Extraction of Wikipedia/Wiktionary-Topics

Extraction of Wikipedia/Wiktionary-Topics

Closed - This job posting has been filled and work has been completed.

Job Description

I need a list of all Articles of wikipedia (de.wikipedia.org and en.wikipedia.org) and all entries in Wiktionary (German and English).

It could be an automated script to extract these items, but not required. It is also fine to create the list partly automated, partly manual. But it should be complete in terms below:

The list of articles of wikipedia should cover
- categories
- content articles
- referring articles

The wiktionary export should - if possible - should be grouped by type (i.e. "Noun", "Adjective", ...)

The lists should be plain text files like that:

content-pages-en.txt:
:
:
A.C. Smith
A.C. St. Louis
A.C. Stephens
:
:

If possible, also with URL-ending or full-URL:

content-pages-en.txt:
:
:
A.C. Smith; A.C._Smith
A.C. St. Louis; A.C._St._Louis
A.C. Stephens
:
:

Please give me feedback if you see much more or less effort.