<<Up     Contents

Wikipedia:Database download

Redirected from Database download

All Wikipedia content is licensed under the GNU Free Documentation License; see Wikipedia:Copyrights for more info.

See also Wikipedia:PHP script to get the software to run the wiki. If you're just looking for the database schema, it's described in schema.doc (http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/%2Acheckout%2A/wikipedia/phpwiki/newcodebase/docs/schema.doc?rev=HEAD&content-type=text/plain) (a text file, not Microsoft Word; IE users beware).

Database dumps, updated approx. weekly

See http://download.wikipedia.org/ to grab the backup dumps of the database. These can be read into a MySQL relational database for leisurely analysis, testing of the Wikipedia software (http://wikipedia.sourceforge.net/), and with appropriate preprocessing, perhaps offline reading.

The database schema is explained here (http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/wikipedia/phpwiki/newcodebase/docs/schema.doc?rev=HEAD&content-type=text/vnd.viewcvs-markup). The cur tables contain the current revisions of all pages; the old tables contain the prior edit history. Approximate file sizes are given for the compressed dumps; uncompressed they'll be significantly larger.

Static HTML tree dumps for mirroring or CD distribution

In development... If you'd like to help set up an automatic dump-to-static function, please drop us a note on the developers' mailing list.

Daily tarballs of older Non-English Wikipedias

These have not yet been upgraded and are running on UseMod-wiki. The software and data are included together in a single tarball.

see also Wikipedia:TomeRaider database


Please do not use a web crawler to download large amounts of articles. Aggressive crawling of the server can cause a dramatic slow-down of Wikipedia.

wikipedia.org dumped 2003-03-17 with terodump