WP2TXTFree Ruby based Wikipedia to text converter | |
Download |
WP2TXT Ranking & Summary
Advertisement
- License:
- Freeware
- Price:
- FREE
- Publisher Name:
- Yoichiro Hasebe
- Publisher web site:
- http://rubyforge.org/users/yohasebe/
- Operating Systems:
- Mac OS X 10.5 or later
- File Size:
- 11 MB
WP2TXT Tags
WP2TXT Description
Free Ruby based Wikipedia to text converter WP2TXT decompresses and converts Wikipedia dump-file, which is coded in XML and MediaWiki formats and bz2-compressed, into plain text files.WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata. WP2TXT is originally intended to be useful for researchers who look for an easy way to obtain open-source multi-lingual corpora, but may be handy for anyone who needs article text from Wikipedia.WP2TXT is written in the Ruby programming language and equipped with a GUI made with the help of wxRuby. Mac OS X and Windows packages are provided.NOTE: WP2TXT is developed, licensed and released under the terms of the MIT License.
WP2TXT Related Software