Enca

Detects the encoding of text files
Download

Enca Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • David Necas
  • Publisher web site:
  • http://gwyddion.net/

Enca Tags


Enca Description

Detects the encoding of text files Enca detects the encoding of text files, on the basis of knowledge of their language.Enca is an Extremely Naive Charset Analyser. It detects character set and encoding of text files and can also convert them to other encodings.The charset detecing functionality is also available as a library. Work has begun on pyenca, a Python libenca interface. Here are some key features of "Enca": · recognises several multibyte encodings: UCS-2, UCS-4, UTF-8, UTF-7 and TeX accents · recognises all common EOL types, byte orders and also Quoted-printables · detects files accidentaly converted twice to UTF-8 from some 8bit encoding · can report charset names after various conventions (or programs) as well as human-readable descriptions; accepts all common charset aliases · works with multiple files and can act as an intelligent filter · converts files using a built-in convertor, GNU recode library, UNIX98 iconv functions or some external convertor that can be specified on command line (e.g. cstocs, GNU recode) · automagically converts files to your locale preferred character set when called as enconv · has a special ambiguous mode for very short texts · can filter out binary parts of file and/or box drawing characters before guessing so it can determine encoding of pretty messy files · uses various tricks to solve hardly decidable cases like distinguishing between iso8859-2/cp1250, etc. · is fairly portable, runs on GNU/Linux and all sane Unices What's New in This Release: · There is a new upstream maintainer. · Belarussian detection has been fixed.


Enca Related Software