Sherlock Holmes

A universal search engine.
Download

Sherlock Holmes Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • Martin Mares
  • Publisher web site:
  • http://mj.ucw.cz/linux.shtml

Sherlock Holmes Tags


Sherlock Holmes Description

A universal search engine. Sherlock Holmes is a universal search engine, a system for gathering and indexing of textual data (text files, web pages, etc), both locally and over the network. Here are some key features of "Sherlock Holmes": · Gathers files via HTTP or from local files. · Parses text files, HTML, PDF, and several other formats using external parsers (such as MS Word and PostScript). · The whole system is modular, so adding your own data sources or parsers is just matter of plugging in right module (well, usually also writing it). · Works well in mixed charset environment. · Considers multiple occurences of the same file (even with minor changes) a single document with multiple URL's. · Everything is highly configurable. You can write filtering rules in a special language which allows to tweak configuration variables depending on the document being processed. · Searching of words, phrases, and boolean expressions. Searching in filenames and link texts. · Proximity search and proximity weighting of regular searches. · Recognition of languages, easy integration of stemmers and synonymic dictionaries. · Spelling checker based on word frequencies observed in the indexed data, hinting the user that his query might be misspelled. · Search results include context in each document. · Scales well to tens of millions of documents on normal PC hardware. · User interface (the front-end) is completely separated from the rest of the system, making it easy to modify and also to embed the search engine in existing applications. · Downloaded files and indices are compressed to save space.


Sherlock Holmes Related Software