@hackage full-text-search0.2.0.0

In-memory full text search engine

An in-memory full text search engine library. It lets you run full-text queries on a collection of your documents.

Features:

  • Can search over any type of "document". (You explain how to extract search terms from them.)

  • Supports documents with multiple fields (e.g. title, body)

  • Supports documents with non-term features (e.g. quality score, page rank)

  • Uses the state of the art BM25F ranking function

  • Adjustable ranking parameters (including field weights and non-term feature scores)

  • In-memory but quite compact. It does not keep a copy of your original documents.

It is independent of the document type, so you have to write the document-specific parts: extracting search terms and any case-normalisation or stemming. This is quite easy using libraries such as tokenize and snowball.

For an example, see the code for the hackage-server where it is used for the package search feature.