Type Alias Options<Document, SummaryField, IndexField>

Configuration options for creating a search index.

type Options<
    Document extends Record<string, unknown>,
    SummaryField extends keyof Document = keyof Document,
    IndexField extends keyof Document = keyof Document,
> = {
    errorRate?: number;
    fields: IndexField[] | Record<IndexField, number>;
    minSize?: number;
    ngrams?: number;
    preprocess?: PreprocessFunction<Document, IndexField>;
    seed?: number;
    stemmer?: StemmerFunction;
    stopwords?: StopwordsFunction;
    summary: SummaryField[];
    termFrequencyBuckets?: number[];
    tokenizer?: TokenizerFunction;
}

Type Parameters

Document extends Record<string, unknown>
The type of document being indexed.
SummaryField extends keyof Document = keyof Document
The document keys that can be returned in search results.
IndexField extends keyof Document = keyof Document
The document keys to be indexed for searching.

Index

Properties

errorRate? fields minSize? ngrams? preprocess? seed? stemmer? stopwords? summary termFrequencyBuckets? tokenizer?

Properties

`Optional`errorRate

errorRate?: number

Determines the desired error rate. A lower number yields more reliable results but makes the index larger. The value defaults to 0.0001 (or 0.01%).

fields

fields: IndexField[] | Record<IndexField, number>

The fields to index, provided as an array or as a record of field keys and weight values.

`Optional`minSize

minSize?: number

Minimum term cardinality used to calculate the Bloom filter size. This can be used to reduce false positives when dealing with small documents with sparse term frequency distribution. The default value is 0.

`Optional`ngrams

ngrams?: number

Indexes n-grams beyond the single text tokens. A value of 2 indexes digrams, a value of 3 indexes digrams and trigrams, and so forth. This allows seaching the index for simple phrases (a phrase search is entered "between quotes"). Indexing n-grams will increase the size of the generated indices roughly by a factor of n. Default value is 1 (no n-grams are indexed).

`Optional`preprocess

preprocess?: PreprocessFunction<Document, IndexField>

Preprocessing function, executed before all others. The function serialises each field as a string and optionally process it before indexing. For example, you might use this function to strip HTML from a field value. By default, this class simply converts the field value into a string.

`Optional`seed

seed?: number

Hash seed to use in Bloom Filters, defaults to 0x00c0ffee.

`Optional`stemmer

stemmer?: StemmerFunction

Allows plugging in a custom stemming function. By default, this class does not change text tokens.

`Optional`stopwords

stopwords?: StopwordsFunction

Filters tokens so that words that are too short or too common may be excluded from the index. By default, no stopwords are excluded.

summary

summary: SummaryField[]

Determines which fields in the document can be stored in the index and returned as a search result.

`Optional`termFrequencyBuckets

termFrequencyBuckets?: number[]

Optimises storage by grouping indexed terms into buckets according to term frequency in a document. Defaults to [1, 2, 3, 4, 8, 16, 32, 64].

`Optional`tokenizer

tokenizer?: TokenizerFunction

Allows a custom tokenizer function. By default content is transformed to lowercase, split at every whitespace or hyphen, and stripped of any non-word (A-Z, 0-9, and _) characters.

Type Alias Options<Document, SummaryField, IndexField>

Type Parameters

Index

Properties

Properties

`Optional`errorRate

fields

`Optional`minSize

`Optional`ngrams

`Optional`preprocess

`Optional`seed

`Optional`stemmer

`Optional`stopwords

summary

`Optional`termFrequencyBuckets

`Optional`tokenizer

Settings

On This Page

Type Alias Options<Document, SummaryField, IndexField>

Type Parameters

Index

Properties

Properties

OptionalerrorRate

fields

OptionalminSize

Optionalngrams

Optionalpreprocess

Optionalseed

Optionalstemmer

Optionalstopwords

summary

OptionaltermFrequencyBuckets

Optionaltokenizer

Settings

On This Page

`Optional`errorRate

`Optional`minSize

`Optional`ngrams

`Optional`preprocess

`Optional`seed

`Optional`stemmer

`Optional`stopwords

`Optional`termFrequencyBuckets

`Optional`tokenizer