mirror of
https://gitlab.com/SIGBUS/nyaa.git
synced 2024-12-22 20:40:00 +00:00
Optimize Elasticsearch fullword field
Since the main display_name field ngrams words up to 15 characters, anything to and under that will already be indexed - the fullword field (which we have for words longer than 15 characters) needs to index only words longer than that.
This commit is contained in:
parent
81806d7bc9
commit
f31af836d9
|
@ -32,13 +32,19 @@ settings:
|
||||||
filter:
|
filter:
|
||||||
- lowercase
|
- lowercase
|
||||||
- word_delimit
|
- word_delimit
|
||||||
# These should be enough, as my_index_analyzer will match the rest
|
# Skip tokens shorter than N characters,
|
||||||
|
# since they're already indexed in the main field
|
||||||
|
- fullword_min
|
||||||
|
|
||||||
filter:
|
filter:
|
||||||
my_ngram:
|
my_ngram:
|
||||||
type: edgeNGram
|
type: edgeNGram
|
||||||
min_gram: 1
|
min_gram: 1
|
||||||
max_gram: 15
|
max_gram: 15
|
||||||
|
fullword_min:
|
||||||
|
type: length
|
||||||
|
# Remember to change this if you change the max_gram below!
|
||||||
|
min: 16
|
||||||
resolution:
|
resolution:
|
||||||
type: pattern_capture
|
type: pattern_capture
|
||||||
patterns: ["(\\d+)[xX](\\d+)"]
|
patterns: ["(\\d+)[xX](\\d+)"]
|
||||||
|
|
Loading…
Reference in a new issue