我有以下单词“Μιχάλης”
我正在创建此映射
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword","char_filter": [
"my_char_filter"
]
}
},"char_filter": {
"my_char_filter": {
"type": "mapping","mappings": [
"α => a","β => b","γ => g","δ => d","ε => e","ζ => z","η => i","θ => th","ι => i","κ => k","λ => l","μ => m","ν => n","ξ => x","ο => o","π => p","ρ => r","σ => s","τ => t","υ => u","φ => f","χ => x","ψ => ps","ω => o","ς => s","έ => e","ύ => u","ί => i","ό => o","ά => a","ή => i","ώ => o"
]
}
}
}
},"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
如果我分析单词,我将得到以下结果
{
"tokens": [
{
"token": "Μixalis","start_offset": 0,"end_offset": 7,"type": "word","position": 0
}
]
}
它可以工作...但是,问题在于某些字符有多种变体。例如
“χ”可以是“ x”或“ ch”,我想创建多个变体 (“ξ”可以是“ x”或“ ks”)
正确的答案可能是
{
"tokens": [
{
"token": "Μixalis","position": 0
},{
"token": "Μichalis","end_offset": 8,"position": 0
}
]
}
所以我想从所有字符变体中创建所有单词变体
(Elasticsearch 7)