問題描述
我想構(gòu)建自己的分析器,同時(shí)使用過濾器/標(biāo)記器.
I want to build my own analyzer that uses both filters/tokenizers.
我的意思是,相同的字段是 Keyword(整個(gè)流作為單個(gè)標(biāo)記)和小寫
I mean, the same field is Keyword (entire stream as a single token) and lowercase
如果 KeywordAnalyzer僅使用,字段的值不區(qū)分大小寫.如果我使用 LowerCaseTokenizer 或LowerCaseFilter 我要結(jié)合它們與其他執(zhí)行相同操作的分析器 KeywordAnalyzer(不使用字母、空格、刪除停用詞等分隔)
If KeywordAnalyzer use only, the value of field keeps the case-insensitive. If I use LowerCaseTokenizer or LowerCaseFilter I have to combine them with other analyzers that do the same thing KeywordAnalyzer (separated by no letter, by spaces, remove stop-words, etc.)
問題是:有沒有辦法使用過濾器或分析器 Lucene 或標(biāo)記器將該字段設(shè)為關(guān)鍵字(將整個(gè)流作為單個(gè)標(biāo)記)和 小寫?
The question is: Is there any way to make that field as Keyword (entire stream as a single token) and that lowercase using filters or analyzers Lucene or tokenizers?
(谷歌翻譯,錯(cuò)誤見諒)
(google translated, sorry about errors)
推薦答案
這應(yīng)該可行:
public final class YourAnalyzer extends ReusableAnalyzerBase {
@Override
protected TokenStreamComponents createComponents(final String fieldName, final Reader reader) {
final TokenStream source = new KeywordTokenizer(reader);
return new TokenStreamComponents(source, new LowercaseFilter(Version.LUCENE_36, source));
}
}
這篇關(guān)于KeywordAnalyzer 和 LowerCaseFilter/LowerCaseTokenizer的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!