問題描述
我正在嘗試匹配 TITLE 列中的文本Config migration from ASA5505 8.2 to ASA5516.
Im trying to match a text Config migration from ASA5505 8.2 to ASA5516 in column TITLE.
我的程序是這樣的.
Directory directory = FSDirectory.open(indexDir);
MultiFieldQueryParser queryParser = new MultiFieldQueryParser(Version.LUCENE_35,new String[] {"TITLE"}, new StandardAnalyzer(Version.LUCENE_35));
IndexReader reader = IndexReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
queryParser.setPhraseSlop(0);
queryParser.setLowercaseExpandedTerms(true);
Query query = queryParser.parse("TITLE:Config migration from ASA5505 8.2 to ASA5516");
System.out.println(queryStr);
TopDocs topDocs = searcher.search(query,100);
System.out.println(topDocs.totalHits);
ScoreDoc[] hits = topDocs.scoreDocs;
System.out.println(hits.length + " Record(s) Found");
for (int i = 0; i < hits.length; i++) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println(""Title :" " +d.get("TITLE") );
}
但它的回歸
"Title :" Config migration from ASA5505 8.2 to ASA5516
"Title :" Firewall migration from ASA5585 to ASA5555
"Title :" Firewall migration from ASA5585 to ASA5555
第二個 2 結果不是預期的.所以需要什么修改才能匹配確切的文本配置從 ASA5505 8.2 遷移到 ASA5516
Second 2 results are not expected.So what modification required to match exact text Config migration from ASA5505 8.2 to ASA5516
我的索引函數看起來像這樣
And my indexing function looks like this
public class Lucene {
public static final String INDEX_DIR = "./Lucene";
private static final String JDBC_DRIVER = "oracle.jdbc.OracleDriver";
private static final String CONNECTION_URL = "jdbc:oracle:thin:xxxxxxx"
private static final String USER_NAME = "localhost";
private static final String PASSWORD = "localhost";
private static final String QUERY = "select * from TITLE_TABLE";
public static void main(String[] args) throws Exception {
File indexDir = new File(INDEX_DIR);
Lucene indexer = new Lucene();
try {
Date start = new Date();
Class.forName(JDBC_DRIVER).newInstance();
Connection conn = DriverManager.getConnection(CONNECTION_URL, USER_NAME, PASSWORD);
SimpleAnalyzer analyzer = new SimpleAnalyzer(Version.LUCENE_35);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_35, analyzer);
IndexWriter indexWriter = new IndexWriter(FSDirectory.open(indexDir), indexWriterConfig);
System.out.println("Indexing to directory '" + indexDir + "'...");
int indexedDocumentCount = indexer.indexDocs(indexWriter, conn);
indexWriter.close();
System.out.println(indexedDocumentCount + " records have been indexed successfully");
System.out.println("Total Time:" + (new Date().getTime() - start.getTime()) / (1000));
} catch (Exception e) {
e.printStackTrace();
}
}
int indexDocs(IndexWriter writer, Connection conn) throws Exception {
String sql = QUERY;
Statement stmt = conn.createStatement();
stmt.setFetchSize(100000);
ResultSet rs = stmt.executeQuery(sql);
int i = 0;
while (rs.next()) {
System.out.println("Addind Doc No:" + i);
Document d = new Document();
System.out.println(rs.getString("TITLE"));
d.add(new Field("TITLE", rs.getString("TITLE"), Field.Store.YES, Field.Index.ANALYZED));
d.add(new Field("NAME", rs.getString("NAME"), Field.Store.YES, Field.Index.ANALYZED));
writer.addDocument(d);
i++;
}
return i;
}
}
推薦答案
PVR 是正確的,在這里使用短語查詢可能是正確的解決方案,但是他們錯過了如何使用 PhraseQuery
類.不過,您已經在使用 QueryParser
,因此只需將搜索文本括在引號中即可使用查詢解析器語法:
PVR is correct, that using a phrase query is probably the right solution here, but they missed on how to use the PhraseQuery
class. You are already using QueryParser
though, so just use the query parser syntax by enclosing you search text in quotes:
Query query = queryParser.parse("TITLE:"Config migration from ASA5505 8.2 to ASA5516"");
<小時>
根據您的更新,您在索引時和查詢時使用了不同的分析器.SimpleAnalyzer
和 StandardAnalyzer
不做同樣的事情.除非您有很好的理由不這樣做,否則您應該在索引和查詢時以相同的方式進行分析.
Based on your update, you are using a different analyzer at index-time and query-time. SimpleAnalyzer
and StandardAnalyzer
don't do the same things. Unless you have a very good reason to do otherwise, you should analyze the same way when indexing and querying.
因此,將索引代碼中的分析器更改為 StandardAnalyzer
(反之亦然,在查詢時使用 SimpleAnalyzer
),您應該會看到更好的結果.
So, change the analyzer in your indexing code to StandardAnalyzer
(or vice-versa, use SimpleAnalyzer
when querying), and you should see better results.
這篇關于如何在 Lucene 搜索中匹配精確文本?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!