调整权重相同名称的特定字段的Lucene搜索结果得分
||
我目前正在使用Lucene作为我们的全文搜索引擎。但是我们需要根据特定字段对搜索结果进行排序。
例如,如果我们的索引中包含以下三个文档,除了
id
字段外,它们的内容完全相同。
val document01 = new Document()
val field0100 = new Field(\"id\", \"1\", Field.Store.YES, Field.Index.ANALYZED)
val field0101 = new Field(\"contents\", \"This is a test: Linux\", Field.Store.YES, Field.Index.ANALYZED)
val field0102 = new Field(\"contents\", \"This is a test: Windows\", Field.Store.YES, Field.Index.ANALYZED)
document01.add(field0100)
document01.add(field0101)
document01.add(field0102)
val document02 = new Document()
val field0200 = new Field(\"id\", \"2\", Field.Store.YES, Field.Index.ANALYZED)
val field0201 = new Field(\"contents\", \"This is a test: Linux\", Field.Store.YES, Field.Index.ANALYZED)
val field0202 = new Field(\"contents\", \"This is a test: Windows\", Field.Store.YES, Field.Index.ANALYZED)
document02.add(field0200)
document02.add(field0201)
document02.add(field0202)
val document03 = new Document()
val field0300 = new Field(\"id\", \"3\", Field.Store.YES, Field.Index.ANALYZED)
val field0301 = new Field(\"contents\", \"This is a test: Linux\", Field.Store.YES, Field.Index.ANALYZED)
val field0302 = new Field(\"contents\", \"This is a test: Windows\", Field.Store.YES, Field.Index.ANALYZED)
document03.add(field0300)
document03.add(field0301)
document03.add(field0302)
现在,当我使用IndexSearcher搜索Linux
时,得到以下结果:
Document<stored,indexed,tokenized<id:1> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:2> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:3> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
搜索Windows
时,得到的结果相同。
Document<stored,indexed,tokenized<id:1> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:2> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:3> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
问题是建立索引时是否可以权衡特定字段?例如,如果搜索时匹配make6 like,我希望它具有更高的分数。
换句话说,当我搜索Linux
时,我想按以下顺序获得结果:
Document<stored,indexed,tokenized<id:2> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:1> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:3> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
当我搜索Windows
时,它仍然保持原始顺序,如下所示:
Document<stored,indexed,tokenized<id:1> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:2> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
Document<stored,indexed,tokenized<id:3> stored,indexed,tokenized<contents:This is a test: Linux> stored,indexed,tokenized<contents:This is a test: Windows>>
我尝试使用field0201.setBoost()
,但是当我搜索Linux
或Windows
时,都会改变搜索结果的顺序。
没有找到相关结果
已邀请:
1 个回复
泪琉踞檄
然后使用
查询(使用带有两个都设置为术语Linux的布尔值查询的布尔查询),然后,如果匹配项位于该字段中,则content-high的提升应增加文档得分。用
看看是否有效。