项目实际中需要满足如下搜索需求,根据用户输入的关键词从若干个字段中去去分词匹配,搜素出的结果要求根据该条数据的中的用户评分、收藏数去排序,同时要求精确匹配的放在第一条,由于景区匹配的数据的用户评分和收藏数未必是最高的,所以默认情况下根据用户评分和收藏数去排序精确匹配的数据是不会出现在第一条的,如果按照ElasticSearch算分来实现的话精确匹配的那条算分是最高的,但是排序是不正确的。在这里我的思路是改变排序按照优先算分、其次是用户评分再次是收藏数。其中用户评分和收藏数是不可以干扰的原始数据,这里的思路自然是修改算分规则,人为的使得精确匹配的算分最高,其余的算分为0,这样就可以达到目的。
这里就要使用ElasticSearch中FunctionScore相关的功能,之前我也写过这方面的文章,今天这篇主要来说下如何应用Fliter和Weight来实现。先来看下Mapping的信息。
GET /product_production/_mapping
{
"product_production_v2020052902" : {
"mappings" : {
"properties" : {
"BrandCName" : {
"type" : "text",
"fields" : {
"PingYin" : {
"type" : "text",
"analyzer" : "pinyin"
},
"Raw" : {
"type" : "keyword"
}
},
"analyzer" : "ik_smart"
},
"BrandEName" : {
"type" : "text",
"fields" : {
"Raw" : {
"type" : "keyword"
}
}
},
"CategoryID" : {
"type" : "integer"
},
"CategoryName" : {
"type" : "keyword"
},
"CommentCount" : {
"type" : "integer"
},
"CreateTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss || yyyy-MM-dd ||epoch_millis"
},
"IsTryMakeup" : {
"type" : "boolean"
},
"ProductCName" : {
"type" : "text",
"fields" : {
"PingYin" : {
"type" : "text",
"analyzer" : "pinyin"
},
"Raw" : {
"type" : "keyword"
}
},
"analyzer" : "ik_smart"
},
"ProductEName" : {
"type" : "text",
"fields" : {
"Raw" : {
"type" : "keyword"
}
}
},
"ProductID" : {
"type" : "keyword"
},
"ProductImage" : {
"type" : "keyword"
},
"ProductKeyNO" : {
"type" : "keyword"
},
"Ranks" : {
"type" : "float"
},
"categoryName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
根据搜索要求是输入关键字,从ProuctCName、ProductEName、BrancCName、BrandEName中去寻找相关的,然后根据Ranks、CommentCount和CreateTIme排序,但是精确匹配的排在第一个。看下构建的Search DSL
/product_production/_search
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "明星挚爱哑光唇膏",
"fields": [
"BrandCName^8.0",
"BrandEName^2.0",
"ProductCName^10.0",
"ProductEName^2.0"
],
"type": "best_fields",
"operator": "OR",
"slop": 0,
"prefix_length": 0,
"max_expansions": 50,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"functions": [
{
"filter": {
"bool": {
"should": [
{
"term": {
"ProductCName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
},
{
"term": {
"ProductEName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
},
{
"term": {
"BrandCName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
},
{
"term": {
"BrandEName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"weight": 1000
},
{
"filter": {
"bool": {
"must_not": [
{
"term": {
"ProductCName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
},
{
"term": {
"ProductEName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
},
{
"term": {
"BrandCName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
},
{
"term": {
"BrandEName.Raw": {
"value": "明星挚爱哑光唇膏",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"weight": 0
}
],
"score_mode": "multiply",
"max_boost": 3.4028235e+38,
"boost": 1
}
},
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"Ranks": {
"order": "desc"
}
},
{
"CommentCount": {
"order": "desc"
}
},
{
"CreateTime": {
"order": "desc"
}
}
]
}
使用multi_match实现多字段匹配并结合权重,使用function score 中的Filter结合bool query 的should和must not来是实现完全匹配算分高,否则为0,后用算分+Ranks+Comment+CreateTime排序。再来看下Java代码如何实现。
public PagedResult<ProductEntity> selectProductByKeyword(String productDataName, String keyWord, int PageSize, int CurrentPage) {
SearchRequest searchRequest=new SearchRequest(productDataName);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.size(PageSize);
searchSourceBuilder.from((CurrentPage-1)*PageSize);
BoolQueryBuilder boolQueryBuilder=new BoolQueryBuilder();
if (keyWord != null) {
Map<String, Float> query = new HashMap<>();
query.put("ProductCName", 10.0f);
query.put("ProductEName", 2.0f);
query.put("BrandCName", 8.0f);
query.put("BrandEName", 2.0f);
MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery(keyWord).fields(query);
boolQueryBuilder.should(multiMatchQueryBuilder);
if(!this.IsChinese(keyWord)) {
MatchQueryBuilder productCNamePingYin = QueryBuilders.matchQuery("ProductCName.PingYin", keyWord);
boolQueryBuilder.should(productCNamePingYin);
MatchQueryBuilder brandCNamePingYin = QueryBuilders.matchQuery("BrandCName.PingYin", keyWord);
boolQueryBuilder.should(brandCNamePingYin);
}
}
BoolQueryBuilder shouldEqual = QueryBuilders.boolQuery();
shouldEqual.should(QueryBuilders.termQuery("ProductCName.Raw",keyWord));
shouldEqual.should(QueryBuilders.termQuery("ProductEName.Raw",keyWord));
shouldEqual.should(QueryBuilders.termQuery("BrandCName.Raw",keyWord));
shouldEqual.should(QueryBuilders.termQuery("BrandEName.Raw",keyWord));
BoolQueryBuilder mustNot = QueryBuilders.boolQuery();
mustNot.mustNot(QueryBuilders.termQuery("ProductCName.Raw",keyWord));
mustNot.mustNot(QueryBuilders.termQuery("ProductEName.Raw",keyWord));
mustNot.mustNot(QueryBuilders.termQuery("BrandCName.Raw",keyWord));
mustNot.mustNot(QueryBuilders.termQuery("BrandEName.Raw",keyWord));
FunctionScoreQueryBuilder.FilterFunctionBuilder[] filtersFns = {
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
shouldEqual,
ScoreFunctionBuilders.weightFactorFunction(100)),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
mustNot,
ScoreFunctionBuilders.weightFactorFunction(0)
)
};
FunctionScoreQueryBuilder functionScoreQueryBuilder = QueryBuilders.functionScoreQuery(boolQueryBuilder,filtersFns);
searchSourceBuilder = searchSourceBuilder
.sort("_score",SortOrder.DESC)
.sort("Ranks", SortOrder.DESC)
.sort("CommentCount", SortOrder.DESC)
.sort("CreateTime", SortOrder.DESC);
searchSourceBuilder.query(functionScoreQueryBuilder);
String dsl = searchSourceBuilder.toString();
searchRequest.source(searchSourceBuilder);
System.out.println(searchSourceBuilder);
SearchResponse response=null;
try {
response=client.search(searchRequest, RequestOptions.DEFAULT);
} catch (IOException e) {
e.printStackTrace();
}
return returnProductEntity( response,PageSize);
}
本文为Lokie.Wang原创文章,转载无需和我联系,但请注明来自lokie博客http://lokie.wang