feat: [2.6] struct element hybrid search design and impl#50369
feat: [2.6] struct element hybrid search design and impl#50369zhengbuqian wants to merge 13 commits into
Conversation
issue: milvus-io#42148 design doc: docs/design-docs/design_docs/20260602-struct_hybrid_search.md (cherry picked from commit 2f3f19e) Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
| Topks: append([]int64(nil), data.GetTopks()...), | ||
| FieldsData: data.GetFieldsData(), | ||
| Scores: append([]float32(nil), data.GetScores()...), | ||
| Ids: &schemapb.IDs{IdField: &schemapb.IDs_StrId{StrId: &schemapb.StringArray{Data: keys}}}, |
There was a problem hiding this comment.
prepareElementLevelHybridResult always emits IDs_StrId, but the reranker binds its generic T to the collection PK type and processOneSearchData does an unchecked col.ids.([]T) (rrf_function.go:77). On an int64-PK collection T is int64, so asserting the synthetic string-ID slice to []int64 is an invalid type assertion with no PK-type guard, panicking on every element-level hybrid search over an int64-PK collection (the common case). Existing tests miss this because they bypass the real reranker; fix by emitting IDs that match the PK type, or converting/guarding before the reranker.
| } | ||
|
|
||
| annsField := typeutil.GetField(t.schema.CollectionSchema, t.FieldId) | ||
| if annsField != nil && annsField.GetDataType() == schemapb.DataType_ArrayOfVector { |
There was a problem hiding this comment.
parseSearchInfo still contains a stale guard (search_util.go:303-317) that rejects any ArrayOfVector anns-field using radius/group_by/iterator with the old "embedding list" message before the placeholder type is known. Because it runs inside tryGeneratePlan->parseSearchInfo, it preempts the new placeholder-aware validation in both initSearchRequest (single search, line 806) and the per-sub-request loop in initAdvancedSearchRequest (hybrid, line 452), and since GetFieldByName resolves struct sub-fields it also fires for struct element/emb-list searches. The PR's own new tests fail as a result — 'element-level range search should succeed' and 'element-level iterator v2 should succeed' get rejected, and the hybrid range/iterator cases match the old message instead of the new '...in hybrid search' one — consistent with the red ci-v2/ut-go check. master deleted this guard and moved the checks into task_search.go; port that by removing or making the search_util.go guard placeholder-aware.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 2.6 #50369 +/- ##
==========================================
+ Coverage 76.99% 77.72% +0.72%
==========================================
Files 1700 2000 +300
Lines 262533 329431 +66898
==========================================
+ Hits 202142 256042 +53900
- Misses 53550 65509 +11959
- Partials 6841 7880 +1039
🚀 New features to boost your workflow:
|
| return merr.WrapErrParameterInvalid("", "", | ||
| "range search is not supported for vector array ("+searchKind+") fields in hybrid search, fieldName:"+annsField.GetName()) | ||
| } | ||
| if t.rankParams.GetGroupByFieldId() > 0 || len(t.rankParams.GetGroupByFieldIds()) > 0 { |
There was a problem hiding this comment.
Line 490 calls t.rankParams.GetGroupByFieldIds(), but the rankParams type only defines the singular groupByFieldId field and GetGroupByFieldId() accessor — no plural accessor exists anywhere in the repo. This is an undefined-method compile error that prevents internal/proxy (and the binary) from building. Drop the || len(t.rankParams.GetGroupByFieldIds()) > 0 clause and rely on the singular GetGroupByFieldId() > 0.
| "legacy search iterator is not supported for element-level search on embedding list fields; use search iterator v2") | ||
| } | ||
|
|
||
| groupByFieldIDs := queryInfo.GetGroupByFieldIds() |
There was a problem hiding this comment.
Line 861 calls queryInfo.GetGroupByFieldIds(), but planpb.QueryInfo only has the singular GroupByFieldId field (proto field 6) with accessor GetGroupByFieldId() — there is no plural field or method in this 2.6 proto. This is a second undefined-method compile error in internal/proxy. Use the singular GetGroupByFieldId(); the following two lines already fall back to it.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
| if t.rankParams.GetGroupByFieldId() > 0 { | ||
| return merr.WrapErrParameterInvalid("", "", | ||
| "group by search is not supported for vector array (embedding list) fields in hybrid search, fieldName:"+annsField.GetName()) | ||
| if err := validateElementCollapseMetricType(collapseConfig, queryInfo.GetMetricType()); err != nil { |
There was a problem hiding this comment.
In initAdvancedSearchRequest (task_search.go:467), validateElementCollapseMetricType is passed queryInfo.GetMetricType(), which is empty whenever the caller omits metric_type and relies on the index-resolved metric (parseSearchInfo leaves it "" at search_util.go:232). Because PositivelyRelated("") is false, a valid element_scope collapse with sum/topk_sum on an IP/COSINE field is rejected with 'only supported for positively related metrics'. The runtime collapse stage already re-validates against the metric resolved from results (search_pipeline.go: 'the metrictype in the request may be empty, it can only be obtained from the result'), so this proxy-side check is premature — validate against the resolved metric or drop it. Note the defect is in initAdvancedSearchRequest, not initSearchRequest, and metric_type is not schema-defaulted.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
| } | ||
| for _, field := range cloned.GetFields() { | ||
| if field.GetIsPrimaryKey() { | ||
| field.DataType = schemapb.DataType_VarChar |
There was a problem hiding this comment.
elementLevelHybridRerankSchema clones the schema and flips the primary key field's DataType to VarChar (task_search.go:651). newRerankBase derives both pkType and the rerank input-field types from that cloned schema, so when a decay reranker's input field is the primary key itself, a numeric PK is read as VarChar and newDecayFunction rejects it via its else branch at decay_function.go:96 ("only support numeric field"). The same decay-over-numeric-PK configuration is accepted on the normal row-level hybrid path, so element-level hybrid behaves inconsistently. This only triggers when the decay input field is the PK; if that config should be supported, classify the decay input against the PK's real type rather than the flipped VarChar.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zhengbuqian The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
| }) | ||
|
|
||
| t.Run("regular vector advanced controls should succeed", func(t *testing.T) { | ||
| task := makeTask("regular_vec", commonpb.PlaceholderType_FloatVector, rangeParams, true, false, "scalar_field") |
There was a problem hiding this comment.
This case calls makeTask("regular_vec", …, rangeParams, /*withIterator=*/true, false, /*groupByField=*/"scalar_field") then asserts initSearchRequest succeeds, but it enables iterator and group-by at the same time. parseSearchInfo (reached via initSearchRequest → tryGeneratePlan) hits the pre-existing guard if isIterator && groupByFieldId > 0 { return ...WrapErrParameterInvalid(..., "Not allowed to do groupBy when doing iteration") } before any ArrayOfVector validation runs, so the call returns an error and assert.NoError fails deterministically on every CI run with the proxy test environment available. Split this into separate range / iterator / group-by regular-vector cases like TestSearchTask_ArrayOfVectorHybridSearch does, or drop one of the conflicting withIterator/groupByField controls.
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
|
/ci-rerun-ut |
|
/ci-rerun-e2e-default |
|
/ci-rerun-ut-go |
|
/ci-rerun-e2e-default |
|
/ci-rerun-gosdk |
|
/ci-rerun-ut-go |
|
/ci-rerun-e2e-default |
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
Use /refresh-label to update related check and label manually |
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
Use /refresh-label to update related check and label manually |
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
Use /refresh-label to update related check and label manually |
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
Use /refresh-label to update related check and label manually |
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
Use /refresh-label to update related check and label manually |
issue: #42148
pr: #50243
design doc: docs/design-docs/design_docs/20260602-struct_hybrid_search.md