Skip to content

fix: [2.5] self-heal compaction segment positions and add L0 force-select bypass#48909

Merged
sre-ci-robot merged 2 commits into
milvus-io:2.5from
XuanYang-cn:cp25/silent-delete-loss
Apr 23, 2026
Merged

fix: [2.5] self-heal compaction segment positions and add L0 force-select bypass#48909
sre-ci-robot merged 2 commits into
milvus-io:2.5from
XuanYang-cn:cp25/silent-delete-loss

Conversation

@XuanYang-cn

Copy link
Copy Markdown
Contributor

Compaction inherits StartPosition/DmlPosition from source segments via getMinPosition without recalculating from actual data. The import position bug (PR #47276) wrote wrong timestamps on imported segments, and these wrong positions persist and propagate through compaction. L0 compaction then misses L1/L2 segments due to StartPosition mismatches, causing zombie L0 segments and silent delete loss.

There is also a latent bug: DmlPosition in mix/clustering compaction uses getMinPosition, but DmlPosition represents the latest entity timestamp and should use max.

See also: #46435
pr: #48907

@sre-ci-robot sre-ci-robot added the size/XL Denotes a PR that changes 500-999 lines. label Apr 10, 2026
@mergify mergify Bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels Apr 10, 2026
@sre-ci-robot sre-ci-robot added do-not-merge/need-merge-master-first any pr merge to release branch need to merge master first do-not-merge/need-milestone generate by v2-label-manager labels Apr 10, 2026
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

[WARNING] Milestone not set

You can set milestone by commenting:
/set-milestone
Example:
/set-milestone 2.5.0

Use /refresh-label to update related check and label manually

@sre-ci-robot

Copy link
Copy Markdown
Contributor

[ci-v2-notice]
Notice: New ci-v2 system is enabled for this PR.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-build-all // for ci-v2/build-all (multi-arch builds)
  • /ci-rerun-buildenv // for ci-v2/build-env (build milvus-env builder images)
  • /ci-rerun-ut-integration // for ci-v2/ut-integration, will rerun ci-v2/build
  • /ci-rerun-ut-go // for ci-v2/ut-go, will rerun ci-v2/build
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp, will rerun ci-v2/build
  • /ci-rerun-e2e-default // for ci-v2/e2e-default
  • /ci-rerun-e2e-amd // for ci-v2/e2e-amd (e2e pool dispatcher)
  • /ci-rerun-build-ut-cov // for ci-v2/build-ut-cov (build + unit tests in one pipeline)
  • /ci-rerun-gosdk // for ci-v2/go-sdk (Go SDK E2E tests, ARM)

If you have any questions or requests, please contact @zhikunyao.

@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

[WARNING] Milestone not set

You can set milestone by commenting:
/set-milestone
Example:
/set-milestone 2.5.0

Use /refresh-label to update related check and label manually

@codecov

codecov Bot commented Apr 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.05%. Comparing base (4204fb1) to head (268177c).
⚠️ Report is 1 commits behind head on 2.5.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##              2.5   #48909       +/-   ##
===========================================
+ Coverage   58.41%   82.05%   +23.63%     
===========================================
  Files        1470     1589      +119     
  Lines      239133   249573    +10440     
===========================================
+ Hits       139698   204791    +65093     
+ Misses      92387    38738    -53649     
+ Partials     7048     6044     -1004     
Components Coverage Δ
Client 78.92% <ø> (∅)
Core 84.51% <ø> (+<0.01%) ⬆️
Go 82.38% <100.00%> (+34.28%) ⬆️
Files with missing lines Coverage Δ
internal/datacoord/compaction_l0_view.go 91.05% <100.00%> (+21.95%) ⬆️
internal/datacoord/meta.go 90.18% <100.00%> (+31.02%) ⬆️
pkg/util/paramtable/component_param.go 98.74% <100.00%> (+0.87%) ⬆️

... and 934 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sre-ci-robot sre-ci-robot added the low-code-coverage add test-label from zhikun, diff coverage > 80% label Apr 10, 2026
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

[WARNING] Milestone not set

You can set milestone by commenting:
/set-milestone
Example:
/set-milestone 2.5.0

Use /refresh-label to update related check and label manually

@sre-ci-robot sre-ci-robot added low-code-coverage add test-label from zhikun, diff coverage > 80% and removed low-code-coverage add test-label from zhikun, diff coverage > 80% labels Apr 14, 2026
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

[WARNING] Milestone not set

You can set milestone by commenting:
/set-milestone
Example:
/set-milestone 2.5.0

Use /refresh-label to update related check and label manually

@sre-ci-robot sre-ci-robot removed the low-code-coverage add test-label from zhikun, diff coverage > 80% label Apr 14, 2026
@yanliang567 yanliang567 added this to the 2.5.28 milestone Apr 15, 2026
@mergify

mergify Bot commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

@XuanYang-cn Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco.

@mergify mergify Bot added needs-dco DCO is missing in this pull request. and removed dco-passed DCO check passed. labels Apr 21, 2026
@sre-ci-robot sre-ci-robot removed the do-not-merge/need-milestone generate by v2-label-manager label Apr 21, 2026
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

Use /refresh-label to update related check and label manually

@sre-ci-robot sre-ci-robot added low-code-coverage add test-label from zhikun, diff coverage > 80% and removed low-code-coverage add test-label from zhikun, diff coverage > 80% labels Apr 21, 2026
@XuanYang-cn XuanYang-cn force-pushed the cp25/silent-delete-loss branch from 53b78cc to d95a1b7 Compare April 21, 2026 14:47
@mergify mergify Bot added dco-passed DCO check passed. and removed needs-dco DCO is missing in this pull request. labels Apr 21, 2026
…lect bypass

Compaction inherits StartPosition/DmlPosition from source segments via
getMinPosition without recalculating from actual data. The import position
bug (PR milvus-io#47276) wrote wrong timestamps on imported segments, and these
wrong positions persist and propagate through compaction. L0 compaction
then misses L1/L2 segments due to StartPosition mismatches, causing
zombie L0 segments and silent delete loss.

There is also a latent bug: DmlPosition in mix/clustering compaction
uses getMinPosition, but DmlPosition represents the latest entity
timestamp and should use max.

See also: milvus-io#46435
pr: milvus-io#48907

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
@XuanYang-cn XuanYang-cn force-pushed the cp25/silent-delete-loss branch from d95a1b7 to 75bd3b8 Compare April 21, 2026 14:49
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

Use /refresh-label to update related check and label manually

1 similar comment
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

Use /refresh-label to update related check and label manually

@mergify

mergify Bot commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

@XuanYang-cn go-sdk check failed, comment rerun go-sdk can trigger the job again.

…estGrowingSegmentPos > realL0DmlTs

buildView() set earliestGrowingSegmentPos.Timestamp == realL0DmlTs (5000),
so Trigger()'s dmlPos < earliestGrowingSegmentPos filter always produced
an empty validSegments slice and Trigger() returned nil.  The subsequent
s.Require().NotNil aborted both subtests before selectSealedSegment() was
ever called, leaving the SelectSegments mock expectation unmet.

Fix: use Timestamp=10000 for earliestGrowingSegmentPos — strictly greater
than realL0DmlTs (5000) so L0 segments pass the Trigger() filter, and
strictly less than highPosSeg.StartPosition (20000) so the flag-off path
(task.Pos=10000) still excludes highPosSeg while the flag-on path
(task.Pos=MaxUint64) includes it.

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[FAILED] PR #48907 not merged

Use /refresh-label to update related check and label manually

@XuanYang-cn

Copy link
Copy Markdown
Contributor Author

/ci-rerun-ut-go

@mergify mergify Bot added the ci-passed label Apr 22, 2026
@tedxu

tedxu commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

/lgtm
/approve

@sre-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tedxu, XuanYang-cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot added approved and removed do-not-merge/need-merge-master-first any pr merge to release branch need to merge master first labels Apr 23, 2026
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[INFO] PR Label Summary by Default
[SUCCESS] PR #48907 merged to master

Use /refresh-label to update related check and label manually

@sre-ci-robot sre-ci-robot merged commit 2b5074b into milvus-io:2.5 Apr 23, 2026
17 of 21 checks passed
sre-ci-robot pushed a commit that referenced this pull request Apr 28, 2026
See also: #46435, #48907, #48909

pr: #47154

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved ci-passed dco-passed DCO check passed. kind/bug Issues or changes related a bug lgtm size/XL Denotes a PR that changes 500-999 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants