Skip to content

fix(jobsdb): pq 42P01 error when trying to store to a dataset that has been dropped#7041

Merged
atzoum merged 1 commit into
masterfrom
fix.jobsdbStorePanic
Jun 3, 2026
Merged

fix(jobsdb): pq 42P01 error when trying to store to a dataset that has been dropped#7041
atzoum merged 1 commit into
masterfrom
fix.jobsdbStorePanic

Conversation

@atzoum

@atzoum atzoum commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Description

Since #6963, Store operates against a snapshot of the dataset list. In the rare case where a compaction manages to drop the previous rightmost dataset between when the snapshot was taken and when the store transaction executes, the COPY INTO fails with PostgreSQL error 42P01 (undefined table), treated as an unrecoverable error, causing server to panic.

Fix: Treat 42P01 (undefined table) the same as RS001 (readonly trigger): return ErrStaleDsList so inStoreSafeCtx refreshes the dataset list and retries on the current last dataset. Set a maximum number of stale ds list retries to avoid infinite loops in test scenarios (default: 3).

Linear Ticket

resolves PIPE-3080

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@atzoum atzoum force-pushed the fix.jobsdbStorePanic branch 2 times, most recently from a367f90 to 8976201 Compare June 2, 2026 14:47
@codecov

codecov Bot commented Jun 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 72.72727% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.67%. Comparing base (0dcf2b4) to head (1510d64).

Files with missing lines Patch % Lines
jobsdb/jobsdb.go 72.72% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7041      +/-   ##
==========================================
- Coverage   79.77%   79.67%   -0.11%     
==========================================
  Files         566      566              
  Lines       64321    64328       +7     
==========================================
- Hits        51312    51252      -60     
- Misses       9964    10022      +58     
- Partials     3045     3054       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@atzoum atzoum force-pushed the fix.jobsdbStorePanic branch from 8976201 to 1510d64 Compare June 3, 2026 06:08
Comment thread jobsdb/jobsdb.go
@atzoum atzoum requested a review from mihir20 June 3, 2026 06:56
@atzoum atzoum merged commit 9db406d into master Jun 3, 2026
108 of 111 checks passed
@atzoum atzoum deleted the fix.jobsdbStorePanic branch June 3, 2026 07:35
This was referenced Jun 8, 2026
itsmihir pushed a commit that referenced this pull request Jun 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0-rc.1](v1.76.0...v1.77.0-rc.1)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
This was referenced Jun 8, 2026
atzoum pushed a commit that referenced this pull request Jun 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0-rc.2](v1.76.0...v1.77.0-rc.2)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* **jobsdb:** stale table count gauge after dropping dataset async
([#7063](#7063))
([4abb293](4abb293))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
This was referenced Jun 8, 2026
atzoum pushed a commit that referenced this pull request Jun 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0-rc.3](v1.76.0...v1.77.0-rc.3)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** jobsdb_lock_total_time for dsMigrationLock missing
([#7066](#7066))
([7e00370](7e00370))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* **jobsdb:** stale table count gauge after dropping dataset async
([#7063](#7063))
([4abb293](4abb293))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
itsmihir pushed a commit that referenced this pull request Jun 9, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0](v1.76.0...v1.77.0)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** jobsdb_lock_total_time for dsMigrationLock missing
([#7066](#7066))
([7e00370](7e00370))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* **jobsdb:** stale table count gauge after dropping dataset async
([#7063](#7063))
([4abb293](4abb293))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants