Skip to content

chore: capture scheduler lag in stats#7017

Merged
atzoum merged 1 commit into
masterfrom
chore.schedulerLagStat
Jun 3, 2026
Merged

chore: capture scheduler lag in stats#7017
atzoum merged 1 commit into
masterfrom
chore.schedulerLagStat

Conversation

@atzoum

@atzoum atzoum commented May 28, 2026

Copy link
Copy Markdown
Contributor

Description

Captures Go scheduler lag as a metric to help detect CPU contention and scheduling bottlenecks. Uses a SchedulerLagCollector from rudder-go-kit with a configurable sampling interval (default 100ms).

Also bumps several dependencies including rudder-go-kit to v0.76.2 and rudder-observability-kit to v0.0.7.

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@atzoum atzoum changed the title chore: capture scheduler lag chore: capture scheduler lag stat May 28, 2026
@atzoum atzoum changed the title chore: capture scheduler lag stat chore: capture scheduler lag in stats May 28, 2026
@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.11%. Comparing base (9e8a0a8) to head (df8e650).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7017      +/-   ##
==========================================
+ Coverage   79.74%   80.11%   +0.37%     
==========================================
  Files         566      566              
  Lines       64318    64321       +3     
==========================================
+ Hits        51288    51532     +244     
+ Misses       9978     9716     -262     
- Partials     3052     3073      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@atzoum atzoum force-pushed the chore.schedulerLagStat branch from b8536af to 6fdc6e0 Compare May 28, 2026 14:47
@atzoum atzoum force-pushed the chore.schedulerLagStat branch from 6fdc6e0 to df8e650 Compare June 2, 2026 13:21
@atzoum atzoum merged commit 0dcf2b4 into master Jun 3, 2026
73 checks passed
@atzoum atzoum deleted the chore.schedulerLagStat branch June 3, 2026 06:05
This was referenced Jun 8, 2026
itsmihir pushed a commit that referenced this pull request Jun 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0-rc.1](v1.76.0...v1.77.0-rc.1)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
This was referenced Jun 8, 2026
atzoum pushed a commit that referenced this pull request Jun 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0-rc.2](v1.76.0...v1.77.0-rc.2)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* **jobsdb:** stale table count gauge after dropping dataset async
([#7063](#7063))
([4abb293](4abb293))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
This was referenced Jun 8, 2026
atzoum pushed a commit that referenced this pull request Jun 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0-rc.3](v1.76.0...v1.77.0-rc.3)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** jobsdb_lock_total_time for dsMigrationLock missing
([#7066](#7066))
([7e00370](7e00370))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* **jobsdb:** stale table count gauge after dropping dataset async
([#7063](#7063))
([4abb293](4abb293))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
itsmihir pushed a commit that referenced this pull request Jun 9, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.77.0](v1.76.0...v1.77.0)
(2026-06-08)


### Features

* **jobsdb:** deferred status table lock during compaction
([#7020](#7020))
([d156057](d156057))
* **jobsdb:** non-blocking dataset compaction
([#6967](#6967))
([e0838dc](e0838dc))
* **jobsdb:** versioned dataset list and non-blocking drop for completed
datasets
([#6962](#6962))
([d7fe1ae](d7fe1ae))


### Bug Fixes

* flaky cluster test
([#7003](#7003))
([d7fe1ae](d7fe1ae))
* **jobsdb:** jobsdb_lock_total_time for dsMigrationLock missing
([#7066](#7066))
([7e00370](7e00370))
* **jobsdb:** pq 42P01 error when trying to store to a dataset that has
been dropped
([#7041](#7041))
([9db406d](9db406d))
* **jobsdb:** stale table count gauge after dropping dataset async
([#7063](#7063))
([4abb293](4abb293))
* server panics with sql database is closed in case of an etcd error
([#7023](#7023))
([15dc39a](15dc39a))
* support scoped batchrouter datePrefixOverride
([#7037](#7037))
([9e8a0a8](9e8a0a8))


### Miscellaneous

* capture scheduler lag in stats
([#7017](#7017))
([0dcf2b4](0dcf2b4))
* **deps:** bump actions/create-github-app-token from 3.0.0 to 3.1.1
([#6969](#6969))
([ee7f735](ee7f735))
* **deps:** bump actions/create-github-app-token from 3.1.1 to 3.2.0
([#7013](#7013))
([ac5681b](ac5681b))
* **deps:** bump actions/labeler from 5.0.0 to 6.1.0
([#6970](#6970))
([e748b86](e748b86))
* **deps:** bump actions/stale from 10.2.0 to 10.3.0
([#7031](#7031))
([6cf6804](6cf6804))
* **deps:** bump aws-actions/amazon-ecr-login from 2.1.4 to 2.1.5
([#6972](#6972))
([e38a99d](e38a99d))
* **deps:** bump aws-actions/configure-aws-credentials from 5.0.0 to
6.1.1 ([#6971](#6971))
([1114b7f](1114b7f))
* **deps:** bump aws-actions/configure-aws-credentials from 6.1.1 to
6.1.2 ([#7049](#7049))
([e210e68](e210e68))
* **deps:** bump codecov/codecov-action from 6.0.0 to 6.0.1
([#7016](#7016))
([3d639f5](3d639f5))
* **deps:** bump docker/login-action from 4.1.0 to 4.2.0
([#7035](#7035))
([b39183b](b39183b))
* **deps:** bump docker/metadata-action from 6.0.0 to 6.1.0
([#7032](#7032))
([574f7c8](574f7c8))
* **deps:** bump docker/setup-buildx-action from 4.0.0 to 4.1.0
([#7034](#7034))
([ff49009](ff49009))
* **deps:** bump golangci/golangci-lint-action from 8.0.0 to 9.2.1
([#7044](#7044))
([56070dc](56070dc))
* **deps:** bump slackapi/slack-github-action from 3.0.1 to 3.0.3
([#6968](#6968))
([fd70e60](fd70e60))
* **deps:** bump step-security/harden-runner from 2.19.1 to 2.19.3
([#7012](#7012))
([b4dc9b2](b4dc9b2))
* **deps:** bump step-security/harden-runner from 2.19.3 to 2.19.4
([#7033](#7033))
([ba82100](ba82100))
* **jobsdb:** avoid refreshing ds list from database in case of a stale
ds error while updating job statuses
([#7024](#7024))
([de98ede](de98ede))
* **jobsdb:** lateral join for getting jobs
([#7007](#7007))
([7148a30](7148a30))
* remove deprecated bulkStatusEnabled flag for snowpipe streaming
([#6992](#6992))
([d7fe1ae](d7fe1ae))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: rudderstack-github-actions[bot] <236995729+rudderstack-github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants