## Summary Many teams self-host Gitea + Act Runner at scale. The current runner design causes excessive HTTP requests to the Gitea server, leading to high server load. This PR addresses three root causes: aggressive fixed-interval polling, per-task status reporting every 1 second regardless of activity, and unoptimized HTTP client configuration. ## Problem The original architecture has these issues: **1. Fixed 1-second reporting interval (RunDaemon)** - Every running task calls ReportLog + ReportState every 1 second (2 HTTP requests/sec/task) - These requests are sent even when there are no new log rows or state changes - With 200 runners × 3 tasks each = **1,200 req/sec just for status reporting** **2. Fixed 2-second polling interval (no backoff)** - Idle runners poll FetchTask every 2 seconds forever, even when no jobs are queued - No exponential backoff or jitter — all runners can synchronize after network recovery (thundering herd) - 200 idle runners = **100 req/sec doing nothing useful** **3. HTTP client not tuned** - Uses http.DefaultClient with MaxIdleConnsPerHost=2, causing frequent TCP/TLS reconnects - Creates two separate http.Client instances (one for Ping, one for Runner service) instead of sharing **Total: ~1,300 req/sec for 200 runners with 3 tasks each** ## Solution ### Adaptive Event-Driven Log Reporting Replace the recursive `time.AfterFunc(1s)` pattern in RunDaemon with a goroutine-based select event loop using three trigger mechanisms: | Trigger | Default | Purpose | |---------|---------|---------| | `log_report_max_latency` | 3s | Guarantee even a single log line is delivered within this time | | `log_report_interval` | 5s | Periodic sweep — steady-state cadence | | `log_report_batch_size` | 100 rows | Immediate flush during bursty output (e.g., npm install) | **Key design**: `log_report_max_latency` (3s) must be less than `log_report_interval` (5s) so the max-latency timer fires before the periodic ticker for single-line scenarios. State reporting is decoupled to its own `state_report_interval` (default 5s), with immediate flush on step transitions (start/stop) via a stateNotify channel for responsive frontend UX. Additionally: - Skip ReportLog when `len(rows) == 0` (no pending log rows) - Skip ReportState when `stateChanged == false && len(outputs) == 0` (nothing changed) - Move expensive `proto.Clone` after the early-return check to avoid deep copies on no-op paths ### Polling Backoff with Jitter Replace fixed `rate.Limiter` with adaptive exponential backoff: - Track `consecutiveEmpty` and `consecutiveErrors` counters - Interval doubles with each empty/error response: `base × 2^(n-1)`, capped at `fetch_interval_max` (default 60s) - Add ±20% random jitter to prevent thundering herd - Fetch first, sleep after ��� preserves burst=1 behavior for immediate first fetch on startup and after task completion ### HTTP Client Tuning - Configure custom `http.Transport` with `MaxIdleConnsPerHost=10` (was 2) - Share a single `http.Client` between PingService and RunnerService - Add `IdleConnTimeout=90s` for clean connection lifecycle ## Load Reduction For 200 runners × 3 tasks (70% with active log output): | Component | Before | After | Reduction | |-----------|--------|-------|-----------| | Polling (idle) | 100 req/s | ~3.4 req/s | 97% | | Log reporting | 420 req/s | ~84 req/s | 80% | | State reporting | 126 req/s | ~25 req/s | 80% | | **Total** | **~1,300 req/s** | **~113 req/s** | **~91%** | ## Frontend UX Impact | Scenario | Before | After | Notes | |----------|--------|-------|-------| | Continuous output (npm install) | ~1s | ~5s | Periodic ticker sweep | | Single line then silence | ~1s | ≤3s | maxLatencyTimer guarantee | | Bursty output (100+ lines) | ~1s | <1s | Batch size immediate flush | | Step start/stop | ~1s | <1s | stateNotify immediate flush | | Job completion | ~1s | ~1s | Close() retry unchanged | ## New Configuration Options All have safe defaults — existing config files need no changes: ```yaml runner: fetch_interval_max: 60s # Max backoff interval when idle log_report_interval: 5s # Periodic log flush interval log_report_max_latency: 3s # Max time a log row waits (must be < log_report_interval) log_report_batch_size: 100 # Immediate flush threshold state_report_interval: 5s # State flush interval (step transitions are always immediate) ``` Config validation warns on invalid combinations: - `fetch_interval_max < fetch_interval` → auto-corrected - `log_report_max_latency >= log_report_interval` → warning (timer would be redundant) ## Test Plan - [x] `go build ./...` passes - [x] `go test ./...` passes (all existing + 3 new tests) - [x] `golangci-lint run` — 0 issues - [x] TestReporter_MaxLatencyTimer — verifies single log line flushed by maxLatencyTimer before logTicker - [x] TestReporter_BatchSizeFlush — verifies batch size threshold triggers immediate flush - [x] TestReporter_StateNotifyFlush — verifies step transition triggers immediate state flush - [x] TestReporter_EphemeralRunnerDeletion — verifies Close/RunDaemon race safety - [x] TestReporter_RunDaemonClose_Race — verifies concurrent Close safety Reviewed-on: https://gitea.com/gitea/act_runner/pulls/819 Reviewed-by: Nicolas <173651+bircni@noreply.gitea.com> Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com> Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
act runner
Act runner is a runner for Gitea based on Gitea fork of act.
Installation
Prerequisites
Docker Engine Community version is required for docker mode. To install Docker CE, follow the official install instructions.
Download pre-built binary
Visit here and download the right version for your platform.
Build from source
make build
Build a docker image
make docker
Quickstart
Actions are disabled by default, so you need to add the following to the configuration file of your Gitea instance to enable it:
[actions]
ENABLED=true
Register
./act_runner register
And you will be asked to input:
- Gitea instance URL, like
http://192.168.8.8:3000/. You should use your gitea instance ROOT_URL as the instance argument and you should not uselocalhostor127.0.0.1as instance IP; - Runner token, you can get it from
http://192.168.8.8:3000/admin/actions/runners; - Runner name, you can just leave it blank;
- Runner labels, you can just leave it blank.
The process looks like:
INFO Registering runner, arch=amd64, os=darwin, version=0.1.5.
WARN Runner in user-mode.
INFO Enter the Gitea instance URL (for example, https://gitea.com/):
http://192.168.8.8:3000/
INFO Enter the runner token:
fe884e8027dc292970d4e0303fe82b14xxxxxxxx
INFO Enter the runner name (if set empty, use hostname: Test.local):
INFO Enter the runner labels, leave blank to use the default labels (comma-separated, for example, ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest):
INFO Registering runner, name=Test.local, instance=http://192.168.8.8:3000/, labels=[ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest ubuntu-22.04:docker://docker.gitea.com/runner-images:ubuntu-22.04 ubuntu-20.04:docker://docker.gitea.com/runner-images:ubuntu-20.04].
DEBU Successfully pinged the Gitea instance server
INFO Runner registered successfully.
You can also register with command line arguments.
./act_runner register --instance http://192.168.8.8:3000 --token <my_runner_token> --no-interactive
If the registry succeed, it will run immediately. Next time, you could run the runner directly.
Run
./act_runner daemon
Run with docker
docker run -e GITEA_INSTANCE_URL=https://your_gitea.com -e GITEA_RUNNER_REGISTRATION_TOKEN=<your_token> -v /var/run/docker.sock:/var/run/docker.sock --name my_runner gitea/act_runner:nightly
Configuration
You can also configure the runner with a configuration file.
The configuration file is a YAML file, you can generate a sample configuration file with ./act_runner generate-config.
./act_runner generate-config > config.yaml
You can specify the configuration file path with -c/--config argument.
./act_runner -c config.yaml register # register with config file
./act_runner -c config.yaml daemon # run with config file
You can read the latest version of the configuration file online at config.example.yaml.
Example Deployments
Check out the examples directory for sample deployment types.