mirror of https://gitea.com/gitea/act_runner.git synced 2026-04-23 20:30:07 +08:00

Go to file

Bo-Yi Wu f2d545565f perf: reduce runner-to-server connection load with adaptive reporting and polling (#819 )

## Summary

Many teams self-host Gitea + Act Runner at scale. The current runner design causes excessive HTTP requests to the Gitea server, leading to high server load. This PR addresses three root causes: aggressive fixed-interval polling, per-task status reporting every 1 second regardless of activity, and unoptimized HTTP client configuration.

## Problem

The original architecture has these issues:

**1. Fixed 1-second reporting interval (RunDaemon)**

- Every running task calls ReportLog + ReportState every 1 second (2 HTTP requests/sec/task)
- These requests are sent even when there are no new log rows or state changes
- With 200 runners × 3 tasks each = **1,200 req/sec just for status reporting**

**2. Fixed 2-second polling interval (no backoff)**

- Idle runners poll FetchTask every 2 seconds forever, even when no jobs are queued
- No exponential backoff or jitter — all runners can synchronize after network recovery (thundering herd)
- 200 idle runners = **100 req/sec doing nothing useful**

**3. HTTP client not tuned**

- Uses http.DefaultClient with MaxIdleConnsPerHost=2, causing frequent TCP/TLS reconnects
- Creates two separate http.Client instances (one for Ping, one for Runner service) instead of sharing

**Total: ~1,300 req/sec for 200 runners with 3 tasks each**

## Solution

### Adaptive Event-Driven Log Reporting

Replace the recursive `time.AfterFunc(1s)` pattern in RunDaemon with a goroutine-based select event loop using three trigger mechanisms:

| Trigger | Default | Purpose |
|---------|---------|---------|
| `log_report_max_latency` | 3s | Guarantee even a single log line is delivered within this time |
| `log_report_interval` | 5s | Periodic sweep — steady-state cadence |
| `log_report_batch_size` | 100 rows | Immediate flush during bursty output (e.g., npm install) |

**Key design**: `log_report_max_latency` (3s) must be less than `log_report_interval` (5s) so the max-latency timer fires before the periodic ticker for single-line scenarios.

State reporting is decoupled to its own `state_report_interval` (default 5s), with immediate flush on step transitions (start/stop) via a stateNotify channel for responsive frontend UX.

Additionally:
- Skip ReportLog when `len(rows) == 0` (no pending log rows)
- Skip ReportState when `stateChanged == false && len(outputs) == 0` (nothing changed)
- Move expensive `proto.Clone` after the early-return check to avoid deep copies on no-op paths

### Polling Backoff with Jitter

Replace fixed `rate.Limiter` with adaptive exponential backoff:
- Track `consecutiveEmpty` and `consecutiveErrors` counters
- Interval doubles with each empty/error response: `base × 2^(n-1)`, capped at `fetch_interval_max` (default 60s)
- Add ±20% random jitter to prevent thundering herd
- Fetch first, sleep after ��� preserves burst=1 behavior for immediate first fetch on startup and after task completion

### HTTP Client Tuning

- Configure custom `http.Transport` with `MaxIdleConnsPerHost=10` (was 2)
- Share a single `http.Client` between PingService and RunnerService
- Add `IdleConnTimeout=90s` for clean connection lifecycle

## Load Reduction

For 200 runners × 3 tasks (70% with active log output):

| Component | Before | After | Reduction |
|-----------|--------|-------|-----------|
| Polling (idle) | 100 req/s | ~3.4 req/s | 97% |
| Log reporting | 420 req/s | ~84 req/s | 80% |
| State reporting | 126 req/s | ~25 req/s | 80% |
| **Total** | **~1,300 req/s** | **~113 req/s** | **~91%** |

## Frontend UX Impact

| Scenario | Before | After | Notes |
|----------|--------|-------|-------|
| Continuous output (npm install) | ~1s | ~5s | Periodic ticker sweep |
| Single line then silence | ~1s | ≤3s | maxLatencyTimer guarantee |
| Bursty output (100+ lines) | ~1s | <1s | Batch size immediate flush |
| Step start/stop | ~1s | <1s | stateNotify immediate flush |
| Job completion | ~1s | ~1s | Close() retry unchanged |

## New Configuration Options

All have safe defaults — existing config files need no changes:

```yaml
runner:
  fetch_interval_max: 60s        # Max backoff interval when idle
  log_report_interval: 5s        # Periodic log flush interval
  log_report_max_latency: 3s     # Max time a log row waits (must be < log_report_interval)
  log_report_batch_size: 100     # Immediate flush threshold
  state_report_interval: 5s      # State flush interval (step transitions are always immediate)
```

Config validation warns on invalid combinations:
- `fetch_interval_max < fetch_interval` → auto-corrected
- `log_report_max_latency >= log_report_interval` → warning (timer would be redundant)

## Test Plan

- [x] `go build ./...` passes
- [x] `go test ./...` passes (all existing + 3 new tests)
- [x] `golangci-lint run` — 0 issues
- [x] TestReporter_MaxLatencyTimer — verifies single log line flushed by maxLatencyTimer before logTicker
- [x] TestReporter_BatchSizeFlush — verifies batch size threshold triggers immediate flush
- [x] TestReporter_StateNotifyFlush — verifies step transition triggers immediate state flush
- [x] TestReporter_EphemeralRunnerDeletion — verifies Close/RunDaemon race safety
- [x] TestReporter_RunDaemonClose_Race — verifies concurrent Close safety

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/819
Reviewed-by: Nicolas <173651+bircni@noreply.gitea.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>

2026-04-14 11:29:25 +00:00

.gitea/workflows

Semver tags for Docker images (#720 )

2026-02-25 19:09:53 +00:00

examples

examples: improve DIND rootless network performance (#786 )

2026-02-16 07:56:45 +00:00

internal

perf: reduce runner-to-server connection load with adaptive reporting and polling (#819 )

2026-04-14 11:29:25 +00:00

scripts

feat: docker env vars for ephemeral and once (#685 )

2025-05-07 15:43:05 +00:00

.editorconfig

Add .editorconfig and .gitattributes (#186 )

2023-05-13 23:51:22 +08:00

.gitattributes

Add .editorconfig and .gitattributes (#186 )

2023-05-13 23:51:22 +08:00

.gitignore

Support basic, dind and dind-rootless as multiple kinds of images (#619 )

2024-11-06 03:15:51 +00:00

.golangci.yml

chore(lint): add golangci-lint v2 and fix all lint issues (#803 )

2026-02-22 17:35:08 +00:00

.goreleaser.checksum.sh

…

.goreleaser.yaml

ci: release binary for linux/loong64 (#756 )

2025-10-21 17:42:08 +00:00

build.go

…

Dockerfile

chore(deps): bump Go version to 1.26 in Dockerfile and Makefile (#788 )

2026-02-14 10:32:16 +00:00

go.mod

perf: reduce runner-to-server connection load with adaptive reporting and polling (#819 )

2026-04-14 11:29:25 +00:00

go.sum

Upgrade yaml (#816 )

2026-03-28 16:18:47 +00:00

LICENSE

…

main.go

Refactor to new framework (#98 )

2023-04-04 21:32:04 +08:00

Makefile

chore(lint): add golangci-lint v2 and fix all lint issues (#803 )

2026-02-22 17:35:08 +00:00

README.md

use new docker image URLs (#661 )

2025-03-01 20:21:52 +00:00

renovate.json5

Add renovate config (#408 )

2023-11-23 20:41:10 +00:00

README.md

act runner

Act runner is a runner for Gitea based on Gitea fork of act.

Installation

Prerequisites

Docker Engine Community version is required for docker mode. To install Docker CE, follow the official install instructions.

Download pre-built binary

Visit here and download the right version for your platform.

Build from source

make build

Build a docker image

make docker

Quickstart

Actions are disabled by default, so you need to add the following to the configuration file of your Gitea instance to enable it:

[actions]
ENABLED=true

Register

./act_runner register

And you will be asked to input:

Gitea instance URL, like http://192.168.8.8:3000/. You should use your gitea instance ROOT_URL as the instance argument and you should not use localhost or 127.0.0.1 as instance IP;
Runner token, you can get it from http://192.168.8.8:3000/admin/actions/runners;
Runner name, you can just leave it blank;
Runner labels, you can just leave it blank.

The process looks like:

INFO Registering runner, arch=amd64, os=darwin, version=0.1.5.
WARN Runner in user-mode.
INFO Enter the Gitea instance URL (for example, https://gitea.com/):
http://192.168.8.8:3000/
INFO Enter the runner token:
fe884e8027dc292970d4e0303fe82b14xxxxxxxx
INFO Enter the runner name (if set empty, use hostname: Test.local):

INFO Enter the runner labels, leave blank to use the default labels (comma-separated, for example, ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest):

INFO Registering runner, name=Test.local, instance=http://192.168.8.8:3000/, labels=[ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest ubuntu-22.04:docker://docker.gitea.com/runner-images:ubuntu-22.04 ubuntu-20.04:docker://docker.gitea.com/runner-images:ubuntu-20.04].
DEBU Successfully pinged the Gitea instance server
INFO Runner registered successfully.

You can also register with command line arguments.

./act_runner register --instance http://192.168.8.8:3000 --token <my_runner_token> --no-interactive

If the registry succeed, it will run immediately. Next time, you could run the runner directly.

Run

./act_runner daemon

Run with docker

docker run -e GITEA_INSTANCE_URL=https://your_gitea.com -e GITEA_RUNNER_REGISTRATION_TOKEN=<your_token> -v /var/run/docker.sock:/var/run/docker.sock --name my_runner gitea/act_runner:nightly

Configuration

You can also configure the runner with a configuration file. The configuration file is a YAML file, you can generate a sample configuration file with ./act_runner generate-config.

./act_runner generate-config > config.yaml

You can specify the configuration file path with -c/--config argument.

./act_runner -c config.yaml register # register with config file
./act_runner -c config.yaml daemon # run with config file

You can read the latest version of the configuration file online at config.example.yaml.

Example Deployments

Check out the examples directory for sample deployment types.