Commit Graph

18 Commits

Author SHA1 Message Date
silverwind
48944e136c Use golangci-lint fmt to format code (#823)
Use `golangci-lint fmt` to format code, replacing the previous gofumpt-based formatter. https://github.com/daixiang0/gci is used to order the imports.

Also drops the `gitea-vet` dependency since `gci` now handles import ordering.

Mirrors https://github.com/go-gitea/gitea/pull/37194.

---
This PR was written with the help of Claude Opus 4.7

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/823
Reviewed-by: Nicolas <173651+bircni@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-04-17 22:05:01 +00:00
Bo-Yi Wu
f33e5a6245 feat: add Prometheus metrics endpoint for runner observability (#820)
## What

Add an optional Prometheus `/metrics` HTTP endpoint to `act_runner` so operators can observe runner health, polling behavior, job outcomes, and RPC latency without scraping logs.

New surface:

- `internal/pkg/metrics/metrics.go` — metric definitions, custom `Registry`, static Go/process collectors, label constants, `ResultToStatusLabel` helper.
- `internal/pkg/metrics/server.go` — hardened `http.Server` serving `/metrics` and `/healthz` with Slowloris-safe timeouts (`ReadHeaderTimeout` 5s, `ReadTimeout`/`WriteTimeout` 10s, `IdleTimeout` 60s) and a 5s graceful shutdown.
- `daemon.go` wires it up behind `cfg.Metrics.Enabled` (disabled by default).
- `poller.go` / `reporter.go` / `runner.go` instrument their existing hot paths with counters/histograms/gauges — no behavior change.

Metrics exported (namespace `act_runner_`):

| Subsystem | Metric | Type | Labels |
|---|---|---|---|
| — | `info` | Gauge | `version`, `name` |
| — | `capacity`, `uptime_seconds` | Gauge | — |
| `poll` | `fetch_total`, `client_errors_total` | Counter | `result` / `method` |
| `poll` | `fetch_duration_seconds`, `backoff_seconds` | Histogram / Gauge | — |
| `job` | `total` | Counter | `status` |
| `job` | `duration_seconds`, `running`, `capacity_utilization_ratio` | Histogram / GaugeFunc | — |
| `report` | `log_total`, `state_total` | Counter | `result` |
| `report` | `log_duration_seconds`, `state_duration_seconds` | Histogram | — |
| `report` | `log_buffer_rows` | Gauge | — |
| — | `go_*`, `process_*` | standard collectors | — |

All label values are predefined constants — **no high-cardinality labels** (no task IDs, repo URLs, branches, tokens, or secrets) so scraping is safe and bounded.

## Why

Teams self-hosting Gitea + `act_runner` at scale need to answer basic SRE questions that are currently invisible:

- How often are RPCs failing? Which RPC? (`act_runner_client_errors_total`)
- Are runners saturated? (`act_runner_job_capacity_utilization_ratio`, `act_runner_job_running`)
- How long do jobs take? (`act_runner_job_duration_seconds`)
- Is polling backing off? (`act_runner_poll_backoff_seconds`, `act_runner_poll_fetch_total{result=\"error\"}`)
- Are log/state reports slow? (`act_runner_report_{log,state}_duration_seconds`)
- Is the log buffer draining? (`act_runner_report_log_buffer_rows`)

Today operators have to grep logs. This PR makes all of the above first-class metrics so they can feed dashboards and alerts (`rate(act_runner_client_errors_total[5m]) > 0.1`, capacity saturation alerts, etc.).

The endpoint is **disabled by default** and binds to `127.0.0.1:9101` when enabled, so it's opt-in and safe for existing deployments.

## How

### Config

```yaml
metrics:
  enabled: false           # opt-in
  addr: 127.0.0.1:9101     # change to 0.0.0.0:9101 only behind a reverse proxy
```

`config.example.yaml` documents both fields plus a security note about binding externally without auth.

### Wiring

1. `daemon.go` calls `metrics.Init()` (guarded by `sync.Once`), sets `act_runner_info`, `act_runner_capacity`, registers uptime + running-jobs GaugeFuncs, then starts the server goroutine with the daemon context — it shuts down cleanly on `ctx.Done()`.
2. `poller.fetchTask` observes RPC latency / result / error counters. `DeadlineExceeded` (long-poll idle) is treated as an empty result and **not** observed into the histogram so the 5s timeout doesn't swamp the buckets.
3. `poller.pollOnce` reports `poll_backoff_seconds` using the pre-jitter base interval (the true backoff level), and only when it changes — prevents noisy no-op gauge updates at the `FetchIntervalMax` plateau.
4. `reporter.ReportLog` / `ReportState` record duration histograms and success/error counters; `log_buffer_rows` is updated only when the value changes, guarded by the already-held `clientM`.
5. `runner.Run` observes `job_duration_seconds` and increments `job_total` by outcome via `metrics.ResultToStatusLabel`.

### Safety / security review

- All timeouts set; Slowloris-safe.
- Custom `prometheus.NewRegistry()` — no global registration side-effects.
- No sensitive data in labels (reviewed every instrumentation site).
- Single new dependency: `github.com/prometheus/client_golang v1.23.2`.
- Endpoint is unauthenticated by design and documented as such; default localhost bind mitigates exposure. Operators exposing externally should front it with a reverse proxy.

## Verification

### Unit tests

\`\`\`bash
go build ./...
go vet ./...
go test ./...
\`\`\`

### Manual smoke test

1. Enable metrics in `config.yaml`:
   \`\`\`yaml
   metrics:
     enabled: true
     addr: 127.0.0.1:9101
   \`\`\`
2. Start the runner against a Gitea instance: \`./act_runner daemon\`.
3. Scrape the endpoint:
   \`\`\`bash
   curl -s http://127.0.0.1:9101/metrics | grep '^act_runner_'
   curl -s http://127.0.0.1:9101/healthz   # → ok
   \`\`\`
4. Confirm the static series appear immediately: \`act_runner_info\`, \`act_runner_capacity\`, \`act_runner_uptime_seconds\`, \`act_runner_job_running\`, \`act_runner_job_capacity_utilization_ratio\`.
5. Trigger a workflow and confirm counters increment: \`act_runner_poll_fetch_total{result=\"task\"}\`, \`act_runner_job_total{status=\"success\"}\`, \`act_runner_report_log_total{result=\"success\"}\`.
6. Leave the runner idle and confirm \`act_runner_poll_backoff_seconds\` settles (and does **not** churn on every poll).
7. Ctrl-C and confirm a clean \"metrics server shutdown\" log line (no port-in-use error on restart within 5s).

### Prometheus integration

Add to \`prometheus.yml\`:

\`\`\`yaml
scrape_configs:
  - job_name: act_runner
    static_configs:
      - targets: ['127.0.0.1:9101']
\`\`\`

Sample alert to try:

\`\`\`
sum(rate(act_runner_client_errors_total[5m])) by (method) > 0.1
\`\`\`

## Out of scope (follow-ups)

- TLS and auth on the metrics endpoint (mitigated today by localhost default; add when operators need external scraping).
- Per-task labels (intentionally avoided for cardinality safety).

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/820
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
2026-04-15 01:27:34 +00:00
silverwind
658101d9cb chore(lint): add golangci-lint v2 and fix all lint issues (#803)
## Summary
- Replace old `.golangci.yml` (v1 format) with v2 format, aligned with gitea's lint config
- Add `lint-go`, `lint-go-fix`, and `lint` Makefile targets using golangci-lint v2.10.1
- Replace `make vet` with `make lint` in CI workflow (lint includes vet)
- Fix all 35 lint issues: modernize (maps.Copy, range over int, any), perfsprint (errors.New), unparam (remove unused parameters), revive (var naming), staticcheck, forbidigo exclusion for cmd/
- Make `security-check` non-fatal (apply https://github.com/go-gitea/gitea/pull/36681)
- Remove dead gocritic exclusion rules (commentFormatting, exitAfterDefer)
- Remove dead linter exclusions and disabled checks (singleCaseSwitch, ST1003, QF1001, QF1006, QF1008, testifylint go-require/require-error, test file exclusions for dupl/errcheck/staticcheck/unparam)

## Test plan
- [x] `golangci-lint run` passes
- [x] `go build ./...` passes
- [x] `go test ./...` passes

---------

Co-authored-by: ChristopherHX <christopher.homberger@web.de>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/803
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
2026-02-22 17:35:08 +00:00
Christopher Homberger
8920c4a170 Timeout to wait for and optionally require docker always (#741)
* do not require docker cli in the runner image for waiting for a sidecar dockerd
* allow to specify that `:host` labels would also only be accepted if docker is reachable

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/741
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
2025-08-28 17:28:08 +00:00
Pablo Carranza
6a9a447f86 Report errors by setting raw_output when it's error level (#645)
This solves #643 by setting the "raw_output" entry attribute when the log level is error.  This results in the log line being shipped to the Gitea UI.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/645
Reviewed-by: Zettat123 <zettat123@noreply.gitea.com>
Co-authored-by: Pablo Carranza <pcarranza@gmail.com>
Co-committed-by: Pablo Carranza <pcarranza@gmail.com>
2025-06-05 17:53:13 +00:00
Christopher Homberger
b1ae30dda8 ephemeral act runner (#649)
Works for both interactive and non-interactive registration mode.

A further enhancement would be jitconfig support of the daemon command, because after some changes in Gitea Actions the registration token became reusable.

removing runner and fail seems not possible at the current api level

Part of https://github.com/go-gitea/gitea/pull/33570

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/649
Reviewed-by: Zettat123 <zettat123@noreply.gitea.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
2025-03-13 21:57:44 +00:00
garet90
8bc0275e74 feat: add once flag to daemon command (#19) (#598)
Once flag polls and completes one job then exits.

I use this with Windows Sandbox (and creating users with local brew install on Mac) to create a fresh environment every time.

Co-authored-by: Garet Halliday <garet@pit.dev>
Co-authored-by: Jason Song <wolfogre@noreply.gitea.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/598
Reviewed-by: techknowlogick <techknowlogick@noreply.gitea.com>
Reviewed-by: Jason Song <wolfogre@noreply.gitea.com>
Co-authored-by: garet90 <garet90@noreply.gitea.com>
Co-committed-by: garet90 <garet90@noreply.gitea.com>
2024-11-06 17:16:08 +00:00
rowan-allspice
d1d3cad4b0 feat: allow graceful shutdowns (#546)
Add a `Shutdown(context.Context) error` method to the Poller. Calling this method will first shutdown all active polling, preventing any new jobs from spawning. It will then wait for either all jobs to finish, or for the context to be cancelled. If the context is cancelled, it will then force all jobs to end, and then exit.

Fixes https://gitea.com/gitea/act_runner/issues/107

Co-authored-by: Rowan Bohde <rowan.bohde@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/546
Reviewed-by: Jason Song <i@wolfogre.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: rowan-allspice <rowan-allspice@noreply.gitea.com>
Co-committed-by: rowan-allspice <rowan-allspice@noreply.gitea.com>
2024-05-27 07:38:55 +00:00
Jason Song
be2df361ef Ensure declare to use new labels (#530)
Ensure "declare" is supported, to use new labels, see https://gitea.com/gitea/act_runner/pulls/529

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/530
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
2024-04-02 07:39:40 +00:00
Jason Song
23ec12b8cf Bump act to v0.260.0 (#522)
Related to https://gitea.com/gitea/act/issues/99.

Also update other main dependencies.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/522
Reviewed-by: Zettat123 <zettat123@noreply.gitea.com>
2024-03-27 03:17:04 +00:00
TheFox0x7
ed35b09b8f change podman socket path (#341)
port of https://github.com/nektos/act/pull/1961
closes gitea/act_runner#274

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/341
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: TheFox0x7 <thefox0x7@gmail.com>
Co-committed-by: TheFox0x7 <thefox0x7@gmail.com>
2023-08-21 04:01:12 +00:00
Bo-Yi Wu
db662b3690 ci(lint): refactor code for clarity and linting compliance (#289)
- Removed `deadcode`, `structcheck`, and `varcheck` linters from `.golangci.yml`
- Fixed a typo in a comment in `daemon.go`
- Renamed `defaultActionsUrl` to `defaultActionsURL` in `exec.go`
- Removed unnecessary else clause in `exec.go` and `runner.go`
- Simplified variable initialization in `exec.go`
- Changed function name from `getHttpClient` to `getHTTPClient` in `http.go`
- Removed unnecessary else clause in `labels_test.go`

Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/289
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
2023-07-13 01:10:54 +00:00
Zettat123
f2629f2ea3 Add support for finding docker daemon from common socket paths (#263)
Caused by #260

act_runner will fail to start if user does not set `docker_host` configuration and `DOCKER_HOST` env. This PR adds the support for finding docker daemon from common socket paths so act_runner could detect the docker socket from these paths.

The `commonSocketPaths` is from [nektos/act](e60018a6d9/cmd/root.go (L124-L131))

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/263
Co-authored-by: Zettat123 <zettat123@noreply.gitea.com>
Co-committed-by: Zettat123 <zettat123@noreply.gitea.com>
2023-07-01 01:27:54 +00:00
Zettat123
ccc27329dc Improve the usage of docker_host configuration (#260)
Follow #242, #244
Fixes #258

Users could use `docker_host` configuration to specify which docker daemon will be used by act_runner.
- If `docker_host` is **empty**, act_runner will find an available docker host automatically.
- If `docker_host` is **"-"**, act_runner will find an available docker host automatically, but the docker host won't be mounted to the job containers and service containers.
- If `docker_host` is **not empty or "-"**, the specified docker host will be used. An error will be returned if it doesn't work.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/260
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2023-06-30 04:00:04 +00:00
appleboy
9e4a5f7363 feat: improve Docker configuration and detection handling (#242)
- Pass `cfg` to `envcheck.CheckIfDockerRunning` function
- Add `Docker` struct to `config.go` for Docker configuration
- Update `config.example.yaml` with `docker` configuration options
- Modify `CheckIfDockerRunning` in `docker.go` to use Docker host from config if provided

Signed-off-by: appleboy <appleboy.tw@gmail.com>

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/242
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: wxiaoguang <wxiaoguang@noreply.gitea.com>
Co-authored-by: appleboy <appleboy.tw@gmail.com>
Co-committed-by: appleboy <appleboy.tw@gmail.com>
2023-06-18 05:38:38 +00:00
sillyguodong
67b1363d25 Support changing labels (#201)
Implement proposal: https://github.com/go-gitea/gitea/issues/24540

Related:
- Protocol: https://gitea.com/gitea/actions-proto-def/pulls/9
- Gitea side: https://github.com/go-gitea/gitea/pull/24806

Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/201
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: sillyguodong <gedong_1994@163.com>
Co-committed-by: sillyguodong <gedong_1994@163.com>
2023-06-15 03:59:15 +00:00
Bo-Yi Wu
69c55ee003 refactor: daemon, config, and logging for better clarity (#225)
- Import "path", "runtime", "strconv", and "strings" packages in daemon.go
- Move "Starting runner daemon" log message to a different location
- Refactor log formatter initialization and add debug level caller information
- Split Config struct into separate Log, Runner, Cache, and Container structs with comments in config.go

Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/225
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
2023-06-05 13:11:23 +00:00
Jason Song
220efa69c0 Refactor to new framework (#98)
- Adjust directory structure
```text
├── internal
│   ├── app
│   │   ├── artifactcache
│   │   ├── cmd
│   │   ├── poll
│   │   └── run
│   └── pkg
│       ├── client
│       ├── config
│       ├── envcheck
│       ├── labels
│       ├── report
│       └── ver
└── main.go
```
- New pkg `labels` to parse label
- New pkg `report` to report logs to Gitea
- Remove pkg `engine`, use `envcheck` to check if docker running.
- Rewrite `runtime` to `run`
- Rewrite `poller` to `poll`
- Simplify some code and remove what's useless.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/98
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Jason Song <i@wolfogre.com>
Co-committed-by: Jason Song <i@wolfogre.com>
2023-04-04 21:32:04 +08:00