Merges the `gitea.com/gitea/act` fork into this repository as the `act/`
directory and consumes it as a local package. The `replace github.com/nektos/act
=> gitea.com/gitea/act` directive is removed; act's dependencies are merged
into the root `go.mod`.
- Imports rewritten: `github.com/nektos/act/pkg/...` → `gitea.com/gitea/act_runner/act/...`
(flattened — `pkg/` boundary dropped to match the layout forgejo-runner adopted).
- Dropped act's CLI (`cmd/`, `main.go`) and all upstream project files; kept
the library tree + `LICENSE`.
- Added `// Copyright <year> The Gitea Authors ...` / `// Copyright <year> nektos`
headers to 104 `.go` files.
- Pre-existing act lint violations annotated inline with
`//nolint:<linter> // pre-existing issue from nektos/act`.
`.golangci.yml` is unchanged vs `main`.
- Makefile test target: `-race -short` (matches forgejo-runner).
- Pre-existing integration test failures fixed: race in parallel executor
(atomic counters); TestSetupEnv / command_test / expression_test /
run_context_test updated to match gitea fork runtime; TestJobExecutor and
TestActionCache gated on `testing.Short()`.
Full `gitea/act` commit history is reachable via the second parent.
Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
## What
Add an optional Prometheus `/metrics` HTTP endpoint to `act_runner` so operators can observe runner health, polling behavior, job outcomes, and RPC latency without scraping logs.
New surface:
- `internal/pkg/metrics/metrics.go` — metric definitions, custom `Registry`, static Go/process collectors, label constants, `ResultToStatusLabel` helper.
- `internal/pkg/metrics/server.go` — hardened `http.Server` serving `/metrics` and `/healthz` with Slowloris-safe timeouts (`ReadHeaderTimeout` 5s, `ReadTimeout`/`WriteTimeout` 10s, `IdleTimeout` 60s) and a 5s graceful shutdown.
- `daemon.go` wires it up behind `cfg.Metrics.Enabled` (disabled by default).
- `poller.go` / `reporter.go` / `runner.go` instrument their existing hot paths with counters/histograms/gauges — no behavior change.
Metrics exported (namespace `act_runner_`):
| Subsystem | Metric | Type | Labels |
|---|---|---|---|
| — | `info` | Gauge | `version`, `name` |
| — | `capacity`, `uptime_seconds` | Gauge | — |
| `poll` | `fetch_total`, `client_errors_total` | Counter | `result` / `method` |
| `poll` | `fetch_duration_seconds`, `backoff_seconds` | Histogram / Gauge | — |
| `job` | `total` | Counter | `status` |
| `job` | `duration_seconds`, `running`, `capacity_utilization_ratio` | Histogram / GaugeFunc | — |
| `report` | `log_total`, `state_total` | Counter | `result` |
| `report` | `log_duration_seconds`, `state_duration_seconds` | Histogram | — |
| `report` | `log_buffer_rows` | Gauge | — |
| — | `go_*`, `process_*` | standard collectors | — |
All label values are predefined constants — **no high-cardinality labels** (no task IDs, repo URLs, branches, tokens, or secrets) so scraping is safe and bounded.
## Why
Teams self-hosting Gitea + `act_runner` at scale need to answer basic SRE questions that are currently invisible:
- How often are RPCs failing? Which RPC? (`act_runner_client_errors_total`)
- Are runners saturated? (`act_runner_job_capacity_utilization_ratio`, `act_runner_job_running`)
- How long do jobs take? (`act_runner_job_duration_seconds`)
- Is polling backing off? (`act_runner_poll_backoff_seconds`, `act_runner_poll_fetch_total{result=\"error\"}`)
- Are log/state reports slow? (`act_runner_report_{log,state}_duration_seconds`)
- Is the log buffer draining? (`act_runner_report_log_buffer_rows`)
Today operators have to grep logs. This PR makes all of the above first-class metrics so they can feed dashboards and alerts (`rate(act_runner_client_errors_total[5m]) > 0.1`, capacity saturation alerts, etc.).
The endpoint is **disabled by default** and binds to `127.0.0.1:9101` when enabled, so it's opt-in and safe for existing deployments.
## How
### Config
```yaml
metrics:
enabled: false # opt-in
addr: 127.0.0.1:9101 # change to 0.0.0.0:9101 only behind a reverse proxy
```
`config.example.yaml` documents both fields plus a security note about binding externally without auth.
### Wiring
1. `daemon.go` calls `metrics.Init()` (guarded by `sync.Once`), sets `act_runner_info`, `act_runner_capacity`, registers uptime + running-jobs GaugeFuncs, then starts the server goroutine with the daemon context — it shuts down cleanly on `ctx.Done()`.
2. `poller.fetchTask` observes RPC latency / result / error counters. `DeadlineExceeded` (long-poll idle) is treated as an empty result and **not** observed into the histogram so the 5s timeout doesn't swamp the buckets.
3. `poller.pollOnce` reports `poll_backoff_seconds` using the pre-jitter base interval (the true backoff level), and only when it changes — prevents noisy no-op gauge updates at the `FetchIntervalMax` plateau.
4. `reporter.ReportLog` / `ReportState` record duration histograms and success/error counters; `log_buffer_rows` is updated only when the value changes, guarded by the already-held `clientM`.
5. `runner.Run` observes `job_duration_seconds` and increments `job_total` by outcome via `metrics.ResultToStatusLabel`.
### Safety / security review
- All timeouts set; Slowloris-safe.
- Custom `prometheus.NewRegistry()` — no global registration side-effects.
- No sensitive data in labels (reviewed every instrumentation site).
- Single new dependency: `github.com/prometheus/client_golang v1.23.2`.
- Endpoint is unauthenticated by design and documented as such; default localhost bind mitigates exposure. Operators exposing externally should front it with a reverse proxy.
## Verification
### Unit tests
\`\`\`bash
go build ./...
go vet ./...
go test ./...
\`\`\`
### Manual smoke test
1. Enable metrics in `config.yaml`:
\`\`\`yaml
metrics:
enabled: true
addr: 127.0.0.1:9101
\`\`\`
2. Start the runner against a Gitea instance: \`./act_runner daemon\`.
3. Scrape the endpoint:
\`\`\`bash
curl -s http://127.0.0.1:9101/metrics | grep '^act_runner_'
curl -s http://127.0.0.1:9101/healthz # → ok
\`\`\`
4. Confirm the static series appear immediately: \`act_runner_info\`, \`act_runner_capacity\`, \`act_runner_uptime_seconds\`, \`act_runner_job_running\`, \`act_runner_job_capacity_utilization_ratio\`.
5. Trigger a workflow and confirm counters increment: \`act_runner_poll_fetch_total{result=\"task\"}\`, \`act_runner_job_total{status=\"success\"}\`, \`act_runner_report_log_total{result=\"success\"}\`.
6. Leave the runner idle and confirm \`act_runner_poll_backoff_seconds\` settles (and does **not** churn on every poll).
7. Ctrl-C and confirm a clean \"metrics server shutdown\" log line (no port-in-use error on restart within 5s).
### Prometheus integration
Add to \`prometheus.yml\`:
\`\`\`yaml
scrape_configs:
- job_name: act_runner
static_configs:
- targets: ['127.0.0.1:9101']
\`\`\`
Sample alert to try:
\`\`\`
sum(rate(act_runner_client_errors_total[5m])) by (method) > 0.1
\`\`\`
## Out of scope (follow-ups)
- TLS and auth on the metrics endpoint (mitigated today by localhost default; add when operators need external scraping).
- Per-task labels (intentionally avoided for cardinality safety).
---
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/820
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Before this commit, when running locally `act_runner exec` to test workflows, we could only fill env and secrets but not vars
This commit add a new exec option `--var` based on what is done for env and secret
Example:
`act_runner exec --env MY_ENV=testenv -s MY_SECRET=testsecret --var MY_VAR=testvariable`
workflow
```
name: Gitea Actions test
on: [push]
jobs:
TestAction:
runs-on: ubuntu-latest
steps:
- run: echo "VAR -> ${{ vars.MY_VAR }}"
- run: echo "ENV -> ${{ env.MY_ENV }}"
- run: echo "SECRET -> ${{ secrets.MY_SECRET }}"
```
Will echo var, env and secret values sent in the command line
Fixesgitea/act_runner#692
Co-authored-by: Lautriva <gitlactr@dbn.re>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/704
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: lautriva <lautriva@noreply.gitea.com>
Co-committed-by: lautriva <lautriva@noreply.gitea.com>
I used to be able to do something like `./act_runner register --instance https://gitea.com --token testdcff --name test` on GitHub Actions Runners, but act_runner always asked me to enter instance, token etc. again and requiring me to use `--no-interactive` including passing everything per cli.
My idea was to extract the preset input of some stages to skip the prompt for this value if it is a non empty string. Labels is the only question that has been asked more than once if validation failed, in this case the error path have to unset the values of the input structure to not end in a non-interactive loop.
_I have written this initially for my own gitea runner, might be useful to everyone using the official runner as well_
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/682
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
- Removed `deadcode`, `structcheck`, and `varcheck` linters from `.golangci.yml`
- Fixed a typo in a comment in `daemon.go`
- Renamed `defaultActionsUrl` to `defaultActionsURL` in `exec.go`
- Removed unnecessary else clause in `exec.go` and `runner.go`
- Simplified variable initialization in `exec.go`
- Changed function name from `getHttpClient` to `getHTTPClient` in `http.go`
- Removed unnecessary else clause in `labels_test.go`
Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/289
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
This PR
- adds the `cache-server` command so act_runner can run as a cache server. When running as a cache server, act_runner only processes the requests related to cache and does not run jobs.
- adds the `external_server` configuration for cache. If specified, act_runner will use this URL as the ACTIONS_CACHE_URL instead of starting a cache server itself.
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/275
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
Follow #242, #244Fixes#258
Users could use `docker_host` configuration to specify which docker daemon will be used by act_runner.
- If `docker_host` is **empty**, act_runner will find an available docker host automatically.
- If `docker_host` is **"-"**, act_runner will find an available docker host automatically, but the docker host won't be mounted to the job containers and service containers.
- If `docker_host` is **not empty or "-"**, the specified docker host will be used. An error will be returned if it doesn't work.
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/260
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
- Import "path", "runtime", "strconv", and "strings" packages in daemon.go
- Move "Starting runner daemon" log message to a different location
- Refactor log formatter initialization and add debug level caller information
- Split Config struct into separate Log, Runner, Cache, and Container structs with comments in config.go
Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/225
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Close https://gitea.com/gitea/act_runner/issues/177
Related https://gitea.com/gitea/act/pulls/56
### ⚠️ Breaking
The `container.network_mode` is a deprecated configuration item. It may be removed after Gitea 1.20 released.
Previously, if the value of `container.network_mode` is `bridge`, it means that `act_runner` will create a new network for job.But `bridge` is easily confused with the bridge network created by Docker by default.
We recommand that using `container.network` to specify the network to which containers created by `act_runner` connect.
### 🆕 container.network
The configuration file of `act_runner` add a new item of `contianer.network`.
In `config.example.yaml`:
```yaml
container:
# Specifies the network to which the container will connect.
# Could be host, bridge or the name of a custom network.
# If it's empty, act_runner will create a network automatically.
network: ""
```
As the comment in the example above says, the purpose of the `container.network` is specifying the network to which containers created by `act_runner` will connect.
`container.network` accepts the following valid values:
- `host`: All of containers (including job containers and service contianers) created by `act_runner` will be connected to the network named `host` which is created automatically by Docker. Containers will share the host’s network stack and all interfaces from the host will be available to these containers.
- `bridge`: It is similar to `host`. All of containers created by `act_runner` will be connected to the network named `bridge` which is created automatically by Docker. All containers connected to the `bridge` (Perhaps there are containers that are not created by `act_runner`) are allowed to communicate with each other, while providing isolation from containers which are not connected to that `bridge` network.
- `<custom_network>`: Please make sure that the `<custom_network>` network already exists firstly (`act_runner` does not detect whether the specified network exists currently. If not exists yet, will return error in the stage of `docker create`). All of containers created by `act_runner` will be connected to `<custom_network>`. After the job is executed, containers are removed and automatically disconnected from the `<custom_network>`.
- empty: `act_runner` will create a new network for each job container and their service containers (if defined in workflow). So each job container and their service containers share a network environment, but are isolated from others container and the Docker host. Of course, these networks created by `act_runner` will be removed at last.
### Others
- If you do not have special needs, we highly recommend that setting `container.network` to empty string (and do not use `container.network_mode` any more). Because the containers created by `act_runner` will connect to the networks that are created by itself. This point will provide better isolation.
- If you set `contianer.network` to empty string or `<custom_network>`, we can be access to service containers by `<service-id>:<port>` in the steps of job. Because we added an alias to the service container when connecting to the network.
Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/184
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: sillyguodong <gedong_1994@163.com>
Co-committed-by: sillyguodong <gedong_1994@163.com>
This adds a very simple Dockerfile and run script for running `act_runner` as a container.
It also allows setting `Privileged` and `ContainerOptions` flags via the new config file when spawning task containers. The combination makes it possible to use Docker-in-Docker (which requires `privileged` mode) as well as pass any other options child Docker containers may require.
For example, if Gitea is running in Docker on the same machine, for the `checkout` action to behave as expected from a task container launched by `act_runner`, it might be necessary to map the hostname via something like:
```
container:
network_mode: bridge
privileged: true
options: --add-host=my.gitea.hostname:host-gateway
```
> NOTE: Description updated to reflect latest code.
> NOTE: Description updated to reflect latest code (again).
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/84
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: Thomas E Lackey <telackey@bozemanpass.com>
Co-committed-by: Thomas E Lackey <telackey@bozemanpass.com>