Commit Graph

1573 Commits

Author SHA1 Message Date
silverwind
8af9a2b47a Merge gitea/act into act/ folder' (#827)
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/827
Reviewed-by: ChristopherHX <38043+christopherhx@noreply.gitea.com>
2026-04-23 16:41:15 +00:00
silverwind
fab2d6ae04 Merge gitea/act into act/
Merges the `gitea.com/gitea/act` fork into this repository as the `act/`
directory and consumes it as a local package. The `replace github.com/nektos/act
=> gitea.com/gitea/act` directive is removed; act's dependencies are merged
into the root `go.mod`.

- Imports rewritten: `github.com/nektos/act/pkg/...` → `gitea.com/gitea/act_runner/act/...`
  (flattened — `pkg/` boundary dropped to match the layout forgejo-runner adopted).
- Dropped act's CLI (`cmd/`, `main.go`) and all upstream project files; kept
  the library tree + `LICENSE`.
- Added `// Copyright <year> The Gitea Authors ...` / `// Copyright <year> nektos`
  headers to 104 `.go` files.
- Pre-existing act lint violations annotated inline with
  `//nolint:<linter> // pre-existing issue from nektos/act`.
  `.golangci.yml` is unchanged vs `main`.
- Makefile test target: `-race -short` (matches forgejo-runner).
- Pre-existing integration test failures fixed: race in parallel executor
  (atomic counters); TestSetupEnv / command_test / expression_test /
  run_context_test updated to match gitea fork runtime; TestJobExecutor and
  TestActionCache gated on `testing.Short()`.

Full `gitea/act` commit history is reachable via the second parent.

Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
2026-04-22 22:29:06 +02:00
Pascal Zimmermann
15dd63a839 feat: Add support for Dynamic Matrix Strategies with Job Outputs (#149)
# Matrix Strategy: Support for Scalar Values and Template Expressions

## 🎯 Overview

This Pull Request implements support for **scalar values and template expressions** in matrix strategies. Previously, every matrix value had to be declared as a YAML array — a bare scalar like `os: ubuntu-latest` or `version: ${{ fromJSON(...) }}` caused silent failures. Now, scalars are automatically wrapped into single-element arrays, enabling the `${{ fromJSON(...) }}` pattern that is commonly used in GitHub Actions workflows for dynamic matrix generation.

## ⚠️ Semantic Changes

Two subtle behavior changes are introduced that are worth understanding:

### 1. Scalar matrix values now produce a 1-element matrix expansion

**Before this PR:** A scalar value like

```yaml
strategy:
  matrix:
    os: ubuntu-latest   # no brackets
```

failed to decode as `map[string][]interface{}`, so `Matrix()` returned `nil`. The job ran **once, without any matrix context** (`matrix.os` was undefined).

**After this PR:** The same input is wrapped to `["ubuntu-latest"]`. The job runs **one matrix iteration** with `matrix.os = "ubuntu-latest"` set.

> **Impact on existing workflows:** Workflows that accidentally used bare scalars instead of arrays will now run with a matrix context where they previously did not. The functional result is usually the same, but `matrix.*` variables are now populated. This is considered an improvement, but is a semantic change.

---

### 2. Unevaluated template expressions fall back to a literal string value

The actual resolution of `${{ fromJSON(needs.job1.outputs.versions) }}` into an array happens **before** `Matrix()` is called — in `EvaluateYamlNode` inside `pkg/runner/runner.go`. This PR does not change that evaluation path.

If evaluation succeeds → the YAML node already contains a proper array → `Matrix()` decodes it normally.

If evaluation **fails or is skipped** (e.g., expression not yet resolved) → the scalar string is now wrapped as `["${{ fromJSON(...) }}"]` → the job runs **one iteration with the literal unexpanded string** as the matrix value.

> **Note:** The test `TestJobMatrix/matrix_with_template_expression` covers this fallback path, not the successful resolution path. The literal-wrap behavior is the fallback, not the primary mechanism for template expression support.

---

## 🔑 Key Changes

### 1. Flexible Matrix Decoding (`pkg/model/workflow.go`)

#### New Helper Function: `normalizeMatrixValue`
```go
func normalizeMatrixValue(key string, val interface{}) ([]interface{}, error)
```
- Converts a single matrix value to the expected array format
- Handles three cases:
  1. **Arrays**: Pass through unchanged
  2. **Scalars**: Wrap in single-element array (including template expressions)
  3. **Invalid types** (maps, etc.): Return error to detect misconfigurations early

**Supported scalar types:**
- `string` — including unevaluated template expressions
- `int`, `float64` — numeric values
- `bool` — boolean values
- `nil` — null values

#### Enhanced `Matrix()` Method

**Before:**
```go
func (j *Job) Matrix() map[string][]interface{} {
    if j.Strategy.RawMatrix.Kind == yaml.MappingNode {
        var val map[string][]interface{}
        if !decodeNode(j.Strategy.RawMatrix, &val) {
            return nil
        }
        return val
    }
    return nil
}
```

**After:**
```go
func (j *Job) Matrix() map[string][]interface{} {
    if j.Strategy == nil || j.Strategy.RawMatrix.Kind != yaml.MappingNode {
        return nil
    }

    // First decode to flexible map[string]interface{} to handle scalars and arrays
    var flexVal map[string]interface{}
    err := j.Strategy.RawMatrix.Decode(&flexVal)
    if err != nil {
        // If that fails, try the strict array format
        var val map[string][]interface{}
        if !decodeNode(j.Strategy.RawMatrix, &val) {
            return nil
        }
        return val
    }

    // Convert flexible format to expected format with validation
    val := make(map[string][]interface{})
    for k, v := range flexVal {
        normalized, err := normalizeMatrixValue(k, v)
        if err != nil {
            log.Errorf("matrix validation error: %v", err)
            return nil
        }
        val[k] = normalized
    }
    return val
}
```

**Key improvements:**
- Handles unevaluated template expressions as scalar strings
- Automatic conversion to array format for consistent processing
- Validation to catch nested maps and invalid configurations
- Fallback to strict array format decoding for backward compatibility
- Early nil check for strategy to prevent nil pointer panics

---

## 🎁 Use Cases

### Example 1: Template Expression in Matrix (primary motivation)
```yaml
jobs:
  setup:
    runs-on: ubuntu-latest
    outputs:
      versions: ${{ steps.version-map.outputs.versions }}
    steps:
      - id: version-map
        run: echo "versions=[\"1.17\",\"1.18\",\"1.19\"]" >> $GITHUB_OUTPUT

  test:
    needs: setup
    strategy:
      matrix:
        # EvaluateYamlNode resolves this to an array before Matrix() is called
        version: ${{ fromJSON(needs.setup.outputs.versions) }}
    runs-on: ubuntu-latest
    steps:
      - run: echo "Testing version ${{ matrix.version }}"
```

### Example 2: Mixed Scalar and Array Values
```yaml
jobs:
  build:
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest]  # Array — unchanged
        go-version: 1.19                      # Scalar — wrapped to [1.19]
        node: [14, 16, 18]                    # Array — unchanged
    runs-on: ${{ matrix.os }}
```

### Example 3: Single Value Matrix
```yaml
jobs:
  deploy:
    strategy:
      matrix:
        environment: production  # Scalar — wrapped to ["production"]
    runs-on: ubuntu-latest
```

---

##  Benefits

-  **Feature Parity with GitHub Actions**: Supports template expressions in matrix strategies
- 🔒 **Robust Validation**: Detects misconfigured matrices (nested maps) early
- 🔄 **Backward Compatible**: Existing array-based workflows continue to work
- 📝 **Clear Error Messages**: Helpful validation messages for debugging
- 🛡️ **Type Safety**: Handles all YAML scalar types correctly

---

## 🚀 Breaking Changes

**Technically none in a strictly backward-incompatible sense**, but two semantic changes exist (see [⚠️ Semantic Changes](#️-semantic-changes-read-before-merging) above):

1. Scalar matrix values now iterate (previously they caused a `nil` matrix / no iteration)
2. Unevaluated template strings now produce a literal iteration instead of skipping

Both changes are improvements in practice, but downstream users should be aware.

---

## 📚 Related Work

- Related Gitea Server PR: [go-gitea/gitea#36564](https://github.com/go-gitea/gitea/pull/36564)
- Enables dynamic matrix generation from job outputs
- Supports advanced CI/CD patterns used in GitHub Actions

## 🔗 References

- [GitHub Actions: Matrix strategies](https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs)
- [GitHub Actions: Expression syntax](https://docs.github.com/en/actions/learn-github-actions/expressions)
- [GitHub Actions: `fromJSON` in matrix context](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstrategymatrix)

Reviewed-on: https://gitea.com/gitea/act/pulls/149
Reviewed-by: silverwind <2021+silverwind@noreply.gitea.com>
Co-authored-by: Pascal Zimmermann <pascal.zimmermann@theiotstudio.com>
Co-committed-by: Pascal Zimmermann <pascal.zimmermann@theiotstudio.com>
2026-04-19 22:36:34 +00:00
Bo-Yi Wu
9aafec169b perf: use single poller with semaphore-based capacity control (#822)
## Background

#819 replaced the shared `rate.Limiter` with per-worker exponential backoff counters to add jitter and adaptive polling. Before #819, the poller used:

```go
limiter := rate.NewLimiter(rate.Every(p.cfg.Runner.FetchInterval), 1)
```

This limiter was **shared across all N polling goroutines with burst=1**, effectively serializing their `FetchTask` calls — so even with `capacity=60`, the runner issued roughly one `FetchTask` per `FetchInterval` total.

#819 replaced this with independent per-worker `consecutiveEmpty` / `consecutiveErrors` counters. Each goroutine now backs off **independently**, which inadvertently removed the cross-worker serialization. With `capacity=N`, the runner now has N goroutines each polling on their own schedule — a regression from the pre-#819 baseline for any runner with `capacity > 1`.

(Thanks to @ChristopherHX for catching this in review.)

## Problem

With the post-#819 code:

- `capacity=N` maintains **N persistent polling goroutines**, each calling `FetchTask` independently
- At idle, N goroutines each wake up and send a `FetchTask` RPC per `FetchInterval`
- At full load, N goroutines **continue polling** even though no slot is available to run a new task — every one of those RPCs is wasted
- The `Shutdown()` timeout branch has a pre-existing bug: the "non-blocking check" is actually a blocking receive, so `shutdownJobs()` is never reached on timeout

## Real-World Impact: 3 Runners × capacity=60

Current production environment: 3 runners each with `capacity=60`.

| Metric | Post-#819 (current) | This PR | Reduction |
|--------|---------------------|---------|-----------|
| Polling goroutines (total) | 3 × 60 = **180** | 3 × 1 = **3** | **98.3%** (177 fewer) |
| FetchTask RPCs per poll cycle (idle) | **180** | **3** | **98.3%** |
| FetchTask RPCs per poll cycle (full load) | **180** (all wasted) | **0** (blocked on semaphore) | **100%** |
| Concurrent connections to Gitea | **180** | **3** | **98.3%** |
| Backoff state objects | 180 (per-worker) | 3 (one per runner) | Simplified |

### Idle scenario

All 180 goroutines wake up every `FetchInterval`, each sending a `FetchTask` RPC that returns empty. Server handles 180 RPCs per cycle for zero useful work. After this PR: **3 RPCs per cycle** — one per runner.

> Note: pre-#819 idle behavior was already ~3 RPCs/cycle due to the shared `rate.Limiter`. This PR restores that property while also addressing the full-load case below.

### Full-load scenario (all 180 slots occupied)

All 180 goroutines **continue polling** even though no slot is available. Every RPC is wasted. After this PR: all 3 pollers are **blocked on the semaphore** — **zero RPCs** until a task completes.

> This is a scenario neither the pre-#819 shared limiter nor the post-#819 per-worker backoff handles — both still issue `FetchTask` RPCs when no slot is free. The semaphore is the only approach of the three that ties polling to available capacity.

## Why Not Just Revert to `rate.Limiter`?

Reverting would restore the serialized behavior but is not the right long-term fix:

- **`rate.Limiter` has no concept of available capacity.** At full load it still hands out tokens and issues `FetchTask` RPCs that can't be acted on. The semaphore blocks polling entirely in that case — zero wasted RPCs.
- **It composes poorly with adaptive backoff from #819.** A shared limiter and per-worker backoff pull in different directions.
- **N goroutines serializing on a shared limiter means N-1 of them exist only to wait in line.** A single poller expresses the same behavior more directly.

The semaphore approach ties polling to capacity explicitly: `acquire slot → fetch → dispatch → release`. That invariant becomes structural rather than emergent from a rate limiter.

## Solution

Replace N polling goroutines with a **single polling loop** that uses a buffered channel as a semaphore to control concurrent task execution:

```go
// New: poller.go Poll()
sem := make(chan struct{}, p.cfg.Runner.Capacity)
for {
    select {
    case sem <- struct{}{}:       // Acquire slot (blocks at capacity)
    case <-p.pollingCtx.Done():
        return
    }
    task, ok := p.fetchTask(...)  // Single FetchTask RPC
    if !ok {
        <-sem                     // Release slot on empty response
        // backoff...
        continue
    }
    go func(t *runnerv1.Task) {   // Dispatch task
        defer func() { <-sem }()  // Release slot when done
        p.runTaskWithRecover(p.jobsCtx, t)
    }(task)
}
```

The exponential backoff and jitter from #819 are preserved — just driven by a single `workerState` instead of N per-worker states.

## Shutdown Bug Fix

Fixed a pre-existing bug in `Shutdown()` where the timeout branch could never force-cancel running jobs:

```go
// Before (BROKEN): blocking receive, shutdownJobs() never reached
_, ok := <-p.done   // blocks until p.done is closed
if !ok { return nil }
p.shutdownJobs()    // dead code when jobs are still running

// After (FIXED): proper non-blocking check
select {
case <-p.done:
    return nil
default:
}
p.shutdownJobs()    // now correctly reached on timeout
```

## Code Changes

| Area | Detail |
|------|--------|
| `Poller.runner` | `*run.Runner` → `TaskRunner` interface (enables mock-based testing) |
| `Poll()` | N goroutines → single loop with buffered-channel semaphore |
| `PollOnce()` | Inlined from removed `pollOnce()` |
| `waitBackoff()` | New helper, eliminates duplicated backoff logic |
| `resetBackoff()` | New method on `workerState`, also resets stale `lastBackoff` metric |
| `Shutdown()` | Fixed blocking receive → proper non-blocking select |
| Removed | `poll()`, `pollOnce()` private methods (-2 methods, -42 lines) |

## Test Coverage

Added `TestPoller_ConcurrencyLimitedByCapacity` which verifies:

- With `capacity=3`, at most 3 tasks execute concurrently (`maxConcurrent <= 3`)
- Tasks actually overlap in execution (`maxConcurrent >= 2`)
- `FetchTask` is never called concurrently — confirms single poller (`maxFetchConcur == 1`)
- All 6 tasks complete successfully (`totalCompleted == 6`)
- Mock runner respects context cancellation, enabling shutdown path verification

```
=== RUN   TestPoller_ConcurrencyLimitedByCapacity
--- PASS: TestPoller_ConcurrencyLimitedByCapacity (0.10s)
PASS
ok  	gitea.com/gitea/act_runner/internal/app/poll	0.59s
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/822
Reviewed-by: silverwind <2021+silverwind@noreply.gitea.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
v0.4.1
2026-04-19 08:10:23 +00:00
silverwind
f923badec7 Use golangci-lint fmt to format code (#163)
Use `golangci-lint fmt` to format code, upgrading `.golangci.yml` to v2 and mirroring the linter configuration used by https://github.com/go-gitea/gitea. `gci` now handles import ordering into standard, project-local, blank, and default groups.

Mirrors https://github.com/go-gitea/gitea/pull/37194.

Changes:
- Upgrade `.golangci.yml` to v2 format with the same linter set as gitea (minus `prealloc`, `unparam`, `testifylint`, `nilnil` which produced too many pre-existing issues)
- Add path-based exclusions (`bodyclose`, `gosec` in tests; `gosec:G115`/`G117` globally)
- Run lint via `make lint-go` in CI instead of `golangci/golangci-lint-action`, matching the pattern used by other Gitea repos
- Apply safe auto-fixes (`modernize`, `perfsprint`, `usetesting`, etc.)
- Add explanations to existing `//nolint` directives
- Remove dead code (unused `newRemoteReusableWorkflow` and `networkName`), duplicate imports, and shadowed `max` builtins
- Replace deprecated `docker/distribution/reference` with `distribution/reference`
- Fix `Deprecated:` comment casing and simplify nil/len checks

---
This PR was written with the help of Claude Opus 4.7

Reviewed-on: https://gitea.com/gitea/act/pulls/163
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-04-18 09:10:09 +00:00
silverwind
48944e136c Use golangci-lint fmt to format code (#823)
Use `golangci-lint fmt` to format code, replacing the previous gofumpt-based formatter. https://github.com/daixiang0/gci is used to order the imports.

Also drops the `gitea-vet` dependency since `gci` now handles import ordering.

Mirrors https://github.com/go-gitea/gitea/pull/37194.

---
This PR was written with the help of Claude Opus 4.7

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/823
Reviewed-by: Nicolas <173651+bircni@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-04-17 22:05:01 +00:00
Bo-Yi Wu
40dcee0991 chore(deps): upgrade golangci-lint from v2.10.1 to v2.11.4 (#821)
## Summary
- Bump golangci-lint from v2.10.1 to v2.11.4
- Remove unused `//nolint:revive` directive on metrics package declaration (detected by stricter nolintlint in new version)

## Changes between v2.10.1 and v2.11.4
- **v2.11.0** — Multiple linter dependency upgrades, Go 1.26 support
- **v2.11.2** — Bug fix for `fmt` with path
- **v2.11.3** — gosec update
- **v2.11.4** — Dependency updates (sqlclosecheck, noctx, etc.)

No breaking changes.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/821
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
v0.4.0
2026-04-15 03:56:34 +00:00
Bo-Yi Wu
f33e5a6245 feat: add Prometheus metrics endpoint for runner observability (#820)
## What

Add an optional Prometheus `/metrics` HTTP endpoint to `act_runner` so operators can observe runner health, polling behavior, job outcomes, and RPC latency without scraping logs.

New surface:

- `internal/pkg/metrics/metrics.go` — metric definitions, custom `Registry`, static Go/process collectors, label constants, `ResultToStatusLabel` helper.
- `internal/pkg/metrics/server.go` — hardened `http.Server` serving `/metrics` and `/healthz` with Slowloris-safe timeouts (`ReadHeaderTimeout` 5s, `ReadTimeout`/`WriteTimeout` 10s, `IdleTimeout` 60s) and a 5s graceful shutdown.
- `daemon.go` wires it up behind `cfg.Metrics.Enabled` (disabled by default).
- `poller.go` / `reporter.go` / `runner.go` instrument their existing hot paths with counters/histograms/gauges — no behavior change.

Metrics exported (namespace `act_runner_`):

| Subsystem | Metric | Type | Labels |
|---|---|---|---|
| — | `info` | Gauge | `version`, `name` |
| — | `capacity`, `uptime_seconds` | Gauge | — |
| `poll` | `fetch_total`, `client_errors_total` | Counter | `result` / `method` |
| `poll` | `fetch_duration_seconds`, `backoff_seconds` | Histogram / Gauge | — |
| `job` | `total` | Counter | `status` |
| `job` | `duration_seconds`, `running`, `capacity_utilization_ratio` | Histogram / GaugeFunc | — |
| `report` | `log_total`, `state_total` | Counter | `result` |
| `report` | `log_duration_seconds`, `state_duration_seconds` | Histogram | — |
| `report` | `log_buffer_rows` | Gauge | — |
| — | `go_*`, `process_*` | standard collectors | — |

All label values are predefined constants — **no high-cardinality labels** (no task IDs, repo URLs, branches, tokens, or secrets) so scraping is safe and bounded.

## Why

Teams self-hosting Gitea + `act_runner` at scale need to answer basic SRE questions that are currently invisible:

- How often are RPCs failing? Which RPC? (`act_runner_client_errors_total`)
- Are runners saturated? (`act_runner_job_capacity_utilization_ratio`, `act_runner_job_running`)
- How long do jobs take? (`act_runner_job_duration_seconds`)
- Is polling backing off? (`act_runner_poll_backoff_seconds`, `act_runner_poll_fetch_total{result=\"error\"}`)
- Are log/state reports slow? (`act_runner_report_{log,state}_duration_seconds`)
- Is the log buffer draining? (`act_runner_report_log_buffer_rows`)

Today operators have to grep logs. This PR makes all of the above first-class metrics so they can feed dashboards and alerts (`rate(act_runner_client_errors_total[5m]) > 0.1`, capacity saturation alerts, etc.).

The endpoint is **disabled by default** and binds to `127.0.0.1:9101` when enabled, so it's opt-in and safe for existing deployments.

## How

### Config

```yaml
metrics:
  enabled: false           # opt-in
  addr: 127.0.0.1:9101     # change to 0.0.0.0:9101 only behind a reverse proxy
```

`config.example.yaml` documents both fields plus a security note about binding externally without auth.

### Wiring

1. `daemon.go` calls `metrics.Init()` (guarded by `sync.Once`), sets `act_runner_info`, `act_runner_capacity`, registers uptime + running-jobs GaugeFuncs, then starts the server goroutine with the daemon context — it shuts down cleanly on `ctx.Done()`.
2. `poller.fetchTask` observes RPC latency / result / error counters. `DeadlineExceeded` (long-poll idle) is treated as an empty result and **not** observed into the histogram so the 5s timeout doesn't swamp the buckets.
3. `poller.pollOnce` reports `poll_backoff_seconds` using the pre-jitter base interval (the true backoff level), and only when it changes — prevents noisy no-op gauge updates at the `FetchIntervalMax` plateau.
4. `reporter.ReportLog` / `ReportState` record duration histograms and success/error counters; `log_buffer_rows` is updated only when the value changes, guarded by the already-held `clientM`.
5. `runner.Run` observes `job_duration_seconds` and increments `job_total` by outcome via `metrics.ResultToStatusLabel`.

### Safety / security review

- All timeouts set; Slowloris-safe.
- Custom `prometheus.NewRegistry()` — no global registration side-effects.
- No sensitive data in labels (reviewed every instrumentation site).
- Single new dependency: `github.com/prometheus/client_golang v1.23.2`.
- Endpoint is unauthenticated by design and documented as such; default localhost bind mitigates exposure. Operators exposing externally should front it with a reverse proxy.

## Verification

### Unit tests

\`\`\`bash
go build ./...
go vet ./...
go test ./...
\`\`\`

### Manual smoke test

1. Enable metrics in `config.yaml`:
   \`\`\`yaml
   metrics:
     enabled: true
     addr: 127.0.0.1:9101
   \`\`\`
2. Start the runner against a Gitea instance: \`./act_runner daemon\`.
3. Scrape the endpoint:
   \`\`\`bash
   curl -s http://127.0.0.1:9101/metrics | grep '^act_runner_'
   curl -s http://127.0.0.1:9101/healthz   # → ok
   \`\`\`
4. Confirm the static series appear immediately: \`act_runner_info\`, \`act_runner_capacity\`, \`act_runner_uptime_seconds\`, \`act_runner_job_running\`, \`act_runner_job_capacity_utilization_ratio\`.
5. Trigger a workflow and confirm counters increment: \`act_runner_poll_fetch_total{result=\"task\"}\`, \`act_runner_job_total{status=\"success\"}\`, \`act_runner_report_log_total{result=\"success\"}\`.
6. Leave the runner idle and confirm \`act_runner_poll_backoff_seconds\` settles (and does **not** churn on every poll).
7. Ctrl-C and confirm a clean \"metrics server shutdown\" log line (no port-in-use error on restart within 5s).

### Prometheus integration

Add to \`prometheus.yml\`:

\`\`\`yaml
scrape_configs:
  - job_name: act_runner
    static_configs:
      - targets: ['127.0.0.1:9101']
\`\`\`

Sample alert to try:

\`\`\`
sum(rate(act_runner_client_errors_total[5m])) by (method) > 0.1
\`\`\`

## Out of scope (follow-ups)

- TLS and auth on the metrics endpoint (mitigated today by localhost default; add when operators need external scraping).
- Per-task labels (intentionally avoided for cardinality safety).

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/820
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
2026-04-15 01:27:34 +00:00
Bo-Yi Wu
f2d545565f perf: reduce runner-to-server connection load with adaptive reporting and polling (#819)
## Summary

Many teams self-host Gitea + Act Runner at scale. The current runner design causes excessive HTTP requests to the Gitea server, leading to high server load. This PR addresses three root causes: aggressive fixed-interval polling, per-task status reporting every 1 second regardless of activity, and unoptimized HTTP client configuration.

## Problem

The original architecture has these issues:

**1. Fixed 1-second reporting interval (RunDaemon)**

- Every running task calls ReportLog + ReportState every 1 second (2 HTTP requests/sec/task)
- These requests are sent even when there are no new log rows or state changes
- With 200 runners × 3 tasks each = **1,200 req/sec just for status reporting**

**2. Fixed 2-second polling interval (no backoff)**

- Idle runners poll FetchTask every 2 seconds forever, even when no jobs are queued
- No exponential backoff or jitter — all runners can synchronize after network recovery (thundering herd)
- 200 idle runners = **100 req/sec doing nothing useful**

**3. HTTP client not tuned**

- Uses http.DefaultClient with MaxIdleConnsPerHost=2, causing frequent TCP/TLS reconnects
- Creates two separate http.Client instances (one for Ping, one for Runner service) instead of sharing

**Total: ~1,300 req/sec for 200 runners with 3 tasks each**

## Solution

### Adaptive Event-Driven Log Reporting

Replace the recursive `time.AfterFunc(1s)` pattern in RunDaemon with a goroutine-based select event loop using three trigger mechanisms:

| Trigger | Default | Purpose |
|---------|---------|---------|
| `log_report_max_latency` | 3s | Guarantee even a single log line is delivered within this time |
| `log_report_interval` | 5s | Periodic sweep — steady-state cadence |
| `log_report_batch_size` | 100 rows | Immediate flush during bursty output (e.g., npm install) |

**Key design**: `log_report_max_latency` (3s) must be less than `log_report_interval` (5s) so the max-latency timer fires before the periodic ticker for single-line scenarios.

State reporting is decoupled to its own `state_report_interval` (default 5s), with immediate flush on step transitions (start/stop) via a stateNotify channel for responsive frontend UX.

Additionally:
- Skip ReportLog when `len(rows) == 0` (no pending log rows)
- Skip ReportState when `stateChanged == false && len(outputs) == 0` (nothing changed)
- Move expensive `proto.Clone` after the early-return check to avoid deep copies on no-op paths

### Polling Backoff with Jitter

Replace fixed `rate.Limiter` with adaptive exponential backoff:
- Track `consecutiveEmpty` and `consecutiveErrors` counters
- Interval doubles with each empty/error response: `base × 2^(n-1)`, capped at `fetch_interval_max` (default 60s)
- Add ±20% random jitter to prevent thundering herd
- Fetch first, sleep after ��� preserves burst=1 behavior for immediate first fetch on startup and after task completion

### HTTP Client Tuning

- Configure custom `http.Transport` with `MaxIdleConnsPerHost=10` (was 2)
- Share a single `http.Client` between PingService and RunnerService
- Add `IdleConnTimeout=90s` for clean connection lifecycle

## Load Reduction

For 200 runners × 3 tasks (70% with active log output):

| Component | Before | After | Reduction |
|-----------|--------|-------|-----------|
| Polling (idle) | 100 req/s | ~3.4 req/s | 97% |
| Log reporting | 420 req/s | ~84 req/s | 80% |
| State reporting | 126 req/s | ~25 req/s | 80% |
| **Total** | **~1,300 req/s** | **~113 req/s** | **~91%** |

## Frontend UX Impact

| Scenario | Before | After | Notes |
|----------|--------|-------|-------|
| Continuous output (npm install) | ~1s | ~5s | Periodic ticker sweep |
| Single line then silence | ~1s | ≤3s | maxLatencyTimer guarantee |
| Bursty output (100+ lines) | ~1s | <1s | Batch size immediate flush |
| Step start/stop | ~1s | <1s | stateNotify immediate flush |
| Job completion | ~1s | ~1s | Close() retry unchanged |

## New Configuration Options

All have safe defaults — existing config files need no changes:

```yaml
runner:
  fetch_interval_max: 60s        # Max backoff interval when idle
  log_report_interval: 5s        # Periodic log flush interval
  log_report_max_latency: 3s     # Max time a log row waits (must be < log_report_interval)
  log_report_batch_size: 100     # Immediate flush threshold
  state_report_interval: 5s      # State flush interval (step transitions are always immediate)
```

Config validation warns on invalid combinations:
- `fetch_interval_max < fetch_interval` → auto-corrected
- `log_report_max_latency >= log_report_interval` → warning (timer would be redundant)

## Test Plan

- [x] `go build ./...` passes
- [x] `go test ./...` passes (all existing + 3 new tests)
- [x] `golangci-lint run` — 0 issues
- [x] TestReporter_MaxLatencyTimer — verifies single log line flushed by maxLatencyTimer before logTicker
- [x] TestReporter_BatchSizeFlush — verifies batch size threshold triggers immediate flush
- [x] TestReporter_StateNotifyFlush — verifies step transition triggers immediate state flush
- [x] TestReporter_EphemeralRunnerDeletion — verifies Close/RunDaemon race safety
- [x] TestReporter_RunDaemonClose_Race — verifies concurrent Close safety

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/819
Reviewed-by: Nicolas <173651+bircni@noreply.gitea.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-committed-by: Bo-Yi Wu <appleboy.tw@gmail.com>
2026-04-14 11:29:25 +00:00
Lunny Xiao
90c1275f0e Upgrade yaml (#816)
~wait https://gitea.com/gitea/act/pulls/157~

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/816
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
2026-03-28 16:18:47 +00:00
Lunny Xiao
3232358e71 downgrade yaml from rc.4 to rc.3 and upgrade action lint (#158)
Reviewed-on: https://gitea.com/gitea/act/pulls/158
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
2026-03-28 04:13:07 +00:00
Lunny Xiao
2e98baa34a Upgrade yaml (#157)
Reviewed-on: https://gitea.com/gitea/act/pulls/157
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
2026-03-28 03:31:27 +00:00
Zettat123
505907eb2a Add run_attempt to context (#632)
Blocked by https://gitea.com/gitea/act/pulls/126
Fix https://github.com/go-gitea/gitea/issues/33135

---------

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/632
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
v0.3.1
2026-03-26 20:07:22 +00:00
silverwind
9933ea0d92 feat: add configurable bind_workdir option with workspace cleanup for DinD setups (#810)
## Summary

Adds a `container.bind_workdir` config option that exposes the nektos/act `BindWorkdir` setting. When enabled, workspaces are bind-mounted from the host filesystem instead of Docker volumes, which is required for DinD setups where jobs use `docker compose` with bind mounts (e.g. `.:/app`).

Each job gets an isolated workspace at `/workspace/<task_id>/<owner>/<repo>` to prevent concurrent jobs from the same repo interfering with each other. The task directory is cleaned up after job execution.

### Configuration

```yaml
container:
  bind_workdir: true
```

When using this with DinD, also mount the workspace parent into the runner container and add it to `valid_volumes`:
```yaml
container:
  valid_volumes:
    - /workspace/**
```

*This PR was authored by Claude (AI assistant)*

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/810
Reviewed-by: ChristopherHX <38043+christopherhx@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-03-03 10:15:06 +00:00
RoboMagus
5dd5436169 Semver tags for Docker images (#720)
The main Gitea docker image is already distributed with proper semver tags, such that users can pin to e.g. the minor version and still pull in patch releases.  This is something that has been lacking on the act runner images.

This PR expands the docker image tag versioning strategy such that when `v0.2.13` is released the following image tags are produced:

Basic:
 - `0`
 - `0.2`
 - `0.2.13`
 - `latest`

DinD:
 - `0-dind`
 - `0.2-dind`
 - `0.2.13-dind`
 - `latest-dind`

DinD-Rootless:
 - `0-dind-rootless`
 - `0.2-dind-rootless`
 - `0.2.13-dind-rootless`
 - `latest-dind-rootless`

To verify the `docker/metadata-action` produces the expected results in a Gitea workflow environment I've executed a release workflow. Results can be seen in [this run](https://gitea.com/RoboMagus/gitea_act_runner/actions/runs/14). (Note though that as the repository name of my fork changed from `act_runner` to `gitea_act_runner` this is reflected in the produced docker tags in this test run!)

---------

Co-authored-by: RoboMagus <68224306+RoboMagus@users.noreply.github.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: techknowlogick <techknowlogick@noreply.gitea.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/720
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: RoboMagus <robomagus@noreply.gitea.com>
Co-committed-by: RoboMagus <robomagus@noreply.gitea.com>
2026-02-25 19:09:53 +00:00
Lunny Xiao
28740d7788 Upgrade go mod (#154)
Reviewed-on: https://gitea.com/gitea/act/pulls/154
Reviewed-by: silverwind <silverwind@noreply.gitea.com>
2026-02-22 20:39:43 +00:00
Lunny Xiao
ddf9159a8f Remove jobparser because of moving to Gitea main project (#155)
Reviewed-on: https://gitea.com/gitea/act/pulls/155
Reviewed-by: silverwind <silverwind@noreply.gitea.com>
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
2026-02-22 19:04:16 +00:00
silverwind
658101d9cb chore(lint): add golangci-lint v2 and fix all lint issues (#803)
## Summary
- Replace old `.golangci.yml` (v1 format) with v2 format, aligned with gitea's lint config
- Add `lint-go`, `lint-go-fix`, and `lint` Makefile targets using golangci-lint v2.10.1
- Replace `make vet` with `make lint` in CI workflow (lint includes vet)
- Fix all 35 lint issues: modernize (maps.Copy, range over int, any), perfsprint (errors.New), unparam (remove unused parameters), revive (var naming), staticcheck, forbidigo exclusion for cmd/
- Make `security-check` non-fatal (apply https://github.com/go-gitea/gitea/pull/36681)
- Remove dead gocritic exclusion rules (commentFormatting, exitAfterDefer)
- Remove dead linter exclusions and disabled checks (singleCaseSwitch, ST1003, QF1001, QF1006, QF1008, testifylint go-require/require-error, test file exclusions for dupl/errcheck/staticcheck/unparam)

## Test plan
- [x] `golangci-lint run` passes
- [x] `go build ./...` passes
- [x] `go test ./...` passes

---------

Co-authored-by: ChristopherHX <christopher.homberger@web.de>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/803
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
2026-02-22 17:35:08 +00:00
ChristopherHX
f0f9f0c8ab fix: composite action log result reported as step result (#801)
Act logs an array of stepID to signal that this is an partial step result within an composite actions, this is the case for years and act_runner never handled it correctly.

Ref: <43e6958fa3/pkg/runner/logger.go (L142)>

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/801
Reviewed-by: silverwind <silverwind@noreply.gitea.com>
2026-02-22 14:56:09 +00:00
silverwind
ae6e5dfcf7 fix: race condition in reporter between RunDaemon and Close (#796)
## Summary

- Fix data race on `r.closed` between `RunDaemon()` and `Close()` by protecting it with the existing `stateMu` — `closed` is part of the reporter state. `RunDaemon()` reads it under `stateMu.RLock()`, `Close()` sets it inside the existing `stateMu.Lock()` block
- `ReportState` now has a parameter to not report results from runDaemon even if set, from now on `Close` reports the result
- `Close` waits for `RunDaemon()` to signal exit via a closed channel `daemon` before reporting the final logs and state with result, unless something really wrong happens it does not time out
- Add `TestReporter_EphemeralRunnerDeletion` which reproduces the exact scenario from #793: RunDaemon's `ReportState` racing with `Close`, causing the ephemeral runner to be deleted before final logs are sent
- Add `TestReporter_RunDaemonClose_Race` which exercises `RunDaemon()` and `Close()` concurrently to verify no data race on `r.closed` under `go test -race`
- Enable `-race` flag in `make test` so CI catches data races going forward

Based on #794, with fixes for the remaining unprotected `r.closed` reads that the race detector catches.

Fixes #793

---------

Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-authored-by: ChristopherHX <christopher.homberger@web.de>
Co-authored-by: rmawatson <rmawatson@hotmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/796
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-02-22 14:52:49 +00:00
silverwind
f4418eff18 chore(deps): update go dependencies (#802)
## Summary
- Update all Go dependencies to latest compatible versions
- Notable updates: connectrpc v1.19.1, cobra v1.10.2, protobuf v1.36.11, otel v1.40.0, actionlint v1.7.11
- docker/docker kept at v25 (v28+ has breaking module restructuring)
- Fix govulncheck issues: update go-git v5.16.2 → v5.16.5 (GO-2026-4473), containerd v1.7.13 → v1.7.29 (GO-2025-4108, GO-2025-4100, GO-2025-3528)
- Remove unnecessary go-git replace directive
- Add `make tidy` and `make tidy-check` Makefile targets
- Clarify `distribution/reference` replace directive comment (needed because `docker/distribution@v2.8.3` uses `reference.SplitHostname` removed in v0.6.0)

## Test plan
- [x] `go build ./...` passes
- [x] `go test ./...` passes
- [x] `govulncheck ./...` passes

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/802
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-02-22 11:59:47 +00:00
silverwind
0f7efae806 chore(deps): update workflow action versions (#792)
## Summary
- `actions/checkout`: v4/v5 → v6
- `actions/setup-go`: v5 → v6
- `docker/build-push-action`: v5 → v6

All other actions (`goreleaser/goreleaser-action@v6`, `docker/setup-qemu-action@v3`, `docker/setup-buildx-action@v3`, `docker/login-action@v3`, `crazy-max/ghaction-import-gpg@v6`) were already at their latest major versions.

No breaking changes affect the current workflow configurations. The main changes in the updated actions are Node.js 24 runtime upgrades and minor feature additions.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/792
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-02-18 06:05:34 +00:00
silverwind
aab249000c chore(deps): update act to v0.261.8 (#791)
## Summary
- Update `gitea.com/gitea/act` from pseudo-version `v0.261.7-0.20251202193638-5417d3ac6742` to `v0.261.8`
- Update yaml import in `workflow.go` from `gopkg.in/yaml.v3` to `go.yaml.in/yaml/v4` to match act's yaml v4 migration
- Add tests to verify yaml/v4 upgrade works correctly

## Changes included in act v0.261.8
- Recover from panics in parallel executor workers ([gitea/act#153](https://gitea.com/gitea/act/pulls/153))
- Fix max-parallel support for matrix jobs ([gitea/act#150](https://gitea.com/gitea/act/pulls/150))
- Fix yaml with prefixed newline broken setjob + yaml v4 ([gitea/act#144](https://gitea.com/gitea/act/pulls/144))
- Fixed typo ([gitea/act#151](https://gitea.com/gitea/act/pulls/151))

## Tests added
- **`Test_generateWorkflow`**: 7 new cases (was 1), covering: no needs, needs as list, workflow env/triggers, container+services, matrix strategy, special YAML characters, and invalid YAML error handling
- **`Test_yamlV4NodeRoundTrip`**: Directly exercises `go.yaml.in/yaml/v4` — `yaml.Node` construction, marshal/unmarshal round-trip, and node kind constants

Fixes: https://gitea.com/gitea/act_runner/issues/371
Fixes: https://gitea.com/gitea/act_runner/issues/772
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/791
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
v0.3.0
2026-02-18 05:37:21 +00:00
silverwind
43e6958fa3 Recover from panics in parallel executor workers (#153)
The worker goroutines in `NewParallelExecutor` had no panic recovery. A panic in any executor (e.g. expression evaluation) would crash the entire runner process, leaving all steps stuck in the UI because the final status was never reported back.

Add `defer`/`recover` in the worker goroutines to convert panics into errors that propagate through the normal error channel.

Issue is present in latest `nektos/act` as well.

Fixes https://gitea.com/gitea/act_runner/issues/371

(Partially generated by Claude)

Reviewed-on: https://gitea.com/gitea/act/pulls/153
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-02-18 03:29:07 +00:00
stecklars
c82077e6b5 examples: improve DIND rootless network performance (#786)
## Summary
- Add `DOCKERD_ROOTLESS_ROOTLESSKIT_NET=slirp4netns` and `DOCKERD_ROOTLESS_ROOTLESSKIT_MTU=65520` to the DIND docker-compose example
- The `docker:dind-rootless` base image defaults to vpnkit as the network driver, which has substantially lower throughput than slirp4netns

## The problem

I noticed that pulling containers as well as downloading data within the container when running act_runner as DIND was very slow (see Ookla speedtest results in the following). While analysing the issue, I found that this was caused by the usage of vpnkit.

The `docker:dind-rootless` base image defaults to vpnkit as the network driver. slirp4netns was [added as an opt-in option](https://github.com/docker-library/docker/pull/543) and must be explicitly enabled via `DOCKERD_ROOTLESS_ROOTLESSKIT_NET=slirp4netns`.

This means anyone following the current DIND example gets vpnkit, which has significantly lower network throughput. This affects **all** network operations in the container — image pulls, package installs, and CI tasks.

Per the [rootlesskit iperf3 benchmarks](https://github.com/rootless-containers/rootlesskit/blob/master/docs/network.md):

| Driver | MTU 1500 | MTU 65520 |
|--------|----------|-----------|
| **vpnkit** | 0.60 Gbps | not supported |
| **slirp4netns** | 1.06 Gbps | 7.55 Gbps |

## Real-world benchmark results (Ookla speedtest, same server)

| | Download | Upload |
|---|---|---|
| **Default (vpnkit)** | ~130 Mbps | ~126 Mbps |
| **slirp4netns + MTU 65520** | ~958 Mbps | ~462 Mbps |

## References
- [docker-library/docker#543](https://github.com/docker-library/docker/pull/543) — added slirp4netns to dind-rootless as opt-in (vpnkit remains default)
- [rootlesskit network docs](https://github.com/rootless-containers/rootlesskit/blob/master/docs/network.md) — iperf3 benchmarks

---------

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/786
Reviewed-by: silverwind <silverwind@noreply.gitea.com>
Co-authored-by: stecklars <stecklars@noreply.gitea.com>
Co-committed-by: stecklars <stecklars@noreply.gitea.com>
2026-02-16 07:56:45 +00:00
silverwind
9f91fdd0eb chore(deps): bump Go version to 1.26 in Dockerfile and Makefile (#788)
## Summary
- Bump Dockerfile builder image from `golang:1.24-alpine` to `golang:1.26-alpine`
- Bump Makefile `XGO_VERSION` from `go-1.24.x` to `go-1.26.x`

These were missed in #787 which bumped `go.mod` to Go 1.26.0, causing CI build failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/788
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-02-14 10:32:16 +00:00
silverwind
7a7a5d9051 chore(deps): bump Go version to 1.26.0 (#787)
## Summary
- Bumps Go version from 1.24.0 (toolchain go1.24.11) to 1.26.0
- Fixes CI `govulncheck` failures caused by three standard library vulnerabilities in go1.24.11:
  - GO-2026-4341: Memory exhaustion in `net/url` query parameter parsing
  - GO-2026-4340: Handshake messages at incorrect encryption level in `crypto/tls`
  - GO-2026-4337: Unexpected session resumption in `crypto/tls`

## Test plan
- [x] `make vet` passes
- [x] `make build` passes
- [x] `make test` passes (includes `govulncheck` and all unit tests)

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/787
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
2026-02-14 10:19:45 +00:00
Pascal Zimmermann
c0f19d9a26 fix: Max parallel Support for Matrix Jobs and Remote Action Tests (#150)
## Summary

This PR fixes the `max-parallel` strategy configuration for matrix jobs and resolves all failing tests in `step_action_remote_test.go`. The implementation ensures that matrix jobs respect the `max-parallel` setting, preventing resource exhaustion when running GitHub Actions workflows.

## Problem Statement

### Issue 1: max-parallel Not Working Correctly
Matrix jobs were running in parallel regardless of the `max-parallel` setting in the strategy configuration. This caused:
- Resource contention on limited runners
- Unpredictable job execution behavior
- Inability to control concurrency for resource-intensive workflows

### Issue 2: Failing Remote Action Tests
All tests in `step_action_remote_test.go` were failing due to:
- Missing `ActionCacheDir` configuration
- Incorrect mock expectations using fixed strings instead of hash-based paths
- Incompatibility with the hash-based action cache implementation

## Changes

### 1. max-parallel Implementation (`pkg/runner/runner.go`)

#### Robust Initialization
Added fallback logic to ensure `MaxParallel` is always properly initialized:
```go
if job.Strategy.MaxParallel == 0 {
    job.Strategy.MaxParallel = job.Strategy.GetMaxParallel()
}
```

#### Eliminated Unnecessary Nesting
Fixed inefficient nested parallelization when only one pipeline element exists:
```go
if len(pipeline) == 1 {
    // Execute directly without additional wrapper
    log.Debugf("Single pipeline element, executing directly")
    return pipeline[0](ctx)
}
```

#### Enhanced Logging
Added comprehensive debug and info logging:
- Shows which `maxParallel` value is being used
- Logs adjustments based on matrix size
- Reports final parallelization decisions

### 2. Worker Logging (`pkg/common/executor.go`)

Enhanced `NewParallelExecutor` with detailed worker activity logging:
```go
log.Infof("NewParallelExecutor: Creating %d workers for %d executors", parallel, len(executors))

for i := 0; i < parallel; i++ {
    go func(workerID int, work <-chan Executor, errs chan<- error) {
        log.Debugf("Worker %d started", workerID)
        taskCount := 0
        for executor := range work {
            taskCount++
            log.Debugf("Worker %d executing task %d", workerID, taskCount)
            errs <- executor(ctx)
        }
        log.Debugf("Worker %d finished (%d tasks executed)", workerID, taskCount)
    }(i, work, errs)
}
```

**Benefits:**
- Easy verification of correct parallelization
- Better debugging capabilities
- Clear visibility into worker activity

### 3. Documentation (`pkg/model/workflow.go`)

Added clarifying comment to ensure strategy values are always set:
```go
// Always set these values, even if there's an error later
j.Strategy.FailFast = j.Strategy.GetFailFast()
j.Strategy.MaxParallel = j.Strategy.GetMaxParallel()
```

### 4. Test Fixes (`pkg/runner/step_action_remote_test.go`)

#### Added Missing Configuration
All tests now include `ActionCacheDir`:
```go
Config: &Config{
    GitHubInstance: "github.com",
    ActionCacheDir: "/tmp/test-cache",
}
```

#### Hash-Based Suffix Matchers
Updated mocks to use hash-based paths instead of fixed strings:
```go
// Before
sarm.On("readAction", sar.Step, suffixMatcher("org-repo-path@ref"), ...)

// After
sarm.On("readAction", sar.Step, suffixMatcher(sar.Step.UsesHash()), ...)
```

#### Flexible Exec Matcher for Post Tests
Implemented flexible path matching for hash-based action directories:
```go
execMatcher := mock.MatchedBy(func(args []string) bool {
    if len(args) != 2 {
        return false
    }
    return args[0] == "node" && strings.Contains(args[1], "post.js")
})
```

#### Token Test Improvements
- Uses unique cache directory to force cloning
- Validates URL redirection to github.com
- Accepts realistic token behavior

### 5. New Tests

#### Unit Tests (`pkg/runner/max_parallel_test.go`)
Tests various `max-parallel` configurations:
- `max-parallel: 1` → Sequential execution
- `max-parallel: 2` → Max 2 parallel jobs
- `max-parallel: 4` (default) → Max 4 parallel jobs
- `max-parallel: 10` → Max 10 parallel jobs

#### Concurrency Test (`pkg/common/executor_max_parallel_test.go`)
Verifies that `max-parallel: 2` actually limits concurrent execution using atomic counters.

## Expected Behavior

### Before
-  Jobs ran in parallel regardless of `max-parallel` setting
-  Unnecessary nested parallelization (8 workers for 1 element)
-  No logging to debug parallelization issues
-  All `step_action_remote_test.go` tests failing

### After
-  `max-parallel: 1` → Jobs run strictly sequentially
-  `max-parallel: N` → Maximum N jobs run simultaneously
-  Efficient single-level parallelization for matrix jobs
-  Comprehensive logging for debugging
-  All tests passing (10/10)

## Example Log Output

With `max-parallel: 2` and 6 matrix jobs:
```
[DEBUG] Using job.Strategy.MaxParallel: 2
[INFO] Running job with maxParallel=2 for 6 matrix combinations
[DEBUG] Single pipeline element, executing directly
[INFO] NewParallelExecutor: Creating 2 workers for 6 executors
[DEBUG] Worker 0 started
[DEBUG] Worker 1 started
[DEBUG] Worker 0 executing task 1
[DEBUG] Worker 1 executing task 1
...
[DEBUG] Worker 0 finished (3 tasks executed)
[DEBUG] Worker 1 finished (3 tasks executed)
```

## Test Results

All tests pass successfully:
```
 TestStepActionRemote (3 sub-tests)
 TestStepActionRemotePre
 TestStepActionRemotePreThroughAction
 TestStepActionRemotePreThroughActionToken
 TestStepActionRemotePost (4 sub-tests)
 TestMaxParallelStrategy
 TestMaxParallel2Quick

Total: 12/12 tests passing
```

## Breaking Changes

None. This PR is fully backward compatible. All changes improve existing behavior without altering the API.

## Impact

-  Fixes resource management for CI/CD pipelines
-  Prevents runner exhaustion on limited infrastructure
-  Enables sequential execution for resource-intensive jobs
-  Improves debugging with detailed logging
-  Ensures test suite reliability

## Files Modified

### Core Implementation
- `pkg/runner/runner.go` - max-parallel fix + logging
- `pkg/common/executor.go` - Worker logging
- `pkg/model/workflow.go` - Documentation

### Tests
- `pkg/runner/step_action_remote_test.go` - Fixed all 10 tests
- `pkg/runner/max_parallel_test.go` - **NEW** - Unit tests
- `pkg/common/executor_max_parallel_test.go` - **NEW** - Concurrency test

## Verification

### Manual Testing
```bash
# Build
go build -o dist/act main.go

# Test with max-parallel: 2
./dist/act -W test-max-parallel-2.yml -v

# Expected: Only 2 jobs run simultaneously
```

### Automated Testing
```bash
# Run all tests
go test ./pkg/runner -run TestStepActionRemote -v
go test ./pkg/runner -run TestMaxParallel -v
go test ./pkg/common -run TestMaxParallel -v
```

## Related Issues

Fixes issues where matrix jobs in Gitea ignored the `max-parallel` strategy setting, causing resource contention and unpredictable behavior.

Reviewed-on: https://gitea.com/gitea/act/pulls/150
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: silverwind <silverwind@noreply.gitea.com>
Co-authored-by: Pascal Zimmermann <pascal.zimmermann@theiotstudio.com>
Co-committed-by: Pascal Zimmermann <pascal.zimmermann@theiotstudio.com>
2026-02-11 00:43:38 +00:00
wrzooz
495185446f Fixed typo (#151)
Reviewed-on: https://gitea.com/gitea/act/pulls/151
Reviewed-by: techknowlogick <techknowlogick@noreply.gitea.com>
Co-authored-by: wrzooz <wrzooz@noreply.gitea.com>
Co-committed-by: wrzooz <wrzooz@noreply.gitea.com>
2026-01-22 08:09:32 +00:00
mkesper
90d11b8692 Remove duplicate assignment of DIST (#777)
The assignment already happens at line 1.

Signed-off-by: mkesper <mkesper@noreply.gitea.com>

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/777
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: mkesper <mkesper@noreply.gitea.com>
Co-committed-by: mkesper <mkesper@noreply.gitea.com>
2025-12-26 00:38:38 +00:00
Christopher Homberger
c4b57fbcb2 chore(deps): upgrade dependencies (#775)
CI uses latest go 24, we may need a cron job after go updates.

Closes https://gitea.com/gitea/act_runner/issues/774

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/775
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
2025-12-20 23:35:15 +00:00
Christopher Homberger
3a07d231a0 Fix yaml with prefixed newline broken setjob + yaml v4 (#144)
* go-yaml v3 **and** v4 workaround
* avoid yaml.Marshal broken indention

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/144
Reviewed-by: wxiaoguang <wxiaoguang@noreply.gitea.com>
Reviewed-by: Zettat123 <zettat123@noreply.gitea.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
2025-12-09 02:28:56 +00:00
Zettat123
e6dbe2a1ca Bump gitea/act (#770)
Related to https://gitea.com/gitea/act/pulls/145

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/770
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2025-12-02 22:38:31 +00:00
Zettat123
5417d3ac67 Interpolate uses for remote reusable workflows (#145)
Related to #127

Reviewed-on: https://gitea.com/gitea/act/pulls/145
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2025-12-02 19:36:38 +00:00
Max P.
774f316c8b fix(dockerfile): Pin docker dind images to version 28, as version 29 has breaking changes in the API that are currently causing problems with the act fork of Gitea (#769)
The effect probably does not **yet** occur in the published versions because the last publication took place before the release of Docker v29.

Therefore, no one should expect version 29 of Docker to be used, so there are basically no side effects.

---

fix #768

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/769
Reviewed-by: techknowlogick <techknowlogick@noreply.gitea.com>
Co-authored-by: Max P. <mail@0xMax42.io>
Co-committed-by: Max P. <mail@0xMax42.io>
2025-11-25 14:40:27 +00:00
DavidToneian
823e9489d6 Fix some typos in internal/pkg/config/config.example.yaml (#764)
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/764
Reviewed-by: techknowlogick <techknowlogick@noreply.gitea.com>
Co-authored-by: DavidToneian <davidtoneian@noreply.gitea.com>
Co-committed-by: DavidToneian <davidtoneian@noreply.gitea.com>
2025-11-19 18:16:45 +00:00
DavidToneian
dc38bf1895 Fix URL to documentation of runner images (#765)
The previous URL, `https://gitea.com/docker.gitea.com/runner-images`, leads to a 404 currently.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/765
Reviewed-by: techknowlogick <techknowlogick@noreply.gitea.com>
Co-authored-by: DavidToneian <davidtoneian@noreply.gitea.com>
Co-committed-by: DavidToneian <davidtoneian@noreply.gitea.com>
2025-11-19 18:16:08 +00:00
Akkuman
96b866a3a8 chore: update config.example.yaml for https://gitea.com/gitea/act_runner/pulls/724 (#763)
for pr https://gitea.com/gitea/act_runner/pulls/724

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/763
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Co-authored-by: Akkuman <akkumans@qq.com>
Co-committed-by: Akkuman <akkumans@qq.com>
2025-11-18 07:18:38 +00:00
Zettat123
f56fd693ee Add run_attempt to context (#126)
Fix https://github.com/go-gitea/gitea/issues/33135

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/126
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2025-11-14 20:14:23 +00:00
WANG Xuerui
3b11bac2ad ci: release binary for linux/loong64 (#756)
Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/756
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: WANG Xuerui <git@xen0n.name>
Co-committed-by: WANG Xuerui <git@xen0n.name>
2025-10-21 17:42:08 +00:00
Zettat123
47caafd037 Bump gitea/act (#753)
Related to https://gitea.com/gitea/act/pulls/123

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/753
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2025-10-14 03:32:35 +00:00
Zettat123
34f68b3c18 Support cloning actions and workflows from private repos (#123)
Related to https://github.com/go-gitea/gitea/pull/32562

Resolve https://gitea.com/gitea/act_runner/issues/102

To support using actions and workflows from private repositories, we need to enable act_runner to clone private repositories.
~~But it is not easy to know if a repository is private and whether a token is required when cloning. In this PR, I added a new option `RetryToken`. By default, token is empty. When cloning a repo returns an `authentication required` error, `act_runner` will try to clone the repo again using `RetryToken` as the token.~~

In this PR, I added a new `getGitCloneToken` function. This function returns `GITEA_TOKEN` for cloning remote actions or remote reusable workflows when the cloneURL is from the same Gitea instance that the runner is registered to. Otherwise, it returns an empty string as token for cloning public repos from other instances (such as GitHub).

Thanks @ChristopherHX for https://gitea.com/gitea/act/pulls/123#issuecomment-1046171 and https://gitea.com/gitea/act/pulls/123#issuecomment-1046285.

Reviewed-on: https://gitea.com/gitea/act/pulls/123
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2025-10-14 02:37:09 +00:00
ChristopherHX
50e0509007 Do not implicitly mount /var/run/docker.sock (#751)
* podman creates an folder
* dind sees a folder and fails

Was adding the mount a mistake?, a feature with side effects?

Closes https://gitea.com/gitea/act_runner/issues/750

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/751
Reviewed-by: Jason Song <wolfogre@noreply.gitea.com>
Co-authored-by: ChristopherHX <christopher.homberger@web.de>
Co-committed-by: ChristopherHX <christopher.homberger@web.de>
2025-10-09 07:16:56 +00:00
Zettat123
ac6e4b7517 Support inputs context when parsing jobs (#143)
Reusable workflows or manually triggered workflows may get data from [`inputs` context](https://docs.github.com/en/actions/reference/workflows-and-actions/contexts#inputs-context).

For example:

```
name: Build

run-name: Build app on ${{ inputs.os }}

on:
  workflow_dispatch:
    inputs:
      os:
        description: Select the OS
        required: true
        type: choice
        options:
          - windows
          - linux

...
```

Reviewed-on: https://gitea.com/gitea/act/pulls/143
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
2025-10-03 18:05:12 +00:00
Christopher Homberger
8920c4a170 Timeout to wait for and optionally require docker always (#741)
* do not require docker cli in the runner image for waiting for a sidecar dockerd
* allow to specify that `:host` labels would also only be accepted if docker is reachable

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/741
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
v0.2.13
2025-08-28 17:28:08 +00:00
Akkuman
bbf9d7e90f feat: support github mirror (#716)
ref: https://github.com/go-gitea/gitea/issues/34858

when github_mirror='https://ghfast.top/https://github.com'

it will clone from the github mirror

However, there is an exception: because the cache is hashed using the string, if the same cache has been used before, it will still be pulled from github, only for the newly deployed act_ruuner

In the following two scenarios, it will solve the problem encountered:

1. some github mirror is  https://xxx.com/https://github.com/actions/checkout@v4, it will report error `Expected format {org}/{repo}[/path]@ref. Actual ‘https://xxx.com/https://github.com/actions/checkout@v4’ Input string was not in a correct format`
2. If I use an action that has a dependency on another action, even if I configure the url of the action I want to use, the action that the action introduces will still pull from github.
for example 02882cc2d9/action.yml (L127-L132)

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/716
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Akkuman <akkumans@qq.com>
Co-committed-by: Akkuman <akkumans@qq.com>
2025-08-28 16:57:01 +00:00
Daan Selen
46f471a900 refac(workflow): use matrix to compile different docker images (#740)
I have made this to speed up and make it more robust.

The matrix executes the jobs in parallel, doing some things perhaps double. But making overall management easier due to the simple defined variables at the top of the matrix declaration.

Co-authored-by: Daan Selen <dselen@systemec.nl>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/740
Reviewed-by: techknowlogick <techknowlogick@noreply.gitea.com>
Co-authored-by: Daan Selen <dselen@nerthus.nl>
Co-committed-by: Daan Selen <dselen@nerthus.nl>
2025-08-22 23:49:20 +00:00
DaanSelen
aa28f8d99c feat(docker): add TZ env variable working (#738)
Relevant: https://gitea.com/gitea/act_runner/issues/735

See my example below, `edit` is my PR, `non-edit` is the origin/main.
```
dselen@N-DESKTOP1:~/development/act_runner$ docker images
REPOSITORY   TAG        IMAGE ID       CREATED          SIZE
runner       edit       b12322f8c3f0   26 seconds ago   43.5MB
runner       non-edit   e5593ad32c16   34 minutes ago   43.1MB

dselen@N-DESKTOP1:~/development/act_runner$ docker run -d -e TZ=Europe/Amsterdam runner:non-edit
5f26979515f461a2a7e342aa586d7b91224d2d3c3dcf1ed0c1e7293ff00645a4

dselen@N-DESKTOP1:~/development/act_runner$ docker run -d -e TZ=Europe/Amsterdam runner:edit
9cc5fc6b364cf07776d97c6c60c03f23372eb2c93c7da8d3d80f4f6dc2a6b10e

dselen@N-DESKTOP1:~/development/act_runner$ docker ps
CONTAINER ID   IMAGE             COMMAND                  CREATED         STATUS         PORTS     NAMES
9cc5fc6b364c   runner:edit       "/sbin/tini -- run.sh"   2 seconds ago   Up 2 seconds             serene_bardeen
5f26979515f4   runner:non-edit   "/sbin/tini -- run.sh"   5 seconds ago   Up 5 seconds             jovial_euler

dselen@N-DESKTOP1:~/development/act_runner$ docker exec -it jovial_euler bash
5f26979515f4:/# date
Thu Aug 21 16:40:35 UTC 2025

dselen@N-DESKTOP1:~/development/act_runner$ docker exec -it serene_bardeen bash
9cc5fc6b364c:/# date
Thu Aug 21 18:40:42 CEST 2025
```

I do not see why this would not be acceptable, its only 400KB

Regards.

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/738
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Co-authored-by: DaanSelen <daanselen@noreply.gitea.com>
Co-committed-by: DaanSelen <daanselen@noreply.gitea.com>
2025-08-21 23:56:32 +00:00
Christopher Homberger
6a7e18b124 Allow node24 actions (#737)
* `node` Tool is used regardless of this change
* upgrade images with `node` = `node24` if required
* actions/checkout@v5 now passing validation

Fixes #729

Reviewed-on: https://gitea.com/gitea/act_runner/pulls/737
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
2025-08-21 17:47:26 +00:00
Christopher Homberger
91852faf93 refactor: simplify adding new node versions add node 24 (#140)
* backport

Reviewed-on: https://gitea.com/gitea/act/pulls/140
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-committed-by: Christopher Homberger <christopher.homberger@web.de>
2025-08-21 16:04:08 +00:00