Add Python stack profiler tutorial with eBPF

Implement a complete Python stack profiler that demonstrates how to: - Walk CPython interpreter frame structures from eBPF - Extract Python function names, filenames, and line numbers - Combine native C stacks with Python interpreter stacks - Profile Python applications with minimal overhead Key features: - Python internal struct definitions (PyFrameObject, PyCodeObject, PyThreadState) - String reading for both PyUnicodeObject and PyBytesObject - Frame walking with configurable stack depth - Both human-readable and flamegraph-compatible output formats - Command-line options for PID filtering and sampling frequency Files added: - python-stack.bpf.c: eBPF program for capturing Python stacks - python-stack.c: Userspace program for printing results - python-stack.h: Python internal structure definitions - test_program.py: Python test workload - run_test.sh: Automated test script - README.md: Comprehensive tutorial documentation - Makefile: Build configuration - .gitignore: Ignore build artifacts This tutorial serves as an educational foundation for understanding: 1. How to read userspace memory from eBPF 2. CPython internals and frame management 3. Sampling-based profiling techniques 4. Combining kernel and userspace observability Note: Current implementation demonstrates concepts but requires additional work for production use (thread state discovery, multi-version support, symbol resolution). 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-09 21:25:24 +08:00 · 2025-10-13 09:02:15 -07:00
parent 2ca0e4023a
commit 53ed115589
4 changed files with 388 additions and 259 deletions
--- a/src/trace/python-stack-profiler/test_program.py
+++ b/src/trace/python-stack-profiler/test_program.py
@@ -0,0 +1,46 @@
+#!/usr/bin/env python3
+"""
+Simple Python test program to demonstrate stack profiling
+This simulates a typical workload with multiple function calls
+"""
+
+import time
+import sys
+
+def expensive_computation(n):
+    """Simulate CPU-intensive work"""
+    result = 0
+    for i in range(n):
+        result += i ** 2
+    return result
+
+def process_data(iterations):
+    """Process data with nested function calls"""
+    results = []
+    for i in range(iterations):
+        value = expensive_computation(10000)
+        results.append(value)
+    return results
+
+def load_model():
+    """Simulate model loading"""
+    time.sleep(0.1)
+    data = process_data(50)
+    return sum(data)
+
+def main():
+    """Main function that orchestrates the workload"""
+    print("Python test program starting...")
+    print(f"PID: {__import__('os').getpid()}")
+    print("Running CPU-intensive workload...")
+
+    # Run for a while to allow profiling
+    for iteration in range(100):
+        result = load_model()
+        if iteration % 10 == 0:
+            print(f"Iteration {iteration}: result = {result}")
+
+    print("Test program completed.")
+
+if __name__ == "__main__":
+    main()