mirror of
https://github.com/eunomia-bpf/bpf-developer-tutorial.git
synced 2026-05-07 06:02:47 +08:00
linited markdown documents
This commit is contained in:
@@ -1,4 +1,4 @@
|
|||||||
## eBPF 入门实践教程:
|
# eBPF 入门实践教程
|
||||||
|
|
||||||
## 备注
|
## 备注
|
||||||
|
|
||||||
@@ -8,7 +8,7 @@
|
|||||||
|
|
||||||
origin from:
|
origin from:
|
||||||
|
|
||||||
https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpconnlat.bpf.c
|
<https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpconnlat.bpf.c>
|
||||||
|
|
||||||
## Compile and Run
|
## Compile and Run
|
||||||
|
|
||||||
@@ -30,10 +30,10 @@ TODO: support union in C
|
|||||||
|
|
||||||
Demonstrations of tcpconnect, the Linux eBPF/bcc version.
|
Demonstrations of tcpconnect, the Linux eBPF/bcc version.
|
||||||
|
|
||||||
|
|
||||||
This tool traces the kernel function performing active TCP connections
|
This tool traces the kernel function performing active TCP connections
|
||||||
(eg, via a connect() syscall; accept() are passive connections). Some example
|
(eg, via a connect() syscall; accept() are passive connections). Some example
|
||||||
output (IP addresses changed to protect the innocent):
|
output (IP addresses changed to protect the innocent):
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect
|
# ./tcpconnect
|
||||||
PID COMM IP SADDR DADDR DPORT
|
PID COMM IP SADDR DADDR DPORT
|
||||||
@@ -43,6 +43,7 @@ PID COMM IP SADDR DADDR DPORT
|
|||||||
1991 telnet 6 ::1 ::1 23
|
1991 telnet 6 ::1 ::1 23
|
||||||
2015 ssh 6 fe80::2000:bff:fe82:3ac fe80::2000:bff:fe82:3ac 22
|
2015 ssh 6 fe80::2000:bff:fe82:3ac fe80::2000:bff:fe82:3ac 22
|
||||||
```
|
```
|
||||||
|
|
||||||
This output shows four connections, one from a "telnet" process, two from
|
This output shows four connections, one from a "telnet" process, two from
|
||||||
"curl", and one from "ssh". The output details shows the IP version, source
|
"curl", and one from "ssh". The output details shows the IP version, source
|
||||||
address, destination address, and destination port. This traces attempted
|
address, destination address, and destination port. This traces attempted
|
||||||
@@ -52,8 +53,8 @@ The overhead of this tool should be negligible, since it is only tracing the
|
|||||||
kernel functions performing connect. It is not tracing every packet and then
|
kernel functions performing connect. It is not tracing every packet and then
|
||||||
filtering.
|
filtering.
|
||||||
|
|
||||||
|
|
||||||
The -t option prints a timestamp column:
|
The -t option prints a timestamp column:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect -t
|
# ./tcpconnect -t
|
||||||
TIME(s) PID COMM IP SADDR DADDR DPORT
|
TIME(s) PID COMM IP SADDR DADDR DPORT
|
||||||
@@ -64,6 +65,7 @@ TIME(s) PID COMM IP SADDR DADDR DPORT
|
|||||||
90.928 2482 local_agent 4 10.103.219.236 10.102.64.230 7001
|
90.928 2482 local_agent 4 10.103.219.236 10.102.64.230 7001
|
||||||
90.938 2482 local_agent 4 10.103.219.236 10.115.167.169 7101
|
90.938 2482 local_agent 4 10.103.219.236 10.115.167.169 7101
|
||||||
```
|
```
|
||||||
|
|
||||||
The output shows some periodic connections (or attempts) from a "local_agent"
|
The output shows some periodic connections (or attempts) from a "local_agent"
|
||||||
process to various other addresses. A few connections occur every minute.
|
process to various other addresses. A few connections occur every minute.
|
||||||
|
|
||||||
@@ -74,6 +76,7 @@ in this column. Queries for 127.0.0.1 and ::1 are automatically associated with
|
|||||||
"localhost". If the time between when the DNS response was received and a
|
"localhost". If the time between when the DNS response was received and a
|
||||||
connect call was traced exceeds 100ms, the tool will print the time delta
|
connect call was traced exceeds 100ms, the tool will print the time delta
|
||||||
after the query name. See below for www.domain.com for an example.
|
after the query name. See below for www.domain.com for an example.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect -d
|
# ./tcpconnect -d
|
||||||
PID COMM IP SADDR DADDR DPORT QUERY
|
PID COMM IP SADDR DADDR DPORT QUERY
|
||||||
@@ -86,6 +89,7 @@ PID COMM IP SADDR DADDR DPORT QUERY
|
|||||||
```
|
```
|
||||||
|
|
||||||
The -L option prints a LPORT column:
|
The -L option prints a LPORT column:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect -L
|
# ./tcpconnect -L
|
||||||
PID COMM IP SADDR LPORT DADDR DPORT
|
PID COMM IP SADDR LPORT DADDR DPORT
|
||||||
@@ -95,6 +99,7 @@ PID COMM IP SADDR LPORT DADDR DPORT
|
|||||||
```
|
```
|
||||||
|
|
||||||
The -U option prints a UID column:
|
The -U option prints a UID column:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect -U
|
# ./tcpconnect -U
|
||||||
UID PID COMM IP SADDR DADDR DPORT
|
UID PID COMM IP SADDR DADDR DPORT
|
||||||
@@ -105,14 +110,17 @@ UID PID COMM IP SADDR DADDR DPORT
|
|||||||
```
|
```
|
||||||
|
|
||||||
The -u option filtering UID:
|
The -u option filtering UID:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect -Uu 1000
|
# ./tcpconnect -Uu 1000
|
||||||
UID PID COMM IP SADDR DADDR DPORT
|
UID PID COMM IP SADDR DADDR DPORT
|
||||||
1000 31338 telnet 6 ::1 ::1 23
|
1000 31338 telnet 6 ::1 ::1 23
|
||||||
1000 31338 telnet 4 127.0.0.1 127.0.0.1 23
|
1000 31338 telnet 4 127.0.0.1 127.0.0.1 23
|
||||||
```
|
```
|
||||||
|
|
||||||
To spot heavy outbound connections quickly one can use the -c flag. It will
|
To spot heavy outbound connections quickly one can use the -c flag. It will
|
||||||
count all active connections per source ip and destination ip/port.
|
count all active connections per source ip and destination ip/port.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect.py -c
|
# ./tcpconnect.py -c
|
||||||
Tracing connect ... Hit Ctrl-C to end
|
Tracing connect ... Hit Ctrl-C to end
|
||||||
@@ -126,17 +134,18 @@ LADDR RADDR RPORT CONNECTS
|
|||||||
|
|
||||||
The --cgroupmap option filters based on a cgroup set. It is meant to be used
|
The --cgroupmap option filters based on a cgroup set. It is meant to be used
|
||||||
with an externally created map.
|
with an externally created map.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# ./tcpconnect --cgroupmap /sys/fs/bpf/test01
|
# ./tcpconnect --cgroupmap /sys/fs/bpf/test01
|
||||||
```
|
```
|
||||||
For more details, see docs/special_filtering.md
|
|
||||||
|
|
||||||
|
For more details, see docs/special_filtering.md
|
||||||
|
|
||||||
## eBPF入门实践教程:使用 libbpf-bootstrap 开发程序统计 TCP 连接延时
|
## eBPF入门实践教程:使用 libbpf-bootstrap 开发程序统计 TCP 连接延时
|
||||||
|
|
||||||
## 来源
|
## 来源
|
||||||
|
|
||||||
修改自 https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpconnlat.bpf.c
|
修改自 <https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpconnlat.bpf.c>
|
||||||
|
|
||||||
## 编译运行
|
## 编译运行
|
||||||
|
|
||||||
@@ -147,7 +156,8 @@ For more details, see docs/special_filtering.md
|
|||||||
- ```sudo ./tcpconnlat```
|
- ```sudo ./tcpconnlat```
|
||||||
|
|
||||||
## 效果
|
## 效果
|
||||||
```
|
|
||||||
|
```plain
|
||||||
root@yutong-VirtualBox:~/libbpf-bootstrap/examples/c# ./tcpconnlat
|
root@yutong-VirtualBox:~/libbpf-bootstrap/examples/c# ./tcpconnlat
|
||||||
PID COMM IP SADDR DADDR DPORT LAT(ms)
|
PID COMM IP SADDR DADDR DPORT LAT(ms)
|
||||||
222564 wget 4 192.168.88.15 110.242.68.3 80 25.29
|
222564 wget 4 192.168.88.15 110.242.68.3 80 25.29
|
||||||
|
|||||||
@@ -1,4 +1,6 @@
|
|||||||
## eBPF 入门实践教程:编写 eBPF 程序 tcpconnlat 测量 tcp 连接延时
|
# eBPF 入门实践教程:编写 eBPF 程序 tcpconnlat 测量 tcp 连接延时
|
||||||
|
|
||||||
|
## 代码解释
|
||||||
|
|
||||||
### 背景
|
### 背景
|
||||||
|
|
||||||
@@ -36,10 +38,9 @@ tcp 连接的整个过程如图所示:
|
|||||||
- 半连接队列,也称 SYN 队列;
|
- 半连接队列,也称 SYN 队列;
|
||||||
- 全连接队列,也称 accepet 队列;
|
- 全连接队列,也称 accepet 队列;
|
||||||
|
|
||||||
|
|
||||||
服务端收到客户端发起的 SYN 请求后,内核会把该连接存储到半连接队列,并向客户端响应 SYN+ACK,接着客户端会返回 ACK,服务端收到第三次握手的 ACK 后,内核会把连接从半连接队列移除,然后创建新的完全的连接,并将其添加到 accept 队列,等待进程调用 accept 函数时把连接取出来。
|
服务端收到客户端发起的 SYN 请求后,内核会把该连接存储到半连接队列,并向客户端响应 SYN+ACK,接着客户端会返回 ACK,服务端收到第三次握手的 ACK 后,内核会把连接从半连接队列移除,然后创建新的完全的连接,并将其添加到 accept 队列,等待进程调用 accept 函数时把连接取出来。
|
||||||
|
|
||||||
我们的 ebpf 代码实现在 https://github.com/yunwei37/Eunomia/blob/master/bpftools/tcpconnlat/tcpconnlat.bpf.c 中:
|
我们的 ebpf 代码实现在 <https://github.com/yunwei37/Eunomia/blob/master/bpftools/tcpconnlat/tcpconnlat.bpf.c> 中:
|
||||||
|
|
||||||
它主要使用了 trace_tcp_rcv_state_process 和 kprobe/tcp_v4_connect 这样的跟踪点:
|
它主要使用了 trace_tcp_rcv_state_process 和 kprobe/tcp_v4_connect 这样的跟踪点:
|
||||||
|
|
||||||
@@ -162,7 +163,7 @@ PID COMM IP SRC DEST PORT LAT(ms) CONATINER
|
|||||||
|
|
||||||
使用下述查询命令即可看到延时的统计图表:
|
使用下述查询命令即可看到延时的统计图表:
|
||||||
|
|
||||||
```
|
```plain
|
||||||
rate(eunomia_observed_tcpconnlat_v4_histogram_sum[5m])
|
rate(eunomia_observed_tcpconnlat_v4_histogram_sum[5m])
|
||||||
/
|
/
|
||||||
rate(eunomia_observed_tcpconnlat_v4_histogram_count[5m])
|
rate(eunomia_observed_tcpconnlat_v4_histogram_count[5m])
|
||||||
@@ -178,9 +179,9 @@ PID COMM IP SRC DEST PORT LAT(ms) CONATINER
|
|||||||
|
|
||||||
> `Eunomia` 是一个使用 C/C++ 开发的基于 eBPF的轻量级,高性能云原生监控工具,旨在帮助用户了解容器的各项行为、监控可疑的容器安全事件,力求提供覆盖容器全生命周期的轻量级开源监控解决方案。它使用 `Linux` `eBPF` 技术在运行时跟踪您的系统和应用程序,并分析收集的事件以检测可疑的行为模式。目前,它包含性能分析、容器集群网络可视化分析*、容器安全感知告警、一键部署、持久化存储监控等功能,提供了多样化的 ebpf 追踪点。其核心导出器/命令行工具最小仅需要约 4MB 大小的二进制程序,即可在支持的 Linux 内核上启动。
|
> `Eunomia` 是一个使用 C/C++ 开发的基于 eBPF的轻量级,高性能云原生监控工具,旨在帮助用户了解容器的各项行为、监控可疑的容器安全事件,力求提供覆盖容器全生命周期的轻量级开源监控解决方案。它使用 `Linux` `eBPF` 技术在运行时跟踪您的系统和应用程序,并分析收集的事件以检测可疑的行为模式。目前,它包含性能分析、容器集群网络可视化分析*、容器安全感知告警、一键部署、持久化存储监控等功能,提供了多样化的 ebpf 追踪点。其核心导出器/命令行工具最小仅需要约 4MB 大小的二进制程序,即可在支持的 Linux 内核上启动。
|
||||||
|
|
||||||
项目地址:https://github.com/yunwei37/Eunomia
|
项目地址:<https://github.com/yunwei37/Eunomia>
|
||||||
|
|
||||||
### 参考资料
|
### 参考资料
|
||||||
|
|
||||||
1. http://kerneltravel.net/blog/2020/tcpconnlat/
|
1. <http://kerneltravel.net/blog/2020/tcpconnlat/>
|
||||||
2. https://network.51cto.com/article/640631.html
|
2. <https://network.51cto.com/article/640631.html>
|
||||||
|
|||||||
@@ -1,10 +1,10 @@
|
|||||||
## eBPF 入门实践教程:
|
# eBPF 入门实践教程
|
||||||
|
|
||||||
## origin
|
## origin
|
||||||
|
|
||||||
origin from:
|
origin from:
|
||||||
|
|
||||||
https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpconnlat.bpf.c
|
<https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpconnlat.bpf.c>
|
||||||
|
|
||||||
## Compile and Run
|
## Compile and Run
|
||||||
|
|
||||||
@@ -13,6 +13,7 @@ Compile:
|
|||||||
```shell
|
```shell
|
||||||
docker run -it -v `pwd`/:/src/ yunwei37/ebpm:latest
|
docker run -it -v `pwd`/:/src/ yunwei37/ebpm:latest
|
||||||
```
|
```
|
||||||
|
|
||||||
Run:
|
Run:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
@@ -23,9 +24,9 @@ sudo ./ecli run package.json
|
|||||||
|
|
||||||
Demonstrations of tcpstates, the Linux BPF/bcc version.
|
Demonstrations of tcpstates, the Linux BPF/bcc version.
|
||||||
|
|
||||||
|
|
||||||
tcpstates prints TCP state change information, including the duration in each
|
tcpstates prints TCP state change information, including the duration in each
|
||||||
state as milliseconds. For example, a single TCP session:
|
state as milliseconds. For example, a single TCP session:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
# tcpstates
|
# tcpstates
|
||||||
SKADDR C-PID C-COMM LADDR LPORT RADDR RPORT OLDSTATE -> NEWSTATE MS
|
SKADDR C-PID C-COMM LADDR LPORT RADDR RPORT OLDSTATE -> NEWSTATE MS
|
||||||
@@ -36,6 +37,7 @@ ffff9fd7e8192000 0 swapper/5 100.66.100.185 63446 52.33.159.26 80 FI
|
|||||||
ffff9fd7e8192000 0 swapper/5 100.66.100.185 63446 52.33.159.26 80 FIN_WAIT2 -> CLOSE 0.006
|
ffff9fd7e8192000 0 swapper/5 100.66.100.185 63446 52.33.159.26 80 FIN_WAIT2 -> CLOSE 0.006
|
||||||
^C
|
^C
|
||||||
```
|
```
|
||||||
|
|
||||||
This showed that the most time was spent in the ESTABLISHED state (which then
|
This showed that the most time was spent in the ESTABLISHED state (which then
|
||||||
transitioned to FIN_WAIT1), which was 176.042 milliseconds.
|
transitioned to FIN_WAIT1), which was 176.042 milliseconds.
|
||||||
|
|
||||||
@@ -49,7 +51,7 @@ process context. If that's not the case, they may show kernel details.
|
|||||||
|
|
||||||
## 来源
|
## 来源
|
||||||
|
|
||||||
修改自 https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpstates.bpf.c
|
修改自 <https://github.com/iovisor/bcc/blob/master/libbpf-tools/tcpstates.bpf.c>
|
||||||
|
|
||||||
## 编译运行
|
## 编译运行
|
||||||
|
|
||||||
@@ -60,7 +62,8 @@ process context. If that's not the case, they may show kernel details.
|
|||||||
- ```sudo ./tcpstates```
|
- ```sudo ./tcpstates```
|
||||||
|
|
||||||
## 效果
|
## 效果
|
||||||
```
|
|
||||||
|
```plain
|
||||||
root@yutong-VirtualBox:~/libbpf-bootstrap/examples/c# ./tcpstates
|
root@yutong-VirtualBox:~/libbpf-bootstrap/examples/c# ./tcpstates
|
||||||
SKADDR PID COMM LADDR LPORT RADDR RPORT OLDSTATE -> NEWSTATE MS
|
SKADDR PID COMM LADDR LPORT RADDR RPORT OLDSTATE -> NEWSTATE MS
|
||||||
ffff9bf61bb62bc0 164978 node 192.168.88.15 0 52.178.17.2 443 CLOSE -> SYN_SENT 0.000
|
ffff9bf61bb62bc0 164978 node 192.168.88.15 0 52.178.17.2 443 CLOSE -> SYN_SENT 0.000
|
||||||
@@ -87,8 +90,6 @@ int handle_set_state(struct trace_event_raw_inet_sock_set_state *ctx)
|
|||||||
|
|
||||||
在套接字改变状态处附加一个eBPF跟踪函数。
|
在套接字改变状态处附加一个eBPF跟踪函数。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
```c
|
```c
|
||||||
if (ctx->protocol != IPPROTO_TCP)
|
if (ctx->protocol != IPPROTO_TCP)
|
||||||
return 0;
|
return 0;
|
||||||
@@ -105,8 +106,6 @@ int handle_set_state(struct trace_event_raw_inet_sock_set_state *ctx)
|
|||||||
|
|
||||||
跟踪函数被调用后,先判断当前改变状态的套接字是否满足我们需要的过滤条件,如果不满足则不进行记录。
|
跟踪函数被调用后,先判断当前改变状态的套接字是否满足我们需要的过滤条件,如果不满足则不进行记录。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
```c
|
```c
|
||||||
tsp = bpf_map_lookup_elem(×tamps, &sk);
|
tsp = bpf_map_lookup_elem(×tamps, &sk);
|
||||||
ts = bpf_ktime_get_ns();
|
ts = bpf_ktime_get_ns();
|
||||||
@@ -139,16 +138,12 @@ int handle_set_state(struct trace_event_raw_inet_sock_set_state *ctx)
|
|||||||
|
|
||||||
- 此处使用了```libbpf``` 的 CO-RE 支持。
|
- 此处使用了```libbpf``` 的 CO-RE 支持。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
```c
|
```c
|
||||||
bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &event, sizeof(event));
|
bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &event, sizeof(event));
|
||||||
```
|
```
|
||||||
|
|
||||||
将事件结构体发送至用户态程序。
|
将事件结构体发送至用户态程序。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
```c
|
```c
|
||||||
if (ctx->newstate == TCP_CLOSE)
|
if (ctx->newstate == TCP_CLOSE)
|
||||||
bpf_map_delete_elem(×tamps, &sk);
|
bpf_map_delete_elem(×tamps, &sk);
|
||||||
|
|||||||
Reference in New Issue
Block a user