Files
bpf-developer-tutorial/29-sockops/index.html

433 lines
34 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE HTML>
<html lang="en" class="light" dir="ltr">
<head>
<!-- Book generated using mdBook -->
<meta charset="UTF-8">
<title>使用 sockops 加速网络请求转发 - bpf-developer-tutorial</title>
<!-- Custom HTML head -->
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#ffffff">
<link rel="icon" href="../favicon.svg">
<link rel="shortcut icon" href="../favicon.png">
<link rel="stylesheet" href="../css/variables.css">
<link rel="stylesheet" href="../css/general.css">
<link rel="stylesheet" href="../css/chrome.css">
<link rel="stylesheet" href="../css/print.css" media="print">
<!-- Fonts -->
<link rel="stylesheet" href="../FontAwesome/css/font-awesome.css">
<link rel="stylesheet" href="../fonts/fonts.css">
<!-- Highlight.js Stylesheets -->
<link rel="stylesheet" href="../highlight.css">
<link rel="stylesheet" href="../tomorrow-night.css">
<link rel="stylesheet" href="../ayu-highlight.css">
<!-- Custom theme stylesheets -->
</head>
<body class="sidebar-visible no-js">
<div id="body-container">
<!-- Provide site root to javascript -->
<script>
var path_to_root = "../";
var default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? "navy" : "light";
</script>
<!-- Work around some values being stored in localStorage wrapped in quotes -->
<script>
try {
var theme = localStorage.getItem('mdbook-theme');
var sidebar = localStorage.getItem('mdbook-sidebar');
if (theme.startsWith('"') && theme.endsWith('"')) {
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
}
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
}
} catch (e) { }
</script>
<!-- Set the theme before any content is loaded, prevents flash -->
<script>
var theme;
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
if (theme === null || theme === undefined) { theme = default_theme; }
var html = document.querySelector('html');
html.classList.remove('light')
html.classList.add(theme);
var body = document.querySelector('body');
body.classList.remove('no-js')
body.classList.add('js');
</script>
<input type="checkbox" id="sidebar-toggle-anchor" class="hidden">
<!-- Hide / unhide sidebar before it is displayed -->
<script>
var body = document.querySelector('body');
var sidebar = null;
var sidebar_toggle = document.getElementById("sidebar-toggle-anchor");
if (document.body.clientWidth >= 1080) {
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
sidebar = sidebar || 'visible';
} else {
sidebar = 'hidden';
}
sidebar_toggle.checked = sidebar === 'visible';
body.classList.remove('sidebar-visible');
body.classList.add("sidebar-" + sidebar);
</script>
<nav id="sidebar" class="sidebar" aria-label="Table of contents">
<div class="sidebar-scrollbox">
<ol class="chapter"><li class="chapter-item expanded affix "><a href="../https://github.com/eunomia-bpf/bpf-developer-tutorial.html">https://github.com/eunomia-bpf/bpf-developer-tutorial</a></li><li class="chapter-item expanded affix "><li class="part-title">入门文档</li><li class="chapter-item expanded "><a href="../0-introduce/index.html"><strong aria-hidden="true">1.</strong> lesson 0-introduce</a></li><li class="chapter-item expanded "><a href="../1-helloworld/index.html"><strong aria-hidden="true">2.</strong> lesson 1-helloworld</a></li><li class="chapter-item expanded "><a href="../2-kprobe-unlink/index.html"><strong aria-hidden="true">3.</strong> lesson 2-kprobe-unlink</a></li><li class="chapter-item expanded "><a href="../3-fentry-unlink/index.html"><strong aria-hidden="true">4.</strong> lesson 3-fentry-unlink</a></li><li class="chapter-item expanded "><a href="../4-opensnoop/index.html"><strong aria-hidden="true">5.</strong> lesson 4-opensnoop</a></li><li class="chapter-item expanded "><a href="../5-uprobe-bashreadline/index.html"><strong aria-hidden="true">6.</strong> lesson 5-uprobe-bashreadline</a></li><li class="chapter-item expanded "><a href="../6-sigsnoop/index.html"><strong aria-hidden="true">7.</strong> lesson 6-sigsnoop</a></li><li class="chapter-item expanded "><a href="../7-execsnoop/index.html"><strong aria-hidden="true">8.</strong> lesson 7-execsnoop</a></li><li class="chapter-item expanded "><a href="../8-exitsnoop/index.html"><strong aria-hidden="true">9.</strong> lesson 8-execsnoop</a></li><li class="chapter-item expanded "><a href="../9-runqlat/index.html"><strong aria-hidden="true">10.</strong> lesson 9-runqlat</a></li><li class="chapter-item expanded "><a href="../10-hardirqs/index.html"><strong aria-hidden="true">11.</strong> lesson 10-hardirqs</a></li><li class="chapter-item expanded affix "><li class="part-title">进阶文档和示例</li><li class="chapter-item expanded "><a href="../11-bootstrap/index.html"><strong aria-hidden="true">12.</strong> lesson 11-bootstrap</a></li><li class="chapter-item expanded "><a href="../12-profile/index.html"><strong aria-hidden="true">13.</strong> lesson 12-profile</a></li><li class="chapter-item expanded "><a href="../13-tcpconnlat/index.html"><strong aria-hidden="true">14.</strong> lesson 13-tcpconnlat</a></li><li class="chapter-item expanded "><a href="../14-tcpstates/index.html"><strong aria-hidden="true">15.</strong> lesson 14-tcpstates</a></li><li class="chapter-item expanded "><a href="../15-javagc/index.html"><strong aria-hidden="true">16.</strong> lesson 15-javagc</a></li><li class="chapter-item expanded "><a href="../16-memleak/index.html"><strong aria-hidden="true">17.</strong> lesson 16-memleak</a></li><li class="chapter-item expanded "><a href="../17-biopattern/index.html"><strong aria-hidden="true">18.</strong> lesson 17-biopattern</a></li><li class="chapter-item expanded "><a href="../18-further-reading/index.html"><strong aria-hidden="true">19.</strong> lesson 18-further-reading</a></li><li class="chapter-item expanded "><a href="../19-lsm-connect/index.html"><strong aria-hidden="true">20.</strong> lesson 19-lsm-connect</a></li><li class="chapter-item expanded "><a href="../20-tc/index.html"><strong aria-hidden="true">21.</strong> lesson 20-tc</a></li><li class="chapter-item expanded "><a href="../21-xdp/index.html"><strong aria-hidden="true">22.</strong> lesson 21-xdp</a></li><li class="chapter-item expanded affix "><li class="part-title">高级主题</li><li class="chapter-item expanded "><a href="../22-android/index.html"><strong aria-hidden="true">23.</strong> 在 Android 上使用 eBPF 程序</a></li><li class="chapter-item expanded "><a href="../30-sslsniff/index.html"><strong aria-hidden="true">24.</strong> 使用 uprobe 捕获多种库的 SSL/TLS 明文数据</a></li><li class="chapter-item expanded "><a href="../23-http/index.html"><strong aria-hidden="true">25.</strong> 使用 eBPF socket filter 或 syscall trace 追踪 HTTP 请求和其他七层协议</a></li><li class="chapter-item expanded "><a href="../29-sockops/index.html" class="active"><strong aria-hidden="true">26.</strong> 使用 sockops 加速网络请求转发</a></li><li class="chapter-item expanded "><a href="../24-hide/index.html"><strong aria-hidden="true">27.</strong> 使用 eBPF 隐藏进程或文件信息</a></li><li class="chapter-item expanded "><a href="../25-signal/index.html"><strong aria-hidden="true">28.</strong> 使用 bpf_send_signal 发送信号终止进程</a></li><li class="chapter-item expanded "><a href="../26-sudo/index.html"><strong aria-hidden="true">29.</strong> 使用 eBPF 添加 sudo 用户</a></li><li class="chapter-item expanded "><a href="../27-replace/index.html"><strong aria-hidden="true">30.</strong> 使用 eBPF 替换任意程序读取或写入的文本</a></li><li class="chapter-item expanded "><a href="../28-detach/index.html"><strong aria-hidden="true">31.</strong> BPF 的生命周期:使用 Detached 模式在用户态应用退出后持续运行 eBPF 程序</a></li><li class="chapter-item expanded "><a href="../18-further-reading/ebpf-security.zh.html"><strong aria-hidden="true">32.</strong> eBPF 运行时的安全性与面临的挑战</a></li><li class="chapter-item expanded "><a href="../34-syscall/index.html"><strong aria-hidden="true">33.</strong> 使用 eBPF 修改系统调用参数</a></li><li class="chapter-item expanded "><a href="../35-user-ringbuf/index.html"><strong aria-hidden="true">34.</strong> eBPF开发实践使用 user ring buffer 向内核异步发送信息</a></li><li class="chapter-item expanded "><a href="../36-userspace-ebpf/index.html"><strong aria-hidden="true">35.</strong> 用户空间 eBPF 运行时:深度解析与应用实践</a></li><li class="chapter-item expanded "><a href="../37-uprobe-rust/index.html"><strong aria-hidden="true">36.</strong> 使用 uprobe 追踪 Rust 应用程序</a></li><li class="chapter-item expanded "><a href="../38-btf-uprobe/index.html"><strong aria-hidden="true">37.</strong> 借助 eBPF 和 BTF让用户态也能一次编译、到处运行</a></li><li class="chapter-item expanded affix "><li class="part-title">bcc 和 bpftrace 教程与文档</li><li class="chapter-item expanded "><a href="../bcc-documents/kernel-versions.html"><strong aria-hidden="true">38.</strong> BPF Features by Linux Kernel Version</a></li><li class="chapter-item expanded "><a href="../bcc-documents/kernel_config.html"><strong aria-hidden="true">39.</strong> Kernel Configuration for BPF Features</a></li><li class="chapter-item expanded "><a href="../bcc-documents/reference_guide.html"><strong aria-hidden="true">40.</strong> bcc Reference Guide</a></li><li class="chapter-item expanded "><a href="../bcc-documents/special_filtering.html"><strong aria-hidden="true">41.</strong> Special Filtering</a></li><li class="chapter-item expanded "><a href="../bcc-documents/tutorial.html"><strong aria-hidden="true">42.</strong> bcc Tutorial</a></li><li class="chapter-item expanded "><a href="../bcc-documents/tutorial_bcc_python_developer.html"><strong aria-hidden="true">43.</strong> bcc Python Developer Tutorial</a></li><li class="chapter-item expanded "><a href="../bpftrace-tutorial/index.html"><strong aria-hidden="true">44.</strong> bpftrace Tutorial</a></li></ol>
</div>
<div id="sidebar-resize-handle" class="sidebar-resize-handle">
<div class="sidebar-resize-indicator"></div>
</div>
</nav>
<!-- Track and set sidebar scroll position -->
<script>
var sidebarScrollbox = document.querySelector('#sidebar .sidebar-scrollbox');
sidebarScrollbox.addEventListener('click', function(e) {
if (e.target.tagName === 'A') {
sessionStorage.setItem('sidebar-scroll', sidebarScrollbox.scrollTop);
}
}, { passive: true });
var sidebarScrollTop = sessionStorage.getItem('sidebar-scroll');
sessionStorage.removeItem('sidebar-scroll');
if (sidebarScrollTop) {
// preserve sidebar scroll position when navigating via links within sidebar
sidebarScrollbox.scrollTop = sidebarScrollTop;
} else {
// scroll sidebar to current active section when navigating via "next/previous chapter" buttons
var activeSection = document.querySelector('#sidebar .active');
if (activeSection) {
activeSection.scrollIntoView({ block: 'center' });
}
}
</script>
<div id="page-wrapper" class="page-wrapper">
<div class="page">
<div id="menu-bar-hover-placeholder"></div>
<div id="menu-bar" class="menu-bar sticky">
<div class="left-buttons">
<label id="sidebar-toggle" class="icon-button" for="sidebar-toggle-anchor" title="Toggle Table of Contents" aria-label="Toggle Table of Contents" aria-controls="sidebar">
<i class="fa fa-bars"></i>
</label>
<button id="theme-toggle" class="icon-button" type="button" title="Change theme" aria-label="Change theme" aria-haspopup="true" aria-expanded="false" aria-controls="theme-list">
<i class="fa fa-paint-brush"></i>
</button>
<ul id="theme-list" class="theme-popup" aria-label="Themes" role="menu">
<li role="none"><button role="menuitem" class="theme" id="light">Light</button></li>
<li role="none"><button role="menuitem" class="theme" id="rust">Rust</button></li>
<li role="none"><button role="menuitem" class="theme" id="coal">Coal</button></li>
<li role="none"><button role="menuitem" class="theme" id="navy">Navy</button></li>
<li role="none"><button role="menuitem" class="theme" id="ayu">Ayu</button></li>
</ul>
<button id="search-toggle" class="icon-button" type="button" title="Search. (Shortkey: s)" aria-label="Toggle Searchbar" aria-expanded="false" aria-keyshortcuts="S" aria-controls="searchbar">
<i class="fa fa-search"></i>
</button>
</div>
<h1 class="menu-title">bpf-developer-tutorial</h1>
<div class="right-buttons">
<a href="../print.html" title="Print this book" aria-label="Print this book">
<i id="print-button" class="fa fa-print"></i>
</a>
</div>
</div>
<div id="search-wrapper" class="hidden">
<form id="searchbar-outer" class="searchbar-outer">
<input type="search" id="searchbar" name="searchbar" placeholder="Search this book ..." aria-controls="searchresults-outer" aria-describedby="searchresults-header">
</form>
<div id="searchresults-outer" class="searchresults-outer hidden">
<div id="searchresults-header" class="searchresults-header"></div>
<ul id="searchresults">
</ul>
</div>
</div>
<!-- Apply ARIA attributes after the sidebar and the sidebar toggle button are added to the DOM -->
<script>
document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
});
</script>
<div id="content" class="content">
<main>
<h1 id="ebpf-开发实践使用-sockops-加速网络请求转发"><a class="header" href="#ebpf-开发实践使用-sockops-加速网络请求转发">eBPF 开发实践:使用 sockops 加速网络请求转发</a></h1>
<p>eBPF扩展的伯克利数据包过滤器是 Linux 内核中的一个强大功能,可以在无需更改内核源代码或重启内核的情况下,运行、加载和更新用户定义的代码。这种功能让 eBPF 在网络和系统性能分析、数据包过滤、安全策略等方面有了广泛的应用。</p>
<p>本教程将关注 eBPF 在网络领域的应用,特别是如何使用 sockops 类型的 eBPF 程序来加速本地网络请求的转发。这种应用通常在使用软件负载均衡器进行请求转发的场景中很有价值,比如使用 Nginx 或 HAProxy 之类的工具。</p>
<p>在许多工作负载中如微服务架构下的服务间通信通过本机进行的网络请求的性能开销可能会对整个应用的性能产生显著影响。由于这些请求必须经过本机的网络栈其处理性能可能会成为瓶颈尤其是在高并发的场景下。为了解决这个问题sockops 类型的 eBPF 程序可以用于加速本地的请求转发。sockops 程序可以在内核空间管理套接字,实现在本机上的套接字之间直接转发数据包,从而降低了在 TCP/IP 栈中进行数据包转发所需的 CPU 时间。</p>
<p>本教程将会通过一个具体的示例演示如何使用 sockops 类型的 eBPF 程序来加速网络请求的转发。为了让你更好地理解如何使用 sockops 程序,我们将逐步介绍示例程序的代码,并讨论每个部分的工作原理。完整的源代码和工程可以在 <a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/29-sockops">https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/29-sockops</a> 中找到。</p>
<h2 id="利用-ebpf-的-sockops-进行性能优化"><a class="header" href="#利用-ebpf-的-sockops-进行性能优化">利用 eBPF 的 sockops 进行性能优化</a></h2>
<p>网络连接本质上是 socket 之间的通讯eBPF 提供了一个 <a href="https://man7.org/linux/man-pages/man7/bpf-helpers.7.html">bpf_msg_redirect_hash</a> 函数,用来将应用发出的包直接转发到对端的 socket可以极大地加速包在内核中的处理流程。</p>
<p>这里 sock_map 是记录 socket 规则的关键部分,即根据当前的数据包信息,从 sock_map 中挑选一个存在的 socket 连接来转发请求。所以需要先在 sockops 的 hook 处或者其它地方,将 socket 信息保存到 sock_map并提供一个规则 (一般为四元组) 根据 key 查找到 socket。</p>
<p>Merbridge 项目就是这样实现了用 eBPF 代替 iptables 为 Istio 进行加速。在使用 Merbridge (eBPF) 优化之后,出入口流量会直接跳过很多内核模块,明显提高性能,如下图所示:</p>
<p><img src="merbridge.png" alt="merbridge" /></p>
<h2 id="示例程序"><a class="header" href="#示例程序">示例程序</a></h2>
<p>此示例程序从发送者的套接字(出口)重定向流量至接收者的套接字(入口),<strong>跳过 TCP/IP 内核网络栈</strong>。在这个示例中,我们假定发送者和接收者都在<strong>同一台</strong>机器上运行。这个示例程序有两个部分,它们共享一个 map 定义:</p>
<p>bpf_sockmap.h</p>
<pre><code class="language-c">#include "vmlinux.h"
#include &lt;bpf/bpf_endian.h&gt;
#include &lt;bpf/bpf_helpers.h&gt;
#define LOCALHOST_IPV4 16777343
struct sock_key {
__u32 sip;
__u32 dip;
__u32 sport;
__u32 dport;
__u32 family;
};
struct {
__uint(type, BPF_MAP_TYPE_SOCKHASH);
__uint(max_entries, 65535);
__type(key, struct sock_key);
__type(value, int);
} sock_ops_map SEC(".maps");
</code></pre>
<p>这个示例程序中的 BPF 程序被分为两个部分 <code>bpf_redirect.bpf.c</code><code>bpf_contrack.bpf.c</code></p>
<ul>
<li>
<p><code>bpf_contrack.bpf.c</code> 中的 BPF 代码定义了一个套接字操作(<code>sockops</code>)程序,它的功能主要是当本机(使用 localhost上的任意 TCP 连接被创建时,根据这个新连接的五元组(源地址,目标地址,源端口,目标端口,协议),在 <code>sock_ops_map</code> 这个 BPF MAP 中创建一个条目。这个 BPF MAP 被定义为 <code>BPF_MAP_TYPE_SOCKHASH</code> 类型,可以存储套接字和对应的五元组。这样使得每当本地 TCP 连接被创建的时候,这个连接的五元组信息也能够在 BPF MAP 中找到。</p>
</li>
<li>
<p><code>bpf_redirect.bpf.c</code> 中的 BPF 代码定义了一个网络消息 (sk_msg) 处理程序,当本地套接字上有消息到达时会调用这个程序。然后这个 sk_msg 程序检查该消息是否来自本地地址,如果是,根据获取的五元组信息(源地址,目标地址,源端口,目标端口,协议)在 <code>sock_ops_map</code> 查找相应的套接字,并将该消息重定向到在 <code>sock_ops_map</code> 中找到的套接字上,这样就实现了绕过内核网络栈。</p>
</li>
</ul>
<p>举个例子,我们假设有两个进程在本地运行,进程 A 绑定在 8000 端口上,进程 B 绑定在 9000 端口上,进程 A 向进程 B 发送消息。</p>
<ol>
<li>
<p>当进程 A 首次和进程 B 建立 TCP 连接时,触发 <code>bpf_contrack.bpf.c</code> 中的 <code>sockops</code> 程序,这个程序将五元组 <code>{127.0.0.1, 127.0.0.1, 8000, 9000, TCP}</code> 存入 <code>sock_ops_map</code>,值为进程 A 的套接字。</p>
</li>
<li>
<p>当进程 A 发送消息时,触发 <code>bpf_redirect.bpf.c</code> 中的 <code>sk_msg</code> 程序,然后 <code>sk_msg</code> 程序将消息从进程 A 的套接字重定向到 <code>sock_ops_map</code> 中存储的套接字(进程 A 的套接字)上,因此,消息被直接从进程 A 输送到进程 B绕过了内核网络栈。</p>
</li>
</ol>
<p>这个示例程序就是通过 BPF 实现了在本地通信时,快速将消息从发送者的套接字重定向到接收者的套接字,从而绕过了内核网络栈,以提高传输效率。</p>
<p>bpf_redirect.bpf.c</p>
<pre><code class="language-c">#include "bpf_sockmap.h"
char LICENSE[] SEC("license") = "Dual BSD/GPL";
SEC("sk_msg")
int bpf_redir(struct sk_msg_md *msg)
{
if(msg-&gt;remote_ip4 != LOCALHOST_IPV4 || msg-&gt;local_ip4!= LOCALHOST_IPV4)
return SK_PASS;
struct sock_key key = {
.sip = msg-&gt;remote_ip4,
.dip = msg-&gt;local_ip4,
.dport = bpf_htonl(msg-&gt;local_port), /* convert to network byte order */
.sport = msg-&gt;remote_port,
.family = msg-&gt;family,
};
return bpf_msg_redirect_hash(msg, &amp;sock_ops_map, &amp;key, BPF_F_INGRESS);
}
</code></pre>
<p>bpf_contrack.bpf.c</p>
<pre><code class="language-c">#include "bpf_sockmap.h"
char LICENSE[] SEC("license") = "Dual BSD/GPL";
SEC("sockops")
int bpf_sockops_handler(struct bpf_sock_ops *skops){
u32 family, op;
family = skops-&gt;family;
op = skops-&gt;op;
if (op != BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
&amp;&amp; op != BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB) {
return BPF_OK;
}
if(skops-&gt;remote_ip4 != LOCALHOST_IPV4 || skops-&gt;local_ip4!= LOCALHOST_IPV4) {
return BPF_OK;
}
struct sock_key key = {
.dip = skops-&gt;remote_ip4,
.sip = skops-&gt;local_ip4,
.sport = bpf_htonl(skops-&gt;local_port), /* convert to network byte order */
.dport = skops-&gt;remote_port,
.family = skops-&gt;family,
};
bpf_printk("&gt;&gt;&gt; new connection: OP:%d, PORT:%d --&gt; %d\n", op, bpf_ntohl(key.sport), bpf_ntohl(key.dport));
bpf_sock_hash_update(skops, &amp;sock_ops_map, &amp;key, BPF_NOEXIST);
return BPF_OK;
}
</code></pre>
<h3 id="编译-ebpf-程序"><a class="header" href="#编译-ebpf-程序">编译 eBPF 程序</a></h3>
<p>这里我们使用 libbpf 编译这个 eBPF 程序。完整的源代码和工程可以在 <a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/29-sockops">https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/29-sockops</a> 中找到。关于如何安装依赖,请参考:<a href="https://eunomia.dev/tutorials/11-bootstrap/">https://eunomia.dev/tutorials/11-bootstrap/</a></p>
<pre><code class="language-shell"># Compile the bpf program with libbpf
make
</code></pre>
<h3 id="加载-ebpf-程序"><a class="header" href="#加载-ebpf-程序">加载 eBPF 程序</a></h3>
<p>我们编写了一个脚本来加载 eBPF 程序,它会自动加载两个 eBPF 程序并创建一个 BPF MAP</p>
<pre><code class="language-shell">sudo ./load.sh
</code></pre>
<p>这个脚本实际上完成了这些操作:</p>
<pre><code class="language-sh">#!/bin/bash
set -x
set -e
sudo mount -t bpf bpf /sys/fs/bpf/
# check if old program already loaded
if [ -e "/sys/fs/bpf/bpf_sockops" ]; then
echo "&gt;&gt;&gt; bpf_sockops already loaded, uninstalling..."
./unload.sh
echo "&gt;&gt;&gt; old program already deleted..."
fi
# load and attach sock_ops program
sudo bpftool prog load bpf_contrack.bpf.o /sys/fs/bpf/bpf_sockops type sockops pinmaps /sys/fs/bpf/
sudo bpftool cgroup attach "/sys/fs/cgroup/" sock_ops pinned "/sys/fs/bpf/bpf_sockops"
# load and attach sk_msg program
sudo bpftool prog load bpf_redirect.bpf.o "/sys/fs/bpf/bpf_redir" map name sock_ops_map pinned "/sys/fs/bpf/sock_ops_map"
sudo bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map
</code></pre>
<p>这是一个 BPF 的加载脚本。它的主要功能是加载和附加 BPF 程序到内核系统中,并将关联的 BPF map 一并存储pin到 BPF 文件系统中,以便 BPF 程序能访问和操作这些 map。</p>
<p>让我们详细地看一下脚本的每一行是做什么的。</p>
<ol>
<li><code>sudo mount -t bpf bpf /sys/fs/bpf/</code> 这一行用于挂载 BPF 文件系统,使得 BPF 程序和相关的 map 可以被系统访问和操作。</li>
<li>判断条件 <code>[ -e "/sys/fs/bpf/bpf_sockops" ]</code> 是检查是否已经存在 <code>/sys/fs/bpf/bpf_sockops</code> 文件,如果存在,则说明 <code>bpf_sockops</code> 程序已经被加载到系统中,那么将会通过 <code>./unload.sh</code> 脚本将其卸载。</li>
<li><code>sudo bpftool prog load bpf_contrack.bpf.o /sys/fs/bpf/bpf_sockops type sockops pinmaps /sys/fs/bpf/</code> 这一行是加载上文中 <code>bpf_contrack.bpf.c</code> 编译得到的 BPF 对象文件 <code>bpf_contrack.bpf.o</code> 到 BPF 文件系统中,存储至 <code>/sys/fs/bpf/bpf_sockops</code>,并且指定它的类型为 <code>sockops</code><code>pinmaps /sys/fs/bpf/</code> 是指定将加载的 BPF 程序相关的 map 存储在 <code>/sys/fs/bpf/</code> 下。</li>
<li><code>sudo bpftool cgroup attach "/sys/fs/cgroup/" sock_ops pinned "/sys/fs/bpf/bpf_sockops"</code> 这一行是将已经加载到 BPF 文件系统的 <code>bpf_sockops</code> 程序附加到 cgroup此路径为"/sys/fs/cgroup/")。附加后,所有属于这个 cgroup 的套接字操作都会受到 <code>bpf_sockops</code> 的影响。</li>
<li><code>sudo bpftool prog load bpf_redirect.bpf.o "/sys/fs/bpf/bpf_redir" map name sock_ops_map pinned "/sys/fs/bpf/sock_ops_map"</code> 这一行是加载 <code>bpf_redirect.bpf.c</code> 编译得到的 BPF 对象文件 <code>bpf_redirect.bpf.o</code> 到 BPF 文件系统中,存储至 <code>/sys/fs/bpf/bpf_redir</code> ,并且指定它的相关 map为 <code>sock_ops_map</code>这个map在 <code>/sys/fs/bpf/sock_ops_map</code> 中。</li>
<li><code>sudo bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map</code> 这一行是将已经加载的 <code>bpf_redir</code> 附加到 <code>sock_ops_map</code> 上,附加方式为 <code>msg_verdict</code>,表示当该 map 对应的套接字收到消息时,将会调用 <code>bpf_redir</code> 程序处理。</li>
</ol>
<p>综上,此脚本的主要作用就是将两个用于处理本地套接字流量的 BPF 程序分别加载到系统并附加到正确的位置,以便它们能被正确地调用,并且确保它们可以访问和操作相关的 BPF map。</p>
<p>您可以使用 <a href="https://github.com/torvalds/linux/blob/master/tools/bpf/bpftool/Documentation/bpftool-prog.rst">bpftool utility</a> 检查这两个 eBPF 程序是否已经加载。</p>
<pre><code class="language-console">$ sudo bpftool prog show
63: sock_ops name bpf_sockops_handler tag 275467be1d69253d gpl
loaded_at 2019-01-24T13:07:17+0200 uid 0
xlated 1232B jited 750B memlock 4096B map_ids 58
64: sk_msg name bpf_redir tag bc78074aa9dd96f4 gpl
loaded_at 2019-01-24T13:07:17+0200 uid 0
xlated 304B jited 233B memlock 4096B map_ids 58
</code></pre>
<h3 id="使用-iperf3-或-curl-进行测试"><a class="header" href="#使用-iperf3-或-curl-进行测试">使用 iperf3 或 curl 进行测试</a></h3>
<p>运行 <a href="https://iperf.fr/">iperf3</a> 服务器</p>
<pre><code class="language-shell">iperf3 -s -p 5001
</code></pre>
<p>运行 <a href="https://iperf.fr/">iperf3</a> 客户端</p>
<pre><code class="language-shell">iperf3 -c 127.0.0.1 -t 10 -l 64k -p 5001
</code></pre>
<p>或者也可以用 Python 和 curl 进行测试:</p>
<pre><code class="language-sh">python3 -m http.server
curl http://0.0.0.0:8000/
</code></pre>
<h3 id="收集追踪"><a class="header" href="#收集追踪">收集追踪</a></h3>
<p>查看<code>sock_ops</code>追踪本地连接建立</p>
<pre><code class="language-console">$ ./trace_bpf_output.sh # 实际上就是 sudo cat /sys/kernel/debug/tracing/trace_pipe
iperf3-9516 [001] .... 22500.634108: 0: &lt;&lt;&lt; ipv4 op = 4, port 18583 --&gt; 4135
iperf3-9516 [001] ..s1 22500.634137: 0: &lt;&lt;&lt; ipv4 op = 5, port 4135 --&gt; 18583
iperf3-9516 [001] .... 22500.634523: 0: &lt;&lt;&lt; ipv4 op = 4, port 19095 --&gt; 4135
iperf3-9516 [001] ..s1 22500.634536: 0: &lt;&lt;&lt; ipv4 op = 5, port 4135 --&gt; 19095
</code></pre>
<p>当iperf3 -c建立连接后你应该可以看到上述用于套接字建立的事件。如果你没有看到任何事件那么 eBPF 程序可能没有正确地附加上。</p>
<p>此外,当<code>sk_msg</code>生效后,可以发现当使用 tcpdump 捕捉本地lo设备流量时只能捕获三次握手和四次挥手流量而iperf数据流量没有被捕获到。如果捕获到iperf数据流量那么 eBPF 程序可能没有正确地附加上。</p>
<pre><code class="language-console">$ ./trace_lo_traffic.sh # tcpdump -i lo port 5001
# 三次握手
13:24:07.181804 IP localhost.46506 &gt; localhost.5001: Flags [S], seq 620239881, win 65495, options [mss 65495,sackOK,TS val 1982813394 ecr 0,nop,wscale 7], length 0
13:24:07.181815 IP localhost.5001 &gt; localhost.46506: Flags [S.], seq 1084484879, ack 620239882, win 65483, options [mss 65495,sackOK,TS val 1982813394 ecr 1982813394,nop,wscale 7], length 0
13:24:07.181832 IP localhost.46506 &gt; localhost.5001: Flags [.], ack 1, win 512, options [nop,nop,TS val 1982813394 ecr 1982813394], length 0
# 四次挥手
13:24:12.475649 IP localhost.46506 &gt; localhost.5001: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 1982818688 ecr 1982813394], length 0
13:24:12.479621 IP localhost.5001 &gt; localhost.46506: Flags [.], ack 2, win 512, options [nop,nop,TS val 1982818692 ecr 1982818688], length 0
13:24:12.481265 IP localhost.5001 &gt; localhost.46506: Flags [F.], seq 1, ack 2, win 512, options [nop,nop,TS val 1982818694 ecr 1982818688], length 0
13:24:12.481270 IP localhost.46506 &gt; localhost.5001: Flags [.], ack 2, win 512, options [nop,nop,TS val 1982818694 ecr 1982818694], length 0
</code></pre>
<h3 id="卸载-ebpf-程序"><a class="header" href="#卸载-ebpf-程序">卸载 eBPF 程序</a></h3>
<pre><code class="language-shell">sudo ./unload.sh
</code></pre>
<h2 id="参考资料"><a class="header" href="#参考资料">参考资料</a></h2>
<p>最后,如果您对 eBPF 技术感兴趣,并希望进一步了解和实践,可以访问我们的教程代码仓库 <a href="https://github.com/eunomia-bpf/bpf-developer-tutorial">https://github.com/eunomia-bpf/bpf-developer-tutorial</a> 和教程网站 <a href="https://eunomia.dev/zh/tutorials/%E3%80%82">https://eunomia.dev/zh/tutorials/。</a></p>
<ul>
<li><a href="https://github.com/zachidan/ebpf-sockops">https://github.com/zachidan/ebpf-sockops</a></li>
<li><a href="https://github.com/merbridge/merbridge">https://github.com/merbridge/merbridge</a></li>
</ul>
<blockquote>
<p>原文地址:<a href="https://eunomia.dev/zh/tutorials/29-sockops/">https://eunomia.dev/zh/tutorials/29-sockops/</a> 转载请注明出处。</p>
</blockquote>
</main>
<nav class="nav-wrapper" aria-label="Page navigation">
<!-- Mobile navigation buttons -->
<a rel="prev" href="../23-http/index.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="../24-hide/index.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
<div style="clear: both"></div>
</nav>
</div>
</div>
<nav class="nav-wide-wrapper" aria-label="Page navigation">
<a rel="prev" href="../23-http/index.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="../24-hide/index.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
</nav>
</div>
<script>
window.playground_copyable = true;
</script>
<script src="../elasticlunr.min.js"></script>
<script src="../mark.min.js"></script>
<script src="../searcher.js"></script>
<script src="../clipboard.min.js"></script>
<script src="../highlight.js"></script>
<script src="../book.js"></script>
<!-- Custom JS scripts -->
</div>
</body>
</html>