add translatefor sslsniff (#69)

This commit is contained in:
云微
2023-08-27 13:39:22 +01:00
committed by GitHub
parent 45fabebccc
commit 82cd415d84
6 changed files with 574 additions and 32 deletions

View File

@@ -1,9 +1,18 @@
# eBPF 开发者教程与知识库:Learn eBPF by example tools
# eBPF 开发者教程与知识库:eBPF Tutorial by Example
[![CI](https://github.com/eunomia-bpf/bpf-developer-tutorial/actions/workflows/main.yml/badge.svg)](https://github.com/eunomia-bpf/bpf-developer-tutorial/actions/workflows/main.yml)
[GitHub](https://github.com/eunomia-bpf/bpf-developer-tutorial)
[Gitee 镜像](https://gitee.com/yunwei37/bpf-developer-tutorial)
[English Version](README_en.md)
Dive straight into eBPF development with this concise tutorial, built around the powerful CO-RE (Compile Once, Run Everywhere) philosophy. Whether you're a newbie or a pro, we've got you covered with:
- 🛠 **Practical Examples:** Start coding with bite-sized examples, some as short as just 20 lines!
- 🔍 **Focused Learning:** We prioritize hands-on learning, skipping the lengthy theory. Each directory offers an independent eBPF tool example.
- 💼 **Modern Frameworks:** Get comfortable with the latest eBPF frameworks such as libbpf, Cilium, libbpf-rs, and eunomia-bpf.
- 🌐 **Multi-language Support:** Play with code samples in C, Go, and Rust.
- 🌍 **Bilingual Content:** This tutorial is available in both Chinese and English. For the English version, click [here](README_en.md) or check the README_en.md inside each directory.
这是一个基于 `CO-RE`(一次编译,到处运行)的 eBPF 的开发教程,提供了从入门到进阶的 eBPF 开发实践,包括基本概念、代码实例、实际应用等内容。和 BCC 不同的是,我们使用 libbpf、Cilium、libbpf-rs、eunomia-bpf 等框架进行开发,包含 C、Go、Rust 等语言的示例。
@@ -11,12 +20,6 @@
教程关注于可观测性、网络、安全等等方面的 eBPF 示例。
This is a development tutorial for eBPF based on CO-RE (Compile Once, Run Everywhere). It provides practical eBPF development practices from beginner to advanced, including basic concepts, code examples, and real-world applications. Unlike BCC, we use frameworks like libbpf, Cilium, libbpf-rs, and eunomia-bpf for development, with examples in languages such as C, Go, and Rust.
This tutorial does not cover complex concepts and scenario introductions. Its main purpose is to provide examples of eBPF tools (**very short, starting with twenty lines of code!**) to help eBPF application developers quickly grasp eBPF development methods and techniques. The tutorial content can be found in the directory, with each directory being an independent eBPF tool example.
The tutorial is in Chinese and English versions. For English version, please refer to [README_en.md](README_en.md) and the README_en.md in each directory.
## 目录
### 入门文档
@@ -56,6 +59,7 @@ Android:
网络和追踪:
- [使用 uprobe 捕获多种库的 SSL/TLS 明文数据](src/30-sslsniff/README.md)
- [使用 eBPF 追踪 HTTP 请求或其他七层协议](src/23-http/README.md)
- [使用 sockops 加速网络请求转发](src/29-sockops/README.md)
@@ -175,6 +179,6 @@ eunomia-bpf 由一个编译工具链和一个运行时库组成, 对比传统的
完整的对话记录可以在这里找到: [ChatGPT.md](ChatGPT.md)
我们也构建了一个命令行工具的 demo ,通过本教程的训练, 让它通过自然语言描述即可自动编写 eBPF 程序,追踪 Linux 系统https://github.com/eunomia-bpf/GPTtrace
我们也构建了一个命令行工具的 demo ,通过本教程的训练, 让它通过自然语言描述即可自动编写 eBPF 程序,追踪 Linux 系统:<https://github.com/eunomia-bpf/GPTtrace>
![ebpf-chatgpt-signal](https://github.com/eunomia-bpf/GPTtrace/blob/main/doc/result.gif)

View File

@@ -51,6 +51,7 @@ Networking and tracing:
- [Tracing HTTP requests or other layer-7 protocols using eBPF](src/23-http/README_en.md)
- [Accelerating network request forwarding using sockops](src/29-sockops/README_en.md)
- [Capturing Plain Text Data of Various Libraries' SSL/TLS Using uprobe](src/30-sslsniff/README_en.md)
Security:
@@ -123,17 +124,17 @@ TIME COMM TID LAT(us)
## Why do we need tutorials based on libbpf and BPF CO-RE?
> In history, when it comes to developing a BPF application, one could choose the BCC framework to load the BPF program into the kernel when implementing various BPF programs for Tracepoints. BCC provides a built-in Clang compiler that can compile BPF code at runtime and customize it into a program that conforms to a specific host kernel. This is the only way to develop maintainable BPF applications under the constantly changing internal kernel environment. The portability of BPF and the introduction of CO-RE are detailed in the article "BPF Portability and CO-RE", explaining why BCC was the only viable option before and why libbpf is now considered a better choice. Last year, Libbpf saw significant improvements in functionality and complexity, eliminating many differences with BCC (especially for Tracepoints applications) and adding many new and powerful features that BCC does not support (such as global variables and BPF skeletons)
>
> Admittedly, BCC does its best to simplify the work of BPF developers, but sometimes it also increases the difficulty of problem localization and fixing while providing convenience. Users must remember its naming conventions and the autogenerated structures for Tracepoints, and they must rely on rewriting this code to read kernel data and access kprobe parameters. When using BPF maps, it is necessary to write half-object-oriented C code that does not completely match what happens in the kernel. Furthermore, BCC leads to the writing of a large amount of boilerplate code in user space, with manually configuring the most trivial parts.
>
> As mentioned above, BCC relies on runtime compilation and embeds a large LLVM/Clang library, which creates certain gaps between BCC and an ideal usage scenario:
>
> - High resource utilization (memory and CPU) at compile time, which may interfere with the main process in busy servers.
> - It relies on the kernel header package and needs to be installed on each target host. Even so, if certain kernel contents are not exposed through public header files, type definitions need to be copied and pasted into the BPF code to achieve the purpose.
> - Even the smallest compile-time errors can only be detected at runtime, followed by recompiling and restarting the user-space application. This greatly affects the iteration time of development (and increases frustration...).
>
> Libbpf + BPF CO-RE (Compile Once - Run Everywhere) takes a different approach, considering BPF programs as normal user-space programs: they only need to be compiled into small binaries that can be deployed on target hosts without modification. libbpf acts as a loader for BPF programs, responsible for configuration work (relocating, loading, and verifying BPF programs, creating BPF maps, attaching to BPF hooks, etc.), and developers only need to focus on the correctness and performance of BPF programs. This approach minimizes overhead, eliminates dependencies, and improves the overall developer experience.
>
> In terms of API and code conventions, libbpf adheres to the philosophy of "least surprise", where most things need to be explicitly stated: no header files are implied, and no code is rewritten. Most monotonous steps can be eliminated using simple C code and appropriate auxiliary macros. In addition, what users write is the content that needs to be executed, and the structure of BPF applications is one-to-one, finally verified and executed by the kernel.
Reference: [BCC to Libbpf Conversion Guide (Translation) - Deep Dive into eBPF](https://www.ebpf.top/post/bcc-to-libbpf-guid/)
@@ -148,9 +149,10 @@ With eunomia-bpf, you can:
- Use Wasm to develop eBPF user mode programs, controlling the entire eBPF program loading and execution, as well as handling related data within the WASM virtual machine.
- eunomia-bpf can package pre-compiled eBPF programs into universal JSON or WASM modules for distribution across architectures and kernel versions, allowing dynamic loading and execution without the need for recompilation.
eunomia-bpf consists of a compilation toolchain and a runtime library. Compared to traditional frameworks like BCC and native libbpf, it greatly simplifies the development process of eBPF programs, where in most cases, only the kernel mode code needs to be written to easily build, package, and publish complete eBPF applications. At the same time, the kernel mode eBPF code guarantees 100% compatibility with mainstream development frameworks such as libbpf, libbpfgo, libbpf-rs, and more. When user mode code needs to be written, multiple languages can be used with the help of Webassembly. Compared to script tools like bpftrace, eunomia-bpf maintains similar convenience, while not being limited to trace scenarios and can be used in various other fields such as networking and security.> - eunomia-bpf project GitHub address: <https://github.com/eunomia-bpf/eunomia-bpf>
>
> - gitee mirror: <https://gitee.com/anolis/eunomia>
eunomia-bpf consists of a compilation toolchain and a runtime library. Compared to traditional frameworks like BCC and native libbpf, it greatly simplifies the development process of eBPF programs, where in most cases, only the kernel mode code needs to be written to easily build, package, and publish complete eBPF applications. At the same time, the kernel mode eBPF code guarantees 100% compatibility with mainstream development frameworks such as libbpf, libbpfgo, libbpf-rs, and more. When user mode code needs to be written, multiple languages can be used with the help of Webassembly. Compared to script tools like bpftrace, eunomia-bpf maintains similar convenience, while not being limited to trace scenarios and can be used in various other fields such as networking and security.
- eunomia-bpf project GitHub address: <https://github.com/eunomia-bpf/eunomia-bpf>
- gitee mirror: <https://gitee.com/anolis/eunomia>
## Let ChatGPT Help Us

View File

@@ -1,12 +1,10 @@
# eBPF 实践教程:使用 eBPF 用户态捕获多种库的 SSL/TLS 明文数据
# eBPF 实践教程:使用 uprobe 捕获多种库的 SSL/TLS 明文数据
随着TLS在现代网络环境中的广泛应用跟踪微服务RPC消息已经变得愈加棘手。传统的流量嗅探技术常常受限于只能获取到加密后的数据导致无法真正观察到通信的原始内容。这种限制为系统的调试和分析带来了不小的障碍。
但现在,我们有了新的解决方案。
但现在,我们有了新的解决方案。使用 eBPF 技术,通过其能力在用户空间进行探测,提供了一种方法重新获得明文数据,使得我们可以直观地查看加密前的通信内容。然而,每个应用可能使用不同的库,每个库都有多个版本,这种多样性给跟踪带来了复杂性。
eBPF技术通过其能力在用户空间进行探测提供了一种方法重新获得明文数据使得我们可以直观地查看加密前的通信内容。然而每个应用可能使用不同的库每个库都有多个版本这种多样性给跟踪带来了复杂性
在本教程中,我们将带您了解一种跨多种条件的技术,它不仅可以同时跟踪 GnuTLS 和 OpenSSL 等用户态库,而且相比以往,大大降低了对新版本库的维护工作。
在本教程中,我们将带您了解一种跨多种用户态 SSL/TLS 库的 eBPF 追踪技术,它不仅可以同时跟踪 GnuTLS 和 OpenSSL 等用户态库,而且相比以往,大大降低了对新版本库的维护工作
## 背景知识
@@ -24,7 +22,7 @@ Transport Layer Security (TLS) 是一个密码学协议,旨在为计算机网
#### 握手过程
当客户端与启用了TLS的服务器连接并请求建立安全连接时握手过程开始。握手允许客户端和服务器通过不对称密码来建立连接的安全性参数
当客户端与启用了TLS的服务器连接并请求建立安全连接时握手过程开始。握手允许客户端和服务器通过不对称密码来建立连接的安全性参数,完整流程如下
1. **初始握手**客户端连接到启用了TLS的服务器请求安全连接并提供它支持的密码套件列表加密算法和哈希函数
2. **选择密码套件**:从提供的列表中,服务器选择它也支持的密码套件和哈希函数,并通知客户端已做出的决定。
@@ -32,19 +30,19 @@ Transport Layer Security (TLS) 是一个密码学协议,旨在为计算机网
4. **验证证书**:客户端在继续之前确认证书的有效性。
5. **生成会话密钥**:为了生成用于安全连接的会话密钥,客户端有以下两种方法:
- 使用服务器的公钥加密一个随机数PreMasterSecret并将结果发送到服务器只有服务器才能使用其私钥解密双方然后使用该随机数生成一个独特的会话密钥用于会话期间的数据加密和解密。
- 使用Diffie-Hellman 密钥交换或其变体椭圆曲线DH来安全地生成一个随机且独特的会话密钥用于加密和解密该密钥具有前向保密的额外属性即使在未来公开了服务器的私钥也不能用它来解密当前的会话即使第三方拦截并记录了会话。
- 使用 Diffie-Hellman 密钥交换或其变体椭圆曲线DH来安全地生成一个随机且独特的会话密钥用于加密和解密该密钥具有前向保密的额外属性即使在未来公开了服务器的私钥也不能用它来解密当前的会话即使第三方拦截并记录了会话。
一旦上述步骤成功完成握手过程便结束加密的连接开始。此连接使用会话密钥进行加密和解密直到连接关闭。如果上述任何步骤失败则TLS握手失败连接将不会建立。
#### OSI模型中的TLS
TLSSSL不完全适合OSI模型或TCP/IP模型的任何单一层次。TLS在“某些可靠的传输协议例如TCP之上运行”这意味着它位于传输层之上。它为更高的层提供加密这通常是表示层的功能。但是使用TLS的应用程序通常视其为传输层即使使用TLS的应用程序必须积极控制启动TLS握手和交换的认证证书的处理。
TLSSSL 不完全适合 OSI 模型或 TCP/IP 模型的任何单一层次。TLS 在“某些可靠的传输协议例如TCP之上运行”这意味着它位于传输层之上。它为更高的层提供加密这通常是表示层的功能。但是使用TLS 的应用程序通常视其为传输层即使使用TLS的应用程序必须积极控制启动 TLS 握手和交换的认证证书的处理。
### eBPF 和 uprobe
eBPF (Extended Berkeley Packet Filter): 是一种内核技术,允许用户在内核空间中运行预定义的程序,不需要修改内核源代码或重新加载模块。它创建了一个桥梁,使得用户空间和内核空间可以交互,从而为系统监控、性能分析和网络流量分析等任务提供了无前例的能力。
uprobes: 是eBPF的一个重要特性允许我们在用户空间应用程序中动态地插入探测点特别适用于跟踪SSL/TLS库中的函数调用。
uprobes 是eBPF的一个重要特性允许我们在用户空间应用程序中动态地插入探测点特别适用于跟踪SSL/TLS库中的函数调用。
### 用户态库
@@ -60,9 +58,7 @@ OpenSSL 是一个广泛应用的开源库,提供了 SSL 和 TLS 协议的完
### 1. SSL_read 函数
当我们想从一个已建立的 SSL 连接中读取数据时,可以使用 `SSL_read``SSL_read_ex` 函数。
#### 函数原型
当我们想从一个已建立的 SSL 连接中读取数据时,可以使用 `SSL_read``SSL_read_ex` 函数。函数原型如下:
```c
int SSL_read_ex(SSL *ssl, void *buf, size_t num, size_t *readbytes);
@@ -86,7 +82,7 @@ int SSL_write(SSL *ssl, const void *buf, int num);
## eBPF 内核态代码编写
eBPF (扩展伯克利数据包过滤器) 是 Linux 内核中的一个功能强大的编程框架,它允许开发者在不修改内核源代码的情况下,为 Linux 内核动态插入自定义的程序。在我们的例子中,我们使用 eBPF 来 hook ssl_read 和 ssl_write 函数,从而在数据读取或写入 SSL 连接时执行自定义操作。
在我们的例子中,我们使用 eBPF 来 hook ssl_read 和 ssl_write 函数,从而在数据读取或写入 SSL 连接时执行自定义操作。
### 数据结构
@@ -305,8 +301,6 @@ int BPF_URETPROBE(probe_SSL_do_handshake_exit) {
通过这些 hook 函数,我们可以获得关于握手成功与否、握手所需的时间以及相关的进程信息的数据。这为我们提供了关于系统 SSL/TLS 行为的深入见解,可以帮助我们在需要时进行更深入的分析和优化。
## 用户态辅助代码分析
## 用户态辅助代码分析与解读
在 eBPF 的生态系统中,用户态和内核态代码经常协同工作。内核态代码负责数据的采集,而用户态代码则负责设置、管理和处理这些数据。在本节中,我们将解读上述用户态代码如何配合 eBPF 追踪 SSL/TLS 交互。

View File

@@ -1,2 +1,542 @@
# eBPF openssl
# eBPF Practical Tutorial: Capturing Plain Text Data of Various Libraries' SSL/TLS Using uprobe
With the widespread use of TLS in modern network environments, tracing microservices RPC messages has become increasingly challenging. Traditional traffic sniffing techniques often face limitations in accessing only encrypted data, preventing a genuine observation of the original communication content. This restriction poses significant obstacles to system debugging and analysis.
However, a new solution is now available. Through the use of eBPF technology and its capability to perform probing in user space, a method has emerged to regain plain text data, allowing us to intuitively view the pre-encrypted communication content. Nevertheless, each application might utilize different libraries, and each library comes in multiple versions, introducing complexity to the tracking process.
In this tutorial, we will guide you through an eBPF tracing technique that spans across various user-space SSL/TLS libraries. This technique not only allows simultaneous tracing of user-space libraries like GnuTLS and OpenSSL but also significantly reduces maintenance efforts for new library versions compared to previous methods.
## Background Knowledge
Before delving into the main topic of this tutorial, we need to grasp some core concepts that will serve as the foundation for our subsequent discussions.
### SSL and TLS
SSL (Secure Sockets Layer): Developed by Netscape in the early 1990s, SSL provides data encryption for communication between two machines on a network. However, due to known security vulnerabilities, SSL has been succeeded by its successor, TLS.
TLS (Transport Layer Security): TLS is the successor to SSL, aiming to provide stronger and more secure data encryption methods. TLS operates through a handshake process during which a client and a server select an encryption algorithm and corresponding keys. Once the handshake is complete, data transmission begins, with all data being encrypted using the chosen algorithm and keys.
### Operation Principles of TLS
Transport Layer Security (TLS) is a cryptographic protocol designed to provide security for communication over computer networks. Its primary goal is to provide security, including privacy (confidentiality), integrity, and authenticity, for two or more communicating computer applications over a network using cryptography, such as certificates. TLS consists of two sub-layers: the TLS Record Protocol and the TLS Handshake Protocol.
#### Handshake Process
When a client connects to a TLS-enabled server and requests a secure connection, the handshake process begins. The handshake allows the client and server to establish security parameters for the connection using asymmetric cryptography. The complete process is as follows:
1. **Initial Handshake**: The client connects to the TLS-enabled server, requests a secure connection, and provides a list of supported cipher suites (encryption algorithms and hash functions).
2. **Selecting Cipher Suite**: From the provided list, the server chooses a cipher suite and hash function it also supports and notifies the client of the decision.
3. **Providing Digital Certificate**: Usually, the server then provides identity authentication in the form of a digital certificate. This certificate includes the server's name, trusted certificate authorities (guaranteeing the certificate's authenticity), and the server's public encryption key.
4. **Certificate Verification**: The client verifies the certificate's validity before proceeding.
5. **Generating Session Key**: To create a session key for a secure connection, the client has two methods:
- Encrypt a random number (PreMasterSecret) with the server's public key and send the result to the server (only the server can decrypt it with its private key); both parties then use this random number to generate a unique session key for encrypting and decrypting data during the session.
- Use Diffie-Hellman key exchange (or its variant, Elliptic Curve DH) to securely generate a random and unique session key for encryption and decryption. This key has the additional property of forward secrecy: even if the server's private key is exposed in the future, it can't be used to decrypt the current session, even if a third party intercepts and records the session.
Once these steps are successfully completed, the handshake process concludes, and the encrypted connection begins. This connection uses the session key for encryption and decryption until the connection is closed. If any of the above steps fail, the TLS handshake fails, and the connection won't be established.
#### TLS in the OSI Model
TLS and SSL don't perfectly align with any single layer of the OSI model or the TCP/IP model. TLS "runs over some reliable transport protocol (such as TCP)," which means it sits above the transport layer. It provides encryption for higher layers, typically the presentation layer. However, applications using TLS often consider it the transport layer, even though applications using TLS must actively control the initiation of TLS handshakes and the handling of exchanged authentication certificates.
### eBPF and uprobes
eBPF (Extended Berkeley Packet Filter): It's a kernel technology that allows users to run predefined programs in the kernel space without modifying kernel source code or reloading modules. It creates a bridge that enables interaction between user space and kernel space, providing unprecedented capabilities for tasks like system monitoring, performance analysis, and network traffic analysis.
uprobes are a significant feature of eBPF, allowing dynamic insertion of probe points in user space applications, particularly useful for tracking function calls in SSL/TLS libraries.
### User-Space Libraries
The implementation of the SSL/TLS protocol heavily relies on user-space libraries. Here are some common ones:
- OpenSSL: An open-source, feature-rich cryptographic library widely used in many open-source and commercial projects.
- BoringSSL: A fork of OpenSSL maintained by Google, focusing on simplification and optimization for Google's needs.
- GnuTLS: Part of the GNU project, offering an implementation of SSL, TLS, and DTLS protocols. GnuTLS differs from OpenSSL and BoringSSL in API design, module structure, and licensing.
## OpenSSL API Analysis
OpenSSL is a widely used open-source library providing a complete implementation of the SSL and TLS protocols, ensuring data transmission security in various applications. Among its functions, SSL_read() and SSL_write() are two core API functions for reading from and writing to TLS/SSL connections. In this section, we'll delve into these functions to help you understand their mechanisms.
### 1. SSL_read Function
When we want to read data from an established SSL connection, we can use the `SSL_read` or `SSL_read_ex` function. The function prototype is as follows:
```c
int SSL_read_ex(SSL *ssl, void *buf, size_t num, size_t *readbytes);
int SSL_read(SSL *ssl, void *buf, int num);
```
`SSL_read` and `SSL_read_ex` attempt to read up to `num` bytes of data from the specified `ssl` into the buffer `buf`. Upon success, `SSL_read_ex` stores the actual number of read bytes in `*readbytes`.
### 2. Function SSL_write
When we want to write data into an established SSL connection, we can use the `SSL_write` or `SSL_write_ex` functions.
Function prototype:
```c
int SSL_write_ex(SSL *s, const void *buf, size_t num, size_t *written);
int SSL_write(SSL *ssl, const void *buf, int num);
```
`SSL_write` and `SSL_write_ex` will write up to `num` bytes of data from the buffer `buf` into the specified `ssl` connection. Upon success, `SSL_write_ex` will store the actual number of written bytes in `*written`.
## Writing eBPF Kernel Code
In our example, we use eBPF to hook the `ssl_read` and `ssl_write` functions to perform custom actions when data is read from or written to an SSL connection.
### Data Structures
Firstly, we define a data structure `probe_SSL_data_t` to transfer data between kernel and user space:
```c
#define MAX_BUF_SIZE 8192
#define TASK_COMM_LEN 16
struct probe_SSL_data_t {
__u64 timestamp_ns; // Timestamp (nanoseconds)
__u64 delta_ns; // Function execution time
__u32 pid; // Process ID
__u32 tid; // Thread ID
__u32 uid; // User ID
__u32 len; // Length of read/write data
int buf_filled; // Whether buffer is filled completely
int rw; // Read or Write (0 for read, 1 for write)
char comm[TASK_COMM_LEN]; // Process name
__u8 buf[MAX_BUF_SIZE]; // Data buffer
int is_handshake; // Whether it's handshake data
};
```
### Hook Functions
Our goal is to hook into the `SSL_read` and `SSL_write` functions. We define a function `SSL_exit` to handle the return values of these two functions. This function determines whether to trace and collect data based on the current process and thread IDs.
```c
static int SSL_exit(struct pt_regs *ctx, int rw) {
int ret = 0;
u32 zero = 0;
u64 pid_tgid = bpf_get_current_pid_tgid();
u32 pid = pid_tgid >> 32;
u32 tid = (u32)pid_tgid;
u32 uid = bpf_get_current_uid_gid();
u64 ts = bpf_ktime_get_ns();
if (!trace_allowed(uid, pid)) {
return 0;
}
/* store arg info for later lookup */
u64 *bufp = bpf_map_lookup_elem(&bufs, &tid);
if (bufp == 0)
return 0;
u64 *tsp = bpf_map_lookup_elem(&start_ns, &tid);
if (!tsp)
return 0;
u64 delta_ns = ts - *tsp;
int len = PT_REGS_RC(ctx);
if (len <= 0) // no data
return 0;
struct probe_SSL_data_t *data = bpf_map_lookup_elem(&ssl_data, &zero);
if (!data)
return 0;
data->timestamp_ns = ts;
data->delta_ns = delta_ns;
data->pid = pid;
data->tid = tid;
data->uid = uid;
data->len = (u32)len;
data->buf_filled = 0;
data->rw = rw;
data->is_handshake = false;
u32 buf_copy_size = min((size_t)MAX_BUF_SIZE, (size_t)len);
bpf_get_current_comm(&data->comm, sizeof(data->comm));
if (bufp != 0)
ret = bpf_probe_read_user(&data->buf, buf_copy_size, (char *)*bufp);
bpf_map_delete_elem(&bufs, &tid);
bpf_map_delete_elem(&start_ns, &tid);
if (!ret)
data->buf_filled = 1;
else
buf_copy_size = 0;
bpf_perf_event_output(ctx, &perf_SSL_events, BPF_F_CURRENT_CPU, data,
EVENT_SIZE(buf_copy_size));
return 0;
}
```
The `rw` parameter here indicates whether it's a read or write operation. 0 represents read, and 1 represents write.
#### Data Collection Process
1. Obtain the ID of the current process and thread, along with the ID of the current user.
2. Use `trace_allowed` to determine if tracing is allowed for this process.
3. Get the start time to calculate the execution time of the function.
4. Attempt to retrieve relevant data from the `bufs` and `start_ns` maps.
5. If data retrieval is successful, create or locate a `probe_SSL_data_t` structure to populate the data.
6. Copy the data from user space to the buffer, ensuring it doesn't exceed the designated size.
7. Finally, send the data to user space.
Note: We use two user-level return probes `uretprobe` to respectively hook the returns of `SSL_read` and `SSL_write`:
```c
SEC("uretprobe/SSL_read")
int BPF_URETPROBE(probe_SSL_read_exit) {
return (SSL_exit(ctx, 0)); // 0 indicates read operation
}
SEC("uretprobe/SSL_write")
int BPF_URETPROBE(probe_SSL_write_exit) {
return (SSL_exit(ctx, 1)); // 1 indicates write operation
}
```
### Hooking into the Handshake Process
In SSL/TLS, the handshake is a special process used to establish a secure connection between a client and a server. To analyze this process, we hook into the `do_handshake` function to track the start and end of the handshake.
#### Entering the Handshake
We use a `uprobe` to set a probe for the `do_handshake` function:
```c
SEC("uprobe/do_handshake")
int BPF_UPROBE(probe_SSL_do_handshake_enter, void *ssl) {
u64 pid_tgid = bpf_get_current_pid_tgid();
u32 pid = pid_tgid >> 32;
u32 tid = (u32)pid_tgid;
u64 ts = bpf_ktime_get_ns();
u32 uid = bpf_get_current_uid_gid();
if (!trace_allowed(uid, pid)) {
return 0;
}
/* store arg info for later lookup */
bpf_map_update_elem(&start_ns, &tid, &ts, BPF_ANY);
return 0;
}
```
The main functionality of this code is as follows:
1. Obtain the current `pid`, `tid`, `ts`, and `uid`.
2. Use `trace_allowed` to verify if the process is allowed to be traced.
3. Store the current timestamp in the `start_ns` map, which will be used to calculate the duration of the handshake process later.
#### Exiting the Handshake
Similarly, we've set a `uretprobe` for the return of `do_handshake`:
```c
SEC("uretprobe/do_handshake")
int BPF_URETPROBE(handle_do_handshake_exit) {
// Code to execute upon exiting the do_handshake function.
return 0;
}
```
In this context, the `uretprobe` will execute the provided code when the `do_handshake` function exits.
```c
SEC("uretprobe/do_handshake")
int BPF_URETPROBE(probe_SSL_do_handshake_exit) {
u32 zero = 0;
u64 pid_tgid = bpf_get_current_pid_tgid();
u32 pid = pid_tgid >> 32;
u32 tid = (u32)pid_tgid;
u32 uid = bpf_get_current_uid_gid();
u64 ts = bpf_ktime_get_ns();
int ret = 0;
/* use kernel terminology here for tgid/pid: */
u32 tgid = pid_tgid >> 32;
/* store arg info for later lookup */
if (!trace_allowed(tgid, pid)) {
return 0;
}
u64 *tsp = bpf_map_lookup_elem(&start_ns, &tid);
if (tsp == 0)
return 0;
ret = PT_REGS_RC(ctx);
if (ret <= 0) // handshake failed
return 0;
struct probe_SSL_data_t *data = bpf_map_lookup_elem(&ssl_data, &zero);
if (!data)
return 0;
data->timestamp_ns = ts;
data->delta_ns = ts - *tsp;
data->pid = pid;
data->tid = tid;
data->uid = uid;
data->len = ret;
data->buf_filled = 0;
data->rw = 2;
data->is_handshake = true;
bpf_get_current_comm(&data->comm, sizeof(data->comm));
bpf_map_delete_elem(&start_ns, &tid);
bpf_perf_event_output(ctx, &perf_SSL_events, BPF_F_CURRENT_CPU, data,
EVENT_SIZE(0));
return 0;
}
```
Logic of this Function:
1. Obtain the current `pid`, `tid`, `ts`, and `uid`.
2. Use `trace_allowed` to recheck if tracing is allowed.
3. Look up the timestamp in the `start_ns` map for calculating handshake duration.
4. Use `PT_REGS_RC(ctx)` to get the return value of `do_handshake` and determine if the handshake was successful.
5. Find or initialize the `probe_SSL_data_t` data structure associated with the current thread.
6. Update the data structure's fields, including timestamp, duration, process information, etc.
7. Use `bpf_perf_event_output` to send the data to user space.
Our eBPF code not only tracks data transmission for `ssl_read` and `ssl_write` but also focuses on the SSL/TLS handshake process. This information is crucial for a deeper understanding and optimization of the performance of secure connections.
Through these hook functions, we can obtain data regarding the success of the handshake, the time taken for the handshake, and related process information. This provides us with insights into the behavior of the system's SSL/TLS, enabling us to perform more in-depth analysis and optimization when necessary.
## User-Space Assisted Code Analysis and Interpretation
In the eBPF ecosystem, user-space and kernel-space code often work in collaboration. Kernel-space code is responsible for data collection, while user-space code manages, processes, and handles this data. In this section, we will explain how the above user-space code collaborates with eBPF to trace SSL/TLS interactions.
### 1. Supported Library Attachment
In the provided code snippet, based on the setting of the `env` environment variable, the program can choose to attach to three common encryption libraries (OpenSSL, GnuTLS, and NSS). This means that we can trace calls to multiple libraries within the same tool.
To achieve this functionality, the `find_library_path` function is first used to determine the library's path. Then, depending on the library type, the corresponding `attach_` function is called to attach the eBPF program to the library function.
```c
if (env.openssl) {
char *openssl_path = find_library_path("libssl.so");
printf("OpenSSL path: %s\n", openssl_path);
attach_openssl(obj, "/lib/x86_64-linux-gnu/libssl.so.3");
}
if (env.gnutls) {
char *gnutls_path = find_library_path("libgnutls.so");
printf("GnuTLS path: %s\n", gnutls_path);
attach_gnutls(obj, gnutls_path);
}
if (env.nss) {
char *nss_path = find_library_path("libnspr4.so");
printf("NSS path: %s\n", nss_path);
attach_nss(obj, nss_path);
}
```
This section primarily covers the attachment logic for the OpenSSL, GnuTLS, and NSS libraries. NSS is a set of security libraries designed for organizations, supporting the creation of secure client and server applications. Originally developed by Netscape, they are now maintained by Mozilla. The other two libraries have been introduced earlier and are not reiterated here.
### 2. Detailed Attachment Logic
The specific `attach` functions are as follows:
```c
#define __ATTACH_UPROBE(skel, binary_path, sym_name, prog_name, is_retprobe) \
do { \
LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts, .func_name = #sym_name, \
.retprobe = is_retprobe); \
skel->links.prog_name = bpf_program__attach_uprobe_opts( \
skel->progs.prog_name, env.pid, binary_path, 0, &uprobe_opts); \
} while (false)
int attach_openssl(struct sslsniff_bpf *skel, const char *lib) {
ATTACH_UPROBE_CHECKED(skel, lib, SSL_write, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, SSL_write, probe_SSL_write_exit);
ATTACH_UPROBE_CHECKED(skel, lib, SSL_read, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, SSL_read, probe_SSL_read_exit);
if (env.latency && env.handshake) {
ATTACH_UPROBE_CHECKED(skel, lib, SSL_do_handshake,
probe_SSL_do_handshake_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, SSL_do_handshake,
probe_SSL_do_handshake_exit);
}
return 0;
}
int attach_gnutls(struct sslsniff_bpf *skel, const char *lib) {
ATTACH_UPROBE_CHECKED(skel, lib, gnutls_record_send, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, gnutls_record_send, probe_SSL_write_exit);
ATTACH_UPROBE_CHECKED(skel, lib, gnutls_record_recv, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, gnutls_record_recv, probe_SSL_read_exit);
return 0;
}
int attach_nss(struct sslsniff_bpf *skel, const char *lib) {
ATTACH_UPROBE_CHECKED(skel, lib, PR_Write, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, PR_Write, probe_SSL_write_exit);
ATTACH_UPROBE_CHECKED(skel, lib, PR_Send, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, PR_Send, probe_SSL_write_exit);
ATTACH_UPROBE_CHECKED(skel, lib, PR_Read, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, PR_Read, probe_SSL_read_exit);
ATTACH_UPROBE_CHECKED(skel, lib, PR_Recv, probe_SSL_rw_enter);
ATTACH_URETPROBE_CHECKED(skel, lib, PR_Recv, probe_SSL_read_exit);
return 0;
}
```
We further examine the `attach_` function and can see that they both use the `ATTACH_UPROBE_CHECKED` and `ATTACH_URETPROBE_CHECKED` macros to implement specific mounting logic. These two macros are used respectively for setting uprobe (function entry) and uretprobe (function return).
Considering that different libraries have different API function names (for example, OpenSSL uses `SSL_write`, while GnuTLS uses `gnutls_record_send`), we need to write a separate `attach_` function for each library.
For instance, in the `attach_openssl` function, we set up probes for both `SSL_write` and `SSL_read`. If users also want to track handshake latency (`env.latency`) and the handshake process (`env.handshake`), we set up a probe for `SSL_do_handshake`.
In the eBPF ecosystem, `perf_buffer` is an efficient mechanism used to transfer data from kernel space to user space. This is particularly useful for kernel-space eBPF programs as they can't directly interact with user space. With `perf_buffer`, we can collect data in kernel-space eBPF programs and then asynchronously read this data in user space. We use the `perf_buffer__poll` function to read data reported in kernel space, as shown below:
```c
while (!exiting) {
err = perf_buffer__poll(pb, PERF_POLL_TIMEOUT_MS);
if (err < 0 && err != -EINTR) {
warn("error polling perf buffer: %s\n", strerror(-err));
goto cleanup;
}
err = 0;
}
```
Finally, in the `print_event` function, we print the data to standard output:
```c
// Function to print the event from the perf buffer
void print_event(struct probe_SSL_data_t *event, const char *evt) {
...
if (buf_size != 0) {
if (env.hexdump) {
// 2 characters for each byte + null terminator
char hex_data[MAX_BUF_SIZE * 2 + 1] = {0};
buf_to_hex((uint8_t *)buf, buf_size, hex_data);
printf("\n%s\n", s_mark);
for (size_t i = 0; i < strlen(hex_data); i += 32) {
printf("%.32s\n", hex_data + i);
}
printf("%s\n\n", e_mark);
} else {
printf("\n%s\n%s\n%s\n\n", s_mark, buf, e_mark);
}
}
}
```
You can find the complete source code here: [https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/30-sslsniff](https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/30-sslsniff)
## Compilation and Execution
To start using `sslsniff`, you need to first compile it:
```sh
make
```
Once done, follow these steps:
### **Start sslsniff**
In a terminal, execute the following command to start `sslsniff`:
```sh
sudo ./sslsniff
```
### **Execute CURL command**
In another terminal, execute:
```console
curl https://example.com
```
Under normal circumstances, you will see output similar to the following:
```html
<!doctype html>
<html>
<head>
<title>Example Domain</title>
...
<body>
<div>
...
</div>
</body>
</html>
```
### **sslsniff Output**
After executing the `curl` command, `sslsniff` will display the following content:
```txt
READ/RECV 0.132786160 curl 47458 1256
----- DATA -----
<!doctype html>
...
<div>
<h1>Example Domain</h1>
...
</div>
</body>
</html>
----- END DATA -----
```
**Note**: The displayed HTML content may vary depending on the specific content of the `example.com` page.
### Displaying Latency and Handshake Process
To view latency and handshake process, execute the following command:
```console
$ sudo ./sslsniff -l --handshake
OpenSSL path: /lib/x86_64-linux-gnu/libssl.so.3
GnuTLS path: /lib/x86_64-linux-gnu/libgnutls.so.30
NSS path: /lib/x86_64-linux-gnu/libnspr4.so
FUNC TIME(s) COMM PID LEN LAT(ms)
HANDSHAKE 0.000000000 curl 6460 1 1.384 WRITE/SEND 0.000115400 curl 6460 24 0.014
```
### Hexadecimal Output
To display data in hexadecimal format, execute the following command:
```console
$ sudo ./sslsniff --hexdump
WRITE/SEND 0.000000000 curl 16104 24
----- DATA -----
505249202a20485454502f322e300d0a
0d0a534d0d0a0d0a
----- END DATA -----
...
```
## Summary
eBPF is a very powerful technology that can help us gain deeper insights into how a system works. This tutorial is a simple example demonstrating how to use eBPF to monitor SSL/TLS communication. If you're interested in eBPF technology and want to learn more and practice further, you can visit our tutorial code repository at <https://github.com/eunomia-bpf/bpf-developer-tutorial> and tutorial website at <https://eunomia.dev/zh/tutorials/>.
References:
- <https://github.com/iovisor/bcc/pull/4706>
- <https://github.com/openssl/openssl>
- <https://www.openssl.org/docs/man1.1.1/man3/SSL_read.html>
- <https://github.com/iovisor/bcc/blob/master/tools/sslsniff_example.txt>
- <https://en.wikipedia.org/wiki/Transport_Layer_Security>

View File

@@ -27,6 +27,7 @@
- [在 Android 上使用 eBPF 程序](22-android/README.md)
- [使用 eBPF 追踪 HTTP 请求或其他七层协议](23-http/README.md)
- [使用 uprobe 捕获多种库的 SSL/TLS 明文数据](30-sslsniff/README.md)
- [使用 sockops 加速网络请求转发](29-sockops/README.md)
- [使用 eBPF 隐藏进程或文件信息](24-hide/README.md)
- [使用 bpf_send_signal 发送信号终止进程](25-signal/README.md)

View File

@@ -27,6 +27,7 @@
- [Using eBPF programs on Android](22-android/README.md)
- [Tracing HTTP requests or other layer 7 protocols using eBPF](23-http/README.md)
- [Capturing Plain Text Data of Various Libraries' SSL/TLS Using uprobe](30-sslsniff/README_en.md)
- [Accelerating network request forwarding using sockops](29-sockops/README.md)
- [Hiding process or file information using eBPF](24-hide/README.md)
- [Terminating processes by sending signals using bpf_send_signal](25-signal/README.md)