* fix: update new readme * chore: update new readme
中文 | English
Machine Learning Systems: Design and Implementation
An open-source book explaining the design principles and implementation experience of modern machine learning systems, covering the complete technology stack from programming interfaces and computational graphs to compilers and distributed training.
English version 1 (stable): openmlsys.github.io/html-en/
English version 2: Under reconstruction.
Table of Contents
Target Audience
- Students: Those who have mastered machine learning fundamentals and want to deeply understand the design and implementation of modern ML systems.
- Researchers: Those who need to develop custom operators or leverage distributed execution for large model development.
- Engineers: Those responsible for building ML infrastructure and need to tune system performance or customize ML systems for business needs.
Content Overview
The book is organized into three parts: Fundamentals, Advanced Topics, and Extensions.
Part I: Fundamentals
| Chapter | Content |
|---|---|
| Programming Interface | Framework API design, ML workflows, deep learning model definition, C/C++ framework development |
| Computational Graph | Graph components, generation methods, scheduling strategies, automatic differentiation |
Part II: Advanced Topics
| Chapter | Content |
|---|---|
| Compiler Frontend & IR | Type inference, intermediate representation (IR), automatic differentiation, common optimization passes |
| Compiler Backend & Runtime | Graph optimization, operator selection, memory allocation, compute scheduling and execution |
| Hardware Accelerators | GPU/Ascend architecture, high-performance programming interfaces (CUDA/CANN) |
| Data Processing | Usability, efficiency, order preservation, distributed data processing |
| Model Deployment | Model conversion, compression, inference, and security |
| Distributed Training | Data parallelism, model parallelism, pipeline parallelism, collective communication, parameter servers |
Part III: Extensions
| Chapter | Content |
|---|---|
| Recommender Systems | Recommendation principles, large-scale industrial architecture |
| Federated Learning | Federated learning methods, privacy protection, system implementation |
| Reinforcement Learning Systems | Single-agent and multi-agent RL systems |
| Explainable AI Systems | XAI methods and production practices |
| Robot Learning Systems | Robot perception, planning, control, and system safety |
Changelog
| Date | Event |
|---|---|
| 2022-01 | Project initialized; Chinese content writing begins |
| 2022-05 | Extension chapters released (Federated Learning, RL Systems, Explainable AI) |
| 2023-05 | Codebase adapted to MindSpore 2.0 |
| 2026-03 | Bilingual (CN/EN) build architecture refactored; English version launched |
Build Guide
Prerequisites
- Python >= 3.10
- pandoc >= 2.19
Installation
# Clone the repository
git clone https://github.com/openmlsys/openmlsys-zh.git
cd openmlsys-zh
# Install d2lbook
git clone https://github.com/openmlsys/d2l-book.git
cd d2l-book && pip install . && cd ..
# Install Python dependencies
pip install -r requirements.txt
Build HTML
sh build_html.sh
# Output is in _build/html/
For more details, see the Build Guide.
Contributing
We welcome all forms of contributions, including:
- Errata: If you find text or figure errors, please open an Issue and @ the chapter editors, or submit a PR directly.
- Content updates: Submit PRs to update or add Markdown files.
- New chapters: We welcome community contributions on topics such as meta-learning systems, automatic parallelism, cluster scheduling, green AI, and graph learning.
Before contributing, please read:
Community
Join our WeChat group by scanning the QR code in info/mlsys_group.png.
Citation
If this book has been helpful to your research or work, please cite it as:
Plain text:
OpenMLSys Team. Machine Learning Systems: Design and Implementation. 2022. https://openmlsys.github.io/
BibTeX:
@book{openmlsys2022,
title = {Machine Learning Systems: Design and Implementation},
author = {OpenMLSys Team},
year = {2022},
url = {https://openmlsys.github.io/},
note = {Open-source textbook, \url{https://github.com/openmlsys/openmlsys-zh}}
}
License
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
