Merge branch 'main' of https://github.com/openmlsys/openmlsys-zh into main

2026-06-15 14:26:55 +08:00 · 2022-05-22 23:32:57 -04:00
parent 8728a8a0a4 2c5996eecd
commit aacabe79e7
51 changed files with 1328 additions and 109 deletions
--- a/Pic_Templates_and_Samples.pptx
+++ b/Pic_Templates_and_Samples.pptx
--- a/appendix_machine_learning_introduction/classic_machine_learning.md
+++ b/appendix_machine_learning_introduction/classic_machine_learning.md
@@ -66,3 +66,7 @@ K均值聚类算法是一种解决聚类问题的算法，算法过程如下：
 本章结束语：

 在系统角度，机器学习的算法无论是什么算法，涉及到高维数据任务的现都是矩阵运算实现的。
+
+## 参考文献
+
+:bibliography:`../references/appendix.bib`
--- a/chapter_accelerator/summary.md
+++ b/chapter_accelerator/summary.md
@@ -11,6 +11,11 @@

 ## 扩展阅读

-  CUDA编程指导 [CUDA](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html)
-  昇腾社区 [Ascend](https://gitee.com/ascend)
-  MLIR应用进展 [MLIR](https://mlir.llvm.org/talks)
+- CUDA编程指导 [CUDA](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html)
+- 昇腾社区 [Ascend](https://gitee.com/ascend)
+- MLIR应用进展 [MLIR](https://mlir.llvm.org/talks)
+
+
+## 参考文献
+
+:bibliography:`../references/accelerator.bib`
--- a/chapter_computational_graph/components_of_computational_graph.md
+++ b/chapter_computational_graph/components_of_computational_graph.md
@@ -7,7 +7,7 @@
 :label:`simpledag`
 ### 张量和算子

-在计算框架中，基础组件包含张量和算子，张量是基础数据结构，算子是基本运算单元。在数学中定义张量是基于向量与矩阵的推广，涵盖标量、向量与矩阵的概念。可以将标量理解为零阶张量，向量为一阶张量，我们熟悉的RGB彩色图像即为三阶张量。在计算框架中张量不仅存储数据，还存储数据类型、数据形状、维度或秩以及梯度传递状态等多个属性，如:numref:`tensor_attr`所示，列举了主要的属性和功能。
+在计算框架中，基础组件包含张量和算子，张量是基础数据结构，算子是基本运算单元。在数学中定义张量是基于向量与矩阵的推广，涵盖标量、向量与矩阵的概念。可以将标量理解为零阶张量，向量为一阶张量，我们熟悉的RGB彩色图像即为三阶张量。在计算框架中张量不仅存储数据，还存储数据类型、数据形状、维度或秩以及梯度传递状态等多个属性，如:numref:`tensor_attr`所示，列举了主要的属性和功能。可以通过[代码示例](https://github.com/openmlsys/openmlsys-pytorch/blob/master/chapter_computational_graph/tensor.py)查看张量的属性和部分操作展示

 :张量属性

@@ -175,4 +175,4 @@ grad_W = matmul(transpose(X), grad_X1)
 :width:`600px`
 :label:`chain`

-在深度学习计算框架中，控制流可以进行嵌套，比如多重循环和循环条件控制，计算图会对复杂控制流进行准确的描述，以便于执行正确的计算调度与执行任务。
+在深度学习计算框架中，控制流可以进行嵌套，比如多重循环和循环条件控制，计算图会对复杂控制流进行准确的描述，以便于执行正确的计算调度与执行任务。可以通过[代码示例](https://github.com/openmlsys/openmlsys-pytorch/blob/master/chapter_computational_graph/control_flow.py)查看在条件控制和循环控制下，前向和反向计算的数据流。
--- a/chapter_computational_graph/generation_of_computational_graph.md
+++ b/chapter_computational_graph/generation_of_computational_graph.md
@@ -129,7 +129,7 @@ def model(X, flag):
 ```
 代码中模型整体可以采用动态生成，而\@ms\_function可以使用基于源码转换的技术将模块*add_and_relu*的转化为静态图结构。与动态生成中代码执行相同，模型接受输入按照模型定义的计算顺序进行调度执行，并生成临时图结构，当执行语句*Y=add_and_relu(Y,b)* 时，计算框架会自动调用该模块静态生成的图结构执行计算。模块*add_and_relu* 可以利用静态图中的优化技术来提高计算性能，实现动态图和静态图的混合执行。此外，动静态转换的技术常用于模型部署阶段，动态图预测部署时除了需要已经训练完成的参数文件，还须提供最初的模型组网前端代码，这使得动态图部署受到局限性，部署硬件中往往难以提供支持前端语言执行环境。因此当使用动态图模式训练完成模型参数后，可以将整体网络结构转换为静态图格式，将神经网络模型和参数文件进行序列化保存，与前端代码完全解耦，扩大模型部署的硬件支持范围。

-主流的计算框架TensorFlow、MindSpore等均提供动静态相互转换与融合执行的技术，我们将各框架中支持源码转换和追踪转换技术的接口梳理如 :numref:`dynamic_static_switch`所示。
+主流的计算框架TensorFlow、MindSpore等均提供动静态相互转换与融合执行的技术，我们将各框架中支持源码转换和追踪转换技术的接口梳理如 :numref:`dynamic_static_switch`所示。可以通过[代码示例](https://github.com/openmlsys/openmlsys-pytorch/blob/master/chapter_computational_graph/generate_static_graph.py)查看PyTorch计算框架中是如何将动态图模型转化为静态图模型，并且展示静态图结构信息。

 :主流框架动态图转换静态图支持

--- a/chapter_data_processing/summary.md
+++ b/chapter_data_processing/summary.md
@@ -1,3 +1,8 @@
 ## 总结

 本章我们围绕着易用性、高效性和保序性三个维度展开研究如何设计实现机器学习系统中的数据预处理模块。在易用性维度我们重点探讨了数据模块的编程模型，通过借鉴历史上优秀的并行数据处理系统的设计经验，我们认为基于描述数据集变换的编程抽象较为适合作为数据模块的编程模型，在具体的系统实现中，我们不仅要在上述的编程模型的基础上提供足够多内置算子方便用户的数据预处理编程，同时还要考虑如何支持用户方便的使用自定义算子。在高效性方面，我们从数据读取和计算两个方面分别介绍了特殊文件格式设计和计算并行架构设计。我们也使用我们在前几章中学习到的模型计算图编译优化技术来优化用户的数据预处理计算图，以进一步的达到更高的数据处理吞吐率。机器学习场景中模型对数据输入顺序敏感，于是衍生出来保序性这一特殊性质，我们在本章中对此进行了分析并通过MindSpore中的Connector的特殊约束实现来展示真实系统实现中如何确保保序性。最后，我们也针对部分情况下单机CPU数据预处理性能的问题，介绍了当前基于异构处理加速的纵向扩展方案，和基于分布式数据预处理的横向扩展方案，我们相信读者学习了本章后能够对机器学习系统中的数据模块有深刻的认知，也对数据模块未来面临的挑战有所了解。
+
+## 扩展阅读
+
+-   流水线粒度并行实现示例建议阅读 [Pytorch DataLoader](https://github.com/pytorch/pytorch/tree/master/torch/utils/data)。
+-   算子粒度并行实现示例建议阅读 [MindData](https://gitee.com/mindspore/mindspore/tree/master/mindspore/ccsrc/minddata)。
--- a/chapter_explainable_AI/explainable_ai.md
+++ b/chapter_explainable_AI/explainable_ai.md
@@ -110,17 +110,7 @@ TCAV就可以通过计算类$k$的具有正$S_{C,k,l}$’s的样本的比率来

 $$\textbf{TCAV}_{Q_{C,k,l}}=\frac{\vert \{\mathbf{x}\in X_{k}:S_{C,k,l}(\mathbf{x})>0\}\vert}{\vert X_{k}\vert}
 \label{eq:TCAV}$$
-结合$t$-分布假设方法，如果$\textbf{TCAV}_{Q_{C,k,l}}$大于0.5，则表明概念$C$对类$k$有重大影响。
-
-= \[rectangle, minimum height=2.5cm, text width=2.4cm, text centered,
-draw=black, font=\] = \[thick,-&gt;,&gt;=stealth\]
-
-(step1) \[startstop\] [收集一个概念的正负样本]{}; (step2) \[startstop,
-right of=step1\] [输入正负样模型获取中间层的激活]{}; (step3)
-\[startstop, right of=step2\] [通过线性回归获取 CAVs]{}; (step4)
-\[startstop, right of=step3\] [计算TCAV分值]{};
-
-(step1) – (step2); (step2) – (step3); (step3) – (step4);
+结合$$t$$-分布假设方法，如果$$\textbf{TCAV}_{Q_{C,k,l}}$$大于0.5，则表明概念$$C$$对类$$k$$有重大影响。

 ![TCAV流程(图片来源于 :cite:`2020tkde_li`)](../img/ch11/xai_tcav.png)
 :width:`800px`
@@ -132,16 +122,16 @@ right of=step1\] [输入正负样模型获取中间层的激活]{}; (step3)
 :width:`800px`
 :label:`tb_net`

-TB-Net的框架如图 :numref:`tb_net`所示：其中，Step代表步骤，historical代表历史的记录在图谱中的节点。Path extraction代表路径抽取，embedding propagation代表图嵌入向量传导技术，R代表关系矩阵，e代表图谱中的实体节点，pair block代表物品配对块，user response代表用户兴趣反馈向量，update代表更新词向量，concat代表拼接计算。步骤1，TB-Net得到目标项$\tau$，用户$u$和该用户的子图。子图是通过历史点击项集合$I_u$（蓝色部分）来构建生成；步骤2，连接$\tau$和$I_u$之间的路径，提取路径作为双向嵌入传播网络TB-Net的输入。词向量的计算从路径的左侧和右侧传播到中间节点（图中的绿色节点）；步骤3，计算左右两个流向的词向量汇集到同一中间实体的概率。概率用于表示用户对中间实体的喜好程度，并作为解释的依据；步骤4，TB-Net同时输出推荐结果和具有语义级别的解释。
+TB-Net的框架如图 :numref:`tb_net`所示：其中，$i_c$代表待推荐物品，$h_n$代表历史记录中用户交互的物品，$r$和$e$代表图谱中的关系（relation）和实体（entity），它们的向量化表达拼接在一起形成关系矩阵和实体矩阵。首先，TB-Net通过$i_c$和$h_n$的相同特征值来构建用户$u$的子图谱，每一对$i_c$和$h_n$都由关系和实体所组成的路径来连接。然后，TB-Net的路径双向传导方法将物品、实体和关系向量的计算从路径的左侧和右侧分别传播到中间节点，即计算左右两个流向的向量汇集到同一中间实体的概率。该概率用于表示用户对中间实体的喜好程度，并作为解释的依据。最后，TB-Net识别子图谱中关键路径（即关键实体和关系），输出推荐结果和具有语义级别的解释。

-以游戏推荐为场景，随机对一个用户推荐新的游戏，如图 :numref:`xai_kg_recommendataion`所示，其中Half-Life, DOTA 2, Team Fortress 2等为游戏名称。关系属性中，game.year 代表游戏发行年份，game.genres代表游戏属性，game.developer代表游戏的开发商，game.categories代表游戏分类。属性节点中，MOBA代表多人在线战术竞技游戏，valve代表威尔乌游戏公司，action代表动作类，Multi-player代表多人游戏，Valve Anti-Cheat enabled代表威尔乌防作弊类，Free代表免费，Cross-Platform代表跨平台。左边的游戏是从训练数据中选取的评分项。而测试数据中正确推荐的游戏是“Team Fortress 2”。
+以游戏推荐为场景，随机对一个用户推荐新的游戏，如图 :numref:`xai_kg_recommendation`所示，其中Half-Life, DOTA 2, Team Fortress 2等为游戏名称。关系属性中，game.year 代表游戏发行年份，game.genres代表游戏属性，game.developer代表游戏的开发商，game.categories代表游戏分类。属性节点中，MOBA代表多人在线战术竞技游戏，Valve代表威尔乌游戏公司，Action代表动作类，Multi-player代表多人游戏，Valve Anti-Cheat enabled代表威尔乌防作弊类，Free代表免费，Cross-Platform代表跨平台。右边的游戏是用户历史记录中玩过的游戏。而测试数据中正确推荐的游戏是“Team Fortress 2”。

 ![Steam游戏推荐可解释示例 （用户玩过的游戏: Half-Life, DOAT 2. 推荐命中的游戏: “Team Fortress 2”。具有属性信息的节点如，game.geners: Action, free-to-play; game.developer: Valve; game.categories:
-Multiplayer, MOBA.）](../img/ch11/xai_kg_recommendataion.png)
+Multiplayer, MOBA.）](../img/ch11/xai_kg_recommendation.png)
 :width:`800px`
-:label:`xai_kg_recommendataion`
+:label:`xai_kg_recommendation`

-在图 :numref:`xai_kg_recommendataion`中，有两个突出显示的相关概率（38.6%, 21.1%），它们是在推荐过程中模型计算的路径被激活的概率。实线箭头突出显示从“Team Fortress 2”到历史项目“Half-Life”之间的路径。它表明TB-Net能够通过各种关系连接向用户推荐物品，并输出关键因素作为解释。因此，将“Team Fortress 2”推荐给用户的解释可以翻译成固定话术：“Team Fortress 2”是游戏公司“Valve”开发的一款动作类、多人在线、射击类“action”电子游戏。这与用户历史玩过的游戏“Half-Life”有高度关联。
+在图 :numref:`xai_kg_recommendation`中，有两个突出显示的相关概率（38.6%, 21.1%），它们是在推荐过程中模型计算的关键路径被激活的概率。红色箭头突出显示从“Team Fortress 2”到历史项目“Half-Life”之间的关键路径。它表明TB-Net能够通过各种关系连接向用户推荐物品，并找出关键路径作为解释。因此，将“Team Fortress 2”推荐给用户的解释可以翻译成固定话术：“Team Fortress 2”是游戏公司“Valve”开发的一款动作类、多人在线、射击类电子游戏。这与用户历史玩过的游戏“Half-Life”有高度关联。


 ## 未来可解释AI
@@ -153,3 +143,7 @@ Multiplayer, MOBA.）](../img/ch11/xai_kg_recommendataion.png)
 此外，XAI系统的部署也非常需要一个更加标准和更加统一的评估框架。为了构建标准统一的评估框架，我们可能需要同时利用不同的指标，相互补充。不同的指标可能适用于不同的任务和用户。统一的评价框架应具有相应的灵活性。

 最后，我们相信跨学科合作将是有益的。XAI的发展不仅需要计算机科学家来开发先进的算法，还需要物理学家、生物学家和认知科学家来揭开人类认知的奥秘，以及特定领域的专家来贡献他们的领域知识。
+
+## 参考文献
+
+:bibliography:`../references/explainable.bib`
--- a/chapter_frontend_and_ir/intermediate_representation.md
+++ b/chapter_frontend_and_ir/intermediate_representation.md
@@ -30,7 +30,7 @@
  : 中间表示的分类
 :::

-1\) 线性中间表示/Users/liangzhibo/Desktop/中间表示-中间表示结构.png
+1\) 线性中间表示

 线性中间表示类似抽象机的汇编代码，将被编译代码表示为操作的有序序列，对操作序列规定了一种清晰且实用的顺序。由于大多数处理器采用线性的汇编语言，线性中间表示广泛应用于编译器设计。

--- a/chapter_preface/index.md
+++ b/chapter_preface/index.md
@@ -0,0 +1,50 @@
+# 序言
+
+## 缘起
+
+我在2020年来到了爱丁堡大学信息学院，爱丁堡大学是AI研究的发源地之一，很多学生慕名而来学习机器学习技术。因此，我们拥有许多出色的AI课程（自然语言处理，计算机视觉，计算神经学等），同时也拥有一系列关于计算机系统的基础课程（操作系统，编程语言，编译器，计算机体系架构等）。但是当我在教学的过程中问起学生：机器学习是如何利用计算机系统来做到计算加速和大规模部署的？许多学生都会报来疑惑的眼神。而这也促使我思考在爱丁堡大学乃至于其他世界顶尖大学的教学大纲里，我们是不是缺乏了一门课程来衔接机器学习技术和计算机系统知识。
+
+我第一反应是寻找一门已有的课程来借鉴。当其时，加州伯克利大学的AI Systems较为知名。这门课描述了机器学习系统的不同研究方向，内容以研读论文为主。可惜的是，许多论文已经无法经受住时间的检验。更重要的是：这门课缺乏对于知识的整体梳理，形成完整的知识体系架构。学习完这个课程，学生并没有明确的思路可以从头搭建起来一个机器学习框架。而将目光投向其他地方，华盛顿大学曾短期开过Deep Learning Systems课程，这门课程讲述了机器学习程序的编译过程，其受限于服务TVM的目的，对于机器学习系统缺乏完整的解读。另外，斯坦福大学的Machine Learning Systems Design因为课程设计人是数据库背景，因此课程专注数据清洗，数据管理，数据标注等数据专题。
+
+当时觉得比较合适的是微软亚洲研究院的AI Systems。这门课程在研读论文的同时，一定程度上讲述了AI框架背后的设计理念。但是当我准备将其教授给本科生的时候，我发现这门课对于机器学习系统核心设计理念讲解很浅，同时也要求学生具有大量的背景知识，实际上更适合给博士生授课。抛开内容不谈，上述的全部课程共同的核心问题是：它们给学生的阅读材料都是高深，零散甚至过时的论文，而不是一本全面，注重基础，语言通熟易懂的面向本科生和工程师的教科书，这给机器学习系统相关知识的传播造成了极大的困难。
+
+回首2020年的世界，我们已经拥有了优秀的操作系统，数据库，分布式系统等基础性教材。在机器学习领域，我们也拥有了一系列机器学习算法的教材。然而，无论是英语世界还是中文世界，我竟找不到任何一本系统性讲述机器学习系统的教材。而这本教材的缺乏，让许多公司和高校实验室不得不花费大量的人力和物力从头培养学生和工程师对于机器学习基础架构的认识，这已经制约了高校培养出符合业界，学界和时代发展的人才了！因此，我开始思考：我们学界和业界是不是需要一本机器学习系统的教科书了呢？
+
+## 开端
+
+带着写书的构想，我开始和身边的朋友沟通。几乎全部人都非常认可这本书的巨大价值，但是现实的情况是：没有人愿意做这么一件吃力不讨好的事情。我当时的博士后导师也劝我：我现在处在助理教授的关键阶段，追求高影响力的学术论文是当务之急，写一本书要耗费3-4年的精力，最后可能也无法面世。而当我和同行交流时也发现：人们更愿意改进世面上已经有的教科书，做有迹可循的事情，而不是摸着石头过河，做从无到有的事情。特别是对于机器学习系统这个快速发展，依然在试错的领域，能不能写出一本能够经受住时间检验的书也是一个巨大的未知数。
+
+我因此不得不将这个想法暂时藏在了心里了数月，直到一次探亲回国和朋友聊天。这个朋友就是MindSpore的架构师金雪锋。和雪锋的相识是在疫情前的最后一个圣诞节左右，雪锋来伦敦访问，他正在领导MindSpore的开发（当时1.0还没有发布）。而在世界的另一端，我在2018年也和好友一起试图从头搭建一个AI框架，虽然最终资源不足，无疾而终，不过许多的思考成就了我之后发表的多篇AI系统论文。和雪锋聊起来，我们都对AI框架开发之难深有同感。我们共同的感慨就是：找到懂AI框架开发的人太难了。现今的学生们都一心学习机器学习算法，很多学生对于底层的运作原理理解很粗浅。而当他们在真实世界中应用机器学习意识到系统的重要性，想去学习的时候，却没有了在学校中最充沛的学习时间。我因此对雪锋苦笑到：我是准备写一本机器学习系统教材的，但是可能还要等个3-4年。雪锋这时候说：我也有这个想法啊，你要是写的话，我能帮助到你吗？
+
+雪锋这句话其实点醒了我。传统的书籍写作，往往是依赖于1-2个教授将学科十余年的发展慢慢总结，整理出书。这种模式类似于传统软件开发的瀑布流方式。可是，科技的世界已经变了！软件的发展从传统的瀑布流进化到如今的开源，敏捷开发。而书籍的写作为什么还要停留在传统方式呢？MXNet团队构建开源社区来编写的专注于深度学习算法的书籍《Deep Dive into Deep Learning》就是一个很好的例子啊。我因此马上找到当年一起创立TensorLayer开源社区的小伙伴北京大学的董豪，我们一拍即合，说干就干。雪锋也很高兴我和董豪愿意开始做这件事，也邀请了他的同事干志良进来帮助我们。我们终于开始书籍的写作了！
+
+经过几轮的讨论，我们将书籍的名字定为《机器学习系统：设计和实现》。我们希望这本书能教给学生经受住时间检验的机器学习系统设计原理，同时也提供大量的系统实现经验分享，让他们将来工作，科研中遇到实际问题知道该如何分析和解决。
+
+## 社区的构建
+
+考虑到机器学习系统本身就是一个依然在发展，试错，并且频繁孕育细分领域的学科。我从一开始就在思考：如何设计一个高度可扩展（Scalable）的社区架构来保证这本书的可持续发展呢？因为我是专注于大规模软件系统的老师，我决定借鉴几个分布式系统的设计要点来构建社区：
+
+* 预防单点瓶颈：现代分布式系统往往采用控制层和数据层分离的设计来避免单点故障和瓶颈。那么我们在设计高度可扩展的写作社区的时候，也要如此。因此，我们设计了如下分布式机制：编辑（类似于分布式系统的Leader）决定花最大的时间来寻找每个章节最优秀，主动，负责任的章节负责人(Local leader)。而章节负责人可以进一步寻找其他作者（Follower）共同协作。而章节负责人和章节作者进行密切的沟通，按照给定时间节点，全速异步推进。而编辑和章节负责人设定了每隔1周的讨论来同步（Synchronise）写作的进展，确保并行完成的章节质量能够持续符合编辑和社区的整体预期。
+
+* 迭代式改进：深度学习的优化算法随机梯度下降本质上是在复杂问题中利用局部梯度进行海量迭代，最终找到优秀的局部最优解。我因此利用了同样的思路来设计书籍质量的迭代提高。我们首先在Overleaf上写作好书籍的初版（类似于初始参数，Initial Weights）。接下来，我们进一步将书籍的内容做成标准的Git代码仓库（Book as code）。建立机制鼓励开源社区和广大读者开启Issue和PR，频繁改进书籍（相当于梯度，Gradients），而我们设置好完善的书籍构建工具，持续集成工具，贡献者讨论会，标准化的Issue和PR合并流程等等，就可以让书籍的质量持续提高实现随机梯度下降（Stochastic Gradient Descent）一样的最终最优性。
+
+* 高可用性：我们要有7x24小时在线的平台，让书籍可以在全球任何时区，任何语言平台下都能参与开发，倾听社区的反馈。因此我们将Git仓库放置在Github上，并准备之后在Gittee做好镜像。这样，我们就搭建了一套高可用的写作平台了。
+
+* 内容中立：一个分布式系统要能长久运行，其中的每一个节点我们要同等对待，遇到故障才能用统一的办法来进行故障恢复。考虑到书籍写作中的故障（设计无法经受时间检验，写作人中途不得不退出等等）可以来源于方方面面，我们让不同背景的参与者共同完成每一个章节，确保写出中立，客观，包容各类型观点的书籍内容，并且写作不会因为故障而中断。
+
+## 现状和未来
+
+机制一旦建立好，写作就自动化地跑起来了，同行人也越来越多，我带过的学生袁秀龙、丁子涵、符尧也很用心参与，董豪邀请了鹏城实验室的韩佳容和赖铖，志良邀请了许多MindSpore的小伙伴进来贡献，许多资深的AI框架的设计者也和我们在各个渠道展开讨论，提供了非常多宝贵的写作建议。另外，学界的教授（Peter Pietzuch老师，陈雷老师等）也持续给我们内容提供详细的反馈。
+
+充分发动了“分布式系统”的力量后，书籍的内容得以持续高质量的合并了进来。当我们开源了书籍以后，书籍的受众快速增长，GitHub上的关注度增长让我们受宠若惊。在社区的推动下，书籍的中文版，英文版，阿拉伯语版都已经开始推进。这么多年来，我第一次意识到我在分布式系统和机器学习里面学习到的知识，在解决现实复杂问题的时候是如此的有用！
+
+很多时候，当我们面对未知而巨大的困难，个人的力量真的渺小。而和朋友，社区一起，就变成了强大的力量，让我们鼓起勇气，走出了最关键的第一步！希望我的一些思考，能给其他复杂问题的求解带来一些小小的启发。
+
+最后，我们非常欢迎新成员的加入来帮助书籍提升质量，扩展内容。感兴趣的读者可以通过书籍的GitHub社区：https://github.com/openmlsys/ 联系到我们，我们非常期待和大家一起努力，写出世界上第一本机器学习系统的书籍！
+
+
+麦络
+
+写于英国爱丁堡
+
+2022年5月4日
--- a/chapter_recommender_system/summary.md
+++ b/chapter_recommender_system/summary.md
@@ -12,4 +12,8 @@

 - 利用多级缓存支持超大规模深度学习推荐系统训练：[Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems](https://arxiv.org/abs/2003.05622)

- 工业界机器学习系统的实践：[Hidden Technical Debt in Machine Learning Systems](https://papers.nips.cc/paper/2015/hash/86df7dcfd896fcaf2674f757a2463eba-Abstract.html)
+- 工业界机器学习系统的实践：[Hidden Technical Debt in Machine Learning Systems](https://papers.nips.cc/paper/2015/hash/86df7dcfd896fcaf2674f757a2463eba-Abstract.html)
+
+## 参考文献
+
+:bibliography:`../references/recommender.bib`
--- a/chapter_references/index.md
+++ b/chapter_references/index.md
@@ -1,10 +0,0 @@
-```eval_rst
-
-.. only:: html
-
-   参考文献
-   ==========
-
-```
-
-:bibliography:`../mlsys.bib`
--- a/chapter_reinforcement_learning/distributed_node_rl.md
+++ b/chapter_reinforcement_learning/distributed_node_rl.md
@@ -28,7 +28,7 @@

 Ray :cite:`moritz2018ray`是由伯克利大学几名研究人员发起的一个分布式计算框架，基于Ray之上构建了一个专门针对强化学习的系统RLlib :cite:`liang2017ray`。RLlib是一个面向工业级应用的开源强化学习框架，同时包含了强化学习的算法库，它对非强化学习专家使用也很方便。

-![RLlib分布式训练](../img/ch12/ch12-rllib-distributed.png)
+![RLlib分布式训练](../img/ch12/ch12-rllib-distributed.svg)

 :width:`600px`

--- a/chapter_reinforcement_learning/summary.md
+++ b/chapter_reinforcement_learning/summary.md
@@ -1,3 +1,7 @@
 ## 小结

-在这一章，我们简单介绍了强化学习的基本概念，包括单智能体和多智能体强化学习算法、单节点和分布式强化学习系统等，给读者对强化学习问题的基本认识。当前，强化学习是一个快速发展的深度学习分支，许多实际问题都有可能通过强化学习算法的进一步发展得到解决。另一方面，由于强化学习问题设置的特殊性（如需要与环境交互进行采样等），也使得相应算法对计算系统的要求更高：如何更好地平衡样本采集和策略训练过程？如何均衡CPU和GPU等不同计算硬件的能力？如何在大规模分布式系统上有效部署强化学习智能体？等等，都需要对计算机系统的设计和使用有更好的理解。
+在这一章，我们简单介绍了强化学习的基本概念，包括单智能体和多智能体强化学习算法、单节点和分布式强化学习系统等，给读者对强化学习问题的基本认识。当前，强化学习是一个快速发展的深度学习分支，许多实际问题都有可能通过强化学习算法的进一步发展得到解决。另一方面，由于强化学习问题设置的特殊性（如需要与环境交互进行采样等），也使得相应算法对计算系统的要求更高：如何更好地平衡样本采集和策略训练过程？如何均衡CPU和GPU等不同计算硬件的能力？如何在大规模分布式系统上有效部署强化学习智能体？等等，都需要对计算机系统的设计和使用有更好的理解。
+
+## 参考文献
+
+:bibliography:`../references/reinforcement.bib`
--- a/chapter_rl_sys/control.md
+++ b/chapter_rl_sys/control.md
@@ -4,14 +4,14 @@
 从机器学习的角度来看，未来的主要挑战之一是超越模式识别并解决数据驱动控制和动态过程优化方面的问题。

 理论方面，线性二次控制（Linear-Quadratic
-Control）是经典的控制方法，最近有关于图神经网络在分布式线性二次控制的研究 :cite:`pmlr-v144-gama21a`。作者称将线性二次问题转换为自监督学习问题，能够找到基于图神经网络（Graph
+Control）是经典的控制方法，最近有关于图神经网络在分布式线性二次控制的研究。作者称将线性二次问题转换为自监督学习问题，能够找到基于图神经网络（Graph
 Neural
-Networks，GNN）的最佳分布式控制器，他们还推导出了所得闭环系统稳定的充分条件。随着基于数据和学习的机器人控制方法不断得到重视，研究人员必须了解何时以及如何在现实世界中最好地利用这些方法，因为安全是至关重要的，有的研究通过学习不确定的动力学来安全地提高性能，鼓励安全或稳健的强化学习方法，以及可以正式认证所学控制策略的安全性的方法 :cite:`brunke2021safe`。 :numref:`safe\_learning\_control`展示了安全学习控制（Safe Learning
-Control）系统的框架图，用数据驱动的方法来学习控制策略，兼顾安全性。Lyapunov :cite:`pmlr-v144-mehrjou21a`
+Networks，GNN）的最佳分布式控制器，他们还推导出了所得闭环系统稳定的充分条件。随着基于数据和学习的机器人控制方法不断得到重视，研究人员必须了解何时以及如何在现实世界中最好地利用这些方法，因为安全是至关重要的，有的研究通过学习不确定的动力学来安全地提高性能，鼓励安全或稳健的强化学习方法，以及可以正式认证所学控制策略的安全性的方法。 :numref:`safe\_learning\_control`展示了安全学习控制（Safe Learning
+Control）系统的框架图，用数据驱动的方法来学习控制策略，兼顾安全性。Lyapunov
 函数是评估非线性动力系统稳定性的有效工具，最近有人提出Neural
 Lyapunov来将安全性纳入考虑。

-应用方面，有基于神经网络的自动驾驶汽车模型预测控制 :cite:`vianna2021neural`，也有研究将最优控制和学习相结合并应用在陌生环境中的视觉导航 :cite:`pmlr-v100-bansal20a`，该研究将基于模型的控制与基于学习的感知相结合来解决。基于学习的感知模块产生一系列航路点通过无碰撞路径引导机器人到达目标。基于模型的规划器使用这些航路点来生成平滑且动态可行的轨迹，该轨迹使用反馈控制在物理系统上执行。在模拟的现实世界杂乱环境和实际地面车辆上的实验表明，与纯粹基于几何映射或基于端到端学习的替代方案相比，这种新的系统可以在新环境中更可靠、更有效地到达目标位置。强化学习和模仿学习与控制论有密切联系：LEOC :cite:`pmlr-v144-zhang21b`整合了强化学习和经典控制理论的原则方法。有人将基于模型的离线强化学习算法扩展到高维视觉观察空间并在真实机器人上执行基于图像的抽屉关闭任务方面表现出色 :cite:`pmlr-v144-rafailov21a`。控制部分通过神经网络优化可以更加平滑、节能、安全，如何将
+应用方面，有基于神经网络的自动驾驶汽车模型预测控制，也有研究将最优控制和学习相结合并应用在陌生环境中的视觉导航，该研究将基于模型的控制与基于学习的感知相结合来解决。基于学习的感知模块产生一系列航路点通过无碰撞路径引导机器人到达目标。基于模型的规划器使用这些航路点来生成平滑且动态可行的轨迹，该轨迹使用反馈控制在物理系统上执行。在模拟的现实世界杂乱环境和实际地面车辆上的实验表明，与纯粹基于几何映射或基于端到端学习的替代方案相比，这种新的系统可以在新环境中更可靠、更有效地到达目标位置。强化学习和模仿学习与控制论有密切联系：LEOC整合了强化学习和经典控制理论的原则方法。有人将基于模型的离线强化学习算法扩展到高维视觉观察空间并在真实机器人上执行基于图像的抽屉关闭任务方面表现出色。控制部分通过神经网络优化可以更加平滑、节能、安全，如何将
 神经网络和传统控制理论结合，特别是和运动学算法相结合，将会是一个有趣的方向。

 ![安全学习控制系统，数据被用来更新控制策略或或安全滤波器 :cite:`brunke2021safe`](../img/ch13/safe_learning_control.png)
--- a/chapter_rl_sys/perception.md
+++ b/chapter_rl_sys/perception.md
@@ -1,20 +1,20 @@
 ## 感知系统

-感知系统不仅可以包括视觉，还可以包含触觉、声音等。在未知环境中，机器人想实现自主移动和导航必须知道自己在哪（例如通过相机重定位 :cite:`ding2019camnet`），周围什么情况（例如通过3D物体检测 :cite:`yi2020segvoxelnet`或语义分割），这些要依靠感知系统来实现 :cite:`xu2019depth,xu2020selfvoxelo,xu2022rnnpose,xu2022robust,yang2021pdnet,huang2021vs,huang2021life,huang2019prior,zhu2020ssn`。
+感知系统不仅可以包括视觉，还可以包含触觉、声音等。在未知环境中，机器人想实现自主移动和导航必须知道自己在哪（例如通过相机重定位 :cite:`ding2019camnet`），周围什么情况（例如通过3D物体检测 :cite:`yi2020segvoxelnet`或语义分割），这些要依靠感知系统来实现 :cite:`xu2019depth`。
 一提到感知系统，不得不提的就是即时定位与建图（Simultaneous Localization
 and
 Mapping，SLAM)系统。SLAM大致过程包括地标提取、数据关联、状态估计、状态更新以及地标更新等。视觉里程计Visual
-Odometry是SLAM中的重要部分，它估计两个时刻机器人的相对运动（Ego-motion）。ORB-SLAM :cite:`campos2021orb`系列是视觉SLAM中有代表性的工作， :numref:`orbslam3` 展示了最新的ORB-SLAM3的主要系统组件。香港科技大学开源的基于单目视觉与惯导融合的SLAM技术VINS-Mono :cite:`8421746`也很值得关注。多传感器融合、优化数据关联与回环检测、与前端异构处理器集成、提升鲁棒性和重定位精度都是SLAM技术接下来的发展方向。
+Odometry是SLAM中的重要部分，它估计两个时刻机器人的相对运动（Ego-motion）。ORB-SLAM系列是视觉SLAM中有代表性的工作， :numref:`orbslam3` 展示了最新的ORB-SLAM3的主要系统组件。香港科技大学开源的基于单目视觉与惯导融合的SLAM技术VINS-Mono也很值得关注。多传感器融合、优化数据关联与回环检测、与前端异构处理器集成、提升鲁棒性和重定位精度都是SLAM技术接下来的发展方向。

 最近，随着机器学习的兴起，基于学习的SLAM框架也被提了出来。TartanVO是第一个基于学习的视觉里程计（VO）模型，该模型可以推广到多个数据集和现实世界场景，并优于传统基于几何的方法。
-UnDeepVO :cite:`li2018undeepvo`是一个无监督深度学习方案，能够通过使用深度神经网络估计单目相机的
-6-DoF 位姿及其视图深度。DROID-SLAM :cite:`teed2021droid`是用于单目、立体和
+UnDeepVO是一个无监督深度学习方案，能够通过使用深度神经网络估计单目相机的
+6-DoF 位姿及其视图深度。DROID-SLAM是用于单目、立体和
 RGB-D 相机的深度视觉 SLAM，它通过Bundle
 Adjustment层对相机位姿和像素深度的反复迭代更新，具有很强的鲁棒性，故障大大减少，尽管对单目视频进行了训练，但它可以利用立体声或
 RGB-D 视频在测试时提高性能。其中，Bundle Adjustment
-(BA)与机器学习的结合被广泛研究 :cite:`tang2018ba,tanaka2021learning`。CMU提出通过主动神经
+(BA)与机器学习的结合被广泛研究。CMU提出通过主动神经
 SLAM
-的模块化系统帮助智能机器人在未知环境中的高效探索 :cite:`chaplot2020learning`。
+的模块化系统帮助智能机器人在未知环境中的高效探索。

 ![ORB-SLAM3主要系统组件 :cite:`campos2021orb`](../img/ch13/orbslam3.png)

@@ -22,4 +22,3 @@ SLAM

 :label:`orbslam3`

-:bibliography:`../mlsys.bib`
--- a/chapter_rl_sys/planning.md
+++ b/chapter_rl_sys/planning.md
@@ -2,15 +2,13 @@

 规划不仅包含运动路径规划，还包含高级任务规划 :cite:`9712373`。其中，运动规划是机器人技术的核心问题之一，应用范围从导航到复杂环境中的操作。它具有悠久的研究历史，方法需要有概率完整性和最优性的保证。然而，当经典运动规划在处理现实世界的机器人问题（在高维空间中）时，挑战仍然存在。研究人员在继续开发新算法来克服与这些方法相关的限制，包括优化计算和内存负载、更好的规划表示和处理维度灾难等。

-相比之下，机器学习的最新进展为机器人专家研究运动规划问题开辟了新视角：经典运动规划器的瓶颈可以以数据驱动的方式解决；基于深度学习的规划器可以避免几何输入的局限性，例如使用视觉或语义输入进行规划等。最近的工作有：基于深度神经网络的四足机器人快速运动规划框架，通过贝叶斯学习进行运动规划 :cite:`quintero2021motion`，通过运动规划器指导的视觉运动策略学习。ML4KP :cite:`ML4KP`是一个用于有效运动动力学运动规划的C++库，该库可以轻松地将机器学习方法集成到规划过程中。
-自动驾驶领域和行人和车辆轨迹预测 :cite:`qiu2021egocentric`方面也涌现出使用机器学习解决运动规划的工作，比如斯坦福大学提出Trajectron++ :cite:`salzmann2020trajectron++`。强化学习在规划系统上也有重要应用 :cite:`aradi2020survey,sun2021adversarial`，比如基于MetaDrive模拟器 :cite:`li2021metadrive`，最近有一些关于多智能体强化学习，多智能体车流模拟、驾驶行为分析 :cite:`peng2021learning`，考虑安全性因素的强化学习 :cite:`peng2021safe`，以及拓展到由真人专家在旁边监督，出现危险的时候接管的专家参与的强化学习工作（Online
+相比之下，机器学习的最新进展为机器人专家研究运动规划问题开辟了新视角：经典运动规划器的瓶颈可以以数据驱动的方式解决；基于深度学习的规划器可以避免几何输入的局限性，例如使用视觉或语义输入进行规划等。最近的工作有：基于深度神经网络的四足机器人快速运动规划框架，通过贝叶斯学习进行运动规划，通过运动规划器指导的视觉运动策略学习。ML4KP是一个用于有效运动动力学运动规划的C++库，该库可以轻松地将机器学习方法集成到规划过程中。
+自动驾驶领域和行人和车辆轨迹预测方面也涌现出使用机器学习解决运动规划的工作，比如斯坦福大学提出Trajectron++。强化学习在规划系统上也有重要应用 :cite:`sun2021adversarial`，比如基于MetaDrive模拟器 :cite:`li2021metadrive`，最近有一些关于多智能体强化学习，多智能体车流模拟、驾驶行为分析 :cite:`peng2021learning`，考虑安全性因素的强化学习 :cite:`peng2021safe`，以及拓展到由真人专家在旁边监督，出现危险的时候接管的专家参与的强化学习工作（Online
 Imitation Learning、Offline
-RL）:cite:`li2021efficient`，样本效率极高，是单纯强化学习算法的50倍。为了更好地说明强化学习是如何应用在自动驾驶中的， :numref:`rl\_ad`展示了一个基于深度强化学习的自动驾驶POMDP模型。
+RL） :cite:`li2021efficient`，样本效率极高，是单纯强化学习算法的50倍。为了更好地说明强化学习是如何应用在自动驾驶中的， :numref:`rl\_ad`展示了一个基于深度强化学习的自动驾驶POMDP模型。

 ![基于深度强化学习的自动驾驶POMDP模型 :cite:`aradi2020survey`](../img/ch13/rl_ad.png)

 :width:`800px`

 :label:`rl\_ad`
-
-:bibliography:`../mlsys.bib`
--- a/chapter_rl_sys/rl_sys_intro.md
+++ b/chapter_rl_sys/rl_sys_intro.md
@@ -1,32 +1,32 @@
 ## 概述

-机器人学是一个交叉学科，它涉及了计算机科学、机械工程、电气工程、生物医学工程、数学等多种学科，并有诸多应用，比如自动驾驶汽车、机械臂、无人机、医疗机器人等。机器人能够自主地完成一种或多种任务或者辅助人类完成指定任务。通常，人们把机器人系统划分为感知系统、决策（规划）和控制系统等组成部分 :cite:`buehler2009darpa`。
+机器人学是一个交叉学科，它涉及了计算机科学、机械工程、电气工程、生物医学工程、数学等多种学科，并有诸多应用，比如自动驾驶汽车、机械臂、无人机、医疗机器人等。机器人能够自主地完成一种或多种任务或者辅助人类完成指定任务。通常，人们把机器人系统划分为感知系统、决策（规划）和控制系统等组成部分。

 近些年，随着机器学习的兴起，经典机器人技术出现和机器学习技术结合的趋势，称为机器人学习（Robot
-Learning）:cite:`peters2016robot`。机器人学习包含了计算机视觉、自然语言处理、语音处理、强化学习和模仿学习等人工智能技术在机器人上的应用，让机器人通过学习，自主地执行各种决策控制任务。
+Learning）。机器人学习包含了计算机视觉、自然语言处理、语音处理、强化学习和模仿学习等人工智能技术在机器人上的应用，让机器人通过学习，自主地执行各种决策控制任务。

 机器人学习系统（Robot Learning
-System）是一个较新的概念。作为系统和机器人学习的交叉方向，仿照机器学习系统的概念，我们把机器人学习系统定义为"支持机器人模型训练和部署的系统"。按照涉及的机器人数量，可以划分为单机器人学习系统和多机器人学习系统。多机器人学习系统协作和沟通中涉及的安全和隐私问题，也会是一个值得研究的方向。最近机器人学习系统在室内自主移动 :cite:`zhu2017target,pmlr-v100-bansal20a,9123682,huang2018navigationnet`，道路自动驾驶 :cite:`pmlr-v155-huang21a,pmlr-v155-sun21a,Sun2022SelfSupervisedTA`，机械臂工业操作 :cite:`tobin2017domain,finn2017deep,chen2020transferable,duan2017one`等行业场景得到充分应用和发展。一些机器人学习基础设施项目也在进行中，如具备从公开可用的互联网资源、计算机模拟和
-真实机器人试验中学习能力的大规模的计算系统RobotBrain :cite:`saxena2014robobrain`。在自动驾驶领域，受联网的自动驾驶汽车
-(CAV) 对传统交通运输行业的影响，"车辆计算"(Vehicle Computing) :cite:`9491826`
-(如 :numref:`vehicle-computing`)概念引起广泛关注，并激发了如何让计算能力有限使用周围的CAV计算平台来执行复杂的计算任务的研究。最近，有很多自动驾驶系统的模拟器，代表性的比如CARLA :cite:`Dosovitskiy17`，支持安全RL、MARL、真实地图数据导入、泛化性测试等任务的MetaDrive :cite:`li2021metadrive`，还有CarSim和
-TruckSim :cite:`benekohal1988carsim`，它们可以作为各种自动驾驶算法的训练场并对算法效果进行评估。另外针对自动驾驶的系统开发平台也不断涌现，如ERDOS,
+System）是一个较新的概念。作为系统和机器人学习的交叉方向，仿照机器学习系统的概念，我们把机器人学习系统定义为"支持机器人模型训练和部署的系统"。按照涉及的机器人数量，可以划分为单机器人学习系统和多机器人学习系统。多机器人学习系统协作和沟通中涉及的安全和隐私问题，也会是一个值得研究的方向。最近机器人学习系统在室内自主移动 :cite:`9123682,huang2018navigationnet`，道路自动驾驶 :cite:`pmlr-v155-huang21a,pmlr-v155-sun21a,Sun2022SelfSupervisedTA`，机械臂工业操作等行业场景得到充分应用和发展。一些机器人学习基础设施项目也在进行中，如具备从公开可用的互联网资源、计算机模拟和
+真实机器人试验中学习能力的大规模的计算系统RobotBrain。在自动驾驶领域，受联网的自动驾驶汽车
+(CAV) 对传统交通运输行业的影响，"车辆计算"(Vehicle Computing)
+(如 :numref:`vehicle-computing`)概念引起广泛关注，并激发了如何让计算能力有限使用周围的CAV计算平台来执行复杂的计算任务的研究。最近，有很多自动驾驶系统的模拟器，代表性的比如CARLA，支持安全RL、MARL、真实地图数据导入、泛化性测试等任务的MetaDrive :cite:`li2021metadrive`，还有CarSim和
+TruckSim，它们可以作为各种自动驾驶算法的训练场并对算法效果进行评估。另外针对自动驾驶的系统开发平台也不断涌现，如ERDOS,
 D3 (Dynamic
-Deadline-Driven) :cite:`10.1145/3492321.3519576`和强调模块化思想的Pylot :cite:`gog2021pylot`，可以让模型训练与部署系统与这些平台对接。
+Deadline-Driven)和强调模块化思想的Pylot，可以让模型训练与部署系统与这些平台对接。

 ![车辆计算框架图 :cite:`9491826`](../img/ch13/vehicle_computing.png)

 :width:`800px`

-:label:`vehicle\_computing`
+:label:`vehicle-computing`

 :numref:`learning\_decision\_module`是一个典型的感知、规划、控制的模块化设计的自动驾驶系统框架图，接下来，我们也将按照这个顺序依次介绍通用框架、感知系统、规划系统和控制系统。

 ![通过模仿学习进行自动驾驶框架图。
-绿线表示自主驾驶系统的模块化流程。橙色实线表示神经判别器的训练。而橙色虚线表示规划和控制模块是不可微的。但是决策策略可以通过判别器对控制行动的奖励，重新参数化技术进行训练，如蓝色虚线所示 :cite:`pmlr-v155-huang21a`。](../img/ch13/idm.png)
+绿线表示自主驾驶系统的模块化流程。橙色实线表示神经判别器的训练。而橙色虚线表示规划和控制模块是不可微的。但是决策策略可以通过判别器对控制行动的奖励，重新参数化技术进行训练，如蓝色虚线所示。](../img/ch13/idm.png)

 :width:`800px`

 :label:`learning\_decision\_module`

-:bibliography:`../mlsys.bib`
+
--- a/chapter_rl_sys/ros.md
+++ b/chapter_rl_sys/ros.md
@@ -6,7 +6,7 @@

 :label:`ROS2\_arch`

-机器人操作系统(ROS) :cite:`quigley2009ros,maruyama2016exploring,koubaa2017robot`起源于斯坦福大学人工智能实验室的一个机器人项目。它是一个自由、开源的框架，提供接口、工具来构建先进的机器人。由于机器人领域的快速发展和复杂化，代码复用和模块化的需求日益强烈，ROS适用于机器人这种多节点多任务的复杂场景。目前也有一些机器人、无人机甚至无人车都开始采用ROS作为开发平台。在机器人学习方面，ROS/ROS2可以与深度学习结合，有开发人员为ROS/ROS2开发了的深度学习节点，并支持NVIDIA
+机器人操作系统(ROS)起源于斯坦福大学人工智能实验室的一个机器人项目。它是一个自由、开源的框架，提供接口、工具来构建先进的机器人。由于机器人领域的快速发展和复杂化，代码复用和模块化的需求日益强烈，ROS适用于机器人这种多节点多任务的复杂场景。目前也有一些机器人、无人机甚至无人车都开始采用ROS作为开发平台。在机器人学习方面，ROS/ROS2可以与深度学习结合，有开发人员为ROS/ROS2开发了的深度学习节点，并支持NVIDIA
 Jetson和TensorRT。NVIDIA
 Jetson是NVIDIA为自主机器开发的一个嵌入式系统，包括CPU、GPU、PMIC、DRAM
 和闪存的一个模组化系统，可以将自主机器软件运作系统运行速率提升。TensorRT
@@ -18,11 +18,11 @@ Server）、动作库（ActionLib）这四种。
 ROS提供了很多内置工具，比如三维可视化器rviz，用于可视化机器人、它们工作的环境和传感器数据。它是一个高度可配置的工具，具有许多不同类型的可视化和插件。catkin是ROS
 构建系统（类似于Linux下的CMake），Catkin
 Workspace是创建、修改、编译catkin软件包的目录。roslaunch可用于在本地和远程启动多个ROS
-节点以及在ROS参数服务器上设置参数的工具。此外还有机器人仿真工具Gazebo和移动操作软件和规划框架MoveIt! :cite:`coleman2014reducing`。ROS为机器人开发者提供了不同编程语言的接口，比如C++语言ROS接口roscpp，python语言的ROS接口rospy。ROS中提供了许多机器人的统一机器人描述格式URDF（Unified
+节点以及在ROS参数服务器上设置参数的工具。此外还有机器人仿真工具Gazebo和移动操作软件和规划框架MoveIt!。ROS为机器人开发者提供了不同编程语言的接口，比如C++语言ROS接口roscpp，python语言的ROS接口rospy。ROS中提供了许多机器人的统一机器人描述格式URDF（Unified
 Robot Description
 Format）文件，URDF使用XML格式描述机器人文件。ROS也有一些需要提高的地方，比如它的通信实时性能有限，与工业级要求的系统稳定性还有一定差距。

-ROS2 :cite:`maruyama2016exploring`项目在ROSCon 2014上被宣布，第一个ROS2发行版
+ROS2项目在ROSCon 2014上被宣布，第一个ROS2发行版
 Ardent Apalone
 是于2017年发布。ROS2增加了对多机器人系统的支持，提高了多机器人之间通信的网络性能，而且支持微控制器和跨系统平台，不仅可以运行在现有的X86和ARM系统上，还将支持MCU等嵌入式微控制器，不止能运行在Linux系统之上，还增加了对Windows、MacOS、RTOS等系统的支持。更重要的是，ROS
 2还加入了实时控制的支持，可以提高控制的时效性和整体机器人的性能。ROS
@@ -102,7 +102,5 @@ rqt是ROS的一个软件框架，以插件的形式实现了各种 GUI 工具。

 :label:`ros2\_actions`

-:bibliography:`../mlsys.bib`
-

 [^1]: https://docs.ros.org/en/foxy/Tutorials/Understanding-ROS2-Nodes.html
--- a/chapter_rl_sys/summary.md
+++ b/chapter_rl_sys/summary.md
@@ -1,3 +1,7 @@
 ## 小结

 在这一章，我们简单介绍了机器人学习系统的基本概念，包括通用机器人操作系统、感知系统、规划系统和控制系统等，给读者对机器人学习问题的基本认识。当前，机器人学习是一个快速发展的人工智能分支，许多实际问题都有可能通过机器人学习算法的进一步发展得到解决。另一方面，由于机器人学习问题设置的特殊性，也使得相应系统与相关硬件的耦合程度更高、更复杂：如何更好地平衡各种传感器负载？如何在计算资源有限的情况下最大化计算效率（实时性）？等等，都需要对计算机系统的设计和使用有更好的理解。
+
+## 参考文献
+
+:bibliography:`../references/rlsys.bib`
--- a/config.ini
+++ b/config.ini
@@ -19,7 +19,7 @@ lang = zh
 notebooks = *.md */*.md

 # A list of files that will be copied to the build folder.
-resources = img/ mlsyszh/ mlsys.bib
+resources = img/ references/

 # Files that will be skipped.
 exclusions = */*_origin.md README.md info/* contrib/*md
@@ -86,24 +86,3 @@ html_logo = static/logo-with-text.png
 # post_latex = ./static/post_latex/main.py

 latex_logo = static/logo.png
-
-#[deploy]
-
-#other_file_s3urls = s3://d2l-webdata/releases/d2l-zh/d2l-zh-1.0.zip
-#                    s3://d2l-webdata/releases/d2l-zh/d2l-zh-1.1.zip
-#                    s3://d2l-webdata/releases/d2l-zh/d2l-zh-2.0.0.zip
-
-#google_analytics_tracking_id = UA-96378503-2
-
-#[colab]
-
-#github_repo = mxnet, d2l-ai/d2l-zh-colab
-#              pytorch, d2l-ai/d2l-zh-pytorch-colab
-#              tensorflow, d2l-ai/d2l-zh-tensorflow-colab
-
-#replace_svg_url = img, http://d2l.ai/_images
-
-#libs = mxnet, mxnet, -U mxnet-cu101==1.7.0
-#       mxnet, d2l, git+https://github.com/d2l-ai/d2l-zh@release  # installing d2l
-#       pytorch, d2l, git+https://github.com/d2l-ai/d2l-zh@release  # installing d2l
-#       tensorflow, d2l, git+https://github.com/d2l-ai/d2l-zh@release  # installing d2l
--- a/img/ch09/ch10-computation-increase.png
+++ b/img/ch09/ch10-computation-increase.png
--- a/img/ch11/XAI_methods.PNG
+++ b/img/ch11/XAI_methods.PNG
--- a/img/ch11/tb_net.png
+++ b/img/ch11/tb_net.png
--- a/img/ch11/xai_concept.PNG
+++ b/img/ch11/xai_concept.PNG
--- a/img/ch11/xai_data_driven.png
+++ b/img/ch11/xai_data_driven.png
--- a/img/ch11/xai_gradient_based.PNG
+++ b/img/ch11/xai_gradient_based.PNG
--- a/img/ch11/xai_kg_recommendataion.png
+++ b/img/ch11/xai_kg_recommendataion.png
--- a/img/ch11/xai_kg_recommendation.png
+++ b/img/ch11/xai_kg_recommendation.png
--- a/img/ch11/xai_lime.png
+++ b/img/ch11/xai_lime.png
--- a/img/ch11/xai_tcav.png
+++ b/img/ch11/xai_tcav.png
--- a/index.md
+++ b/index.md
@@ -5,6 +5,7 @@
 :maxdepth: 2
 :numbered:

+chapter_preface/index
 chapter_introduction/index
 chapter_programming_interface/index
 chapter_computational_graph/index
@@ -32,5 +33,4 @@ chapter_rl_sys/index
 :maxdepth: 1

 appendix_machine_learning_introduction/index
-chapter_references/index
 ```
--- a/info/editors.md
+++ b/info/editors.md
@@ -6,37 +6,39 @@

 项目README: [@luomai](https://github.com/luomai)

-第1章 - 导论：[@luomai](https://github.com/luomai)
+序言：[@luomai](https://github.com/luomai)

-第2章 - 编程接口：[@Laicheng0830](https://github.com/Laicheng0830)
+导论：[@luomai](https://github.com/luomai)

-第3章 - 计算图：[@hanjr92](https://github.com/hanjr92)
+编程接口：[@Laicheng0830](https://github.com/Laicheng0830)
+
+计算图：[@hanjr92](https://github.com/hanjr92)

 进阶篇序言：[@ganzhiliang](https://github.com/ganzhiliang) 

-第4章 - 编译器前端和IR： [@LiangZhibo](https://github.com/LiangZhibo)
+编译器前端和IR： [@LiangZhibo](https://github.com/LiangZhibo)

-第5章 - 编译器后端和运行时： [@chujinjin101](https://github.com/chujinjin101)
+编译器后端和运行时： [@chujinjin101](https://github.com/chujinjin101)

-第6章 - 硬件加速器：[@anyrenwei](https://github.com/anyrenwei)
+硬件加速器：[@anyrenwei](https://github.com/anyrenwei)

-第7章 - 数据处理框架： [@eedalong](https://github.com/eedalong)
+数据处理框架： [@eedalong](https://github.com/eedalong)

-第8章 - 模型部署： [@AssassinG](https://github.com/AssassinGQ)
+模型部署： [@AssassinG](https://github.com/AssassinGQ)

-第9章 - 分布式训练系统： [@luomai](https://github.com/luomai)
+分布式训练系统： [@luomai](https://github.com/luomai)

 拓展篇序言：[@luomai](https://github.com/luomai)

-第10章 - 深度学习推荐系统：[@future-xy](https://github.com/future-xy)
+深度学习推荐系统：[@future-xy](https://github.com/future-xy)

-第11章 - 联邦学习系统：[@chengtianwu](https://github.com/chengtianwu)
+联邦学习系统：[@chengtianwu](https://github.com/chengtianwu)

-第12章 - 强化学习系统：[@quantumiracle](https://github.com/quantumiracle)
+强化学习系统：[@quantumiracle](https://github.com/quantumiracle)

-第13章 - 可解释性AI系统：[@HaoyangLee](https://github.com/HaoyangLee)
+可解释性AI系统：[@HaoyangLee](https://github.com/HaoyangLee)

-第14章 - 机器人学习系统：[@Jack](https://github.com/Jiankai-Sun)
+机器人系统：[@Jack](https://github.com/Jiankai-Sun)

 附录：机器学习介绍：[@Hao](https://github.com/zsdonghao)

--- a/info/refenence_guide.md
+++ b/info/refenence_guide.md
@@ -0,0 +1,35 @@
+# 参考文献引用方式
+  下面为参考文献的引用，需要注意引用时前面需要有一个空格：
+  1. 单篇参考文献
+  这篇文章参考了论文 :cite:`cnn2015`
+  2. 多篇参考文献可以用逗号分开
+  这篇文章参考了论文 :cite:`cnn2015,rnn2015`
+  3. 此时在对应bib中应该有如下参考文献
+  @inproceedings{cnn2015,
+	title = {CNN},
+	author = {xxx},
+	year = {2015},
+	keywords = {xxx}
+  }
+  @inproceedings{rnn2015,
+	title = {RNN},
+	author = {xxx},
+	year = {2015},
+	keywords = {xxx}
+  }
+
+# 参考文献置于章节末尾方式
+1.将章节所引用的全部参考文献生成一个chapter.pip，放置于references文件夹下。
+如机器人系统章节将该章节参考文献全部放在rlsys.bib，并将其放在reference文件夹下。
+
+```
+参考文献目录
+
+/references/rlsys.bib`
+```
+2.将对应章节参考文献引用添加至文章末尾处，如机器人系统章节在summary最后加上
+   ```
+	## 参考文献
+	
+	:bibliography:`../references/rlsys.bib`
+   ```
--- a/info/style.md
+++ b/info/style.md
@@ -92,7 +92,7 @@
    :eqlabel:`linear`
    公式引用使用 :eqref:`linear`
   ```
-  * 参考文献引用方式，参考文献放在mlsys.bib，如需新增，只需在该文件中添加即可。参考文献使用 :cite:`文献`
+  * 参考文献引用方式，参考文献放在references/xxx.bib，如需新增，只需在该文件中添加即可。参考文献使用 :cite:`文献`
    需要注意的是bib里的参考文献不能有重复的。
  ```python
  下面参考文献的引用：
@@ -101,7 +101,7 @@
  2. 多篇参考文献可以用逗号分开
  这篇文章参考了论文 :cite:`cnn2015,rnn2015`
  
-  此时在mlsys.bib中应该有如下参考文献
+  此时在对应bib中应该有如下参考文献
  @inproceedings{cnn2015,
 	title = {CNN},
 	author = {xxx},
--- a/mlsys.bib
+++ b/mlsys.bib
@@ -735,7 +735,7 @@ numpages = {22}
 }

@misc{kim2018interpretability,
-      title={Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)}, 
+      title={Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)},
      author={Been Kim and Martin Wattenberg and Justin Gilmer and Carrie Cai and James Wexler and Fernanda Viegas and Rory Sayres},
      year={2018},
      eprint={1711.11279},
@@ -1529,4 +1529,4 @@ series = {EuroSys '22}
  author={Jiankai Sun and Shreyas Kousik and David Fridovich-Keil and Mac Schwager},
  journal={arXiv preprint},
  year={2022}
-}
+}
--- a/references/accelerator.bib
+++ b/references/accelerator.bib
@@ -0,0 +1,92 @@
+@misc{2017NVIDIA,
+  author={NVIDIA},
+  title={NVIDIA Tesla V100 GPU Architecture: The World's Most Advanced Datacenter GPU},
+  year={2017},
+  howpublished = "Website",
+  note = {\url{http://www.nvidia.com/object/volta-architecture-whitepaper.html}}
+}
+
+@inproceedings{2021Ascend,
+  title={Ascend: a Scalable and Unified Architecture for Ubiquitous Deep Neural Network Computing : Industry Track Paper},
+  author={Liao, Heng and Tu, Jiajin and Xia, Jing and Liu, Hu and Zhou, Xiping and Yuan, Honghui and Hu, Yuxing},
+  booktitle={2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)},
+  year={2021},
+  pages = {789–801},
+  doi = {10.1109/HPCA51647.2021.00071},
+}
+
+@article{ragan2013halide,
+  title={Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines},
+  author={Ragan-Kelley, Jonathan and Barnes, Connelly and Adams, Andrew and Paris, Sylvain and Durand, Fr{\'e}do and Amarasinghe, Saman},
+  journal={Acm Sigplan Notices},
+  volume={48},
+  number={6},
+  pages={519--530},
+  year={2013},
+  publisher={ACM New York, NY, USA}
+}
+
+@article{chen2018tvm,
+  title={TVM: end-to-end optimization stack for deep learning},
+  author={Chen, Tianqi and Moreau, Thierry and Jiang, Ziheng and Shen, Haichen and Yan, Eddie Q and Wang, Leyuan and Hu, Yuwei and Ceze, Luis and Guestrin, Carlos and Krishnamurthy, Arvind},
+  journal={arXiv preprint arXiv:1802.04799},
+  volume={11},
+  pages={20},
+  year={2018},
+  publisher={CoRR}
+}
+
+@inproceedings{verdoolaege2010isl,
+  title={isl: An integer set library for the polyhedral model},
+  author={Verdoolaege, Sven},
+  booktitle={International Congress on Mathematical Software},
+  pages={299--302},
+  year={2010},
+  organization={Springer}
+}
+
+@inproceedings{zheng2020ansor,
+  title={Ansor: Generating $\{$High-Performance$\}$ Tensor Programs for Deep Learning},
+  author={Zheng, Lianmin and Jia, Chengfan and Sun, Minmin and Wu, Zhao and Yu, Cody Hao and Haj-Ali, Ameer and Wang, Yida and Yang, Jun and Zhuo, Danyang and Sen, Koushik and others},
+  booktitle={14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)},
+  pages={863--879},
+  year={2020}
+}
+
+@article{lattner2020mlir,
+  title={MLIR: A compiler infrastructure for the end of Moore's law},
+  author={Lattner, Chris and Amini, Mehdi and Bondhugula, Uday and Cohen, Albert and Davis, Andy and Pienaar, Jacques and Riddle, River and Shpeisman, Tatiana and Vasilache, Nicolas and Zinenko, Oleksandr},
+  journal={arXiv preprint arXiv:2002.11054},
+  year={2020}
+}
+
+@inproceedings{zhao2021akg,
+  title={AKG: automatic kernel generation for neural processing units using polyhedral transformations},
+  author={Zhao, Jie and Li, Bojie and Nie, Wang and Geng, Zhen and Zhang, Renwei and Gao, Xiong and Cheng, Bin and Wu, Chen and Cheng, Yun and Li, Zheng and others},
+  booktitle={Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation},
+  pages={1233--1248},
+  year={2021}
+}
+
+@article{vasilache2022composable,
+  title={Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction},
+  author={Vasilache, Nicolas and Zinenko, Oleksandr and Bik, Aart JC and Ravishankar, Mahesh and Raoux, Thomas and Belyaev, Alexander and Springer, Matthias and Gysi, Tobias and Caballero, Diego and Herhut, Stephan and others},
+  journal={arXiv preprint arXiv:2202.03293},
+  year={2022}
+}
+
+@inproceedings{bastoul2004code,
+  title={Code generation in the polyhedral model is easier than you think},
+  author={Bastoul, C{\'e}dric},
+  booktitle={Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004.},
+  pages={7--16},
+  year={2004},
+  organization={IEEE}
+}
+
+@article{2018Modeling,
+  title={Modeling Deep Learning Accelerator Enabled GPUs},
+  author={Raihan, M. A. and Goli, N. and Aamodt, T.},
+  journal={arXiv e-prints arXiv:1811.08309},
+  year={2018}
+}
--- a/references/appendix.bib
+++ b/references/appendix.bib
@@ -0,0 +1,103 @@
+@article{rosenblatt1958perceptron,
+  title={The perceptron: a probabilistic model for information storage and organization in the brain.},
+  author={Rosenblatt, Frank},
+  journal={Psychological Review},
+  volume={65},
+  number={6},
+  pages={386},
+  year={1958},
+  publisher={American Psychological Association}
+}
+
+@article{lecun1989backpropagation,
+  title={Backpropagation applied to handwritten zip code recognition},
+  author={LeCun, Yann and Boser, Bernhard and Denker, John S and Henderson, Donnie and Howard, Richard E and Hubbard, Wayne and Jackel, Lawrence D},
+  journal={Neural computation},
+  volume={1},
+  number={4},
+  pages={541--551},
+  year={1989},
+  publisher={MIT Press}
+}
+
+@inproceedings{krizhevsky2012imagenet,
+  title={Imagenet classification with deep convolutional neural networks},
+  author={Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E},
+  booktitle={Advances in Neural Information Processing Systems},
+  pages={1097--1105},
+  year={2012}
+}
+
+@inproceedings{he2016deep,
+	title={{Deep Residual Learning for Image Recognition}},
+	author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
+	booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+	year={2016}
+}
+
+@article{rumelhart1986learning,
+  title={Learning representations by back-propagating errors},
+  author={Rumelhart, David E and Hinton, Geoffrey E and Williams, Ronald J},
+  journal={Nature},
+  volume={323},
+  number={6088},
+  pages={533},
+  year={1986},
+  publisher={Nature Publishing Group}
+}
+
+@article{Hochreiter1997lstm,
+	author = {Hochreiter, Sepp and Hochreiter, S and Schmidhuber, J{\"{u}}rgen and Schmidhuber, J},
+	isbn = {08997667 (ISSN)},
+	issn = {0899-7667},
+	journal = {Neural Computation},
+	number = {8},
+	pages = {1735--80},
+	pmid = {9377276},
+	title = {{Long Short-Term Memory.}},
+	volume = {9},
+	year = {1997}
+}
+
+@inproceedings{vaswani2017attention,
+  title={Attention is all you need},
+  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
+  booktitle={Advances in Neural Information Processing Systems},
+  pages={5998--6008},
+  year={2017}
+}
+
+@article{lecun2015deep,
+	title={Deep learning},
+	author={LeCun, Yann and Bengio, Yoshua and Hinton, Geoffrey},
+	journal={Nature},
+	volume={521},
+	number={7553},
+	pages={436},
+	year={2015},
+	publisher={Nature Publishing Group}
+}
+
+@inproceedings{KingmaAdam2014,
+	title = {{Adam}: A Method for Stochastic Optimization},
+	author = {Kingma, Diederik and Ba, Jimmy},
+	booktitle = {Proceedings of the International Conference on Learning Representations (ICLR)},
+	year = {2014}
+}
+
+@techreport{tieleman2012rmsprop,
+	title={Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning},
+	author={Tieleman, T and Hinton, G},
+	year={2017},
+	institution={Technical Report}
+}
+
+@article{duchi2011adagrad,
+	title={Adaptive subgradient methods for online learning and stochastic optimization},
+	author={Duchi, John and Hazan, Elad and Singer, Yoram},
+	journal={Journal of Machine Learning Research (JMLR)},
+	volume={12},
+	number={Jul},
+	pages={2121--2159},
+	year={2011}
+}
--- a/references/backend.bib
+++ b/references/backend.bib
--- a/references/data.bib
+++ b/references/data.bib
--- a/references/explainable.bib
+++ b/references/explainable.bib
@@ -0,0 +1,59 @@
+@ARTICLE{2020tkde_li,
+  author={Li, Xiao-Hui and Cao, Caleb Chen and Shi, Yuhan and Bai, Wei and Gao, Han and Qiu, Luyu and Wang, Cong and Gao, Yuanyuan and Zhang, Shenjia and Xue, Xun and Chen, Lei},
+  journal={IEEE Transactions on Knowledge and Data Engineering},
+  title={A Survey of Data-driven and Knowledge-aware eXplainable AI},
+  year={2020},
+  volume={},
+  number={},
+  pages={1-1},
+  doi={10.1109/TKDE.2020.2983930}
+}
+
+@article{erhan2009visualizing,
+  title={Visualizing higher-layer features of a deep network},
+  author={Erhan, Dumitru and Bengio, Yoshua and Courville, Aaron and Vincent, Pascal},
+  journal={University of Montreal},
+  volume={1341},
+  number={3},
+  pages={1},
+  year={2009}
+}
+
+@misc{kim2018interpretability,
+      title={Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)},
+      author={Been Kim and Martin Wattenberg and Justin Gilmer and Carrie Cai and James Wexler and Fernanda Viegas and Rory Sayres},
+      year={2018},
+      eprint={1711.11279},
+      archivePrefix={arXiv},
+      primaryClass={stat.ML}
+}
+
+@article{riedl2019human,
+    title={Human-centered artificial intelligence and machine learning},
+    author={Riedl, Mark O.},
+    journal={Human Behavior and Emerging Technologies},
+    volume={1},
+    number={1},
+    pages={33--36},
+    year={2019},
+    publisher={Wiley Online Library}
+
+}
+
+@inproceedings{10.1145/2988450.2988454,
+author = {Cheng, Heng-Tze and Koc, Levent and Harmsen, Jeremiah and Shaked, Tal and Chandra, Tushar and Aradhye, Hrishi and Anderson, Glen and Corrado, Greg and Chai, Wei and Ispir, Mustafa and Anil, Rohan and Haque, Zakaria and Hong, Lichan and Jain, Vihan and Liu, Xiaobing and Shah, Hemal},
+title = {Wide &amp; Deep Learning for Recommender Systems},
+year = {2016},
+isbn = {9781450347952},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2988450.2988454},
+doi = {10.1145/2988450.2988454},
+abstract = {Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide &amp; Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide &amp; Deep significantly increased app acquisitions compared with wide-only and deep-only models. We have also open-sourced our implementation in TensorFlow.},
+booktitle = {Proceedings of the 1st Workshop on Deep Learning for Recommender Systems},
+pages = {7-10},
+numpages = {4},
+keywords = {Recommender Systems, Wide &amp; Deep Learning},
+location = {Boston, MA, USA},
+series = {DLRS 2016}
+}
--- a/references/extension.bib
+++ b/references/extension.bib
--- a/references/federated.bib
+++ b/references/federated.bib
--- a/references/frontend.bib
+++ b/references/frontend.bib
--- a/references/graph.bib
+++ b/references/graph.bib
--- a/references/interface.bib
+++ b/references/interface.bib
--- a/references/introduction.bib
+++ b/references/introduction.bib
--- a/references/model.bib
+++ b/references/model.bib
--- a/references/recommender.bib
+++ b/references/recommender.bib
@@ -0,0 +1,284 @@
+@inproceedings{10.1145/2988450.2988454,
+author = {Cheng, Heng-Tze and Koc, Levent and Harmsen, Jeremiah and Shaked, Tal and Chandra, Tushar and Aradhye, Hrishi and Anderson, Glen and Corrado, Greg and Chai, Wei and Ispir, Mustafa and Anil, Rohan and Haque, Zakaria and Hong, Lichan and Jain, Vihan and Liu, Xiaobing and Shah, Hemal},
+title = {Wide &amp; Deep Learning for Recommender Systems},
+year = {2016},
+isbn = {9781450347952},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2988450.2988454},
+doi = {10.1145/2988450.2988454},
+abstract = {Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide &amp; Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide &amp; Deep significantly increased app acquisitions compared with wide-only and deep-only models. We have also open-sourced our implementation in TensorFlow.},
+booktitle = {Proceedings of the 1st Workshop on Deep Learning for Recommender Systems},
+pages = {7-10},
+numpages = {4},
+keywords = {Recommender Systems, Wide &amp; Deep Learning},
+location = {Boston, MA, USA},
+series = {DLRS 2016}
+}
+
+@inproceedings{10.1145/3124749.3124754,
+author = {Wang, Ruoxi and Fu, Bin and Fu, Gang and Wang, Mingliang},
+title = {Deep &amp; Cross Network for Ad Click Predictions},
+year = {2017},
+isbn = {9781450351942},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3124749.3124754},
+doi = {10.1145/3124749.3124754},
+abstract = {Feature engineering has been the key to the success of many prediction models. However, the process is nontrivial and often requires manual feature engineering or exhaustive searching. DNNs are able to automatically learn feature interactions; however, they generate all the interactions implicitly, and are not necessarily efficient in learning all types of cross features. In this paper, we propose the Deep &amp; Cross Network (DCN) which keeps the benefits of a DNN model, and beyond that, it introduces a novel cross network that is more efficient in learning certain bounded-degree feature interactions. In particular, DCN explicitly applies feature crossing at each layer, requires no manual feature engineering, and adds negligible extra complexity to the DNN model. Our experimental results have demonstrated its superiority over the state-of-art algorithms on the CTR prediction dataset and dense classification dataset, in terms of both model accuracy and memory usage.},
+booktitle = {Proceedings of the ADKDD'17},
+articleno = {12},
+numpages = {7},
+keywords = {CTR Prediction, Deep Learning, Neural Networks, Feature Crossing},
+location = {Halifax, NS, Canada},
+series = {ADKDD'17}
+}
+
+@inproceedings{ijcai2017-239,
+  author    = {Huifeng Guo and Ruiming TANG and Yunming Ye and Zhenguo Li and Xiuqiang He},
+  title     = {DeepFM: A Factorization-Machine based Neural Network for CTR Prediction},
+  booktitle = {Proceedings of the Twenty-Sixth International Joint Conference on
+               Artificial Intelligence, {IJCAI-17}},
+  pages     = {1725--1731},
+  year      = {2017},
+  doi       = {10.24963/ijcai.2017/239},
+  url       = {https://doi.org/10.24963/ijcai.2017/239},
+}
+
+@article{naumov2019deep,
+  title={Deep learning recommendation model for personalization and recommendation systems},
+  author={Naumov, Maxim and Mudigere, Dheevatsa and Shi, Hao-Jun Michael and Huang, Jianyu and Sundaraman, Narayanan and Park, Jongsoo and Wang, Xiaodong and Gupta, Udit and Wu, Carole-Jean and Azzolini, Alisson G and others},
+  journal={arXiv preprint arXiv:1906.00091},
+  year={2019}
+}
+
+@misc{Merlin,
+    note={Accessed on 2022-03-24},
+    author = {NVIDIA},
+    year = {2022},
+    title = {{{NVIDIA Merlin}}},
+    howpublished = {\url{https://github.com/NVIDIA-Merlin/Merlin}},
+}
+
+@inproceedings{NIPS2015_86df7dcf,
+ author = {Sculley, D. and Holt, Gary and Golovin, Daniel and Davydov, Eugene and Phillips, Todd and Ebner, Dietmar and Chaudhary, Vinay and Young, Michael and Crespo, Jean-Fran\c{c}ois and Dennison, Dan},
+ booktitle = {Advances in Neural Information Processing Systems},
+ editor = {C. Cortes and N. Lawrence and D. Lee and M. Sugiyama and R. Garnett},
+ pages = {},
+ publisher = {Curran Associates, Inc.},
+ title = {Hidden Technical Debt in Machine Learning Systems},
+ url = {https://proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf},
+ volume = {28},
+ year = {2015}
+}
+
+@misc{NVTabular,
+    note={Accessed on 2022-03-24},
+    author = {NVIDIA},
+    year = {2022},
+    title = {{{NVIDIA NVTabular}}},
+    howpublished = {\url{https://github.com/NVIDIA-Merlin/NVTabular}},
+}
+
+@misc{HugeCTR,
+    note={Accessed on 2022-03-24},
+    author = {NVIDIA},
+    year = {2022},
+    title = {{{NVIDIA HugeCTR}}},
+    howpublished = {\url{https://github.com/NVIDIA-Merlin/HugeCTR}},
+}
+
+@misc{Triton,
+    note={Accessed on 2022-03-24},
+    author = {NVIDIA},
+    year = {2022},
+    title = {{{NVIDIA Triton}}},
+    howpublished = {\url{https://github.com/triton-inference-server/server}},
+}
+
+@article{zionex,
+  title={Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models},
+  author={Mudigere, Dheevatsa and Hao, Yuchen and Huang, Jianyu and Jia, Zhihao and Tulloch, Andrew and Sridharan, Srinivas and Liu, Xing and Ozdal, Mustafa and Nie, Jade and Park, Jongsoo and others},
+  journal={arXiv preprint arXiv:2104.05158},
+  year={2021}
+}
+
+@inproceedings{10.1145/3437801.3441578,
+author = {Fang, Jiarui and Yu, Yang and Zhao, Chengduo and Zhou, Jie},
+title = {TurboTransformers: An Efficient GPU Serving System for Transformer Models},
+year = {2021},
+isbn = {9781450382946},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3437801.3441578},
+doi = {10.1145/3437801.3441578},
+abstract = {The transformer is the most critical algorithm innovation of the Nature Language Processing (NLP) field in recent years. Unlike the Recurrent Neural Network (RNN) models, transformers are able to process on dimensions of sequence lengths in parallel, therefore leads to better accuracy on long sequences. However, efficient deployments of them for online services in data centers equipped with GPUs are not easy. First, more computation introduced by transformer structures makes it more challenging to meet the latency and throughput constraints of serving. Second, NLP tasks take in sentences of variable length. The variability of input dimensions brings a severe problem to efficient memory management and serving optimization.To solve the above challenges, this paper designed a transformer serving system called TurboTransformers, which consists of a computing runtime and a serving framework. Three innovative features make it stand out from other similar works. An efficient parallel algorithm is proposed for GPU-based batch reduction operations, like Softmax and LayerNorm, which are major hot spots besides BLAS routines. A memory allocation algorithm, which better balances the memory footprint and allocation/free efficiency, is designed for variable-length input situations. A serving framework equipped with a new batch scheduler using dynamic programming achieves the optimal throughput on variable-length requests. The system can achieve the state-of-the-art transformer model serving performance on GPU platforms and can be seamlessly integrated into your PyTorch code with a few lines of code.},
+booktitle = {Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming},
+pages = {389–402},
+numpages = {14},
+keywords = {serving system, deep learning runtime, GPU, transformers},
+location = {Virtual Event, Republic of Korea},
+series = {PPoPP '21}
+}
+
+@inproceedings{wang-etal-2021-lightseq,
+    title = "{L}ight{S}eq: A High Performance Inference Library for Transformers",
+    author = "Wang, Xiaohui  and
+      Xiong, Ying  and
+      Wei, Yang  and
+      Wang, Mingxuan  and
+      Li, Lei",
+    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers",
+    month = jun,
+    year = "2021",
+    address = "Online",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2021.naacl-industry.15",
+    doi = "10.18653/v1/2021.naacl-industry.15",
+    pages = "113--120",
+    abstract = "Transformer and its variants have achieved great success in natural language processing. Since Transformer models are huge in size, serving these models is a challenge for real industrial applications. In this paper, we propose , a highly efficient inference library for models in the Transformer family. includes a series of GPU optimization techniques to both streamline the computation of Transformer layers and reduce memory footprint. supports models trained using PyTorch and Tensorflow. Experimental results on standard machine translation benchmarks show that achieves up to 14x speedup compared with TensorFlow and 1.4x speedup compared with , a concurrent CUDA implementation. The code will be released publicly after the review.",
+}
+
+@inproceedings{MLSYS2021_979d472a,
+ author = {Yin, Chunxing and Acun, Bilge and Wu, Carole-Jean and Liu, Xing},
+ booktitle = {Proceedings of Machine Learning and Systems},
+ editor = {A. Smola and A. Dimakis and I. Stoica},
+ pages = {448--462},
+ title = {TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models},
+ url = {https://proceedings.mlsys.org/paper/2021/file/979d472a84804b9f647bc185a877a8b5-Paper.pdf},
+ volume = {3},
+ year = {2021}
+}
+
+@inproceedings{MLSYS2020_f7e6c855,
+ author = {Zhao, Weijie and Xie, Deping and Jia, Ronglai and Qian, Yulei and Ding, Ruiquan and Sun, Mingming and Li, Ping},
+ booktitle = {Proceedings of Machine Learning and Systems},
+ editor = {I. Dhillon and D. Papailiopoulos and V. Sze},
+ pages = {412--428},
+ title = {Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems},
+ url = {https://proceedings.mlsys.org/paper/2020/file/f7e6c85504ce6e82442c770f7c8606f0-Paper.pdf},
+ volume = {2},
+ year = {2020}
+}
+
+@inproceedings{10.1145/2020408.2020444,
+author = {Chu, Wei and Zinkevich, Martin and Li, Lihong and Thomas, Achint and Tseng, Belle},
+title = {Unbiased Online Active Learning in Data Streams},
+year = {2011},
+isbn = {9781450308137},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2020408.2020444},
+doi = {10.1145/2020408.2020444},
+abstract = {Unlabeled samples can be intelligently selected for labeling to minimize classification error. In many real-world applications, a large number of unlabeled samples arrive in a streaming manner, making it impossible to maintain all the data in a candidate pool. In this work, we focus on binary classification problems and study selective labeling in data streams where a decision is required on each sample sequentially. We consider the unbiasedness property in the sampling process, and design optimal instrumental distributions to minimize the variance in the stochastic process. Meanwhile, Bayesian linear classifiers with weighted maximum likelihood are optimized online to estimate parameters. In empirical evaluation, we collect a data stream of user-generated comments on a commercial news portal in 30 consecutive days, and carry out offline evaluation to compare various sampling strategies, including unbiased active learning, biased variants, and random sampling. Experimental results verify the usefulness of online active learning, especially in the non-stationary situation with concept drift.},
+booktitle = {Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
+pages = {195–203},
+numpages = {9},
+keywords = {unbiasedness, bayesian online learning, active learning, data streaming, adaptive importance sampling},
+location = {San Diego, California, USA},
+series = {KDD '11}
+}
+
+@inproceedings{10.1145/2648584.2648589,
+author = {He, Xinran and Pan, Junfeng and Jin, Ou and Xu, Tianbing and Liu, Bo and Xu, Tao and Shi, Yanxin and Atallah, Antoine and Herbrich, Ralf and Bowers, Stuart and Candela, Joaquin Qui\~{n}onero},
+title = {Practical Lessons from Predicting Clicks on Ads at Facebook},
+year = {2014},
+isbn = {9781450329996},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2648584.2648589},
+doi = {10.1145/2648584.2648589},
+abstract = {Online advertising allows advertisers to only bid and pay for measurable user responses, such as clicks on ads. As a consequence, click prediction systems are central to most online advertising systems. With over 750 million daily active users and over 1 million active advertisers, predicting clicks on Facebook ads is a challenging machine learning task. In this paper we introduce a model which combines decision trees with logistic regression, outperforming either of these methods on its own by over 3%, an improvement with significant impact to the overall system performance. We then explore how a number of fundamental parameters impact the final prediction performance of our system. Not surprisingly, the most important thing is to have the right features: those capturing historical information about the user or ad dominate other types of features. Once we have the right features and the right model (decisions trees plus logistic regression), other factors play small roles (though even small improvements are important at scale). Picking the optimal handling for data freshness, learning rate schema and data sampling improve the model slightly, though much less than adding a high-value feature, or picking the right model to begin with.},
+booktitle = {Proceedings of the Eighth International Workshop on Data Mining for Online Advertising},
+pages = {1–9},
+numpages = {9},
+location = {New York, NY, USA},
+series = {ADKDD'14}
+}
+
+@inproceedings{10.1145/3267809.3267817,
+author = {Tian, Huangshi and Yu, Minchen and Wang, Wei},
+title = {Continuum: A Platform for Cost-Aware, Low-Latency Continual Learning},
+year = {2018},
+isbn = {9781450360111},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3267809.3267817},
+doi = {10.1145/3267809.3267817},
+abstract = {Many machine learning applications operate in dynamic environments that change over time, in which models must be continually updated to capture the recent trend in data. However, most of today's learning frameworks perform training offline, without a system support for continual model updating.In this paper, we design and implement Continuum, a general-purpose platform that streamlines the implementation and deployment of continual model updating across existing learning frameworks. In pursuit of fast data incorporation, we further propose two update policies, cost-aware and best-effort, that judiciously determine when to perform model updating, with and without accounting for the training cost (machine-time), respectively. Theoretical analysis shows that cost-aware policy is 2-competitive. We implement both polices in Continuum, and evaluate their performance through EC2 deployment and trace-driven simulations. The evaluation shows that Continuum results in reduced data incorporation latency, lower training cost, and improved model quality in a number of popular online learning applications that span multiple application domains, programming languages, and frameworks.},
+booktitle = {Proceedings of the ACM Symposium on Cloud Computing},
+pages = {26–40},
+numpages = {15},
+keywords = {Competitive Analysis, Continual Learning System, Online Algorithm},
+location = {Carlsbad, CA, USA},
+series = {SoCC '18}
+}
+
+@INPROCEEDINGS{9355295,  
+  author={Xie, Minhui and Ren, Kai and Lu, Youyou and Yang, Guangxu and Xu, Qingxing and Wu, Bihai and Lin, Jiazhen and Ao, Hongbo and Xu, Wanhong and Shu, Jiwu},  
+  booktitle={SC20: International Conference for High Performance Computing, Networking, Storage and Analysis},   
+  title={Kraken: Memory-Efficient Continual Learning for Large-Scale Real-Time Recommendations},   
+  year={2020},  
+  volume={},  
+  number={},  
+  pages={1-17},  
+  doi={10.1109/SC41405.2020.00025}
+}
+
+@inproceedings{gong2020edgerec,
+  title={EdgeRec: Recommender System on Edge in Mobile Taobao},
+  author={Gong, Yu and Jiang, Ziwen and Feng, Yufei and Hu, Binbin and Zhao, Kaiqi and Liu, Qingwen and Ou, Wenwu},
+  booktitle={Proceedings of the 29th ACM International Conference on Information \& Knowledge Management},
+  pages={2477--2484},
+  year={2020}
+}
+
+@inproceedings{NEURIPS2020_a1d4c20b,
+ author = {He, Chaoyang and Annavaram, Murali and Avestimehr, Salman},
+ booktitle = {Advances in Neural Information Processing Systems},
+ editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
+ pages = {14068--14080},
+ publisher = {Curran Associates, Inc.},
+ title = {Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge},
+ url = {https://proceedings.neurips.cc/paper/2020/file/a1d4c20b182ad7137ab3606f0e3fc8a4-Paper.pdf},
+ volume = {33},
+ year = {2020}
+}
+
+@inproceedings{MLSYS2021_ec895663,
+ author = {Jiang, Wenqi and He, Zhenhao and Zhang, Shuai and Preu\ss er, Thomas B. and Zeng, Kai and Feng, Liang and Zhang, Jiansong and Liu, Tongxuan and Li , Yong and Zhou, Jingren and Zhang, Ce and Alonso, Gustavo},
+ booktitle = {Proceedings of Machine Learning and Systems},
+ editor = {A. Smola and A. Dimakis and I. Stoica},
+ pages = {845--859},
+ title = {MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions},
+ url = {https://proceedings.mlsys.org/paper/2021/file/ec8956637a99787bd197eacd77acce5e-Paper.pdf},
+ volume = {3},
+ year = {2021}
+}
+
+@inproceedings{10.1145/3394486.3403059,
+author = {Shi, Hao-Jun Michael and Mudigere, Dheevatsa and Naumov, Maxim and Yang, Jiyan},
+title = {Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems},
+year = {2020},
+isbn = {9781450379984},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3394486.3403059},
+doi = {10.1145/3394486.3403059},
+abstract = {},
+booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining},
+pages = {165–175},
+numpages = {11},
+keywords = {model compression, recommendation systems, embeddings},
+location = {Virtual Event, CA, USA},
+series = {KDD '20}
+}
+
+@misc{ginart2021mixed,
+      title={Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems}, 
+      author={Antonio Ginart and Maxim Naumov and Dheevatsa Mudigere and Jiyan Yang and James Zou},
+      year={2021},
+      eprint={1909.11810},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG}
+}
--- a/references/reinforcement.bib
+++ b/references/reinforcement.bib
@@ -0,0 +1,174 @@
+@article{han2020tstarbot,
+  title={Tstarbot-x: An open-sourced and comprehensive study for efficient league training in starcraft ii full game},
+  author={Han, Lei and Xiong, Jiechao and Sun, Peng and Sun, Xinghai and Fang, Meng and Guo, Qingwei and Chen, Qiaobo and Shi, Tengfei and Yu, Hongsheng and Wu, Xipeng and others},
+  journal={arXiv preprint arXiv:2011.13729},
+  year={2020}
+}
+
+@inproceedings{wang2021scc,
+  title={SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II},
+  author={Wang, Xiangjun and Song, Junxiao and Qi, Penghui and Peng, Peng and Tang, Zhenkun and Zhang, Wei and Li, Weimin and Pi, Xiongjun and He, Jujie and Gao, Chao and others},
+  booktitle={International Conference on Machine Learning},
+  pages={10905--10915},
+  year={2021},
+  organization={PMLR}
+}
+
+@inproceedings{MLSYS2021_979d472a,
+ author = {Yin, Chunxing and Acun, Bilge and Wu, Carole-Jean and Liu, Xing},
+ booktitle = {Proceedings of Machine Learning and Systems},
+ editor = {A. Smola and A. Dimakis and I. Stoica},
+ pages = {448--462},
+ title = {TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models},
+ url = {https://proceedings.mlsys.org/paper/2021/file/979d472a84804b9f647bc185a877a8b5-Paper.pdf},
+ volume = {3},
+ year = {2021}
+}
+
+@inproceedings{MLSYS2020_f7e6c855,
+ author = {Zhao, Weijie and Xie, Deping and Jia, Ronglai and Qian, Yulei and Ding, Ruiquan and Sun, Mingming and Li, Ping},
+ booktitle = {Proceedings of Machine Learning and Systems},
+ editor = {I. Dhillon and D. Papailiopoulos and V. Sze},
+ pages = {412--428},
+ title = {Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems},
+ url = {https://proceedings.mlsys.org/paper/2020/file/f7e6c85504ce6e82442c770f7c8606f0-Paper.pdf},
+ volume = {2},
+ year = {2020}
+}
+
+@article{zionex,
+  title={Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models},
+  author={Mudigere, Dheevatsa and Hao, Yuchen and Huang, Jianyu and Jia, Zhihao and Tulloch, Andrew and Sridharan, Srinivas and Liu, Xing and Ozdal, Mustafa and Nie, Jade and Park, Jongsoo and others},
+  journal={arXiv preprint arXiv:2104.05158},
+  year={2021}
+}
+
+@inproceedings{gong2020edgerec,
+  title={EdgeRec: Recommender System on Edge in Mobile Taobao},
+  author={Gong, Yu and Jiang, Ziwen and Feng, Yufei and Hu, Binbin and Zhao, Kaiqi and Liu, Qingwen and Ou, Wenwu},
+  booktitle={Proceedings of the 29th ACM International Conference on Information \& Knowledge Management},
+  pages={2477--2484},
+  year={2020}
+}
+
+@inproceedings{NEURIPS2020_a1d4c20b,
+ author = {He, Chaoyang and Annavaram, Murali and Avestimehr, Salman},
+ booktitle = {Advances in Neural Information Processing Systems},
+ editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
+ pages = {14068--14080},
+ publisher = {Curran Associates, Inc.},
+ title = {Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge},
+ url = {https://proceedings.neurips.cc/paper/2020/file/a1d4c20b182ad7137ab3606f0e3fc8a4-Paper.pdf},
+ volume = {33},
+ year = {2020}
+}
+
+@INPROCEEDINGS{9355295,
+  author={Xie, Minhui and Ren, Kai and Lu, Youyou and Yang, Guangxu and Xu, Qingxing and Wu, Bihai and Lin, Jiazhen and Ao, Hongbo and Xu, Wanhong and Shu, Jiwu},
+  booktitle={SC20: International Conference for High Performance Computing, Networking, Storage and Analysis},
+  title={Kraken: Memory-Efficient Continual Learning for Large-Scale Real-Time Recommendations},
+  year={2020},
+  volume={},
+  number={},
+  pages={1-17},
+  doi={10.1109/SC41405.2020.00025}
+}
+
+@inproceedings{MLSYS2021_ec895663,
+ author = {Jiang, Wenqi and He, Zhenhao and Zhang, Shuai and Preu\ss er, Thomas B. and Zeng, Kai and Feng, Liang and Zhang, Jiansong and Liu, Tongxuan and Li , Yong and Zhou, Jingren and Zhang, Ce and Alonso, Gustavo},
+ booktitle = {Proceedings of Machine Learning and Systems},
+ editor = {A. Smola and A. Dimakis and I. Stoica},
+ pages = {845--859},
+ title = {MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions},
+ url = {https://proceedings.mlsys.org/paper/2021/file/ec8956637a99787bd197eacd77acce5e-Paper.pdf},
+ volume = {3},
+ year = {2021}
+}
+
+@inproceedings{10.1145/3394486.3403059,
+author = {Shi, Hao-Jun Michael and Mudigere, Dheevatsa and Naumov, Maxim and Yang, Jiyan},
+title = {Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems},
+year = {2020},
+isbn = {9781450379984},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3394486.3403059},
+doi = {10.1145/3394486.3403059},
+abstract = {},
+booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining},
+pages = {165–175},
+numpages = {11},
+keywords = {model compression, recommendation systems, embeddings},
+location = {Virtual Event, CA, USA},
+series = {KDD '20}
+}
+
+@misc{ginart2021mixed,
+      title={Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems},
+      author={Antonio Ginart and Maxim Naumov and Dheevatsa Mudigere and Jiyan Yang and James Zou},
+      year={2021},
+      eprint={1909.11810},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG}
+}
+
+@inproceedings{10.1145/2020408.2020444,
+author = {Chu, Wei and Zinkevich, Martin and Li, Lihong and Thomas, Achint and Tseng, Belle},
+title = {Unbiased Online Active Learning in Data Streams},
+year = {2011},
+isbn = {9781450308137},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2020408.2020444},
+doi = {10.1145/2020408.2020444},
+abstract = {Unlabeled samples can be intelligently selected for labeling to minimize classification error. In many real-world applications, a large number of unlabeled samples arrive in a streaming manner, making it impossible to maintain all the data in a candidate pool. In this work, we focus on binary classification problems and study selective labeling in data streams where a decision is required on each sample sequentially. We consider the unbiasedness property in the sampling process, and design optimal instrumental distributions to minimize the variance in the stochastic process. Meanwhile, Bayesian linear classifiers with weighted maximum likelihood are optimized online to estimate parameters. In empirical evaluation, we collect a data stream of user-generated comments on a commercial news portal in 30 consecutive days, and carry out offline evaluation to compare various sampling strategies, including unbiased active learning, biased variants, and random sampling. Experimental results verify the usefulness of online active learning, especially in the non-stationary situation with concept drift.},
+booktitle = {Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
+pages = {195–203},
+numpages = {9},
+keywords = {unbiasedness, bayesian online learning, active learning, data streaming, adaptive importance sampling},
+location = {San Diego, California, USA},
+series = {KDD '11}
+}
+
+@inproceedings{10.1145/3267809.3267817,
+author = {Tian, Huangshi and Yu, Minchen and Wang, Wei},
+title = {Continuum: A Platform for Cost-Aware, Low-Latency Continual Learning},
+year = {2018},
+isbn = {9781450360111},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3267809.3267817},
+doi = {10.1145/3267809.3267817},
+abstract = {Many machine learning applications operate in dynamic environments that change over time, in which models must be continually updated to capture the recent trend in data. However, most of today's learning frameworks perform training offline, without a system support for continual model updating.In this paper, we design and implement Continuum, a general-purpose platform that streamlines the implementation and deployment of continual model updating across existing learning frameworks. In pursuit of fast data incorporation, we further propose two update policies, cost-aware and best-effort, that judiciously determine when to perform model updating, with and without accounting for the training cost (machine-time), respectively. Theoretical analysis shows that cost-aware policy is 2-competitive. We implement both polices in Continuum, and evaluate their performance through EC2 deployment and trace-driven simulations. The evaluation shows that Continuum results in reduced data incorporation latency, lower training cost, and improved model quality in a number of popular online learning applications that span multiple application domains, programming languages, and frameworks.},
+booktitle = {Proceedings of the ACM Symposium on Cloud Computing},
+pages = {26–40},
+numpages = {15},
+keywords = {Competitive Analysis, Continual Learning System, Online Algorithm},
+location = {Carlsbad, CA, USA},
+series = {SoCC '18}
+}
+
+@inproceedings{10.1145/2648584.2648589,
+author = {He, Xinran and Pan, Junfeng and Jin, Ou and Xu, Tianbing and Liu, Bo and Xu, Tao and Shi, Yanxin and Atallah, Antoine and Herbrich, Ralf and Bowers, Stuart and Candela, Joaquin Qui\~{n}onero},
+title = {Practical Lessons from Predicting Clicks on Ads at Facebook},
+year = {2014},
+isbn = {9781450329996},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2648584.2648589},
+doi = {10.1145/2648584.2648589},
+abstract = {Online advertising allows advertisers to only bid and pay for measurable user responses, such as clicks on ads. As a consequence, click prediction systems are central to most online advertising systems. With over 750 million daily active users and over 1 million active advertisers, predicting clicks on Facebook ads is a challenging machine learning task. In this paper we introduce a model which combines decision trees with logistic regression, outperforming either of these methods on its own by over 3%, an improvement with significant impact to the overall system performance. We then explore how a number of fundamental parameters impact the final prediction performance of our system. Not surprisingly, the most important thing is to have the right features: those capturing historical information about the user or ad dominate other types of features. Once we have the right features and the right model (decisions trees plus logistic regression), other factors play small roles (though even small improvements are important at scale). Picking the optimal handling for data freshness, learning rate schema and data sampling improve the model slightly, though much less than adding a high-value feature, or picking the right model to begin with.},
+booktitle = {Proceedings of the Eighth International Workshop on Data Mining for Online Advertising},
+pages = {1–9},
+numpages = {9},
+location = {New York, NY, USA},
+series = {ADKDD'14}
+}
+
+@misc{2017NVIDIA,
+  author={NVIDIA},
+  title={NVIDIA Tesla V100 GPU Architecture: The World's Most Advanced Datacenter GPU},
+  year={2017},
+  howpublished = "Website",
+  note = {\url{http://www.nvidia.com/object/volta-architecture-whitepaper.html}}
+}
--- a/references/rlsys.bib
+++ b/references/rlsys.bib
@@ -0,0 +1,436 @@
+@InProceedings{pmlr-v155-huang21a,
+  title = 	 {Learning a Decision Module by Imitating Driver’s Control Behaviors},
+  author =       {Huang, Junning and Xie, Sirui and Sun, Jiankai and Ma, Qiurui and Liu, Chunxiao and Lin, Dahua and Zhou, Bolei},
+  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
+  pages = 	 {1--10},
+  year = 	 {2021},
+  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
+  volume = 	 {155},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {16--18 Nov},
+  publisher =    {PMLR},
+  pdf = 	 {https://proceedings.mlr.press/v155/huang21a/huang21a.pdf},
+  url = 	 {https://proceedings.mlr.press/v155/huang21a.html},
+  abstract = 	 {Autonomous driving systems have a pipeline of perception, decision, planning, and control. The decision module processes information from the perception module and directs the execution of downstream planning and control modules. On the other hand, the recent success of deep learning suggests that this pipeline could be replaced by end-to-end neural control policies, however, safety cannot be well guaranteed for the data-driven neural networks. In this work, we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning.  This hybrid framework can preserve the merits of the classical pipeline such as the strict enforcement of physical and logical constraints while learning complex driving decisions from data. To circumvent the ambiguous annotation of human driving decisions, our method learns high-level driving decisions by imitating low-level control behaviors. We show in the simulation experiments that our modular driving agent can generalize its driving decision and control to various complex scenarios where the rule-based programs fail. It can also generate smoother and safer driving trajectories than end-to-end neural policies. Demo and code are available at https://decisionforce.github.io/modulardecision/.}
+}
+
+@inproceedings{ding2019camnet,
+  title={CamNet: Coarse-to-fine retrieval for camera re-localization},
+  author={Ding, Mingyu and Wang, Zhe and Sun, Jiankai and Shi, Jianping and Luo, Ping},
+  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
+  pages={2871--2880},
+  year={2019}
+}
+
+@incollection{peters2016robot,
+  title={Robot learning},
+  author={Peters, Jan and Lee, Daniel D and Kober, Jens and Nguyen-Tuong, Duy and Bagnell, J Andrew and Schaal, Stefan},
+  booktitle={Springer Handbook of Robotics},
+  pages={357--398},
+  year={2016},
+  publisher={Springer}
+}
+
+@article{saxena2014robobrain,
+  title={Robobrain: Large-scale knowledge engine for robots},
+  author={Saxena, Ashutosh and Jain, Ashesh and Sener, Ozan and Jami, Aditya and Misra, Dipendra K and Koppula, Hema S},
+  journal={arXiv preprint arXiv:1412.0691},
+  year={2014}
+}
+
+@inproceedings{zhu2017target,
+  title={Target-driven visual navigation in indoor scenes using deep reinforcement learning},
+  author={Zhu, Yuke and Mottaghi, Roozbeh and Kolve, Eric and Lim, Joseph J and Gupta, Abhinav and Fei-Fei, Li and Farhadi, Ali},
+  booktitle={2017 IEEE international conference on robotics and automation (ICRA)},
+  pages={3357--3364},
+  year={2017},
+  organization={IEEE}
+}
+
+@ARTICLE{9123682,  author={Pan, Bowen and Sun, Jiankai and Leung, Ho Yin Tiga and Andonian, Alex and Zhou, Bolei},  journal={IEEE Robotics and Automation Letters},   title={Cross-View Semantic Segmentation for Sensing Surroundings},   year={2020},  volume={5},  number={3},  pages={4867-4873},  doi={10.1109/LRA.2020.3004325}}
+
+@article{tang2018ba,
+  title={Ba-net: Dense bundle adjustment network},
+  author={Tang, Chengzhou and Tan, Ping},
+  journal={arXiv preprint arXiv:1806.04807},
+  year={2018}
+}
+
+@inproceedings{tanaka2021learning,
+  title={Learning To Bundle-Adjust: A Graph Network Approach to Faster Optimization of Bundle Adjustment for Vehicular SLAM},
+  author={Tanaka, Tetsuya and Sasagawa, Yukihiro and Okatani, Takayuki},
+  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
+  pages={6250--6259},
+  year={2021}
+}
+
+@inproceedings{tobin2017domain,
+  title={Domain randomization for transferring deep neural networks from simulation to the real world},
+  author={Tobin, Josh and Fong, Rachel and Ray, Alex and Schneider, Jonas and Zaremba, Wojciech and Abbeel, Pieter},
+  booktitle={2017 IEEE/RSJ international conference on intelligent robots and systems (IROS)},
+  pages={23--30},
+  year={2017},
+  organization={IEEE}
+}
+
+@inproceedings{finn2017deep,
+  title={Deep visual foresight for planning robot motion},
+  author={Finn, Chelsea and Levine, Sergey},
+  booktitle={2017 IEEE International Conference on Robotics and Automation (ICRA)},
+  pages={2786--2793},
+  year={2017},
+  organization={IEEE}
+}
+
+@article{duan2017one,
+  title={One-shot imitation learning},
+  author={Duan, Yan and Andrychowicz, Marcin and Stadie, Bradly and Jonathan Ho, OpenAI and Schneider, Jonas and Sutskever, Ilya and Abbeel, Pieter and Zaremba, Wojciech},
+  journal={Advances in neural information processing systems},
+  volume={30},
+  year={2017}
+}
+
+@book{koubaa2017robot,
+  title={Robot Operating System (ROS).},
+  author={Koub{\^a}a, Anis and others},
+  volume={1},
+  year={2017},
+  publisher={Springer}
+}
+
+@article{coleman2014reducing,
+  title={Reducing the barrier to entry of complex robotic software: a moveit! case study},
+  author={Coleman, David and Sucan, Ioan and Chitta, Sachin and Correll, Nikolaus},
+  journal={arXiv preprint arXiv:1404.3785},
+  year={2014}
+}
+
+@inproceedings{salzmann2020trajectron++,
+  title={Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data},
+  author={Salzmann, Tim and Ivanovic, Boris and Chakravarty, Punarjay and Pavone, Marco},
+  booktitle={European Conference on Computer Vision},
+  pages={683--700},
+  year={2020},
+  organization={Springer}
+}
+
+@inproceedings{gog2021pylot,
+  title={Pylot: A modular platform for exploring latency-accuracy tradeoffs in autonomous vehicles},
+  author={Gog, Ionel and Kalra, Sukrit and Schafhalter, Peter and Wright, Matthew A and Gonzalez, Joseph E and Stoica, Ion},
+  booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)},
+  pages={8806--8813},
+  year={2021},
+  organization={IEEE}
+}
+
+@inproceedings{Dosovitskiy17,
+  title = { {CARLA}: {An} Open Urban Driving Simulator},
+  author = {Alexey Dosovitskiy and German Ros and Felipe Codevilla and Antonio Lopez and Vladlen Koltun},
+  booktitle = {Proceedings of the 1st Annual Conference on Robot Learning},
+  pages = {1--16},
+  year = {2017}
+}
+
+@inproceedings{10.1145/3492321.3519576,
+author = {Gog, Ionel and Kalra, Sukrit and Schafhalter, Peter and Gonzalez, Joseph E. and Stoica, Ion},
+title = {D3: A Dynamic Deadline-Driven Approach for Building Autonomous Vehicles},
+year = {2022},
+isbn = {9781450391627},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/3492321.3519576},
+doi = {10.1145/3492321.3519576},
+abstract = {Autonomous vehicles (AVs) must drive across a variety of challenging environments that impose continuously-varying deadlines and runtime-accuracy tradeoffs on their software pipelines. A deadline-driven execution of such AV pipelines requires a new class of systems that enable the computation to maximize accuracy under dynamically-varying deadlines. Designing these systems presents interesting challenges that arise from combining ease-of-development of AV pipelines with deadline specification and enforcement mechanisms.Our work addresses these challenges through D3 (Dynamic Deadline-Driven), a novel execution model that centralizes the deadline management, and allows applications to adjust their computation by modeling missed deadlines as exceptions. Further, we design and implement ERDOS, an open-source realization of D3 for AV pipelines that exposes finegrained execution events to applications, and provides mechanisms to speculatively execute computation and enforce deadlines between an arbitrary set of events. Finally, we address the crucial lack of AV benchmarks through our state-of-the-art open-source AV pipeline, Pylot, that works seamlessly across simulators and real AVs. We evaluate the efficacy of D3 and ERDOS by driving Pylot across challenging driving scenarios spanning 50km, and observe a 68% reduction in collisions as compared to prior execution models.},
+booktitle = {Proceedings of the Seventeenth European Conference on Computer Systems},
+pages = {453–471},
+numpages = {19},
+location = {Rennes, France},
+series = {EuroSys '22}
+}
+
+@article{li2021metadrive,
+ author = {Li, Quanyi and Peng, Zhenghao and Xue, Zhenghai and Zhang, Qihang and Zhou, Bolei},
+ journal = {ArXiv preprint},
+ title = {Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning},
+ url = {https://arxiv.org/abs/2109.12674},
+ volume = {abs/2109.12674},
+ year = {2021}
+}
+
+@article{peng2021learning,
+ author = {Peng, Zhenghao and Li, Quanyi and Hui, Ka Ming and Liu, Chunxiao and Zhou, Bolei},
+ journal = {Advances in Neural Information Processing Systems},
+ title = {Learning to Simulate Self-Driven Particles System with Coordinated Policy Optimization},
+ volume = {34},
+ year = {2021}
+}
+
+
+@inproceedings{peng2021safe,
+ author = {Peng, Zhenghao and Li, Quanyi and Liu, Chunxiao and Zhou, Bolei},
+ booktitle = {5th Annual Conference on Robot Learning},
+ title = {Safe Driving via Expert Guided Policy Optimization},
+ year = {2021}
+}
+
+@ARTICLE{8421746,  author={Qin, Tong and Li, Peiliang and Shen, Shaojie},  journal={IEEE Transactions on Robotics},   title={VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator},   year={2018},  volume={34},  number={4},  pages={1004-1020},  doi={10.1109/TRO.2018.2853729}}
+
+@article{campos2021orb,
+  title={Orb-slam3: An accurate open-source library for visual, visual--inertial, and multimap slam},
+  author={Campos, Carlos and Elvira, Richard and Rodr{\'\i}guez, Juan J G{\'o}mez and Montiel, Jos{\'e} MM and Tard{\'o}s, Juan D},
+  journal={IEEE Transactions on Robotics},
+  volume={37},
+  number={6},
+  pages={1874--1890},
+  year={2021},
+  publisher={IEEE}
+}
+
+@inproceedings{li2021efficient,
+ author = {Li, Quanyi and Peng, Zhenghao and Zhou, Bolei},
+ booktitle = {International Conference on Learning Representations},
+ title = {Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization},
+ year = {2021}
+}
+
+@article{chaplot2020learning,
+  title={Learning to explore using active neural slam},
+  author={Chaplot, Devendra Singh and Gandhi, Dhiraj and Gupta, Saurabh and Gupta, Abhinav and Salakhutdinov, Ruslan},
+  journal={arXiv preprint arXiv:2004.05155},
+  year={2020}
+}
+
+@article{teed2021droid,
+  title={Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras},
+  author={Teed, Zachary and Deng, Jia},
+  journal={Advances in Neural Information Processing Systems},
+  volume={34},
+  year={2021}
+}
+
+@article{brunke2021safe,
+  title={Safe learning in robotics: From learning-based control to safe reinforcement learning},
+  author={Brunke, Lukas and Greeff, Melissa and Hall, Adam W and Yuan, Zhaocong and Zhou, Siqi and Panerati, Jacopo and Schoellig, Angela P},
+  journal={Annual Review of Control, Robotics, and Autonomous Systems},
+  volume={5},
+  year={2021},
+  publisher={Annual Reviews}
+}
+
+
+@InProceedings{pmlr-v144-gama21a,
+  title = 	 {Graph Neural Networks for Distributed Linear-Quadratic Control},
+  author =       {Gama, Fernando and Sojoudi, Somayeh},
+  booktitle = 	 {Proceedings of the 3rd Conference on Learning for Dynamics and Control},
+  pages = 	 {111--124},
+  year = 	 {2021},
+  editor = 	 {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A.&nbsp;Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.},
+  volume = 	 {144},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {07 -- 08 June},
+  publisher =    {PMLR},
+  pdf = 	 {http://proceedings.mlr.press/v144/gama21a/gama21a.pdf},
+  url = 	 {https://proceedings.mlr.press/v144/gama21a.html},
+  abstract = 	 {The linear-quadratic controller is one of the fundamental problems in control theory. The optimal solution is a linear controller that requires access to the state of the entire system at any given time. When considering a network system, this renders the optimal controller a centralized one. The interconnected nature of a network system often demands a distributed controller, where different components of the system are controlled based only on local information. Unlike the classical centralized case, obtaining the optimal distributed controller is usually an intractable problem. Thus, we adopt a graph neural network (GNN) as a parametrization of distributed controllers. GNNs are naturally local and have distributed architectures, making them well suited for learning nonlinear distributed controllers. By casting the linear-quadratic problem as a self-supervised learning problem, we are able to find the best GNN-based distributed controller. We also derive sufficient conditions for the resulting closed-loop system to be stable. We run extensive simulations to study the performance of GNN-based distributed controllers and showcase that they are a computationally efficient parametrization with scalability and transferability capabilities.}
+}
+
+
+@InProceedings{pmlr-v144-mehrjou21a,
+  title = 	 {Neural Lyapunov Redesign},
+  author =       {Mehrjou, Arash and Ghavamzadeh, Mohammad and Sch\"olkopf, Bernhard},
+  booktitle = 	 {Proceedings of the 3rd Conference on Learning for Dynamics and Control},
+  pages = 	 {459--470},
+  year = 	 {2021},
+  editor = 	 {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A.&nbsp;Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.},
+  volume = 	 {144},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {07 -- 08 June},
+  publisher =    {PMLR},
+  pdf = 	 {http://proceedings.mlr.press/v144/mehrjou21a/mehrjou21a.pdf},
+  url = 	 {https://proceedings.mlr.press/v144/mehrjou21a.html},
+  abstract = 	 {Learning controllers merely based on a performance metric has been proven effective in many physical and non-physical tasks in both control theory and reinforcement learning. However, in practice, the controller must guarantee some notion of safety to ensure that it does not harm either the agent or the environment. Stability is a crucial notion of safety, whose violation can certainly cause unsafe behaviors. Lyapunov functions are effective tools to assess stability in nonlinear dynamical systems. In this paper, we combine an improving Lyapunov function with automatic controller synthesis in an iterative fashion to obtain control policies with large safe regions. We propose a two-player collaborative algorithm that alternates between estimating a Lyapunov function and deriving a controller that gradually enlarges the stability region of the closed-loop system. We provide theoretical results on the class of systems that can be treated with the proposed algorithm and empirically evaluate the effectiveness of our method using an exemplary dynamical system.}
+}
+
+
+@InProceedings{pmlr-v144-zhang21b,
+  title = 	 {{LEOC}: A Principled Method in Integrating Reinforcement Learning and Classical Control Theory},
+  author =       {Zhang, Naifu and Capel, Nicholas},
+  booktitle = 	 {Proceedings of the 3rd Conference on Learning for Dynamics and Control},
+  pages = 	 {689--701},
+  year = 	 {2021},
+  editor = 	 {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A.&nbsp;Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.},
+  volume = 	 {144},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {07 -- 08 June},
+  publisher =    {PMLR},
+  pdf = 	 {http://proceedings.mlr.press/v144/zhang21b/zhang21b.pdf},
+  url = 	 {https://proceedings.mlr.press/v144/zhang21b.html},
+  abstract = 	 {There have been attempts in reinforcement learning to exploit a priori knowledge about the structure of the system. This paper proposes a hybrid reinforcement learning controller which dynamically interpolates a model-based linear controller and an arbitrary differentiable policy. The linear controller is designed based on local linearised model knowledge, and stabilises the system in a neighbourhood about an operating point. The coefficients of interpolation between the two controllers are determined by a scaled distance function measuring the distance between the current state and the operating point. The overall hybrid controller is proven to maintain the stability guarantee around the neighborhood of the operating point and still possess the universal function approximation property of the arbitrary non-linear policy. Learning has been done on both model-based (PILCO) and model-free (DDPG) frameworks. Simulation experiments performed in OpenAI gym demonstrate stability and robustness of the proposed hybrid controller. This paper thus introduces a principled method allowing for the direct importing of control methodology into reinforcement learning.}
+}
+
+
+@InProceedings{pmlr-v144-rafailov21a,
+  title = 	 {Offline Reinforcement Learning from Images with Latent Space Models},
+  author =       {Rafailov, Rafael and Yu, Tianhe and Rajeswaran, Aravind and Finn, Chelsea},
+  booktitle = 	 {Proceedings of the 3rd Conference on Learning for Dynamics and Control},
+  pages = 	 {1154--1168},
+  year = 	 {2021},
+  editor = 	 {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A.&nbsp;Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.},
+  volume = 	 {144},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {07 -- 08 June},
+  publisher =    {PMLR},
+  pdf = 	 {http://proceedings.mlr.press/v144/rafailov21a/rafailov21a.pdf},
+  url = 	 {https://proceedings.mlr.press/v144/rafailov21a.html},
+  abstract = 	 {Offline reinforcement learning (RL) refers to the task of learning policies from a static dataset of environment interactions. Offline RL enables extensive utilization and re-use of historical datasets, while also alleviating safety concerns associated with online exploration, thereby expanding the real-world applicability of RL. Most prior work in offline RL has focused on tasks with compact state representations. However, the ability to learn directly from rich observation spaces like images is critical for real-world applications like robotics. In this work, we build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces. Model-based offline RL algorithms have achieved state of the art results in state based tasks and are minimax optimal. However, they rely crucially on the ability to quantify uncertainty in the model predictions. This is particularly challenging with image observations. To overcome this challenge, we propose to learn a latent-state dynamics model, and represent the uncertainty in the latent space. Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP. Through experiments on a range of challenging image-based locomotion and robotic manipulation tasks, we find that our algorithm significantly outperforms previous offline model-free RL methods as well as state-of-the-art online visual model-based RL methods. Moreover, we also find that our approach excels on an image-based drawer closing task on a real robot using a pre-existing dataset. All results including videos can be found online at \url{https://sites.google.com/view/lompo/}.}
+}
+
+@inproceedings{chen2020transferable,
+  title={Transferable active grasping and real embodied dataset},
+  author={Chen, Xiangyu and Ye, Zelin and Sun, Jiankai and Fan, Yuda and Hu, Fang and Wang, Chenxi and Lu, Cewu},
+  booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)},
+  pages={3611--3618},
+  year={2020},
+  organization={IEEE}
+}
+
+@article{sun2021adversarial,
+  title={Adversarial inverse reinforcement learning with self-attention dynamics model},
+  author={Sun, Jiankai and Yu, Lantao and Dong, Pinqian and Lu, Bo and Zhou, Bolei},
+  journal={IEEE Robotics and Automation Letters},
+  volume={6},
+  number={2},
+  pages={1880--1886},
+  year={2021},
+  publisher={IEEE}
+}
+
+@article{huang2018navigationnet,
+  title={NavigationNet: A large-scale interactive indoor navigation dataset},
+  author={Huang, He and Shen, Yujing and Sun, Jiankai and Lu, Cewu},
+  journal={arXiv preprint arXiv:1808.08374},
+  year={2018}
+}
+
+@inproceedings{xu2019depth,
+  title={Depth completion from sparse lidar data with depth-normal constraints},
+  author={Xu, Yan and Zhu, Xinge and Shi, Jianping and Zhang, Guofeng and Bao, Hujun and Li, Hongsheng},
+  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
+  pages={2811--2820},
+  year={2019}
+}
+
+@inproceedings{zhu2020ssn,
+  title={Ssn: Shape signature networks for multi-class object detection from point clouds},
+  author={Zhu, Xinge and Ma, Yuexin and Wang, Tai and Xu, Yan and Shi, Jianping and Lin, Dahua},
+  booktitle={European Conference on Computer Vision},
+  pages={581--597},
+  year={2020},
+  organization={Springer}
+}
+
+@inproceedings{huang2019prior,
+  title={Prior guided dropout for robust visual localization in dynamic environments},
+  author={Huang, Zhaoyang and Xu, Yan and Shi, Jianping and Zhou, Xiaowei and Bao, Hujun and Zhang, Guofeng},
+  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
+  pages={2791--2800},
+  year={2019}
+}
+
+@article{xu2020selfvoxelo,
+  title={Selfvoxelo: Self-supervised lidar odometry with voxel-based deep neural networks},
+  author={Xu, Yan and Huang, Zhaoyang and Lin, Kwan-Yee and Zhu, Xinge and Shi, Jianping and Bao, Hujun and Zhang, Guofeng and Li, Hongsheng},
+  journal={arXiv preprint arXiv:2010.09343},
+  year={2020}
+}
+
+@article{huang2021life,
+  title={LIFE: Lighting Invariant Flow Estimation},
+  author={Huang, Zhaoyang and Pan, Xiaokun and Xu, Runsen and Xu, Yan and Zhang, Guofeng and Li, Hongsheng and others},
+  journal={arXiv preprint arXiv:2104.03097},
+  year={2021}
+}
+
+@inproceedings{huang2021vs,
+  title={VS-Net: Voting with Segmentation for Visual Localization},
+  author={Huang, Zhaoyang and Zhou, Han and Li, Yijin and Yang, Bangbang and Xu, Yan and Zhou, Xiaowei and Bao, Hujun and Zhang, Guofeng and Li, Hongsheng},
+  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+  pages={6101--6111},
+  year={2021}
+}
+
+@article{yang2021pdnet,
+  title={PDNet: Towards Better One-stage Object Detection with Prediction Decoupling},
+  author={Yang, Li and Xu, Yan and Wang, Shaoru and Yuan, Chunfeng and Zhang, Ziqi and Li, Bing and Hu, Weiming},
+  journal={arXiv preprint arXiv:2104.13876},
+  year={2021}
+}
+
+@article{xu2022robust,
+  title={Robust Self-supervised LiDAR Odometry via Representative Structure Discovery and 3D Inherent Error Modeling},
+  author={Xu, Yan and Lin, Junyi and Shi, Jianping and Zhang, Guofeng and Wang, Xiaogang and Li, Hongsheng},
+  journal={IEEE Robotics and Automation Letters},
+  year={2022},
+  publisher={IEEE}
+}
+
+@article{xu2022rnnpose,
+  title={RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization},
+  author={Xu, Yan and Lin, Junyi and Zhang, Guofeng and Wang, Xiaogang and Li, Hongsheng},
+  journal={arXiv preprint arXiv:2203.12870},
+  year={2022}
+}
+
+@article{Sun2022SelfSupervisedTA,
+  title={Self-Supervised Traffic Advisors: Distributed, Multi-view Traffic Prediction for Smart Cities},
+  author={Jiankai Sun and Shreyas Kousik and David Fridovich-Keil and Mac Schwager},
+  journal={arXiv preprint},
+  year={2022}
+}
+
+@InProceedings{pmlr-v155-sun21a,
+  title = 	 {Neuro-Symbolic Program Search for Autonomous Driving Decision Module Design},
+  author =       {Sun, Jiankai and Sun, Hao and Han, Tian and Zhou, Bolei},
+  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
+  pages = 	 {21--30},
+  year = 	 {2021},
+  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
+  volume = 	 {155},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {16--18 Nov},
+  publisher =    {PMLR},
+  pdf = 	 {https://proceedings.mlr.press/v155/sun21a/sun21a.pdf},
+  url = 	 {https://proceedings.mlr.press/v155/sun21a.html},
+  abstract = 	 {As a promising topic in cognitive robotics, neuro-symbolic modeling integrates symbolic reasoning and neural representation altogether. However, previous neuro-symbolic models usually wire their structures and the connections manually, making the underlying parameters sub-optimal. In this work, we propose the Neuro-Symbolic Program Search (NSPS) to improve the autonomous driving system design. NSPS is a novel automated search method that synthesizes the Neuro-Symbolic Programs. It can produce robust and expressive Neuro-Symbolic Programs and automatically tune the hyper-parameters. We validate NSPS in the CARLA driving simulation environment. The resulting Neuro-Symbolic Decision Programs successfully handle multiple traffic scenarios. Compared with previous neural-network-based driving and rule-based methods, our neuro-symbolic driving pipeline achieves more stable and safer behaviors in complex driving scenarios while maintaining an interpretable symbolic decision-making process.}
+}
+
+@ARTICLE{9491826,  author={Lu, Sidi and Shi, Weisong},  journal={IEEE Internet Computing},   title={The Emergence of Vehicle Computing},   year={2021},  volume={25},  number={3},  pages={18-22},  doi={10.1109/MIC.2021.3066076}}
+
+@inproceedings{maruyama2016exploring,
+  title={Exploring the performance of ROS2},
+  author={Maruyama, Yuya and Kato, Shinpei and Azumi, Takuya},
+  booktitle={Proceedings of the 13th ACM SIGBED International Conference on Embedded Software (EMSOFT)},
+  pages={1--10},
+  year={2016}
+}
+
+@inproceedings{yi2020segvoxelnet,
+  title={Segvoxelnet: Exploring semantic context and depth-aware features for 3d vehicle detection from point cloud},
+  author={Yi, Hongwei and Shi, Shaoshuai and Ding, Mingyu and Sun, Jiankai and Xu, Kui and Zhou, Hui and Wang, Zhe and Li, Sheng and Wang, Guoping},
+  booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)},
+  pages={2274--2280},
+  year={2020},
+  organization={IEEE}
+}
+
+@ARTICLE{9712373,  author={Sun, Jiankai and Huang, De-An and Lu, Bo and Liu, Yun-Hui and Zhou, Bolei and Garg, Animesh},  journal={IEEE Robotics and Automation Letters},   title={PlaTe: Visually-Grounded Planning With Transformers in Procedural Tasks},   year={2022},  volume={7},  number={2},  pages={4924-4930},  doi={10.1109/LRA.2022.3150855}}
+
+
+@article{aradi2020survey,
+  title={Survey of deep reinforcement learning for motion planning of autonomous vehicles},
+  author={Aradi, Szil{\'a}rd},
+  journal={IEEE Transactions on Intelligent Transportation Systems},
+  year={2020},
+  publisher={IEEE}
+}
+
--- a/references/training.bib
+++ b/references/training.bib