mirror of
https://github.com/openmlsys/openmlsys-zh.git
synced 2026-03-21 04:27:33 +08:00
fix: fix equation rendering by changing the toolchain to mathjax (#493)
* docs: update README and build guide * fix: escape * and _ inside math to prevent markdown emphasis corruption * fix: configure MathJax to use TeX (Computer Modern) font * feat: enhance markdown processing with label and figure collection * fix: remove duplicate bibliography directives from chapter summaries References are already handled at the chapter level, so the :bibliography: directives in summary pages are redundant and cause rendering issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
8
.github/workflows/main.yml
vendored
8
.github/workflows/main.yml
vendored
@@ -23,20 +23,12 @@ jobs:
|
||||
with:
|
||||
mdbook-version: 'latest'
|
||||
|
||||
- name: Cache mdbook-typst-math binary
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: .mdbook-bin
|
||||
key: mdbook-typst-math-v0.3.0-linux-x86_64
|
||||
|
||||
- name: Run mdBook regression tests
|
||||
run: |
|
||||
python3 -m unittest discover -s tests -p 'test_prepare_mdbook.py'
|
||||
python3 -m unittest discover -s tests -p 'test_prepare_mdbook_zh.py'
|
||||
python3 -m unittest discover -s tests -p 'test_assemble_docs_publish_tree.py'
|
||||
python3 -m unittest discover -s tests -p 'test_ensure_book_resources.py'
|
||||
python3 -m unittest discover -s tests -p 'test_mdbook_typst_math.py'
|
||||
python3 -m unittest discover -s tests -p 'test_ensure_mdbook_typst_math.py'
|
||||
python3 -m unittest discover -s tests -p 'test_update_docs_workflow.py'
|
||||
|
||||
- name: Build English HTML with mdBook
|
||||
|
||||
8
.github/workflows/update_docs.yml
vendored
8
.github/workflows/update_docs.yml
vendored
@@ -17,12 +17,6 @@ jobs:
|
||||
with:
|
||||
python-version: '3.10'
|
||||
|
||||
- name: Cache mdbook-typst-math binary
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: .mdbook-bin
|
||||
key: mdbook-typst-math-v0.3.0-linux-x86_64
|
||||
|
||||
- name: Setup mdBook
|
||||
uses: peaceiris/actions-mdbook@v2
|
||||
with:
|
||||
@@ -34,8 +28,6 @@ jobs:
|
||||
python3 -m unittest discover -s tests -p 'test_prepare_mdbook_zh.py'
|
||||
python3 -m unittest discover -s tests -p 'test_assemble_docs_publish_tree.py'
|
||||
python3 -m unittest discover -s tests -p 'test_ensure_book_resources.py'
|
||||
python3 -m unittest discover -s tests -p 'test_mdbook_typst_math.py'
|
||||
python3 -m unittest discover -s tests -p 'test_ensure_mdbook_typst_math.py'
|
||||
|
||||
- name: Build English HTML with mdBook
|
||||
run: bash build_mdbook.sh
|
||||
|
||||
19
README.md
19
README.md
@@ -89,8 +89,8 @@
|
||||
|
||||
### 环境依赖
|
||||
|
||||
- Python >= 3.10
|
||||
- pandoc >= 2.19
|
||||
- curl
|
||||
- git
|
||||
|
||||
### 安装步骤
|
||||
|
||||
@@ -99,19 +99,18 @@
|
||||
git clone https://github.com/openmlsys/openmlsys-zh.git
|
||||
cd openmlsys-zh
|
||||
|
||||
# 安装 d2lbook
|
||||
git clone https://github.com/openmlsys/d2l-book.git
|
||||
cd d2l-book && pip install . && cd ..
|
||||
# 安装rust toolchain
|
||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
||||
|
||||
# 安装 Python 依赖
|
||||
pip install -r requirements.txt
|
||||
# 安装mdbook
|
||||
cargo install mdbook
|
||||
```
|
||||
|
||||
### 编译 HTML
|
||||
### 编译HTML
|
||||
|
||||
```bash
|
||||
sh build_html.sh
|
||||
# 生成结果在 _build/html/
|
||||
sh build_mdbook_zh.sh
|
||||
# 生成结果位于 .mdbook-zh/book
|
||||
```
|
||||
|
||||
更多细节请参考 [构建指南](info/info.md)。
|
||||
|
||||
17
README_EN.md
17
README_EN.md
@@ -91,8 +91,8 @@ The book is organized into three parts: Fundamentals, Advanced Topics, and Exten
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Python >= 3.10
|
||||
- pandoc >= 2.19
|
||||
- curl
|
||||
- git
|
||||
|
||||
### Installation
|
||||
|
||||
@@ -101,19 +101,18 @@ The book is organized into three parts: Fundamentals, Advanced Topics, and Exten
|
||||
git clone https://github.com/openmlsys/openmlsys-zh.git
|
||||
cd openmlsys-zh
|
||||
|
||||
# Install d2lbook
|
||||
git clone https://github.com/openmlsys/d2l-book.git
|
||||
cd d2l-book && pip install . && cd ..
|
||||
# Install Rust toolchain (Linux/macOS)
|
||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
||||
|
||||
# Install Python dependencies
|
||||
pip install -r requirements.txt
|
||||
# Install mdbook
|
||||
cargo install mdbook
|
||||
```
|
||||
|
||||
### Build HTML
|
||||
|
||||
```bash
|
||||
sh build_html.sh
|
||||
# Output is in _build/html/
|
||||
sh build_mdbook.sh
|
||||
# Output is in .mdbook/book
|
||||
```
|
||||
|
||||
For more details, see the [Build Guide](info/info.md).
|
||||
|
||||
@@ -11,9 +11,8 @@ create-missing = false
|
||||
[preprocessor.openmlsys]
|
||||
command = "python3 tools/mdbook_preprocessor.py"
|
||||
|
||||
[preprocessor.typst-math]
|
||||
|
||||
[output.html]
|
||||
mathjax-support = true
|
||||
git-repository-url = "https://github.com/openmlsys/openmlsys-zh"
|
||||
preferred-dark-theme = "navy"
|
||||
additional-css = ["theme/dark-mode-images.css", "theme/typst.css"]
|
||||
additional-css = ["theme/dark-mode-images.css"]
|
||||
|
||||
@@ -11,9 +11,8 @@ create-missing = false
|
||||
[preprocessor.openmlsys-zh]
|
||||
command = "python3 ../../tools/mdbook_zh_preprocessor.py"
|
||||
|
||||
[preprocessor.typst-math]
|
||||
|
||||
[output.html]
|
||||
mathjax-support = true
|
||||
git-repository-url = "https://github.com/openmlsys/openmlsys-zh"
|
||||
preferred-dark-theme = "navy"
|
||||
additional-css = ["theme/dark-mode-images.css", "theme/typst.css"]
|
||||
additional-css = ["theme/dark-mode-images.css"]
|
||||
|
||||
12
books/zh/theme/head.hbs
Normal file
12
books/zh/theme/head.hbs
Normal file
@@ -0,0 +1,12 @@
|
||||
<script type="text/x-mathjax-config">
|
||||
MathJax.Hub.Config({
|
||||
"HTML-CSS": {
|
||||
availableFonts: ["TeX"],
|
||||
preferredFont: "TeX",
|
||||
webFont: "TeX"
|
||||
},
|
||||
SVG: {
|
||||
font: "TeX"
|
||||
}
|
||||
});
|
||||
</script>
|
||||
@@ -14,10 +14,6 @@ if ! command -v mdbook >/dev/null 2>&1; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
MDBOOK_TYPST_MATH_BIN_DIR="${ROOT}/.mdbook-bin"
|
||||
"${PYTHON_BIN}" "${ROOT}/tools/ensure_mdbook_typst_math.py" --output-dir "${MDBOOK_TYPST_MATH_BIN_DIR}" >/dev/null
|
||||
export PATH="${MDBOOK_TYPST_MATH_BIN_DIR}:${PATH}"
|
||||
|
||||
"${PYTHON_BIN}" "${ROOT}/tools/ensure_book_resources.py" --chapter-dir "${ROOT}/en_chapters"
|
||||
"${PYTHON_BIN}" "${ROOT}/tools/prepare_mdbook.py" \
|
||||
--source "${ROOT}/en_chapters" \
|
||||
|
||||
@@ -14,10 +14,6 @@ if ! command -v mdbook >/dev/null 2>&1; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
MDBOOK_TYPST_MATH_BIN_DIR="${ROOT}/.mdbook-bin"
|
||||
"${PYTHON_BIN}" "${ROOT}/tools/ensure_mdbook_typst_math.py" --output-dir "${MDBOOK_TYPST_MATH_BIN_DIR}" >/dev/null
|
||||
export PATH="${MDBOOK_TYPST_MATH_BIN_DIR}:${PATH}"
|
||||
|
||||
# ── Create resource links ─────────────────────────────────────────────────────
|
||||
"${PYTHON_BIN}" "${ROOT}/tools/ensure_book_resources.py" --chapter-dir "${ROOT}/zh_chapters"
|
||||
|
||||
|
||||
25
info/info.md
25
info/info.md
@@ -1,13 +1,10 @@
|
||||
## 环境安装
|
||||
机器学习系统书籍部署在GitHub是依赖于d2lbook工具实现的。因此我们首先要安装d2lbook。
|
||||
机器学习系统书籍部署在GitHub是依赖于mdbook工具实现的。我们推荐使用rust的原生包管理器cargo安装mdbook。
|
||||
```bash
|
||||
git clone https://github.com/openmlsys/d2l-book.git
|
||||
cd d2l-book
|
||||
python setup.py install
|
||||
# 安装rust工具链,获取cargo
|
||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
||||
cargo install mdbook
|
||||
```
|
||||
使用d2lbook构建HTML需要安装`pandoc`, 可以使用`conda install pandoc` (如果是MacOS可以用Homebrew), apt源中pandoc发布版本较低,表格转换格式可能有误,请尽量使用较新版本的pandoc。
|
||||
构建PDF时如果有SVG图片需要安装LibRsvg来转换SVG图片,安装`librsvg`可以通过`apt-get install librsvg`(如果是MacOS可以用Homebrew)。
|
||||
当然构建PDF必须要有LaTeX,如安装[Tex Live](https://www.tug.org/texlive/).
|
||||
|
||||
## 编译HTML版本
|
||||
在编译前先下载[openmlsys-zh](https://github.com/openmlsys/openmlsys-zh) , 所有的编译命令都在这个文件目录内执行。
|
||||
@@ -15,16 +12,16 @@ python setup.py install
|
||||
git clone https://github.com/openmlsys/openmlsys-zh.git
|
||||
cd openmlsys-zh
|
||||
```
|
||||
使用d2lbook工具编译HTML。 请尽量使用build_html.sh脚本进行编译,保证首页正确合并到书籍中去。
|
||||
```
|
||||
sh build_html.sh
|
||||
使用mdbook工具编译HTML。 请尽量使用build_mdbook.sh脚本进行编译,保证首页正确合并到书籍中去。
|
||||
```bash
|
||||
sh build_mdbook.sh
|
||||
# 中文版本
|
||||
sh build_mdbook_zh.sh
|
||||
```
|
||||
|
||||
生成的html会在`_build/html`。
|
||||
生成的html会在`.mdbook/book`或者`.mdbook-zh/book`下。此时我们可以使用`tools/assemble_docs_publish_tree.py`组装最终的双语发布版本,然后将其拷贝至openmlsys.github.io的docs发布。
|
||||
|
||||
此时我们将编译好的html整个文件夹下的内容拷贝至openmlsys.github.io的docs发布。
|
||||
|
||||
需要注意的是docs目录下的.nojekyll不要删除了,不然网页会没有渲染。
|
||||
具体工作流可以参考`.github/workflows/update_docs.yml`
|
||||
|
||||
## 样式规范
|
||||
|
||||
|
||||
@@ -1,134 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import gzip
|
||||
import hashlib
|
||||
import os
|
||||
import tempfile
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
from tools.ensure_mdbook_typst_math import (
|
||||
ASSET_SHA256,
|
||||
VERSION,
|
||||
build_download_url,
|
||||
ensure_binary,
|
||||
resolve_asset_name,
|
||||
resolve_binary_path,
|
||||
resolve_version_path,
|
||||
)
|
||||
|
||||
|
||||
class ResolveAssetNameTests(unittest.TestCase):
|
||||
def test_resolve_asset_name_for_supported_targets(self) -> None:
|
||||
self.assertEqual(
|
||||
resolve_asset_name(system="Darwin", machine="arm64"),
|
||||
"mdbook-typst-math-aarch64-apple-darwin.gz",
|
||||
)
|
||||
self.assertEqual(
|
||||
resolve_asset_name(system="Darwin", machine="x86_64"),
|
||||
"mdbook-typst-math-x86_64-apple-darwin.gz",
|
||||
)
|
||||
self.assertEqual(
|
||||
resolve_asset_name(system="Linux", machine="aarch64"),
|
||||
"mdbook-typst-math-aarch64-unknown-linux-gnu.gz",
|
||||
)
|
||||
self.assertEqual(
|
||||
resolve_asset_name(system="Linux", machine="AMD64"),
|
||||
"mdbook-typst-math-x86_64-unknown-linux-gnu.gz",
|
||||
)
|
||||
self.assertEqual(
|
||||
resolve_asset_name(system="Windows", machine="AMD64"),
|
||||
"mdbook-typst-math-x86_64-pc-windows-msvc.exe",
|
||||
)
|
||||
|
||||
def test_resolve_asset_name_rejects_unsupported_targets(self) -> None:
|
||||
with self.assertRaises(ValueError):
|
||||
resolve_asset_name(system="Linux", machine="riscv64")
|
||||
|
||||
|
||||
class EnsureBinaryTests(unittest.TestCase):
|
||||
def test_ensure_binary_downloads_and_extracts_gzip_release(self) -> None:
|
||||
payload = b"linux-binary"
|
||||
asset_name = "mdbook-typst-math-x86_64-unknown-linux-gnu.gz"
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_dir = Path(tmpdir)
|
||||
urls: list[str] = []
|
||||
|
||||
def fake_downloader(url: str) -> bytes:
|
||||
urls.append(url)
|
||||
return gzip.compress(payload)
|
||||
|
||||
with patch.dict(ASSET_SHA256, {asset_name: hashlib.sha256(gzip.compress(payload)).hexdigest()}):
|
||||
binary_path = ensure_binary(
|
||||
output_dir,
|
||||
system="Linux",
|
||||
machine="x86_64",
|
||||
downloader=fake_downloader,
|
||||
)
|
||||
|
||||
self.assertEqual(binary_path, resolve_binary_path(output_dir, VERSION, asset_name))
|
||||
self.assertEqual(binary_path.name, "mdbook-typst-math")
|
||||
self.assertEqual(binary_path.read_bytes(), payload)
|
||||
self.assertEqual(resolve_version_path(output_dir).read_text(encoding="utf-8"), VERSION)
|
||||
self.assertEqual(urls, [build_download_url(VERSION, asset_name)])
|
||||
self.assertTrue(os.access(binary_path, os.X_OK))
|
||||
|
||||
def test_ensure_binary_uses_cached_file_without_downloading(self) -> None:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_dir = Path(tmpdir)
|
||||
asset_name = "mdbook-typst-math-x86_64-unknown-linux-gnu.gz"
|
||||
cached_binary = resolve_binary_path(output_dir, VERSION, asset_name)
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
cached_binary.write_bytes(b"cached")
|
||||
cached_binary.chmod(0o755)
|
||||
resolve_version_path(output_dir).write_text(VERSION, encoding="utf-8")
|
||||
|
||||
def fail_downloader(_: str) -> bytes:
|
||||
raise AssertionError("downloader should not be called for cached binary")
|
||||
|
||||
binary_path = ensure_binary(
|
||||
output_dir,
|
||||
system="Linux",
|
||||
machine="x86_64",
|
||||
downloader=fail_downloader,
|
||||
)
|
||||
|
||||
self.assertEqual(binary_path, cached_binary)
|
||||
self.assertEqual(binary_path.read_bytes(), b"cached")
|
||||
|
||||
def test_ensure_binary_keeps_windows_extension(self) -> None:
|
||||
payload = b"windows-binary"
|
||||
asset_name = "mdbook-typst-math-x86_64-pc-windows-msvc.exe"
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_dir = Path(tmpdir)
|
||||
|
||||
def fake_downloader(_: str) -> bytes:
|
||||
return payload
|
||||
|
||||
with patch.dict(ASSET_SHA256, {asset_name: hashlib.sha256(payload).hexdigest()}):
|
||||
binary_path = ensure_binary(
|
||||
output_dir,
|
||||
system="Windows",
|
||||
machine="AMD64",
|
||||
downloader=fake_downloader,
|
||||
)
|
||||
|
||||
self.assertEqual(binary_path.name, "mdbook-typst-math.exe")
|
||||
self.assertEqual(binary_path.read_bytes(), payload)
|
||||
|
||||
def test_ensure_binary_rejects_checksum_mismatch(self) -> None:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
with self.assertRaises(ValueError):
|
||||
ensure_binary(
|
||||
Path(tmpdir),
|
||||
system="Linux",
|
||||
machine="x86_64",
|
||||
downloader=lambda _: gzip.compress(b"bad-binary"),
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -1,38 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[1]
|
||||
BOOK_PATHS = (
|
||||
REPO_ROOT / "book.toml",
|
||||
REPO_ROOT / "books" / "zh" / "book.toml",
|
||||
)
|
||||
BUILD_SCRIPTS = (
|
||||
REPO_ROOT / "build_mdbook.sh",
|
||||
REPO_ROOT / "build_mdbook_zh.sh",
|
||||
)
|
||||
|
||||
|
||||
class MdBookTypstMathConfigTests(unittest.TestCase):
|
||||
def test_books_use_typst_math_without_mathjax(self) -> None:
|
||||
for path in BOOK_PATHS:
|
||||
config = path.read_text(encoding="utf-8")
|
||||
|
||||
self.assertIn("[preprocessor.typst-math]", config, path.as_posix())
|
||||
self.assertIn("theme/typst.css", config, path.as_posix())
|
||||
self.assertNotIn("mathjax-support = true", config, path.as_posix())
|
||||
|
||||
def test_build_scripts_bootstrap_prebuilt_typst_math_binary(self) -> None:
|
||||
for path in BUILD_SCRIPTS:
|
||||
script = path.read_text(encoding="utf-8")
|
||||
|
||||
self.assertIn("ensure_mdbook_typst_math.py", script, path.as_posix())
|
||||
self.assertIn("MDBOOK_TYPST_MATH_BIN_DIR", script, path.as_posix())
|
||||
self.assertIn("export PATH=", script, path.as_posix())
|
||||
self.assertNotIn("cargo install mdbook-typst-math", script, path.as_posix())
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -4,7 +4,17 @@ import tempfile
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
|
||||
from tools.prepare_mdbook import build_title_cache, rewrite_markdown, write_summary
|
||||
from tools.prepare_mdbook import (
|
||||
_relative_chapter_path,
|
||||
build_title_cache,
|
||||
collect_figure_labels,
|
||||
collect_labels,
|
||||
convert_math_to_mathjax,
|
||||
normalize_directives,
|
||||
process_figure_captions,
|
||||
rewrite_markdown,
|
||||
write_summary,
|
||||
)
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[1]
|
||||
@@ -233,5 +243,287 @@ Reference :cite:`smith2024`.
|
||||
self.assertIn("width: 100%;", frontpage)
|
||||
|
||||
|
||||
class CollectLabelsTests(unittest.TestCase):
|
||||
def test_standalone_label(self) -> None:
|
||||
md = ":label:`my_fig`\n"
|
||||
self.assertEqual(collect_labels(md), ["my_fig"])
|
||||
|
||||
def test_inline_table_label(self) -> None:
|
||||
md = "|:label:`tbl`|||\n"
|
||||
self.assertEqual(collect_labels(md), ["tbl"])
|
||||
|
||||
def test_escaped_underscores(self) -> None:
|
||||
md = ":label:`ros2\\_topics`\n"
|
||||
self.assertEqual(collect_labels(md), ["ros2\\_topics"])
|
||||
|
||||
def test_empty(self) -> None:
|
||||
md = "No labels here.\n"
|
||||
self.assertEqual(collect_labels(md), [])
|
||||
|
||||
def test_multiple_labels(self) -> None:
|
||||
md = ":label:`fig1`\nsome text\n:label:`fig2`\n"
|
||||
self.assertEqual(collect_labels(md), ["fig1", "fig2"])
|
||||
|
||||
|
||||
class LabelToAnchorTests(unittest.TestCase):
|
||||
def test_standalone_label_becomes_anchor(self) -> None:
|
||||
result = normalize_directives(":label:`ROS2_arch`\n")
|
||||
self.assertIn('<a id="ROS2_arch"></a>', result)
|
||||
self.assertNotIn(":label:", result)
|
||||
|
||||
def test_table_row_label_becomes_anchor(self) -> None:
|
||||
result = normalize_directives("|:label:`tbl`|||\n")
|
||||
self.assertIn('|<a id="tbl"></a>|||', result)
|
||||
|
||||
def test_width_line_removed(self) -> None:
|
||||
result = normalize_directives(":width:`800px`\n")
|
||||
self.assertNotIn(":width:", result)
|
||||
self.assertNotIn("800px", result)
|
||||
|
||||
|
||||
class NumrefToLinkTests(unittest.TestCase):
|
||||
def test_same_file_link(self) -> None:
|
||||
ref_map = {"my_fig": "chapter/page.md"}
|
||||
result = normalize_directives(
|
||||
"See :numref:`my_fig`.\n",
|
||||
ref_label_map=ref_map,
|
||||
current_source_path="chapter/page.md",
|
||||
)
|
||||
self.assertIn("[my_fig](#my_fig)", result)
|
||||
|
||||
def test_cross_file_link(self) -> None:
|
||||
ref_map = {"my_fig": "other_ch/file.md"}
|
||||
result = normalize_directives(
|
||||
"See :numref:`my_fig`.\n",
|
||||
ref_label_map=ref_map,
|
||||
current_source_path="chapter/page.md",
|
||||
)
|
||||
self.assertIn("[my_fig](../other_ch/file.md#my_fig)", result)
|
||||
|
||||
def test_unknown_label_fallback(self) -> None:
|
||||
result = normalize_directives(
|
||||
"See :numref:`unknown`.\n",
|
||||
ref_label_map={},
|
||||
current_source_path="chapter/page.md",
|
||||
)
|
||||
self.assertIn("`unknown`", result)
|
||||
self.assertNotIn("[unknown]", result)
|
||||
|
||||
def test_no_ref_map_fallback(self) -> None:
|
||||
result = normalize_directives("See :numref:`foo`.\n")
|
||||
self.assertIn("`foo`", result)
|
||||
|
||||
def test_escaped_underscores_in_numref(self) -> None:
|
||||
ref_map = {"ros2\\_topics": "chapter/ros.md"}
|
||||
result = normalize_directives(
|
||||
"See :numref:`ros2\\_topics`.\n",
|
||||
ref_label_map=ref_map,
|
||||
current_source_path="chapter/ros.md",
|
||||
)
|
||||
# _strip_latex_escapes_outside_math removes \_ → _, producing consistent IDs
|
||||
self.assertIn("[ros2_topics](#ros2_topics)", result)
|
||||
|
||||
|
||||
class RelativeChapterPathTests(unittest.TestCase):
|
||||
def test_same_file(self) -> None:
|
||||
self.assertEqual(_relative_chapter_path("ch/page.md", "ch/page.md"), "")
|
||||
|
||||
def test_same_dir(self) -> None:
|
||||
result = _relative_chapter_path("ch/a.md", "ch/b.md")
|
||||
self.assertEqual(result, "b.md")
|
||||
|
||||
def test_different_dir(self) -> None:
|
||||
result = _relative_chapter_path("ch1/page.md", "ch2/other.md")
|
||||
self.assertEqual(result, "../ch2/other.md")
|
||||
|
||||
|
||||
class CollectFigureLabelsTests(unittest.TestCase):
|
||||
def test_image_followed_by_label(self) -> None:
|
||||
md = "\n:label:`fig1`\n"
|
||||
self.assertEqual(collect_figure_labels(md), ["fig1"])
|
||||
|
||||
def test_image_with_width_and_label(self) -> None:
|
||||
md = "\n:width:`800px`\n:label:`fig1`\n"
|
||||
self.assertEqual(collect_figure_labels(md), ["fig1"])
|
||||
|
||||
def test_image_with_blank_lines(self) -> None:
|
||||
md = "\n\n:width:`800px`\n\n:label:`fig1`\n"
|
||||
self.assertEqual(collect_figure_labels(md), ["fig1"])
|
||||
|
||||
def test_table_label_not_collected(self) -> None:
|
||||
md = "|:label:`tbl`|||\n"
|
||||
self.assertEqual(collect_figure_labels(md), [])
|
||||
|
||||
def test_standalone_label_without_image(self) -> None:
|
||||
md = "# Heading\n:label:`sec1`\n"
|
||||
self.assertEqual(collect_figure_labels(md), [])
|
||||
|
||||
def test_multiple_figures(self) -> None:
|
||||
md = "\n:label:`f1`\n\n\n:label:`f2`\n"
|
||||
self.assertEqual(collect_figure_labels(md), ["f1", "f2"])
|
||||
|
||||
|
||||
class ProcessFigureCaptionsTests(unittest.TestCase):
|
||||
def test_figure_with_number_and_caption(self) -> None:
|
||||
md = "\n:width:`800px`\n:label:`fig1`\n"
|
||||
result = process_figure_captions(md, fig_number_map={"fig1": "8.1"})
|
||||
self.assertIn('<a id="fig1"></a>', result)
|
||||
self.assertIn("", result)
|
||||
self.assertIn('<p align="center">图8.1 量化原理</p>', result)
|
||||
self.assertNotIn(":width:", result)
|
||||
self.assertNotIn(":label:", result)
|
||||
|
||||
def test_figure_without_number_map(self) -> None:
|
||||
md = "\n:label:`fig1`\n"
|
||||
result = process_figure_captions(md)
|
||||
self.assertIn('<a id="fig1"></a>', result)
|
||||
self.assertIn("", result)
|
||||
self.assertIn('<p align="center">caption</p>', result)
|
||||
|
||||
def test_image_without_label_passthrough(self) -> None:
|
||||
md = "\nSome text\n"
|
||||
result = process_figure_captions(md)
|
||||
self.assertIn("", result)
|
||||
self.assertNotIn('<a id=', result)
|
||||
self.assertNotIn('<p align="center">', result)
|
||||
|
||||
def test_figure_empty_caption(self) -> None:
|
||||
md = "\n:label:`fig1`\n"
|
||||
result = process_figure_captions(md, fig_number_map={"fig1": "1.1"})
|
||||
self.assertIn('<p align="center">图1.1</p>', result)
|
||||
|
||||
|
||||
class NumrefWithFigureNumberTests(unittest.TestCase):
|
||||
def test_numref_shows_figure_number(self) -> None:
|
||||
result = normalize_directives(
|
||||
"See :numref:`my_fig`.\n",
|
||||
ref_label_map={"my_fig": "ch/page.md"},
|
||||
current_source_path="ch/page.md",
|
||||
fig_number_map={"my_fig": "8.1"},
|
||||
)
|
||||
self.assertIn("[图8.1](#my_fig)", result)
|
||||
|
||||
def test_numref_cross_file_with_figure_number(self) -> None:
|
||||
result = normalize_directives(
|
||||
"See :numref:`my_fig`.\n",
|
||||
ref_label_map={"my_fig": "other/page.md"},
|
||||
current_source_path="ch/page.md",
|
||||
fig_number_map={"my_fig": "3.2"},
|
||||
)
|
||||
self.assertIn("[图3.2](../other/page.md#my_fig)", result)
|
||||
|
||||
def test_numref_without_figure_number_shows_name(self) -> None:
|
||||
result = normalize_directives(
|
||||
"See :numref:`tbl`.\n",
|
||||
ref_label_map={"tbl": "ch/page.md"},
|
||||
current_source_path="ch/page.md",
|
||||
fig_number_map={},
|
||||
)
|
||||
self.assertIn("[tbl](#tbl)", result)
|
||||
|
||||
|
||||
class LabelNumrefIntegrationTests(unittest.TestCase):
|
||||
def test_rewrite_markdown_with_label_map(self) -> None:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
page = Path(tmpdir) / "chapter" / "page.md"
|
||||
page.parent.mkdir()
|
||||
page.write_text(
|
||||
"# Title\n\n:label:`my_fig`\n\nSee :numref:`my_fig`.\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
rewritten = rewrite_markdown(
|
||||
page.read_text(encoding="utf-8"),
|
||||
page.resolve(),
|
||||
{page.resolve(): "Title"},
|
||||
ref_label_map={"my_fig": "chapter/page.md"},
|
||||
current_source_path="chapter/page.md",
|
||||
)
|
||||
self.assertIn('<a id="my_fig"></a>', rewritten)
|
||||
self.assertIn("[my_fig](#my_fig)", rewritten)
|
||||
|
||||
def test_rewrite_markdown_cross_file_numref(self) -> None:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
page = Path(tmpdir) / "ch1" / "page.md"
|
||||
page.parent.mkdir()
|
||||
page.write_text(
|
||||
"# Title\n\nSee :numref:`other_fig`.\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
rewritten = rewrite_markdown(
|
||||
page.read_text(encoding="utf-8"),
|
||||
page.resolve(),
|
||||
{page.resolve(): "Title"},
|
||||
ref_label_map={"other_fig": "ch2/file.md"},
|
||||
current_source_path="ch1/page.md",
|
||||
)
|
||||
self.assertIn("[other_fig](../ch2/file.md#other_fig)", rewritten)
|
||||
|
||||
def test_rewrite_markdown_figure_with_number_and_caption(self) -> None:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
page = Path(tmpdir) / "ch" / "page.md"
|
||||
page.parent.mkdir()
|
||||
page.write_text(
|
||||
"# Title\n\n\n:width:`800px`\n:label:`qfig`\n\nSee :numref:`qfig`.\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
rewritten = rewrite_markdown(
|
||||
page.read_text(encoding="utf-8"),
|
||||
page.resolve(),
|
||||
{page.resolve(): "Title"},
|
||||
ref_label_map={"qfig": "ch/page.md"},
|
||||
current_source_path="ch/page.md",
|
||||
fig_number_map={"qfig": "8.1"},
|
||||
)
|
||||
self.assertIn('<a id="qfig"></a>', rewritten)
|
||||
self.assertIn("", rewritten)
|
||||
self.assertIn('<p align="center">图8.1 量化原理</p>', rewritten)
|
||||
self.assertIn("[图8.1](#qfig)", rewritten)
|
||||
|
||||
|
||||
class ConvertMathToMathjaxTests(unittest.TestCase):
|
||||
def test_display_math(self) -> None:
|
||||
result = convert_math_to_mathjax("before $$x^2$$ after")
|
||||
self.assertEqual(result, "before \\\\[x^2\\\\] after")
|
||||
|
||||
def test_inline_math(self) -> None:
|
||||
result = convert_math_to_mathjax("before $x^2$ after")
|
||||
self.assertEqual(result, "before \\\\(x^2\\\\) after")
|
||||
|
||||
def test_backslash_doubling_inside_math(self) -> None:
|
||||
result = convert_math_to_mathjax("$$a \\\\ b$$")
|
||||
self.assertEqual(result, "\\\\[a \\\\\\\\ b\\\\]")
|
||||
|
||||
def test_math_inside_code_block_not_converted(self) -> None:
|
||||
text = "```\n$x^2$\n```"
|
||||
result = convert_math_to_mathjax(text)
|
||||
self.assertEqual(result, text)
|
||||
|
||||
def test_math_inside_inline_code_not_converted(self) -> None:
|
||||
text = "use `$x$` for math"
|
||||
result = convert_math_to_mathjax(text)
|
||||
self.assertEqual(result, text)
|
||||
|
||||
def test_cjk_dollar_spans_stripped(self) -> None:
|
||||
result = convert_math_to_mathjax("price $100美元$ done")
|
||||
self.assertEqual(result, "price 100美元 done")
|
||||
|
||||
def test_no_math_passthrough(self) -> None:
|
||||
text = "No math here at all."
|
||||
self.assertEqual(convert_math_to_mathjax(text), text)
|
||||
|
||||
def test_mixed_display_and_inline(self) -> None:
|
||||
text = "Inline $a$ and display $$b$$."
|
||||
result = convert_math_to_mathjax(text)
|
||||
self.assertEqual(result, "Inline \\\\(a\\\\) and display \\\\[b\\\\].")
|
||||
|
||||
def test_asterisk_escaped_inside_math(self) -> None:
|
||||
result = convert_math_to_mathjax("$$n*CHW$$")
|
||||
self.assertEqual(result, "\\\\[n\\*CHW\\\\]")
|
||||
|
||||
def test_underscore_escaped_inside_math(self) -> None:
|
||||
result = convert_math_to_mathjax("$x_i$")
|
||||
self.assertEqual(result, "\\\\(x\\_i\\\\)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
|
||||
12
theme/head.hbs
Normal file
12
theme/head.hbs
Normal file
@@ -0,0 +1,12 @@
|
||||
<script type="text/x-mathjax-config">
|
||||
MathJax.Hub.Config({
|
||||
"HTML-CSS": {
|
||||
availableFonts: ["TeX"],
|
||||
preferredFont: "TeX",
|
||||
webFont: "TeX"
|
||||
},
|
||||
SVG: {
|
||||
font: "TeX"
|
||||
}
|
||||
});
|
||||
</script>
|
||||
@@ -1,16 +0,0 @@
|
||||
.typst-inline {
|
||||
display: inline-flex;
|
||||
vertical-align: -0.2em;
|
||||
}
|
||||
|
||||
.typst-display {
|
||||
display: flex;
|
||||
justify-content: center;
|
||||
margin: 1rem 0;
|
||||
overflow-x: auto;
|
||||
}
|
||||
|
||||
.typst-doc {
|
||||
color: var(--fg);
|
||||
max-width: 100%;
|
||||
}
|
||||
@@ -1,132 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import gzip
|
||||
import hashlib
|
||||
import os
|
||||
import platform
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
VERSION = "v0.3.0"
|
||||
RELEASE_BASE_URL = "https://github.com/duskmoon314/mdbook-typst-math/releases/download"
|
||||
ASSET_SHA256 = {
|
||||
"mdbook-typst-math-aarch64-apple-darwin.gz": "9c7a94113e16a465edd1324010e2cc432be3c0794320c13d6a44d9523f069384",
|
||||
"mdbook-typst-math-aarch64-unknown-linux-gnu.gz": "bbcf4574e380663400af74dda76dd6ecafd36aff185d653a2e24e294c45321c3",
|
||||
"mdbook-typst-math-x86_64-apple-darwin.gz": "8bb36eb558fc438c55162b442975eca588a7654b8069860526e46cc08c2aee6a",
|
||||
"mdbook-typst-math-x86_64-pc-windows-msvc.exe": "b5d3e07108a7286007d153c66efe434d06ab6caf43fcd22f78b4e6af8a294314",
|
||||
"mdbook-typst-math-x86_64-unknown-linux-gnu.gz": "3b785a42fb3a93bcd3f80106e6ded5c55bb0bcd4cd0634edf8232d14444b6987",
|
||||
}
|
||||
SUPPORTED_ASSETS = {
|
||||
("darwin", "aarch64"): "mdbook-typst-math-aarch64-apple-darwin.gz",
|
||||
("darwin", "x86_64"): "mdbook-typst-math-x86_64-apple-darwin.gz",
|
||||
("linux", "aarch64"): "mdbook-typst-math-aarch64-unknown-linux-gnu.gz",
|
||||
("linux", "x86_64"): "mdbook-typst-math-x86_64-unknown-linux-gnu.gz",
|
||||
("windows", "x86_64"): "mdbook-typst-math-x86_64-pc-windows-msvc.exe",
|
||||
}
|
||||
|
||||
|
||||
def normalize_machine(machine: str) -> str:
|
||||
normalized = machine.strip().lower()
|
||||
if normalized in {"arm64", "aarch64"}:
|
||||
return "aarch64"
|
||||
if normalized in {"amd64", "x86_64", "x64"}:
|
||||
return "x86_64"
|
||||
return normalized
|
||||
|
||||
|
||||
def normalize_system(system: str) -> str:
|
||||
normalized = system.strip().lower()
|
||||
if normalized.startswith("mingw") or normalized.startswith("msys") or normalized.startswith("cygwin"):
|
||||
return "windows"
|
||||
return normalized
|
||||
|
||||
|
||||
def resolve_asset_name(system: str | None = None, machine: str | None = None) -> str:
|
||||
resolved_system = normalize_system(system or platform.system())
|
||||
resolved_machine = normalize_machine(machine or platform.machine())
|
||||
asset_name = SUPPORTED_ASSETS.get((resolved_system, resolved_machine))
|
||||
if asset_name is None:
|
||||
raise ValueError(
|
||||
f"Unsupported platform for mdbook-typst-math: system={resolved_system!r}, machine={resolved_machine!r}"
|
||||
)
|
||||
return asset_name
|
||||
|
||||
|
||||
def build_download_url(version: str, asset_name: str) -> str:
|
||||
return f"{RELEASE_BASE_URL}/{version}/{asset_name}"
|
||||
|
||||
|
||||
def resolve_binary_path(output_dir: Path, version: str, asset_name: str) -> Path:
|
||||
binary_name = "mdbook-typst-math.exe" if asset_name.endswith(".exe") else "mdbook-typst-math"
|
||||
return output_dir / binary_name
|
||||
|
||||
|
||||
def resolve_version_path(output_dir: Path) -> Path:
|
||||
return output_dir / ".mdbook-typst-math.version"
|
||||
|
||||
|
||||
def download_bytes(url: str) -> bytes:
|
||||
request = urllib.request.Request(url, headers={"User-Agent": "openmlsys-mdbook-bootstrap/1.0"})
|
||||
with urllib.request.urlopen(request) as response:
|
||||
return response.read()
|
||||
|
||||
|
||||
def ensure_binary(
|
||||
output_dir: Path,
|
||||
*,
|
||||
version: str = VERSION,
|
||||
system: str | None = None,
|
||||
machine: str | None = None,
|
||||
downloader=download_bytes,
|
||||
) -> Path:
|
||||
asset_name = resolve_asset_name(system=system, machine=machine)
|
||||
binary_path = resolve_binary_path(output_dir, version, asset_name)
|
||||
version_path = resolve_version_path(output_dir)
|
||||
if binary_path.exists() and version_path.exists() and version_path.read_text(encoding="utf-8").strip() == version:
|
||||
if binary_path.suffix != ".exe":
|
||||
binary_path.chmod(binary_path.stat().st_mode | 0o111)
|
||||
return binary_path
|
||||
|
||||
expected_sha256 = ASSET_SHA256[asset_name]
|
||||
download_url = build_download_url(version, asset_name)
|
||||
archive_bytes = downloader(download_url)
|
||||
digest = hashlib.sha256(archive_bytes).hexdigest()
|
||||
if digest != expected_sha256:
|
||||
raise ValueError(
|
||||
f"Checksum mismatch for {asset_name}: expected {expected_sha256}, got {digest}"
|
||||
)
|
||||
|
||||
binary_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
payload = gzip.decompress(archive_bytes) if asset_name.endswith(".gz") else archive_bytes
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
temporary_path = binary_path.with_suffix(f"{binary_path.suffix}.tmp")
|
||||
temporary_path.write_bytes(payload)
|
||||
if binary_path.suffix != ".exe":
|
||||
temporary_path.chmod(0o755)
|
||||
os.replace(temporary_path, binary_path)
|
||||
version_path.write_text(version, encoding="utf-8")
|
||||
return binary_path
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description="Download a pinned mdbook-typst-math release binary.")
|
||||
parser.add_argument(
|
||||
"--output-dir",
|
||||
type=Path,
|
||||
default=Path(".mdbook-bin"),
|
||||
help="Directory used to cache the downloaded mdbook-typst-math binary.",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
binary_path = ensure_binary(args.output_dir.resolve())
|
||||
print(binary_path)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
@@ -1,614 +0,0 @@
|
||||
"""Convert LaTeX math notation to Typst math notation within markdown content.
|
||||
|
||||
This module provides a best-effort converter for the LaTeX math subset used in
|
||||
the OpenMLSys textbook. It is **not** a general-purpose LaTeX→Typst transpiler;
|
||||
only the commands that actually appear in the zh_chapters are handled.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Brace-matching helper
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _find_brace_group(s: str, pos: int) -> tuple[str, int] | None:
|
||||
"""Return ``(content, end_pos)`` for the ``{…}`` group starting at *pos*.
|
||||
|
||||
Skips leading whitespace. Returns ``None`` when no opening brace is found
|
||||
or braces are unbalanced.
|
||||
"""
|
||||
while pos < len(s) and s[pos] in " \t":
|
||||
pos += 1
|
||||
if pos >= len(s) or s[pos] != "{":
|
||||
return None
|
||||
depth = 0
|
||||
start = pos + 1
|
||||
for i in range(pos, len(s)):
|
||||
if s[i] == "{":
|
||||
depth += 1
|
||||
elif s[i] == "}":
|
||||
depth -= 1
|
||||
if depth == 0:
|
||||
return (s[start:i], i + 1)
|
||||
return None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Command tables
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Commands whose Typst name equals the LaTeX name (just drop the backslash).
|
||||
SIMPLE_COMMANDS: set[str] = {
|
||||
# Greek
|
||||
"alpha", "beta", "gamma", "delta", "Delta",
|
||||
"epsilon", "zeta", "eta", "theta", "Theta",
|
||||
"iota", "kappa", "lambda", "Lambda",
|
||||
"mu", "nu", "xi", "Xi",
|
||||
"pi", "Pi", "rho",
|
||||
"sigma", "Sigma", "tau",
|
||||
"upsilon", "Upsilon",
|
||||
"phi", "Phi", "chi", "psi", "Psi",
|
||||
"omega", "Omega",
|
||||
# Operators / relations
|
||||
"times", "partial", "nabla", "in",
|
||||
"top", "prime",
|
||||
"forall", "exists", "approx", "equiv",
|
||||
"subset", "supset",
|
||||
# Big operators / functions
|
||||
"log", "ln", "exp", "sin", "cos", "tan",
|
||||
"min", "max", "lim", "sum",
|
||||
"det", "dim", "ker", "inf", "sup",
|
||||
}
|
||||
|
||||
# Commands that map to a *different* Typst identifier.
|
||||
RENAMED_COMMANDS: dict[str, str] = {
|
||||
"cdot": "dot.c",
|
||||
"cdots": "dots.c",
|
||||
"ldots": "dots",
|
||||
"dots": "dots",
|
||||
"to": "->",
|
||||
"rightarrow": "->",
|
||||
"leftarrow": "<-",
|
||||
"Rightarrow": "=>",
|
||||
"rightsquigarrow": "arrow.r.squiggly",
|
||||
"leq": "<=",
|
||||
"geq": ">=",
|
||||
"prod": "product",
|
||||
"notag": "",
|
||||
"quad": "quad",
|
||||
"qquad": "wide",
|
||||
"label": "", # consumed by :eqlabel: already
|
||||
"sim": "tilde.op",
|
||||
"infty": "infinity",
|
||||
"neq": "eq.not",
|
||||
"ast": "ast.op",
|
||||
"vdots": "dots.v",
|
||||
"ddots": "dots.down",
|
||||
"lVert": "||",
|
||||
"rVert": "||",
|
||||
"vert": "|",
|
||||
"lvert": "|",
|
||||
"rvert": "|",
|
||||
"mid": "|",
|
||||
"cap": "inter",
|
||||
"cup": "union",
|
||||
"le": "<=",
|
||||
"ge": ">=",
|
||||
"odot": "dot.o",
|
||||
"oplus": "plus.circle",
|
||||
"otimes": "times.circle",
|
||||
}
|
||||
|
||||
# \cmd{arg} → typst_func(arg)
|
||||
ONE_ARG_COMMANDS: dict[str, str] = {
|
||||
"boldsymbol": "bold",
|
||||
"mathcal": "cal",
|
||||
"mathbf": "bold",
|
||||
"mathbb": "bb",
|
||||
"hat": "hat",
|
||||
"bar": "overline",
|
||||
"dot": "dot",
|
||||
"tilde": "tilde",
|
||||
"sqrt": "sqrt",
|
||||
"overline": "overline",
|
||||
"pmb": "bold",
|
||||
"textbf": "bold",
|
||||
"textit": "italic",
|
||||
"bm": "bold",
|
||||
}
|
||||
|
||||
# \cmd{arg1}{arg2} → typst_func(arg1, arg2)
|
||||
TWO_ARG_COMMANDS: dict[str, str] = {
|
||||
"frac": "frac",
|
||||
"binom": "binom",
|
||||
}
|
||||
|
||||
# Delimiter-sizing commands to strip (the delimiter char after them is kept).
|
||||
_SIZING_COMMANDS: set[str] = {
|
||||
"left", "right", "bigg", "Bigg", "big", "Big", "biggl", "biggr",
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Core single-pass converter
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _last_char(out: list[str]) -> str:
|
||||
"""Return the last non-empty character in the output buffer, or ``""``."""
|
||||
for part in reversed(out):
|
||||
if part:
|
||||
return part[-1]
|
||||
return ""
|
||||
|
||||
|
||||
def _emit(out: list[str], text: str) -> None:
|
||||
"""Append *text* to *out*, adding a space separator when needed.
|
||||
|
||||
In Typst math, consecutive letters form a multi-letter identifier which
|
||||
will error if unknown. Similarly, letter→digit transitions form tokens
|
||||
like ``W1``. This helper inserts spaces to prevent such merging, matching
|
||||
LaTeX math semantics where adjacent characters are separate symbols.
|
||||
"""
|
||||
if not text:
|
||||
out.append(text)
|
||||
return
|
||||
lc = _last_char(out)
|
||||
fc = text[0]
|
||||
if lc and (
|
||||
# letter→letter (e.g. "ou" → "o u")
|
||||
(lc.isalpha() and fc.isalpha())
|
||||
# letter→digit (e.g. "W1" → "W 1")
|
||||
or (lc.isalpha() and fc.isdigit())
|
||||
# digit→letter (e.g. "2x" → "2 x")
|
||||
or (lc.isdigit() and fc.isalpha())
|
||||
# )→letter/digit (e.g. "bold(X)y" → "bold(X) y")
|
||||
or (lc == ")" and (fc.isalpha() or fc.isdigit()))
|
||||
):
|
||||
out.append(" ")
|
||||
out.append(text)
|
||||
|
||||
|
||||
def _convert(s: str) -> str:
|
||||
"""Convert a single LaTeX math expression to Typst math."""
|
||||
out: list[str] = []
|
||||
i = 0
|
||||
n = len(s)
|
||||
|
||||
while i < n:
|
||||
ch = s[i]
|
||||
|
||||
# ---- backslash commands ----
|
||||
if ch == "\\" and i + 1 < n:
|
||||
nxt = s[i + 1]
|
||||
|
||||
# Double backslash: either markdown-escaped bracket or line-break
|
||||
if nxt == "\\":
|
||||
# \\{ \\} \\[ \\] \\( \\) → markdown-escaped LaTeX delimiters
|
||||
if i + 2 < n and s[i + 2] in "{}[]()":
|
||||
out.append(s[i + 2])
|
||||
i += 3
|
||||
continue
|
||||
out.append(" \\\n")
|
||||
i += 2
|
||||
continue
|
||||
|
||||
# Escaped characters: \{ \} \[ \] \( \) \, \; \! \ \.
|
||||
if nxt in "{}[]()":
|
||||
out.append(nxt)
|
||||
i += 2
|
||||
continue
|
||||
if nxt == ",":
|
||||
out.append("thin ") # thin space
|
||||
i += 2
|
||||
continue
|
||||
if nxt == ";":
|
||||
out.append("med ") # medium space
|
||||
i += 2
|
||||
continue
|
||||
if nxt == "!":
|
||||
out.append("") # negative thin space → ignore
|
||||
i += 2
|
||||
continue
|
||||
if nxt == " ":
|
||||
out.append(" ")
|
||||
i += 2
|
||||
continue
|
||||
if nxt == "\n":
|
||||
out.append(" ")
|
||||
i += 2
|
||||
continue
|
||||
|
||||
# Try to match an alphabetic command name
|
||||
m = re.match(r"[a-zA-Z]+", s[i + 1:])
|
||||
if not m:
|
||||
# Bare backslash before non-alpha → keep the char after
|
||||
out.append(nxt)
|
||||
i += 2
|
||||
continue
|
||||
|
||||
cmd = m.group()
|
||||
after = i + 1 + m.end()
|
||||
|
||||
# -- environments --
|
||||
if cmd == "begin":
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
env_name, env_pos = g
|
||||
end_marker = f"\\end{{{env_name}}}"
|
||||
end_idx = s.find(end_marker, env_pos)
|
||||
if end_idx != -1:
|
||||
body = s[env_pos:end_idx]
|
||||
i = end_idx + len(end_marker)
|
||||
_emit(out, _convert_environment(env_name, body))
|
||||
continue
|
||||
# Fallthrough: couldn't parse, skip \begin
|
||||
i = after
|
||||
continue
|
||||
|
||||
if cmd == "end":
|
||||
# Stray \end (shouldn't happen if \begin matched)
|
||||
g = _find_brace_group(s, after)
|
||||
i = g[1] if g else after
|
||||
continue
|
||||
|
||||
# -- special multi-arg commands --
|
||||
if cmd == "underset":
|
||||
# \underset{below}{base} → attach(base, b: below)
|
||||
g1 = _find_brace_group(s, after)
|
||||
if g1:
|
||||
below, p1 = g1
|
||||
g2 = _find_brace_group(s, p1)
|
||||
if g2:
|
||||
base, p2 = g2
|
||||
_emit(out, f"attach({_convert(base)}, b: {_convert(below)})")
|
||||
i = p2
|
||||
continue
|
||||
|
||||
if cmd == "overset":
|
||||
# \overset{above}{base} → attach(base, t: above)
|
||||
g1 = _find_brace_group(s, after)
|
||||
if g1:
|
||||
above, p1 = g1
|
||||
g2 = _find_brace_group(s, p1)
|
||||
if g2:
|
||||
base, p2 = g2
|
||||
_emit(out, f"attach({_convert(base)}, t: {_convert(above)})")
|
||||
i = p2
|
||||
continue
|
||||
|
||||
if cmd == "operatorname":
|
||||
# \operatorname{name} → op("name")
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
name, pos = g
|
||||
_emit(out, f'op("{name}")')
|
||||
i = pos
|
||||
continue
|
||||
|
||||
if cmd == "tag":
|
||||
# \tag{n} → visual equation number
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
content, pos = g
|
||||
_emit(out, f'quad upright("({content})")')
|
||||
i = pos
|
||||
continue
|
||||
|
||||
if cmd == "eqref":
|
||||
# \eqref{name} → show label name as fallback
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
content, pos = g
|
||||
_emit(out, f'upright("({content})")')
|
||||
i = pos
|
||||
continue
|
||||
|
||||
if cmd in ("mathrm", "text"):
|
||||
# \mathrm{text} → upright("text") — treat as text, not math
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
content, pos = g
|
||||
stripped = content.strip()
|
||||
if stripped:
|
||||
_emit(out, f'upright("{stripped}")')
|
||||
# else: empty mathrm (spacing hack) → drop
|
||||
i = pos
|
||||
continue
|
||||
|
||||
# -- two-arg commands --
|
||||
if cmd in TWO_ARG_COMMANDS:
|
||||
g1 = _find_brace_group(s, after)
|
||||
if g1:
|
||||
c1, p1 = g1
|
||||
g2 = _find_brace_group(s, p1)
|
||||
if g2:
|
||||
c2, p2 = g2
|
||||
func = TWO_ARG_COMMANDS[cmd]
|
||||
_emit(out, f"{func}({_convert(c1)}, {_convert(c2)})")
|
||||
i = p2
|
||||
continue
|
||||
# Fallthrough
|
||||
_emit(out, cmd)
|
||||
i = after
|
||||
continue
|
||||
|
||||
# -- one-arg commands --
|
||||
if cmd in ONE_ARG_COMMANDS:
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
content, pos = g
|
||||
func = ONE_ARG_COMMANDS[cmd]
|
||||
_emit(out, f"{func}({_convert(content)})")
|
||||
i = pos
|
||||
continue
|
||||
# Fallthrough: no brace group → just emit the typst name
|
||||
_emit(out, ONE_ARG_COMMANDS[cmd])
|
||||
i = after
|
||||
continue
|
||||
|
||||
# -- \rm (applies upright to the rest of the current scope) --
|
||||
if cmd == "rm":
|
||||
raw_rest = s[after:]
|
||||
leading = len(raw_rest) - len(raw_rest.lstrip())
|
||||
rest = raw_rest.lstrip()
|
||||
# Grab one "word"
|
||||
wm = re.match(r"[A-Za-z0-9]+", rest)
|
||||
if wm:
|
||||
word = wm.group()
|
||||
_emit(out, f"upright({word})")
|
||||
i = after + leading + len(word)
|
||||
continue
|
||||
_emit(out, "upright")
|
||||
i = after
|
||||
continue
|
||||
|
||||
# -- delimiter sizing --
|
||||
if cmd in _SIZING_COMMANDS:
|
||||
# Skip the command; keep whatever delimiter follows.
|
||||
i = after
|
||||
continue
|
||||
|
||||
# -- simple (same name) --
|
||||
if cmd in SIMPLE_COMMANDS:
|
||||
_emit(out, cmd)
|
||||
# Also add right-side space when next char would merge
|
||||
if after < n and (s[after].isalnum() or s[after] == "\\"):
|
||||
out.append(" ")
|
||||
i = after
|
||||
continue
|
||||
|
||||
# -- renamed --
|
||||
if cmd in RENAMED_COMMANDS:
|
||||
repl = RENAMED_COMMANDS[cmd]
|
||||
if repl:
|
||||
_emit(out, repl)
|
||||
if after < n and s[after].isalnum():
|
||||
out.append(" ")
|
||||
# If repl is empty the command is silently dropped.
|
||||
# For \label{...} consume the brace group too.
|
||||
if cmd == "label":
|
||||
g = _find_brace_group(s, after)
|
||||
if g:
|
||||
i = g[1]
|
||||
continue
|
||||
i = after
|
||||
continue
|
||||
|
||||
# -- unknown command → emit name without backslash --
|
||||
_emit(out, cmd)
|
||||
if after < n and s[after].isalnum():
|
||||
out.append(" ")
|
||||
i = after
|
||||
continue
|
||||
|
||||
# ---- brace groups (not consumed by a command) ----
|
||||
if ch == "{":
|
||||
g = _find_brace_group(s, i)
|
||||
if g:
|
||||
content, end = g
|
||||
# Check if preceded by ^ or _ → superscript/subscript grouping
|
||||
if out and out[-1] and out[-1][-1] in "^_":
|
||||
out.append(f"({_convert(content)})")
|
||||
i = end
|
||||
continue
|
||||
# Check for {\rm ...} pattern
|
||||
rm_m = re.match(r"\\rm\s+", content)
|
||||
if rm_m:
|
||||
inner = content[rm_m.end():]
|
||||
_emit(out, f"upright({_convert(inner)})")
|
||||
i = end
|
||||
continue
|
||||
# Otherwise, just emit the converted content (braces act as
|
||||
# invisible grouping in LaTeX — no Typst equivalent needed
|
||||
# in most contexts).
|
||||
_emit(out, _convert(content))
|
||||
i = end
|
||||
continue
|
||||
# Unmatched brace — emit as-is
|
||||
out.append(ch)
|
||||
i += 1
|
||||
continue
|
||||
|
||||
# ---- everything else (digits, letters, operators, whitespace) ----
|
||||
# Use _emit so consecutive raw letters get spaces inserted,
|
||||
# matching LaTeX math semantics where adjacent letters are
|
||||
# separate variables (e.g. "out" → "o u t" in Typst).
|
||||
_emit(out, ch)
|
||||
i += 1
|
||||
|
||||
result = "".join(out)
|
||||
# Typst math requires a base before ^ or _; add an invisible base
|
||||
# when the expression starts with a script marker (e.g. $^2$).
|
||||
if result and result.lstrip() and result.lstrip()[0] in "^_":
|
||||
result = '""' + result.lstrip()
|
||||
return result
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Environment converters
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _convert_environment(name: str, body: str) -> str:
|
||||
"""Convert a ``\\begin{name}…\\end{name}`` block to Typst."""
|
||||
if name in ("matrix", "bmatrix", "pmatrix", "vmatrix"):
|
||||
return _convert_matrix_env(name, body)
|
||||
if name == "cases":
|
||||
return _convert_cases_env(body)
|
||||
if name in ("aligned", "split"):
|
||||
# Just unwrap — Typst math handles & alignment and \ line-breaks.
|
||||
converted = _convert(body)
|
||||
return converted
|
||||
if name == "figure":
|
||||
# Not real math; pass through as-is.
|
||||
return f"\\begin{{{name}}}{body}\\end{{{name}}}"
|
||||
# Unknown environment — pass through converted content
|
||||
return _convert(body)
|
||||
|
||||
|
||||
def _convert_matrix_env(name: str, body: str) -> str:
|
||||
"""Convert matrix/bmatrix/pmatrix/vmatrix to ``mat(…)``."""
|
||||
delim_map = {
|
||||
"matrix": "",
|
||||
"bmatrix": '"["',
|
||||
"pmatrix": '"("',
|
||||
"vmatrix": '"|"',
|
||||
}
|
||||
# Split rows on \\, columns on &
|
||||
rows: list[str] = []
|
||||
for row_text in re.split(r"\\\\", body):
|
||||
row_text = row_text.strip()
|
||||
if not row_text:
|
||||
continue
|
||||
cells = [_convert(c.strip()) for c in row_text.split("&")]
|
||||
rows.append(", ".join(cells))
|
||||
|
||||
inner = "; ".join(rows)
|
||||
delim = delim_map.get(name, "")
|
||||
if delim:
|
||||
return f"mat(delim: {delim}, {inner})"
|
||||
return f"mat({inner})"
|
||||
|
||||
|
||||
def _convert_cases_env(body: str) -> str:
|
||||
"""Convert cases environment to ``cases(…)``."""
|
||||
branches: list[str] = []
|
||||
for branch_text in re.split(r"\\\\", body):
|
||||
branch_text = branch_text.strip()
|
||||
if not branch_text:
|
||||
continue
|
||||
branches.append(_convert(branch_text))
|
||||
return "cases(" + ", ".join(branches) + ")"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Markdown-level math-span detection
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_FENCE_RE = re.compile(r"^(`{3,}|~{3,})", re.MULTILINE)
|
||||
|
||||
|
||||
def _iter_math_spans(content: str):
|
||||
"""Yield ``(start, end, is_display)`` for every math span.
|
||||
|
||||
Skips spans inside fenced code blocks and inline code.
|
||||
"""
|
||||
n = len(content)
|
||||
i = 0
|
||||
in_fence: str | None = None # fence marker when inside a code block
|
||||
|
||||
while i < n:
|
||||
# Track fenced code blocks
|
||||
if content[i] == "`" or content[i] == "~":
|
||||
m = _FENCE_RE.match(content, i)
|
||||
if m and (i == 0 or content[i - 1] == "\n"):
|
||||
marker = m.group(1)
|
||||
if in_fence is None:
|
||||
in_fence = marker[0] # opening
|
||||
i = content.index("\n", i) + 1 if "\n" in content[i:] else n
|
||||
continue
|
||||
elif marker[0] == in_fence:
|
||||
in_fence = None # closing
|
||||
i = m.end()
|
||||
continue
|
||||
|
||||
if in_fence:
|
||||
i += 1
|
||||
continue
|
||||
|
||||
# Skip inline code
|
||||
if content[i] == "`":
|
||||
end_tick = content.find("`", i + 1)
|
||||
if end_tick != -1:
|
||||
i = end_tick + 1
|
||||
continue
|
||||
|
||||
# Display math $$...$$
|
||||
if content[i:i + 2] == "$$":
|
||||
start = i
|
||||
close = content.find("$$", i + 2)
|
||||
if close != -1:
|
||||
yield (start + 2, close, True)
|
||||
i = close + 2
|
||||
continue
|
||||
|
||||
# Inline math $...$
|
||||
if content[i] == "$":
|
||||
start = i
|
||||
# Find closing $ — any next $ closes the span (even if followed
|
||||
# by another $, which starts a NEW span).
|
||||
j = i + 1
|
||||
while j < n:
|
||||
if content[j] == "$":
|
||||
if j > i + 1: # non-empty
|
||||
yield (start + 1, j, False)
|
||||
j += 1
|
||||
break
|
||||
if content[j] == "\n" and not content[i + 1:j].strip():
|
||||
break # empty line → not math
|
||||
j += 1
|
||||
i = j
|
||||
continue
|
||||
|
||||
i += 1
|
||||
|
||||
|
||||
_CJK_RE = re.compile(r"[\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff]")
|
||||
|
||||
|
||||
def convert_latex_math_to_typst(content: str) -> str:
|
||||
"""Replace LaTeX math with Typst math throughout *content* (markdown)."""
|
||||
spans = list(_iter_math_spans(content))
|
||||
if not spans:
|
||||
return content
|
||||
|
||||
parts: list[str] = []
|
||||
prev = 0
|
||||
for start, end, is_display in spans:
|
||||
delim = "$$" if is_display else "$"
|
||||
delim_len = len(delim)
|
||||
delim_start = start - delim_len
|
||||
|
||||
latex = content[start:end]
|
||||
|
||||
# Spans containing CJK characters are almost certainly mismatched $.
|
||||
# Strip the $ delimiters and emit the raw text.
|
||||
if _CJK_RE.search(latex):
|
||||
parts.append(content[prev:delim_start])
|
||||
parts.append(latex) # emit without $ delimiters
|
||||
prev = end + delim_len
|
||||
continue
|
||||
|
||||
parts.append(content[prev:delim_start])
|
||||
converted = _convert(latex)
|
||||
# Strip leading/trailing whitespace from inline math so that
|
||||
# ``$ text$`` (space after opening $) never occurs — CommonMark
|
||||
# and mdbook-typst-math treat that as non-math.
|
||||
if not is_display:
|
||||
converted = converted.strip()
|
||||
parts.append(f"{delim}{converted}{delim}")
|
||||
|
||||
prev = end + delim_len
|
||||
|
||||
parts.append(content[prev:])
|
||||
return "".join(parts)
|
||||
@@ -5,9 +5,9 @@ import sys
|
||||
from pathlib import Path
|
||||
|
||||
try:
|
||||
from tools.prepare_mdbook import build_title_cache, parse_bib, rewrite_markdown
|
||||
from tools.prepare_mdbook import build_title_cache, collect_figure_labels, collect_labels, convert_math_to_mathjax, parse_bib, rewrite_markdown
|
||||
except ModuleNotFoundError:
|
||||
from prepare_mdbook import build_title_cache, parse_bib, rewrite_markdown
|
||||
from prepare_mdbook import build_title_cache, collect_figure_labels, collect_labels, convert_math_to_mathjax, parse_bib, rewrite_markdown
|
||||
|
||||
|
||||
PLACEHOLDER_PREFIX = "[TODO: src = zh_chapters/"
|
||||
@@ -43,7 +43,25 @@ def main() -> int:
|
||||
for key, fields in parse_bib(extra_bib).items():
|
||||
bib_db.setdefault(key, fields)
|
||||
|
||||
for chapter in iter_chapters(book.get("items", [])):
|
||||
chapters = iter_chapters(book.get("items", []))
|
||||
|
||||
# Pass 1: collect all :label: directives and figure labels
|
||||
ref_label_map: dict[str, str] = {}
|
||||
fig_number_map: dict[str, str] = {}
|
||||
for chapter in chapters:
|
||||
source_path = chapter.get("source_path") or chapter.get("path")
|
||||
if not source_path:
|
||||
continue
|
||||
for label in collect_labels(chapter["content"]):
|
||||
ref_label_map.setdefault(label, source_path)
|
||||
number = chapter.get("number")
|
||||
if number:
|
||||
prefix = ".".join(str(n) for n in number)
|
||||
for idx, label in enumerate(collect_figure_labels(chapter["content"]), 1):
|
||||
fig_number_map[label] = f"{prefix}.{idx}"
|
||||
|
||||
# Pass 2: rewrite markdown with cross-reference linking
|
||||
for chapter in chapters:
|
||||
source_path = chapter.get("source_path") or chapter.get("path")
|
||||
if not source_path:
|
||||
continue
|
||||
@@ -56,7 +74,11 @@ def main() -> int:
|
||||
bibliography_title=BIBLIOGRAPHY_TITLE,
|
||||
frontpage_switch_label=FRONTPAGE_SWITCH_LABEL,
|
||||
frontpage_switch_href=FRONTPAGE_SWITCH_HREF,
|
||||
ref_label_map=ref_label_map,
|
||||
current_source_path=source_path,
|
||||
fig_number_map=fig_number_map,
|
||||
)
|
||||
chapter["content"] = convert_math_to_mathjax(chapter["content"])
|
||||
|
||||
json.dump(book, sys.stdout, ensure_ascii=False)
|
||||
return 0
|
||||
|
||||
@@ -5,11 +5,9 @@ import sys
|
||||
from pathlib import Path
|
||||
|
||||
try:
|
||||
from tools.prepare_mdbook import build_title_cache, parse_bib, rewrite_markdown
|
||||
from tools.latex_to_typst import convert_latex_math_to_typst
|
||||
from tools.prepare_mdbook import build_title_cache, collect_figure_labels, collect_labels, convert_math_to_mathjax, parse_bib, rewrite_markdown
|
||||
except ModuleNotFoundError:
|
||||
from prepare_mdbook import build_title_cache, parse_bib, rewrite_markdown
|
||||
from latex_to_typst import convert_latex_math_to_typst
|
||||
from prepare_mdbook import build_title_cache, collect_figure_labels, collect_labels, convert_math_to_mathjax, parse_bib, rewrite_markdown
|
||||
|
||||
|
||||
BIBLIOGRAPHY_TITLE = "参考文献"
|
||||
@@ -44,7 +42,25 @@ def main() -> int:
|
||||
for key, fields in parse_bib(extra_bib).items():
|
||||
bib_db.setdefault(key, fields)
|
||||
|
||||
for chapter in iter_chapters(book.get("items", [])):
|
||||
chapters = iter_chapters(book.get("items", []))
|
||||
|
||||
# Pass 1: collect all :label: directives and figure labels
|
||||
ref_label_map: dict[str, str] = {}
|
||||
fig_number_map: dict[str, str] = {}
|
||||
for chapter in chapters:
|
||||
source_path = chapter.get("source_path") or chapter.get("path")
|
||||
if not source_path:
|
||||
continue
|
||||
for label in collect_labels(chapter["content"]):
|
||||
ref_label_map.setdefault(label, source_path)
|
||||
number = chapter.get("number")
|
||||
if number:
|
||||
prefix = ".".join(str(n) for n in number)
|
||||
for idx, label in enumerate(collect_figure_labels(chapter["content"]), 1):
|
||||
fig_number_map[label] = f"{prefix}.{idx}"
|
||||
|
||||
# Pass 2: rewrite markdown with cross-reference linking
|
||||
for chapter in chapters:
|
||||
source_path = chapter.get("source_path") or chapter.get("path")
|
||||
if not source_path:
|
||||
continue
|
||||
@@ -57,8 +73,11 @@ def main() -> int:
|
||||
bibliography_title=BIBLIOGRAPHY_TITLE,
|
||||
frontpage_switch_label=FRONTPAGE_SWITCH_LABEL,
|
||||
frontpage_switch_href=FRONTPAGE_SWITCH_HREF,
|
||||
ref_label_map=ref_label_map,
|
||||
current_source_path=source_path,
|
||||
fig_number_map=fig_number_map,
|
||||
)
|
||||
chapter["content"] = convert_latex_math_to_typst(chapter["content"])
|
||||
chapter["content"] = convert_math_to_mathjax(chapter["content"])
|
||||
|
||||
json.dump(book, sys.stdout, ensure_ascii=False)
|
||||
return 0
|
||||
|
||||
@@ -4,13 +4,16 @@ import argparse
|
||||
import os
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from pathlib import Path, PurePosixPath
|
||||
|
||||
|
||||
TOC_FENCE = "toc"
|
||||
EVAL_RST_FENCE = "eval_rst"
|
||||
OPTION_LINE_RE = re.compile(r"^:(width|label):`[^`]+`\s*$", re.MULTILINE)
|
||||
WIDTH_LINE_RE = re.compile(r"^:width:`[^`]+`\s*$", re.MULTILINE)
|
||||
LABEL_RE = re.compile(r":label:`([^`]+)`")
|
||||
NUMREF_RE = re.compile(r":numref:`([^`]+)`")
|
||||
IMAGE_LINE_RE = re.compile(r"^!\[([^\]]*)\]\(([^)]+)\)\s*$")
|
||||
LABEL_LINE_RE = re.compile(r"^:label:`([^`]+)`\s*$")
|
||||
EQREF_RE = re.compile(r":eqref:`([^`]+)`")
|
||||
EQLABEL_LINE_RE = re.compile(r"^:eqlabel:`([^`]+)`\s*$")
|
||||
CITE_RE = re.compile(r":cite:`([^`]+)`")
|
||||
@@ -343,12 +346,110 @@ def process_equation_labels(markdown: str) -> tuple[str, dict[str, int]]:
|
||||
return "\n".join(result), label_map
|
||||
|
||||
|
||||
def collect_labels(markdown: str) -> list[str]:
|
||||
"""Extract all label names from :label: directives."""
|
||||
return LABEL_RE.findall(markdown)
|
||||
|
||||
|
||||
def collect_figure_labels(markdown: str) -> list[str]:
|
||||
"""Return label names for figures (image lines followed by :label:)."""
|
||||
labels: list[str] = []
|
||||
lines = markdown.splitlines()
|
||||
for i, line in enumerate(lines):
|
||||
if not IMAGE_LINE_RE.match(line.strip()):
|
||||
continue
|
||||
j = i + 1
|
||||
while j < len(lines):
|
||||
s = lines[j].strip()
|
||||
if not s or WIDTH_LINE_RE.match(s):
|
||||
j += 1
|
||||
continue
|
||||
m = LABEL_LINE_RE.match(s)
|
||||
if m:
|
||||
labels.append(m.group(1))
|
||||
break
|
||||
return labels
|
||||
|
||||
|
||||
def process_figure_captions(
|
||||
markdown: str,
|
||||
fig_number_map: dict[str, str] | None = None,
|
||||
) -> str:
|
||||
"""Convert image+label blocks into figures with anchors and captions."""
|
||||
lines = markdown.splitlines()
|
||||
result: list[str] = []
|
||||
i = 0
|
||||
while i < len(lines):
|
||||
img_match = IMAGE_LINE_RE.match(lines[i].strip())
|
||||
if img_match:
|
||||
caption = img_match.group(1)
|
||||
img_line = lines[i]
|
||||
# Look ahead for :width: and :label:
|
||||
j = i + 1
|
||||
label = None
|
||||
while j < len(lines):
|
||||
s = lines[j].strip()
|
||||
if not s or WIDTH_LINE_RE.match(s):
|
||||
j += 1
|
||||
continue
|
||||
m = LABEL_LINE_RE.match(s)
|
||||
if m:
|
||||
label = m.group(1)
|
||||
j += 1
|
||||
break
|
||||
|
||||
if label:
|
||||
fig_num = (fig_number_map or {}).get(label)
|
||||
result.append(f'<a id="{label}"></a>')
|
||||
result.append("")
|
||||
result.append(img_line)
|
||||
if fig_num and caption:
|
||||
result.append("")
|
||||
result.append(f'<p align="center">图{fig_num} {caption}</p>')
|
||||
elif fig_num:
|
||||
result.append("")
|
||||
result.append(f'<p align="center">图{fig_num}</p>')
|
||||
elif caption:
|
||||
result.append("")
|
||||
result.append(f'<p align="center">{caption}</p>')
|
||||
i = j
|
||||
continue
|
||||
|
||||
result.append(lines[i])
|
||||
i += 1
|
||||
return "\n".join(result)
|
||||
|
||||
|
||||
def _relative_chapter_path(from_path: str, to_path: str) -> str:
|
||||
"""Compute relative path between two mdbook source_paths."""
|
||||
if from_path == to_path:
|
||||
return ""
|
||||
from_dir = str(PurePosixPath(from_path).parent)
|
||||
return PurePosixPath(os.path.relpath(to_path, start=from_dir)).as_posix()
|
||||
|
||||
|
||||
def normalize_directives(
|
||||
markdown: str,
|
||||
label_map: dict[str, int] | None = None,
|
||||
ref_label_map: dict[str, str] | None = None,
|
||||
current_source_path: str | None = None,
|
||||
fig_number_map: dict[str, str] | None = None,
|
||||
) -> str:
|
||||
normalized = OPTION_LINE_RE.sub("", markdown)
|
||||
normalized = NUMREF_RE.sub(lambda match: f"`{match.group(1)}`", normalized)
|
||||
normalized = WIDTH_LINE_RE.sub("", markdown)
|
||||
normalized = LABEL_RE.sub(lambda m: f'<a id="{m.group(1)}"></a>', normalized)
|
||||
|
||||
def _numref_replace(match: re.Match[str]) -> str:
|
||||
name = match.group(1)
|
||||
if ref_label_map and current_source_path and name in ref_label_map:
|
||||
target_path = ref_label_map[name]
|
||||
rel = _relative_chapter_path(current_source_path, target_path)
|
||||
display = f"图{fig_number_map[name]}" if fig_number_map and name in fig_number_map else name
|
||||
if rel:
|
||||
return f"[{display}]({rel}#{name})"
|
||||
return f"[{display}](#{name})"
|
||||
return f"`{name}`"
|
||||
|
||||
normalized = NUMREF_RE.sub(_numref_replace, normalized)
|
||||
if label_map:
|
||||
normalized = EQREF_RE.sub(
|
||||
lambda m: f"({label_map[m.group(1)]})" if m.group(1) in label_map else f"$\\eqref{{{m.group(1)}}}$",
|
||||
@@ -509,6 +610,121 @@ def process_citations(
|
||||
return processed
|
||||
|
||||
|
||||
_FENCE_RE = re.compile(r"^(`{3,}|~{3,})", re.MULTILINE)
|
||||
_CJK_RE = re.compile(r"[\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff]")
|
||||
|
||||
|
||||
def _iter_math_spans(content: str):
|
||||
"""Yield ``(start, end, is_display)`` for every math span.
|
||||
|
||||
Skips spans inside fenced code blocks and inline code.
|
||||
"""
|
||||
n = len(content)
|
||||
i = 0
|
||||
in_fence: str | None = None # fence marker when inside a code block
|
||||
|
||||
while i < n:
|
||||
# Track fenced code blocks
|
||||
if content[i] == "`" or content[i] == "~":
|
||||
m = _FENCE_RE.match(content, i)
|
||||
if m and (i == 0 or content[i - 1] == "\n"):
|
||||
marker = m.group(1)
|
||||
if in_fence is None:
|
||||
in_fence = marker[0] # opening
|
||||
i = content.index("\n", i) + 1 if "\n" in content[i:] else n
|
||||
continue
|
||||
elif marker[0] == in_fence:
|
||||
in_fence = None # closing
|
||||
i = m.end()
|
||||
continue
|
||||
|
||||
if in_fence:
|
||||
i += 1
|
||||
continue
|
||||
|
||||
# Skip inline code
|
||||
if content[i] == "`":
|
||||
end_tick = content.find("`", i + 1)
|
||||
if end_tick != -1:
|
||||
i = end_tick + 1
|
||||
continue
|
||||
|
||||
# Display math $$...$$
|
||||
if content[i:i + 2] == "$$":
|
||||
start = i
|
||||
close = content.find("$$", i + 2)
|
||||
if close != -1:
|
||||
yield (start + 2, close, True)
|
||||
i = close + 2
|
||||
continue
|
||||
|
||||
# Inline math $...$
|
||||
if content[i] == "$":
|
||||
start = i
|
||||
j = i + 1
|
||||
while j < n:
|
||||
if content[j] == "$":
|
||||
if j > i + 1: # non-empty
|
||||
yield (start + 1, j, False)
|
||||
j += 1
|
||||
break
|
||||
if content[j] == "\n" and not content[i + 1:j].strip():
|
||||
break # empty line → not math
|
||||
j += 1
|
||||
i = j
|
||||
continue
|
||||
|
||||
i += 1
|
||||
|
||||
|
||||
def convert_math_to_mathjax(content: str) -> str:
|
||||
"""Replace ``$``/``$$`` delimited math with MathJax ``\\(…\\)``/``\\[…\\]``.
|
||||
|
||||
Inside math content, ``\\`` (LaTeX newline) is doubled to ``\\\\`` so that
|
||||
mdBook's markdown processing (which consumes one level of backslash
|
||||
escaping) delivers the correct ``\\`` to MathJax.
|
||||
"""
|
||||
spans = list(_iter_math_spans(content))
|
||||
if not spans:
|
||||
return content
|
||||
|
||||
parts: list[str] = []
|
||||
prev = 0
|
||||
for start, end, is_display in spans:
|
||||
delim = "$$" if is_display else "$"
|
||||
delim_len = len(delim)
|
||||
delim_start = start - delim_len
|
||||
|
||||
math = content[start:end]
|
||||
|
||||
# Spans containing CJK characters are almost certainly mismatched $.
|
||||
# Strip the $ delimiters and emit the raw text.
|
||||
if _CJK_RE.search(math):
|
||||
parts.append(content[prev:delim_start])
|
||||
parts.append(math)
|
||||
prev = end + delim_len
|
||||
continue
|
||||
|
||||
parts.append(content[prev:delim_start])
|
||||
|
||||
# Double backslashes inside math so that after mdBook markdown
|
||||
# processing (which eats one backslash layer) MathJax sees the
|
||||
# original LaTeX.
|
||||
math = math.replace("\\\\", "\\\\\\\\")
|
||||
math = math.replace("*", "\\*")
|
||||
math = math.replace("_", "\\_")
|
||||
|
||||
if is_display:
|
||||
parts.append(f"\\\\[{math}\\\\]")
|
||||
else:
|
||||
parts.append(f"\\\\({math}\\\\)")
|
||||
|
||||
prev = end + delim_len
|
||||
|
||||
parts.append(content[prev:])
|
||||
return "".join(parts)
|
||||
|
||||
|
||||
def resolve_raw_html_file(current_file: Path, filename: str) -> Path:
|
||||
direct = (current_file.parent / filename).resolve()
|
||||
if direct.exists():
|
||||
@@ -628,6 +844,9 @@ def rewrite_markdown(
|
||||
bibliography_title: str = DEFAULT_BIBLIOGRAPHY_TITLE,
|
||||
frontpage_switch_label: str | None = None,
|
||||
frontpage_switch_href: str | None = None,
|
||||
ref_label_map: dict[str, str] | None = None,
|
||||
current_source_path: str | None = None,
|
||||
fig_number_map: dict[str, str] | None = None,
|
||||
) -> str:
|
||||
output: list[str] = []
|
||||
lines = markdown.splitlines()
|
||||
@@ -676,7 +895,14 @@ def rewrite_markdown(
|
||||
|
||||
raw = "\n".join(output) + "\n"
|
||||
result, label_map = process_equation_labels(raw)
|
||||
result = normalize_directives(result, label_map=label_map)
|
||||
result = process_figure_captions(result, fig_number_map=fig_number_map)
|
||||
result = normalize_directives(
|
||||
result,
|
||||
label_map=label_map,
|
||||
ref_label_map=ref_label_map,
|
||||
current_source_path=current_source_path,
|
||||
fig_number_map=fig_number_map,
|
||||
)
|
||||
result = process_citations(result, bib_db or {}, bibliography_title=bibliography_title)
|
||||
return result
|
||||
|
||||
|
||||
@@ -14,8 +14,3 @@
|
||||
- CUDA编程指导 [CUDA](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html)
|
||||
- 昇腾社区 [Ascend](https://gitee.com/ascend)
|
||||
- MLIR应用进展 [MLIR](https://mlir.llvm.org/talks)
|
||||
|
||||
|
||||
## 参考文献
|
||||
|
||||
:bibliography:`../references/accelerator.bib`
|
||||
@@ -28,15 +28,19 @@
|
||||
但是计算机的存储并不能够直接将这样的矩阵放到内存中,需要将其展平成1维后存储,这样就涉及逻辑上的索引如何映射成为内存中的索引,即如何根据逻辑数据索引来映射到内存中的1维数据索引。
|
||||
|
||||
对于NCHW的数据是先取W轴方向数据,再取H轴方向数据,再取C轴方向,最后取N轴方向。其中物理存储与逻辑存储的之间的映射关系为
|
||||
|
||||
$$offsetnchw(n,c,h,w) = n*CHW + c*HW + h*W +w$$
|
||||
如 :numref:`nchw`所示,这种格式中,是按照最低维度W轴方向进行展开,W轴相邻的元素在内存排布中同样是相邻的。如果需要取下一个图片上的相同位置的元素,就必须跳过整个图像的尺寸($C*H*W$)。比如有8张32\*32的RGB图像,此时$N=8,C=3,H=32,W=32$。在内存中存储它们需要先按照W轴方向进行展开,然后按照H轴排列,这样之后便完成了一个通道的处理,之后按照同样的方式处理下一个通道。处理完全部通道后,处理下一张图片。PyTorch和MindSpore框架默认使用NCHW格式。
|
||||
|
||||
如 :numref:`nchw`所示,这种格式中,是按照最低维度W轴方向进行展开,W轴相邻的元素在内存排布中同样是相邻的。如果需要取下一个图片上的相同位置的元素,就必须跳过整个图像的尺寸 $C*H*W$。比如有8张32\*32的RGB图像,此时$N=8,C=3,H=32,W=32$。在内存中存储它们需要先按照W轴方向进行展开,然后按照H轴排列,这样之后便完成了一个通道的处理,之后按照同样的方式处理下一个通道。处理完全部通道后,处理下一张图片。PyTorch和MindSpore框架默认使用NCHW格式。
|
||||
|
||||

|
||||
:width:`800px`
|
||||
:label:`nchw`
|
||||
|
||||
类似的NHWC数据格式是先取C方向数据,再取W方向,然后是H方向,最后取N方向。NHWC是Tensorflow默认的数据格式。这种格式在PyTorch中称为Channel-Last。
|
||||
|
||||
$$offsetnhwc(n,h,w,c) = n*HWC + h*WC + w*C +c$$
|
||||
|
||||
:numref:`nchwandnhwc`展示了不同数据格式下逻辑排布到内存物理侧数据排布的映射。\[x:1\]代表从最内侧维度到最下一维度的索引变换。比如\[a:1\]表示当前行W轴结束后,下一个H轴排布。\[b:1\]表示最内侧C轴排布完成后进行按照W轴进行排列。
|
||||
|
||||

|
||||
|
||||
@@ -8,8 +8,4 @@
|
||||
|
||||
- 消息队列介绍:[什么是消息队列](https://aws.amazon.com/message-queue/)
|
||||
|
||||
- 特征存储介绍:[什么是机器学习中的特征存储](https://www.featurestore.org/what-is-a-feature-store)
|
||||
|
||||
## 参考文献
|
||||
|
||||
:bibliography:`../references/recommender.bib`
|
||||
- 特征存储介绍:[什么是机器学习中的特征存储](https://www.featurestore.org/what-is-a-feature-store)
|
||||
@@ -1,7 +1,3 @@
|
||||
## 小结
|
||||
|
||||
在这一章,我们简单介绍了强化学习的基本概念,包括单智能体和多智能体强化学习算法、单节点和分布式强化学习系统等,给读者对强化学习问题的基本认识。当前,强化学习是一个快速发展的深度学习分支,许多实际问题都有可能通过强化学习算法的进一步发展得到解决。另一方面,由于强化学习问题设置的特殊性(如需要与环境交互进行采样等),也使得相应算法对计算系统的要求更高:如何更好地平衡样本采集和策略训练过程?如何均衡 CPU 和 GPU 等不同计算硬件的能力?如何在大规模分布式系统上有效部署强化学习智能体?都需要对计算机系统的设计和使用有更好的理解。
|
||||
|
||||
## 参考文献
|
||||
|
||||
:bibliography:`../references/reinforcement.bib`
|
||||
在这一章,我们简单介绍了强化学习的基本概念,包括单智能体和多智能体强化学习算法、单节点和分布式强化学习系统等,给读者对强化学习问题的基本认识。当前,强化学习是一个快速发展的深度学习分支,许多实际问题都有可能通过强化学习算法的进一步发展得到解决。另一方面,由于强化学习问题设置的特殊性(如需要与环境交互进行采样等),也使得相应算法对计算系统的要求更高:如何更好地平衡样本采集和策略训练过程?如何均衡 CPU 和 GPU 等不同计算硬件的能力?如何在大规模分布式系统上有效部署强化学习智能体?都需要对计算机系统的设计和使用有更好的理解。
|
||||
@@ -4,7 +4,7 @@
|
||||
|
||||
:width:`800px`
|
||||
|
||||
:label:`ROS2\_arch`
|
||||
:label:`ROS2_arch`
|
||||
|
||||
在这一章节中,我们来大致了解一下机器人操作系统(ROS)。机器人操作系统(ROS)起源于斯坦福大学人工智能实验室的一个机器人项目。它是一个自由、开源的框架,提供接口、工具来构建先进的机器人。由于机器人领域的快速发展和复杂化,代码复用和模块化的需求日益强烈,ROS适用于机器人这种多节点多任务的复杂场景。目前也有一些机器人、无人机甚至无人车都开始采用ROS作为开发平台。在机器人学习方面,ROS/ROS2可以与深度学习结合,有开发人员为ROS/ROS2开发了的深度学习节点,并支持NVIDIA Jetson和TensorRT。NVIDIA Jetson是NVIDIA为自主机器开发的一个嵌入式系统,包括CPU、GPU、PMIC、DRAM 和闪存的一个模组化系统,可以将自主机器软件运作系统运行速率提升。TensorRT 是由 Nvidia 发布的机器学习框架,用于在其硬件上运行机器学习推理。
|
||||
|
||||
@@ -12,19 +12,19 @@
|
||||
|
||||
ROS提供了很多内置工具,比如三维可视化器rviz,用于可视化机器人、它们工作的环境和传感器数据。它是一个高度可配置的工具,具有许多不同类型的可视化和插件。catkin是ROS 构建系统(类似于Linux下的CMake),Catkin Workspace是创建、修改、编译Catkin软件包的目录。roslaunch可用于在本地和远程启动多个ROS 节点以及在ROS参数服务器上设置参数的工具。此外还有机器人仿真工具Gazebo和移动操作软件和规划框架MoveIt!。ROS为机器人开发者提供了不同编程语言的接口,比如C++语言ROS接口roscpp,python语言的ROS接口rospy。ROS中提供了许多机器人的统一机器人描述格式URDF(Unified Robot Description Format)文件,URDF使用XML格式描述机器人文件。ROS也有一些需要提高的地方,比如它的通信实时性能有限,与工业级要求的系统稳定性还有一定差距。
|
||||
|
||||
ROS2项目在ROSCon 2014上被宣布,第一个ROS2发行版 Ardent Apalone 于2017年发布。ROS2增加了对多机器人系统的支持,提高了多机器人之间通信的网络性能,而且支持微控制器和跨系统平台,不仅可以运行在现有的X86和ARM系统上,还将支持MCU等嵌入式微控制器,不止能运行在Linux系统之上,还增加了对Windows、MacOS、RTOS等系统的支持。更重要的是,ROS2还加入了实时控制的支持,可以提高控制的时效性和整体机器人的性能。ROS2的通信系统基于DDS(Data Distribution Service),即数据分发服务,如 :numref:`ROS2\_arch`所示。
|
||||
ROS2项目在ROSCon 2014上被宣布,第一个ROS2发行版 Ardent Apalone 于2017年发布。ROS2增加了对多机器人系统的支持,提高了多机器人之间通信的网络性能,而且支持微控制器和跨系统平台,不仅可以运行在现有的X86和ARM系统上,还将支持MCU等嵌入式微控制器,不止能运行在Linux系统之上,还增加了对Windows、MacOS、RTOS等系统的支持。更重要的是,ROS2还加入了实时控制的支持,可以提高控制的时效性和整体机器人的性能。ROS2的通信系统基于DDS(Data Distribution Service),即数据分发服务,如 :numref:`ROS2_arch`所示。
|
||||
|
||||
ROS2依赖于使用shell环境组合工作区。“工作区”(Workspace)是一个ROS术语,表示使用ROS2进行开发的系统位置。核心ROS2 工作区称为Underlay。随后的工作区称为Overlays。使用ROS2进行开发时,通常会同时有多个工作区处于活动状态。接下来我们详细介绍一下ROS2的核心概念。这一部分我们参考了文献 [^1]。
|
||||
|
||||
### ROS2节点
|
||||
|
||||
ROS Graph是一个由ROS2元素组成的网络,在同一时间一起处理数据。它包括所有的可执行文件和它们之间的联系。ROS2 中的每个节点都应负责一个单一的模块用途(例如,一个节点用于控制车轮马达,一个节点用于控制激光测距仪等)。每个节点都可以通过主题、服务、动作或参数向其他节点发送和接收数据。一个完整的机器人系统由许多协同工作的节点组成。如 :numref:`ros2\_graph`。在ROS2中,单个可执行文件(C++程序、Python 程序等)可以包含一个或多个节点。
|
||||
ROS Graph是一个由ROS2元素组成的网络,在同一时间一起处理数据。它包括所有的可执行文件和它们之间的联系。ROS2 中的每个节点都应负责一个单一的模块用途(例如,一个节点用于控制车轮马达,一个节点用于控制激光测距仪等)。每个节点都可以通过主题、服务、动作或参数向其他节点发送和接收数据。一个完整的机器人系统由许多协同工作的节点组成。如 :numref:`ros2_graph`。在ROS2中,单个可执行文件(C++程序、Python 程序等)可以包含一个或多个节点。
|
||||
|
||||

|
||||
|
||||
:width:`800px`
|
||||
|
||||
:label:`ros2\_graph`
|
||||
:label:`ros2_graph`
|
||||
|
||||
节点之间的互相发现是通过ROS2底层的中间件实现的,过程总结如下:
|
||||
|
||||
|
||||
@@ -1,7 +1,3 @@
|
||||
### 总结
|
||||
|
||||
在这一章,我们简单介绍了机器人系统的基本概念,包括通用机器人操作系统、感知系统、规划系统和控制系统等,给读者对机器人问题的基本认识。对通用机器人操作系统部分,我们回顾了其中的基本概念,并通过代码实例让读者对ROS能有直接的体验,体会到搭建一个简单机器人系统的乐趣。当前,机器人是一个快速发展的人工智能分支,许多实际问题都需要通过机器人算法和系统设计的进一步发展得到解决。
|
||||
|
||||
## 参考文献
|
||||
|
||||
:bibliography:`../references/rlsys.bib`
|
||||
在这一章,我们简单介绍了机器人系统的基本概念,包括通用机器人操作系统、感知系统、规划系统和控制系统等,给读者对机器人问题的基本认识。对通用机器人操作系统部分,我们回顾了其中的基本概念,并通过代码实例让读者对ROS能有直接的体验,体会到搭建一个简单机器人系统的乐趣。当前,机器人是一个快速发展的人工智能分支,许多实际问题都需要通过机器人算法和系统设计的进一步发展得到解决。
|
||||
Reference in New Issue
Block a user