Translate all code to English (#1836)

* Review the EN heading format.

* Fix pythontutor headings.

* Fix pythontutor headings.

* bug fixes

* Fix headings in **/summary.md

* Revisit the CN-to-EN translation for Python code using Claude-4.5

* Revisit the CN-to-EN translation for Java code using Claude-4.5

* Revisit the CN-to-EN translation for Cpp code using Claude-4.5.

* Fix the dictionary.

* Fix cpp code translation for the multipart strings.

* Translate Go code to English.

* Update workflows to test EN code.

* Add EN translation for C.

* Add EN translation for CSharp.

* Add EN translation for Swift.

* Trigger the CI check.

* Revert.

* Update en/hash_map.md

* Add the EN version of Dart code.

* Add the EN version of Kotlin code.

* Add missing code files.

* Add the EN version of JavaScript code.

* Add the EN version of TypeScript code.

* Fix the workflows.

* Add the EN version of Ruby code.

* Add the EN version of Rust code.

* Update the CI check for the English version  code.

* Update Python CI check.

* Fix cmakelists for en/C code.

* Fix Ruby comments
This commit is contained in:
Yudong Jin
2025-12-31 07:44:52 +08:00
committed by GitHub
parent 45e1295241
commit 2778a6f9c7
1284 changed files with 71557 additions and 3275 deletions

View File

@@ -2,7 +2,7 @@
In computers, all data is stored in binary form, and character `char` is no exception. To represent characters, we need to establish a "character set" that defines a one-to-one correspondence between each character and binary numbers. With a character set, computers can convert binary numbers to characters by looking up the table.
## ASCII Character Set
## Ascii Character Set
<u>ASCII code</u> is the earliest character set, with the full name American Standard Code for Information Interchange. It uses 7 binary bits (the lower 7 bits of one byte) to represent a character, and can represent a maximum of 128 different characters. As shown in the figure below, ASCII code includes uppercase and lowercase English letters, numbers 0 ~ 9, some punctuation marks, and some control characters (such as newline and tab).
@@ -12,7 +12,7 @@ However, **ASCII code can only represent English**. With the globalization of co
Worldwide, a batch of EASCII character sets suitable for different regions have appeared successively. The first 128 characters of these character sets are unified as ASCII code, and the last 128 characters are defined differently to adapt to the needs of different languages.
## GBK Character Set
## Gbk Character Set
Later, people found that **EASCII code still cannot meet the character quantity requirements of many languages**. For example, there are nearly one hundred thousand Chinese characters, and several thousand are used daily. In 1980, the China National Standardization Administration released the <u>GB2312</u> character set, which included 6,763 Chinese characters, basically meeting the needs for computer processing of Chinese characters.
@@ -36,7 +36,7 @@ For the above problem, **a straightforward solution is to store all characters a
However, ASCII code has already proven to us that encoding English only requires 1 byte. If the above scheme is adopted, the size of English text will be twice that under ASCII encoding, which is very wasteful of memory space. Therefore, we need a more efficient Unicode encoding method.
## UTF-8 Encoding
## Utf-8 Encoding
Currently, UTF-8 has become the most widely used Unicode encoding method internationally. **It is a variable-length encoding** that uses 1 to 4 bytes to represent a character, depending on the complexity of the character. ASCII characters only require 1 byte, Latin and Greek letters require 2 bytes, commonly used Chinese characters require 3 bytes, and some other rare characters require 4 bytes.

View File

@@ -1,6 +1,6 @@
# Data Structure
# Data Structures
![Data structure](../assets/covers/chapter_data_structure.jpg)
![Data structures](../assets/covers/chapter_data_structure.jpg)
!!! abstract

View File

@@ -1,6 +1,6 @@
# Summary
### Key review
### Key Review
- Data structures can be classified from two perspectives: logical structure and physical structure. Logical structure describes the logical relationships between data elements, while physical structure describes how data is stored in computer memory.
- Common logical structures include linear, tree, and network structures. We typically classify data structures as linear (arrays, linked lists, stacks, queues) and non-linear (trees, graphs, heaps) based on their logical structure. The implementation of hash tables may involve both linear and non-linear data structures.