IDE (Integrated Development Environment):集成开发环境,说白了就是你写代码的地方。作为一个码农,IDE 的重要性不言而喻,但由于很多 IDE 是为大型工程项目设计的,体量较大,功能也过于丰富。其实如今一些轻便的文本编辑器配合丰富的插件生态基本可以满足日常的轻量编程需求。个人常用的编辑器是 VS Code 和 Sublime(前者的插件配置非常简单,后者略显复杂但颜值很高)。当然对于大型项目我还是会采用略重型的 IDE,例如 Pycharm (Python),IDEA (Java) 等等(免责申明:所有的 IDE 都是世界上最好的 IDE)。
Vim:一款命令行编辑工具。这是一个学习曲线有些陡峭的编辑器,不过学会它我觉得是非常有必要的,因为它将极大地提高你的开发效率。现在绝大多数 IDE 也都支持 Vim 插件,让你在享受现代开发环境的同时保留极客的炫酷(yue)。
Emacs:与 Vim 齐名的经典编辑器,同样具有极高的开发效率,同时具有更为强大的扩展性,它既可以配置为一个轻量编辑器,也可以扩展成一个个人定制的 IDE,甚至可以有更多奇技淫巧。
Git:一款代码版本控制工具。Git的学习曲线可能更为陡峭,但出自 Linux 之父 Linus 之手的 Git 绝对是每个学 CS 的童鞋必须掌握的神器之一。
作为大一新生,学好微积分线代是和写代码至少同等重要的事情,相信已经有无数的前人经验提到过这一点,但我还是要不厌其烦地再强调一遍:学好微积分线代真的很重要!你也许会吐槽这些东西岂不是考完就忘,那我觉得你是并没有把握住它们本质,对它们的理解还没有达到刻骨铭心的程度。如果觉得老师课上讲的内容晦涩难懂,不妨参考 MIT 的 Calculus Course 和 18.06: Linear Algebra 的课程 notes,至少于我而言,它帮助我深刻理解了微积分和线性代数的许多本质。顺道再安利一个油管数学网红 3Blue1Brown,他的频道有很多用生动形象的动画阐释数学本质内核的视频,兼具深度和广度,质量非常高。
信息论入门
作为计算机系的学生,及早了解一些信息论的基础知识,我觉得是大有裨益的。但大多信息论课程都面向高年级本科生甚至研究生,对新手极不友好。而 MIT 的 6.050J: Information theory and Entropy 这门课正是为大一新生量身定制的,几乎没有先修要求,涵盖了编码、压缩、通信、信息熵等等内容,非常有趣。
作为计算机系的学生,培养计算思维是很重要的,实际问题的建模、离散化,计算机的模拟、分析,是一项很重要的能力。而这两年开始风靡的,由 MIT 打造的 Julia 编程语言以其 C 一样的速度和 Python 一样友好的语法在数值计算领域有一统天下之势,MIT 的许多数学课程也开始用 Julia 作为教学工具,把艰深的数学理论用直观清晰的代码展示出来。
ComputationalThinking 是 MIT 开设的一门计算思维入门课,所有课程内容全部开源,可以在课程网站直接访问。这门课利用 Julia 编程语言,在图像处理、社会科学与数据科学、气候学建模三个 topic 下带领学生理解算法、数学建模、数据分析、交互设计、图例展示,让学生体验计算与科学的美妙结合。内容虽然不难,但给我最深刻的感受就是,科学的魅力并不是故弄玄虚的艰深理论,不是诘屈聱牙的术语行话,而是用直观生动的案例,用简练深刻的语言,让每个普通人都能理解。
Languages are tools, you choose the right tool to do the right thing. Since there's no universally perfect tool, there's no universally perfect language.
一份“能跑”的代码,和一份高质量的工业级代码是有本质区别的。因此我非常推荐低年级的同学学习一下 MIT 6.031: Software Construction 这门课,它会以 Java 语言为基础,以丰富细致的阅读材料和精心设计的编程练习传授如何编写不易出 bug、简明易懂、易于维护修改的高质量代码。大到宏观数据结构设计,小到如何写注释,遵循这些前人总结的细节和经验,对于你此后的编程生涯大有裨益。
专业课
当然,如果你想系统性地上一门软件工程的课程,那我推荐的是伯克利的 UCB CS169: software engineering。但需要提醒的是,和大多学校(包括贵校)的软件工程课程不同,这门课不会涉及传统的 design and document 模式,即强调各种类图、流程图及文档设计,而是采用近些年流行起来的小团队快速迭代 Agile Develepment 开发模式以及利用云平台的 Software as a service 服务模式。
MIT6.033: System Engineering 是 MIT 的系统入门课,主题涉及了操作系统、网络、分布式和系统安全,除了知识点的传授外,这门课还会讲授一些写作和表达上的技巧,让你学会如何设计并向别人介绍和分析自己的系统。这本书配套的教材 Principles of Computer System Design: An Introduction 也写得非常好,推荐大家阅读。
机器学习领域近些年最重要的进展就是发展出了基于神经网络的深度学习分支,但其实很多基于统计学习的算法依然在数据分析领域有着广泛的应用。如果你之前从未接触过机器学习的相关知识,而且不想一开始就陷入艰深晦涩的数学证明,那么不妨先从 Andrew Ng (吴恩达)的 Coursera: Machine Learning 学起。这门课在机器学习领域基本无人不晓,吴恩达以其深厚的理论功底和出色的表达能力把很多艰深的算法讲得深入浅出,并且非常实用。其配套的作业也是质量相当上乘,可以帮助你快速入门。
前几年 AlphaGo 的大热让深度学习进入了大众的视野,不少大学甚至专门成立了相关专业。很多计算机的其他领域也会借助深度学习的技术来做研究,因此基本不管你干啥多少都会接触到一些神经网络、深度学习相关的技术需求。如果想快速入门,同样推荐 Andrew Ng (吴恩达)的 Coursera: Deep Learning,质量无需多言,Coursera 上罕见的满分课程。此外如果你觉得英文课程学习起来有难度,推荐李宏毅老师的 国立台湾大学:机器学习 课程。这门课打着机器学习的名号,却囊括了深度学习领域的几乎所有方向,非常全面,很适合你从宏观上对这个领域有一个大致的了解。而且老师本人也非常幽默,课堂金句频出。
The field of computer science is vast and complex, with a seemingly endless sea of knowledge. Each specialized area can lead to limitless learning if pursued deeply. Therefore, a clear and definite study plan is very important. I've taken some detours in my years of self-study and finally distilled the following content for your reference.
Before you start learning, I highly recommend a popular science video series for beginners: Crash Course: Computer Science. In just 8 hours, it vividly and comprehensively covers various aspects of computer science: the history of computers, how computers operate, the important modules that make up a computer, key ideas in computer science, and so on. As its slogan says, Computers are not magic! I hope that after watching this video, everyone will have a holistic perception of computer science and embark on the detailed and in-depth learning content below with interest.
Essential Tools
As the saying goes: sharpening your axe will not delay your job of chopping wood. If you are a pure beginner in the world of computers, learning some tools will make you more efficient.
Learn to ask questions: You might be surprised that asking questions is the first one listed? I think in the open-source community, learning to ask questions is a very important ability. It involves two aspects. First, it indirectly cultivates your ability to solve problems independently, as the cycle of forming a question, describing it, getting answers from others, and then understanding the response is quite long. If you expect others to remotely assist you with every trivial issue, then the world of computers might not suit you. Second, if after trying, you still can't solve a problem, you can seek help from the open-source community. But at that point, how to concisely explain your situation and goal to others becomes particularly important. I recommend reading the article How To Ask Questions The Smart Way, which not only increases the probability and efficiency of solving your problems but also keeps those who provide answers in the open-source community in a good mood.
Learn to be a hacker: MIT-Missing-Semester covers many useful tools for a hacker and provides detailed usage instructions. I strongly recommend beginners to study this course. However, one thing to note is that the course occasionally refers to terms related to the development process. Therefore, it is recommended to study it at least after completing an introductory computer science course.
GFW: For well-known reasons, sites like Google and GitHub are not accessible in mainland China. However, in many cases, Google and StackOverflow can solve 99% of the problems encountered during development. Therefore, learning to use a VPN is almost an essential skill for a mainland CSer. (Considering legal issues, the methods provided in this book are only applicable to users with a Peking University email address).
Command Line: Proficiency in using the command line is often overlooked or considered difficult to master, but in reality, it greatly enhances your flexibility and productivity as an engineer. The Art of Command Line is a classic tutorial that started as a question on Quora, but with the contribution of many experts, it has become a top GitHub project with over 100,000 stars, translated into dozens of languages. The tutorial is not long, and I highly recommend everyone to read it repeatedly and internalize it through practice. Also, mastering shell script programming should not be overlooked, and you can refer to this tutorial.
IDE (Integrated Development Environment): Simply put, it's where you write your code. The importance of an IDE for a programmer goes without saying, but many IDEs are designed for large-scale projects and are quite bulky and overly feature-rich. Nowadays, some lightweight text editors with rich plugin ecosystems can basically meet the needs of daily lightweight programming. My personal favorites are VS Code and Sublime (the former has a very simple plugin configuration, while the latter is a bit more complex but aesthetically pleasing). Of course, for large projects, I would still use slightly heavier IDEs, such as Pycharm (Python), IDEA (Java), etc. (Disclaimer: all IDEs are the best in the world).
Vim: A command-line editor. Vim has a somewhat steep learning curve, but mastering it, I think, is very necessary because it will greatly improve your development efficiency. Most modern IDEs also support Vim plugins, allowing you to retain the coolness of a geek while enjoying a modern development environment.
Emacs: A classic editor that stands alongside Vim, with equally high development efficiency and more powerful expandability. It can be configured as a lightweight editor or expanded into a custom IDE, and even more sophisticated tricks.
Git: A version control tool for your project. Git, created by the father of Linux, Linus, is definitely one of the must-have tools for every CS student.
GitHub: A code hosting platform based on Git. The world's largest open-source community and a gathering place for CS experts.
GNU Make: An engineering build tool. Proficiency in GNU Make will help you develop a habit of modularizing your code and familiarize you with the compilation and linking processes of large projects.
CMake: A more powerful build tool than GNU Make, recommended for study after mastering GNU Make.
Docker: A lighter-weight software packaging and deployment tool compared to virtual machines.
Practical Toolkit: In addition to the tools mentioned above that are frequently used in development, I have also collected many practical and interesting free tools, such as download tools, design tools, learning websites, etc.
Thesis: Tutorial for writing graduation thesis in Word.
Recommended Books
I believe a good textbook should be people-oriented, rather than a display of technical jargon. It's certainly important to tell readers "what it is," but a better approach would be for the author to integrate decades of experience in the field into the book and narratively convey to the reader "why it is" and what should be done in the future.
What you think of as development — coding frantically in an IDE for hours.
Actual development — setting up the environment for several days without starting to code.
PC Environment Setup
If you are a Mac user, you're in luck, as this guide will walk you through setting up the entire development environment. If you are a Windows user, thanks to the efforts of the open-source community, you can enjoy a similar experience with Scoop.
Additionally, you can refer to an environment setup guide inspired by 6.NULL MIT-Missing-Semester, focusing on terminal beautification. It also includes common software sources (such as GitHub, Anaconda, PyPI) for acceleration and replacement, as well as some IDE configuration and activation tutorials.
Server-Side Environment Setup
Server-side operation and maintenance require basic use of Linux (or other Unix-like systems) and fundamental concepts like processes, devices, networks, etc. Beginners can refer to the Linux 101 online notes compiled by the Linux User Association of the University of Science and Technology of China. If you want to delve deeper into system operation and maintenance, you can refer to the Aspects of System Administration course.
Additionally, if you need to learn a specific concept or tool, I recommend a great GitHub project, DevOps-Guide, which covers a lot of foundational knowledge and tutorials in the administration field, such as Docker, Kubernetes, Linux, CI-CD, GitHub Actions, and more.
Course Map
As mentioned at the beginning of this chapter, this course map is merely a reference guide for course planning, from my perspective as an undergraduate nearing graduation. I am acutely aware that I neither have the right nor the capability to preach to others about “how one should learn”. Therefore, if you find any issues with the course categorization and selection below, I fully accept and deeply apologize for them. You can tailor your own course map in the next section Customize Your Own Course Map.
Apart from courses labeled as basic or introductory, there is no explicit sequence in the following categories. As long as you meet the prerequisites for a course, you are free to choose any course according to your needs and interests.
Mathematical Foundations
Calculus and Linear Algebra
As a freshman, mastering calculus and linear algebra is as important as learning to code. This point has been reiterated countless times by predecessors, but I feel compelled to emphasize it again: mastering calculus and linear algebra is really important! You might complain that these subjects are forgotten after exams, but I believe that indicates a lack of deep understanding of their essence. If you find the content taught in class to be obscure, consider referring to MIT’s Calculus Course and 18.06: Linear Algebra course notes. For me, they greatly deepened my understanding of the essence of calculus and linear algebra. Also, I highly recommend the maths YouTuber 3Blue1Brown, whose channel features videos explaining the core of mathematics with vivid animations, offering both depth and breadth of high quality.
Introduction to Information Theory
For computer science students, gaining some foundational knowledge in information theory early on is beneficial. However, most information theory courses are targeted towards senior or even graduate students, making them quite inaccessible to beginners. MIT’s 6.050J: Information theory and Entropy is tailored for freshmen, with almost no prerequisites, covering coding, compression, communication, information entropy, and more, which is very interesting.
Advanced Mathematics
Discrete Mathematics and Probability Theory
Set theory, graph theory, and probability theory are essential tools for algorithm derivation and proof, as well as foundations for more advanced mathematical courses. However, the teaching of these subjects often falls into a rut of being overly theoretical and formalistic, turning classes into mere recitations of theorems and conclusions without helping students grasp the essence of these theories. If theory teaching can be interspersed with examples of algorithm application, students can expand their algorithm knowledge while appreciating the power and charm of theory.
UCB CS70: Discrete Math and Probability Theory and UCB CS126: Probability Theory are UC Berkeley’s probability courses. The former covers the basics of discrete mathematics and probability theory, while the latter delves into stochastic processes and more advanced theoretical content. Both emphasize the integration of theory and practice and feature abundant examples of algorithm application, with the latter including numerous Python programming assignments to apply probability theory to real-world problems.
Numerical Analysis
For computer science students, developing computational thinking is crucial. Modeling and discretizing real-world problems, and simulating and analyzing them on computers, are vital skills. Recently, the Julia programming language, developed by MIT, has become popular in the field of numerical computation with its C-like speed and Python-friendly syntax. Many MIT mathematics courses have started using Julia as a teaching tool, presenting complex mathematical theories through clear and intuitive code.
ComputationalThinking is an introductory course in computational thinking offered by MIT. All course materials are open source and accessible on the course website. Using the Julia programming language, the course covers image processing, social science and data science, and climatology modeling, helping students understand algorithms, mathematical modeling, data analysis, interactive design, and graph presentation. The course content, though not difficult, profoundly impressed me with the idea that the allure of science lies not in obscure theories or jargon but in presenting complex concepts through vivid examples and concise, deep language.
After completing this experience course, if you’re still eager for more, consider MIT’s 18.330: Introduction to Numerical Analysis. This course also uses Julia for programming assignments but is more challenging and in-depth. It covers floating-point encoding, root finding, linear systems, differential equations, and more, with the main goal of using discrete computer representations to estimate and approximate continuous mathematical concepts. The course instructor has also written an accompanying open-source textbook, Fundamentals of Numerical Computation, which includes abundant Julia code examples and rigorous formula derivations.
Wouldn't it be cool if the motion and development of everything in the world could be described and depicted with equations? Although differential equations are not a mandatory part of any CS curriculum, I believe mastering them provides a new perspective to view the world.
Since differential equations often involve complex variable functions, you can refer to MIT18.04: Complex Variables Functions course notes to fill in prerequisite knowledge.
MIT18.03: Differential Equations mainly covers the solution of ordinary differential equations, and on this basis, MIT18.152: Partial Differential Equations dives into the modeling and solving of partial differential equations. With the powerful tool of differential equations, you will gain enhanced capabilities in modeling real-world problems and intuitively grasping the essence among various noisy variables.
Advanced Mathematical Topics
As a computer science student, I often hear arguments about the uselessness of mathematics. While I neither agree nor have the authority to oppose such views, if everything is forcibly categorized as useful or useless, it indeed becomes quite dull. Therefore, the following advanced mathematics courses, aimed at senior and even graduate students, are available for those interested.
Languages are tools, and you choose the right tool for the right job. Since there's no universally perfect tool, there's no universally perfect language.
For computer science students, understanding basic circuit knowledge and experiencing the entire pipeline from sensor data collection to data analysis and algorithm prediction can be very helpful for future learning and developing computational thinking. EE16A&B: Designing Information Devices and Systems I&II at UC Berkeley are introductory courses for freshmen in electrical engineering. EE16A focuses on collecting and analyzing data from the real environment through circuits, while EE16B focuses on analyzing these collected data to make predictive actions.
Signals and Systems
Signals and Systems is a course I find very worthwhile. Initially, I studied it out of curiosity about Fourier Transform, but after completing it, I was amazed at how Fourier Transform provided a new perspective to view the world, just like differential equations, immersing you in the elegance and magic of precisely depicting the world with mathematics.
UCB EE120: Signal and Systems has very well-written notes on Fourier Transform and provides many interesting Python programming assignments to practically apply the theories and algorithms of signals and systems.
Data Structures and Algorithms
Algorithms are the core of computer science and the foundation for almost all professional courses. How to abstract real-world problems into algorithmic problems mathematically and solve them under time and memory constraints using appropriate data structures is the eternal theme of algorithm courses. If you are fed up with your teacher's rote teaching, I highly recommend UC Berkeley's UCB CS61B: Data Structures and Algorithms and Princeton's Coursera: Algorithms I & II. Both courses are taught in a deep yet simple manner and have rich and interesting programming experiments to integrate theory with knowledge.
There is a fundamental difference between “working” code and high-quality industrial code. Therefore, I highly recommend senior students to take MIT 6.031: Software Construction. Based on Java, this course teaches how to write high-quality code that is bug-resistant, clear, and easy to maintain and modify with rich and detailed reading materials and well-designed programming exercises. From macro data structure design to minor details like how to write comments, following these details and experiences summarized by predecessors can greatly benefit your future programming career.
Professional Course
Of course, if you want to systematically take a software engineering course, I recommend UC Berkeley’s UCB CS169: Software Engineering. However, unlike most software engineering courses, this course does not involve the traditional design and document model that emphasizes various class diagrams, flowcharts, and document design. Instead, it adopts the Agile Development model, which has become popular in recent years, featuring small team rapid iterations and the Software as a Service model using cloud platforms.
Computer Architecture
Introductory Course
Since childhood, I've always heard that the world of computers is made of 0s and 1s, which I didn't understand but was deeply impressed by. If you also have this curiosity, consider spending one to two months learning the barrier-free computer course Coursera: Nand2Tetris. This comprehensive course starts from 0s and 1s, allowing you to build a computer by hand and run a Tetris game on it. It covers compilation, virtual machines, assembly, architecture, digital circuits, logic gates, etc., from top to bottom, from software to hardware. Its difficulty is carefully designed to omit many complex details of modern computers, extracting the most core essence, aiming to make it understandable to everyone. In lower levels, establishing a bird's-eye view of the entire computer system is very beneficial.
Professional Course
Of course, if you want to delve into the complex details of modern computer architecture, you still need to take a university-level course UCB CS61C: Great Ideas in Computer Architecture. This course emphasizes practice, and you will hand-write assembly to construct neural networks in projects, build a CPU from scratch, and more, all of which will give you a deeper understanding of computer architecture, beyond the monotony of "fetch, decode, execute, memory access, write back."
Introduction to Computer Systems
Computer systems are a vast and profound topic. Before delving into a specific area, having a macro conceptual understanding of each field and some general design principles will reinforce core and even philosophical concepts in your subsequent in-depth study, rather than being shackled by complex internal details and various tricks. In my opinion, the key to learning systems is to grasp these core concepts to design and implement your own systems.
MIT6.033: System Engineering is MIT's introductory course to systems, covering topics like operating systems, networks, distributed systems, and system security. In addition to the theory, this course also teaches some writing and expression skills, helping you learn how to design, introduce, and analyze your own systems. The accompanying textbook Principles of Computer System Design: An Introduction is also very well written and recommended for reading.
CMU 15-213: Introduction to Computer System is CMU’s introductory systems course, covering architecture, operating systems, linking, parallelism, networks, etc., with both breadth and depth. The accompanying textbook Computer Systems: A Programmer's Perspective is also of very high quality and strongly recommended for reading.
Operating Systems
There’s nothing like writing your own kernel to deepen your understanding of operating systems.
Operating systems provide a set of elegant abstractions to virtualize various complex underlying hardware, providing rich functional support for all application software. Understanding the design principles and internal mechanisms of operating systems is greatly beneficial for a programmer who is not satisfied with just being a coder. Out of love for operating systems, I have taken many operating system courses in different colleges, each with its own focus and merits. You can choose based on your interests.
MIT 6.S081: Operating System Engineering, offered by the famous PDOS lab at MIT, features 11 projects that modify an elegantly implemented Unix-like operating system xv6. This course made me realize that systems is not about reading PPTs; it's about writing tens of thousands of lines of code.
UCB CS162: Operating System, UC Berkeley’s operating system course, uses the same Project as Stanford — an educational operating system, Pintos. As the teaching assistant for Peking University’s 2022 and 2023 Spring Semester Operating Systems Course, I introduced and improved this Project. The course resources are fully open-sourced, with details on the course website.
NJU: Operating System Design and Implementation, offered by Professor Yanyan Jiang at Nanjing University, provides an in-depth and accessible explanation of various operating system concepts, combining a unique system perspective with rich code examples. All course content is in Chinese, making it very convenient for students.
HIT OS: Operating System, taught by Professor Zhijun Li at Harbin Institute of Technology, is a Chinese course on operating systems. Based on the Linux 0.11 source code, the course places great emphasis on code practice, explaining the intricacies of operating systems from the student's perspective.
Parallel and Distributed Systems
In recent years, the most common phrase heard in CS lectures is "Moore's Law is coming to an end." As single-core capabilities reach their limits, multi-core and many-core architectures are becoming increasingly important. The changes in hardware necessitate adaptations and changes in the upper-level programming logic. Writing parallel programs has nearly become a mandatory skill for programmers to fully utilize hardware performance. Meanwhile, the rise of deep learning has brought unprecedented demands on computing power and storage, making the deployment and optimization of large-scale clusters a hot topic.
Whether you chose computer science because of a youthful dream of becoming a hacker, the reality is that becoming a hacker is a long and difficult journey.
Theoretical Courses
UCB CS161: Computer Security at UC Berkeley covers stack attacks, cryptography, website security, network security, and more.
ASU CSE466: Computer Systems Security at Arizona State University covers a wide range of topics in system security. It has a high barrier to entry, requiring familiarity with Linux, C, and Python.
SU SEED Labs at Syracuse University, supported by a $1.3 million grant from the NSF, has developed hands-on experimental exercises (called SEED Labs) for cybersecurity education. The course emphasizes both theoretical teaching and practical exercises, including detailed open-source lectures, video tutorials, textbooks (printed in multiple languages), and a ready-to-use virtual machine and Docker-based attack-defense environment. This project is currently used by 1,050 institutions worldwide and covers a wide range of topics in computer and information security, including software security, network security, web security, operating system security, and mobile app security.
Practical Courses
After mastering this theoretical knowledge, it's essential to cultivate and hone these "hacker skills" in practice. CTF competitions are a popular way to comprehensively test your understanding and application of computer knowledge in various fields. Peking University also successfully held the 0th and 1st editions, encouraging participation to improve skills through practice. Here are some resources I use for learning (and relaxing):
There’s nothing like writing your own TCP/IP protocol stack to deepen your understanding of computer networks.
The renowned Stanford CS144: Computer Network includes 8 projects that guide you in implementing the entire TCP/IP protocol stack.
If you're just looking to understand computer networks theoretically, I recommend the famous networking textbook "A Top-Down Approach" and its accompanying learning resources Computer Networking: A Top-Down Approach.
Database Systems
There’s nothing like building your own relational database to deepen your understanding of database systems.
CMU's famous database course CMU 15-445: Introduction to Database System guides you through 4 projects to add various functionalities to the educational relational database bustub. The experimental evaluation framework is also open-source, making it very suitable for self-learning. The course experiments also use many new features of C++11, offering a great opportunity to strengthen C++ coding skills.
Berkeley, as the birthplace of the famous open-source database PostgreSQL, has its own course UCB CS186: Introduction to Database System where you will implement a relational database in Java that supports SQL concurrent queries, B+ tree indexing, and fault recovery.
Compiler Theory
There’s nothing like writing your own compiler to deepen your understanding of compilers.
Front-end development is often overlooked in computer science curricula, but mastering these skills has many benefits, such as building your personal website or creating an impressive presentation website for your course projects.
Data science, machine learning, and deep learning are closely related, with a focus on practical application. Berkeley's UCB Data100: Principles and Techniques of Data Science lets you master various data analysis tools and algorithms through extensive programming exercises. The course guides you through extracting desired results from massive datasets and making predictions about future data or user behavior. For those looking to learn industrial-level data mining and analysis techniques, Stanford's big data mining course CS246: Mining Massive Data Sets is an option.
Artificial Intelligence
Artificial intelligence has been one of the hottest fields in computer science over the past decade. If you're not content with just hearing about AI advancements in the media and want to delve into the subject, I highly recommend Harvard's renowned CS50 series AI course Harvard CS50: Introduction to AI with Python. The course is concise and covers several major branches of traditional AI, supplemented with rich and interesting Python programming exercises to reinforce your understanding of AI algorithms. However, the content is somewhat simplified for online learners and doesn't delve into deep mathematical theories. For a more systematic and in-depth study, consider an undergraduate-level course like Berkeley's UCB CS188: Introduction to Artificial Intelligence. This course's projects feature the classic game "Pac-Man," allowing you to use AI algorithms to play the game, which is very fun.
Machine Learning
The most significant recent progress in the field of machine learning is the emergence of deep learning, a branch based on deep neural networks. However, many algorithms based on statistical learning are still widely used in data analysis. If you're new to machine learning and don't want to get bogged down in complex mathematical proofs, start with Andrew Ng's (Enda Wu) Coursera: Machine Learning. This course is well-known in the field of machine learning, and Enda Wu, with his profound theoretical knowledge and excellent presentation skills, makes many complex algorithms accessible and practical. The accompanying assignments are also of high quality, helping you get started quickly.
However, completing this course will only give you a general understanding of the field of machine learning. To truly understand the mathematical principles behind these "magical" algorithms or to engage in related research, you need a more "mathematical" course, such as Stanford CS229: Machine Learning or UCB CS189: Introduction to Machine Learning.
Deep Learning
The popularity of AlphaGo a few years ago brought deep learning to the public eye, leading many universities to establish related majors. Many other areas of computer science also use deep learning technology for research, so regardless of your field, you will likely encounter some needs related to neural networks and deep learning. For a quick introduction, I again recommend Andrew Ng's (Enda Wu) Coursera: Deep Learning, a top-rated course on Coursera. Additionally, if you find English-language courses challenging, consider Professor Hongyi Li's course National Taiwan University: Machine Learning. Although titled "Machine Learning," this course covers almost all areas of deep learning and is very comprehensive, making it suitable for getting a broad overview of the field. The professor is also very humorous, with frequent witty remarks in class.
Due to the rapid development of deep learning, there are now many research branches. For further in-depth study, consider the following representative courses:
The course map above inevitably carries strong personal preferences and may not suit everyone. It is more intended to serve as a starting point for exploration. If you want to select your own areas of interest for study, you can refer to the following resources:
MIT OpenCourseWare: MIT's open-sharing project for course resources, featuring thousands of courses from various disciplines, including computer science courses numbered 6.xxx.
UC Berkeley EECS Course Map: UC Berkeley's EECS curriculum plan, presenting the categories and prerequisites of various courses in a course map format, most of which are included in this book.
The English version is still under development, please check this issue if you want to contribute.
This is a self-learning guide to computer science, and a memento of my three years of self-learning at university.
It is also a gift to the young students at Peking University. It would be a great encouragement and comfort to me if this book could be of even the slightest help to you in your college life.
The book is currently organized to include the following sections (if you have other good suggestions, or would like to join the ranks of contributors, please feel free to email zhongyinmin@pku.edu.cn or ask questions in the issue).
Productivity Toolkit: IDE, VPN, StackOverflow, Git, Github, Vim, Latex, GNU Make and so on.
Environment configuration: PC/Server development environment setup, DevOps tutorials and so on.
Book recommendations: Those who have read the CSAPP must have realized the importance of good books. I will list links to books and resources in different areas of Computer Science that I find rewarding to read.
List of high quality CS courses: I will summarize all the high quality foreign CS courses I have taken into different categories and give relevant self-learning advice. Most of them will have a separate repository containing relevant resources as well as my homework/project implementations.
The place where dreams start —— CS61A
In my freshman year, I was a novice who knew nothing about computers. I installed a giant IDE Visual Studio and fight with OJ every day. With my high school maths background, I did pretty well in maths courses, but I felt struggled to learn courses in my major. When it came to programming, all I could do was open up that clunky IDE, create a new project that I didn't know exactly what it was for, and then cin, cout, for loops, and then CE, RE, WA loops. I was in a state where I was desperately trying to learn well but I didn't know how to learn. I listened carefully in class but I couldn't solve the homework problems. I spent almost all my spare time doing the homework after class, but the results were disappointing. I still retain the source code of the project for Introduction to Computing course —— a single 1200-line C++ file with no header files, no class abstraction, no unit tests, no makefile, no version control. The only good thing is that it can run, the disadvantage is the complement of "can run". For a while I wondered if I wasn't cut out for computer science, as all my childhood imaginings of geekiness had been completely ruined by my first semester's experience.
It all turned around during the winter break of my freshman year, when I had a hankering to learn Python. I overheard someone recommend CS61A, a freshman introductory course at UC Berkeley on Python. I'll never forget that day, when I opened the CS61A course website. It was like Columbus discovering a new continent, and I opened the door to a new world.
I finished the course in 3 weeks and for the first time I felt that CS could be so fulfilling and interesting, and I was shocked that there existed such a great course in the world.
To avoid any suspicion of pandering to foreign courses, I will tell you about my experience of studying CS61A from the perspective of a pure student.
Course website developed by course staffs: The course website integrates all the course resources into one, with a well organised course schedule, links to all slides, recorded videos and homework, detailed and clear syllabus, list of exams and solutions from previous years. Aesthetics aside, this website is so convenient for students.
Textbook written by course instructor: The course instructor has adapted the classic MIT textbook Structure and Interpretation of Computer Programs (SICP) into Python (the original textbook was based on Scheme). This is a great way to ensure that the classroom content is consistent with the textbook, while adding more details. The entire book is open source and can be read directly online.
Various, comprehensive and interesting homework: There are 14 labs to reinforce the knowledge gained in class, 10 homework assignments to practice, and 4 projects each with thousands of lines of code, all with well-organized skeleton code and babysitting instructions. Unlike the old-school OJ and Word document assignments, each lab/homework/project has a detailed handout document, fully automated grading scripts, and CS61A staffs have even developed an automated assignment submission and grading system. Of course, one might say "How much can you learn from a project where most of code are written by your teaching assistants?" . For someone who is new to CS and even stumbling over installing Python, this well-developed skeleton code allows students to focus on reinforcing the core knowledge they've learned in class, but also gives them a sense of achievement that they already can make a little game despite of learning Python only for a month. It also gives them the opportunity to read and learn from other people's high quality code so that they can reuse it later. I think in the freshman year, this kind of skeleton code is absolutely beneficial. The only bad thing perhaps is for the instructors and teaching assistants, as developing such assignments can conceivably require a considerable time commitment.
Weekly discussion sessions: The teaching assistants will explain the difficult knowledge in class and add some supplementary materials which may not be covered in class. Also, there will be exercises from exams of previous years. All the exercises are written in LaTeX with solutions.
In CS61A, You don't need any prerequesites about CS at all. You just need to pay attention, spend time and work hard. The feeling that you do not know what to do, that you are not getting anything in return for all the time you put in, is gone. It suited me so well that I fell in love with self-learning.
Imagine that if someone could chew up the hard knowledge and present it to you in a vivid and straightforward way, with so many fancy and varied projects to reinforce your theoretical knowledge, you'd think they were really trying their best to make you fully grasp the course, and it was even an insult to the course builders not to learn it well.
If you think I'm exaggerating, start with CS61A, because it's where my dreams began.
Why write this book?
In the 2020 Fall semester, I worked as a teaching assistant for the class Introduction to Computer Systems at Peking University. At that time, I had been studying totally on my own for over a year. I enjoyed this style of learning immensely. To share this joy, I have made a CS Self-learning Materials List for students in my seminar. It was purely on a whim at the time, as I wouldn't dare to encourage my students to skip classes and study on their own.
But after another year of maintenance, the list has become quite comprehensive, covering most of the courses in Computer Science, Artificial Intelligence and Soft Engineering, and I have built separate repositories for each course, summarising the self-learning materials that I used.
In my last college year, when I opened up my curriculum book, I realized that it was already a subset of my self-learning list. By then, it was only two and a half years after I had started my self-learning journey. Then, a bold idea came to my mind: perhaps I could create a self-learning book, write down the difficulty I encountered and the interest I found during these years of self-learning, hoping to make it easy for students who may also enjoy self-learning to start their wonderful self-learning journey.
If you can build up the whole CS foundation in less than three years, have relatively solid mathematical skills and coding ability, experience dozens of projects with thousands of lines of code, master at least C/C++/Java/JS/Python/Go/Rust and other mainstream programming languages, have a good understanding of algorithms, circuits, architectures, networks, operating systems, compilers, artificial intelligence, machine learning, computer vision, natural language processing, reinforcement learning, cryptography, information theory, game theory, numerical analysis, statistics, distributed systems, parallel computing, database systems, computer graphics, web development, cloud computing, supercomputing etc. I think you will be confident enough to choose the area you are interested in, and you will be quite competitive in both industry and academia.
I firmly believe that if you have read to this line, you do not lack the ability and committment to learn CS well, you just need a good teacher to teach you a good course. And I will try my best to pick such courses for you, based on my three years of experience.
Pros
For me, the biggest advantage of self-learning is that I can adjust the pace of learning entirely according to my own progress. For difficult parts, I can watch the videos over and over again, Google it online and ask questions on StackOverflow until I have it all figured out. For those that I mastered relatively quickly, I could skip them at twice or even three times the speed.
Another great thing about self-learning is that you can learn from different perspectives. I have taken core courses such as architectures, networking, operating systems, and compilers from different universities. Different instructors may have different views on the same knowledge, which will broaden your horizon.
A third advantage of self-learning is that you do not need to go to the class, listening to the boring lectures.
Cons
Of course, as a big fan of self-learning, I have to admit that it has its disadvantages.
The first is the difficulty of communication. I'm actually a very keen questioner, and I like to follow up all the points I don't understand. But when you're facing a screen and you hear a teacher talking about something you don't understand, you can't go to the other end of the network and ask him or her for clarification. I try to mitigate this by thinking independently and making good use of Google, but it would be great to have a few friends to study together. You can refer to README for more information on participating a community group.
The second thing is that these courses are basically in English. From the videos to the slides to the assignments, all in English. You may struggle at first, but I think it's a challenge that if you overcome, it will be extremely rewarding. Because at the moment, as reluctant as I am, I have to admit that in computer science, a lot of high quality documentation, forums and websites are all in English.
The third, and I think the most difficult one, is self-discipline. Because have no DDL can sometimes be a really scary thing, especially when you get deeper, many foreign courses are quite difficult. You have to be self-driven enough to force yourself to settle down, read dozens of pages of Project Handout, understand thousands of lines of skeleton code and endure hours of debugging time. With no credits, no grades, no teachers, no classmates, just one belief - that you are getting better.
Who is this book for?
As I said in the beginning, anyone who is interested in learning computer science on their own can refer to this book. If you already have some basic skills and are just interested in a particular area, you can selectively pick and choose what you are interested in to study. Of course, if you are a novice who knows nothing about computers like I did back then, and just begin your college journey, I hope this book will be your cheat sheet to get the knowledge and skills you need in the least amount of time. In a way, this book is more like a course search engine ordered according to my experience, helping you to learn high quality CS courses from the world's top universities without leaving home.
Of course, as an undergraduate student who has not yet graduated, I feel that I am not in a position nor have the right to preach one way of learning. I just hope that this material will help those who are also self-motivated and persistent to gain a richer, more varied and satisfying college life.
Special thanks
I would like to express my sincere gratitude to all the professors who have made their courses public for free. These courses are the culmination of decades of their teaching careers, and they have chosen to selflessly make such a high quality CS education available to all. Without them, my university life would not have been as fulfilling and enjoyable. Many of the professors would even reply with hundreds of words in length after I had sent them a thank you email, which really touched me beyond words. They also inspired me all the time that if decide to do something, do it with all heart and soul.
Want to join as a contributor?
There is a limit to how much one person can do, and this book was written by me under a heavy research schedule, so there are inevitably imperfections. In addition, as I work in the area of systems, many of the courses focus on systems, and there is relatively little content related to advanced mathematics, computing theory, and advanced algorithms. If any of you would like to share your self-learning experience and resources in other areas, you can directly initiate a Pull Request in the project, or feel free to contact me by email (zhongyinmin@pku.edu.cn).
The English version is still under development, please check this issue if you want to contribute.
This is a self-learning guide to computer science, and a memento of my three years of self-learning at university.
It is also a gift to the young students at Peking University. It would be a great encouragement and comfort to me if this book could be of even the slightest help to you in your college life.
The book is currently organized to include the following sections (if you have other good suggestions, or would like to join the ranks of contributors, please feel free to email zhongyinmin@pku.edu.cn or ask questions in the issue).
Productivity Toolkit: IDE, VPN, StackOverflow, Git, Github, Vim, Latex, GNU Make and so on.
Environment configuration: PC/Server development environment setup, DevOps tutorials and so on.
Book recommendations: Those who have read the CSAPP must have realized the importance of good books. I will list links to books and resources in different areas of Computer Science that I find rewarding to read.
List of high quality CS courses: I will summarize all the high quality foreign CS courses I have taken into different categories and give relevant self-learning advice. Most of them will have a separate repository containing relevant resources as well as my homework/project implementations.
The place where dreams start —— CS61A
In my freshman year, I was a novice who knew nothing about computers. I installed a giant IDE Visual Studio and fight with OJ every day. With my high school maths background, I did pretty well in maths courses, but I felt struggled to learn courses in my major. When it came to programming, all I could do was open up that clunky IDE, create a new project that I didn't know exactly what it was for, and then cin, cout, for loops, and then CE, RE, WA loops. I was in a state where I was desperately trying to learn well but I didn't know how to learn. I listened carefully in class but I couldn't solve the homework problems. I spent almost all my spare time doing the homework after class, but the results were disappointing. I still retain the source code of the project for Introduction to Computing course —— a single 1200-line C++ file with no header files, no class abstraction, no unit tests, no makefile, no version control. The only good thing is that it can run, the disadvantage is the complement of "can run". For a while I wondered if I wasn't cut out for computer science, as all my childhood imaginings of geekiness had been completely ruined by my first semester's experience.
It all turned around during the winter break of my freshman year, when I had a hankering to learn Python. I overheard someone recommend CS61A, a freshman introductory course at UC Berkeley on Python. I'll never forget that day, when I opened the CS61A course website. It was like Columbus discovering a new continent, and I opened the door to a new world.
I finished the course in 3 weeks and for the first time I felt that CS could be so fulfilling and interesting, and I was shocked that there existed such a great course in the world.
To avoid any suspicion of pandering to foreign courses, I will tell you about my experience of studying CS61A from the perspective of a pure student.
Course website developed by course staffs: The course website integrates all the course resources into one, with a well organised course schedule, links to all slides, recorded videos and homework, detailed and clear syllabus, list of exams and solutions from previous years. Aesthetics aside, this website is so convenient for students.
Textbook written by course instructor: The course instructor has adapted the classic MIT textbook Structure and Interpretation of Computer Programs (SICP) into Python (the original textbook was based on Scheme). This is a great way to ensure that the classroom content is consistent with the textbook, while adding more details. The entire book is open source and can be read directly online.
Various, comprehensive and interesting homework: There are 14 labs to reinforce the knowledge gained in class, 10 homework assignments to practice, and 4 projects each with thousands of lines of code, all with well-organized skeleton code and babysitting instructions. Unlike the old-school OJ and Word document assignments, each lab/homework/project has a detailed handout document, fully automated grading scripts, and CS61A staffs have even developed an automated assignment submission and grading system. Of course, one might say "How much can you learn from a project where most of code are written by your teaching assistants?" . For someone who is new to CS and even stumbling over installing Python, this well-developed skeleton code allows students to focus on reinforcing the core knowledge they've learned in class, but also gives them a sense of achievement that they already can make a little game despite of learning Python only for a month. It also gives them the opportunity to read and learn from other people's high quality code so that they can reuse it later. I think in the freshman year, this kind of skeleton code is absolutely beneficial. The only bad thing perhaps is for the instructors and teaching assistants, as developing such assignments can conceivably require a considerable time commitment.
Weekly discussion sessions: The teaching assistants will explain the difficult knowledge in class and add some supplementary materials which may not be covered in class. Also, there will be exercises from exams of previous years. All the exercises are written in LaTeX with solutions.
In CS61A, You don't need any prerequesites about CS at all. You just need to pay attention, spend time and work hard. The feeling that you do not know what to do, that you are not getting anything in return for all the time you put in, is gone. It suited me so well that I fell in love with self-learning.
Imagine that if someone could chew up the hard knowledge and present it to you in a vivid and straightforward way, with so many fancy and varied projects to reinforce your theoretical knowledge, you'd think they were really trying their best to make you fully grasp the course, and it was even an insult to the course builders not to learn it well.
If you think I'm exaggerating, start with CS61A, because it's where my dreams began.
Why write this book?
In the 2020 Fall semester, I worked as a teaching assistant for the class "Introduction to Computer Systems" at Peking University. At that time, I had been studying totally on my own for over a year. I enjoyed this style of learning immensely. To share this joy, I have made a CS Self-learning Materials List for students in my seminar. It was purely on a whim at the time, as I wouldn't dare to encourage my students to skip classes and study on their own.
But after another year of maintenance, the list has become quite comprehensive, covering most of the courses in Computer Science, Artificial Intelligence and Soft Engineering, and I have built separate repositories for each course, summarising the self-learning materials that I used.
In my last college year, when I opened up my curriculum book, I realized that it was already a subset of my self-learning list. By then, it was only two and a half years after I had started my self-learning journey. Then, a bold idea came to my mind: perhaps I could create a self-learning book, write down the difficulty I encountered and the interest I found during these years of self-learning, hoping to make it easy for students who may also enjoy self-learning to start their wonderful self-learning journey.
If you can build up the whole CS foundation in less than three years, have relatively solid mathematical skills and coding ability, experience dozens of projects with thousands of lines of code, master at least C/C++/Java/JS/Python/Go/Rust and other mainstream programming languages, have a good understanding of algorithms, circuits, architectures, networks, operating systems, compilers, artificial intelligence, machine learning, computer vision, natural language processing, reinforcement learning, cryptography, information theory, game theory, numerical analysis, statistics, distributed systems, parallel computing, database systems, computer graphics, web development, cloud computing, supercomputing etc. I think you will be confident enough to choose the area you are interested in, and you will be quite competitive in both industry and academia.
I firmly believe that if you have read to this line, you do not lack the ability and committment to learn CS well, you just need a good teacher to teach you a good course. And I will try my best to pick such courses for you, based on my three years of experience.
Pros
For me, the biggest advantage of self-learning is that I can adjust the pace of learning entirely according to my own progress. For difficult parts, I can watch the videos over and over again, Google it online and ask questions on StackOverflow until I have it all figured out. For those that I mastered relatively quickly, I could skip them at twice or even three times the speed.
Another great thing about self-learning is that you can learn from different perspectives. I have taken core courses such as architectures, networking, operating systems, and compilers from different universities. Different instructors may have different views on the same knowledge, which will broaden your horizon.
A third advantage of self-learning is that you do not need to go to the class, listening to the boring lectures.
Cons
Of course, as a big fan of self-learning, I have to admit that it has its disadvantages.
The first is the difficulty of communication. I'm actually a very keen questioner, and I like to follow up all the points I don't understand. But when you're facing a screen and you hear a teacher talking about something you don't understand, you can't go to the other end of the network and ask him or her for clarification. I try to mitigate this by thinking independently and making good use of Google, but it would be great to have a few friends to study together. You can refer to README for more information on participating a community group.
The second thing is that these courses are basically in English. From the videos to the slides to the assignments, all in English. You may struggle at first, but I think it's a challenge that if you overcome, it will be extremely rewarding. Because at the moment, as reluctant as I am, I have to admit that in computer science, a lot of high quality documentation, forums and websites are all in English.
The third, and I think the most difficult one, is self-discipline. Because have no DDL can sometimes be a really scary thing, especially when you get deeper, many foreign courses are quite difficult. You have to be self-driven enough to force yourself to settle down, read dozens of pages of Project Handout, understand thousands of lines of skeleton code and endure hours of debugging time. With no credits, no grades, no teachers, no classmates, just one belief - that you are getting better.
Who is this book for?
As I said in the beginning, anyone who is interested in learning computer science on their own can refer to this book. If you already have some basic skills and are just interested in a particular area, you can selectively pick and choose what you are interested in to study. Of course, if you are a novice who knows nothing about computers like I did back then, and just begin your college journey, I hope this book will be your cheat sheet to get the knowledge and skills you need in the least amount of time. In a way, this book is more like a course search engine ordered according to my experience, helping you to learn high quality CS courses from the world's top universities without leaving home.
Of course, as an undergraduate student who has not yet graduated, I feel that I am not in a position nor have the right to preach one way of learning. I just hope that this material will help those who are also self-motivated and persistent to gain a richer, more varied and satisfying college life.
Special thanks
I would like to express my sincere gratitude to all the professors who have made their courses public for free. These courses are the culmination of decades of their teaching careers, and they have chosen to selflessly make such a high quality CS education available to all. Without them, my university life would not have been as fulfilling and enjoyable. Many of the professors would even reply with hundreds of words in length after I had sent them a thank you email, which really touched me beyond words. They also inspired me all the time that if decide to do something, do it with all heart and soul.
Want to join as a contributor?
There is a limit to how much one person can do, and this book was written by me under a heavy research schedule, so there are inevitably imperfections. In addition, as I work in the area of systems, many of the courses focus on systems, and there is relatively little content related to advanced mathematics, computing theory, and advanced algorithms. If any of you would like to share your self-learning experience and resources in other areas, you can directly initiate a Pull Request in the project, or feel free to contact me by email (zhongyinmin@pku.edu.cn).
We will learn the fundamental concepts of the different parts of modern computing systems, as well as the latest major research topics in Industry and Academia. We will extensively cover memory systems (including DRAM and new Non-Volatile Memory technologies, memory controllers, flash memory), new paradigms like processing-in-memory, parallel computing systems (including multicore processors, coherence and consistency, GPUs), heterogeneous computing, interconnection networks, specialized systems for major data-intensive workloads (e.g. graph analytics, bioinformatics, machine learning), etc. We will focus on fundamentals as well as cutting-edge research. Significant attention will be given to real-life examples and tradeoffs, as well as critical analysis of modern computing systems.
编程实践采取 Verilog 设计和模拟类 MIPS 流水线处理器的寄存器传输(RT)实现,以此加强对理论课程的理解。因此前几个实验会有 verilog 的 CPU 流水线编程。同时还将使用C语言开发一个周期精确的处理器模拟器,并使用该模拟器探索处理器设计选项。
This course, taught by Professor Onur Mutlu, delves into computer architecture. It appears to be an advanced course following DDCA, aimed at teaching how to design control and data paths hardware for a MIPS-like processor, how to execute machine instructions concurrently through pipelining and simple superscalar execution, and how to design fast memory and storage systems. According to student feedback, the course is at least more challenging than CS61C, and some of its content is cutting-edge. Bilibili uploaders recommend it as a supplement to Carnegie Mellon University's 18-447 course. The reading materials provided are extensive, akin to attending a semester's worth of lectures.
The official website description is as follows:
"We will learn the fundamental concepts of the different parts of modern computing systems, as well as the latest major research topics in Industry and Academia. We will extensively cover memory systems (including DRAM and new Non-Volatile Memory technologies, memory controllers, flash memory), new paradigms like processing-in-memory, parallel computing systems (including multicore processors, coherence and consistency, GPUs), heterogeneous computing, interconnection networks, specialized systems for major data-intensive workloads (e.g., graph analytics, bioinformatics, machine learning), etc. We will focus on fundamentals as well as cutting-edge research. Significant attention will be given to real-life examples and tradeoffs, as well as critical analysis of modern computing systems."
The programming practice involves using Verilog to design and simulate RT implementations of a MIPS-like pipeline processor to enhance theoretical course understanding. The initial experiments include Verilog CPU pipeline programming. Additionally, students will develop a cycle-accurate processor simulator in C and explore processor design options using this simulator.
其一就是了解如何写“优雅”的代码。国内的很多大一编程入门课都会讲成极其无聊的语法课,其效果还不如直接让学生看官方文档。事实上,在刚开始接触编程的时候,让学生试着去了解什么样的代码是优雅的,什么样的代码 "have bad taste" 是大有裨益的。一般来说,编程入门课会先介绍过程式编程(例如 C 语言)。但即便是面向过程编程,模块化 和 封装 的思想也极其重要。如果你只想着代码能在 OpenJudge 上通过,写的时候图省事,用大段的复制粘贴和臃肿的 main 函数,长此以往,你的代码质量将一直如此。一旦接触稍微大一点的项目,无尽的 debug 和沟通维护成本将把你吞没。因此,写代码时不断问自己,是否有大量重复的代码?当前函数是否过于复杂(Linux 提倡每个函数只需要做好一件事)?这段代码能抽象成一个函数吗?一开始你可能觉得很不习惯,甚至觉得这么简单的题需要如此大费周章吗?但记住好的习惯是无价的,C 语言初中生都能学会,凭什么公司要招你去当程序员呢?
学过面向过程编程后,大一下学期一般会讲面向对象编程(例如 C++ 或 Java)。这里非常推荐大家看 MIT 6.031: Software Construction 这门课的 Notes,会以 Java 语言为例非常详细地讲解如何写出“优雅”的代码。例如 Test-Driven 的开发、函数 Specification 的设计、异常的处理等等等等。除此之外,既然接触了面向对象,那么了解一些常见的设计模式也是很有必要的。因为国内的面向对象课程同样很容易变成极其无聊的语法课,让学生纠结于各种继承的语法,甚至出一些无聊的脑筋急转弯一样的题目,殊不知这些东西在地球人的开发中基本不会用到。面向对象的精髓是让学生学会自己将实际的问题抽象成若干类和它们之间的关系,而设计模式则是前人总结出来的一些精髓的抽象方法。这里推荐大话设计模式 这本书,写得非常浅显易懂。
其二就是尝试学习一些能提高生产力的工具和技能,例如 Git、Shell、Vim。这里强烈推荐学习 MIT missing semester 这门课,也许一开始接触这些工具用起来会很不习惯,但强迫自己用,熟练之后开发效率会直线提高。此外,还有很多应用也能极大提高的你生产力。一条定律是:一切需要让手离开键盘的操作,都应该想办法去除。例如切换应用、打开文件、浏览网页这些都有相关插件可以实现快捷操作(例如 Mac 上的 Alfred)。如果你发现某个操作每天都会用到,并且用时超过1秒,那就应该想办法把它缩减到0.1秒。毕竟以后数十年你都要和电脑打交道,形成一套顺滑的工作流是事半功倍的。最后,学会盲打!如果你还需要看着键盘打字,那么赶紧上网找个教程学会盲打,这将极大提高你的开发效率。
其三就是平衡好课内和自学。我们质疑现状,但也得遵守规则,毕竟绩点在保研中还是相当重要的。因此在大一,我还是建议大家尽量按照自己的课表学习,但辅以一些优质的课外资源。例如微积分线代可以参考 MIT 18.01/18.02 和 MIT 18.06 的课程 Notes。假期可以通过 UCB CS61A 来学习 Python。同时做到上面第一、第二点说的,注重好的编程习惯和实践能力的培养。就个人经验,大一的数学课学分占比相当大,而且数学考试的内容方差是很大的,不同学校不同老师风格迥异,自学也许能让你领悟数学的本质,但未必能给你一个好成绩。因此考前最好有针对性地刷往年题,充分应试。
As the number of contributors grows, the content of this book keeps expanding. It is impractical and unnecessary to try to complete all the courses in the book. Attempting to do so might even be counterproductive, resulting in effort without reward. To better align with our readers and make this book truly useful for you, I have roughly divided readers into the following three categories based on their needs. Everyone can plan their own self-study program accurately according to their actual situation.
Freshmen
If you have just entered the university or are in the lower grades, and you are studying or planning to switch to computer science, then you are lucky. As studying is your main task, you have ample time and freedom to learn what you are interested in without the pressure of work and daily life. You needn't be overly concerned with utilitarian thoughts like "is it useful" or "can it help me find a job". So, how should you arrange your studies? The first point is to break away from the passive learning style formed in high school. As a small-town problem solver, I know that most Chinese high schools fill every minute of your day with tasks, and you just need to passively follow the schedule. As long as you are diligent, the results won’t be too bad. However, once you enter university, you have much more freedom. All your extracurricular time is yours to use, and no one will organize knowledge points or summarize outlines for you. Exams are not as formulaic as in high school. If you still hold the mentality of a "good high school student", following everything step by step, the results may not be as expected. The professional training plan may not be reasonable, the teaching may not be responsible, attending classes may not guarantee understanding, and even the exam content may not relate to what was taught. Jokingly, you might feel that the whole world is against you, and you can only rely on yourself.
Given this reality, if you want to change it, you must first survive and have the ability to question it. In the lower grades, it’s important to lay a solid foundation. This foundation is comprehensive, covering both in-class knowledge and practical skills, which are often lacking in China's undergraduate computer science education. Based on personal experience, I offer the following suggestions for your reference.
First, learn how to write "elegant" code. Many programming introductory courses in China can be extremely boring syntax classes, less effective than reading official documentation. Initially, letting students understand what makes code elegant and what constitutes "bad taste" is beneficial. Introductory courses usually start with procedural programming (like C language), but even here, the concepts of modularity and encapsulation are crucial. If you write code just to pass on OpenJudge, using lengthy copy-pasting and bloated main functions, your code quality will remain poor. For larger projects, endless debugging and maintenance costs will overwhelm you. So, constantly ask yourself, is there a lot of repetitive code? Is the current function too complex (Linux advocates each function should do only one thing)? Can this code be abstracted into a function? Initially, this may seem cumbersome for simple problems, but remember, good habits are invaluable. Even middle school students can master C language, so why should a company hire you as a software engineer?
After procedural programming, the second semester of the freshman year usually introduces object-oriented programming (like C++ or Java). I highly recommend MIT 6.031: Software Construction course notes, which use Java (switch to TypeScript after 2022) to explain how to write “elegant” code in detail, including Test-Driven development, function Specification design, exception handling, and more. Also, understanding common design patterns is necessary when learning object-oriented programming. Domestic object-oriented courses can easily become dull syntax classes, focusing on inheritance syntax and puzzling questions, neglecting that these are rarely used in real-world development. The essence of object-oriented programming is teaching students to abstract real problems into classes and their relationships, and design patterns are the essence of these abstractions. I recommend the book "Big Talk Design Patterns", which is very easy to understand.
Second, try to learn some productivity-enhancing tools and skills, such as Git, Shell, Vim. I strongly recommend the MIT missing semester course. Initially, you may feel awkward, but force yourself to use them, and your development efficiency will skyrocket. Additionally, many applications can greatly increase your productivity. A rule of thumb is: any action that requires your hands to leave the keyboard should be eliminated. For example, switching applications, opening files, browsing the web - there are plugins for these (like Alfred for Mac). If you find an daily operation that takes more than 1 second, try to reduce it to 0.1 seconds. After all, you'll be dealing with computers for decades, so forming a smooth workflow can greatly enhance efficiency. Lastly, learn to touch type! If you still need to look at the keyboard while typing, find a tutorial online and learn to type without looking. This will significantly increase your development efficiency.
Third, balance coursework and self-learning. We feel angry about the institution but must also follow the rules, as GPA is still important for postgraduate recommendations. Therefore, in the first year, I suggest focusing on the curriculum, complemented by high-quality extracurricular resources. For example, for calculus and linear algebra, refer to MIT 18.01/18.02 and MIT 18.06. During holidays, learn Python through UCB CS61A. Also, focus on good programming habits and practical skills mentioned above. From my experience, mathematics courses matter a lot for your GPA in the first year, and the content of math exams varies greatly between different schools and teachers. Self-learning might help you understand the essence of mathematics, but it may not guarantee good grades. Therefore, it’s better to specifically practice past exams.
In your sophomore year, as computer science courses become the majority, you can fully immerse yourself in self-learning. Refer to A Reference Guide for CS Learning, a guide I created based on three years of self-learning, introducing each course and its importance. For every course in your curriculum, this guide should have a corresponding one, and I believe they are of higher quality. If there are course projects, try to adapt labs or projects from these self-learning courses. For example, I took an operating systems course and found the teacher was still using experiments long abandoned by UC Berkeley, so I emailed the teacher to switch to the MIT 6.S081 xv6 Project I was studying. This allowed me to self-learn while inadvertently promoting curriculum reform. In short, be flexible. Your goal is to master knowledge in the most convenient and efficient way. Anything that contradicts this goal can be “fudged” as necessary. With this attitude, after my junior year, I barely attended offline classes (I spent most of my sophomore year at home due to the pandemic), and it had no impact on my GPA.
Finally, I hope everyone can be less impetuous and more patient in their pursuit. Many ask if self-learning requires strong self-discipline. It depends on what you want. If you still hold the illusion that mastering a programming language will earn you a high salary and a share of the internet’s profits, then whatever I say is pointless. Initially, my motivation was out of pure curiosity and a natural desire for knowledge, not for utilitarian reasons. The process didn't involve “extraordinary efforts”; I spent my days in college as usual and gradually accumulated this wealth of materials. Now, as the US-China confrontation becomes a trend, we still humbly learn techniques from the West. Who will change this? You, the newcomers. So, go for it, young man!
Simplify the Complex
If you have graduated and started postgraduate studies, or have begun working, or are in another field and want to learn coding in your spare time, you may not have enough time to systematically complete the materials in A Reference Guide for CS Learning, but still want to fill the gaps in your undergraduate foundation. Considering that these readers usually has some programming experience, there is no need to repeat introductory courses. From a practical standpoint, since the general direction of work is already determined, there is no need to deeply study every branch of computer science. Instead, focus on general principles and skills. Based on my own experience, I've selected the most important and highest quality core professional courses to deepen readers' understanding of computer science. After completing these courses, regardless of your specific job, I believe you won't just be an ordinary coder, but will have a deeper understanding of the underlying logic of computers.
If you have a solid grasp of the core professional courses in computer science and have already determined your work or research direction, then there are many courses in the book not mentioned in A Reference Guide for CS Learning for you to explore.
As the number of contributors increases, new branches such as Advanced Machine Learning and Machine Learning Systems will be added to the navigation bar. Under each branch, there are several similar courses from different schools with different emphases and experiments, such as the Operating Systems branch, which includes courses from MIT, UC Berkeley, Nanjing University, and Harbin Institute of Technology. If you want to delve into a field, studying these similar courses will give you different perspectives on similar knowledge. Additionally, I plan to contact researchers in related fields to share research learning paths in specific subfields, enhancing the depth of the CS Self-learning Guide while pursuing breadth.
If you want to contribute in this area, feel free to contact the author via email zhongyinmin@pku.edu.cn.
Functionally, GitHub is an online platform for hosting code. You can host your local Git repositories on GitHub for collaborative development and maintained by a group. However, GitHub's significance has evolved far beyond that. It has become a very active and resource-rich open-source community. Developers from all over the world share a wide variety of open-source software on GitHub. From industrial-grade deep learning frameworks like PyTorch and TensorFlow to practical scripts consisting of just a few lines of code, GitHub offers hardcore knowledge sharing, beginner-friendly tutorials, and even many technical books are open-sourced here (like the one you're reading now). Browsing GitHub has become a part of my daily life.
On GitHub, stars are the ultimate affirmation for a project. If you find this book useful, you are welcome to enter the repository's homepage via the link in the upper right corner and give your precious star✨.
How to Use GitHub
If you have never created your own remote repository on GitHub or cloned someone else's code, I suggest you start your open-source journey with GitHub's official tutorial.
If you want to keep up with some interesting open-source projects on GitHub, I highly recommend the HelloGitHub website. It regularly features GitHub's recently trending or very interesting open-source projects, giving you the opportunity to access various quality resources firsthand.
I believe GitHub's success is due to the "one for all, all for one" spirit of open source and the joy of sharing knowledge. If you also want to become the next revered open-source giant or the author of a project with tens of thousands of stars, then transform your ideas that spark during development into code and showcase them on GitHub.
However, it's important to note that the open-source community is not lawless. Many open-source softwares are not meant for arbitrary copying, distribution, or even sale. Understanding various open-source licenses and complying with them is not only a legal requirement but also the responsibility of every member of the open-source community.
\ No newline at end of file
diff --git a/en/必学工具/Latex/index.html b/en/必学工具/Latex/index.html
new file mode 100644
index 00000000..e9769116
--- /dev/null
+++ b/en/必学工具/Latex/index.html
@@ -0,0 +1,35 @@
+ LaTeX - csdiy.wiki
If you need to write academic papers, please skip directly to the next section, as learning LaTeX is not just a choice but a necessity.
LaTeX is a typesetting system based on TeX, developed by Turing Award winner Lamport, while TeX was originally developed by Knuth, both of whom are giants in the field of computer science. Of course, the developers' prowess is not the reason we learn LaTeX. The biggest difference between LaTeX and the commonly used WYSIWYG (What You See Is What You Get) Word documents is that in LaTeX, users only need to focus on the content of the writing, leaving the typesetting entirely to the software. This allows people without any typesetting experience to produce papers or articles with highly professional formatting.
Berkeley computer science professor Christos Papadimitriou once jokingly said:
Every time I read a LaTeX document, I think, wow, this must be correct!
How to Learn LaTeX
The recommended learning path is as follows:
Setting up the LaTeX environment can be a headache. If you encounter problems with configuring LaTeX locally, consider using Overleaf, an online LaTeX editor. The site not only offers a variety of LaTeX templates to choose from but also eliminates the difficulty of environment setup.
The best way to learn LaTeX is, of course, by writing papers. However, starting with a math class and using LaTeX for homework is also a good choice.
Other recommended introductory materials include:
A brief guide to installing LaTeX [GitHub] or the TEX Live Guide (texlive-zh-cn) [PDF] can help you with installation and environment setup.
A (not so) brief introduction to LaTeX2ε (lshort-zh-cn) [PDF] [GitHub], translated by the CTEX development team, helps you get started quickly and accurately. It's recommended to read it thoroughly.
Liu Haiyang's "Introduction to LaTeX" can be used as a reference book, to be consulted when you have specific questions. Skip the section on CTEX suite.
Setting up a development environment in Windows has always been a complex and challenging task. The lack of a unified standard means that the installation methods for different development environments vary greatly, resulting in unnecessary time costs. Scoop helps you uniformly install and manage common development software, eliminating the need for manual downloads, installations, and environment variable configurations.
For example, to install Python and Node.js, you just need to execute:
scoop install python
scoop install nodejs
-
安装 Scoop
Scoop 需要 Windows PowerShell 5.1 或者 PowerShell 作为运行环境,如果你使用的是 Windows 10 及以上版本,Windows PowerShell 是内置在系统中的。而 Windows 7 内置的 Windows PowerShell 版本过于陈旧,你需要手动安装新版本的 PowerShell。
由于发现很多同学在设置 Windows 用户时使用了中文用户名,导致了用户目录也变成了中文名。如果按照 Scoop 的默认方式将软件安装到用户目录下,可能会造成部分软件执行错误。所以这里推荐安装到自定义目录,如果需要其他安装方式请参考: ScoopInstaller/Install
# 设置 PowerShell 执行策略
+
Installing Scoop
Scoop requires Windows PowerShell 5.1 or PowerShell as its runtime environment. If you are using Windows 10 or later, Windows PowerShell is built into the system. However, the version of Windows PowerShell built into Windows 7 is outdated, and you will need to manually install a newer version of PowerShell.
Many students have encountered issues due to setting up Windows user accounts with Chinese usernames, leading to user directories also being named in Chinese. Installing software via Scoop into user directories in such cases may cause some software to execute incorrectly. Therefore, it is recommended to install in a custom directory. For other installation methods, please refer to: ScoopInstaller/Install
# Set PowerShell execution policy
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
-# 下载安装脚本
+# Download the installation script
irm get.scoop.sh -outfile 'install.ps1'
-# 执行安装, --ScoopDir 参数指定 Scoop 安装路径
+# Run the installation, use --ScoopDir parameter to specify Scoop installation path
.\install.ps1 -ScoopDir 'C:\Scoop'
-
Scoop's official documentation is very user-friendly for beginners. Instead of elaborating here, it is recommended to read the official documentation or the Quick Start guide.
Q&A
Can Scoop Configure Mirror Sources?
The Scoop community only maintains installation configurations, and all software is downloaded from the official download links provided by the software's creators. Therefore, mirror sources are not provided. If your network environment causes repeated download failures, you may need a bit of magic.
Why Can't I Find Java 8?
For the same reasons mentioned above, the official download links for Java 8 are no longer provided. It is recommended to use ojdkbuild8 as a substitute.
How Do I Install Python 2?
For software that is outdated and no longer in use, the Scoop community removes it from ScoopInstaller/Main and adds it to ScoopInstaller/Versions. If you need such software, you need to manually add the bucket:
2022年,我本科毕业了。在开始动手写毕业论文的时候,我尴尬地发现,我对 Word 的掌握程度仅限于调节字体、保存导出这些傻瓜功能。曾想转战 Latex,但论文的段落格式要求调整起来还是用 Word 更为方便,经过一番痛苦缠斗之后,总算是有惊无险地完成了论文的写作和答辩。为了不让后来者重蹈覆辙,遂把相关资源整理成一份开箱即用的文档,供大家参考。
学习 Word 排版:到达这一步的童鞋分为两类,一是已经拥有了学院提供的标准模版,二是只有一份虚无缥缈的格式要求。那现在当务之急就是学习基础的 Word 排版技术,对于前者可以学会使用模版,对于后者则可以学会制作模版。此时切记不要雄心勃勃地选择一个十几个小时的 Word 教学视频开始头悬梁锥刺股,因为生产一份应付毕业的学术垃圾只要学半小时能上手就够了。我当时看的一个 B 站的教学视频,短小精悍非常实用,全长半小时极速入门。
In 2022, I graduated from my college. When I started writing my thesis, I embarrassingly realized that my command of Word was limited to basic functions like adjusting fonts and saving documents. I considered switching to LaTeX, but formatting requirements for the thesis were more conveniently handled in Word. After a painful struggle, I finally completed the writing and defense of my thesis. To prevent others from following in my footsteps, I compiled relevant resources into a ready-to-use document for everyone's reference.
How to Write a Graduation Thesis in Word
Just as it takes three steps to put an elephant in a fridge, writing a graduation thesis in Word also requires three simple steps:
Determine the Format Requirements of the Thesis: Usually, colleges will provide the formatting requirements for theses (font and size for headings, sections, formatting of figures and citations, etc.), and if you're lucky, they might even provide a thesis template (if so, jump to the next step). Unfortunately, my college did not issue standard format requirements and provided a chaotic and almost useless template. Out of desperation, I found the thesis format requirements of Peking University graduate students and created a template based on their guidelines. Feel free to use it, but I take no responsibility for any issues for using it.
Learn Word Formatting: At this stage, you either have a standard template provided by your college or just a vague set of formatting requirements. Now, the priority is to learn basic Word formatting skills. If you have a template, learn to use it; if not, learn to create one. Remember, there's no need to ambitiously start with a lengthy Word tutorial video. A half-hour tutorial is enough to get started for creating a passable academic paper. I watched a concise and practical Bilibili tutorial video, which is very useful for a quick start.
Produce Academic Work: The easiest step. Everyone has their own way, so unleash your creativity. Best wishes for a smooth graduation!
MSDN, I Tell You: A site for downloading Windows OS images and other software.
Design Tools
excalidraw: A hand-drawn style drawing tool, great for creating diagrams in course reports or PPTs.
tldraw: A drawing tool suitable for flowcharts, architecture diagrams, etc.
draw.io: A powerful and concise online drawing website, supports flowcharts, UML diagrams, architecture diagrams, prototypes, etc., with export options for Onedrive, Google Drive, Github, and offline client availability.
GitHub: Many open-source projects' hosting platform, also a major communication platform for many open-source projects, where issues can solve many problems.
StackExchange: A programming community composed of 181 Q&A communities (including Stack Overflow).
StackOverflow: An IT technical Q&A site related to programming.
Gitee: A code hosting platform similar to GitHub, where you can find solutions to common questions in the issues of corresponding projects.
Zhihu: A Q&A community similar to Quora, where you can ask questions, with some answers containing computer knowledge.
Cnblogs: A knowledge-sharing community for developers, containing blogs on common questions. Accuracy is not guaranteed, please use with caution.
CSDN: Contains blogs on common questions. Accuracy is not guaranteed, please use with caution.
Miscellaneous
tophub: A collection of trending news headlines (aggregating from Zhihu, Weibo, Baidu, WeChat, etc.).
虽然一手信息很重要,但后面的 N 手信息并非一无是处,因为这 N 手资料里包含了作者对源知识的转化——例如基于某种逻辑的梳理(流程图、思维导图等)或是一些自己的理解(对源知识的抽象、类比、延伸到其他知识点),这些转化可以帮助我们更快地掌握和巩固知识的核心内容,就如同初高中学习时使用的辅导书。 此外,学习的过程中和别人的交流十分重要,这些 N 手信息同时起了和其他作者交流的作用,让我们能采百家之长。所以这提示我们学习一个知识点时先尽量选择质量更高的,信息损失较少的信息源,同时不妨参考多个信息源,让自己的理解更加全面准确。
处理完文本类的信息后,我们还得思考一下怎么处理多媒体类的信息。此处的多媒体我特指英文视频,因为我没有用播客或录音学习的习惯,而且我已经基本不看中文教程了。现在很多国外名校公开课都是以视频的形式,如果能对视频进行做笔记会不会有帮助呢?不知道大家有没这样的想法,就是如果能把老师上课讲的内容转换成文本就好了,因为平时学习时我们看书的速度往往会比老师讲课的速度快。刚好 Language Reactor 这个软件可以将油管和网飞内视频的字幕导出来,同时附上中文翻译。
我们可以把 Language Reactor 导出的字幕复制到 Obsidian 里面作为文章来读。除了出于学习的需求,也可以在平时看油管的视频时打开这个插件,这个插件可以同时显示中英文字幕,并且可以单击选中英文字幕中你认为生僻的单词后显示单词释义。
但阅读文本对于一些抽象的知识点来说并不是效率最高的学习方式。俗话说,一图胜千言,能不能将某一段知识点的文本和对应的图片甚至视频画面操作联系起来呢?我在浏览 Obsidian 的插件市场时,发现了一个叫 Media Extended 的插件,这个插件可以在你的笔记里添加跳转到视频指定时间进度的链接,相当于把你的笔记和视频连接起来了!这刚好可以和我上文提到的生成视频中英文字幕搭配起来,即每一句字幕对应一个时间,并且能根据时间点跳转到视频的指定进度,如此一来如果需要在文章中展示记录了操作过程的视频的话,就不需要自己去截取对应的视频片段,而是直接在文章内就能跳转!
Obsidian 里还有一个很强大的插件,叫 Annotator,它可以实现笔记内跳转到 pdf 原文
The field of computer science is vast and rapidly evolving, making lifelong learning crucial. However, our sources of knowledge in daily development and learning are complex and fragmented. We encounter extensive documentation manuals, brief blogs, and even snippets of news and public accounts on our phones that may contain interesting knowledge. Therefore, it's vital to use various tools to create a learning workflow that suits you, integrating these knowledge fragments into your personal knowledge base for easy reference and review. After two years of learning alongside work, I have developed the following learning workflow:
Core Logic
Initially, when learning new knowledge, I referred to Chinese blogs but often found bugs and gaps in my code practice. Gradually, I realized that the information I referred to might be incorrect, as the threshold for posting blogs is low and their credibility is not high. So, I started consulting some related Chinese books.
Chinese books indeed provide a comprehensive and systematic explanation of concepts. However, given the rapid evolution of computer technology and the US's leadership in CS, content in Chinese books often lags behind the latest knowledge. This led me to realize the importance of firsthand information. Some Chinese books are translations of English ones, and translation can take a year or two, causing a delay in information transmission and loss during translation. If a Chinese book is not a translation, it likely references other books, introducing biases in interpreting the original English text.
Therefore, I naturally started reading English books. The quality of English books is generally higher than that of Chinese ones. As I delved deeper into my studies, I discovered a hierarchy of information reliability: source code > official documentation > English books > English blogs > Chinese blogs. This led me to create an "Information Loss Chart":
Although firsthand information is crucial, subsequent iterations (N-th hand information) are not useless. They include the author's transformation of the source knowledge — such as logical organization (flow charts, mind maps) or personal interpretations (abstractions, analogies, extensions to other knowledge points). These transformations can help us quickly grasp and consolidate core knowledge, like using guidebooks in school. Moreover, interacting with others' interpretations during learning is important, allowing us to benefit from various perspectives. Hence, it's advisable to first choose high-quality, less distorted sources of information while also considering multiple sources for a more comprehensive and accurate understanding.
In real-life work and study, learning rarely follows a linear, deep dive into a single topic. Often, it involves other knowledge points, such as new jargon, classic papers not yet read, or unfamiliar code snippets. This requires us to think deeply and "recursively" learn, establishing connections between multiple knowledge points.
Choosing the Right Note-taking Software
The backbone of the workflow is built around the core logic of "multiple references for a single knowledge point and building connections among various points." This is similar to writing academic papers. Papers usually have footnotes explaining keywords and multiple references at the end. But our daily notes are much more casual, hence the need for a more flexible method.
I'm accustomed to jumping to related functions and implementations in an IDE. It would be great if notes could also be interlinked like code. Current "double-link note-taking software," such as Roam Research, Logseq, Notion, and Obsidian, addresses this need. I chose Obsidian for the following reasons:
Obsidian is based locally, with fast opening speeds, and can store many e-books. My laptop, an Asus TUF Gaming FX505 with 32GB of RAM, runs Obsidian very smoothly.
Obsidian is Markdown-based. This is an advantage because if a note-taking software uses a proprietary format, it's inconvenient for third-party extensions and opening notes with other software.
Obsidian has a rich and active plugin ecosystem, allowing for an "all in one" effect, meaning various knowledge sources can be integrated in one place.
Information Sources
Obsidian's plugins support PDF formats, and it naturally supports Markdown. To achieve "all in one," you can convert other file formats to PDF or Markdown. This presents two questions:
What formats are there?
How to convert them to PDF or Markdown?
Formats
File formats depend on their display platforms. Before considering formats, let's list the sources of information I usually access:

The main categories are articles, papers, e-books, and courses, primarily including formats like web pages, PDFs, MOBI, AZW, and AZW3.
Conversion to PDF or Markdown
Online articles and courses are mostly presented as web pages. To convert web pages to Markdown, I use the clipping software "Simplified Read," which can clip articles from nearly all platforms into Markdown and import them into Obsidian.
For papers and e-books, if the format is already PDF, it's straightforward. Otherwise, I use Calibre for conversion:
Now, using Obsidian's PDF plugin and native Markdown support, I can seamlessly take notes and reference across these documents (see "Information Processing" below for details).
Managing Information Sources
For file resources like PDFs, I use local or cloud storage. For web resources, I categorize and save them in browser bookmarks or clip them into Markdown notes. However, browsers don't support mobile web bookmarking. To enable cross-platform web bookmarking, I use Cubox. With a swipe on my phone, I can save interesting web pages in one place. Although the free version limits to 100 bookmarks, it's usually sufficient and prompts me to process these pages promptly.
Moreover, many of the web pages we bookmark are not from fully-featured blog platforms like Zhihu or Juejin but personal sites without mobile apps. These can be easily overlooked in browser bookmarks, and we might miss new article notifications. Here, RSS comes into play.
RSS (Rich Site Summary) is a type of web feed that allows users to access updates to online content in a standardized format. On desktops, RSSHub Radar helps discover and generate RSS feeds, which can be subscribed to using Feedly (both have official Chrome browser plugins).
With this, the information collection process is comprehensive. But no matter how well categorized, information needs to be internalized to be useful. After collecting information, the next step is processing it — reading, understanding the semantics (especially for English sources), highlighting key sentences or paragraphs, noting queries, brainstorming related knowledge points, and writing summaries. What tools are needed for this process?
Information Processing
English Sources
For English materials, I initially used "Youdao Dictionary" for word translation, Google Translate for sentences, and "Deepl" for paragraphs. Eventually, I realized this was too slow and inefficient. Ideally, a single tool that can handle word, sentence, and paragraph translation would be optimal. After researching, I chose "Quicker" + "Saladict" for translation.
This combo allows translation outside browsers and supports words, sentences, and paragraphs, offering results from multiple translation platforms. For non-urgent word lookups, the "Collins Advanced" dictionary is helpful as it explains English words in English, providing context to aid understanding.
Multimedia Information
After processing text-based information, it's important to consider how to handle multimedia information. Specifically, I'm referring to English videos, as I don't have a habit of learning through podcasts or recordings and I rarely watch Chinese tutorials anymore. Many renowned universities offer open courses in video format. Wouldn't it be helpful if you could take notes on these videos? Have you ever thought it would be great if you could convert the content of a lecture into text, since we usually read faster than a lecturer speaks? Fortunately, the software Language Reactor can export subtitles from YouTube and Netflix videos, along with Chinese translations.
We can copy the subtitles exported by Language Reactor into Obsidian and read them as articles. Besides learning purposes, you can also use this plugin while watching YouTube videos. It displays subtitles in both English and Chinese, and you can click on unfamiliar words in the subtitles to see their definitions.
However, reading texts isn't always the most efficient way to learn about some abstract concepts. As the saying goes, "A picture is worth a thousand words." What if we could link a segment of text to corresponding images or even video operations? While browsing the Obsidian plugin marketplace, I discovered a plugin called Media Extended. This plugin allows you to add links in your notes that jump to specific times in a video, effectively connecting your notes to the video! This works well with the video subtitles mentioned earlier, where each line of subtitles corresponds to a time stamp, allowing for jumps to specific parts of the video. This means you don't have to cut specific video segments; instead, you can jump directly within the article!
Obsidian also has a powerful plugin called Annotator, which allows you to jump from notes to the corresponding section in a PDF.
Now, with Obsidian's built-in double-chain feature, we can achieve inter-note linking, and with the above plugins, we can extend these links to multimedia. This completes the process of information handling. Learning often involves both a challenging ascent and a familiar descent. So, how can we incorporate the review process into this workflow?
Information Review
Obsidian already has a plugin that connects to Anki, the renowned spaced repetition-based memory software. With this plugin, you can export segments of your notes to Anki as flashcards, each containing a link back to the original note.
Conclusion
This workflow evolved over two years of learning in my spare time. Frustration with repetitive processes led to specific needs, which were fortunately met by tools I discovered online. Don't force tools into your workflow just for the sake of satisfaction; life is short, so focus on what's truly important.
By the way, this article discusses the evolution of the workflow. If you're interested in the details of how this workflow is implemented, I recommend reading the following articles in order after this one:
When encountering a problem, remember the first thing is to read the documentation. Don't start by searching online or asking others directly. Reviewing FAQs may quickly provide the answer.
Information retrieval, as I understand it, is essentially about skillfully using search engines to quickly find the information you need, including but not limited to programming.
The most important thing in programming is STFW (search the fucking web) and RTFM (read the fucking manual). First, you should read the documentation, and second, learn to search. With so many resources online, how you use them depends on your information retrieval skills.
To understand how to search effectively, we first need to understand how search engines work.
How Search Engines Work
The working process of a search engine can generally be divided into three stages: 1
Crawling and Fetching: Search engine spiders visit web pages by tracking links, obtain the HTML code of the pages, and store it in a database.
Preprocessing: The indexing program processes the fetched web page data by extracting text, segmenting Chinese words, indexing, etc., preparing for the ranking program.
Ranking: When users enter keywords, the ranking program uses the indexed data to calculate relevance and then generates the search results page in a specific format.
The first step involves web crawlers, often exaggerated in Python courses. It can be simply understood as using an automated program to download all text, images, and related information from websites and store them locally.
The second step is the core of a search engine, but not critical for users to understand. It can be roughly understood as cleaning data and indexing pages, each with keywords for easy querying.
The third step is closely related to us. Whether it's Google, Baidu, Bing, or others, you input keywords or queries, and the search engine returns results. This article teaches you how to obtain better results.
Basic Search Techniques
Based on the above working principles, we can roughly understand that a search engine can be treated as a smart database. Using better query conditions can help you find the information you need faster. Here are some search techniques:
Use English
First, it's important to know that in programming, it's best to search in English. Reasons include:
In programming and various software operations, English resources are of higher quality than those in Chinese or other languages.
Due to translation issues, English terms are more accurate and universally applicable than Chinese.
Chinese search engines' word segmentation systems can lead to ambiguity. For example, Google searches in Chinese may not yield many useful results.
If your English is not strong, use translation tools like Baidu or Sogou; they are sufficient.
Refine Keywords
Don't search whole sentences. Although search engines automatically segment words, searching with whole sentences versus keywords can yield significantly different results in accuracy and order. Search engines are machines, not your teachers or colleagues. As mentioned above, searching is actually querying a database crawled by the search engine, so it's better to break down into keywords or phrases.
For example, if you want to know how to integrate vcpkg into a project instead of globally, searching for "如何将vcpkg集成到项目中而不是全局" in a long sentence may not yield relevant results. It's better to break it down into keywords like "vcpkg 集成 项目 全局".
Replace Keywords
If you can't find what you're looking for, try replacing "项目" with "工程" or remove "集成". If that doesn't work, try advanced searching.
Advanced Searching
Most search engines support advanced searching, including Google, Bing, Baidu, Ecosia, etc. Common formats include:
Exact Match: Enclose the search term in quotes for precise matching.
Exclude Keywords: Use a minus sign (-) to exclude specific words.
Include Keywords: Use a plus sign (+) to ensure a keyword is included.
Search Specific File Types: Use filetype:pdf to search for PDF files directly.
Search Specific Websites: Use site:stackoverflow.com to search within a specific site.
Use GitHub's Advanced Search page or refer to GitHub Query Syntax for advanced searches on GitHub. Examples include searching by repository name, description, readme, stars, fork count, size, update/creation date, license, language, user, and organization. These can be
used in combination.
More Tips
Depending on the context, I recommend specific sites for certain queries:
For language-specific queries (e.g., C++/Qt/OpenGL), add site:stackoverflow.com.
For specific business/development environments or software-related issues, first check BugLists, IssueLists, or relevant forums.
QQ groups are also a place to ask questions, but make sure your queries are meaningful.
Chinese platforms like Zhihu, Jian Shu, Blog Park, and CSDN have a wealth of Chinese notes and experiences.
About Baidu
Many programmers advise against using Baidu, preferring Google or Bing International. However, if you really need it, consider using alternatives like Ecosia or Yandex. For Chinese searches, Baidu might actually be the best option due to its database and indexing policies.
Code Search
In addition to search engines, you might also need to search for code, either your own or from projects. Here are some recommended tools:
Local Code Search
ACK or ACK2, well-established search tools written in Perl.
The Silver Searcher, implemented in C.
The Platinum Searcher, implemented in Go.
FreeCommander's built-in search, efficient on solid-state drives.
IDE's built-in search, though not always the most user-friendly.
Open Source Code Search
Searchcode for searching open source code, known for speed.
NJU OS: Operating System Design and Implementation
课程简介
所属大学:南京大学
先修要求:体系结构 + 扎实的 C 语言功底
编程语言:C 语言
课程难度:🌟🌟🌟🌟
预计学时:150 小时
之前一直听说南大的蒋炎岩老师开设的操作系统课程讲得很好,久闻不如一见,这学期有幸在 B 站观看了蒋老师的课程视频,确实收获良多。蒋老师作为非常年轻的老师,有着丰富的一线代码的经验,因此课程讲授有着满满的 Hacker 风格,课上经常“一言不合”就在命令行里开始写代码,很多重要知识点也都配有生动直白的代码示例。让我印象最为深刻的就是老师为了让学生更好地理解动态链接库的设计思想,甚至专门实现了一个迷你的可执行文件与一系列的二进制工具,让很多困扰我多年的问题都得到了解答。
这门课的讲授思路也非常有趣,蒋老师先从“程序就是状态机”这一视角入手,为“万恶之源”并发程序建立了状态机的转化模型,并在此基础上讲授了并发控制的常见手段以及并发 bug 的应对方法。接着蒋老师将操作系统看作一系列对象(进程/线程、地址空间、文件、设备等等)以及操作它们的 API (系统调用)并结合丰富的实际例子介绍了操作系统是如何利用这系列对象虚拟化硬件资源并给应用软件提供各类服务的。最后的可持久化部分,蒋老师从 1-bit 的存储介质讲起,一步步构建起各类存储设备,并通过设备驱动抽象出一组接口来方便地设计与实现文件系统。我之前虽然上过许多门操作系统的课程,但这种讲法确实独此一家,让我收获了很多独到的视角来看待系统软件。
NJU OS: Operating System Design and Implementation
Course Introduction
University: Nanjing University
Prerequisites: Computer Architecture + Solid C programming skills
Programming Language: C
Course Difficulty: 🌟🌟🌟🌟
Estimated Study Time: 150 hours
I had always heard that the operating system course taught by Professor Yanyan Jiang at Nanjing University was excellent. This semester, I had the opportunity to watch his lectures on Bilibili and gained a lot. As a young professor with rich coding experience, his teaching is full of a hacker's spirit. Often in class, he would start coding in the command line on a whim, and many important points were illustrated with vivid and straightforward code examples. What struck me most was when he implemented a mini executable file and a series of binary tools to help students better understand the design philosophy of dynamic link libraries, solving many problems that had puzzled me for years.
In the course, Prof. Jiang starts from the perspective that "programs are state machines" to establish an explainable model for the "root of all evil" concurrent programs. Based on this, he discusses common methods of concurrency control and strategies for dealing with concurrency bugs. Then, he views the operating system as a series of objects (processes/threads, address spaces, files, devices, etc.) and their APIs (system calls), combined with rich practical examples to show how operating systems use these objects to virtualize hardware resources and provide various services to application software. In the final part about persistence, he builds up various storage devices from 1-bit storage media and abstracts a set of interfaces through device drivers to facilitate the design and implementation of file systems. Although I have taken many operating system courses before, this unique approach has given me many unique perspectives on system software.
In addition to its innovative theoretical instruction, the course's emphasis on practice is a key feature of Prof. Jiang's teaching. In class and through programming assignments, he subtly cultivates the ability to read source code and consult manuals, which are essential skills for computer professionals. During the fifth MiniLab, I read Microsoft's FAT file system manual in detail for the first time, gaining a very valuable experience.
The programming assignments consist of 5 MiniLabs and 4 OSLabs. Unfortunately, the grading system is only open to students at Nanjing University. However, Professor Jiang generously allowed me to participate after I emailed him. I completed the 5 MiniLabs, and the overall experience was excellent. Particularly, the second coroutine experiment left a deep impression on me, where I experienced the beauty and "terror" of context switching in a small experiment of less than a hundred lines. Also, the MiniLabs can be easily tested locally, so the lack of a grading system should not hinder self-learning. Therefore, I hope others will not collectively "harass" the professor for access.
Finally, I want to thank Professor Jiang again for designing and offering such an excellent operating system course, the first independently developed computer course from a domestic university included in this book. It's thanks to young, new-generation teachers like Professor Jiang, who teach with passion despite the heavy Tenure track evaluation, that many students have an unforgettable undergraduate experience. I also look forward to more such high-quality courses in China, which I will include in this book for the benefit of more people.
This course has only been offered twice so far, in Fall 2013 and Spring 2022, and it discusses some cutting-edge topics in the field of databases. The Fall 2013 session covered topics like Streaming, Graph DB, NVM, etc., while the Spring 2022 session mainly focused on Self-Driving DBMS, with relevant papers provided.
The tasks for the Spring 2022 version of the course included:
Task One: Manual performance tuning based on PostgreSQL.
Task Two: Improving the Self-Driving DBMS based on NoisePage Pilot, with no limitations on features.
The teaching style is more akin to a seminar, with fewer programming assignments. This course can broaden the horizons for general students and may be particularly beneficial for those specializing in databases.
University: California Institute of Technology (Caltech)
Prerequisites: None
Programming Language: Java
Course Difficulty: 🌟🌟🌟🌟🌟
Estimated Study Time: 150 hours
Caltech's course, unlike CMU15-445 which does not offer SQL layer functionality, focuses on the implementation at the SQL layer in its CS122 course labs. It covers various modules of a query optimizer, such as SQL parsing, translation, implementation of joins, statistics and cost estimation, subquery implementation, and the implementation of aggregations and group by operations. Additionally, there are experiments related to B+ trees and Write-Ahead Logging (WAL). This course is suitable for students who have completed the CMU15-445 course and are interested in query optimization.
Below is an overview of the first three assignments or lab experiments of this course:
Assignment 1
Provide support for delete and update statements in NanoDB.
Add appropriate pin/unpin code to the Buffer Pool Manager.
Improve the performance of insert statements without excessively inflating the size of the database file.
Assignment 2
Implement a simple plan generator to convert various parsed SQL statements into executable plans.
Implement join plan nodes that support inner and outer joins using the nested-loop join algorithm.
Add unit tests to ensure the correct implementation of inner and outer joins.
Assignment 3
Complete the collection of table statistics.
Perform plan cost calculation for various plan nodes.
Calculate the selectivity of various predicates that may appear in the execution plan.
Update the tuple statistics of the plan nodes' outputs based on predicates.
For the remaining Assignments and Challenges, please refer to the course description. It is recommended to use IDEA to open the project and Maven for building, keeping in mind the log-related configurations.
RedBase, the project for CS346, involves the implementation of a simplified database system and is highly structured. The project can be divided into the following parts, which also correspond to the four labs that need to be completed:
The Record Management Component: This involves the implementation of record management functionalities.
The Index Component: Focuses on the management of B+ tree indexing.
The System Management Component: Deals with DDL statements, command-line tools, data loading commands, and metadata management.
The Query Language Component: In this part, students are required to implement the RQL Redbase Query Language, including select, insert, delete, and update statements.
Extension Component: Beyond the basic components of a database system, students must implement an extension component, which could be a Blob type, network module, join algorithms, CBO optimizer, OLAP, transactions, etc.
RedBase is an ideal follow-up project for students who have completed CMU 15-445 and wish to learn other components of a database system. Due to its manageable codebase, it allows for convenient expansion as needed. Furthermore, as it is entirely written in C++, it also serves as good practice for C++ programming skills.
Course Resources: The course website includes slides, notes, videos, homework, and project materials.
CMU's course on Probabilistic Graphical Models, taught by Eric P. Xing, is a foundational and advanced course on graphical models. The curriculum covers the basics of graphical models, their integration with neural networks, applications in reinforcement learning, and non-parametric methods, making it a highly rigorous and comprehensive course.
For students with a solid background in machine learning, deep learning, and reinforcement learning, this course provides a deep dive into the theoretical and practical aspects of probabilistic graphical models. The extensive resources available on the course website make it an invaluable learning tool for anyone looking to master this complex and rapidly evolving field.
This course offers a rigorous blend of classical learning theory and the latest developments in deep learning theory, making it exceptionally challenging and comprehensive. Previously taught by Percy Liang, the course is now led by Tengyu Ma, ensuring a high level of expertise and insight into the theoretical aspects of machine learning.
The curriculum is designed for students with a solid foundation in machine learning, deep learning, and statistics, aiming to deepen their understanding of the underlying theoretical principles in these fields. This course is an excellent choice for anyone looking to gain a thorough understanding of both the traditional and contemporary theoretical approaches in machine learning.
"Minimizing Expectations" is an advanced Ph.D. level research course, focusing on the interplay between inference and control. The course is taught by Chris Maddison, a founding member of AlphaGo and a NeurIPS 2014 best paper awardee.
This course is notably challenging and is designed for students who have a strong background in Bayesian Inference and Reinforcement Learning. The curriculum explores deep theoretical concepts and their practical applications in the fields of machine learning and artificial intelligence.
Chris Maddison's expertise and his significant contributions to the field, particularly in the development of AlphaGo, make this course highly prestigious and insightful for Ph.D. students and researchers looking to deepen their understanding of inference and control in advanced machine learning contexts. The course website provides valuable resources for anyone interested in this specialized area of study.
"Deep Generative Models" is a Ph.D. level seminar course at Columbia University, taught by John Cunningham. This course is structured around weekly paper presentations and discussions, focusing on deep generative models, which represent the intersection of graphical models and neural networks and are one of the most important directions in modern machine learning.
The course is designed to explore the latest advancements and theoretical foundations in deep generative models. Participants engage in in-depth discussions about current research papers, fostering a deep understanding of the subject matter. This format not only helps students keep abreast of the latest developments in this rapidly evolving field but also sharpens their critical thinking and research skills.
Given the advanced nature of the course, it is ideal for Ph.D. students and researchers who have a solid foundation in machine learning, deep learning, and graphical models, and are looking to delve into the cutting-edge of deep generative models. The course website provides a valuable resource for accessing the curriculum and related materials.
This learning path is suitable for students who have already learned the basics of machine learning (ML, NLP, CV, RL), such as senior undergraduates or junior graduate students, and have published at least one paper in top conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, CVPR, ICCV) and are interested in pursuing a research path in machine learning.
The goal of this path is to lay the theoretical groundwork for understanding and publishing papers at top machine learning conferences, especially in the track of Probabilistic Methods.
There can be multiple advanced learning paths in machine learning, and this one represents the best path as understood by the author Yao Fu, focusing on probabilistic modeling methods under the Bayesian school and involving interdisciplinary knowledge.
Essential Textbooks
PRML: Pattern Recognition and Machine Learning by Christopher Bishop
AoS: All of Statistics by Larry Wasserman
These two books respectively represent classic teachings of the Bayesian and frequentist schools, complementing each other nicely.
Reference Books
MLAPP: Machine Learning: A Probabilistic Perspective by Kevin Murphy
Convex Optimization by Stephen Boyd and Lieven Vandenberghe
Advanced Books
W&J: Graphical Models, Exponential Families, and Variational Inference by Martin Wainwright and Michael Jordan
Theory of Point Estimation by E. L. Lehmann and George Casella
Reading Guidelines
How to Approach
Essential textbooks are a must-read.
Reference books are like dictionaries: consult them when encountering unfamiliar concepts (instead of Wikipedia).
Advanced books should be approached after completing the essential textbooks, which should be read multiple times for thorough understanding.
Contrastive-comparative reading is crucial: open two books on the same topic, compare similarities, differences, and connections.
Recall previously read papers during reading and compare them with textbook content.
Basic Pathway
Start with AoS Chapter 6: Models, Statistical Inference, and Learning as a basic introduction.
Read PRML Chapters 10 and 11:
Chapter 10 covers Variational Inference, and Chapter 11 covers MCMC, the two main routes for Bayesian inference.
Consult earlier chapters in PRML or MLAPP for any unclear terms.
AoS Chapter 8 (Parametric Inference) and Chapter 11 (Bayesian Inference) can also serve as references. Compare these chapters with the relevant PRML chapters.
After PRML Chapters 10 and 11, proceed to AoS Chapter 24 (Simulation Methods) and compare it with PRML Chapter 11, focusing on MCMC.
If foundational concepts are still unclear, review PRML Chapter 3 and compare it with AoS Chapter 11.
Read PRML Chapter 13 (skip Chapter 12) and compare it with MLAPP Chapters 17 and 18, focusing on HMM and LDS.
After completing PRML Chapter 13, move on to Chapter 8 (Graphical Models).
Cross-reference these topics with CMU 10-708 PGM course materials.
By this point, you should have a grasp of:
Basic definitions of probabilistic models
Exact inference - Sum-Product
Approximate inference - MCMC
Approximate inference - VI
Afterward, you can proceed to more advanced topics.
Amirkabir University of Technology 1400-2: Advanced Programming Course
Course Introduction
Affiliated University: Amirkabir University of Technology
Prerequisites: None
Programming Language: C++
Course Difficulty: 🌟🌟🌟🌟🌟
Estimated Study Time: 50 hours
This is an accidentally discovered C++ course. The quality of the homework assignments is outstanding, with each being independently structured and simple, complemented by comprehensive unit tests, making it highly suitable for learning C++ programming. The course includes a total of 7 homework assignments, as follows:
Implement a Matrix class and related functions.
Implement a program that simulates the operation of a cryptocurrency client/server.
Implement a Binary Search Tree (BST).
Implement SharedPtr and UniquePtr smart pointers in C++.
Use inheritance and polymorphism to implement multiple classes.
Solve 4 problems using the STL library.
There's a Python project, for those interested.
The course homepage was not found, but the source code for the homework (named AP1400-2-HW) can be found on GitHub.
This is an introductory course on Linux from UCB, which I find more systematic and clearer than MIT's similarly aimed open course, Missing Semester. This is the main reason I recommend it. While Missing Semester seems more like a course for filling gaps for students who have started programming but haven't systematically used these tools, DeCal is more suitable for absolute beginners. The twelve-week course covers Linux basics, shell programming (including tmux and vim), package management, services, basic computer networks, network services, security (key management), Git, Docker, Kubernetes, Puppet, and CUDA. It's ideal for newcomers to understand and get started with the Linux environment.
A slight drawback is that some course assignments require operations on remote servers, like exercises on ssh, which need UCB internal account access. However, most assignments can be practiced by setting up a virtual machine and using tools like Xshell or directly using a Linux desktop version. After completing the full course and assignments, you should have a basic understanding of Linux.
To compensate for the inability to use remote servers and to familiarize with the Linux command line, I recommend bandit. Bandit is a Wargame from OverTheWire, providing a free practice range for CTF enthusiasts. The first 15 levels of bandit are basic Linux operations without any CTF knowledge. These exercises perfectly supplement the parts of DeCal that are inaccessible to external students (mainly remote connections, file permissions, etc.).
Prerequisites: Linear Algebra, Advanced Mathematics, Python
Programming Language: Python
Course Difficulty: 🌟🌟🌟
Estimated Study Time: 40 hours
Official Description:
This introductory course in computer graphics begins with using Blender to generate images and understanding the underlying mathematical concepts, including triangles, normals, interpolation, texture mapping, bump mapping, and more. It then delves into light and color and how they affect computer displays and printing. The course also covers BRDF and some basic lighting and shading models. Towards the end, topics like ray tracing, anti-aliasing, and acceleration structures are introduced.
For more detailed information, you can visit the course website.
This course is somewhat less in-depth compared to GAMES101 and uses Python, making it more accessible for students who are not familiar with C++.
University: University of California, Santa Barbara (UCSB)
Prerequisites: Linear Algebra, Advanced Mathematics, C++
Programming Language: C++
Course Difficulty: 🌟🌟🌟
Estimated Study Time: 80 hours
Official Introduction:
This course comprehensively and systematically introduces the four major components of modern computer graphics: (1) rasterization imaging, (2) geometric representation, (3) the theory of light propagation, and (4) animation and simulation. Each aspect is explained from basic principles to practical applications, along with the introduction of cutting-edge theoretical research. Through this course, you can learn the mathematics and physics behind computer graphics and enhance your practical programming skills. As an introduction, this course aims to cover as many aspects of graphics as possible, clearly explaining the basic concepts of each part to provide a complete, top-down understanding of computer graphics. A global understanding is crucial; after completing this course, you will realize that graphics is not just OpenGL or ray tracing but a set of methods for creating virtual worlds. The title of this course also contains the word "modern," indicating that the knowledge imparted is contemporary and essential for the modern graphics industry.
GAMES101 is a well-known graphics course in China. Unlike the traditionally math and algorithm-heavy perception of graphics, this course introduces the field of graphics in a very vivid way.
Each project is not code-heavy but is quite interesting. Through these projects, students will implement simple rasterization to render basic models and ray tracing for better rendering quality. Each project also includes optional extensions to enhance the quality and speed of the rendered models.
If you enjoy gaming, you might be familiar with real-time ray tracing, a technology that the course instructor, Lingqi Yan, has directly contributed to. By following the course videos and completing each project, you'll likely develop a strong interest in graphics and modern rendering techniques, just as I did.
University: Style3D / Oregon State University (OSU)
Prerequisites: Linear Algebra, Advanced Mathematics, College Physics, Programming Skills, Basic Graphics Knowledge
Programming Language: C#
Course Difficulty: 🌟🌟🌟🌟
Estimated Study Time: 50 hours
Official Introduction:
This course serves as an introduction to physics-based computer animation techniques, focusing on various fundamental physical animation simulation technologies.
The course mainly covers four areas: 1) Rigid body simulation; 2) Particle systems, springs, constraints, and cloth simulation; 3) Elastic body simulation based on the finite element method; 4) Fluid simulation.
The course content will not delve into specific physical simulation engines but will discuss the technologies behind various engines and their pros and cons. Since developing and learning physical simulations requires a solid mathematical foundation, the initial stages of the course will also spend some time reviewing necessary mathematical concepts. Upon successful completion of the course, students should have a deep understanding of basic physical simulation techniques and some exposure to advanced simulation technologies.
In graphics, the field can be roughly divided into rendering, simulation, and geometry. While GAMES101 and GAMES202 mainly focus on rendering, GAMES103 is an excellent resource for learning about physical simulation.
Course Assignments: Four assignments, available on the official BBS mini-app or the unofficial Repo: GAMES103 HW
Resource Summary
All resources and homework requirements used by @indevn during the course are compiled at GAMES103 Unofficial. For detailed implementations of the assignments, there are many articles on Zhihu that provide in-depth explanations and can be referenced.