add manuscripts on machine learning 'ml_scripts.md', add 912v1.0.tex, update words.

2019-12-31 09:05:31 +08:00
parent 8bd47316df
commit 9bc971279d
3 changed files with 167 additions and 0 deletions
--- a/912v1.0.tex
+++ b/912v1.0.tex
@@ -0,0 +1,102 @@
+\documentclass[UTF8,12pt]{ctexart}
+\usepackage{ctex}
+\usepackage{amsmath}
+\usepackage{graphicx}
+\CTEXsetup[format={\Large\bfseries}]{section}
+\title{\kaishu 912回忆版}
+\author{by Shine Wong}
+\date{12/22}
+
+\begin{document}
+\maketitle
+
+	\section{数据结构}
+
+	\subsection{判断题}
+
+		\begin{itemize}
+
+			\item[1)]$log^nn = \Omega(n^{logn})$。
+			\item[2)]对一棵AVL树进行插入，则至多会引起$\Omega(logn)$次局部调整操作。
+			\item[3)]对一个理想随机输入的序列进行快速排序，则在平均情况下以及最坏情况下都可以达到$O(logn)$的时间复杂度性能。
+			\item[4)]在理想随机输入的情况下，尽管完全二叉堆的删除操作的最坏时间复杂度有$O(logn)$，平均时间复杂度仅为$O(1)$而已。
+			\item[5)]跳转表每一个节点所对应的塔的平均高度为$O(logn)$。\\
+			\item[6)]采用基于比较的算法，可以在$O(n)$的时间内找出序列的前10\%大的元素。
+			\item[7)]对一有向图进行DFS，共有$k$条边被标记为 BACKWARD，则该图中未必有$k$个环路。\\
+			\item[8)]败者树相对于胜者树，具有更优的渐进时间复杂度性能。\\
+			\item[9)]相对于闭散列，开散列可以更好地利用数据的局部性。\\
+			\item[10)]...remain to be added
+
+
+		\end{itemize}
+
+	\subsection{单向选择题}
+
+		\begin{itemize}
+
+			\item[1)]对一有向无环图，该图的拓扑排序序列恰好是DFS的$\underline{\hbox to 10mm{}}$\\
+			A.\ 被发现的顺序\\
+			B.\ 被发现的逆序\\
+			C.\ 回溯的顺序\\
+			D.\ 回溯的逆序
+
+			\item[2)]如果基数排序底层采用不稳定的排序算法，则所得的结果$\underline{\hbox to 10mm{}}$，并且基数排序的稳定性$\underline{\hbox to 10mm{}}$\\
+			A.\ 不再正确 \ 不再保持\\
+			B.\ 不再正确 \ 仍然保持\\
+			C.\ 仍然正确 \ 不再保持\\
+			D.\ 仍然正确 \ 仍然保持
+
+			\item[3)]逆波兰表达式$Blalala$的结果为2019，则中间缺失的操作符为\\
+			A \ + \\
+			B \ - \\
+			C \ * \\
+			D \ / \\
+			E \ \^ \\
+			F \ !
+
+			\item[4)]对于一个权重分别是1,1,2,3,5,8,13,21的字符集构造Huffman编码树，其中最大的深度为\\
+			A.\ 6\\
+			B.\ 7\\
+			C.\ 8\\
+			D.\ 9
+
+			\item[5)]含有$\underline{\hbox to 10mm{}}$个节点的真二叉树的数量，与2019对括号构成的合法表达式数量相同。\\
+			A.\ 1009\\
+			B.\ 1010\\
+			C.\ 2019\\
+			D.\ 4039
+
+			\item[6)]对一模式串HHBFHHBFHHBFSHH，考虑改进的next数组，则$next[14] - next[0] = \underline{\hbox to 10mm{}}$\\
+			A.\ 2\\
+			B.\ 3\\
+			C.\ 4\\
+			D.\ 5
+
+		\end{itemize}
+
+	\subsection{证明题}
+
+		已知一棵二叉搜索树的先序和后序遍历序列，是否可以构造出它的层次遍历序列？是则给出证明，否则给出一个反例。（5分）
+
+	\subsection{程序设计题}
+
+		给出二叉树节点BinNode的定义如下：\\
+
+		\noindent class BinNode{\\
+		public:\\
+			\indent BinNode* parent;\\
+			\indent BinNode* lc;\\
+			\indent BinNode* rc;\\
+			\indent int lsize;\\
+
+			\indent BinNode* zig(BinNode* x);//绕当前节点顺时针旋转，仍然返回旋转后根节点的左子树\\
+			\indent BinNode* zag(BinNode* x);//绕当前节点顺时针旋转，仍然返回旋转后根节点的右子树\\
+		}
+
+		\begin{itemize}
+
+			\item[]
+
+		\end{itemize}
+
+\end{document}
--- a/ml_scipts.md
+++ b/ml_scipts.md
@@ -0,0 +1,26 @@
+Handscripts when studing Machine Learning
+=========================================
+
+> 什么是机器学习？
+
+注意三个点，即E, T, P。
+
+> 监督学习与无监督学习之间的区别？
+
+监督学习是指对于输入的数据，它所对应的输出是已知的。监督学习可以分为两类，即回归问题与分类问题，它们的区别在于输出是否是连续的。具体的例子有房价预测问题（回归问题），判断肿瘤是否是良性（分类问题）。
+
+无监督学习的输入数据之间没有任何区别，每个输入数据都是等价的，并没有事先表明它的状态或者分类信息（比如房价或者恶性肿瘤），而是由机器来分辨不同数据的属性。典型的例子有`聚类问题`（clustering）以及鸡尾酒宴算法。
+
+> 关于深度学习算法的一些思考。
+
+人工神经元算法的设计乃是`线性内核`与`非线性激活`的叠加。根据`线性内核`的不同，可以分为`DNN`，`CNN`，`RNN`，它们分别适用于不同的场景。但是这种建模方法显然是不准确的，片面的，因为实际中的神经元对于各种场合的问题都可以很好的适用。这样，应该存在一种更好的方式来模拟神经元。
+
+人工神经网络的精髓都在于对大脑中的神经元进行模拟。但是我在想神经元并非一定是解决问题的最高效的方法，虽然神经元经过了几十亿年的进化与自然选择，但它未必是解决现实问题的最优解，可能只是一个局部最优而已，alphaGo的例子就说明了这一点——人类数千年形成的围棋算法实际上只是局部最优解。
+
+另一方面，让计算机模拟人脑也未必就是最好的方法。因此我在想，有没有可能跳出现有神经元的桎梏，开创出一个更优化的算法，这样说不定还可以反过来对人类的神经元进行改造。
+
+> 梯度下降法存在的问题。
+
+首先是学习率(learning rate)的选择。如果$\alpha$太小，则需要多次迭代才能找到局部最优解，需要较长的学习时间；而如果$\alpha$太大，则可能直越过最低点，导致无法收敛，甚至发散。
+
+此外，显而易见的是，梯度下降法只能找到局部最优解，而非全局最优解。实际上，梯度下降法找到的解取决于初始位置的选择。然而，对于线性回归（linear regression)问题，则不存在这个问题，因为线性回归问题的代价函数是一个凸函数(convex function)，即它只有一个极值点，该极值点就是它的全局最优解，因此使用梯度下降算法总是可以得到唯一的最优解。
--- a/words.md
+++ b/words.md
@@ -2066,3 +2066,42 @@ Some Words

 	- a pathological liar
 	- He experiences chronic, almost pathological jealousy.
+
+## 30th, December
+
+ spam
+> (n)unwanted email, ususually advertisements</br>
+> (v)to send someone advertisements by email that they do not want.
+
+	- Some Internet service providers block spam to subscribers.
+	- He spammed the message to 30,000 addresses in a week.
+
+ tumor
+> (n)a mass of cell in the body that grow faster than usual and can cause illness.
+
+	- a malignant/benign tumor
+
+ malignant
+> (adj)a malignant disease or growth is cancer or is related to cancer, and is likely to be harmful.</br>
+> (adj)having a strong wish to do harm
+
+	- The process by which malignant cancer cells multiply isn't fully understood.
+	- He developed a malignant hatred for the land of his birth.
+
+ benign
+> (adj)pleasant or kind; not harmful or severe</br>
+> (adj)a benign growth is not cancer and is not likely to be harmful
+
+	- a benign tumor
+	- They are normally a more benign audience.
+	- I just smiled benignly and stood back.
+
+ inventory
+> (n)a detailed list of all the things in a place.</br>
+> (n)the amount of goods a store or business has for sale at a particular time, or their value.
+
+	- About half of the shop's inventory was damaged in the tornado.
+	- Before starting, he made an inventory of everything that was to stay.
+
+ tornado
+> (n)a strong, dangerous wind that forms itself into an upside-down spinning cone and is able to destroy buildings as it moves across the ground.