diff --git a/ml/logistic_regression/initial_data.png b/ml/logistic_regression/images/initial_data.png
similarity index 100%
rename from ml/logistic_regression/initial_data.png
rename to ml/logistic_regression/images/initial_data.png
diff --git a/ml/logistic_regression/justfit.png b/ml/logistic_regression/images/justfit.png
similarity index 100%
rename from ml/logistic_regression/justfit.png
rename to ml/logistic_regression/images/justfit.png
diff --git a/ml/logistic_regression/overfit.png b/ml/logistic_regression/images/overfit.png
similarity index 100%
rename from ml/logistic_regression/overfit.png
rename to ml/logistic_regression/images/overfit.png
diff --git a/ml/logistic_regression/logistic_regression.md b/ml/logistic_regression/logistic_regression.md
index 70b3e35..7fd4300 100644
--- a/ml/logistic_regression/logistic_regression.md
+++ b/ml/logistic_regression/logistic_regression.md
@@ -112,9 +112,9 @@ $$
 cost(h_\theta(x), y) = \left\{
 	\begin{aligned}
 	-log(h_\theta(x))&, y = 1\\\\
-	-log(1 - h_theta(x))&, y = 0
+	-log(1 - h_\theta(x))&, y = 0
 	\end{aligned}
-\right
+\right.
 $$
 
 可以分别画出$y = 0$与$y = 1$的损失函数的图像，以获得一个比较直观的理解。容易看到，当$y = 1$时，对输出的预测值$h_\theta(x)$越接近一，则损失函数的值越小；预测值越接近零，则损失函数的值越大。并且当$h_\theta(x) = 1$时，$cost(h_\theta(x), 1) = 0$，当$h_\theta(x) = 0$时，$cost(h_\theta(x), 1) = +\infty$。该性质是容易理解的，即预测值与真值越接近，则相应的损失就越小。$y = 0$时也有类似的性质。
@@ -239,7 +239,7 @@ H = \frac{1}{m}\left[
 \right]
 $$
 
-容易看出（？，
+容易看出（？
 
 $$
 H = \frac{1}{m}(GX)^TGX
@@ -270,7 +270,7 @@ $$
 为了解决过拟合，仍然可以对此前的对数损失函数添加**正则化项**，正则化后的损失函数$J(\theta)$
 
 $$
-J(\theta) = -\frac{1}{m}[\Sigma_{i = 1}^my^{(i)}log(h_\theta(x^{(i)})) + (1-y^{(i)})log(1 - logh_\theta(x^{(i)}))] + \frac{\lambda}{2m}\Sigma_{i = j}^n\theta_j^2
+J(\theta) = -\frac{1}{m}[\Sigma_{i = 1}^my^{(i)}log(h_\theta(x^{(i)})) + (1-y^{(i)})log(1 - logh_\theta(x^{(i)}))] + \frac{\lambda}{2m}\Sigma_{j = 1}^n\theta_j^2
 $$
 
 下面以一个具体实例，直观地展示过拟合现象以及正则化对过拟合的影响。这个实例是一个二类分类问题，输入$x$是一个二维向量，即具有两个特征$x_1, x_2$。首先对原始数据进行可视化，如下图所示：