mirror of
https://github.com/apachecn/ailearning.git
synced 2026-02-13 15:26:28 +08:00
更新文档数据
This commit is contained in:
@@ -14,7 +14,7 @@
|
||||
机器学习(machine learning): 机器学习是最基础的(当下初创公司和研究实验室的热点领域之一)。
|
||||
在90年代初,人们开始意识到一种可以更有效地构建模式识别算法的方法,那就是用数据(可以通过廉价劳动力采集获得)去替换专家(具有很多图像方面知识的人)。
|
||||
“机器学习”强调的是,在给计算机程序(或者机器)输入一些数据后,它必须做一些事情,那就是学习这些数据,而这个学习的步骤是明确的。
|
||||
机器学习(Machine Learning)是一门专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能的学科。
|
||||
机器学习(Machine Learning)是一门专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身性能的学科。
|
||||
深度学习(deep learning): 深度学习是非常崭新和有影响力的前沿领域,我们甚至不会去思考-后深度学习时代。
|
||||
深度学习是机器学习研究中的一个新的领域,其动机在于建立、模拟人脑进行分析学习的神经网络,它模仿人脑的机制来解释数据,例如图像,声音和文本。
|
||||
|
||||
@@ -36,18 +36,18 @@ http://baike.baidu.com/link?url=76P-uA4EBrC3G-I__P1tqeO7eoDS709Kp4wYuHxc7GNkz_xn
|
||||
## 机器学习的简单概述
|
||||
|
||||
`机器学习`就是把无序的数据转换成有用的信息;机器学习将有助于我们穿越数据雾霾,从中抽取出有用的信息。
|
||||
* 1.需要获取海量的数据
|
||||
* 2.才能从海量数据中获取有用的信息
|
||||
* 1.获取海量的数据
|
||||
* 2.从海量数据中获取有用的信息
|
||||
|
||||
## 机器学习的主要任务
|
||||
|
||||
> 机器学习的主要任务就是分类和回归
|
||||
|
||||
* 分类:将实例数据划分到合适的类别中。
|
||||
* 回归:主要用于预测数值型数据。(例子———数据拟合曲线:通过给定数据点的最优拟合曲线)
|
||||
* 回归:主要用于预测数值型数据。(示例:数据通过给定数据点来拟合最优曲线)
|
||||
* 目标变量
|
||||
* 目标变量是机器学习预测算法的测试结果。
|
||||
* 在分类算法中目标变量的类型通常是标称型的,而在回归算法中通常是连续型的。
|
||||
* 在分类算法中目标变量的类型通常是标称型(如:真与假),而在回归算法中通常是连续型(如:1~100)。
|
||||
|
||||
* 机器学习的训练过程
|
||||
* 
|
||||
@@ -57,7 +57,7 @@ http://baike.baidu.com/link?url=76P-uA4EBrC3G-I__P1tqeO7eoDS709Kp4wYuHxc7GNkz_xn
|
||||
* 必须知道预测什么,即必须知道目标变量的分类信息。分类和回归属于监督学习。
|
||||
* 样本集:训练数据 + 测试数据
|
||||
* 训练样本 = 特征(feature) + 目标变量(label)
|
||||
* 训练样本的集合称为训练样本集,训练样本集必须确定知道目标变量的值,以便机器学习算法可以发现特征和目标变量之间的关系。
|
||||
* 训练样本的集合称为训练样本集,训练样本集必须确定目标变量的值,以便机器学习算法可以发现特征和目标变量之间的关系。
|
||||
* 特征(feature-是否有缺失情况) + 目标变量(分类-离散值<A/B/C、 是/否>/回归-连续值<0~100、 -999~999>)
|
||||
* 特征或者属性通常是训练样本集的列,它们是独立测量得到的结果,多个特征联系在一起共同组成一个训练样本。
|
||||
* `知识表示`:(例如-机器已经学会如何识别鸟类的过程)
|
||||
@@ -76,7 +76,7 @@ http://baike.baidu.com/link?url=76P-uA4EBrC3G-I__P1tqeO7eoDS709Kp4wYuHxc7GNkz_xn
|
||||
|
||||

|
||||
|
||||
## 学习机器学习的原因
|
||||
## 学习机器学习
|
||||
|
||||
* 选择算法需要考虑的两个问题
|
||||
* 使用机器学习算法的目的
|
||||
|
||||
@@ -24,8 +24,8 @@
|
||||
> Sigmoid函数简介
|
||||
|
||||
```
|
||||
我们想要的函数应该是,能接受所有的输入然后预测出类别。例如,在两个类的情况下,上述函数输出 0 和 1 。这类函数称为海维塞得阶跃函数,或者直接称之为 单位阶跃函数。
|
||||
但是,海维塞得阶跃函数的问题在于:该函数在跳跃点上从 0 瞬间跳跃到 1,这个瞬间跳跃过程有时候很难处理。幸好,另外的一个函数也有这样的性质(这里的性质指的是可以输出0和1的性质),
|
||||
我们想要的函数应该是,能接受所有的输入然后预测出类别。例如,在两个类的情况下,上述函数输出 0 和 1 。这类函数称为海维塞德阶跃函数,或者直接称之为 单位阶跃函数。
|
||||
但是,海维塞德阶跃函数的问题在于:该函数在跳跃点上从 0 瞬间跳跃到 1,这个瞬间跳跃过程有时候很难处理。幸好,另外的一个函数也有这样的性质(这里的性质指的是可以输出0和1的性质),
|
||||
且数学上更易处理,这就是我们下边要介绍的 Sigmoid 函数。
|
||||
|
||||
Sigmoid函数具体的计算公式如下:
|
||||
|
||||
157
input/16.RecommenderSystems/ml-100k/README
Normal file
157
input/16.RecommenderSystems/ml-100k/README
Normal file
@@ -0,0 +1,157 @@
|
||||
SUMMARY & USAGE LICENSE
|
||||
=============================================
|
||||
|
||||
MovieLens data sets were collected by the GroupLens Research Project
|
||||
at the University of Minnesota.
|
||||
|
||||
This data set consists of:
|
||||
* 100,000 ratings (1-5) from 943 users on 1682 movies.
|
||||
* Each user has rated at least 20 movies.
|
||||
* Simple demographic info for the users (age, gender, occupation, zip)
|
||||
|
||||
The data was collected through the MovieLens web site
|
||||
(movielens.umn.edu) during the seven-month period from September 19th,
|
||||
1997 through April 22nd, 1998. This data has been cleaned up - users
|
||||
who had less than 20 ratings or did not have complete demographic
|
||||
information were removed from this data set. Detailed descriptions of
|
||||
the data file can be found at the end of this file.
|
||||
|
||||
Neither the University of Minnesota nor any of the researchers
|
||||
involved can guarantee the correctness of the data, its suitability
|
||||
for any particular purpose, or the validity of results based on the
|
||||
use of the data set. The data set may be used for any research
|
||||
purposes under the following conditions:
|
||||
|
||||
* The user may not state or imply any endorsement from the
|
||||
University of Minnesota or the GroupLens Research Group.
|
||||
|
||||
* The user must acknowledge the use of the data set in
|
||||
publications resulting from the use of the data set
|
||||
(see below for citation information).
|
||||
|
||||
* The user may not redistribute the data without separate
|
||||
permission.
|
||||
|
||||
* The user may not use this information for any commercial or
|
||||
revenue-bearing purposes without first obtaining permission
|
||||
from a faculty member of the GroupLens Research Project at the
|
||||
University of Minnesota.
|
||||
|
||||
If you have any further questions or comments, please contact GroupLens
|
||||
<grouplens-info@cs.umn.edu>.
|
||||
|
||||
CITATION
|
||||
==============================================
|
||||
|
||||
To acknowledge use of the dataset in publications, please cite the
|
||||
following paper:
|
||||
|
||||
F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets:
|
||||
History and Context. ACM Transactions on Interactive Intelligent
|
||||
Systems (TiiS) 5, 4, Article 19 (December 2015), 19 pages.
|
||||
DOI=http://dx.doi.org/10.1145/2827872
|
||||
|
||||
|
||||
ACKNOWLEDGEMENTS
|
||||
==============================================
|
||||
|
||||
Thanks to Al Borchers for cleaning up this data and writing the
|
||||
accompanying scripts.
|
||||
|
||||
PUBLISHED WORK THAT HAS USED THIS DATASET
|
||||
==============================================
|
||||
|
||||
Herlocker, J., Konstan, J., Borchers, A., Riedl, J.. An Algorithmic
|
||||
Framework for Performing Collaborative Filtering. Proceedings of the
|
||||
1999 Conference on Research and Development in Information
|
||||
Retrieval. Aug. 1999.
|
||||
|
||||
FURTHER INFORMATION ABOUT THE GROUPLENS RESEARCH PROJECT
|
||||
==============================================
|
||||
|
||||
The GroupLens Research Project is a research group in the Department
|
||||
of Computer Science and Engineering at the University of Minnesota.
|
||||
Members of the GroupLens Research Project are involved in many
|
||||
research projects related to the fields of information filtering,
|
||||
collaborative filtering, and recommender systems. The project is lead
|
||||
by professors John Riedl and Joseph Konstan. The project began to
|
||||
explore automated collaborative filtering in 1992, but is most well
|
||||
known for its world wide trial of an automated collaborative filtering
|
||||
system for Usenet news in 1996. The technology developed in the
|
||||
Usenet trial formed the base for the formation of Net Perceptions,
|
||||
Inc., which was founded by members of GroupLens Research. Since then
|
||||
the project has expanded its scope to research overall information
|
||||
filtering solutions, integrating in content-based methods as well as
|
||||
improving current collaborative filtering technology.
|
||||
|
||||
Further information on the GroupLens Research project, including
|
||||
research publications, can be found at the following web site:
|
||||
|
||||
http://www.grouplens.org/
|
||||
|
||||
GroupLens Research currently operates a movie recommender based on
|
||||
collaborative filtering:
|
||||
|
||||
http://www.movielens.org/
|
||||
|
||||
DETAILED DESCRIPTIONS OF DATA FILES
|
||||
==============================================
|
||||
|
||||
Here are brief descriptions of the data.
|
||||
|
||||
ml-data.tar.gz -- Compressed tar file. To rebuild the u data files do this:
|
||||
gunzip ml-data.tar.gz
|
||||
tar xvf ml-data.tar
|
||||
mku.sh
|
||||
|
||||
u.data -- The full u data set, 100000 ratings by 943 users on 1682 items.
|
||||
Each user has rated at least 20 movies. Users and items are
|
||||
numbered consecutively from 1. The data is randomly
|
||||
ordered. This is a tab separated list of
|
||||
user id | item id | rating | timestamp.
|
||||
The time stamps are unix seconds since 1/1/1970 UTC
|
||||
|
||||
u.info -- The number of users, items, and ratings in the u data set.
|
||||
|
||||
u.item -- Information about the items (movies); this is a tab separated
|
||||
list of
|
||||
movie id | movie title | release date | video release date |
|
||||
IMDb URL | unknown | Action | Adventure | Animation |
|
||||
Children's | Comedy | Crime | Documentary | Drama | Fantasy |
|
||||
Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi |
|
||||
Thriller | War | Western |
|
||||
The last 19 fields are the genres, a 1 indicates the movie
|
||||
is of that genre, a 0 indicates it is not; movies can be in
|
||||
several genres at once.
|
||||
The movie ids are the ones used in the u.data data set.
|
||||
|
||||
u.genre -- A list of the genres.
|
||||
|
||||
u.user -- Demographic information about the users; this is a tab
|
||||
separated list of
|
||||
user id | age | gender | occupation | zip code
|
||||
The user ids are the ones used in the u.data data set.
|
||||
|
||||
u.occupation -- A list of the occupations.
|
||||
|
||||
u1.base -- The data sets u1.base and u1.test through u5.base and u5.test
|
||||
u1.test are 80%/20% splits of the u data into training and test data.
|
||||
u2.base Each of u1, ..., u5 have disjoint test sets; this if for
|
||||
u2.test 5 fold cross validation (where you repeat your experiment
|
||||
u3.base with each training and test set and average the results).
|
||||
u3.test These data sets can be generated from u.data by mku.sh.
|
||||
u4.base
|
||||
u4.test
|
||||
u5.base
|
||||
u5.test
|
||||
|
||||
ua.base -- The data sets ua.base, ua.test, ub.base, and ub.test
|
||||
ua.test split the u data into a training set and a test set with
|
||||
ub.base exactly 10 ratings per user in the test set. The sets
|
||||
ub.test ua.test and ub.test are disjoint. These data sets can
|
||||
be generated from u.data by mku.sh.
|
||||
|
||||
allbut.pl -- The script that generates training and test sets where
|
||||
all but n of a users ratings are in the training data.
|
||||
|
||||
mku.sh -- A shell script to generate all the u data sets from u.data.
|
||||
34
input/16.RecommenderSystems/ml-100k/allbut.pl
Executable file
34
input/16.RecommenderSystems/ml-100k/allbut.pl
Executable file
@@ -0,0 +1,34 @@
|
||||
#!/usr/local/bin/perl
|
||||
|
||||
# get args
|
||||
if (@ARGV < 3) {
|
||||
print STDERR "Usage: $0 base_name start stop max_test [ratings ...]\n";
|
||||
exit 1;
|
||||
}
|
||||
$basename = shift;
|
||||
$start = shift;
|
||||
$stop = shift;
|
||||
$maxtest = shift;
|
||||
|
||||
# open files
|
||||
open( TESTFILE, ">$basename.test" ) or die "Cannot open $basename.test for writing\n";
|
||||
open( BASEFILE, ">$basename.base" ) or die "Cannot open $basename.base for writing\n";
|
||||
|
||||
# init variables
|
||||
$testcnt = 0;
|
||||
|
||||
while (<>) {
|
||||
($user) = split;
|
||||
if (! defined $ratingcnt{$user}) {
|
||||
$ratingcnt{$user} = 0;
|
||||
}
|
||||
++$ratingcnt{$user};
|
||||
if (($testcnt < $maxtest || $maxtest <= 0)
|
||||
&& $ratingcnt{$user} >= $start && $ratingcnt{$user} <= $stop) {
|
||||
++$testcnt;
|
||||
print TESTFILE;
|
||||
}
|
||||
else {
|
||||
print BASEFILE;
|
||||
}
|
||||
}
|
||||
25
input/16.RecommenderSystems/ml-100k/mku.sh
Executable file
25
input/16.RecommenderSystems/ml-100k/mku.sh
Executable file
@@ -0,0 +1,25 @@
|
||||
#!/bin/sh
|
||||
|
||||
trap `rm -f tmp.$$; exit 1` 1 2 15
|
||||
|
||||
for i in 1 2 3 4 5
|
||||
do
|
||||
head -`expr $i \* 20000` u.data | tail -20000 > tmp.$$
|
||||
sort -t" " -k 1,1n -k 2,2n tmp.$$ > u$i.test
|
||||
head -`expr \( $i - 1 \) \* 20000` u.data > tmp.$$
|
||||
tail -`expr \( 5 - $i \) \* 20000` u.data >> tmp.$$
|
||||
sort -t" " -k 1,1n -k 2,2n tmp.$$ > u$i.base
|
||||
done
|
||||
|
||||
allbut.pl ua 1 10 100000 u.data
|
||||
sort -t" " -k 1,1n -k 2,2n ua.base > tmp.$$
|
||||
mv tmp.$$ ua.base
|
||||
sort -t" " -k 1,1n -k 2,2n ua.test > tmp.$$
|
||||
mv tmp.$$ ua.test
|
||||
|
||||
allbut.pl ub 11 20 100000 u.data
|
||||
sort -t" " -k 1,1n -k 2,2n ub.base > tmp.$$
|
||||
mv tmp.$$ ub.base
|
||||
sort -t" " -k 1,1n -k 2,2n ub.test > tmp.$$
|
||||
mv tmp.$$ ub.test
|
||||
|
||||
100000
input/16.RecommenderSystems/ml-100k/u.data
Normal file
100000
input/16.RecommenderSystems/ml-100k/u.data
Normal file
File diff suppressed because it is too large
Load Diff
20
input/16.RecommenderSystems/ml-100k/u.genre
Normal file
20
input/16.RecommenderSystems/ml-100k/u.genre
Normal file
@@ -0,0 +1,20 @@
|
||||
unknown|0
|
||||
Action|1
|
||||
Adventure|2
|
||||
Animation|3
|
||||
Children's|4
|
||||
Comedy|5
|
||||
Crime|6
|
||||
Documentary|7
|
||||
Drama|8
|
||||
Fantasy|9
|
||||
Film-Noir|10
|
||||
Horror|11
|
||||
Musical|12
|
||||
Mystery|13
|
||||
Romance|14
|
||||
Sci-Fi|15
|
||||
Thriller|16
|
||||
War|17
|
||||
Western|18
|
||||
|
||||
3
input/16.RecommenderSystems/ml-100k/u.info
Normal file
3
input/16.RecommenderSystems/ml-100k/u.info
Normal file
@@ -0,0 +1,3 @@
|
||||
943 users
|
||||
1682 items
|
||||
100000 ratings
|
||||
1682
input/16.RecommenderSystems/ml-100k/u.item
Normal file
1682
input/16.RecommenderSystems/ml-100k/u.item
Normal file
File diff suppressed because it is too large
Load Diff
21
input/16.RecommenderSystems/ml-100k/u.occupation
Normal file
21
input/16.RecommenderSystems/ml-100k/u.occupation
Normal file
@@ -0,0 +1,21 @@
|
||||
administrator
|
||||
artist
|
||||
doctor
|
||||
educator
|
||||
engineer
|
||||
entertainment
|
||||
executive
|
||||
healthcare
|
||||
homemaker
|
||||
lawyer
|
||||
librarian
|
||||
marketing
|
||||
none
|
||||
other
|
||||
programmer
|
||||
retired
|
||||
salesman
|
||||
scientist
|
||||
student
|
||||
technician
|
||||
writer
|
||||
943
input/16.RecommenderSystems/ml-100k/u.user
Normal file
943
input/16.RecommenderSystems/ml-100k/u.user
Normal file
@@ -0,0 +1,943 @@
|
||||
1|24|M|technician|85711
|
||||
2|53|F|other|94043
|
||||
3|23|M|writer|32067
|
||||
4|24|M|technician|43537
|
||||
5|33|F|other|15213
|
||||
6|42|M|executive|98101
|
||||
7|57|M|administrator|91344
|
||||
8|36|M|administrator|05201
|
||||
9|29|M|student|01002
|
||||
10|53|M|lawyer|90703
|
||||
11|39|F|other|30329
|
||||
12|28|F|other|06405
|
||||
13|47|M|educator|29206
|
||||
14|45|M|scientist|55106
|
||||
15|49|F|educator|97301
|
||||
16|21|M|entertainment|10309
|
||||
17|30|M|programmer|06355
|
||||
18|35|F|other|37212
|
||||
19|40|M|librarian|02138
|
||||
20|42|F|homemaker|95660
|
||||
21|26|M|writer|30068
|
||||
22|25|M|writer|40206
|
||||
23|30|F|artist|48197
|
||||
24|21|F|artist|94533
|
||||
25|39|M|engineer|55107
|
||||
26|49|M|engineer|21044
|
||||
27|40|F|librarian|30030
|
||||
28|32|M|writer|55369
|
||||
29|41|M|programmer|94043
|
||||
30|7|M|student|55436
|
||||
31|24|M|artist|10003
|
||||
32|28|F|student|78741
|
||||
33|23|M|student|27510
|
||||
34|38|F|administrator|42141
|
||||
35|20|F|homemaker|42459
|
||||
36|19|F|student|93117
|
||||
37|23|M|student|55105
|
||||
38|28|F|other|54467
|
||||
39|41|M|entertainment|01040
|
||||
40|38|M|scientist|27514
|
||||
41|33|M|engineer|80525
|
||||
42|30|M|administrator|17870
|
||||
43|29|F|librarian|20854
|
||||
44|26|M|technician|46260
|
||||
45|29|M|programmer|50233
|
||||
46|27|F|marketing|46538
|
||||
47|53|M|marketing|07102
|
||||
48|45|M|administrator|12550
|
||||
49|23|F|student|76111
|
||||
50|21|M|writer|52245
|
||||
51|28|M|educator|16509
|
||||
52|18|F|student|55105
|
||||
53|26|M|programmer|55414
|
||||
54|22|M|executive|66315
|
||||
55|37|M|programmer|01331
|
||||
56|25|M|librarian|46260
|
||||
57|16|M|none|84010
|
||||
58|27|M|programmer|52246
|
||||
59|49|M|educator|08403
|
||||
60|50|M|healthcare|06472
|
||||
61|36|M|engineer|30040
|
||||
62|27|F|administrator|97214
|
||||
63|31|M|marketing|75240
|
||||
64|32|M|educator|43202
|
||||
65|51|F|educator|48118
|
||||
66|23|M|student|80521
|
||||
67|17|M|student|60402
|
||||
68|19|M|student|22904
|
||||
69|24|M|engineer|55337
|
||||
70|27|M|engineer|60067
|
||||
71|39|M|scientist|98034
|
||||
72|48|F|administrator|73034
|
||||
73|24|M|student|41850
|
||||
74|39|M|scientist|T8H1N
|
||||
75|24|M|entertainment|08816
|
||||
76|20|M|student|02215
|
||||
77|30|M|technician|29379
|
||||
78|26|M|administrator|61801
|
||||
79|39|F|administrator|03755
|
||||
80|34|F|administrator|52241
|
||||
81|21|M|student|21218
|
||||
82|50|M|programmer|22902
|
||||
83|40|M|other|44133
|
||||
84|32|M|executive|55369
|
||||
85|51|M|educator|20003
|
||||
86|26|M|administrator|46005
|
||||
87|47|M|administrator|89503
|
||||
88|49|F|librarian|11701
|
||||
89|43|F|administrator|68106
|
||||
90|60|M|educator|78155
|
||||
91|55|M|marketing|01913
|
||||
92|32|M|entertainment|80525
|
||||
93|48|M|executive|23112
|
||||
94|26|M|student|71457
|
||||
95|31|M|administrator|10707
|
||||
96|25|F|artist|75206
|
||||
97|43|M|artist|98006
|
||||
98|49|F|executive|90291
|
||||
99|20|M|student|63129
|
||||
100|36|M|executive|90254
|
||||
101|15|M|student|05146
|
||||
102|38|M|programmer|30220
|
||||
103|26|M|student|55108
|
||||
104|27|M|student|55108
|
||||
105|24|M|engineer|94043
|
||||
106|61|M|retired|55125
|
||||
107|39|M|scientist|60466
|
||||
108|44|M|educator|63130
|
||||
109|29|M|other|55423
|
||||
110|19|M|student|77840
|
||||
111|57|M|engineer|90630
|
||||
112|30|M|salesman|60613
|
||||
113|47|M|executive|95032
|
||||
114|27|M|programmer|75013
|
||||
115|31|M|engineer|17110
|
||||
116|40|M|healthcare|97232
|
||||
117|20|M|student|16125
|
||||
118|21|M|administrator|90210
|
||||
119|32|M|programmer|67401
|
||||
120|47|F|other|06260
|
||||
121|54|M|librarian|99603
|
||||
122|32|F|writer|22206
|
||||
123|48|F|artist|20008
|
||||
124|34|M|student|60615
|
||||
125|30|M|lawyer|22202
|
||||
126|28|F|lawyer|20015
|
||||
127|33|M|none|73439
|
||||
128|24|F|marketing|20009
|
||||
129|36|F|marketing|07039
|
||||
130|20|M|none|60115
|
||||
131|59|F|administrator|15237
|
||||
132|24|M|other|94612
|
||||
133|53|M|engineer|78602
|
||||
134|31|M|programmer|80236
|
||||
135|23|M|student|38401
|
||||
136|51|M|other|97365
|
||||
137|50|M|educator|84408
|
||||
138|46|M|doctor|53211
|
||||
139|20|M|student|08904
|
||||
140|30|F|student|32250
|
||||
141|49|M|programmer|36117
|
||||
142|13|M|other|48118
|
||||
143|42|M|technician|08832
|
||||
144|53|M|programmer|20910
|
||||
145|31|M|entertainment|V3N4P
|
||||
146|45|M|artist|83814
|
||||
147|40|F|librarian|02143
|
||||
148|33|M|engineer|97006
|
||||
149|35|F|marketing|17325
|
||||
150|20|F|artist|02139
|
||||
151|38|F|administrator|48103
|
||||
152|33|F|educator|68767
|
||||
153|25|M|student|60641
|
||||
154|25|M|student|53703
|
||||
155|32|F|other|11217
|
||||
156|25|M|educator|08360
|
||||
157|57|M|engineer|70808
|
||||
158|50|M|educator|27606
|
||||
159|23|F|student|55346
|
||||
160|27|M|programmer|66215
|
||||
161|50|M|lawyer|55104
|
||||
162|25|M|artist|15610
|
||||
163|49|M|administrator|97212
|
||||
164|47|M|healthcare|80123
|
||||
165|20|F|other|53715
|
||||
166|47|M|educator|55113
|
||||
167|37|M|other|L9G2B
|
||||
168|48|M|other|80127
|
||||
169|52|F|other|53705
|
||||
170|53|F|healthcare|30067
|
||||
171|48|F|educator|78750
|
||||
172|55|M|marketing|22207
|
||||
173|56|M|other|22306
|
||||
174|30|F|administrator|52302
|
||||
175|26|F|scientist|21911
|
||||
176|28|M|scientist|07030
|
||||
177|20|M|programmer|19104
|
||||
178|26|M|other|49512
|
||||
179|15|M|entertainment|20755
|
||||
180|22|F|administrator|60202
|
||||
181|26|M|executive|21218
|
||||
182|36|M|programmer|33884
|
||||
183|33|M|scientist|27708
|
||||
184|37|M|librarian|76013
|
||||
185|53|F|librarian|97403
|
||||
186|39|F|executive|00000
|
||||
187|26|M|educator|16801
|
||||
188|42|M|student|29440
|
||||
189|32|M|artist|95014
|
||||
190|30|M|administrator|95938
|
||||
191|33|M|administrator|95161
|
||||
192|42|M|educator|90840
|
||||
193|29|M|student|49931
|
||||
194|38|M|administrator|02154
|
||||
195|42|M|scientist|93555
|
||||
196|49|M|writer|55105
|
||||
197|55|M|technician|75094
|
||||
198|21|F|student|55414
|
||||
199|30|M|writer|17604
|
||||
200|40|M|programmer|93402
|
||||
201|27|M|writer|E2A4H
|
||||
202|41|F|educator|60201
|
||||
203|25|F|student|32301
|
||||
204|52|F|librarian|10960
|
||||
205|47|M|lawyer|06371
|
||||
206|14|F|student|53115
|
||||
207|39|M|marketing|92037
|
||||
208|43|M|engineer|01720
|
||||
209|33|F|educator|85710
|
||||
210|39|M|engineer|03060
|
||||
211|66|M|salesman|32605
|
||||
212|49|F|educator|61401
|
||||
213|33|M|executive|55345
|
||||
214|26|F|librarian|11231
|
||||
215|35|M|programmer|63033
|
||||
216|22|M|engineer|02215
|
||||
217|22|M|other|11727
|
||||
218|37|M|administrator|06513
|
||||
219|32|M|programmer|43212
|
||||
220|30|M|librarian|78205
|
||||
221|19|M|student|20685
|
||||
222|29|M|programmer|27502
|
||||
223|19|F|student|47906
|
||||
224|31|F|educator|43512
|
||||
225|51|F|administrator|58202
|
||||
226|28|M|student|92103
|
||||
227|46|M|executive|60659
|
||||
228|21|F|student|22003
|
||||
229|29|F|librarian|22903
|
||||
230|28|F|student|14476
|
||||
231|48|M|librarian|01080
|
||||
232|45|M|scientist|99709
|
||||
233|38|M|engineer|98682
|
||||
234|60|M|retired|94702
|
||||
235|37|M|educator|22973
|
||||
236|44|F|writer|53214
|
||||
237|49|M|administrator|63146
|
||||
238|42|F|administrator|44124
|
||||
239|39|M|artist|95628
|
||||
240|23|F|educator|20784
|
||||
241|26|F|student|20001
|
||||
242|33|M|educator|31404
|
||||
243|33|M|educator|60201
|
||||
244|28|M|technician|80525
|
||||
245|22|M|student|55109
|
||||
246|19|M|student|28734
|
||||
247|28|M|engineer|20770
|
||||
248|25|M|student|37235
|
||||
249|25|M|student|84103
|
||||
250|29|M|executive|95110
|
||||
251|28|M|doctor|85032
|
||||
252|42|M|engineer|07733
|
||||
253|26|F|librarian|22903
|
||||
254|44|M|educator|42647
|
||||
255|23|M|entertainment|07029
|
||||
256|35|F|none|39042
|
||||
257|17|M|student|77005
|
||||
258|19|F|student|77801
|
||||
259|21|M|student|48823
|
||||
260|40|F|artist|89801
|
||||
261|28|M|administrator|85202
|
||||
262|19|F|student|78264
|
||||
263|41|M|programmer|55346
|
||||
264|36|F|writer|90064
|
||||
265|26|M|executive|84601
|
||||
266|62|F|administrator|78756
|
||||
267|23|M|engineer|83716
|
||||
268|24|M|engineer|19422
|
||||
269|31|F|librarian|43201
|
||||
270|18|F|student|63119
|
||||
271|51|M|engineer|22932
|
||||
272|33|M|scientist|53706
|
||||
273|50|F|other|10016
|
||||
274|20|F|student|55414
|
||||
275|38|M|engineer|92064
|
||||
276|21|M|student|95064
|
||||
277|35|F|administrator|55406
|
||||
278|37|F|librarian|30033
|
||||
279|33|M|programmer|85251
|
||||
280|30|F|librarian|22903
|
||||
281|15|F|student|06059
|
||||
282|22|M|administrator|20057
|
||||
283|28|M|programmer|55305
|
||||
284|40|M|executive|92629
|
||||
285|25|M|programmer|53713
|
||||
286|27|M|student|15217
|
||||
287|21|M|salesman|31211
|
||||
288|34|M|marketing|23226
|
||||
289|11|M|none|94619
|
||||
290|40|M|engineer|93550
|
||||
291|19|M|student|44106
|
||||
292|35|F|programmer|94703
|
||||
293|24|M|writer|60804
|
||||
294|34|M|technician|92110
|
||||
295|31|M|educator|50325
|
||||
296|43|F|administrator|16803
|
||||
297|29|F|educator|98103
|
||||
298|44|M|executive|01581
|
||||
299|29|M|doctor|63108
|
||||
300|26|F|programmer|55106
|
||||
301|24|M|student|55439
|
||||
302|42|M|educator|77904
|
||||
303|19|M|student|14853
|
||||
304|22|F|student|71701
|
||||
305|23|M|programmer|94086
|
||||
306|45|M|other|73132
|
||||
307|25|M|student|55454
|
||||
308|60|M|retired|95076
|
||||
309|40|M|scientist|70802
|
||||
310|37|M|educator|91711
|
||||
311|32|M|technician|73071
|
||||
312|48|M|other|02110
|
||||
313|41|M|marketing|60035
|
||||
314|20|F|student|08043
|
||||
315|31|M|educator|18301
|
||||
316|43|F|other|77009
|
||||
317|22|M|administrator|13210
|
||||
318|65|M|retired|06518
|
||||
319|38|M|programmer|22030
|
||||
320|19|M|student|24060
|
||||
321|49|F|educator|55413
|
||||
322|20|M|student|50613
|
||||
323|21|M|student|19149
|
||||
324|21|F|student|02176
|
||||
325|48|M|technician|02139
|
||||
326|41|M|administrator|15235
|
||||
327|22|M|student|11101
|
||||
328|51|M|administrator|06779
|
||||
329|48|M|educator|01720
|
||||
330|35|F|educator|33884
|
||||
331|33|M|entertainment|91344
|
||||
332|20|M|student|40504
|
||||
333|47|M|other|V0R2M
|
||||
334|32|M|librarian|30002
|
||||
335|45|M|executive|33775
|
||||
336|23|M|salesman|42101
|
||||
337|37|M|scientist|10522
|
||||
338|39|F|librarian|59717
|
||||
339|35|M|lawyer|37901
|
||||
340|46|M|engineer|80123
|
||||
341|17|F|student|44405
|
||||
342|25|F|other|98006
|
||||
343|43|M|engineer|30093
|
||||
344|30|F|librarian|94117
|
||||
345|28|F|librarian|94143
|
||||
346|34|M|other|76059
|
||||
347|18|M|student|90210
|
||||
348|24|F|student|45660
|
||||
349|68|M|retired|61455
|
||||
350|32|M|student|97301
|
||||
351|61|M|educator|49938
|
||||
352|37|F|programmer|55105
|
||||
353|25|M|scientist|28480
|
||||
354|29|F|librarian|48197
|
||||
355|25|M|student|60135
|
||||
356|32|F|homemaker|92688
|
||||
357|26|M|executive|98133
|
||||
358|40|M|educator|10022
|
||||
359|22|M|student|61801
|
||||
360|51|M|other|98027
|
||||
361|22|M|student|44074
|
||||
362|35|F|homemaker|85233
|
||||
363|20|M|student|87501
|
||||
364|63|M|engineer|01810
|
||||
365|29|M|lawyer|20009
|
||||
366|20|F|student|50670
|
||||
367|17|M|student|37411
|
||||
368|18|M|student|92113
|
||||
369|24|M|student|91335
|
||||
370|52|M|writer|08534
|
||||
371|36|M|engineer|99206
|
||||
372|25|F|student|66046
|
||||
373|24|F|other|55116
|
||||
374|36|M|executive|78746
|
||||
375|17|M|entertainment|37777
|
||||
376|28|F|other|10010
|
||||
377|22|M|student|18015
|
||||
378|35|M|student|02859
|
||||
379|44|M|programmer|98117
|
||||
380|32|M|engineer|55117
|
||||
381|33|M|artist|94608
|
||||
382|45|M|engineer|01824
|
||||
383|42|M|administrator|75204
|
||||
384|52|M|programmer|45218
|
||||
385|36|M|writer|10003
|
||||
386|36|M|salesman|43221
|
||||
387|33|M|entertainment|37412
|
||||
388|31|M|other|36106
|
||||
389|44|F|writer|83702
|
||||
390|42|F|writer|85016
|
||||
391|23|M|student|84604
|
||||
392|52|M|writer|59801
|
||||
393|19|M|student|83686
|
||||
394|25|M|administrator|96819
|
||||
395|43|M|other|44092
|
||||
396|57|M|engineer|94551
|
||||
397|17|M|student|27514
|
||||
398|40|M|other|60008
|
||||
399|25|M|other|92374
|
||||
400|33|F|administrator|78213
|
||||
401|46|F|healthcare|84107
|
||||
402|30|M|engineer|95129
|
||||
403|37|M|other|06811
|
||||
404|29|F|programmer|55108
|
||||
405|22|F|healthcare|10019
|
||||
406|52|M|educator|93109
|
||||
407|29|M|engineer|03261
|
||||
408|23|M|student|61755
|
||||
409|48|M|administrator|98225
|
||||
410|30|F|artist|94025
|
||||
411|34|M|educator|44691
|
||||
412|25|M|educator|15222
|
||||
413|55|M|educator|78212
|
||||
414|24|M|programmer|38115
|
||||
415|39|M|educator|85711
|
||||
416|20|F|student|92626
|
||||
417|27|F|other|48103
|
||||
418|55|F|none|21206
|
||||
419|37|M|lawyer|43215
|
||||
420|53|M|educator|02140
|
||||
421|38|F|programmer|55105
|
||||
422|26|M|entertainment|94533
|
||||
423|64|M|other|91606
|
||||
424|36|F|marketing|55422
|
||||
425|19|M|student|58644
|
||||
426|55|M|educator|01602
|
||||
427|51|M|doctor|85258
|
||||
428|28|M|student|55414
|
||||
429|27|M|student|29205
|
||||
430|38|M|scientist|98199
|
||||
431|24|M|marketing|92629
|
||||
432|22|M|entertainment|50311
|
||||
433|27|M|artist|11211
|
||||
434|16|F|student|49705
|
||||
435|24|M|engineer|60007
|
||||
436|30|F|administrator|17345
|
||||
437|27|F|other|20009
|
||||
438|51|F|administrator|43204
|
||||
439|23|F|administrator|20817
|
||||
440|30|M|other|48076
|
||||
441|50|M|technician|55013
|
||||
442|22|M|student|85282
|
||||
443|35|M|salesman|33308
|
||||
444|51|F|lawyer|53202
|
||||
445|21|M|writer|92653
|
||||
446|57|M|educator|60201
|
||||
447|30|M|administrator|55113
|
||||
448|23|M|entertainment|10021
|
||||
449|23|M|librarian|55021
|
||||
450|35|F|educator|11758
|
||||
451|16|M|student|48446
|
||||
452|35|M|administrator|28018
|
||||
453|18|M|student|06333
|
||||
454|57|M|other|97330
|
||||
455|48|M|administrator|83709
|
||||
456|24|M|technician|31820
|
||||
457|33|F|salesman|30011
|
||||
458|47|M|technician|Y1A6B
|
||||
459|22|M|student|29201
|
||||
460|44|F|other|60630
|
||||
461|15|M|student|98102
|
||||
462|19|F|student|02918
|
||||
463|48|F|healthcare|75218
|
||||
464|60|M|writer|94583
|
||||
465|32|M|other|05001
|
||||
466|22|M|student|90804
|
||||
467|29|M|engineer|91201
|
||||
468|28|M|engineer|02341
|
||||
469|60|M|educator|78628
|
||||
470|24|M|programmer|10021
|
||||
471|10|M|student|77459
|
||||
472|24|M|student|87544
|
||||
473|29|M|student|94708
|
||||
474|51|M|executive|93711
|
||||
475|30|M|programmer|75230
|
||||
476|28|M|student|60440
|
||||
477|23|F|student|02125
|
||||
478|29|M|other|10019
|
||||
479|30|M|educator|55409
|
||||
480|57|M|retired|98257
|
||||
481|73|M|retired|37771
|
||||
482|18|F|student|40256
|
||||
483|29|M|scientist|43212
|
||||
484|27|M|student|21208
|
||||
485|44|F|educator|95821
|
||||
486|39|M|educator|93101
|
||||
487|22|M|engineer|92121
|
||||
488|48|M|technician|21012
|
||||
489|55|M|other|45218
|
||||
490|29|F|artist|V5A2B
|
||||
491|43|F|writer|53711
|
||||
492|57|M|educator|94618
|
||||
493|22|M|engineer|60090
|
||||
494|38|F|administrator|49428
|
||||
495|29|M|engineer|03052
|
||||
496|21|F|student|55414
|
||||
497|20|M|student|50112
|
||||
498|26|M|writer|55408
|
||||
499|42|M|programmer|75006
|
||||
500|28|M|administrator|94305
|
||||
501|22|M|student|10025
|
||||
502|22|M|student|23092
|
||||
503|50|F|writer|27514
|
||||
504|40|F|writer|92115
|
||||
505|27|F|other|20657
|
||||
506|46|M|programmer|03869
|
||||
507|18|F|writer|28450
|
||||
508|27|M|marketing|19382
|
||||
509|23|M|administrator|10011
|
||||
510|34|M|other|98038
|
||||
511|22|M|student|21250
|
||||
512|29|M|other|20090
|
||||
513|43|M|administrator|26241
|
||||
514|27|M|programmer|20707
|
||||
515|53|M|marketing|49508
|
||||
516|53|F|librarian|10021
|
||||
517|24|M|student|55454
|
||||
518|49|F|writer|99709
|
||||
519|22|M|other|55320
|
||||
520|62|M|healthcare|12603
|
||||
521|19|M|student|02146
|
||||
522|36|M|engineer|55443
|
||||
523|50|F|administrator|04102
|
||||
524|56|M|educator|02159
|
||||
525|27|F|administrator|19711
|
||||
526|30|M|marketing|97124
|
||||
527|33|M|librarian|12180
|
||||
528|18|M|student|55104
|
||||
529|47|F|administrator|44224
|
||||
530|29|M|engineer|94040
|
||||
531|30|F|salesman|97408
|
||||
532|20|M|student|92705
|
||||
533|43|M|librarian|02324
|
||||
534|20|M|student|05464
|
||||
535|45|F|educator|80302
|
||||
536|38|M|engineer|30078
|
||||
537|36|M|engineer|22902
|
||||
538|31|M|scientist|21010
|
||||
539|53|F|administrator|80303
|
||||
540|28|M|engineer|91201
|
||||
541|19|F|student|84302
|
||||
542|21|M|student|60515
|
||||
543|33|M|scientist|95123
|
||||
544|44|F|other|29464
|
||||
545|27|M|technician|08052
|
||||
546|36|M|executive|22911
|
||||
547|50|M|educator|14534
|
||||
548|51|M|writer|95468
|
||||
549|42|M|scientist|45680
|
||||
550|16|F|student|95453
|
||||
551|25|M|programmer|55414
|
||||
552|45|M|other|68147
|
||||
553|58|M|educator|62901
|
||||
554|32|M|scientist|62901
|
||||
555|29|F|educator|23227
|
||||
556|35|F|educator|30606
|
||||
557|30|F|writer|11217
|
||||
558|56|F|writer|63132
|
||||
559|69|M|executive|10022
|
||||
560|32|M|student|10003
|
||||
561|23|M|engineer|60005
|
||||
562|54|F|administrator|20879
|
||||
563|39|F|librarian|32707
|
||||
564|65|M|retired|94591
|
||||
565|40|M|student|55422
|
||||
566|20|M|student|14627
|
||||
567|24|M|entertainment|10003
|
||||
568|39|M|educator|01915
|
||||
569|34|M|educator|91903
|
||||
570|26|M|educator|14627
|
||||
571|34|M|artist|01945
|
||||
572|51|M|educator|20003
|
||||
573|68|M|retired|48911
|
||||
574|56|M|educator|53188
|
||||
575|33|M|marketing|46032
|
||||
576|48|M|executive|98281
|
||||
577|36|F|student|77845
|
||||
578|31|M|administrator|M7A1A
|
||||
579|32|M|educator|48103
|
||||
580|16|M|student|17961
|
||||
581|37|M|other|94131
|
||||
582|17|M|student|93003
|
||||
583|44|M|engineer|29631
|
||||
584|25|M|student|27511
|
||||
585|69|M|librarian|98501
|
||||
586|20|M|student|79508
|
||||
587|26|M|other|14216
|
||||
588|18|F|student|93063
|
||||
589|21|M|lawyer|90034
|
||||
590|50|M|educator|82435
|
||||
591|57|F|librarian|92093
|
||||
592|18|M|student|97520
|
||||
593|31|F|educator|68767
|
||||
594|46|M|educator|M4J2K
|
||||
595|25|M|programmer|31909
|
||||
596|20|M|artist|77073
|
||||
597|23|M|other|84116
|
||||
598|40|F|marketing|43085
|
||||
599|22|F|student|R3T5K
|
||||
600|34|M|programmer|02320
|
||||
601|19|F|artist|99687
|
||||
602|47|F|other|34656
|
||||
603|21|M|programmer|47905
|
||||
604|39|M|educator|11787
|
||||
605|33|M|engineer|33716
|
||||
606|28|M|programmer|63044
|
||||
607|49|F|healthcare|02154
|
||||
608|22|M|other|10003
|
||||
609|13|F|student|55106
|
||||
610|22|M|student|21227
|
||||
611|46|M|librarian|77008
|
||||
612|36|M|educator|79070
|
||||
613|37|F|marketing|29678
|
||||
614|54|M|educator|80227
|
||||
615|38|M|educator|27705
|
||||
616|55|M|scientist|50613
|
||||
617|27|F|writer|11201
|
||||
618|15|F|student|44212
|
||||
619|17|M|student|44134
|
||||
620|18|F|writer|81648
|
||||
621|17|M|student|60402
|
||||
622|25|M|programmer|14850
|
||||
623|50|F|educator|60187
|
||||
624|19|M|student|30067
|
||||
625|27|M|programmer|20723
|
||||
626|23|M|scientist|19807
|
||||
627|24|M|engineer|08034
|
||||
628|13|M|none|94306
|
||||
629|46|F|other|44224
|
||||
630|26|F|healthcare|55408
|
||||
631|18|F|student|38866
|
||||
632|18|M|student|55454
|
||||
633|35|M|programmer|55414
|
||||
634|39|M|engineer|T8H1N
|
||||
635|22|M|other|23237
|
||||
636|47|M|educator|48043
|
||||
637|30|M|other|74101
|
||||
638|45|M|engineer|01940
|
||||
639|42|F|librarian|12065
|
||||
640|20|M|student|61801
|
||||
641|24|M|student|60626
|
||||
642|18|F|student|95521
|
||||
643|39|M|scientist|55122
|
||||
644|51|M|retired|63645
|
||||
645|27|M|programmer|53211
|
||||
646|17|F|student|51250
|
||||
647|40|M|educator|45810
|
||||
648|43|M|engineer|91351
|
||||
649|20|M|student|39762
|
||||
650|42|M|engineer|83814
|
||||
651|65|M|retired|02903
|
||||
652|35|M|other|22911
|
||||
653|31|M|executive|55105
|
||||
654|27|F|student|78739
|
||||
655|50|F|healthcare|60657
|
||||
656|48|M|educator|10314
|
||||
657|26|F|none|78704
|
||||
658|33|M|programmer|92626
|
||||
659|31|M|educator|54248
|
||||
660|26|M|student|77380
|
||||
661|28|M|programmer|98121
|
||||
662|55|M|librarian|19102
|
||||
663|26|M|other|19341
|
||||
664|30|M|engineer|94115
|
||||
665|25|M|administrator|55412
|
||||
666|44|M|administrator|61820
|
||||
667|35|M|librarian|01970
|
||||
668|29|F|writer|10016
|
||||
669|37|M|other|20009
|
||||
670|30|M|technician|21114
|
||||
671|21|M|programmer|91919
|
||||
672|54|F|administrator|90095
|
||||
673|51|M|educator|22906
|
||||
674|13|F|student|55337
|
||||
675|34|M|other|28814
|
||||
676|30|M|programmer|32712
|
||||
677|20|M|other|99835
|
||||
678|50|M|educator|61462
|
||||
679|20|F|student|54302
|
||||
680|33|M|lawyer|90405
|
||||
681|44|F|marketing|97208
|
||||
682|23|M|programmer|55128
|
||||
683|42|M|librarian|23509
|
||||
684|28|M|student|55414
|
||||
685|32|F|librarian|55409
|
||||
686|32|M|educator|26506
|
||||
687|31|F|healthcare|27713
|
||||
688|37|F|administrator|60476
|
||||
689|25|M|other|45439
|
||||
690|35|M|salesman|63304
|
||||
691|34|M|educator|60089
|
||||
692|34|M|engineer|18053
|
||||
693|43|F|healthcare|85210
|
||||
694|60|M|programmer|06365
|
||||
695|26|M|writer|38115
|
||||
696|55|M|other|94920
|
||||
697|25|M|other|77042
|
||||
698|28|F|programmer|06906
|
||||
699|44|M|other|96754
|
||||
700|17|M|student|76309
|
||||
701|51|F|librarian|56321
|
||||
702|37|M|other|89104
|
||||
703|26|M|educator|49512
|
||||
704|51|F|librarian|91105
|
||||
705|21|F|student|54494
|
||||
706|23|M|student|55454
|
||||
707|56|F|librarian|19146
|
||||
708|26|F|homemaker|96349
|
||||
709|21|M|other|N4T1A
|
||||
710|19|M|student|92020
|
||||
711|22|F|student|15203
|
||||
712|22|F|student|54901
|
||||
713|42|F|other|07204
|
||||
714|26|M|engineer|55343
|
||||
715|21|M|technician|91206
|
||||
716|36|F|administrator|44265
|
||||
717|24|M|technician|84105
|
||||
718|42|M|technician|64118
|
||||
719|37|F|other|V0R2H
|
||||
720|49|F|administrator|16506
|
||||
721|24|F|entertainment|11238
|
||||
722|50|F|homemaker|17331
|
||||
723|26|M|executive|94403
|
||||
724|31|M|executive|40243
|
||||
725|21|M|student|91711
|
||||
726|25|F|administrator|80538
|
||||
727|25|M|student|78741
|
||||
728|58|M|executive|94306
|
||||
729|19|M|student|56567
|
||||
730|31|F|scientist|32114
|
||||
731|41|F|educator|70403
|
||||
732|28|F|other|98405
|
||||
733|44|F|other|60630
|
||||
734|25|F|other|63108
|
||||
735|29|F|healthcare|85719
|
||||
736|48|F|writer|94618
|
||||
737|30|M|programmer|98072
|
||||
738|35|M|technician|95403
|
||||
739|35|M|technician|73162
|
||||
740|25|F|educator|22206
|
||||
741|25|M|writer|63108
|
||||
742|35|M|student|29210
|
||||
743|31|M|programmer|92660
|
||||
744|35|M|marketing|47024
|
||||
745|42|M|writer|55113
|
||||
746|25|M|engineer|19047
|
||||
747|19|M|other|93612
|
||||
748|28|M|administrator|94720
|
||||
749|33|M|other|80919
|
||||
750|28|M|administrator|32303
|
||||
751|24|F|other|90034
|
||||
752|60|M|retired|21201
|
||||
753|56|M|salesman|91206
|
||||
754|59|F|librarian|62901
|
||||
755|44|F|educator|97007
|
||||
756|30|F|none|90247
|
||||
757|26|M|student|55104
|
||||
758|27|M|student|53706
|
||||
759|20|F|student|68503
|
||||
760|35|F|other|14211
|
||||
761|17|M|student|97302
|
||||
762|32|M|administrator|95050
|
||||
763|27|M|scientist|02113
|
||||
764|27|F|educator|62903
|
||||
765|31|M|student|33066
|
||||
766|42|M|other|10960
|
||||
767|70|M|engineer|00000
|
||||
768|29|M|administrator|12866
|
||||
769|39|M|executive|06927
|
||||
770|28|M|student|14216
|
||||
771|26|M|student|15232
|
||||
772|50|M|writer|27105
|
||||
773|20|M|student|55414
|
||||
774|30|M|student|80027
|
||||
775|46|M|executive|90036
|
||||
776|30|M|librarian|51157
|
||||
777|63|M|programmer|01810
|
||||
778|34|M|student|01960
|
||||
779|31|M|student|K7L5J
|
||||
780|49|M|programmer|94560
|
||||
781|20|M|student|48825
|
||||
782|21|F|artist|33205
|
||||
783|30|M|marketing|77081
|
||||
784|47|M|administrator|91040
|
||||
785|32|M|engineer|23322
|
||||
786|36|F|engineer|01754
|
||||
787|18|F|student|98620
|
||||
788|51|M|administrator|05779
|
||||
789|29|M|other|55420
|
||||
790|27|M|technician|80913
|
||||
791|31|M|educator|20064
|
||||
792|40|M|programmer|12205
|
||||
793|22|M|student|85281
|
||||
794|32|M|educator|57197
|
||||
795|30|M|programmer|08610
|
||||
796|32|F|writer|33755
|
||||
797|44|F|other|62522
|
||||
798|40|F|writer|64131
|
||||
799|49|F|administrator|19716
|
||||
800|25|M|programmer|55337
|
||||
801|22|M|writer|92154
|
||||
802|35|M|administrator|34105
|
||||
803|70|M|administrator|78212
|
||||
804|39|M|educator|61820
|
||||
805|27|F|other|20009
|
||||
806|27|M|marketing|11217
|
||||
807|41|F|healthcare|93555
|
||||
808|45|M|salesman|90016
|
||||
809|50|F|marketing|30803
|
||||
810|55|F|other|80526
|
||||
811|40|F|educator|73013
|
||||
812|22|M|technician|76234
|
||||
813|14|F|student|02136
|
||||
814|30|M|other|12345
|
||||
815|32|M|other|28806
|
||||
816|34|M|other|20755
|
||||
817|19|M|student|60152
|
||||
818|28|M|librarian|27514
|
||||
819|59|M|administrator|40205
|
||||
820|22|M|student|37725
|
||||
821|37|M|engineer|77845
|
||||
822|29|F|librarian|53144
|
||||
823|27|M|artist|50322
|
||||
824|31|M|other|15017
|
||||
825|44|M|engineer|05452
|
||||
826|28|M|artist|77048
|
||||
827|23|F|engineer|80228
|
||||
828|28|M|librarian|85282
|
||||
829|48|M|writer|80209
|
||||
830|46|M|programmer|53066
|
||||
831|21|M|other|33765
|
||||
832|24|M|technician|77042
|
||||
833|34|M|writer|90019
|
||||
834|26|M|other|64153
|
||||
835|44|F|executive|11577
|
||||
836|44|M|artist|10018
|
||||
837|36|F|artist|55409
|
||||
838|23|M|student|01375
|
||||
839|38|F|entertainment|90814
|
||||
840|39|M|artist|55406
|
||||
841|45|M|doctor|47401
|
||||
842|40|M|writer|93055
|
||||
843|35|M|librarian|44212
|
||||
844|22|M|engineer|95662
|
||||
845|64|M|doctor|97405
|
||||
846|27|M|lawyer|47130
|
||||
847|29|M|student|55417
|
||||
848|46|M|engineer|02146
|
||||
849|15|F|student|25652
|
||||
850|34|M|technician|78390
|
||||
851|18|M|other|29646
|
||||
852|46|M|administrator|94086
|
||||
853|49|M|writer|40515
|
||||
854|29|F|student|55408
|
||||
855|53|M|librarian|04988
|
||||
856|43|F|marketing|97215
|
||||
857|35|F|administrator|V1G4L
|
||||
858|63|M|educator|09645
|
||||
859|18|F|other|06492
|
||||
860|70|F|retired|48322
|
||||
861|38|F|student|14085
|
||||
862|25|M|executive|13820
|
||||
863|17|M|student|60089
|
||||
864|27|M|programmer|63021
|
||||
865|25|M|artist|11231
|
||||
866|45|M|other|60302
|
||||
867|24|M|scientist|92507
|
||||
868|21|M|programmer|55303
|
||||
869|30|M|student|10025
|
||||
870|22|M|student|65203
|
||||
871|31|M|executive|44648
|
||||
872|19|F|student|74078
|
||||
873|48|F|administrator|33763
|
||||
874|36|M|scientist|37076
|
||||
875|24|F|student|35802
|
||||
876|41|M|other|20902
|
||||
877|30|M|other|77504
|
||||
878|50|F|educator|98027
|
||||
879|33|F|administrator|55337
|
||||
880|13|M|student|83702
|
||||
881|39|M|marketing|43017
|
||||
882|35|M|engineer|40503
|
||||
883|49|M|librarian|50266
|
||||
884|44|M|engineer|55337
|
||||
885|30|F|other|95316
|
||||
886|20|M|student|61820
|
||||
887|14|F|student|27249
|
||||
888|41|M|scientist|17036
|
||||
889|24|M|technician|78704
|
||||
890|32|M|student|97301
|
||||
891|51|F|administrator|03062
|
||||
892|36|M|other|45243
|
||||
893|25|M|student|95823
|
||||
894|47|M|educator|74075
|
||||
895|31|F|librarian|32301
|
||||
896|28|M|writer|91505
|
||||
897|30|M|other|33484
|
||||
898|23|M|homemaker|61755
|
||||
899|32|M|other|55116
|
||||
900|60|M|retired|18505
|
||||
901|38|M|executive|L1V3W
|
||||
902|45|F|artist|97203
|
||||
903|28|M|educator|20850
|
||||
904|17|F|student|61073
|
||||
905|27|M|other|30350
|
||||
906|45|M|librarian|70124
|
||||
907|25|F|other|80526
|
||||
908|44|F|librarian|68504
|
||||
909|50|F|educator|53171
|
||||
910|28|M|healthcare|29301
|
||||
911|37|F|writer|53210
|
||||
912|51|M|other|06512
|
||||
913|27|M|student|76201
|
||||
914|44|F|other|08105
|
||||
915|50|M|entertainment|60614
|
||||
916|27|M|engineer|N2L5N
|
||||
917|22|F|student|20006
|
||||
918|40|M|scientist|70116
|
||||
919|25|M|other|14216
|
||||
920|30|F|artist|90008
|
||||
921|20|F|student|98801
|
||||
922|29|F|administrator|21114
|
||||
923|21|M|student|E2E3R
|
||||
924|29|M|other|11753
|
||||
925|18|F|salesman|49036
|
||||
926|49|M|entertainment|01701
|
||||
927|23|M|programmer|55428
|
||||
928|21|M|student|55408
|
||||
929|44|M|scientist|53711
|
||||
930|28|F|scientist|07310
|
||||
931|60|M|educator|33556
|
||||
932|58|M|educator|06437
|
||||
933|28|M|student|48105
|
||||
934|61|M|engineer|22902
|
||||
935|42|M|doctor|66221
|
||||
936|24|M|other|32789
|
||||
937|48|M|educator|98072
|
||||
938|38|F|technician|55038
|
||||
939|26|F|student|33319
|
||||
940|32|M|administrator|02215
|
||||
941|20|M|student|97229
|
||||
942|48|F|librarian|78209
|
||||
943|22|M|student|77841
|
||||
80000
input/16.RecommenderSystems/ml-100k/u1.base
Normal file
80000
input/16.RecommenderSystems/ml-100k/u1.base
Normal file
File diff suppressed because it is too large
Load Diff
20000
input/16.RecommenderSystems/ml-100k/u1.test
Normal file
20000
input/16.RecommenderSystems/ml-100k/u1.test
Normal file
File diff suppressed because it is too large
Load Diff
80000
input/16.RecommenderSystems/ml-100k/u2.base
Normal file
80000
input/16.RecommenderSystems/ml-100k/u2.base
Normal file
File diff suppressed because it is too large
Load Diff
20000
input/16.RecommenderSystems/ml-100k/u2.test
Normal file
20000
input/16.RecommenderSystems/ml-100k/u2.test
Normal file
File diff suppressed because it is too large
Load Diff
80000
input/16.RecommenderSystems/ml-100k/u3.base
Normal file
80000
input/16.RecommenderSystems/ml-100k/u3.base
Normal file
File diff suppressed because it is too large
Load Diff
20000
input/16.RecommenderSystems/ml-100k/u3.test
Normal file
20000
input/16.RecommenderSystems/ml-100k/u3.test
Normal file
File diff suppressed because it is too large
Load Diff
80000
input/16.RecommenderSystems/ml-100k/u4.base
Normal file
80000
input/16.RecommenderSystems/ml-100k/u4.base
Normal file
File diff suppressed because it is too large
Load Diff
20000
input/16.RecommenderSystems/ml-100k/u4.test
Normal file
20000
input/16.RecommenderSystems/ml-100k/u4.test
Normal file
File diff suppressed because it is too large
Load Diff
80000
input/16.RecommenderSystems/ml-100k/u5.base
Normal file
80000
input/16.RecommenderSystems/ml-100k/u5.base
Normal file
File diff suppressed because it is too large
Load Diff
20000
input/16.RecommenderSystems/ml-100k/u5.test
Normal file
20000
input/16.RecommenderSystems/ml-100k/u5.test
Normal file
File diff suppressed because it is too large
Load Diff
90570
input/16.RecommenderSystems/ml-100k/ua.base
Normal file
90570
input/16.RecommenderSystems/ml-100k/ua.base
Normal file
File diff suppressed because it is too large
Load Diff
9430
input/16.RecommenderSystems/ml-100k/ua.test
Normal file
9430
input/16.RecommenderSystems/ml-100k/ua.test
Normal file
File diff suppressed because it is too large
Load Diff
90570
input/16.RecommenderSystems/ml-100k/ub.base
Normal file
90570
input/16.RecommenderSystems/ml-100k/ub.base
Normal file
File diff suppressed because it is too large
Load Diff
9430
input/16.RecommenderSystems/ml-100k/ub.test
Normal file
9430
input/16.RecommenderSystems/ml-100k/ub.test
Normal file
File diff suppressed because it is too large
Load Diff
@@ -3,7 +3,7 @@
|
||||
'''
|
||||
Created on 2017-05-18
|
||||
Update on 2017-05-18
|
||||
@author: Peter Harrington/山上有课树
|
||||
@author: Peter Harrington/1988/片刻
|
||||
《机器学习实战》更新地址:https://github.com/apachecn/MachineLearning
|
||||
'''
|
||||
from numpy import random, mat, eye
|
||||
|
||||
68
src/python/16.RecommenderSystems/sklearn-RS-demo-cf.py
Normal file
68
src/python/16.RecommenderSystems/sklearn-RS-demo-cf.py
Normal file
@@ -0,0 +1,68 @@
|
||||
#!/usr/bin/python
|
||||
# coding:utf8
|
||||
|
||||
from math import sqrt
|
||||
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
from scipy.sparse.linalg import svds
|
||||
from sklearn import cross_validation as cv
|
||||
from sklearn.metrics import mean_squared_error
|
||||
from sklearn.metrics.pairwise import pairwise_distances
|
||||
|
||||
# 加载数据集
|
||||
header = ['user_id', 'item_id', 'rating', 'timestamp']
|
||||
# http://files.grouplens.org/datasets/movielens/ml-100k.zip
|
||||
dataFile = 'input/16.RecommenderSystems/ml-100k/u.data'
|
||||
df = pd.read_csv(dataFile, sep='\t', names=header)
|
||||
|
||||
n_users = df.user_id.unique().shape[0]
|
||||
n_items = df.item_id.unique().shape[0]
|
||||
print 'Number of users = ' + str(n_users) + ' | Number of movies = ' + str(n_items)
|
||||
|
||||
# 拆分数据集
|
||||
train_data, test_data = cv.train_test_split(df, test_size=0.25)
|
||||
|
||||
# 创建用户产品矩阵,针对测试数据和训练数据,创建两个矩阵:
|
||||
train_data_matrix = np.zeros((n_users, n_items))
|
||||
for line in train_data.itertuples():
|
||||
train_data_matrix[line[1]-1, line[2]-1] = line[3]
|
||||
test_data_matrix = np.zeros((n_users, n_items))
|
||||
for line in test_data.itertuples():
|
||||
test_data_matrix[line[1]-1, line[2]-1] = line[3]
|
||||
# 使用sklearn的pairwise_distances函数来计算余弦相似性。
|
||||
user_similarity = pairwise_distances(train_data_matrix, metric="cosine")
|
||||
item_similarity = pairwise_distances(train_data_matrix.T, metric="cosine")
|
||||
|
||||
|
||||
def predict(rating, similarity, type='user'):
|
||||
if type == 'user':
|
||||
mean_user_rating = rating.mean(axis=1)
|
||||
rating_diff = (rating - mean_user_rating[:, np.newaxis])
|
||||
pred = mean_user_rating[:, np.newaxis] + similarity.dot(rating_diff)/np.array([np.abs(similarity).sum(axis=1)]).T
|
||||
elif type == 'item':
|
||||
pred = rating.dot(similarity) / np.array([np.abs(similarity).sum(axis=1)])
|
||||
return pred
|
||||
|
||||
|
||||
user_prediction = predict(train_data_matrix, user_similarity, type='user')
|
||||
item_prediction = predict(train_data_matrix, item_similarity, type='item')
|
||||
|
||||
|
||||
def rmse(prediction, ground_truth):
|
||||
prediction = prediction[ground_truth.nonzero()].flatten()
|
||||
ground_truth = ground_truth[ground_truth.nonzero()].flatten()
|
||||
return sqrt(mean_squared_error(prediction, ground_truth))
|
||||
|
||||
|
||||
print 'User based CF RMSE: ' + str(rmse(user_prediction, test_data_matrix))
|
||||
print 'Item based CF RMSe: ' + str(rmse(item_prediction, test_data_matrix))
|
||||
|
||||
sparsity = round(1.0 - len(df)/float(n_users*n_items), 3)
|
||||
print 'The sparsity level of MovieLen100K is ' + str(sparsity * 100) + '%'
|
||||
|
||||
|
||||
u, s, vt = svds(train_data_matrix, k=20)
|
||||
s_diag_matrix = np.diag(s)
|
||||
x_pred = np.dot(np.dot(u, s_diag_matrix), vt)
|
||||
print 'User-based CF MSE: ' + str(rmse(x_pred, test_data_matrix))
|
||||
30
src/python/16.RecommenderSystems/sklearn-RS-demo-item.py
Normal file
30
src/python/16.RecommenderSystems/sklearn-RS-demo-item.py
Normal file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/python
|
||||
# coding:utf8
|
||||
|
||||
import numpy as np
|
||||
from sklearn.decomposition import NMF
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
RATE_MATRIX = np.array(
|
||||
[[5, 5, 3, 0, 5, 5],
|
||||
[5, 0, 4, 0, 4, 4],
|
||||
[0, 3, 0, 5, 4, 5],
|
||||
[5, 4, 3, 3, 5, 5]]
|
||||
)
|
||||
|
||||
nmf = NMF(n_components=2)
|
||||
user_distribution = nmf.fit_transform(RATE_MATRIX)
|
||||
item_distribution = nmf.components_
|
||||
|
||||
item_distribution = item_distribution.T
|
||||
plt.plot(item_distribution[:, 0], item_distribution[:, 1], "b*")
|
||||
plt.xlim((-1, 3))
|
||||
plt.ylim((-1, 3))
|
||||
|
||||
plt.title(u'the distribution of items (NMF)')
|
||||
count = 1
|
||||
for item in item_distribution:
|
||||
plt.text(item[0], item[1], 'item '+str(count), bbox=dict(facecolor='red', alpha=0.2),)
|
||||
count += 1
|
||||
|
||||
plt.show()
|
||||
31
src/python/16.RecommenderSystems/sklearn-RS-demo-user.py
Normal file
31
src/python/16.RecommenderSystems/sklearn-RS-demo-user.py
Normal file
@@ -0,0 +1,31 @@
|
||||
#!/usr/bin/python
|
||||
# coding:utf8
|
||||
|
||||
import numpy as np
|
||||
from sklearn.decomposition import NMF
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
RATE_MATRIX = np.array(
|
||||
[[5, 5, 3, 0, 5, 5],
|
||||
[5, 0, 4, 0, 4, 4],
|
||||
[0, 3, 0, 5, 4, 5],
|
||||
[5, 4, 3, 3, 5, 5]]
|
||||
)
|
||||
|
||||
nmf = NMF(n_components=2)
|
||||
user_distribution = nmf.fit_transform(RATE_MATRIX)
|
||||
item_distribution = nmf.components_
|
||||
|
||||
users = ['Ben', 'Tom', 'John', 'Fred']
|
||||
zip_data = zip(users, user_distribution)
|
||||
|
||||
plt.title(u'the distribution of users (NMF)')
|
||||
plt.xlim((-1, 3))
|
||||
plt.ylim((-1, 4))
|
||||
for item in zip_data:
|
||||
user_name = item[0]
|
||||
data = item[1]
|
||||
plt.plot(data[0], data[1], "b*")
|
||||
plt.text(data[0], data[1], user_name, bbox=dict(facecolor='red', alpha=0.2),)
|
||||
|
||||
plt.show()
|
||||
22
src/python/16.RecommenderSystems/sklearn-RS-demo.py
Normal file
22
src/python/16.RecommenderSystems/sklearn-RS-demo.py
Normal file
@@ -0,0 +1,22 @@
|
||||
#!/usr/bin/python
|
||||
# coding:utf8
|
||||
|
||||
import numpy as np
|
||||
from sklearn.decomposition import NMF
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
RATE_MATRIX = np.array(
|
||||
[[5, 5, 3, 0, 5, 5],
|
||||
[5, 0, 4, 0, 4, 4],
|
||||
[0, 3, 0, 5, 4, 5],
|
||||
[5, 4, 3, 3, 5, 5]]
|
||||
)
|
||||
|
||||
nmf = NMF(n_components=2) # 设有2个隐主题
|
||||
user_distribution = nmf.fit_transform(RATE_MATRIX)
|
||||
item_distribution = nmf.components_
|
||||
|
||||
print '用户的主题分布:'
|
||||
print user_distribution
|
||||
print '物品的主题分布:'
|
||||
print item_distribution
|
||||
Binary file not shown.
Reference in New Issue
Block a user