-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME
42 lines (35 loc) · 2 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
This project applies machine learning algorithms to college admissions.
Files contained in this directory:
- prepare_data.rb: generates txt files of training and testing instances in a format specific to each machine learning algorithm from an initial csv file of data instances
- tree.rb: an ID3 decision tree which can be trained with a file generated by prepare_data.rb
- ann.rb: an artifical neural network that can be trained with a file generated by prepare_data.rb
- naive_bayes.rb: a naive bayes implementation that can be trained with a file generated by prepare_data.rb
- svm.rb: a support vector machine that can be trained with a file generated by prepare_data.rb
- classifiers.rb: trains and runs all algorithms for comparison purposes
- zips.rb: calculates the distance between two zip codes
Proposed schedule for feature implementation:
Milestone #1:
- implement artificial neural network and decision tree algorithms
- include support for training and testing
- process data to be used by the algorithms
Milestone #2:
- implement naive_bayes
- ensure that training results can be saved for all algorithms (config files)
- write separate file to train and test all algorithms
Milestone #3:
- reorganize and improve prepare_data.rb
- implement support vector machine
- incorporate naive_bayes into design paradigm (accept input from prepare_data.rb)
- error break down for classifications
- partition training and test sets by year for realism
Milestone #4
- calculate distances between zip codes
- perform necessary processing to utilize all useful data attributes provided
- allow the option of using two class values (accept or not) rather than three
- make adjustments to each algorithm to gauge their highest level of performance
Wrap-up
- confusion matrices to better understand error break down
- select one most promising algorithm
- finish improvements to selected algorithm
- build interface to allow use by admissions staff
- if feasible, implement any requested additional helpful features for admissions office