-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
15 lines (9 loc) · 831 Bytes
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
This repository consists of a set of python scripts which count the number of occurences of words in a given directory consisting of *.txt files (the other files are ignored). The word count is calculated for each unique word occuring in all the documents considered together.
RUNNING THE SCRIPTS
-------------------
the gen_doc_class_input.py is the main script run it as follows :
$python gen_doc_class_input.py <path to a directory with *.txt files>
if you want the final word count vectors to be written to a file , use the program as follows :
$python gen_doc_class_input.py <path to a directory with *.txt files> -f <output_file_name>
note : the first argument should always be the path, and -f should always be followed by the file name to write to.
for any further discussions please mail me at <[email protected]>