-
Install CB, create empty bucket
-
load "salaries" dataset by running 'generate_salaries_dataset.py'
-
run the parser by running primitive_v3/run.py
APIs:
http://localhost:5000/api/ping - returns pong
http://localhost:5000/api/run - POST request to start indexing (need to post cb server settigns and tags rules
http://localhost:5000/api/progress - general status of the parser
-
run indexer using post to api/run. Example of the data:
curl -i -H "Content-Type: application/json" -XPOST http://localhost:5000/api/run -d '{"server": {"name":"localhost", "bucket":"train", "port":"8091", "password":"password"},"rules":[{"name": "city", "source":"city", "value":"", "filter":""},{"name":"position", "source": "position", "value":"any java positions", "filter":"include(java)"},{"name":"position", "source": "position", "value":"any developer position", "filter":"include(developer)"},{"name":"position", "source": "position", "value":"only java developers", "filter":"include(java, developer)"},{"name":"position", "source": "position", "value":"all qa", "filter":"include(qa)"}, {"name":"salary", "source": "salary", "value":"below 150K", "filter":"range(<150)"},{"name":"salary", "source": "salary", "value":"above 150K", "filter":"range(>=150)"}]}'
json structure:
{ "server": {"name":"localhost", "bucket":"train", "port":"8091", "password":"password"}}, "rules":[ {"name": "city", "source":"city", "value":"", "filter":""}, {"name":"position", "source": "position", "value":"any java positions", "filter":"include(java)"}, {"name":"position", "source": "position", "value":"any developer position", "filter":"include(developer)"}, {"name":"position", "source": "position", "value":"only java developers", "filter":"include(java, developer)"}, {"name":"position", "source": "position", "value":"all qa", "filter":"include(qa)"}, {"name":"salary", "source": "salary", "value":"below 150K", "filter":"range(<150)"}, {"name":"salary", "source": "salary", "value":"above 150K", "filter":"range(>150)"}, {"name": "experience", "source":"experience", "value":"", "filter":""} ] }
the POC supports 3 types of filters: all values, include(test) and range(number)
-
Indexer APIs:
/api/index_stats - returns size of the index /api/query - post request to run query
-
Queries supported:
get_all_tags get_values_by_tag get_docids_by_tag_and_value get_tags_by_docid get_values_by_docid_and_tag
Examples: curl -i -H "Content-Type: application/json" -XPOST http://localhost:5000/api/query -d '{"type":"all_tags"}' curl -i -H "Content-Type: application/json" -XPOST http://localhost:5000/api/query -d '{"type":"values_by_tag", "tag":"city:}'' curl -i -H "Content-Type: application/json" -XPOST http://localhost:5000/api/query -d '{"type":"ids_by_tag_and_value", "tag":"city", "value":"Chicago"}'' curl -i -H "Content-Type: application/json" -XPOST http://localhost:5000/api/query -d '{"type":"tags_by_id", "id":"13"}'' curl -i -H "Content-Type: application/json" -XPOST http://localhost:5000/api/query -d '{"type":"value_by_id_and_tag", "tag":"city", "id":"23"}'
-
Notifications
You must be signed in to change notification settings - Fork 0
agyryk/train-parser
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published