-
-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker revamp #19
base: master
Are you sure you want to change the base?
Docker revamp #19
Changes from 1 commit
b292b6d
4aef0a3
8ee20b8
9dca234
74e6b36
53408c9
d5393e7
4dc0238
b7cda28
a8bf22a
aea6a59
64ddc35
004c17c
3cd6de0
fde9181
92a9447
4205c8c
ff5a8cf
e310886
10502c8
7675154
c2a7010
cb42d10
fd9fe56
3a3b452
f3c1895
e6c9f1b
3fc49a6
e86efe7
5f1d87f
3e862ec
84c6324
4cd6af4
622eae1
baa9d5f
040887b
56901dc
abbc740
a6002ac
c6cf17b
bda22a4
b7226d4
5cde9c9
c3431c4
d086bc6
f84a379
60e06fc
0d7c7a4
fb6bea1
097c774
efc7387
175dd61
a178733
9c55d3d
6f38025
5d17a6c
646b0d9
b985410
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,8 +22,8 @@ services: | |
soft: 2048 | ||
hard: 2048 | ||
#expose this for local dev only! | ||
#ports: | ||
# - "9200:9200" | ||
ports: | ||
- "9200:9200" | ||
redis: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why has redis been added to container? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It serves as article cache. When you fetch the news article from the news page, added article won't add again. |
||
build: | ||
context: ./redis-docker | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,19 @@ | ||
import hashlib | ||
import re | ||
import time | ||
from datetime import datetime | ||
|
||
import nltk | ||
import hashlib | ||
|
||
try: | ||
import urllib.parse as urlparse | ||
except ImportError: | ||
import urlparse | ||
|
||
from config import * | ||
from Initializer.str_unicode import * | ||
from Helper.Sentiment import * | ||
from Initializer.ElasticSearchInit import es | ||
from Initializer.RedisInit import rds | ||
from Initializer.LoggerInit import * | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happened to LoggerInit? |
||
|
||
from Sentiment.Initializer.ElasticSearchInit import es | ||
from Sentiment.Initializer.str_unicode import * | ||
from Sentiment.Initializer.RedisInit import rds | ||
from Sentiment.Helper.Sentiment import * | ||
|
||
|
||
class NewsHeadlineListener: | ||
|
@@ -29,8 +26,9 @@ def __init__(self, symbol,url=None): | |
# add any new headlines | ||
for htext, htext_url in new_headlines: | ||
|
||
md5Hash = hashlib.md5( (htext+htext_url).encode() ).hexdigest() | ||
if rds.exists(md5Hash): | ||
md5_hash = hashlib.md5((htext+htext_url).encode()).hexdigest() | ||
|
||
if rds.exists(md5_hash) is 0: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. easier to read if we say |
||
|
||
datenow = datetime.utcnow().isoformat() | ||
# output news data | ||
|
@@ -49,6 +47,7 @@ def __init__(self, symbol,url=None): | |
for t in nltk_tokens_ignored: | ||
if t in tokens: | ||
logger.info("Text contains token from ignore list, not adding") | ||
rds.set(md5_hash,1,2628000) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Magic numbers. Why has True been replaced with the more abstract There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's better if you review directly from the latest commit. Changes like this one are outdated and no longer exists in the latest commit. |
||
continue | ||
# check required tokens from config | ||
tokenspass = False | ||
|
@@ -65,6 +64,7 @@ def __init__(self, symbol,url=None): | |
break | ||
if not tokenspass: | ||
logger.info("Text does not contain token from required list, not adding") | ||
rds.set(md5_hash,1,2628000) | ||
continue | ||
|
||
# get sentiment values | ||
|
@@ -80,7 +80,7 @@ def __init__(self, symbol,url=None): | |
"polarity": polarity, | ||
"subjectivity": subjectivity, | ||
"sentiment": sentiment}) | ||
rds.set(md5Hash,True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This made more sense. I am not familiar with redis and I intuitively knew this was setting the hash to present in the data structure server. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original console script store the article cache as Python list. Since the python code exits once it's done fetching the articles in docker, it has to move to a 3rd party data store. |
||
rds.set(md5_hash,1,2628000) | ||
|
||
|
||
def get_news_headlines(self, url): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
import argparse | ||
|
||
from Initializer.LoggerInit import * | ||
from Initializer.ElasticSearchInit import es | ||
from Sentiment.Initializer.ElasticSearchInit import es | ||
from Sentiment.Initializer.LoggerInit import * | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for alphabetizing. Good clean code principles. |
||
|
||
if __name__ == '__main__': | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,21 +11,14 @@ | |
""" | ||
|
||
import argparse | ||
import json | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Were these removed because they are imported indirectly through NewsHeadlineListener? Looks a lot cleaner without all the imports. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ya.. They imported through NewsHeadlineListener. It's suggested to remove by PyCharm. |
||
import re | ||
import sys | ||
import time | ||
|
||
import nltk | ||
import requests | ||
|
||
try: | ||
import urllib.parse as urlparse | ||
except ImportError: | ||
import urlparse | ||
|
||
# import elasticsearch host, twitter keys and tokens | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this done in NewsHeadlineListener or do we need to add our own import statement? |
||
from NewsHeadlineListener import * | ||
from Sentiment.NewsHeadlineListener import * | ||
|
||
|
||
STOCKSIGHT_VERSION = '0.1-b.6' | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
#!/bin/bash | ||
|
||
|
||
sleep 20; | ||
sleep 30; | ||
|
||
while true | ||
do | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when this is exposed permanently?