-
Notifications
You must be signed in to change notification settings - Fork 4
Hello Twitter! Teknek drinking from the hose
edwardcapriolo edited this page Jan 6, 2014
·
6 revisions
This examples comes from the class io.teknek.twitter.EndToEndTest
.
The TwtterStreamFeed
is layered on top of twitters hbc client. The credentials come from twitter so you can not run this example without your own.
@Test
public void hangAround() throws JsonGenerationException, JsonMappingException, IOException {
p = new Plan().withFeedDesc(
new FeedDesc().withFeedClass(TwitterStreamFeed.class.getName()).withProperties(getCredentialsOrDie()))
.withRootOperator(new OperatorDesc(new EmitStatusAsTextOperator())
.withNextOperator(new OperatorDesc(new BeLoudOperator())));
p.setName("yell");
p.setMaxWorkers(1);
td.applyPlan(p);
try {
Thread.sleep(10000);
} catch (InterruptedException e) { }
}
If we let this example run we get some random tweets.
{statusAsText=RT @Pesadohein: "Pode ser pra SEMPRE?" - FERRO TUDOOOOOOOO}
{statusAsText=RT @kabasimsek: abi sen dershaneye mi gittin hira mağarasına mı? @kemalgulen}
{statusAsText=RT @Maeecol: Nos vamos a Elche o no te gusta la aventura?? @Mariyeyes}
{statusAsText=RT @Zackkkk_: My circle so small I can talk to Myself.}
{statusAsText=@soykarolay Ire al de aqui de la casa toda las tardes.}
{statusAsText=@JulianRohatgi I would rather smash my face into rusty nails than study and take this math exam.}
That is nifty. Lets say we only want to extract URL's from the stream. Operators can be supplied a map of properties. In this case we made an operator called EmitFieldsMatchingPatter
this is a general purpose operator that takes in a regular expression pattern and emits all matching tokens that match the url separately.
public void hangAround() throws JsonGenerationException, JsonMappingException, IOException {
Map<String,Object> params = MapBuilder.makeMap(EmitFieldsMatchingPattern.SOURCE_FIELD, "statusAsText",
EmitFieldsMatchingPattern.REGEX, "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]" ) ;
p = new Plan().withFeedDesc(
new FeedDesc().withFeedClass(TwitterStreamFeed.class.getName()).withProperties(getCredentialsOrDie()))
.withRootOperator(new OperatorDesc(new EmitStatusAsTextOperator())
//.withNextOperator(new OperatorDesc(new BeLoudOperator())));
.withNextOperator( new OperatorDesc(new EmitFieldsMatchingPattern()).withParameters(params))
.withNextOperator(new OperatorDesc(new BeLoudOperator())));
Then we can see our output is only urls.
DEBUG 22:37:34,757 No children operators for this operator. Tuple not being passed on {out=http://t.co/9q7ajc0acs}
DEBUG 22:37:34,759 No children operators for this operator. Tuple not being passed on {out=http://t.co/XWsPCJgeDf}
Next you could build an Operator that just ticks off Cassandra counters.