Zinc - Twitter - Streaming API - Long lived connections

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Zinc - Twitter - Streaming API - Long lived connections

drush66
Hi,

I was not sure where to fire this question, but Seaside groups seemed like the most probable place to reach people who might know the answer.

So, I might need to access Twitter streaming API. The catch is that streaming api uses long lived HTTP connections, where twitter server over the time pushes json encoded chunks of data, as they appear in users time line. For that http client used (I gues Zinc), must not wait for server to close the connection in order to return the data to the Smalltalk code, but instead, it shoud forward it to smalltalk code as it arrives.

Can Zinc do that? (or some other HTTP Client)

Also for folks that have used STunnel and Nginx - can thay be used to tunnel longlived http connections? (since streaming api seems only be available from https url)

Many thanks,

Davorin Rusevljan
http://www.cloud208.com/


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Sven Van Caekenberghe
Davorin,

On 21 Oct 2011, at 10:20, Davorin Rusevljan wrote:

> I was not sure where to fire this question, but Seaside groups seemed like the most probable place to reach people who might know the answer.
>
> So, I might need to access Twitter streaming API. The catch is that streaming api uses long lived HTTP connections, where twitter server over the time pushes json encoded chunks of data, as they appear in users time line. For that http client used (I gues Zinc), must not wait for server to close the connection in order to return the data to the Smalltalk code, but instead, it shoud forward it to smalltalk code as it arrives.
>
> Can Zinc do that? (or some other HTTP Client)

I had a very quick look at the Twitter Streaming API. It's interesting and challenging at the same time.

Zinc HTTP Components is a foremost a framework to deal with HTTP, with a functional client and server on top of that. Out of the box, the requested use case is not supported. But it should not be too difficult to write some loop that keeps on consuming responses and to deal with them in a streaming fashion.

Also note that you will need OAuth support, which is also a challenge (not provided out of the box, although some people have been working on it).

> Also for folks that have used STunnel and Nginx - can thay be used to tunnel longlived http connections? (since streaming api seems only be available from https url)

Sven

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

drush66
On Fri, Oct 21, 2011 at 10:48 AM, Sven Van Caekenberghe <[hidden email]> wrote:
Davorin,

> Can Zinc do that? (or some other HTTP Client)

I had a very quick look at the Twitter Streaming API. It's interesting and challenging at the same time.

Zinc HTTP Components is a foremost a framework to deal with HTTP, with a functional client and server on top of that. Out of the box, the requested use case is not supported. But it should not be too difficult to write some loop that keeps on consuming responses and to deal with them in a streaming fashion.

Is there a place where Zinc related questions can be raised and discusssed, or this is right group? I might have many more Zinc questions cooking up :) ?
 

Also note that you will need OAuth support, which is also a challenge (not provided out of the box, although some people have been working on it).

I will double check, but for the part of the Streaming api that I might need (there are 3 parts of that api), basic http authentication seems to be still supported. But ieven if not so, CloudFork demo page seems to be able to connect to twitter through OAuth1a

Thanks!

Davorin Rusevljan
http://www.cloud208.com/


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Sven Van Caekenberghe

On 21 Oct 2011, at 10:55, Davorin Rusevljan wrote:

> On Fri, Oct 21, 2011 at 10:48 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> Davorin,
>
> > Can Zinc do that? (or some other HTTP Client)
>
> I had a very quick look at the Twitter Streaming API. It's interesting and challenging at the same time.
>
> Zinc HTTP Components is a foremost a framework to deal with HTTP, with a functional client and server on top of that. Out of the box, the requested use case is not supported. But it should not be too difficult to write some loop that keeps on consuming responses and to deal with them in a streaming fashion.
>
> Is there a place where Zinc related questions can be raised and discusssed, or this is right group? I might have many more Zinc questions cooking up :) ?

The Pharo list is probably the best place.

> Also note that you will need OAuth support, which is also a challenge (not provided out of the box, although some people have been working on it).
>
> I will double check, but for the part of the Streaming api that I might need (there are 3 parts of that api), basic http authentication seems to be still supported. But ieven if not so, CloudFork demo page seems to be able to connect to twitter through OAuth1a

Yeah, the CloudFork project is cool, they have been doing lots of interesting things.

Sven_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Igor Stasenko
In my little project (SCouchDB), i also had to deal with same situation:
 - a couchDB server API using a long-lived connection(s) to notify
observers about database updates,
which can be used for replication of data or watching for updates etc.

So, the idea is to not wait till all data will arrive upon HTTP
request, but answer a stream (which represents a contents),
once http response header is arrived. Then user code may request
#contents, which will force stream to read all data upon
connection is closed (or up to content length, if it specified), or
user may consume data in portions by using a #next /#next: messages to
stream.

This is useful also for cases, when amount of data to transfer is big,
so you don't collect it into a huge buffer and only then hanging over
to user,
but you letting a user to process data in portions, once it arrives via wire.

On 21 October 2011 11:23, Sven Van Caekenberghe <[hidden email]> wrote:

>
> On 21 Oct 2011, at 10:55, Davorin Rusevljan wrote:
>
>> On Fri, Oct 21, 2011 at 10:48 AM, Sven Van Caekenberghe <[hidden email]> wrote:
>> Davorin,
>>
>> > Can Zinc do that? (or some other HTTP Client)
>>
>> I had a very quick look at the Twitter Streaming API. It's interesting and challenging at the same time.
>>
>> Zinc HTTP Components is a foremost a framework to deal with HTTP, with a functional client and server on top of that. Out of the box, the requested use case is not supported. But it should not be too difficult to write some loop that keeps on consuming responses and to deal with them in a streaming fashion.
>>
>> Is there a place where Zinc related questions can be raised and discusssed, or this is right group? I might have many more Zinc questions cooking up :) ?
>
> The Pharo list is probably the best place.
>
>> Also note that you will need OAuth support, which is also a challenge (not provided out of the box, although some people have been working on it).
>>
>> I will double check, but for the part of the Streaming api that I might need (there are 3 parts of that api), basic http authentication seems to be still supported. But ieven if not so, CloudFork demo page seems to be able to connect to twitter through OAuth1a
>
> Yeah, the CloudFork project is cool, they have been doing lots of interesting things.
>
> Sven_______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>



--
Best regards,
Igor Stasenko.
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

drush66
On Fri, Oct 21, 2011 at 11:59 AM, Igor Stasenko <[hidden email]> wrote:
In my little project (SCouchDB), i also had to deal with same situation:
 - a couchDB server API using a long-lived connection(s) to notify
observers about database updates,
which can be used for replication of data or watching for updates etc.

So, the idea is to not wait till all data will arrive upon HTTP
request, but answer a stream (which represents a contents),
once http response header is arrived. Then user code may request
#contents, which will force stream to read all data upon
connection is closed (or up to content length, if it specified), or
user may consume data in portions by using a #next /#next: messages to
stream.

This is useful also for cases, when amount of data to transfer is big,
so you don't collect it into a huge buffer and only then hanging over
to user,
but you letting a user to process data in portions, once it arrives via wire.



Did you use Zinc in SCouchDB, or you were working with plain sockets?

thanks,

Davorin


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Igor Stasenko
On 21 October 2011 12:02, Davorin Rusevljan <[hidden email]> wrote:

> On Fri, Oct 21, 2011 at 11:59 AM, Igor Stasenko <[hidden email]> wrote:
>>
>> In my little project (SCouchDB), i also had to deal with same situation:
>>  - a couchDB server API using a long-lived connection(s) to notify
>> observers about database updates,
>> which can be used for replication of data or watching for updates etc.
>>
>> So, the idea is to not wait till all data will arrive upon HTTP
>> request, but answer a stream (which represents a contents),
>> once http response header is arrived. Then user code may request
>> #contents, which will force stream to read all data upon
>> connection is closed (or up to content length, if it specified), or
>> user may consume data in portions by using a #next /#next: messages to
>> stream.
>>
>> This is useful also for cases, when amount of data to transfer is big,
>> so you don't collect it into a huge buffer and only then hanging over
>> to user,
>> but you letting a user to process data in portions, once it arrives via
>> wire.
>>
>
>
> Did you use Zinc in SCouchDB, or you were working with plain sockets?
>

when i worked on it, there was no Zinc yet, so i had to implement a
bit simplified HTTP protocol
(enough for using with CouchDB) by own over plain socket connection.


> thanks,
>
> Davorin
>
>
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
>



--
Best regards,
Igor Stasenko.
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

radoslav hodnicak
On Fri, Oct 21, 2011 at 12:14 PM, Igor Stasenko <[hidden email]> wrote:
> when i worked on it, there was no Zinc yet, so i had to implement a
> bit simplified HTTP protocol
> (enough for using with CouchDB) by own over plain socket connection.

Does the json parser support streaming? I have some big couchdb
results to work with and so far they fit into the memory, but they
might not always do that.

rado
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

drush66
On Fri, Oct 21, 2011 at 1:07 PM, radoslav hodnicak <[hidden email]> wrote:
On Fri, Oct 21, 2011 at 12:14 PM, Igor Stasenko <[hidden email]> wrote:
> when i worked on it, there was no Zinc yet, so i had to implement a
> bit simplified HTTP protocol
> (enough for using with CouchDB) by own over plain socket connection.

Does the json parser support streaming? I have some big couchdb
results to work with and so far they fit into the memory, but they
might not always do that.


How would you interface to such json parser? I mean how would it partition result into smaller chunks that it can return to you? Probably it would fire event for each subnode parsed, but I doubt current parsers are equiped with that.

Maybe refined map-reduce functions on the couchdb server could help avoid such large jsons?

Davorin Rusevljan
http://www.cloud208.com/


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Nick

Does the json parser support streaming? I have some big couchdb
results to work with and so far they fit into the memory, but they
might not always do that.



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

radoslav hodnicak
In reply to this post by drush66
On Fri, Oct 21, 2011 at 1:43 PM, Davorin Rusevljan
<[hidden email]> wrote:

> On Fri, Oct 21, 2011 at 1:07 PM, radoslav hodnicak <[hidden email]> wrote:
>>
>> On Fri, Oct 21, 2011 at 12:14 PM, Igor Stasenko <[hidden email]>
>> wrote:
>> > when i worked on it, there was no Zinc yet, so i had to implement a
>> > bit simplified HTTP protocol
>> > (enough for using with CouchDB) by own over plain socket connection.
>>
>> Does the json parser support streaming? I have some big couchdb
>> results to work with and so far they fit into the memory, but they
>> might not always do that.
>>
>
> How would you interface to such json parser? I mean how would it partition
> result into smaller chunks that it can return to you? Probably it would fire
> event for each subnode parsed, but I doubt current parsers are equiped with
> that.

couchdb results have a defined structure, one of the attributes is
"rows" with contains the actual data, so it would stream the rows one
by one.

> Maybe refined map-reduce functions on the couchdb server could help avoid
> such large jsons?

sure, there are various ways how to limit the data or split it into
chunks on the couchdb side. but sometimes it would be easier to just
say "hey give me everything you've got and i'll chew through it". I
run into these situations quite often when I neech to update all
documents (or a subset of documents) for whatever reason, and you only
can do that one by one in couchdb.

rado
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

drush66
On Fri, Oct 21, 2011 at 1:57 PM, radoslav hodnicak <[hidden email]> wrote:
couchdb results have a defined structure, one of the attributes is
"rows" with contains the actual data, so it would stream the rows one
by one.


yes, you would partition your json tree on the rows array level, but someone would need it on another level, so some general event based/actor interface would be needed. Nick's link to XStreams looks interesting.

Davorin


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

drush66
In reply to this post by Nick
On Fri, Oct 21, 2011 at 1:55 PM, Nick Ager <[hidden email]> wrote:


Interesting! If I am reading it correcty, it is Event generating parser over the stream.

I have been tempted to use such approach to decode AMQP data from network stream - describe the amqp codec as grammar, put the resulting parser to chew data stream and handle generated events down the road. Always wanted to ask Lukas if PettitParser could be used in such scenario.

I think we it would be great if we could have event/actor based eco system for Pharo. Evented framework with interface to async I/O libs, http, and other protocols. And if we could stick event generating parser into that ... :)

I am afraid it is a _lot_ of work, but it gets me envy that ruby, python and even *sigh* java folks have that for years.

Davorin Rusevljan
http://www.cloud208.com/


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Andreas.Raab
In reply to this post by drush66
On 10/21/2011 10:20, Davorin Rusevljan wrote:
> Can Zinc do that? (or some other HTTP Client)

WebClient [1] + SqueakSSL [2] does it quite nicely:

| wc |
wc := WebClient new.
wc username: 'YOUR_TWITTER_USERNAME'.
wc password: 'YOUR_TWITTER_PASSWORD'.
resp := wc httpGet: 'https://stream.twitter.com/1/statuses/sample.json'.
resp isSuccess ifFalse:[^self error: 'Request failed'].

"Process incoming data"
stream := resp contentStream.
["Could also set the stream to be non-signaling, but perhaps we want
to keep track of timeouts at some point"
[stream waitForData] on: ConnectionTimedOut do:[:ignore].
stream isDataAvailable ifTrue:[
        "Transcript out the next chunk of data"
        Transcript show: resp nextChunk.
]] repeat.

[1] http://www.squeaksource.com/WebClient.html
[2] http://www.squeaksource.com/SqueakSSL.html

Cheers,
   - Andreas
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

drush66
On Fri, Oct 21, 2011 at 3:00 PM, Andreas Raab <[hidden email]> wrote:
On 10/21/2011 10:20, Davorin Rusevljan wrote:
Can Zinc do that? (or some other HTTP Client)

WebClient [1] + SqueakSSL [2] does it quite nicely:



thanks!

Davorin Rusevljan
http://www.cloud208.com/


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Zinc - Twitter - Streaming API - Long lived connections

Sven Van Caekenberghe
In reply to this post by Andreas.Raab

On 21 Oct 2011, at 15:00, Andreas Raab wrote:

> On 10/21/2011 10:20, Davorin Rusevljan wrote:
>> Can Zinc do that? (or some other HTTP Client)
>
> WebClient [1] + SqueakSSL [2] does it quite nicely:
>
> | wc |
> wc := WebClient new.
> wc username: 'YOUR_TWITTER_USERNAME'.
> wc password: 'YOUR_TWITTER_PASSWORD'.
> resp := wc httpGet: 'https://stream.twitter.com/1/statuses/sample.json'.
> resp isSuccess ifFalse:[^self error: 'Request failed'].
>
> "Process incoming data"
> stream := resp contentStream.
[…]

Thanks for the actual code example, Andreas, that is what I meant by writing some loop. If that is indeed the answer, an 'indefinitive' but regular HTTP response, then that can be done with Zinc + Zodiac as well, through its streaming capabilities:

| stream |
stream := ZnNeoClient new
        username: 'twitter_username' password: 'twitter_password';
        streaming: true;
        get: 'https://stream.twitter.com/1/statuses/sample.json'.
[…]

With error handling:

| stream |
stream := ZnNeoClient new
        username: 'twitter_username' password: 'twitter_password';
        streaming: true;
        enforceHttpSuccess: true;
        accept: ZnMimeType applicationJson;
        enforceAcceptContentType: true;
        ifFail: [ :exception | ^ self error: 'Failed with: ', exception printString ];
        get: 'https://stream.twitter.com/1/statuses/sample.json'.
[…]

Using JsJsonParser from Seaside's JavaScript-Core-JSON package, you could do streaming parsing, if the stream contains separate, independent JSON objects:

| parser |
parser := JsJsonParser on: stream.
[ stream atEnd ] whileFalse: [ | nextObject |
        nextObject := parser whitespace; parseValue ]
       
Sven

PS: Soon, ZnNeoClient will be renamed to simply ZnClient


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside