Login  Register

Re: ZnClient: getting more that 19 tweet for data scrapping

Posted by Sven Van Caekenberghe-2 on Apr 08, 2015; 3:03pm
URL: https://forum.world.st/ZnClient-getting-more-that-19-tweet-for-data-scrapping-tp4818162p4818385.html


> On 08 Apr 2015, at 15:29, Paul DeBruicker <[hidden email]> wrote:
>
> Offray - What Sven said is correct.  You're not getting an answer about how
> to violate their Terms of Service because this isn't that kind of place.
> You've asked 3 times.  Once is usually enough.  Use the API. For the Soup
> questions get an inspector on an instance of a SoupTag and start sending it
> messages it understands and see what you get. Trial and error.  Or read the
> python Soup docs as the commands probably have an equivalent in the
> Smalltalk library.  Most of this programming stuff is reading, doing a
> little experiment,  thinking, then trying again.  
>
> Sven - I only showed him that SoupTag has a #text message. I'm sure you're
> busy and had forgotten that the first time he/she asked they stated that
> they don't want to use the api:
> http://forum.world.st/Data-scrapping-in-pharo-Extracting-tweets-contents-td4817746.html
> and provided the download code in an ws.stfx.eu snippet.  

Paul, I know you understand, we're on the same page. Sven

> Hope this helps
>
> Paul
>
>
> Sven Van Caekenberghe-2 wrote
>> What Paul showed is basically just a hack.
>>
>> What you probably what is full API access to Twitter, that gives you the
>> real thing, but it is more work and you have to understand all the
>> technical details (unless somebody already did it for you, I don't know -
>> I know that Zinc-SSO can connect to Twitter).
>>
>> https://dev.twitter.com/overview/api
>>
>>> On 07 Apr 2015, at 20:23, Offray Vladimir Luna Cárdenas &lt;
>
>> offray@
>
>> &gt; wrote:
>>>
>>> Hi,
>>>
>>> Recently Paul DeBruicker taught me how to refine my code for getting
>>> tweets properly. Consider this:
>>>
>>> =[1]====================================
>>> | source anUrl tweet |
>>>
>>> anUrl := 'https://twitter.com/offrayLC'.
>>> source := Soup fromString: (ZnEasy get: anUrl ) contents asString.
>>> tweets := (source findAllTagsByClass: 'ProfileTweet-text') collect:[:ea |
>>> ea text].
>>> ========================================
>>>
>>> Is working fine, but I would like to get more that 19 tweets, that is
>>> what you get by default. There is any way to tell ZnEasy and friends to
>>> get more tweets, something similar to what you do when you scroll down
>>> into a twitter page?
>>>
>>> And by the way, I would like to make more sense of the Soup I got in the
>>> last line. ea text gives me the tweet contents, but how can I interpret
>>> the metadata in the soup? (is a retweet, date of publishing and so on). I
>>> could make this for most part of the twitter profile page, but the tweet
>>> is kind of elusive, for example how to know that "text" is the proper
>>> message for getting the tweet content? Any pointer to how to make sense
>>> of it by myself is greatly appreciated.
>>>
>>> Cheers,
>>>
>>> Offray
>>>
>
>
>
>
>
> --
> View this message in context: http://forum.world.st/ZnClient-getting-more-that-19-tweet-for-data-scrapping-tp4818162p4818361.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.