Login  Register

Re: Data scrapping in pharo: Extracting tweets contents

Posted by Paul DeBruicker on Apr 06, 2015; 3:14am
URL: https://forum.world.st/Data-scrapping-in-pharo-Extracting-tweets-contents-tp4817746p4817756.html

Is this what you want?


| source anUrl tweet |

anUrl := 'https://twitter.com/offrayLC'.
source := Soup fromString: (ZnEasy get: anUrl ) contents asString.
tweet := (source findAllTagsByClass: 'ProfileTweet-text').
tweet collect:[:ea | ea text].






Offray wrote
Hi,

I'm making a data scrapper from twitter. I know that twitter API is
there, but I would like to make the scrapped data available to anyone,
even if the person has not signed an API agreement. Also I think that
this kind of external data is important for making agile visualization
less self-referential and could bring some interesting examples with the
data is common to the usual "netizen".

I have some advances that you can test easily executing the code at [1]
and I have already scrapped and filled out the data from a twitter
profile page.

[1] http://ws.stfx.eu/E3LD464QI0GR

Now I'm having problems extracting tweets data. If I execute the code at
[2] I can get a list of tweets (first 19) and I can explore inside any
member of the collection, but I can't make sense of the SoupTag data
inside. How can I extract particularly the tweet contents?

[2] http://ws.stfx.eu/JXYM7W7WL1H9

Any help with this will be greatly appreciated.

Cheers,

Offray