Hi,
I'm making a data scrapper from twitter. I know that twitter API is there, but I would like to make the scrapped data available to anyone, even if the person has not signed an API agreement. Also I think that this kind of external data is important for making agile visualization less self-referential and could bring some interesting examples with the data is common to the usual "netizen". I have some advances that you can test easily executing the code at [1] and I have already scrapped and filled out the data from a twitter profile page. [1] http://ws.stfx.eu/E3LD464QI0GR Now I'm having problems extracting tweets data. If I execute the code at [2] I can get a list of tweets (first 19) and I can explore inside any member of the collection, but I can't make sense of the SoupTag data inside. How can I extract particularly the tweet contents? [2] http://ws.stfx.eu/JXYM7W7WL1H9 Any help with this will be greatly appreciated. Cheers, Offray |
Is this what you want?
| source anUrl tweet | anUrl := 'https://twitter.com/offrayLC'. source := Soup fromString: (ZnEasy get: anUrl ) contents asString. tweet := (source findAllTagsByClass: 'ProfileTweet-text'). tweet collect:[:ea | ea text].
|
Paul,
Thanks, that's pretty much what I'm looking for! Your clue raises a new question: by default I get only the last 19 tweets from someone. There is any way to tell ZnClient to load more data, similar to what you do when you scroll down the twitter page? Sven, any suggestion here? Thanks again, Offray El 05/04/15 a las 22:14, Paul DeBruicker escribió: > Is this what you want? > > > | source anUrl tweet | > > anUrl := 'https://twitter.com/offrayLC'. > source := Soup fromString: (ZnEasy get: anUrl ) contents asString. > tweet := (source findAllTagsByClass: 'ProfileTweet-text'). > tweet collect:[:ea | ea text]. > > > > > > > > Offray wrote >> Hi, >> >> I'm making a data scrapper from twitter. I know that twitter API is >> there, but I would like to make the scrapped data available to anyone, >> even if the person has not signed an API agreement. Also I think that >> this kind of external data is important for making agile visualization >> less self-referential and could bring some interesting examples with the >> data is common to the usual "netizen". >> >> I have some advances that you can test easily executing the code at [1] >> and I have already scrapped and filled out the data from a twitter >> profile page. >> >> [1] http://ws.stfx.eu/E3LD464QI0GR >> >> Now I'm having problems extracting tweets data. If I execute the code at >> [2] I can get a list of tweets (first 19) and I can explore inside any >> member of the collection, but I can't make sense of the SoupTag data >> inside. How can I extract particularly the tweet contents? >> >> [2] http://ws.stfx.eu/JXYM7W7WL1H9 >> >> Any help with this will be greatly appreciated. >> >> Cheers, >> >> Offray > > > > > > -- > View this message in context: http://forum.world.st/Data-scrapping-in-pharo-Extracting-tweets-contents-tp4817746p4817756.html > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. > > |
Free forum by Nabble | Edit this page |