Hi,
Recently Paul DeBruicker taught me how to refine my code for getting tweets properly. Consider this: =[1]==================================== | source anUrl tweet | anUrl := 'https://twitter.com/offrayLC'. source := Soup fromString: (ZnEasy get: anUrl ) contents asString. tweets := (source findAllTagsByClass: 'ProfileTweet-text') collect:[:ea | ea text]. ======================================== Is working fine, but I would like to get more that 19 tweets, that is what you get by default. There is any way to tell ZnEasy and friends to get more tweets, something similar to what you do when you scroll down into a twitter page? And by the way, I would like to make more sense of the Soup I got in the last line. ea text gives me the tweet contents, but how can I interpret the metadata in the soup? (is a retweet, date of publishing and so on). I could make this for most part of the twitter profile page, but the tweet is kind of elusive, for example how to know that "text" is the proper message for getting the tweet content? Any pointer to how to make sense of it by myself is greatly appreciated. Cheers, Offray |
What Paul showed is basically just a hack.
What you probably what is full API access to Twitter, that gives you the real thing, but it is more work and you have to understand all the technical details (unless somebody already did it for you, I don't know - I know that Zinc-SSO can connect to Twitter). https://dev.twitter.com/overview/api > On 07 Apr 2015, at 20:23, Offray Vladimir Luna Cárdenas <[hidden email]> wrote: > > Hi, > > Recently Paul DeBruicker taught me how to refine my code for getting tweets properly. Consider this: > > =[1]==================================== > | source anUrl tweet | > > anUrl := 'https://twitter.com/offrayLC'. > source := Soup fromString: (ZnEasy get: anUrl ) contents asString. > tweets := (source findAllTagsByClass: 'ProfileTweet-text') collect:[:ea | ea text]. > ======================================== > > Is working fine, but I would like to get more that 19 tweets, that is what you get by default. There is any way to tell ZnEasy and friends to get more tweets, something similar to what you do when you scroll down into a twitter page? > > And by the way, I would like to make more sense of the Soup I got in the last line. ea text gives me the tweet contents, but how can I interpret the metadata in the soup? (is a retweet, date of publishing and so on). I could make this for most part of the twitter profile page, but the tweet is kind of elusive, for example how to know that "text" is the proper message for getting the tweet content? Any pointer to how to make sense of it by myself is greatly appreciated. > > Cheers, > > Offray > |
Offray - What Sven said is correct. You're not getting an answer about how to violate their Terms of Service because this isn't that kind of place. You've asked 3 times. Once is usually enough. Use the API. For the Soup questions get an inspector on an instance of a SoupTag and start sending it messages it understands and see what you get. Trial and error. Or read the python Soup docs as the commands probably have an equivalent in the Smalltalk library. Most of this programming stuff is reading, doing a little experiment, thinking, then trying again.
Sven - I only showed him that SoupTag has a #text message. I'm sure you're busy and had forgotten that the first time he/she asked they stated that they don't want to use the api: http://forum.world.st/Data-scrapping-in-pharo-Extracting-tweets-contents-td4817746.html and provided the download code in an ws.stfx.eu snippet. Hope this helps Paul
|
> On 08 Apr 2015, at 15:29, Paul DeBruicker <[hidden email]> wrote: > > Offray - What Sven said is correct. You're not getting an answer about how > to violate their Terms of Service because this isn't that kind of place. > You've asked 3 times. Once is usually enough. Use the API. For the Soup > questions get an inspector on an instance of a SoupTag and start sending it > messages it understands and see what you get. Trial and error. Or read the > python Soup docs as the commands probably have an equivalent in the > Smalltalk library. Most of this programming stuff is reading, doing a > little experiment, thinking, then trying again. > > Sven - I only showed him that SoupTag has a #text message. I'm sure you're > busy and had forgotten that the first time he/she asked they stated that > they don't want to use the api: > http://forum.world.st/Data-scrapping-in-pharo-Extracting-tweets-contents-td4817746.html > and provided the download code in an ws.stfx.eu snippet. Paul, I know you understand, we're on the same page. Sven > Hope this helps > > Paul > > > Sven Van Caekenberghe-2 wrote >> What Paul showed is basically just a hack. >> >> What you probably what is full API access to Twitter, that gives you the >> real thing, but it is more work and you have to understand all the >> technical details (unless somebody already did it for you, I don't know - >> I know that Zinc-SSO can connect to Twitter). >> >> https://dev.twitter.com/overview/api >> >>> On 07 Apr 2015, at 20:23, Offray Vladimir Luna Cárdenas < > >> offray@ > >> > wrote: >>> >>> Hi, >>> >>> Recently Paul DeBruicker taught me how to refine my code for getting >>> tweets properly. Consider this: >>> >>> =[1]==================================== >>> | source anUrl tweet | >>> >>> anUrl := 'https://twitter.com/offrayLC'. >>> source := Soup fromString: (ZnEasy get: anUrl ) contents asString. >>> tweets := (source findAllTagsByClass: 'ProfileTweet-text') collect:[:ea | >>> ea text]. >>> ======================================== >>> >>> Is working fine, but I would like to get more that 19 tweets, that is >>> what you get by default. There is any way to tell ZnEasy and friends to >>> get more tweets, something similar to what you do when you scroll down >>> into a twitter page? >>> >>> And by the way, I would like to make more sense of the Soup I got in the >>> last line. ea text gives me the tweet contents, but how can I interpret >>> the metadata in the soup? (is a retweet, date of publishing and so on). I >>> could make this for most part of the twitter profile page, but the tweet >>> is kind of elusive, for example how to know that "text" is the proper >>> message for getting the tweet content? Any pointer to how to make sense >>> of it by myself is greatly appreciated. >>> >>> Cheers, >>> >>> Offray >>> > > > > > > -- > View this message in context: http://forum.world.st/ZnClient-getting-more-that-19-tweet-for-data-scrapping-tp4818162p4818361.html > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. |
In reply to this post by Paul DeBruicker
Hi Paul and Sven,
I will try Twitter API if is necessary, but I'm not trying to get support here on how to violate Twitter ToS. I'm pretty aware of them, but surely there are exceptions. The mail I shared with you (the one Paul point to) about why I would like to not use API but instead use scrapping doesn't go in details, it just said that it was because people who have not a twitter account or signed the ToS should be able to get some twitter info. That special kind of info is the one regarding public political/politicians discourse and my idea to scrap *public* and *specific* data from twitter happens in the context of a project for citizen oversight of political issues empowered by ICT. The project is discussed on detail here: https://www.newschallenge.org/challenge/elections/entries/datapolis-data-narratives-visualizations-for-citizen-oversight-of-politicians-discourses-and-public-contracts-in-social-media-and-the-web So, for the moment, I will get properly account permissions for getting twitter data, but my conviction for the long term is that public political discourse (among others), even the one that circulates on private networks like Twitter or Facebook, should be under Constitutional Terms (like free speech and wide political participation) and not under the restricted ones of Twitter or Facebook. This is a sensible issue and surely needs more talk, may be on another time, but I will follow your advice and extract data from Twitter API and come here with questions about it. Thanks for all your help and support, Offray El 08/04/15 a las 08:29, Paul DeBruicker escribió: > Offray - What Sven said is correct. You're not getting an answer about how > to violate their Terms of Service because this isn't that kind of place. > You've asked 3 times. Once is usually enough. Use the API. For the Soup > questions get an inspector on an instance of a SoupTag and start sending it > messages it understands and see what you get. Trial and error. Or read the > python Soup docs as the commands probably have an equivalent in the > Smalltalk library. Most of this programming stuff is reading, doing a > little experiment, thinking, then trying again. > > Sven - I only showed him that SoupTag has a #text message. I'm sure you're > busy and had forgotten that the first time he/she asked they stated that > they don't want to use the api: > http://forum.world.st/Data-scrapping-in-pharo-Extracting-tweets-contents-td4817746.html > and provided the download code in an ws.stfx.eu snippet. > > Hope this helps > > Paul > > > Sven Van Caekenberghe-2 wrote >> What Paul showed is basically just a hack. >> >> What you probably what is full API access to Twitter, that gives you the >> real thing, but it is more work and you have to understand all the >> technical details (unless somebody already did it for you, I don't know - >> I know that Zinc-SSO can connect to Twitter). >> >> https://dev.twitter.com/overview/api >> >>> On 07 Apr 2015, at 20:23, Offray Vladimir Luna Cárdenas < > >> offray@ > >> > wrote: >>> >>> Hi, >>> >>> Recently Paul DeBruicker taught me how to refine my code for getting >>> tweets properly. Consider this: >>> >>> =[1]==================================== >>> | source anUrl tweet | >>> >>> anUrl := 'https://twitter.com/offrayLC'. >>> source := Soup fromString: (ZnEasy get: anUrl ) contents asString. >>> tweets := (source findAllTagsByClass: 'ProfileTweet-text') collect:[:ea | >>> ea text]. >>> ======================================== >>> >>> Is working fine, but I would like to get more that 19 tweets, that is >>> what you get by default. There is any way to tell ZnEasy and friends to >>> get more tweets, something similar to what you do when you scroll down >>> into a twitter page? >>> >>> And by the way, I would like to make more sense of the Soup I got in the >>> last line. ea text gives me the tweet contents, but how can I interpret >>> the metadata in the soup? (is a retweet, date of publishing and so on). I >>> could make this for most part of the twitter profile page, but the tweet >>> is kind of elusive, for example how to know that "text" is the proper >>> message for getting the tweet content? Any pointer to how to make sense >>> of it by myself is greatly appreciated. >>> >>> Cheers, >>> >>> Offray >>> > > > > > > -- > View this message in context: http://forum.world.st/ZnClient-getting-more-that-19-tweet-for-data-scrapping-tp4818162p4818361.html > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. > > |
Free forum by Nabble | Edit this page |