I've been trying to find a library for Pharo/Squeak which would handle
GET/POST requests with the ability to manage cookies and deal with https servers. The HTTPSocket that's included in Pharo doesn't have cookies support. I tried to find any library that handles cookies and there came up CurlPlugin and SWHTTPClient. SWHTTPClient page has a broken link to the source code (http://map.squeak.org/package/15f42ec1-e93e-4bcf-ab2b-6746ae9d413f). CurlPlugin package for Win32 that I found on the main project page fails most of the tests and can't retrieve any http data. I also found WebClient/WebServer library at http://www.squeaksource.com/@QY3MLGU4hU3c8qcE/2xQek_iM which also fails most tests after installation. I wonder what people in smalltalk community are using when they need to do some web scraping when they need to keep some session in cookies? What would be the best library to invest time into (I am very new to Smalltalk)? Thank you, Andrei _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Welcome to the wonderful world of squeak:
(Robin Williams genii voice on) Vast, Cosmic Programming Powers!!!! poorlymaintainedlibraries On 7/29/10 9:06 AM, Andrei Stebakov wrote: > I've been trying to find a library for Pharo/Squeak which would handle > GET/POST requests with the ability to manage cookies and deal with > https servers. > The HTTPSocket that's included in Pharo doesn't have cookies support. > I tried to find any library that handles cookies and there came up > CurlPlugin and SWHTTPClient. > SWHTTPClient page has a broken link to the source code > (http://map.squeak.org/package/15f42ec1-e93e-4bcf-ab2b-6746ae9d413f). > CurlPlugin package for Win32 that I found on the main project page > fails most of the tests and can't retrieve any http data. > I also found WebClient/WebServer library at > http://www.squeaksource.com/@QY3MLGU4hU3c8qcE/2xQek_iM which also > fails most tests after installation. > > I wonder what people in smalltalk community are using when they need > to do some web scraping when they need to keep some session in > cookies? > What would be the best library to invest time into (I am very new to Smalltalk)? > > Thank you, > Andrei > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andrei Stebakov
Hi Andrei,
WebClient is probably the way to go. It is primarily developed for Squeak but I'm currently using it with Pharo 1.1. http://squeakingalong.wordpress.com/2010/05/05/webclient-and-webserver-for-squeak/ The tutorial in the Help section will help you to start. Francois On 29/07/10 18:06, Andrei Stebakov wrote: > I've been trying to find a library for Pharo/Squeak which would handle > GET/POST requests with the ability to manage cookies and deal with > https servers. > The HTTPSocket that's included in Pharo doesn't have cookies support. > I tried to find any library that handles cookies and there came up > CurlPlugin and SWHTTPClient. > SWHTTPClient page has a broken link to the source code > (http://map.squeak.org/package/15f42ec1-e93e-4bcf-ab2b-6746ae9d413f). > CurlPlugin package for Win32 that I found on the main project page > fails most of the tests and can't retrieve any http data. > I also found WebClient/WebServer library at > http://www.squeaksource.com/@QY3MLGU4hU3c8qcE/2xQek_iM which also > fails most tests after installation. > > I wonder what people in smalltalk community are using when they need > to do some web scraping when they need to keep some session in > cookies? > What would be the best library to invest time into (I am very new to Smalltalk)? > > Thank you, > Andrei -- [hidden email] http://www.agilitic.com +32 (0)484/580.322 _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by LawsonEnglish
> Welcome to the wonderful world of squeak: > > (Robin Williams genii voice on) > > Vast, Cosmic Programming Powers!!!! > > poorlymaintainedlibraries Even if you are right, nothing prevents people to - load - fix - enh libraries We are working on books, chapters, ... and any help is welcomed. People often mentions that Python has good documentations, but documentations/tests.... do not get written magically. People do write them. We will make sure with Metacello that libraries gets archived and loadable since metacello will help us to build distributions. Kent Beck told me recently that we often forget that everything done (even simple) is not to be done anymore and that each time you do something you progress. This is an important vision behind pharo. Do something small and do it a lot. Everybody can do something small. Stef _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andrei Stebakov
I found WebClient resonably helpful. And in Pharo1.1 WebClient loaded
with Metacello passes all tests green on Linux. In more recent version some test related to WebSockets are failed. I found WebClient lack of multi-domain cookie support and response encoding problems, but working on it right now. On Thu, Jul 29, 2010 at 20:06, Andrei Stebakov <[hidden email]> wrote: > I've been trying to find a library for Pharo/Squeak which would handle > GET/POST requests with the ability to manage cookies and deal with > https servers. > The HTTPSocket that's included in Pharo doesn't have cookies support. > I tried to find any library that handles cookies and there came up > CurlPlugin and SWHTTPClient. > SWHTTPClient page has a broken link to the source code > (http://map.squeak.org/package/15f42ec1-e93e-4bcf-ab2b-6746ae9d413f). > CurlPlugin package for Win32 that I found on the main project page > fails most of the tests and can't retrieve any http data. > I also found WebClient/WebServer library at > http://www.squeaksource.com/@QY3MLGU4hU3c8qcE/2xQek_iM which also > fails most tests after installation. > > I wonder what people in smalltalk community are using when they need > to do some web scraping when they need to keep some session in > cookies? > What would be the best library to invest time into (I am very new to Smalltalk)? > > Thank you, > Andrei > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
andrey
do you know if they tests are failing also in squeak? Else what are the problems? Stef On Aug 1, 2010, at 3:03 AM, Andrey Larionov wrote: > I found WebClient resonably helpful. And in Pharo1.1 WebClient loaded > with Metacello passes all tests green on Linux. In more recent version > some test related to WebSockets are failed. > I found WebClient lack of multi-domain cookie support and response > encoding problems, but working on it right now. > > On Thu, Jul 29, 2010 at 20:06, Andrei Stebakov <[hidden email]> wrote: >> I've been trying to find a library for Pharo/Squeak which would handle >> GET/POST requests with the ability to manage cookies and deal with >> https servers. >> The HTTPSocket that's included in Pharo doesn't have cookies support. >> I tried to find any library that handles cookies and there came up >> CurlPlugin and SWHTTPClient. >> SWHTTPClient page has a broken link to the source code >> (http://map.squeak.org/package/15f42ec1-e93e-4bcf-ab2b-6746ae9d413f). >> CurlPlugin package for Win32 that I found on the main project page >> fails most of the tests and can't retrieve any http data. >> I also found WebClient/WebServer library at >> http://www.squeaksource.com/@QY3MLGU4hU3c8qcE/2xQek_iM which also >> fails most tests after installation. >> >> I wonder what people in smalltalk community are using when they need >> to do some web scraping when they need to keep some session in >> cookies? >> What would be the best library to invest time into (I am very new to Smalltalk)? >> >> Thank you, >> Andrei >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On 01.08.2010 09:39, Stéphane Ducasse wrote:
> andrey > > do you know if they tests are failing also in squeak? > Else what are the problems? #squeakToUtf8, #utf8ToSqueak concatenating Strings and Integers. I sent a patch to Andreas, got WebSockets running in Seaside now :-) Cheers Philippe _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Cool!
Thanks. I think that this is important to get good infrastructure. On Aug 1, 2010, at 3:43 PM, Philippe Marschall wrote: > On 01.08.2010 09:39, Stéphane Ducasse wrote: >> andrey >> >> do you know if they tests are failing also in squeak? >> Else what are the problems? > > #squeakToUtf8, #utf8ToSqueak concatenating Strings and Integers. > I sent a patch to Andreas, got WebSockets running in Seaside now :-) > > Cheers > Philippe > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Pharo1.1 dosn't contain this methods
On Sun, Aug 1, 2010 at 21:36, Stéphane Ducasse <[hidden email]> wrote: > Cool! > Thanks. > I think that this is important to get good infrastructure. > > On Aug 1, 2010, at 3:43 PM, Philippe Marschall wrote: > >> On 01.08.2010 09:39, Stéphane Ducasse wrote: >>> andrey >>> >>> do you know if they tests are failing also in squeak? >>> Else what are the problems? >> >> #squeakToUtf8, #utf8ToSqueak concatenating Strings and Integers. >> I sent a patch to Andreas, got WebSockets running in Seaside now :-) >> >> Cheers >> Philippe >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On 01.08.2010 21:07, Andrey Larionov wrote:
> Pharo1.1 dosn't contain this methods Yes, that's why the tests failed. Cheers Philippe _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andrey Larionov
I also found that cookies were not correctly sent.
Every cookie was sent with its own "Cookie: " header which is not correct. It should be in a format of "Cookies: name1=value1; name2=value2 " I temporarily fixed it in my current image, I'll try to send out the patch if it's not already fixed by someone else. Also cookie collection is too restrictive to the domain. Let's say your request goes to www.domain.com and in the cookies it'll have domain.com. Those cookies won't be collected since the current algorithm requires it to match from the start of the string (probably should only match the end of the string). Andrei On Sat, Jul 31, 2010 at 9:03 PM, Andrey Larionov <[hidden email]> wrote: > I found WebClient resonably helpful. And in Pharo1.1 WebClient loaded > with Metacello passes all tests green on Linux. In more recent version > some test related to WebSockets are failed. > I found WebClient lack of multi-domain cookie support and response > encoding problems, but working on it right now. > > On Thu, Jul 29, 2010 at 20:06, Andrei Stebakov <[hidden email]> wrote: >> I've been trying to find a library for Pharo/Squeak which would handle >> GET/POST requests with the ability to manage cookies and deal with >> https servers. >> The HTTPSocket that's included in Pharo doesn't have cookies support. >> I tried to find any library that handles cookies and there came up >> CurlPlugin and SWHTTPClient. >> SWHTTPClient page has a broken link to the source code >> (http://map.squeak.org/package/15f42ec1-e93e-4bcf-ab2b-6746ae9d413f). >> CurlPlugin package for Win32 that I found on the main project page >> fails most of the tests and can't retrieve any http data. >> I also found WebClient/WebServer library at >> http://www.squeaksource.com/@QY3MLGU4hU3c8qcE/2xQek_iM which also >> fails most tests after installation. >> >> I wonder what people in smalltalk community are using when they need >> to do some web scraping when they need to keep some session in >> cookies? >> What would be the best library to invest time into (I am very new to Smalltalk)? >> >> Thank you, >> Andrei >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Hi Adrei, excellent :)
BTW, for HTTP Client you should cc Andreas Raab or squeak mailing list.... On Wed, Aug 4, 2010 at 6:08 PM, Andrei Stebakov <[hidden email]> wrote: I also found that cookies were not correctly sent. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On 8/4/2010 9:57 AM, Mariano Martinez Peck wrote:
> Hi Adrei, excellent :) > > BTW, for HTTP Client you should cc Andreas Raab or squeak mailing list.... Squeak-dev please (http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/squeak-dev). > On Wed, Aug 4, 2010 at 6:08 PM, Andrei Stebakov <[hidden email] > <mailto:[hidden email]>> wrote: > > I also found that cookies were not correctly sent. > Every cookie was sent with its own "Cookie: " header which is not > correct. I'm curious, why do you think that's incorrect? My understanding is that RFC 2616 explicitly allows that: "Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one 'field-name: field-value' pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma." And the condition appears to be satisfied in RFC 2109 regarding the Cookie header: "The syntax for the header is: cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value) ... " > Also cookie collection is too restrictive to the domain. Let's say > your request goes to www.domain.com <http://www.domain.com> and in > the cookies it'll have > domain.com <http://domain.com>. > Those cookies won't be collected since the current algorithm requires > it to match from the start of the string (probably should only match > the end of the string). Yeah, that's a silly bug. Thanks for reporting. Cheers, - Andreas _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Hi, Andreas. Could you also look in cross-domain cookie handling. When
domain specified with leading period. For example: .squeak.org It mean, what cookie will be available for squeak.org domain and for all subdomains of squeak.org. On Wed, Aug 4, 2010 at 21:15, Andreas Raab <[hidden email]> wrote: > ah, that's a silly bug. Thanks for reporting. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andreas.Raab
Hi Andreas
I believe that the key is in (quoting): "The syntax for the header is: cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value) ... " That means when you want to send multiple cookies (cookie1 has name1 and value1, cookie2 has name2 and value2) you send: "Cookie: name1=value1; name2=value2" In your case you send two different headers: "Cookie: name1=value1" "Cookie: name2=value2" Maybe some servers may accept this, but the one I am working with chokes on it and skips all the "Cooke: " statements following the first one. If you take a look at the "Live HTTP headers" with FireFox, you'll see requests with cookies follow the "all-cookies-in-one-line" rule. Regards, Andrei On Wed, Aug 4, 2010 at 1:15 PM, Andreas Raab <[hidden email]> wrote: > On 8/4/2010 9:57 AM, Mariano Martinez Peck wrote: >> >> Hi Adrei, excellent :) >> >> BTW, for HTTP Client you should cc Andreas Raab or squeak mailing >> list.... > > Squeak-dev please > (http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/squeak-dev). > >> On Wed, Aug 4, 2010 at 6:08 PM, Andrei Stebakov <[hidden email] >> <mailto:[hidden email]>> wrote: >> >> I also found that cookies were not correctly sent. >> Every cookie was sent with its own "Cookie: " header which is not >> correct. > > I'm curious, why do you think that's incorrect? My understanding is that RFC > 2616 explicitly allows that: > > "Multiple message-header fields with the same field-name MAY be > present in a message if and only if the entire field-value for that header > field is defined as a comma-separated list [i.e., #(values)]. It MUST be > possible to combine the multiple header fields into one 'field-name: > field-value' pair, without changing the semantics of the message, by > appending each subsequent field-value to the first, each separated by a > comma." > > And the condition appears to be satisfied in RFC 2109 regarding the Cookie > header: > > "The syntax for the header is: > > cookie = "Cookie:" cookie-version > 1*((";" | ",") cookie-value) > ... " > > >> Also cookie collection is too restrictive to the domain. Let's say >> your request goes to www.domain.com <http://www.domain.com> and in >> the cookies it'll have >> domain.com <http://domain.com>. >> Those cookies won't be collected since the current algorithm requires >> it to match from the start of the string (probably should only match >> the end of the string). > > Yeah, that's a silly bug. Thanks for reporting. > > Cheers, > - Andreas > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andreas.Raab
On 04.08.2010 19:15, Andreas Raab wrote:
> On 8/4/2010 9:57 AM, Mariano Martinez Peck wrote: >> Hi Adrei, excellent :) >> >> BTW, for HTTP Client you should cc Andreas Raab or squeak mailing >> list.... > > Squeak-dev please > (http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/squeak-dev). > >> On Wed, Aug 4, 2010 at 6:08 PM, Andrei Stebakov <[hidden email] >> <mailto:[hidden email]>> wrote: >> >> I also found that cookies were not correctly sent. >> Every cookie was sent with its own "Cookie: " header which is not >> correct. > > I'm curious, why do you think that's incorrect? My understanding is that > RFC 2616 explicitly allows that: > > "Multiple message-header fields with the same field-name MAY be > present in a message if and only if the entire field-value for that > header field is defined as a comma-separated list [i.e., #(values)]. It > MUST be possible to combine the multiple header fields into one > 'field-name: field-value' pair, without changing the semantics of the > message, by appending each subsequent field-value to the first, each > separated by a comma." You're correct, but that doesn't mean the implementations follow the spec :-(. I can only speak for Set-Cookie, there you have to send each cookie on a new line because the expires date format includes a comma and Firefox and IE can't handle that. Cheers Philippe _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Yes, that's correct for Set-cookie (the response), each has its own line.
"Cookies: " for request should be all in one line. On Sat, Aug 7, 2010 at 8:05 AM, Philippe Marschall <[hidden email]> wrote: > On 04.08.2010 19:15, Andreas Raab wrote: >> On 8/4/2010 9:57 AM, Mariano Martinez Peck wrote: >>> Hi Adrei, excellent :) >>> >>> BTW, for HTTP Client you should cc Andreas Raab or squeak mailing >>> list.... >> >> Squeak-dev please >> (http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/squeak-dev). >> >>> On Wed, Aug 4, 2010 at 6:08 PM, Andrei Stebakov <[hidden email] >>> <mailto:[hidden email]>> wrote: >>> >>> I also found that cookies were not correctly sent. >>> Every cookie was sent with its own "Cookie: " header which is not >>> correct. >> >> I'm curious, why do you think that's incorrect? My understanding is that >> RFC 2616 explicitly allows that: >> >> "Multiple message-header fields with the same field-name MAY be >> present in a message if and only if the entire field-value for that >> header field is defined as a comma-separated list [i.e., #(values)]. It >> MUST be possible to combine the multiple header fields into one >> 'field-name: field-value' pair, without changing the semantics of the >> message, by appending each subsequent field-value to the first, each >> separated by a comma." > > You're correct, but that doesn't mean the implementations follow the > spec :-(. I can only speak for Set-Cookie, there you have to send each > cookie on a new line because the expires date format includes a comma > and Firefox and IE can't handle that. > > Cheers > Philippe > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
What's the process of code review for Sqeak/Pharo?
I just tried to post my changes for WebClient Monticello repository via "Save" and it got rejected with error "401". Looks like I don't have write access to it. On Sat, Aug 7, 2010 at 1:20 PM, Andrei Stebakov <[hidden email]> wrote: > Yes, that's correct for Set-cookie (the response), each has its own line. > "Cookies: " for request should be all in one line. > > On Sat, Aug 7, 2010 at 8:05 AM, Philippe Marschall <[hidden email]> wrote: >> On 04.08.2010 19:15, Andreas Raab wrote: >>> On 8/4/2010 9:57 AM, Mariano Martinez Peck wrote: >>>> Hi Adrei, excellent :) >>>> >>>> BTW, for HTTP Client you should cc Andreas Raab or squeak mailing >>>> list.... >>> >>> Squeak-dev please >>> (http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/squeak-dev). >>> >>>> On Wed, Aug 4, 2010 at 6:08 PM, Andrei Stebakov <[hidden email] >>>> <mailto:[hidden email]>> wrote: >>>> >>>> I also found that cookies were not correctly sent. >>>> Every cookie was sent with its own "Cookie: " header which is not >>>> correct. >>> >>> I'm curious, why do you think that's incorrect? My understanding is that >>> RFC 2616 explicitly allows that: >>> >>> "Multiple message-header fields with the same field-name MAY be >>> present in a message if and only if the entire field-value for that >>> header field is defined as a comma-separated list [i.e., #(values)]. It >>> MUST be possible to combine the multiple header fields into one >>> 'field-name: field-value' pair, without changing the semantics of the >>> message, by appending each subsequent field-value to the first, each >>> separated by a comma." >> >> You're correct, but that doesn't mean the implementations follow the >> spec :-(. I can only speak for Set-Cookie, there you have to send each >> cookie on a new line because the expires date format includes a comma >> and Firefox and IE can't handle that. >> >> Cheers >> Philippe >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |