The Inbox: WebClient-HTTP-cmm.5.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: WebClient-HTTP-cmm.5.mcz

commits-2
Chris Muller uploaded a new version of WebClient-HTTP to project The Inbox:
http://source.squeak.org/inbox/WebClient-HTTP-cmm.5.mcz

==================== Summary ====================

Name: WebClient-HTTP-cmm.5
Author: cmm
Time: 11 September 2016, 4:59:26.542985 pm
UUID: e203b772-bc1c-4516-8bf4-4ea1bc1edb7d
Ancestors: WebClient-HTTP-cmm.4

Don't force Accept header of 'text/html', let clients specify that if its appropriate.  This fixes accessing servers that demand it not be present.

=============== Diff against WebClient-HTTP-cmm.4 ===============

Item was changed:
  ----- Method: HTTPSocket class>>httpGetDocument:args:accept:request: (in category '*webclient-http') -----
  httpGetDocument: url args: args accept: mimeType request: requestString
  "Return the exact contents of a web object. Asks for the given MIME type. If mimeType is nil, use 'text/html'. An extra requestString may be submitted and must end with crlf.  The parsed header is saved. Use a proxy server if one has been registered.  tk 7/23/97 17:12"
 
  "Note: To fetch raw data, you can use the MIME type 'application/octet-stream'."
 
  | client xhdrs resp urlString progress |
  "Normalize the url"
  urlString := (Url absoluteFromText: url) asString.
 
  args ifNotNil: [
  urlString := urlString, (self argString: args)
  ].
 
  "Some raw extra headers which historically have been added"
  xhdrs := HTTPProxyCredentials,
  HTTPBlabEmail, "may be empty"
  requestString. "extra user request. Authorization"
 
  client := WebClient new.
  ^[resp := client httpGet: urlString do:[:req|
  "Add ACCEPT header"
  mimeType ifNotNil:[req headerAt: 'Accept' put: mimeType].
 
- "Always accept plain text"
- req addHeader: 'Accept' value: 'text/html'.
-
  "Add the additional headers"
  (WebUtils readHeadersFrom: xhdrs readStream)
  do:[:assoc| req addHeader: assoc key value: assoc value]].
 
  progress := [:total :amount|
  (HTTPProgress new) total: total; amount: amount; signal: 'Downloading...'
  ].
 
  "Simulate old HTTPSocket return behavior"
  (resp code between: 200 and: 299)
  ifTrue:[MIMEDocument contentType: resp contentType
  content: (resp contentWithProgress: progress) url: url]
  ifFalse:[resp asString, resp content].
  ] ensure:[client destroy].
  !


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-HTTP-cmm.5.mcz

Chris Muller-3
With this fix, I can do:

   'https://www.quandl.com/api/v3/databases.csv' asUrl retrieveContents

without, I can't.

On Sun, Sep 11, 2016 at 4:59 PM,  <[hidden email]> wrote:

> Chris Muller uploaded a new version of WebClient-HTTP to project The Inbox:
> http://source.squeak.org/inbox/WebClient-HTTP-cmm.5.mcz
>
> ==================== Summary ====================
>
> Name: WebClient-HTTP-cmm.5
> Author: cmm
> Time: 11 September 2016, 4:59:26.542985 pm
> UUID: e203b772-bc1c-4516-8bf4-4ea1bc1edb7d
> Ancestors: WebClient-HTTP-cmm.4
>
> Don't force Accept header of 'text/html', let clients specify that if its appropriate.  This fixes accessing servers that demand it not be present.
>
> =============== Diff against WebClient-HTTP-cmm.4 ===============
>
> Item was changed:
>   ----- Method: HTTPSocket class>>httpGetDocument:args:accept:request: (in category '*webclient-http') -----
>   httpGetDocument: url args: args accept: mimeType request: requestString
>         "Return the exact contents of a web object. Asks for the given MIME type. If mimeType is nil, use 'text/html'. An extra requestString may be submitted and must end with crlf.  The parsed header is saved. Use a proxy server if one has been registered.  tk 7/23/97 17:12"
>
>         "Note: To fetch raw data, you can use the MIME type 'application/octet-stream'."
>
>         | client xhdrs resp urlString progress |
>         "Normalize the url"
>         urlString := (Url absoluteFromText: url) asString.
>
>         args ifNotNil: [
>                 urlString := urlString, (self argString: args)
>         ].
>
>         "Some raw extra headers which historically have been added"
>         xhdrs := HTTPProxyCredentials,
>                 HTTPBlabEmail,  "may be empty"
>                 requestString.  "extra user request. Authorization"
>
>         client := WebClient new.
>         ^[resp := client httpGet: urlString do:[:req|
>                 "Add ACCEPT header"
>                 mimeType ifNotNil:[req headerAt: 'Accept' put: mimeType].
>
> -               "Always accept plain text"
> -               req addHeader: 'Accept' value: 'text/html'.
> -
>                 "Add the additional headers"
>                 (WebUtils readHeadersFrom: xhdrs readStream)
>                         do:[:assoc| req addHeader: assoc key value: assoc value]].
>
>         progress := [:total :amount|
>                 (HTTPProgress new) total: total; amount: amount; signal: 'Downloading...'
>         ].
>
>         "Simulate old HTTPSocket return behavior"
>         (resp code between: 200 and: 299)
>                 ifTrue:[MIMEDocument contentType: resp contentType
>                                 content: (resp contentWithProgress: progress) url: url]
>                 ifFalse:[resp asString, resp content].
>         ] ensure:[client destroy].
>   !
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-HTTP-cmm.5.mcz

Tobias Pape

On 12.09.2016, at 00:02, Chris Muller <[hidden email]> wrote:

> With this fix, I can do:
>
>   'https://www.quandl.com/api/v3/databases.csv' asUrl retrieveContents
>
> without, I can't.


Well, shouldn't it be something like this then?:

Accept: text/html; q=1.0, */*; q=0.1

Best regards
        -Tobias

>
> On Sun, Sep 11, 2016 at 4:59 PM,  <[hidden email]> wrote:
>> Chris Muller uploaded a new version of WebClient-HTTP to project The Inbox:
>> http://source.squeak.org/inbox/WebClient-HTTP-cmm.5.mcz
>>
>> ==================== Summary ====================
>>
>> Name: WebClient-HTTP-cmm.5
>> Author: cmm
>> Time: 11 September 2016, 4:59:26.542985 pm
>> UUID: e203b772-bc1c-4516-8bf4-4ea1bc1edb7d
>> Ancestors: WebClient-HTTP-cmm.4
>>
>> Don't force Accept header of 'text/html', let clients specify that if its appropriate.  This fixes accessing servers that demand it not be present.
>>
>> =============== Diff against WebClient-HTTP-cmm.4 ===============
>>
>> Item was changed:
>>  ----- Method: HTTPSocket class>>httpGetDocument:args:accept:request: (in category '*webclient-http') -----
>>  httpGetDocument: url args: args accept: mimeType request: requestString
>>        "Return the exact contents of a web object. Asks for the given MIME type. If mimeType is nil, use 'text/html'. An extra requestString may be submitted and must end with crlf.  The parsed header is saved. Use a proxy server if one has been registered.  tk 7/23/97 17:12"
>>
>>        "Note: To fetch raw data, you can use the MIME type 'application/octet-stream'."
>>
>>        | client xhdrs resp urlString progress |
>>        "Normalize the url"
>>        urlString := (Url absoluteFromText: url) asString.
>>
>>        args ifNotNil: [
>>                urlString := urlString, (self argString: args)
>>        ].
>>
>>        "Some raw extra headers which historically have been added"
>>        xhdrs := HTTPProxyCredentials,
>>                HTTPBlabEmail,  "may be empty"
>>                requestString.  "extra user request. Authorization"
>>
>>        client := WebClient new.
>>        ^[resp := client httpGet: urlString do:[:req|
>>                "Add ACCEPT header"
>>                mimeType ifNotNil:[req headerAt: 'Accept' put: mimeType].
>>
>> -               "Always accept plain text"
>> -               req addHeader: 'Accept' value: 'text/html'.
>> -
>>                "Add the additional headers"
>>                (WebUtils readHeadersFrom: xhdrs readStream)
>>                        do:[:assoc| req addHeader: assoc key value: assoc value]].
>>
>>        progress := [:total :amount|
>>                (HTTPProgress new) total: total; amount: amount; signal: 'Downloading...'
>>        ].
>>
>>        "Simulate old HTTPSocket return behavior"
>>        (resp code between: 200 and: 299)
>>                ifTrue:[MIMEDocument contentType: resp contentType
>>                                content: (resp contentWithProgress: progress) url: url]
>>                ifFalse:[resp asString, resp content].
>>        ] ensure:[client destroy].
>>  !
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-HTTP-cmm.5.mcz

Chris Muller-3
No, that fails too.

On Sun, Sep 11, 2016 at 5:05 PM, Tobias Pape <[hidden email]> wrote:

>
> On 12.09.2016, at 00:02, Chris Muller <[hidden email]> wrote:
>
>> With this fix, I can do:
>>
>>   'https://www.quandl.com/api/v3/databases.csv' asUrl retrieveContents
>>
>> without, I can't.
>
>
> Well, shouldn't it be something like this then?:
>
> Accept: text/html; q=1.0, */*; q=0.1
>
> Best regards
>         -Tobias
>
>>
>> On Sun, Sep 11, 2016 at 4:59 PM,  <[hidden email]> wrote:
>>> Chris Muller uploaded a new version of WebClient-HTTP to project The Inbox:
>>> http://source.squeak.org/inbox/WebClient-HTTP-cmm.5.mcz
>>>
>>> ==================== Summary ====================
>>>
>>> Name: WebClient-HTTP-cmm.5
>>> Author: cmm
>>> Time: 11 September 2016, 4:59:26.542985 pm
>>> UUID: e203b772-bc1c-4516-8bf4-4ea1bc1edb7d
>>> Ancestors: WebClient-HTTP-cmm.4
>>>
>>> Don't force Accept header of 'text/html', let clients specify that if its appropriate.  This fixes accessing servers that demand it not be present.
>>>
>>> =============== Diff against WebClient-HTTP-cmm.4 ===============
>>>
>>> Item was changed:
>>>  ----- Method: HTTPSocket class>>httpGetDocument:args:accept:request: (in category '*webclient-http') -----
>>>  httpGetDocument: url args: args accept: mimeType request: requestString
>>>        "Return the exact contents of a web object. Asks for the given MIME type. If mimeType is nil, use 'text/html'. An extra requestString may be submitted and must end with crlf.  The parsed header is saved. Use a proxy server if one has been registered.  tk 7/23/97 17:12"
>>>
>>>        "Note: To fetch raw data, you can use the MIME type 'application/octet-stream'."
>>>
>>>        | client xhdrs resp urlString progress |
>>>        "Normalize the url"
>>>        urlString := (Url absoluteFromText: url) asString.
>>>
>>>        args ifNotNil: [
>>>                urlString := urlString, (self argString: args)
>>>        ].
>>>
>>>        "Some raw extra headers which historically have been added"
>>>        xhdrs := HTTPProxyCredentials,
>>>                HTTPBlabEmail,  "may be empty"
>>>                requestString.  "extra user request. Authorization"
>>>
>>>        client := WebClient new.
>>>        ^[resp := client httpGet: urlString do:[:req|
>>>                "Add ACCEPT header"
>>>                mimeType ifNotNil:[req headerAt: 'Accept' put: mimeType].
>>>
>>> -               "Always accept plain text"
>>> -               req addHeader: 'Accept' value: 'text/html'.
>>> -
>>>                "Add the additional headers"
>>>                (WebUtils readHeadersFrom: xhdrs readStream)
>>>                        do:[:assoc| req addHeader: assoc key value: assoc value]].
>>>
>>>        progress := [:total :amount|
>>>                (HTTPProgress new) total: total; amount: amount; signal: 'Downloading...'
>>>        ].
>>>
>>>        "Simulate old HTTPSocket return behavior"
>>>        (resp code between: 200 and: 299)
>>>                ifTrue:[MIMEDocument contentType: resp contentType
>>>                                content: (resp contentWithProgress: progress) url: url]
>>>                ifFalse:[resp asString, resp content].
>>>        ] ensure:[client destroy].
>>>  !
>>>
>>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-HTTP-cmm.5.mcz

Tobias Pape

On 12.09.2016, at 00:09, Chris Muller <[hidden email]> wrote:

> No, that fails too.


Strange, because this works:

curl -H"Accept: text/html; q=1.0, */*; q=0.1" -v https://www.quandl.com/api/v3/databases.csv

so there should not be any difference with WebClient then.

Btw: This also works (without your change):

==========================================================
WebClient httpGet: 'https://www.quandl.com/api/v3/databases.csv'  "

WebResponse(HTTP/1.1 200 OK
cache-control: max-age=0, private, must-revalidate
content-type: text/csv; charset=utf-8
date: Sun, 11 Sep 2016 22:31:35 GMT
etag: W/"fcd593ee1d24f34ad69d1bc7a307c655"
server: openresty
vary: Origin
x-api-version: 2015-04-09
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-rack-cors: preflight-hit; no-origin
x-ratelimit-limit: 50
x-ratelimit-remaining: 45
x-request-id: 0ddf764c-e906-49fc-80c6-8c4f64689d72
x-runtime: 0.175008
x-xss-protection: 1; mode=block
content-length: 26215
connection: keep-alive
)"
==========================================================

As does this

==========================================================
curl -H"Accept: text/html" -v https://www.quandl.com/api/v3/databases.csv
* Adding handle: conn: 0x7fe5d4003a00
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0x7fe5d4003a00) send_pipe: 1, recv_pipe: 0
* About to connect() to www.quandl.com port 443 (#0)
*   Trying 54.174.87.84...
* Connected to www.quandl.com (54.174.87.84) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: *.quandl.com
* Server certificate: Amazon
* Server certificate: Amazon Root CA 1
* Server certificate: Starfield Services Root Certificate Authority - G2
> GET /api/v3/databases.csv HTTP/1.1
> User-Agent: curl/7.30.0
> Host: www.quandl.com
> Accept: text/html
>
< HTTP/1.1 200 OK
< Cache-Control: max-age=0, private, must-revalidate
< Content-Type: text/csv; charset=utf-8
< Date: Sun, 11 Sep 2016 22:29:42 GMT
< ETag: W/"fcd593ee1d24f34ad69d1bc7a307c655"
* Server openresty is not blacklisted
< Server: openresty
< Vary: Origin
< X-API-Version: 2015-04-09
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-Rack-CORS: preflight-hit; no-origin
< X-RateLimit-Limit: 50
< X-RateLimit-Remaining: 46
< X-Request-Id: e63f62a2-37ba-4f0c-87e9-0ea91081ce83
< X-Runtime: 0.204353
< X-XSS-Protection: 1; mode=block
< Content-Length: 26215
< Connection: keep-alive
<
==========================================================


So the change is bogus. could you please revert?

Best regards
        -Tobias


>
> On Sun, Sep 11, 2016 at 5:05 PM, Tobias Pape <[hidden email]> wrote:
>>
>> On 12.09.2016, at 00:02, Chris Muller <[hidden email]> wrote:
>>
>>> With this fix, I can do:
>>>
>>>  'https://www.quandl.com/api/v3/databases.csv' asUrl retrieveContents
>>>
>>> without, I can't.
>>
>>
>> Well, shouldn't it be something like this then?:
>>
>> Accept: text/html; q=1.0, */*; q=0.1
>>
>> Best regards
>>        -Tobias
>>
>>>
>>> On Sun, Sep 11, 2016 at 4:59 PM,  <[hidden email]> wrote:
>>>> Chris Muller uploaded a new version of WebClient-HTTP to project The Inbox:
>>>> http://source.squeak.org/inbox/WebClient-HTTP-cmm.5.mcz
>>>>
>>>> ==================== Summary ====================
>>>>
>>>> Name: WebClient-HTTP-cmm.5
>>>> Author: cmm
>>>> Time: 11 September 2016, 4:59:26.542985 pm
>>>> UUID: e203b772-bc1c-4516-8bf4-4ea1bc1edb7d
>>>> Ancestors: WebClient-HTTP-cmm.4
>>>>
>>>> Don't force Accept header of 'text/html', let clients specify that if its appropriate.  This fixes accessing servers that demand it not be present.
>>>>
>>>> =============== Diff against WebClient-HTTP-cmm.4 ===============
>>>>
>>>> Item was changed:
>>>> ----- Method: HTTPSocket class>>httpGetDocument:args:accept:request: (in category '*webclient-http') -----
>>>> httpGetDocument: url args: args accept: mimeType request: requestString
>>>>       "Return the exact contents of a web object. Asks for the given MIME type. If mimeType is nil, use 'text/html'. An extra requestString may be submitted and must end with crlf.  The parsed header is saved. Use a proxy server if one has been registered.  tk 7/23/97 17:12"
>>>>
>>>>       "Note: To fetch raw data, you can use the MIME type 'application/octet-stream'."
>>>>
>>>>       | client xhdrs resp urlString progress |
>>>>       "Normalize the url"
>>>>       urlString := (Url absoluteFromText: url) asString.
>>>>
>>>>       args ifNotNil: [
>>>>               urlString := urlString, (self argString: args)
>>>>       ].
>>>>
>>>>       "Some raw extra headers which historically have been added"
>>>>       xhdrs := HTTPProxyCredentials,
>>>>               HTTPBlabEmail,  "may be empty"
>>>>               requestString.  "extra user request. Authorization"
>>>>
>>>>       client := WebClient new.
>>>>       ^[resp := client httpGet: urlString do:[:req|
>>>>               "Add ACCEPT header"
>>>>               mimeType ifNotNil:[req headerAt: 'Accept' put: mimeType].
>>>>
>>>> -               "Always accept plain text"
>>>> -               req addHeader: 'Accept' value: 'text/html'.
>>>> -
>>>>               "Add the additional headers"
>>>>               (WebUtils readHeadersFrom: xhdrs readStream)
>>>>                       do:[:assoc| req addHeader: assoc key value: assoc value]].
>>>>
>>>>       progress := [:total :amount|
>>>>               (HTTPProgress new) total: total; amount: amount; signal: 'Downloading...'
>>>>       ].
>>>>
>>>>       "Simulate old HTTPSocket return behavior"
>>>>       (resp code between: 200 and: 299)
>>>>               ifTrue:[MIMEDocument contentType: resp contentType
>>>>                               content: (resp contentWithProgress: progress) url: url]
>>>>               ifFalse:[resp asString, resp content].
>>>>       ] ensure:[client destroy].
>>>> !
>>>>
>>>>
>>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-HTTP-cmm.5.mcz

Tobias Pape

On 12.09.2016, at 00:33, Tobias Pape <[hidden email]> wrote:

> bogus

Wrong word. meant something like "not fixing what is to be fixed as something else needs fixing"