A new version of WebClient-Core was added to project The Inbox:
http://source.squeak.org/inbox/WebClient-Core-tobe.120.mcz ==================== Summary ==================== Name: WebClient-Core-tobe.120 Author: tobe Time: 7 May 2020, 8:03:26.842149 pm UUID: e511b96d-6164-458e-bebd-4bb4f42cfb72 Ancestors: WebClient-Core-nice.119 If we get a 304, do not try to parse content In particular, since content-length is typically not set for a 304 response (https://tools.ietf.org/html/rfc7230#section-3.3.2) and we somehow failed to notice the connection being closed, we ended up with a timeout instead. This caused some Metacello tests to time out. =============== Diff against WebClient-Core-nice.119 =============== Item was changed: ----- Method: WebResponse>>getContent (in category 'private') ----- getContent "Do not read any content if this was a HEAD request or code is 204 (no content)" + (request method = 'HEAD' or: [code = 204 or: [code = 304]]) ifTrue:[^'']. - (request method = 'HEAD' or: [code = 204]) ifTrue:[^'']. ^super getContent! |
> On 07.05.2020, at 20:03, [hidden email] wrote: > > A new version of WebClient-Core was added to project The Inbox: > http://source.squeak.org/inbox/WebClient-Core-tobe.120.mcz > > ==================== Summary ==================== > > Name: WebClient-Core-tobe.120 > Author: tobe > Time: 7 May 2020, 8:03:26.842149 pm > UUID: e511b96d-6164-458e-bebd-4bb4f42cfb72 > Ancestors: WebClient-Core-nice.119 > > If we get a 304, do not try to parse content > > In particular, since content-length is typically not set for a 304 response (https://tools.ietf.org/html/rfc7230#section-3.3.2) and we somehow failed to notice the connection being closed, we ended up with a timeout instead. This caused some Metacello tests to time out. > > =============== Diff against WebClient-Core-nice.119 =============== > > Item was changed: > ----- Method: WebResponse>>getContent (in category 'private') ----- > getContent > "Do not read any content if this was a HEAD request or code is 204 (no content)" > + (request method = 'HEAD' or: [code = 204 or: [code = 304]]) ifTrue:[^'']. > - (request method = 'HEAD' or: [code = 204]) ifTrue:[^'']. > ^super getContent! > Hey Tom interesting catch. But we should not reach here. 304 is a redirect which ought to be handled BEFORE any getContent… Best regards -Tobias |
Hey Tobi,
we do still call `self content`, even if we go into the redirect branch, leading to the described timeout behavior: WebClient>>#sendRequest:contentBlock ... "Handle redirect if needed" (self allowRedirect and:[response isRedirect]) ifTrue:[ "Eat up the content of the previous response" response content. repeatRedirect := self redirect: request from: response. ]. ... I also felt like there should be a more general fix here. From what I read, if no content-length is given, the client should read data until the server closes the connection, however, the socket in SqueakSSL kept reporting that the socket was connected when I tried to fetch data from the GitHub API, until we timed out. This seemed like it would also constitute a correct (and also much simpler) fix, just that maybe we should make this list of codes where we do not need to read data even longer. Maybe I'm also missing another code path somewhere that should catch this case though. Best, Tom ________________________________________ From: Squeak-dev <[hidden email]> on behalf of Tobias Pape <[hidden email]> Sent: Thursday, May 7, 2020 8:09:27 PM To: The general-purpose Squeak developers list Subject: Re: [squeak-dev] The Inbox: WebClient-Core-tobe.120.mcz > On 07.05.2020, at 20:03, [hidden email] wrote: > > A new version of WebClient-Core was added to project The Inbox: > http://source.squeak.org/inbox/WebClient-Core-tobe.120.mcz > > ==================== Summary ==================== > > Name: WebClient-Core-tobe.120 > Author: tobe > Time: 7 May 2020, 8:03:26.842149 pm > UUID: e511b96d-6164-458e-bebd-4bb4f42cfb72 > Ancestors: WebClient-Core-nice.119 > > If we get a 304, do not try to parse content > > In particular, since content-length is typically not set for a 304 response (https://tools.ietf.org/html/rfc7230#section-3.3.2) and we somehow failed to notice the connection being closed, we ended up with a timeout instead. This caused some Metacello tests to time out. > > =============== Diff against WebClient-Core-nice.119 =============== > > Item was changed: > ----- Method: WebResponse>>getContent (in category 'private') ----- > getContent > "Do not read any content if this was a HEAD request or code is 204 (no content)" > + (request method = 'HEAD' or: [code = 204 or: [code = 304]]) ifTrue:[^'']. > - (request method = 'HEAD' or: [code = 204]) ifTrue:[^'']. > ^super getContent! > Hey Tom interesting catch. But we should not reach here. 304 is a redirect which ought to be handled BEFORE any getContent… Best regards -Tobias |
Hi
> On 07.05.2020, at 20:24, Beckmann, Tom <[hidden email]> wrote: > > Hey Tobi, > > we do still call `self content`, even if we go into the redirect branch, leading to the described timeout behavior: > > WebClient>>#sendRequest:contentBlock > ... > > "Handle redirect if needed" > (self allowRedirect and:[response isRedirect]) ifTrue:[ > "Eat up the content of the previous response" > response content. > repeatRedirect := self redirect: request from: response. > ]. > ... > > I also felt like there should be a more general fix here. From what I read, if no content-length is given, the client should read data until the server closes the connection, however, the socket in SqueakSSL kept reporting that the socket was connected when I tried to fetch data from the GitHub API, until we timed out. This seemed like it would also constitute a correct (and also much simpler) fix, just that maybe we should make this list of codes where we do not need to read data even longer. Maybe I'm also missing another code path somewhere that should catch this case though. Yeah, in the end it is a robuster thing. Maybe we should catch all 3xx here? they ought all have no content, right? Best regards -Tobias |
Ah I found the relevant part:
" 1. Any response to a HEAD request and any response with a 1xx (Informational), 204 (No Content), or 304 (Not Modified) status code is always terminated by the first empty line after the header fields, regardless of the header fields present in the message, and thus cannot contain a message body." - https://tools.ietf.org/html/rfc7230#section-3.3.3 There appears to be no special cases for the remaining 3xx codes. In fact I believe I do remember seeing HTML content containing a link back in the days on some 301 pages. So could we change the line to (request method = 'HEAD' or: [(code between: 100 and: 199) or: [code = 204 or: [code = 304]]]) when merging to trunk (and maybe include the reference to the RFC)? Or should I create a new commit to the inbox? Best, Tom ________________________________________ From: Squeak-dev <[hidden email]> on behalf of Tobias Pape <[hidden email]> Sent: Thursday, May 7, 2020 8:41:18 PM To: The general-purpose Squeak developers list Subject: Re: [squeak-dev] The Inbox: WebClient-Core-tobe.120.mcz Hi > On 07.05.2020, at 20:24, Beckmann, Tom <[hidden email]> wrote: > > Hey Tobi, > > we do still call `self content`, even if we go into the redirect branch, leading to the described timeout behavior: > > WebClient>>#sendRequest:contentBlock > ... > > "Handle redirect if needed" > (self allowRedirect and:[response isRedirect]) ifTrue:[ > "Eat up the content of the previous response" > response content. > repeatRedirect := self redirect: request from: response. > ]. > ... > > I also felt like there should be a more general fix here. From what I read, if no content-length is given, the client should read data until the server closes the connection, however, the socket in SqueakSSL kept reporting that the socket was connected when I tried to fetch data from the GitHub API, until we timed out. This seemed like it would also constitute a correct (and also much simpler) fix, just that maybe we should make this list of codes where we do not need to read data even longer. Maybe I'm also missing another code path somewhere that should catch this case though. Yeah, in the end it is a robuster thing. Maybe we should catch all 3xx here? they ought all have no content, right? Best regards -Tobias |
> On 07.05.2020, at 20:55, Beckmann, Tom <[hidden email]> wrote: > > Ah I found the relevant part: > " 1. Any response to a HEAD request and any response with a 1xx > (Informational), 204 (No Content), or 304 (Not Modified) status > code is always terminated by the first empty line after the > header fields, regardless of the header fields present in the > message, and thus cannot contain a message body." > - https://tools.ietf.org/html/rfc7230#section-3.3.3 > > There appears to be no special cases for the remaining 3xx codes. In fact I believe I do remember seeing HTML content containing a link back in the days on some 301 pages. > > So could we change the line to > > (request method = 'HEAD' or: [(code between: 100 and: 199) or: [code = 204 or: [code = 304]]]) > > when merging to trunk (and maybe include the reference to the RFC)? Or should I create a new commit to the inbox? New commit and then both to the trunk :) shipit ;) -t > > Best, > Tom > ________________________________________ > From: Squeak-dev <[hidden email]> on behalf of Tobias Pape <[hidden email]> > Sent: Thursday, May 7, 2020 8:41:18 PM > To: The general-purpose Squeak developers list > Subject: Re: [squeak-dev] The Inbox: WebClient-Core-tobe.120.mcz > > Hi >> On 07.05.2020, at 20:24, Beckmann, Tom <[hidden email]> wrote: >> >> Hey Tobi, >> >> we do still call `self content`, even if we go into the redirect branch, leading to the described timeout behavior: >> >> WebClient>>#sendRequest:contentBlock >> ... >> >> "Handle redirect if needed" >> (self allowRedirect and:[response isRedirect]) ifTrue:[ >> "Eat up the content of the previous response" >> response content. >> repeatRedirect := self redirect: request from: response. >> ]. >> ... >> >> I also felt like there should be a more general fix here. From what I read, if no content-length is given, the client should read data until the server closes the connection, however, the socket in SqueakSSL kept reporting that the socket was connected when I tried to fetch data from the GitHub API, until we timed out. This seemed like it would also constitute a correct (and also much simpler) fix, just that maybe we should make this list of codes where we do not need to read data even longer. Maybe I'm also missing another code path somewhere that should catch this case though. > > Yeah, in the end it is a robuster thing. Maybe we should catch all 3xx here? they ought all have no content, right? > > Best regards > -Tobias > > > |
On Thu, May 7, 2020 at 9:11 PM Tobias Pape <[hidden email]> wrote:
> > > > On 07.05.2020, at 20:55, Beckmann, Tom <[hidden email]> wrote: > > > > Ah I found the relevant part: > > " 1. Any response to a HEAD request and any response with a 1xx > > (Informational), 204 (No Content), or 304 (Not Modified) status > > code is always terminated by the first empty line after the > > header fields, regardless of the header fields present in the > > message, and thus cannot contain a message body." > > - https://tools.ietf.org/html/rfc7230#section-3.3.3 > > > > There appears to be no special cases for the remaining 3xx codes. In fact I believe I do remember seeing HTML content containing a link back in the days on some 301 pages. > > > > So could we change the line to > > > > (request method = 'HEAD' or: [(code between: 100 and: 199) or: [code = 204 or: [code = 304]]]) > > > > when merging to trunk (and maybe include the reference to the RFC)? Or should I create a new commit to the inbox? > > New commit and then both to the trunk :) > shipit ;) +1 > > -t > > > > > Best, > > Tom > > ________________________________________ > > From: Squeak-dev <[hidden email]> on behalf of Tobias Pape <[hidden email]> > > Sent: Thursday, May 7, 2020 8:41:18 PM > > To: The general-purpose Squeak developers list > > Subject: Re: [squeak-dev] The Inbox: WebClient-Core-tobe.120.mcz > > > > Hi > >> On 07.05.2020, at 20:24, Beckmann, Tom <[hidden email]> wrote: > >> > >> Hey Tobi, > >> > >> we do still call `self content`, even if we go into the redirect branch, leading to the described timeout behavior: > >> > >> WebClient>>#sendRequest:contentBlock > >> ... > >> > >> "Handle redirect if needed" > >> (self allowRedirect and:[response isRedirect]) ifTrue:[ > >> "Eat up the content of the previous response" > >> response content. > >> repeatRedirect := self redirect: request from: response. > >> ]. > >> ... > >> > >> I also felt like there should be a more general fix here. From what I read, if no content-length is given, the client should read data until the server closes the connection, however, the socket in SqueakSSL kept reporting that the socket was connected when I tried to fetch data from the GitHub API, until we timed out. This seemed like it would also constitute a correct (and also much simpler) fix, just that maybe we should make this list of codes where we do not need to read data even longer. Maybe I'm also missing another code path somewhere that should catch this case though. > > > > Yeah, in the end it is a robuster thing. Maybe we should catch all 3xx here? they ought all have no content, right? > > > > Best regards > > -Tobias > > > > > > > > > |
Free forum by Nabble | Edit this page |