The Inbox: WebClient-Core-ct.126.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: WebClient-Core-ct.126.mcz

commits-2
A new version of WebClient-Core was added to project The Inbox:
http://source.squeak.org/inbox/WebClient-Core-ct.126.mcz

==================== Summary ====================

Name: WebClient-Core-ct.126
Author: ct
Time: 12 October 2020, 7:05:32.190311 pm
UUID: a64ca560-814a-f940-8f64-e66a25cedc61
Ancestors: WebClient-Core-ul.123

Proposal to implement pre-authentication on WebClient.

MOTIVATION.
Until now, the authentication flow in Squeak's WebClient looks like this:
First, a request is made without trying to authenticate the user. If the request fails with an error 401 (Unauthorized) or an error 407 (Proxy Authentication Required), the authentication headers are added to the request, and the request is retried.

However, this does not work properly in some situations.
For example, many modern REST APIs use to return an error 404 if an attempt is made to access a private resource without authenticating before [1] which currrently makes it impossible to authenticate to these APIs using the WebClient.
Another issue I encountered today lies in some particular servers not requesting a specific authentication method via the WWW-Authenticate header along a 401 response as specified by the protocol [2]. Concretely, I encountered this problem with the quite popular GitHub API so I think our client should be robust enough to handle this contract violation.

APPROACH.
This patch adds a new property for the #preAuthenticationMethod to the WebClient class. It can be set to a symbol indicating any authentication method that is supported by the WebClient, e.g. #basic or #bearer. (Digest access authentication, however, cannot be used at this place because it depends on a realm specified by the server.)
If this property is set, the relevant authentication headers will be added to the request already before the first attempt is made to request the resource.

In addition, the patch refactors and reformatst the methods #authenticate:from: and #sendRequest:contentBlock:.

The TESTS work as well as always (a number of them failing sporadically, but after some trials, I get a green bar again).

WHAT REMAINS TO BE DONE.
I deleted the fixme "Pre-authenticate the request if we have valid auth credentials" comment which I think was exactly what I implemented in this patch. I hope this assumption was correct? Also, another fixme comment requests to preserve the authState after following a redirect. Instead, with this patch, any specified pre-authentication method will be reused after every redirect. I did not fix this because I do not have a use-case scenario for it. Can we leave this as-is, and in a future version, could we simply delete this send to #flushAuthState?

Also, I'm not sure about whether pre-authentication maybe should be the default for every request containing a username/password specification. This would speed up every web request that uses credentials by up to the factor 2 because we could save one futile query. Also, it appears to be the state-of-the-art solution, popular tools such as curl specify the credentials in the first run already. On the contrary, it would be a breaking change, and looking at the comment in #preAuthenticationMethod, pre-authentication as an opt-out feature might break or at least slow down NTLM/Negotiate use cases. However, I never heard of this before. Are these protocols still relevant at all?

Please give this patch a careful review because still, all knowledge I have about this domain is collected from StackOverflow and Wikipedia a few hours ago.

REFERENCES.
[1] https://stackoverflow.com/a/17688080/13994294
[2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/401

=============== Diff against WebClient-Core-ul.123 ===============

Item was changed:
  Object subclass: #WebClient
+ instanceVariableNames: 'flags server scheme timeout stream cookies proxyServer lastScheme lastServer lastPort maxRedirect redirections userAgent authParams proxyParams accessLog debugLog preAuthenticationMethod'
- instanceVariableNames: 'flags server scheme timeout stream cookies proxyServer lastScheme lastServer lastPort maxRedirect redirections userAgent authParams proxyParams accessLog debugLog'
  classVariableNames: 'DebugLog FlagAcceptCookies FlagAllowAuth FlagAllowRedirect ProxyHandler'
  poolDictionaries: ''
  category: 'WebClient-Core'!
 
  !WebClient commentStamp: 'ar 5/4/2010 13:17' prior: 0!
  WebClient provides a simple yet complete HTTP client implementation.
 
  To view the documentation evaluate:
 
  HelpBrowser openOn: WebClientHelp.
  !

Item was added:
+ ----- Method: WebClient>>authProcess:from:header:params: (in category 'authentication') -----
+ authProcess: request from: response header: authHeader params: params
+ "Process an authentication header.
+ Answer true if an authentication response could be generated; otherwise, false."
+
+ self
+ authDispatch: request
+ from: response
+ header: authHeader
+ params: params.
+
+ params at: #authResponse ifAbsent: [^ false].
+
+ "If we generated an authentication response for the header use it"
+ request
+ headerAt: ((response ifNotNil: [response code = 401] ifNil: [true])
+ ifTrue: ['Authorization']
+ ifFalse: ['Proxy-Authorization'])
+ put: (params at: #authMethod), ' ', (params at: #authResponse).
+
+ ^ true!

Item was changed:
  ----- Method: WebClient>>authenticate:from: (in category 'sending') -----
  authenticate: request from: response
  "Authenticate after having received a 401/407 response.
  Returns true if we should retry, false if we fail here."
 
+ | headers params |
+
- "NOTE: The first time through we do NOT ask for credentials right away.
- Some authentication mechanisms (NTLM/Negotiate) can use the credentials
- of the currently logged on user. Consequently we only ask for credentials
- if we're unable to do so without asking. Methods that require credentials
- (basic, digest) test for their existence explicitly."
-
- | headers authHeader params |
-
  "Pick the right set of parameters"
+ response code = 401
+ ifTrue: [
+ params := authParams.
+ headers := response headersAt: 'WWW-Authenticate'.
+ "If the connection was closed, we need to flush the
+ proxy params or we won't pick up prior credentials."
+ self isConnected
+ ifFalse: [self flushAuthState: proxyParams]]
+ ifFalse: [
+ params := proxyParams.
+ headers := response headersAt: 'Proxy-Authenticate'].
+
- response code = 401 ifTrue:[
- params := authParams.
- headers := response headersAt: 'WWW-Authenticate'.
- "If the connection was closed, we need to flush the
- proxy params or we won't pick up prior credentials."
- self isConnected
- ifFalse:[self flushAuthState: proxyParams].
- ] ifFalse:[
- params := proxyParams.
- headers := response headersAt: 'Proxy-Authenticate'.
- ].
-
  "Remove any old response"
+ params removeKey: #authResponse ifAbsent: [].
+
- params removeKey: #authResponse ifAbsent:[].
-
  "Process the authentication header(s)"
+ headers
+ detect: [:authHeader |
+ self
+ authProcess: request
+ from: response
+ header: authHeader
+ params: params]
+ ifFound: [:authHeader | ^ true].
+
+ "If we fall through here this can have two reasons: One is that we don't have a suitable authentication method. Check for that first."
+ params at: #authMethod ifAbsent: [^ false].
+
+ "The other possibility is that the credentials are wrong. Clean out the previous auth state and go ask for credentials."
- 1 to: headers size do:[:i|
- authHeader := headers at: i.
- self authDispatch: request from: response header: authHeader params: params.
- "If we generated an authentication response for the header use it"
- params at: #authResponse ifPresent:[:resp|
- request headerAt: (response code = 401
- ifTrue:['Authorization']
- ifFalse:['Proxy-Authorization'])
- put: (params at: #authMethod), ' ', resp.
- ^true].
- ].
-
- "If we fall through here this can have two reasons: One is that we don't have
- a suitable authentication method. Check for that first."
- params at: #authMethod ifAbsent:[^false].
-
- "The other possibility is that the credentials are wrong.
- Clean out the previous auth state and go ask for credentials."
  self flushAuthState: params.
+
-
  "Clean out old authentication headers"
  response code = 401
+ ifTrue: [request removeHeader: 'Authorization'].
- ifTrue:[request removeHeader: 'Authorization'].
  "Always clean out the proxy auth header since we don't support pre-authentication"
  request removeHeader: 'Proxy-Authorization'.
+
-
  "Signal WebAuthRequired"
  (WebAuthRequired client: self request: request response: response)
+ signal == true ifFalse: [^ false].
+
- signal == true ifFalse:[^false].
-
  "And retry with the new credentials"
+ ^ self authenticate: request from: response!
- ^self authenticate: request from: response!

Item was added:
+ ----- Method: WebClient>>preAuthenticationMethod (in category 'accessing') -----
+ preAuthenticationMethod
+ "The authentication method to be used for initial requests. Symbol, e.g. #basic or #bearer. If nil, no authentication will be used until the server requests an authentication.
+
+ NOTE: Some authentication mechanisms (NTLM/Negotiate) can use the credentials of the currently logged on user. Consequently, by default we only ask for credentials if we're unable to do so without asking."
+
+ ^ preAuthenticationMethod!

Item was added:
+ ----- Method: WebClient>>preAuthenticationMethod: (in category 'accessing') -----
+ preAuthenticationMethod: aSymbol
+ "The authentication method to be used for initial requests. See #preAuthenticationMethod."
+
+ preAuthenticationMethod := aSymbol!

Item was changed:
  ----- Method: WebClient>>sendRequest:contentBlock: (in category 'sending') -----
  sendRequest: request contentBlock: contentBlock
  "Send an http request"
 
  |  response repeatRedirect repeatAuth |
-
- "XXXX: Fixme. Pre-authenticate the request if we have valid auth credentials"
-
  redirections := Dictionary new.
 
  ["The outer loop handles redirections"
+ repeatRedirect := false.
+
+ "Always update the host header due to redirect"
+ request headerAt: 'Host' put: server.
+
+ self preAuthenticationMethod ifNotNil: [:authMethod |
+ self
+ authProcess: request
+ from: nil
+ header: authMethod asString capitalized
+ params: authParams].
+
- repeatRedirect := false.
-
- "Always update the host header due to redirect"
- request headerAt: 'Host' put: server.
-
  ["The inner loop handles authentication"
+ repeatAuth := false.
+
+ "Connect can fail if SSL proxy CONNECT is involved"
+ self connect ifNotNil: [:resp| ^ resp].
+
+ "Write the request to the debugLog if present"
+ debugLog ifNotNil: [self writeRequest: request on: debugLog].
+
+ "Send the request itself"
+ self writeRequest: request on: stream.
+ contentBlock value: stream.
+
+ response := request newResponse readFrom: stream.
+ response url: scheme, '://', server, request rawUrl.
+
+ debugLog ifNotNil: [
+ response writeOn: debugLog.
+ debugLog flush].
+ response setCookiesDo: [:cookie|
+ self acceptCookie: cookie host: self serverUrlName path: request url].
+ accessLog ifNotNil: [
+ WebUtils logRequest: request response: response on: accessLog].
+ "Handle authentication if needed"
+ (self allowAuth and: [response code = 401 or: [response code = 407]]) ifTrue: [
+ "Eat up the content of the previous response"
+ response content.
+ repeatAuth := self authenticate: request from: response].
+
+ repeatAuth
+ ] whileTrue.
- repeatAuth := false.
-
- "Connect can fail if SSL proxy CONNECT is involved"
- self connect ifNotNil:[:resp| ^resp].
 
+ "Flush previous authState.
+ XXXX: Fixme. authState must be preserved for pre-authentication of requests."
+ self flushAuthState.
+
+ "Handle redirect if needed"
+ (self allowRedirect and: [response isRedirect]) ifTrue:[
- "Write the request to the debugLog if present"
- debugLog ifNotNil:[self writeRequest: request on: debugLog].
-
- "Send the request itself"
- self writeRequest: request on: stream.
- contentBlock value: stream.
-
- response := request newResponse readFrom: stream.
- response url: (scheme, '://', server, request rawUrl).
-
- debugLog ifNotNil:[
- response writeOn: debugLog.
- debugLog flush.
- ].
- response setCookiesDo:[:cookie|
- self acceptCookie: cookie host: self serverUrlName path: request url.
- ].
- accessLog ifNotNil:[
- WebUtils logRequest: request response: response on: accessLog
- ].
- "Handle authentication if needed"
- (self allowAuth and:[response code = 401 or:[response code = 407]]) ifTrue:[
  "Eat up the content of the previous response"
  response content.
+ repeatRedirect := self redirect: request from: response].
+ repeatRedirect
+ ] whileTrue: [
- repeatAuth := self authenticate: request from: response.
- ].
-
- repeatAuth] whileTrue.
-
- "Flush previous authState.
- XXXX: Fixme. authState must be preserved for pre-authentication of requests."
- self flushAuthState.
-
- "Handle redirect if needed"
- (self allowRedirect and:[response isRedirect]) ifTrue:[
- "Eat up the content of the previous response"
- response content.
- repeatRedirect := self redirect: request from: response.
- ].
- repeatRedirect] whileTrue:[
  "When redirecting, remove authentication headers"
  request removeHeader: 'Authorization'.
  request removeHeader: 'Proxy-Authorization'.
  ].
+
-
  "If the response is not a success, eat up its content"
+ (response isSuccess or: [response isInformational]) ifFalse: [
+ response content].
+
+ ^ response!
- (response isSuccess or:[response isInformational]) ifFalse:[response content].
-
- ^response!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

For the case we want to make pre-authentication the default behavior, the attached changeset activates this.


What about tests? Do you see the need to add some for this patch? :-)


Best,

Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Montag, 12. Oktober 2020 19:05:34
An: [hidden email]
Betreff: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
A new version of WebClient-Core was added to project The Inbox:
http://source.squeak.org/inbox/WebClient-Core-ct.126.mcz

==================== Summary ====================

Name: WebClient-Core-ct.126
Author: ct
Time: 12 October 2020, 7:05:32.190311 pm
UUID: a64ca560-814a-f940-8f64-e66a25cedc61
Ancestors: WebClient-Core-ul.123

Proposal to implement pre-authentication on WebClient.

MOTIVATION.
Until now, the authentication flow in Squeak's WebClient looks like this:
First, a request is made without trying to authenticate the user. If the request fails with an error 401 (Unauthorized) or an error 407 (Proxy Authentication Required), the authentication headers are added to the request, and the request is retried.

However, this does not work properly in some situations.
For example, many modern REST APIs use to return an error 404 if an attempt is made to access a private resource without authenticating before [1] which currrently makes it impossible to authenticate to these APIs using the WebClient.
Another issue I encountered today lies in some particular servers not requesting a specific authentication method via the WWW-Authenticate header along a 401 response as specified by the protocol [2]. Concretely, I encountered this problem with the quite popular GitHub API so I think our client should be robust enough to handle this contract violation.

APPROACH.
This patch adds a new property for the #preAuthenticationMethod to the WebClient class. It can be set to a symbol indicating any authentication method that is supported by the WebClient, e.g. #basic or #bearer. (Digest access authentication, however, cannot be used at this place because it depends on a realm specified by the server.)
If this property is set, the relevant authentication headers will be added to the request already before the first attempt is made to request the resource.

In addition, the patch refactors and reformatst the methods #authenticate:from: and #sendRequest:contentBlock:.

The TESTS work as well as always (a number of them failing sporadically, but after some trials, I get a green bar again).

WHAT REMAINS TO BE DONE.
I deleted the fixme "Pre-authenticate the request if we have valid auth credentials" comment which I think was exactly what I implemented in this patch. I hope this assumption was correct? Also, another fixme comment requests to preserve the authState after following a redirect. Instead, with this patch, any specified pre-authentication method will be reused after every redirect. I did not fix this because I do not have a use-case scenario for it. Can we leave this as-is, and in a future version, could we simply delete this send to #flushAuthState?

Also, I'm not sure about whether pre-authentication maybe should be the default for every request containing a username/password specification. This would speed up every web request that uses credentials by up to the factor 2 because we could save one futile query. Also, it appears to be the state-of-the-art solution, popular tools such as curl specify the credentials in the first run already. On the contrary, it would be a breaking change, and looking at the comment in #preAuthenticationMethod, pre-authentication as an opt-out feature might break or at least slow down NTLM/Negotiate use cases. However, I never heard of this before. Are these protocols still relevant at all?

Please give this patch a careful review because still, all knowledge I have about this domain is collected from StackOverflow and Wikipedia a few hours ago.

REFERENCES.
[1] https://stackoverflow.com/a/17688080/13994294
[2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/401

=============== Diff against WebClient-Core-ul.123 ===============

Item was changed:
  Object subclass: #WebClient
+        instanceVariableNames: 'flags server scheme timeout stream cookies proxyServer lastScheme lastServer lastPort maxRedirect redirections userAgent authParams proxyParams accessLog debugLog preAuthenticationMethod'
-        instanceVariableNames: 'flags server scheme timeout stream cookies proxyServer lastScheme lastServer lastPort maxRedirect redirections userAgent authParams proxyParams accessLog debugLog'
         classVariableNames: 'DebugLog FlagAcceptCookies FlagAllowAuth FlagAllowRedirect ProxyHandler'
         poolDictionaries: ''
         category: 'WebClient-Core'!
 
  !WebClient commentStamp: 'ar 5/4/2010 13:17' prior: 0!
  WebClient provides a simple yet complete HTTP client implementation.
 
  To view the documentation evaluate:
 
         HelpBrowser openOn: WebClientHelp.
  !

Item was added:
+ ----- Method: WebClient>>authProcess:from:header:params: (in category 'authentication') -----
+ authProcess: request from: response header: authHeader params: params
+        "Process an authentication header.
+        Answer true if an authentication response could be generated; otherwise, false."
+
+        self
+                authDispatch: request
+                from: response
+                header: authHeader
+                params: params.
+       
+        params at: #authResponse ifAbsent: [^ false].
+       
+        "If we generated an authentication response for the header use it"
+        request
+                headerAt: ((response ifNotNil: [response code = 401] ifNil: [true])
+                        ifTrue: ['Authorization']
+                        ifFalse: ['Proxy-Authorization'])
+                put: (params at: #authMethod), ' ', (params at: #authResponse).
+       
+        ^ true!

Item was changed:
  ----- Method: WebClient>>authenticate:from: (in category 'sending') -----
  authenticate: request from: response
         "Authenticate after having received a 401/407 response.
         Returns true if we should retry, false if we fail here."
 
+        | headers params |
+       
-        "NOTE: The first time through we do NOT ask for credentials right away.
-        Some authentication mechanisms (NTLM/Negotiate) can use the credentials
-        of the currently logged on user. Consequently we only ask for credentials
-        if we're unable to do so without asking. Methods that require credentials
-        (basic, digest) test for their existence explicitly."
-
-        | headers authHeader params |
-
         "Pick the right set of parameters"
+        response code = 401
+                ifTrue: [
+                        params := authParams.
+                        headers := response headersAt: 'WWW-Authenticate'.
+                        "If the connection was closed, we need to flush the
+                        proxy params or we won't pick up prior credentials."
+                        self isConnected
+                                ifFalse: [self flushAuthState: proxyParams]]
+                ifFalse: [
+                        params := proxyParams.
+                        headers := response headersAt: 'Proxy-Authenticate'].
+       
-        response code = 401 ifTrue:[
-                params := authParams.
-                headers := response headersAt: 'WWW-Authenticate'.
-                "If the connection was closed, we need to flush the
-                proxy params or we won't pick up prior credentials."
-                self isConnected
-                        ifFalse:[self flushAuthState: proxyParams].
-        ] ifFalse:[
-                params := proxyParams.
-                headers := response headersAt: 'Proxy-Authenticate'.
-        ].
-
         "Remove any old response"
+        params removeKey: #authResponse ifAbsent: [].
+       
-        params removeKey: #authResponse ifAbsent:[].
-
         "Process the authentication header(s)"
+        headers
+                detect: [:authHeader |
+                        self
+                                authProcess: request
+                                from: response
+                                header: authHeader
+                                params: params]
+                ifFound: [:authHeader | ^ true].
+       
+        "If we fall through here this can have two reasons: One is that we don't have a suitable authentication method. Check for that first."
+        params at: #authMethod ifAbsent: [^ false].
+       
+        "The other possibility is that the credentials are wrong. Clean out the previous auth state and go ask for credentials."
-        1 to: headers size do:[:i|
-                authHeader := headers at: i.
-                self authDispatch: request from: response header: authHeader params: params.
-                "If we generated an authentication response for the header use it"
-                params at: #authResponse ifPresent:[:resp|
-                        request headerAt: (response code = 401
-                                                                ifTrue:['Authorization']
-                                                                ifFalse:['Proxy-Authorization'])
-                                        put: (params at: #authMethod), ' ', resp.
-                        ^true].
-        ].
-
-        "If we fall through here this can have two reasons: One is that we don't have
-        a suitable authentication method. Check for that first."
-        params at: #authMethod ifAbsent:[^false].
-
-        "The other possibility is that the credentials are wrong.
-        Clean out the previous auth state and go ask for credentials."
         self flushAuthState: params.
+       
-
         "Clean out old authentication headers"
         response code = 401
+                ifTrue: [request removeHeader: 'Authorization'].
-                ifTrue:[request removeHeader: 'Authorization'].
         "Always clean out the proxy auth header since we don't support pre-authentication"
         request removeHeader: 'Proxy-Authorization'.
+       
-
         "Signal WebAuthRequired"
         (WebAuthRequired client: self request: request response: response)
+                signal == true ifFalse: [^ false].
+       
-                signal == true ifFalse:[^false].
-
         "And retry with the new credentials"
+        ^ self authenticate: request from: response!
-        ^self authenticate: request from: response!

Item was added:
+ ----- Method: WebClient>>preAuthenticationMethod (in category 'accessing') -----
+ preAuthenticationMethod
+        "The authentication method to be used for initial requests. Symbol, e.g. #basic or #bearer. If nil, no authentication will be used until the server requests an authentication.
+       
+        NOTE: Some authentication mechanisms (NTLM/Negotiate) can use the credentials of the currently logged on user. Consequently, by default we only ask for credentials if we're unable to do so without asking."
+
+        ^ preAuthenticationMethod!

Item was added:
+ ----- Method: WebClient>>preAuthenticationMethod: (in category 'accessing') -----
+ preAuthenticationMethod: aSymbol
+        "The authentication method to be used for initial requests. See #preAuthenticationMethod."
+
+        preAuthenticationMethod := aSymbol!

Item was changed:
  ----- Method: WebClient>>sendRequest:contentBlock: (in category 'sending') -----
  sendRequest: request contentBlock: contentBlock
         "Send an http request"
 
         |  response repeatRedirect repeatAuth |
-
-        "XXXX: Fixme. Pre-authenticate the request if we have valid auth credentials"
-
         redirections := Dictionary new.
        
         ["The outer loop handles redirections"
+                repeatRedirect := false.
+               
+                "Always update the host header due to redirect"
+                request headerAt: 'Host' put: server.
+               
+                self preAuthenticationMethod ifNotNil: [:authMethod |
+                        self
+                                authProcess: request
+                                from: nil
+                                header: authMethod asString capitalized
+                                params: authParams].
+               
-        repeatRedirect := false.
-
-        "Always update the host header due to redirect"
-        request headerAt: 'Host' put: server.
-
                 ["The inner loop handles authentication"
+                        repeatAuth := false.
+                       
+                        "Connect can fail if SSL proxy CONNECT is involved"
+                        self connect ifNotNil: [:resp| ^ resp].
+                       
+                        "Write the request to the debugLog if present"
+                        debugLog ifNotNil: [self writeRequest: request on: debugLog].
+                       
+                        "Send the request itself"
+                        self writeRequest: request on: stream.
+                        contentBlock value: stream.
+                       
+                        response := request newResponse readFrom: stream.
+                        response url: scheme, '://', server, request rawUrl.
+                       
+                        debugLog ifNotNil: [
+                                response writeOn: debugLog.
+                                debugLog flush].
+                        response setCookiesDo: [:cookie|
+                                self acceptCookie: cookie host: self serverUrlName path: request url].
+                        accessLog ifNotNil: [
+                                WebUtils logRequest: request response: response on: accessLog].
+                        "Handle authentication if needed"
+                        (self allowAuth and: [response code = 401 or: [response code = 407]]) ifTrue: [
+                                "Eat up the content of the previous response"
+                                response content.
+                                repeatAuth := self authenticate: request from: response].
+                       
+                        repeatAuth
+                ] whileTrue.
-                repeatAuth := false.
-
-                "Connect can fail if SSL proxy CONNECT is involved"
-                self connect ifNotNil:[:resp| ^resp].
                
+                "Flush previous authState.
+                XXXX: Fixme. authState must be preserved for pre-authentication of requests."
+                self flushAuthState.
+               
+                "Handle redirect if needed"
+                (self allowRedirect and: [response isRedirect]) ifTrue:[
-                "Write the request to the debugLog if present"
-                debugLog ifNotNil:[self writeRequest: request on: debugLog].
-
-                "Send the request itself"
-                self writeRequest: request on: stream.
-                contentBlock value: stream.
-
-                response := request newResponse readFrom: stream.
-                response url: (scheme, '://', server, request rawUrl).
-
-                debugLog ifNotNil:[
-                        response writeOn: debugLog.
-                        debugLog flush.
-                ].
-                response setCookiesDo:[:cookie|
-                        self acceptCookie: cookie host: self serverUrlName path: request url.
-                ].
-                accessLog ifNotNil:[
-                        WebUtils logRequest: request response: response on: accessLog
-                ].
-                "Handle authentication if needed"
-                (self allowAuth and:[response code = 401 or:[response code = 407]]) ifTrue:[
                         "Eat up the content of the previous response"
                         response content.
+                        repeatRedirect := self redirect: request from: response].
+                repeatRedirect
+        ] whileTrue: [
-                        repeatAuth := self authenticate: request from: response.
-                ].
-
-                repeatAuth] whileTrue.
-
-        "Flush previous authState.
-        XXXX: Fixme. authState must be preserved for pre-authentication of requests."
-        self flushAuthState.
-
-        "Handle redirect if needed"
-        (self allowRedirect and:[response isRedirect]) ifTrue:[
-                "Eat up the content of the previous response"
-                response content.
-                repeatRedirect := self redirect: request from: response.
-        ].
-        repeatRedirect] whileTrue:[
                 "When redirecting, remove authentication headers"
                 request removeHeader: 'Authorization'.
                 request removeHeader: 'Proxy-Authorization'.
         ].
+       
-
         "If the response is not a success, eat up its content"
+        (response isSuccess or: [response isInformational]) ifFalse: [
+                response content].
+       
+        ^ response!
-        (response isSuccess or:[response isInformational]) ifFalse:[response content].
-
-        ^response!





WebClient-preauth-default.1.cs (1K) Download Attachment
Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tobias Pape
Hi

> On 12.10.2020, at 20:47, Thiede, Christoph <[hidden email]> wrote:
>
> For example, many modern REST APIs use to return an error 404 if an attempt is made to access a private resource without authenticating before [1] which currrently makes it impossible to authenticate to these APIs using the WebClient.


No, thats not what the link says:

Q: "1. How to deal with unauthorized requests?

I'm intending to respond to requests with the following codes:

        • Is the resource open and found? 200 OK
        • Do you need to be authenticated to access the resources? 401 Unauthorized
        • Don't you have access to a category of resources? 403 Forbidden
        • Do you have access to a category of resources, but not to this specific resource? 404 Not Found to prevent people from getting to know the existance of a resource they do not have access to.
        • Doesn't the resource exist? 404 Not Found

"
A: "How to deal with unauthorized requests?

The way you described it is pretty much the recommended way for a RESTful service. As far as I can see there is absolutely nothing wrong with that."


That means: "Do you need to be authenticated to access the resources? 401 Unauthorized"

I do not support preemtive authentication, especially in non-SSL circumstances.


=-=-=-=


It is also hard to see the differences because you reformatted them method :/

Best regards
        -Tobias

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

Hi Tobias!


I don't understand your objection. As the link says, some REST APIs will return a 404 Not Found error rather than a 401 Unauthorized error to make sure that unauthorized users cannot know the list of private objects that exist for the owning user.

For example, the GitHub API will respond with a 404 error if you try to access the following URL without authorization: https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master

However, I sweat that this repository actually exists, and I will get a 202 OK response if I authenticate before!

It's the same pattern as not telling you whether your email or your password is wrong on any web service to protect the knowledge about registered users from possible attackers.


I do not support preemtive authentication, especially in non-SSL circumstances.


Why not? However, I would not dislike a warning being raised every time you try to authenticate if the scheme is not https.

It is also hard to see the differences because you reformatted them method :/

That's truly a problem ... I found the original method version suboptimal to read. Is the prettyDiffs an option for you? Otherwise, I can split up this inbox version.

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Montag, 12. Oktober 2020 21:06:21
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi

> On 12.10.2020, at 20:47, Thiede, Christoph <[hidden email]> wrote:
>
> For example, many modern REST APIs use to return an error 404 if an attempt is made to access a private resource without authenticating before [1] which currrently makes it impossible to authenticate to these APIs using the WebClient.


No, thats not what the link says:

Q: "1. How to deal with unauthorized requests?

I'm intending to respond to requests with the following codes:

        • Is the resource open and found? 200 OK
        • Do you need to be authenticated to access the resources? 401 Unauthorized
        • Don't you have access to a category of resources? 403 Forbidden
        • Do you have access to a category of resources, but not to this specific resource? 404 Not Found to prevent people from getting to know the existance of a resource they do not have access to.
        • Doesn't the resource exist? 404 Not Found

"
A: "How to deal with unauthorized requests?

The way you described it is pretty much the recommended way for a RESTful service. As far as I can see there is absolutely nothing wrong with that."


That means: "Do you need to be authenticated to access the resources? 401 Unauthorized"

I do not support preemtive authentication, especially in non-SSL circumstances.


=-=-=-=


It is also hard to see the differences because you reformatted them method :/

Best regards
        -Tobias



Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tobias Pape
Hey Christoph

> On 12.10.2020, at 21:25, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias!
>
> I don't understand your objection. As the link says, some REST APIs will return a 404 Not Found error rather than a 401 Unauthorized error to make sure that unauthorized users cannot know the list of private objects that exist for the owning user.

You're conflating authentication and authorization.
(That said, I think the header name is a misnomer in HTTP…, but I might be wrong)


> For example, the GitHub API will respond with a 404 error if you try to access the following URL without authorization: https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master
> However, I sweat that this repository actually exists, and I will get a 202 OK response if I authenticate before!
> It's the same pattern as not telling you whether your email or your password is wrong on any web service to protect the knowledge about registered users from possible attackers.


No its different.
Look: if you go to
        https://api.github.com/
One of the json entries is:

        authorizations_url: "https://api.github.com/authorizations"

If you go to

        "https://api.github.com/authorizations"

You'll get the 401.

From then on, all HTTP Clients will continue with authorization headers.

No preauth required.

If I'm authenticated that way, I as krono will get the 404 on your url, but you as LinqLover will get 200.


And yes, Github essentially says "F$#^#$ Standards":

https://docs.github.com/en/free-pro-team@latest/rest/overview/other-authentication-methods
        The API supports Basic Authentication as defined in RFC2617 with a few slight differences. The main difference is that the RFC requires unauthenticated requests to be answered with 401 Unauthorized responses. In many places, this would disclose the existence of user data. Instead, the GitHub API responds with 404 Not Found. This may cause problems for HTTP libraries that assume a 401 Unauthorized response. The solution is to manually craft the Authorization header.

And I hate it.
They're trading privacy for security. That's a dick move.

NB: For 2FA, you need the https://api.github.com/authorizations endpoint anyways: https://docs.github.com/en/free-pro-team@latest/rest/overview/other-authentication-methods#working-with-two-factor-authentication

So maybe, first going to "https://api.github.com/authorizations" is a good idea in the first place?

:D

It's not your fault, I understand you just want to get things going.
GitHub is a too big player and we have to scratch standards for that, sigh…


> > I do not support preemtive authentication, especially in non-SSL circumstances.
>
> Why not?

It leaks credentials unnecessarily.

> However, I would not dislike a warning being raised every time you try to authenticate if the scheme is not https.

WebClient is pretty lean for a HTTP client, I'd rather have it continue that way.

That said, you can simply do

WebClient
        httpGet: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master'
        do: [:req | req headerAt: 'Authorization' put: 'WHATEVER-NECESSARY']

And you have your pre-auth.

>
> > It is also hard to see the differences because you reformatted them method :/
>
> That's truly a problem ... I found the original method version suboptimal to read. Is the prettyDiffs an option for you? Otherwise, I can split up this inbox version.

I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.


Best regards
        -Tobias

>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Montag, 12. Oktober 2020 21:06:21
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
>  
> Hi
>
> > On 12.10.2020, at 20:47, Thiede, Christoph <[hidden email]> wrote:
> >
> > For example, many modern REST APIs use to return an error 404 if an attempt is made to access a private resource without authenticating before [1] which currrently makes it impossible to authenticate to these APIs using the WebClient.
>
>
> No, thats not what the link says:
>
> Q: "1. How to deal with unauthorized requests?
>
> I'm intending to respond to requests with the following codes:
>
>         • Is the resource open and found? 200 OK
>         • Do you need to be authenticated to access the resources? 401 Unauthorized
>         • Don't you have access to a category of resources? 403 Forbidden
>         • Do you have access to a category of resources, but not to this specific resource? 404 Not Found to prevent people from getting to know the existance of a resource they do not have access to.
>         • Doesn't the resource exist? 404 Not Found
>
> "
> A: "How to deal with unauthorized requests?
>
> The way you described it is pretty much the recommended way for a RESTful service. As far as I can see there is absolutely nothing wrong with that."
>
>
> That means: "Do you need to be authenticated to access the resources? 401 Unauthorized"
>
> I do not support preemtive authentication, especially in non-SSL circumstances.
>
>
> =-=-=-=
>
>
> It is also hard to see the differences because you reformatted them method :/
>
> Best regards
>         -Tobias



Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

Hi Tobias,


okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:

First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.

Is this correct?


However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello: https://github.com/Metacello/metacello/pull/534

IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.


> Why not?

> It leaks credentials unnecessarily.

Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art. And speed is another point, given the fact that internet connections in Squeak are really slow ...
Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway. If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...

I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.

Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Montag, 12. Oktober 2020 21:55:39
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hey Christoph

> On 12.10.2020, at 21:25, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias!
>
> I don't understand your objection. As the link says, some REST APIs will return a 404 Not Found error rather than a 401 Unauthorized error to make sure that unauthorized users cannot know the list of private objects that exist for the owning user.

You're conflating authentication and authorization.
(That said, I think the header name is a misnomer in HTTP…, but I might be wrong)


> For example, the GitHub API will respond with a 404 error if you try to access the following URL without authorization: https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master
> However, I sweat that this repository actually exists, and I will get a 202 OK response if I authenticate before!
> It's the same pattern as not telling you whether your email or your password is wrong on any web service to protect the knowledge about registered users from possible attackers.


No its different.
Look: if you go to
        https://api.github.com/
One of the json entries is:

        authorizations_url: "https://api.github.com/authorizations"

If you go to

        "https://api.github.com/authorizations"

You'll get the 401.

From then on, all HTTP Clients will continue with authorization headers.

No preauth required.

If I'm authenticated that way, I as krono will get the 404 on your url, but you as LinqLover will get 200.


And yes, Github essentially says "F$#^#$ Standards":

https://docs.github.com/en/free-pro-team@latest/rest/overview/other-authentication-methods
        The API supports Basic Authentication as defined in RFC2617 with a few slight differences. The main difference is that the RFC requires unauthenticated requests to be answered with 401 Unauthorized responses. In many places, this would disclose the existence of user data. Instead, the GitHub API responds with 404 Not Found. This may cause problems for HTTP libraries that assume a 401 Unauthorized response. The solution is to manually craft the Authorization header.

And I hate it.
They're trading privacy for security. That's a dick move.

NB: For 2FA, you need the https://api.github.com/authorizations endpoint anyways: https://docs.github.com/en/free-pro-team@latest/rest/overview/other-authentication-methods#working-with-two-factor-authentication

So maybe, first going to "https://api.github.com/authorizations" is a good idea in the first place?

:D

It's not your fault, I understand you just want to get things going.
GitHub is a too big player and we have to scratch standards for that, sigh…


> > I do not support preemtive authentication, especially in non-SSL circumstances.
>
> Why not?

It leaks credentials unnecessarily.

> However, I would not dislike a warning being raised every time you try to authenticate if the scheme is not https.

WebClient is pretty lean for a HTTP client, I'd rather have it continue that way.

That said, you can simply do

WebClient
        httpGet: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master'
        do: [:req | req headerAt: 'Authorization' put: 'WHATEVER-NECESSARY']

And you have your pre-auth.

>
> > It is also hard to see the differences because you reformatted them method :/
>
> That's truly a problem ... I found the original method version suboptimal to read. Is the prettyDiffs an option for you? Otherwise, I can split up this inbox version.

I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.


Best regards
        -Tobias

>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Montag, 12. Oktober 2020 21:06:21
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 12.10.2020, at 20:47, Thiede, Christoph <[hidden email]> wrote:
> >
> > For example, many modern REST APIs use to return an error 404 if an attempt is made to access a private resource without authenticating before [1] which currrently makes it impossible to authenticate to these APIs using the WebClient.
>
>
> No, thats not what the link says:
>
> Q: "1. How to deal with unauthorized requests?
>
> I'm intending to respond to requests with the following codes:
>
>         • Is the resource open and found? 200 OK
>         • Do you need to be authenticated to access the resources? 401 Unauthorized
>         • Don't you have access to a category of resources? 403 Forbidden
>         • Do you have access to a category of resources, but not to this specific resource? 404 Not Found to prevent people from getting to know the existance of a resource they do not have access to.
>         • Doesn't the resource exist? 404 Not Found
>
> "
> A: "How to deal with unauthorized requests?
>
> The way you described it is pretty much the recommended way for a RESTful service. As far as I can see there is absolutely nothing wrong with that."
>
>
> That means: "Do you need to be authenticated to access the resources? 401 Unauthorized"
>
> I do not support preemtive authentication, especially in non-SSL circumstances.
>
>
> =-=-=-=
>
>
> It is also hard to see the differences because you reformatted them method :/
>
> Best regards
>         -Tobias





Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tobias Pape
Hi

> On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> Is this correct?

Yes. However, the second one is the one GitHub "recommends"


>
> However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello: https://github.com/Metacello/metacello/pull/534
> IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.

There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
If you cannot know beforehand, no single-request stuff is gonna help. No dice.
 

>
> > > Why not?
> >
> > It leaks credentials unnecessarily.
>
> Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.

No thats wrong. Curl will only send auth data if you provide it.

Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually


The sequence is split manually:
```
$ curl https://api.github.com/repos/krono/debug/zipball/master
{
  "message": "Not Found",
  "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
}
# Well, I'm left to guess. Maybe exists, maybe not.
$ curl -u krono https://api.github.com/repos/krono/debug/zipball/master

```
(In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)

The point is, you instruct Curl to _provide credentials unconditionally_.
The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.

Look:

```
$ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
$
# Well, no response?
$ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
*   Trying 2001:638:807:204::8d59:e178...
* TCP_NODELAY set
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 401
< Date: Tue, 13 Oct 2020 07:43:04 GMT
< Server: nginx/1.14.2
< Content-Length: 0
< WWW-Authenticate: Basic realm="SwaSource - XP aware"
<
* Connection #0 to host www.hpi.uni-potsdam.de left intact
```

Thats the 401 we're looking for. We have found that the resource needs authentication.

Sidenote: Curl can do the roundtrip (man curl, search anyauth):

```
$ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
Enter host password for user 'topa':
*   Trying 2001:638:807:204::8d59:e178...
* TCP_NODELAY set
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 401
< Date: Tue, 13 Oct 2020 07:46:05 GMT
< Server: nginx/1.14.2
< Content-Length: 0
< WWW-Authenticate: Basic realm="SwaSource - XP aware"
<
* Connection #0 to host www.hpi.uni-potsdam.de left intact
* Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
* Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
* Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
* Server auth using Basic with user 'topa'
> GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> Authorization: Basic *******************
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200
< Date: Tue, 13 Oct 2020 07:46:05 GMT
< Server: nginx/1.14.2
< Content-Type: text/html
< Content-Length: 15131
< Vary: Accept-Encoding
```

And in this case it does _not_ send auth in the first request but only in the second.

Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:

```
$ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
Enter host password for user 'topa':
*   Trying 2001:638:807:204::8d59:e178...
* TCP_NODELAY set
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200
< Date: Tue, 13 Oct 2020 07:46:56 GMT
< Server: nginx/1.14.2
< Content-Type: text/html
< Content-Length: 75860
< Vary: Accept-Encoding
```





> And speed is another point, given the fact that internet connections in Squeak are really slow ...
> Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.

So you always know beforehand which resources need authentication?
Neat, I dont :D

> If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...

What happens here is that we're bending over backwards because github decided to be a bad player.

I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.

If you want preemtive auth, do it with WebClient httpGet:do:.



>
> > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
>
> Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.

I personally find prettydiffs useless, but that's just me.

Best regards
        -Tobias

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

Hi Tobias,


sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...



No thats wrong. Curl will only send auth data if you provide it.


Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually

The point is, you instruct Curl to _provide credentials unconditionally_.

I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man). If I do
WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.

Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:

ZnClient new
url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
username: 'LinqLover' password: 'mytoken';
downloadTo: 'foo.zip'

So you always know beforehand which resources need authentication?
> Neat, I dont :D

I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?


In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run. It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
Do you dislike this feature in general, even when turned off by default? I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Dienstag, 13. Oktober 2020 10:04:23
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi

> On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> Is this correct?

Yes. However, the second one is the one GitHub "recommends"


>
> However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello: https://github.com/Metacello/metacello/pull/534
> IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.

There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
If you cannot know beforehand, no single-request stuff is gonna help. No dice.
 

>
> > > Why not?
> >
> > It leaks credentials unnecessarily.
>
> Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.

No thats wrong. Curl will only send auth data if you provide it.

Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually


The sequence is split manually:
```
$ curl https://api.github.com/repos/krono/debug/zipball/master
{
  "message": "Not Found",
  "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
}
# Well, I'm left to guess. Maybe exists, maybe not.
$ curl -u krono https://api.github.com/repos/krono/debug/zipball/master

```
(In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)

The point is, you instruct Curl to _provide credentials unconditionally_.
The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.

Look:

```
$ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
$
# Well, no response?
$ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
*   Trying 2001:638:807:204::8d59:e178...
* TCP_NODELAY set
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 401
< Date: Tue, 13 Oct 2020 07:43:04 GMT
< Server: nginx/1.14.2
< Content-Length: 0
< WWW-Authenticate: Basic realm="SwaSource - XP aware"
<
* Connection #0 to host www.hpi.uni-potsdam.de left intact
```

Thats the 401 we're looking for. We have found that the resource needs authentication.

Sidenote: Curl can do the roundtrip (man curl, search anyauth):

```
$ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
Enter host password for user 'topa':
*   Trying 2001:638:807:204::8d59:e178...
* TCP_NODELAY set
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 401
< Date: Tue, 13 Oct 2020 07:46:05 GMT
< Server: nginx/1.14.2
< Content-Length: 0
< WWW-Authenticate: Basic realm="SwaSource - XP aware"
<
* Connection #0 to host www.hpi.uni-potsdam.de left intact
* Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
* Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
* Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
* Server auth using Basic with user 'topa'
> GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> Authorization: Basic *******************
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200
< Date: Tue, 13 Oct 2020 07:46:05 GMT
< Server: nginx/1.14.2
< Content-Type: text/html
< Content-Length: 15131
< Vary: Accept-Encoding
```

And in this case it does _not_ send auth in the first request but only in the second.

Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:

```
$ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
Enter host password for user 'topa':
*   Trying 2001:638:807:204::8d59:e178...
* TCP_NODELAY set
* Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> Host: www.hpi.uni-potsdam.de
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200
< Date: Tue, 13 Oct 2020 07:46:56 GMT
< Server: nginx/1.14.2
< Content-Type: text/html
< Content-Length: 75860
< Vary: Accept-Encoding
```





> And speed is another point, given the fact that internet connections in Squeak are really slow ...
> Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.

So you always know beforehand which resources need authentication?
Neat, I dont :D

> If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...

What happens here is that we're bending over backwards because github decided to be a bad player.

I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.

If you want preemtive auth, do it with WebClient httpGet:do:.



>
> > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
>
> Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.

I personally find prettydiffs useless, but that's just me.

Best regards
        -Tobias



Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tobias Pape
Hi

> On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
>
>
> > No thats wrong. Curl will only send auth data if you provide it.
>
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
>
> > The point is, you instruct Curl to _provide credentials unconditionally_.
>
> I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).

Yes, that means _unconditionally_, even if the source does not need it.
This is called an information leak.


> If I do
> WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
>

Nope.

> Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
>
> ZnClient new
> url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> username: 'LinqLover' password: 'mytoken';
> downloadTo: 'foo.zip'
>
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
>
> I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?

No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.

> If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?

Because preemtive auth is wrong.

Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
Yes, it is "one request more". Yes, it is right thing to do.

Just because it is convenient and just because people are doing it, it does not mean it is good.
In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".

>
>
> In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.

Except when you use  "https://api.github.com/authorizations" first, which 401s.


> It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.

> Do you dislike this feature in general, even when turned off by default?

Yes. WebClient is not just an API-consumer. It ought to be safe.
Otherwise, encode it in the URL?

> I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.

I don't know.

I think there has been too little input from others here.
Don't rely on just my "judgement" ;)

Best regards
        -Tobias


>
> Best,
> Christoph
>
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
>  
> Hi
>
> > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > Is this correct?
>
> Yes. However, the second one is the one GitHub "recommends"
>
>
> >
> > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
>
> There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> If you cannot know beforehand, no single-request stuff is gonna help. No dice.
>  
>
> >
> > > > Why not?
> > >
> > > It leaks credentials unnecessarily.
> >
> > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
>
> No thats wrong. Curl will only send auth data if you provide it.
>
> Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
>
>
> The sequence is split manually:
> ```
> $ curl https://api.github.com/repos/krono/debug/zipball/master
> {
>   "message": "Not Found",
>   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> }
> # Well, I'm left to guess. Maybe exists, maybe not.
> $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
>
> ```
> (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
>
> The point is, you instruct Curl to _provide credentials unconditionally_.
> The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
>
> Look:
>
> ```
> $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> $
> # Well, no response?
> $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 401
> < Date: Tue, 13 Oct 2020 07:43:04 GMT
> < Server: nginx/1.14.2
> < Content-Length: 0
> < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> <
> * Connection #0 to host www.hpi.uni-potsdam.de left intact
> ```
>
> Thats the 401 we're looking for. We have found that the resource needs authentication.
>
> Sidenote: Curl can do the roundtrip (man curl, search anyauth):
>
> ```
> $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> Enter host password for user 'topa':
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 401
> < Date: Tue, 13 Oct 2020 07:46:05 GMT
> < Server: nginx/1.14.2
> < Content-Length: 0
> < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> <
> * Connection #0 to host www.hpi.uni-potsdam.de left intact
> * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> * Server auth using Basic with user 'topa'
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > Authorization: Basic *******************
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 200
> < Date: Tue, 13 Oct 2020 07:46:05 GMT
> < Server: nginx/1.14.2
> < Content-Type: text/html
> < Content-Length: 15131
> < Vary: Accept-Encoding
> ```
>
> And in this case it does _not_ send auth in the first request but only in the second.
>
> Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
>
> ```
> $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> Enter host password for user 'topa':
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 200
> < Date: Tue, 13 Oct 2020 07:46:56 GMT
> < Server: nginx/1.14.2
> < Content-Type: text/html
> < Content-Length: 75860
> < Vary: Accept-Encoding
> ```
>
>
>
>
>
> > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
>
> So you always know beforehand which resources need authentication?
> Neat, I dont :D
>
> > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
>
> What happens here is that we're bending over backwards because github decided to be a bad player.
>
> I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
>
> If you want preemtive auth, do it with WebClient httpGet:do:.
>
>
>
> >
> > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> >
> > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
>
> I personally find prettydiffs useless, but that's just me.
>
> Best regards
>         -Tobias



Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

Hi Tobias,


The API should stick to the HTTP RFCs and use 401 to say: You need authentication.


Hm, but this would be an information leak on the server-side. :-)


Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.


Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

Except when you use  "https://api.github.com/authorizations" first, which 401s.

True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Otherwise, encode it in the URL?

Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

Don't rely on just my "judgement" ;)

Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Sonntag, 25. Oktober 2020 22:19:49
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi

> On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
>
>
> > No thats wrong. Curl will only send auth data if you provide it.
>
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
>
> > The point is, you instruct Curl to _provide credentials unconditionally_.
>
> I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).

Yes, that means _unconditionally_, even if the source does not need it.
This is called an information leak.


> If I do
> WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
>

Nope.

> Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
>
> ZnClient new
> url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> username: 'LinqLover' password: 'mytoken';
> downloadTo: 'foo.zip'
>
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
>
> I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?

No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.

> If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?

Because preemtive auth is wrong.

Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
Yes, it is "one request more". Yes, it is right thing to do.

Just because it is convenient and just because people are doing it, it does not mean it is good.
In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".

>
>
> In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.

Except when you use  "https://api.github.com/authorizations" first, which 401s.


> It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.

> Do you dislike this feature in general, even when turned off by default?

Yes. WebClient is not just an API-consumer. It ought to be safe.
Otherwise, encode it in the URL?

> I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.

I don't know.

I think there has been too little input from others here.
Don't rely on just my "judgement" ;)

Best regards
        -Tobias


>
> Best,
> Christoph
>
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > Is this correct?
>
> Yes. However, the second one is the one GitHub "recommends"
>
>
> >
> > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
>
> There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> If you cannot know beforehand, no single-request stuff is gonna help. No dice.

>
> >
> > > > Why not?
> > >
> > > It leaks credentials unnecessarily.
> >
> > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
>
> No thats wrong. Curl will only send auth data if you provide it.
>
> Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
>
>
> The sequence is split manually:
> ```
> $ curl https://api.github.com/repos/krono/debug/zipball/master
> {
>   "message": "Not Found",
>   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> }
> # Well, I'm left to guess. Maybe exists, maybe not.
> $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
>
> ```
> (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
>
> The point is, you instruct Curl to _provide credentials unconditionally_.
> The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
>
> Look:
>
> ```
> $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> $
> # Well, no response?
> $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 401
> < Date: Tue, 13 Oct 2020 07:43:04 GMT
> < Server: nginx/1.14.2
> < Content-Length: 0
> < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> <
> * Connection #0 to host www.hpi.uni-potsdam.de left intact
> ```
>
> Thats the 401 we're looking for. We have found that the resource needs authentication.
>
> Sidenote: Curl can do the roundtrip (man curl, search anyauth):
>
> ```
> $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> Enter host password for user 'topa':
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 401
> < Date: Tue, 13 Oct 2020 07:46:05 GMT
> < Server: nginx/1.14.2
> < Content-Length: 0
> < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> <
> * Connection #0 to host www.hpi.uni-potsdam.de left intact
> * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> * Server auth using Basic with user 'topa'
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > Authorization: Basic *******************
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 200
> < Date: Tue, 13 Oct 2020 07:46:05 GMT
> < Server: nginx/1.14.2
> < Content-Type: text/html
> < Content-Length: 15131
> < Vary: Accept-Encoding
> ```
>
> And in this case it does _not_ send auth in the first request but only in the second.
>
> Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
>
> ```
> $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> Enter host password for user 'topa':
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 200
> < Date: Tue, 13 Oct 2020 07:46:56 GMT
> < Server: nginx/1.14.2
> < Content-Type: text/html
> < Content-Length: 75860
> < Vary: Accept-Encoding
> ```
>
>
>
>
>
> > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
>
> So you always know beforehand which resources need authentication?
> Neat, I dont :D
>
> > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
>
> What happens here is that we're bending over backwards because github decided to be a bad player.
>
> I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
>
> If you want preemtive auth, do it with WebClient httpGet:do:.
>
>
>
> >
> > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> >
> > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
>
> I personally find prettydiffs useless, but that's just me.
>
> Best regards
>         -Tobias





Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tom Beckmann
Hey everyone,

to sum up my understanding: the goal of this change is to change this pattern

WebClient new
    do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]

to

WebClient new
    preAuthenticateMethod: #basic;
    user: 'myuser' password: 'mypasswd';

I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

So the question boils down to whether we want to explicitly support the pattern or require users to have some low-level understanding of the Authorization header to be able to produce it themselves.
Most HTTP client libraries that I looked at (httpie, curl, axios, Zinc) do provide this convenience to the user, typically even making it even "easier" by eagerly assuming Basic auth as soon as a username and password are provided.

Now for my personal take: I think Christoph's proposed change is minimally invasive (the diff is still not perfectly cleaned up, potentially making it appear larger than it is) and adds value for the user for a (common?) pattern, while making it very explicit what happens with their credentials and when. So it's a +0.5 from me, and it would be a +1 if we could get a "perfectly clean" diff for reviewing the code (I only looked at it from the conceptual POV for now). Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)

Best,
Tom

On Sun, Oct 25, 2020 at 11:00 PM Thiede, Christoph <[hidden email]> wrote:

Hi Tobias,


The API should stick to the HTTP RFCs and use 401 to say: You need authentication.


Hm, but this would be an information leak on the server-side. :-)


Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.


Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

Except when you use  "https://api.github.com/authorizations" first, which 401s.

True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Otherwise, encode it in the URL?

Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

Don't rely on just my "judgement" ;)

Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)

Best,
Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
Gesendet: Sonntag, 25. Oktober 2020 22:19:49
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi

> On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
>
>
> > No thats wrong. Curl will only send auth data if you provide it.
>
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
>
> > The point is, you instruct Curl to _provide credentials unconditionally_.
>
> I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).

Yes, that means _unconditionally_, even if the source does not need it.
This is called an information leak.


> If I do
> WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
>

Nope.

> Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
>
> ZnClient new
> url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> username: 'LinqLover' password: 'mytoken';
> downloadTo: 'foo.zip'
>
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
>
> I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?

No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.

> If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?

Because preemtive auth is wrong.

Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
Yes, it is "one request more". Yes, it is right thing to do.

Just because it is convenient and just because people are doing it, it does not mean it is good.
In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".

>
>
> In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.

Except when you use  "https://api.github.com/authorizations" first, which 401s.


> It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.

> Do you dislike this feature in general, even when turned off by default?

Yes. WebClient is not just an API-consumer. It ought to be safe.
Otherwise, encode it in the URL?

> I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.

I don't know.

I think there has been too little input from others here.
Don't rely on just my "judgement" ;)

Best regards
        -Tobias


>
> Best,
> Christoph
>
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > Is this correct?
>
> Yes. However, the second one is the one GitHub "recommends"
>
>
> >
> > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
>
> There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> If you cannot know beforehand, no single-request stuff is gonna help. No dice.

>
> >
> > > > Why not?
> > >
> > > It leaks credentials unnecessarily.
> >
> > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
>
> No thats wrong. Curl will only send auth data if you provide it.
>
> Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
>
>
> The sequence is split manually:
> ```
> $ curl https://api.github.com/repos/krono/debug/zipball/master
> {
>   "message": "Not Found",
>   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> }
> # Well, I'm left to guess. Maybe exists, maybe not.
> $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
>
> ```
> (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
>
> The point is, you instruct Curl to _provide credentials unconditionally_.
> The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
>
> Look:
>
> ```
> $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> $
> # Well, no response?
> $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 401
> < Date: Tue, 13 Oct 2020 07:43:04 GMT
> < Server: nginx/1.14.2
> < Content-Length: 0
> < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> <
> * Connection #0 to host www.hpi.uni-potsdam.de left intact
> ```
>
> Thats the 401 we're looking for. We have found that the resource needs authentication.
>
> Sidenote: Curl can do the roundtrip (man curl, search anyauth):
>
> ```
> $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> Enter host password for user 'topa':
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 401
> < Date: Tue, 13 Oct 2020 07:46:05 GMT
> < Server: nginx/1.14.2
> < Content-Length: 0
> < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> <
> * Connection #0 to host www.hpi.uni-potsdam.de left intact
> * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> * Server auth using Basic with user 'topa'
> > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > Authorization: Basic *******************
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 200
> < Date: Tue, 13 Oct 2020 07:46:05 GMT
> < Server: nginx/1.14.2
> < Content-Type: text/html
> < Content-Length: 15131
> < Vary: Accept-Encoding
> ```
>
> And in this case it does _not_ send auth in the first request but only in the second.
>
> Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
>
> ```
> $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> Enter host password for user 'topa':
> *   Trying 2001:638:807:204::8d59:e178...
> * TCP_NODELAY set
> * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > Host: www.hpi.uni-potsdam.de
> > User-Agent: curl/7.54.0
> > Accept: */*
> >
> < HTTP/1.1 200
> < Date: Tue, 13 Oct 2020 07:46:56 GMT
> < Server: nginx/1.14.2
> < Content-Type: text/html
> < Content-Length: 75860
> < Vary: Accept-Encoding
> ```
>
>
>
>
>
> > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
>
> So you always know beforehand which resources need authentication?
> Neat, I dont :D
>
> > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
>
> What happens here is that we're bending over backwards because github decided to be a bad player.
>
> I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
>
> If you want preemtive auth, do it with WebClient httpGet:do:.
>
>
>
> >
> > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> >
> > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
>
> I personally find prettydiffs useless, but that's just me.
>
> Best regards
>         -Tobias






Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tobias Pape

> On 26.10.2020, at 08:39, Tom Beckmann <[hidden email]> wrote:
>
> Hey everyone,
>
> to sum up my understanding: the goal of this change is to change this pattern
>
> WebClient new
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>     do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]
>
> to
>
> WebClient new
>     preAuthenticateMethod: #basic;
>     user: 'myuser' password: 'mypasswd';
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>
> I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

Sure, but it wont work for me anyways, Since I have 2FA.
:D

And It won't work in 4 weeks time anyways:

https://docs.github.com/en/free-pro-team@latest/rest/overview/other-authentication-methods

"""
Via username and password

Deprecation Notice: GitHub will discontinue password authentication to the API. You must now authenticate to the GitHub API with an API token, such as an OAuth access token, GitHub App installation access token, or personal access token, depending on what you need to do with the token. Password authentication to the API will be removed on November 13, 2020. For more information, including scheduled brownouts, see the blog post.
"""


Gah :D


>
> So the question boils down to whether we want to explicitly support the pattern or require users to have some low-level understanding of the Authorization header to be able to produce it themselves.
> Most HTTP client libraries that I looked at (httpie, curl, axios, Zinc) do provide this convenience to the user, typically even making it even "easier" by eagerly assuming  Basic auth as soon as a username and password are provided.
>
> Now for my personal take: I think Christoph's proposed change is minimally invasive (the diff is still not perfectly cleaned up, potentially making it appear larger than it is) and adds value for the user for a (common?) pattern, while making it very explicit what happens with their credentials and when. So it's a +0.5 from me, and it would be a +1 if we could get a "perfectly clean" diff for reviewing the code (I only looked at it from the conceptual POV for now). Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)
>
> Best,
> Tom
>
> On Sun, Oct 25, 2020 at 11:00 PM Thiede, Christoph <[hidden email]> wrote:
> Hi Tobias,
>
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
>
>
> Hm, but this would be an information leak on the server-side. :-)
>
>
>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.
>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.
>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...
>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)
>
> Best,
> Christoph
>
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
>  
> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> >  
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> >  
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias
>
>
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tobias Pape
In reply to this post by Christoph Thiede

> On 25.10.2020, at 23:00, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> Hm, but this would be an information leak on the server-side. :-)

somewhat, and I get what GitHub does…


>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

I don't think it works that way.


>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

[1]: Yes, I think you/we/one should have a GitHub client if working with an API. It is not just a simple "web site" anymore.


>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

ahh yes, :D.
I just imagined a browser, which would ask for PW on 401, but sent it anyway if given in the URL.


=-=-=
Think of it that way. You do not present your passport at every door you ring a bell, but only those you're asked for it. ;)



>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)


:)

best regards
        -tobias

>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
>  
> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> >  
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> >  
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias



Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tom Beckmann
Hi Tobias,

> On 26.10.2020, at 08:39, Tom Beckmann <[hidden email]> wrote:

>
> Hey everyone,
>
> to sum up my understanding: the goal of this change is to change this pattern
>
> WebClient new
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>     do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]
>
> to
>
> WebClient new
>     preAuthenticateMethod: #basic;
>     user: 'myuser' password: 'mypasswd';
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>
> I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

Sure, but it wont work for me anyways, Since I have 2FA.
:D

Even with 2FA, `'Basic ', 'myuser:myPersonalAccessToken' base64Encoded` would currently give you the likely easiest way to access the endpoint using most clients. The way it reads [1] you are correct in that even this will no longer work and we would need to switch to `'token myPersonalAccessToken'` as the Authorization header. We should investigate soon if this would also affect the GitBrowser.

Best,
Tom



On Mon, Oct 26, 2020 at 10:37 AM Tobias Pape <[hidden email]> wrote:

> On 25.10.2020, at 23:00, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> Hm, but this would be an information leak on the server-side. :-)

somewhat, and I get what GitHub does…


>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

I don't think it works that way.


>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

[1]: Yes, I think you/we/one should have a GitHub client if working with an API. It is not just a simple "web site" anymore.


>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

ahh yes, :D.
I just imagined a browser, which would ask for PW on 401, but sent it anyway if given in the URL.


=-=-=
Think of it that way. You do not present your passport at every door you ring a bell, but only those you're asked for it. ;)



>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)


:)

best regards
        -tobias
>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> > 
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> > 
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias





Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

Hi Tom, Hi Tobias,


Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)


Is WebClient-Core-ct.128 not yet perfect enough for you? :-)

And It won't work in 4 weeks time anyways

You can still use an access token as a password - actually, I never trust any Squeak image with my real GitHub password because Squeak does not store the password protected.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

Still an edge case. Does this read nice to you?

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Tom Beckmann <[hidden email]>
Gesendet: Montag, 26. Oktober 2020 10:59:24
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi Tobias,

> On 26.10.2020, at 08:39, Tom Beckmann <[hidden email]> wrote:
>
> Hey everyone,
>
> to sum up my understanding: the goal of this change is to change this pattern
>
> WebClient new
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>     do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]
>
> to
>
> WebClient new
>     preAuthenticateMethod: #basic;
>     user: 'myuser' password: 'mypasswd';
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>
> I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

Sure, but it wont work for me anyways, Since I have 2FA.
:D

Even with 2FA, `'Basic ', 'myuser:myPersonalAccessToken' base64Encoded` would currently give you the likely easiest way to access the endpoint using most clients. The way it reads [1] you are correct in that even this will no longer work and we would need to switch to `'token myPersonalAccessToken'` as the Authorization header. We should investigate soon if this would also affect the GitBrowser.

Best,
Tom



On Mon, Oct 26, 2020 at 10:37 AM Tobias Pape <[hidden email]> wrote:

> On 25.10.2020, at 23:00, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> Hm, but this would be an information leak on the server-side. :-)

somewhat, and I get what GitHub does…


>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

I don't think it works that way.


>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

[1]: Yes, I think you/we/one should have a GitHub client if working with an API. It is not just a simple "web site" anymore.


>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

ahh yes, :D.
I just imagined a browser, which would ask for PW on 401, but sent it anyway if given in the URL.


=-=-=
Think of it that way. You do not present your passport at every door you ring a bell, but only those you're asked for it. ;)



>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)


:)

best regards
        -tobias
>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> > 
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> > 
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias





Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Tom Beckmann
Hi Christoph,

> > Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)
>
> Is WebClient-Core-ct.128 not yet perfect enough for you? :-)

I assume it's this, right? [1] There are still a number of changed empty lines and e.g. the change from `1 to: do: ` to use `detect:`, which are changes I agree with, but also make the diff unnecessarily large. Hope I'm not mistaken about these not being integral for your proposed change :)

Best,
Tom


On Mon, Oct 26, 2020 at 2:25 PM Thiede, Christoph <[hidden email]> wrote:

Hi Tom, Hi Tobias,


Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)


Is WebClient-Core-ct.128 not yet perfect enough for you? :-)

And It won't work in 4 weeks time anyways

You can still use an access token as a password - actually, I never trust any Squeak image with my real GitHub password because Squeak does not store the password protected.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

Still an edge case. Does this read nice to you?

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Tom Beckmann <[hidden email]>
Gesendet: Montag, 26. Oktober 2020 10:59:24
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi Tobias,

> On 26.10.2020, at 08:39, Tom Beckmann <[hidden email]> wrote:
>
> Hey everyone,
>
> to sum up my understanding: the goal of this change is to change this pattern
>
> WebClient new
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>     do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]
>
> to
>
> WebClient new
>     preAuthenticateMethod: #basic;
>     user: 'myuser' password: 'mypasswd';
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>
> I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

Sure, but it wont work for me anyways, Since I have 2FA.
:D

Even with 2FA, `'Basic ', 'myuser:myPersonalAccessToken' base64Encoded` would currently give you the likely easiest way to access the endpoint using most clients. The way it reads [1] you are correct in that even this will no longer work and we would need to switch to `'token myPersonalAccessToken'` as the Authorization header. We should investigate soon if this would also affect the GitBrowser.

Best,
Tom



On Mon, Oct 26, 2020 at 10:37 AM Tobias Pape <[hidden email]> wrote:

> On 25.10.2020, at 23:00, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> Hm, but this would be an information leak on the server-side. :-)

somewhat, and I get what GitHub does…


>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

I don't think it works that way.


>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

[1]: Yes, I think you/we/one should have a GitHub client if working with an API. It is not just a simple "web site" anymore.


>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

ahh yes, :D.
I just imagined a browser, which would ask for PW on 401, but sent it anyway if given in the URL.


=-=-=
Think of it that way. You do not present your passport at every door you ring a bell, but only those you're asked for it. ;)



>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)


:)

best regards
        -tobias
>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> > 
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> > 
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias






Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: WebClient-Core-ct.126.mcz

Christoph Thiede

Hi Tom,


ouch, these "empty changes" are not displayed in Monticello, that's why I missed them ... The #detect: refactoring could still be left out, but it's an integral change to send the new message #authProcess:from:header:params: at this place.


Hm ... producing minimal diffs seems to be really challenging :-)


Best,

Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Tom Beckmann <[hidden email]>
Gesendet: Montag, 26. Oktober 2020 14:33:31
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi Christoph,

> > Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)
>
> Is WebClient-Core-ct.128 not yet perfect enough for you? :-)

I assume it's this, right? [1] There are still a number of changed empty lines and e.g. the change from `1 to: do: ` to use `detect:`, which are changes I agree with, but also make the diff unnecessarily large. Hope I'm not mistaken about these not being integral for your proposed change :)

Best,
Tom


On Mon, Oct 26, 2020 at 2:25 PM Thiede, Christoph <[hidden email]> wrote:

Hi Tom, Hi Tobias,


Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)


Is WebClient-Core-ct.128 not yet perfect enough for you? :-)

And It won't work in 4 weeks time anyways

You can still use an access token as a password - actually, I never trust any Squeak image with my real GitHub password because Squeak does not store the password protected.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

Still an edge case. Does this read nice to you?

Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von Tom Beckmann <[hidden email]>
Gesendet: Montag, 26. Oktober 2020 10:59:24
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
 
Hi Tobias,

> On 26.10.2020, at 08:39, Tom Beckmann <[hidden email]> wrote:
>
> Hey everyone,
>
> to sum up my understanding: the goal of this change is to change this pattern
>
> WebClient new
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>     do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]
>
> to
>
> WebClient new
>     preAuthenticateMethod: #basic;
>     user: 'myuser' password: 'mypasswd';
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>
> I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

Sure, but it wont work for me anyways, Since I have 2FA.
:D

Even with 2FA, `'Basic ', 'myuser:myPersonalAccessToken' base64Encoded` would currently give you the likely easiest way to access the endpoint using most clients. The way it reads [1] you are correct in that even this will no longer work and we would need to switch to `'token myPersonalAccessToken'` as the Authorization header. We should investigate soon if this would also affect the GitBrowser.

Best,
Tom



On Mon, Oct 26, 2020 at 10:37 AM Tobias Pape <[hidden email]> wrote:

> On 25.10.2020, at 23:00, Thiede, Christoph <[hidden email]> wrote:
>
> Hi Tobias,
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> Hm, but this would be an information leak on the server-side. :-)

somewhat, and I get what GitHub does…


>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

I don't think it works that way.


>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

[1]: Yes, I think you/we/one should have a GitHub client if working with an API. It is not just a simple "web site" anymore.


>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@...? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

ahh yes, :D.
I just imagined a browser, which would ask for PW on 401, but sent it anyway if given in the URL.


=-=-=
Think of it that way. You do not present your passport at every door you ring a bell, but only those you're asked for it. ;)



>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)


:)

best regards
        -tobias
>
> Best,
> Christoph
> Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <[hidden email]> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <[hidden email]> im Auftrag von Tobias Pape <[hidden email]>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> > 
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <[hidden email]> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> > 
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias






Carpe Squeak!