Handling of $+ in URLs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Handling of $+ in URLs

Sven Van Caekenberghe-2
Hi,

Johan Brichau reported an issue a couple of days ago concerning the handling of $+ in ZnUrl (Pharo 3's URL class) and in Seaside's WAUrl. #bleedingEdge of Zinc HTTP Components fixes the issue, as far as I can see. I want to explain the problem and the solution.

Before october 24 of last year, ZnUrl used a 'better safe than sorry' safe set when doing percent encoding of unsafe characters. However, the URL spec defines different allowed characters per URL part. This behaviour was then added to Zinc-Resource-Meta-Core, ZnUrl's package.

Soon after that a discussion with Jan van de Sandt let to a first small change: since ZnUrl interprets the query part of a URL as key-value pairs, it is necessary to treat $= and $& as unsafe, even though they are not according to the URL spec (which doesn't concern itself with how the query part is interpreted).

All that time, $+ kept on being interpreted as a space, independent of the safe set. As Johan reported, this conflicted with $+ being a safe character. Which eventually let to the functional problem of not being able to enter a + in an input field, in Seaside.

Why only in Seaside ? Because ZnZincServerAdaptor>>#requestUrlFor: was implemented by printing the interpreted incoming ZnUrl and parsing it again. There, the escaping of $+ disappeared and it became an unintended space.

This situation is now fixed by

Changes to ZnPercentEncoder:
- adding an #decodePlusAsSpace boolean option

Changes to ZnResourceMetaUtils:
- #decodePercent: no longer decodes plus as space
- #decodePercentForQuery: does plus as space decoding
- #queryKeyValueSafeSet no longer includes $+
- #parseQueryFrom: not uses #decodePercentForQuery:

Added ZnDefaultServerDelegate>>#formTest1: to test simple form submit encoding handling

Modify ZnZincServerAdaptor>>#requestUrlFor: to build a WAUrl explicitely from the interpreted parts of the incoming ZnUrl instead of going via printing and parsing

Adding new unit tests
- ZnUrlTests>>#testPlusHandling
- ZnServerTests>>#testFormTest1

I think WAUrl should best be changed as well, but that is not my call.

In code, this summarises the implemented behaviour:

ZnUrlTests>>#testPlusHandling
  "While percent decoding, a + is translated as a space only in the context of
   application/x-www-form-urlencoded get/post requests:
   http://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type
   ZnUrl interprets its query part as key value pairs where this translation is applicable,
   even though strictly speaking + (and =, &) are plain unreserved characters in the query"
       
  "$+ is not special in the path part of the URL and it remains itself"
  self
    assert: 'http://localhost/foo+bar' asZnUrl firstPathSegment
    equals: 'foo+bar'.
  self
    assert: 'http://localhost/foo+bar' asZnUrl printString
    equals: 'http://localhost/foo+bar'.
  "$+ gets decoded to space in the interpreted query part of the URL,
   and becomes an encoded space if needed"
  self
    assert: ('http://localhost/test?q=foo+bar' asZnUrl queryAt: #q)
    equals: 'foo bar'.
  self
    assert: 'http://localhost/test?q=foo+bar' asZnUrl printString
    equals: 'http://localhost/test?q=foo%20bar'.
  "to pass $+ as $+ in a query, it has to be encoded"
  self
    assert: 'http://localhost/test?q=foo%2Bbar' asZnUrl printString
    equals: 'http://localhost/test?q=foo%2Bbar'

I hope this is a good and correct solution. In any case, it fixes the functional problem that $+ disappeared in WAUrlEncodingFunctionalTest - which I took over in ZnDefaultServerDelegate>>#formTest1:

Thanks Johan for the whole discussion !

Sven

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Handling of $+ in URLs

Johan Brichau-2
Sven, once again many thanks for addressing this is in such a swift manner!

On 15 Feb 2014, at 23:21, Sven Van Caekenberghe <[hidden email]> wrote:

> Hi,
>
> Johan Brichau reported an issue a couple of days ago concerning the handling of $+ in ZnUrl (Pharo 3's URL class) and in Seaside's WAUrl. #bleedingEdge of Zinc HTTP Components fixes the issue, as far as I can see. I want to explain the problem and the solution.
>
> Before october 24 of last year, ZnUrl used a 'better safe than sorry' safe set when doing percent encoding of unsafe characters. However, the URL spec defines different allowed characters per URL part. This behaviour was then added to Zinc-Resource-Meta-Core, ZnUrl's package.
>
> Soon after that a discussion with Jan van de Sandt let to a first small change: since ZnUrl interprets the query part of a URL as key-value pairs, it is necessary to treat $= and $& as unsafe, even though they are not according to the URL spec (which doesn't concern itself with how the query part is interpreted).
>
> All that time, $+ kept on being interpreted as a space, independent of the safe set. As Johan reported, this conflicted with $+ being a safe character. Which eventually let to the functional problem of not being able to enter a + in an input field, in Seaside.
>
> Why only in Seaside ? Because ZnZincServerAdaptor>>#requestUrlFor: was implemented by printing the interpreted incoming ZnUrl and parsing it again. There, the escaping of $+ disappeared and it became an unintended space.
>
> This situation is now fixed by
>
> Changes to ZnPercentEncoder:
> - adding an #decodePlusAsSpace boolean option
>
> Changes to ZnResourceMetaUtils:
> - #decodePercent: no longer decodes plus as space
> - #decodePercentForQuery: does plus as space decoding
> - #queryKeyValueSafeSet no longer includes $+
> - #parseQueryFrom: not uses #decodePercentForQuery:
>
> Added ZnDefaultServerDelegate>>#formTest1: to test simple form submit encoding handling
>
> Modify ZnZincServerAdaptor>>#requestUrlFor: to build a WAUrl explicitely from the interpreted parts of the incoming ZnUrl instead of going via printing and parsing
>
> Adding new unit tests
> - ZnUrlTests>>#testPlusHandling
> - ZnServerTests>>#testFormTest1
>
> I think WAUrl should best be changed as well, but that is not my call.
>
> In code, this summarises the implemented behaviour:
>
> ZnUrlTests>>#testPlusHandling
>  "While percent decoding, a + is translated as a space only in the context of
>   application/x-www-form-urlencoded get/post requests:
>   http://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type
>   ZnUrl interprets its query part as key value pairs where this translation is applicable,
>   even though strictly speaking + (and =, &) are plain unreserved characters in the query"
>
>  "$+ is not special in the path part of the URL and it remains itself"
>  self
>    assert: 'http://localhost/foo+bar' asZnUrl firstPathSegment
>    equals: 'foo+bar'.
>  self
>    assert: 'http://localhost/foo+bar' asZnUrl printString
>    equals: 'http://localhost/foo+bar'.
>  "$+ gets decoded to space in the interpreted query part of the URL,
>   and becomes an encoded space if needed"
>  self
>    assert: ('http://localhost/test?q=foo+bar' asZnUrl queryAt: #q)
>    equals: 'foo bar'.
>  self
>    assert: 'http://localhost/test?q=foo+bar' asZnUrl printString
>    equals: 'http://localhost/test?q=foo%20bar'.
>  "to pass $+ as $+ in a query, it has to be encoded"
>  self
>    assert: 'http://localhost/test?q=foo%2Bbar' asZnUrl printString
>    equals: 'http://localhost/test?q=foo%2Bbar'
>
> I hope this is a good and correct solution. In any case, it fixes the functional problem that $+ disappeared in WAUrlEncodingFunctionalTest - which I took over in ZnDefaultServerDelegate>>#formTest1:
>
> Thanks Johan for the whole discussion !
>
> Sven
>
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Handling of $+ in URLs

Philippe Marschall
In reply to this post by Sven Van Caekenberghe-2
On Sat, Feb 15, 2014 at 11:21 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> Hi,
>
> Johan Brichau reported an issue a couple of days ago concerning the handling of $+ in ZnUrl (Pharo 3's URL class) and in Seaside's WAUrl. #bleedingEdge of Zinc HTTP Components fixes the issue, as far as I can see. I want to explain the problem and the solution.
>
> Before october 24 of last year, ZnUrl used a 'better safe than sorry' safe set when doing percent encoding of unsafe characters. However, the URL spec defines different allowed characters per URL part. This behaviour was then added to Zinc-Resource-Meta-Core, ZnUrl's package.
>
> Soon after that a discussion with Jan van de Sandt let to a first small change: since ZnUrl interprets the query part of a URL as key-value pairs, it is necessary to treat $= and $& as unsafe, even though they are not according to the URL spec (which doesn't concern itself with how the query part is interpreted).
>
> All that time, $+ kept on being interpreted as a space, independent of the safe set. As Johan reported, this conflicted with $+ being a safe character. Which eventually let to the functional problem of not being able to enter a + in an input field, in Seaside.
>
> Why only in Seaside ? Because ZnZincServerAdaptor>>#requestUrlFor: was implemented by printing the interpreted incoming ZnUrl and parsing it again. There, the escaping of $+ disappeared and it became an unintended space.
>
> This situation is now fixed by
>
> Changes to ZnPercentEncoder:
> - adding an #decodePlusAsSpace boolean option
>
> Changes to ZnResourceMetaUtils:
> - #decodePercent: no longer decodes plus as space
> - #decodePercentForQuery: does plus as space decoding
> - #queryKeyValueSafeSet no longer includes $+
> - #parseQueryFrom: not uses #decodePercentForQuery:
>
> Added ZnDefaultServerDelegate>>#formTest1: to test simple form submit encoding handling
>
> Modify ZnZincServerAdaptor>>#requestUrlFor: to build a WAUrl explicitely from the interpreted parts of the incoming ZnUrl instead of going via printing and parsing
>
> Adding new unit tests
> - ZnUrlTests>>#testPlusHandling
> - ZnServerTests>>#testFormTest1
>
> I think WAUrl should best be changed as well, but that is not my call.
>
> In code, this summarises the implemented behaviour:
>
> ZnUrlTests>>#testPlusHandling
>   "While percent decoding, a + is translated as a space only in the context of
>    application/x-www-form-urlencoded get/post requests:
>    http://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type
>    ZnUrl interprets its query part as key value pairs where this translation is applicable,
>    even though strictly speaking + (and =, &) are plain unreserved characters in the query"
>
>   "$+ is not special in the path part of the URL and it remains itself"
>   self
>     assert: 'http://localhost/foo+bar' asZnUrl firstPathSegment
>     equals: 'foo+bar'.
>   self
>     assert: 'http://localhost/foo+bar' asZnUrl printString
>     equals: 'http://localhost/foo+bar'.
>   "$+ gets decoded to space in the interpreted query part of the URL,
>    and becomes an encoded space if needed"
>   self
>     assert: ('http://localhost/test?q=foo+bar' asZnUrl queryAt: #q)
>     equals: 'foo bar'.
>   self
>     assert: 'http://localhost/test?q=foo+bar' asZnUrl printString
>     equals: 'http://localhost/test?q=foo%20bar'.
>   "to pass $+ as $+ in a query, it has to be encoded"
>   self
>     assert: 'http://localhost/test?q=foo%2Bbar' asZnUrl printString
>     equals: 'http://localhost/test?q=foo%2Bbar'
>
> I hope this is a good and correct solution. In any case, it fixes the functional problem that $+ disappeared in WAUrlEncodingFunctionalTest - which I took over in ZnDefaultServerDelegate>>#formTest1:
>
> Thanks Johan for the whole discussion !

Wow, thanks. I looking into this an likely WAUrl will need to change.

Cheers
Philippe
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside