All,
I'm hitting an interesting issue with XMLHTMLParser and I'm not even sure if this is a bug or intended behaviour. Given an HTML Entity in a String it's resolved or quoted depending on the tag (header or section tag): doc := XMLHTMLParser parse: '<html><head><title>Ü</title></head><body>Ü</body></html>'. (doc findElementNamed: 'title') contentString. "'Ü'" (doc findElementNamed: 'body') contentString. "'Ü'" In my understanding and according to https://www.w3.org/TR/html401/struct/global.html#h-7.4.2 Entities in the title tag are allowed and should IMHO be resolved. So both should return 'Ü' in this case. Any pointers? CU, Udo |
This should be fixed now. Thanks for the bug report.
> Sent: Wednesday, May 03, 2017 at 4:44 PM > From: "Udo Schneider" <[hidden email]> > To: [hidden email] > Subject: [Pharo-users] XMLHTMLParser Entity Handling oddity > > All, > > I'm hitting an interesting issue with XMLHTMLParser and I'm not even > sure if this is a bug or intended behaviour. Given an HTML Entity in a > String it's resolved or quoted depending on the tag (header or section tag): > > doc := XMLHTMLParser parse: > '<html><head><title>Ü</title></head><body>Ü</body></html>'. > (doc findElementNamed: 'title') contentString. "'Ü'" > (doc findElementNamed: 'body') contentString. "'Ü'" > > In my understanding and according to > https://www.w3.org/TR/html401/struct/global.html#h-7.4.2 Entities in the > title tag are allowed and should IMHO be resolved. > > So both should return 'Ü' in this case. > > Any pointers? > > CU, > > Udo > > > |
Tx monty for the fix! On Fri, May 5, 2017 at 7:28 PM, monty <[hidden email]> wrote: This should be fixed now. Thanks for the bug report. |
In reply to this post by monty-3
Perfect! Thank you very very much!
Am 05/05/17 um 19:28 schrieb monty: > This should be fixed now. Thanks for the bug report. > >> Sent: Wednesday, May 03, 2017 at 4:44 PM >> From: "Udo Schneider" <[hidden email]> >> To: [hidden email] >> Subject: [Pharo-users] XMLHTMLParser Entity Handling oddity >> >> All, >> >> I'm hitting an interesting issue with XMLHTMLParser and I'm not even >> sure if this is a bug or intended behaviour. Given an HTML Entity in a >> String it's resolved or quoted depending on the tag (header or section tag): >> >> doc := XMLHTMLParser parse: >> '<html><head><title>Ü</title></head><body>Ü</body></html>'. >> (doc findElementNamed: 'title') contentString. "'Ü'" >> (doc findElementNamed: 'body') contentString. "'Ü'" >> >> In my understanding and according to >> https://www.w3.org/TR/html401/struct/global.html#h-7.4.2 Entities in the >> title tag are allowed and should IMHO be resolved. >> >> So both should return 'Ü' in this case. >> >> Any pointers? >> >> CU, >> >> Udo >> >> >> > > |
Hi guys It would be supercool to have a chapter on the XML package. Does any of you have the knowledge to do it? I do not have it. Stef On Sat, May 6, 2017 at 9:51 AM, Udo Schneider <[hidden email]> wrote: Perfect! Thank you very very much! |
Yes, but at this point it will probably be a booklet, like the Glorp and Smacc ones you posted.
> Sent: Saturday, May 06, 2017 at 6:19 AM > From: "Stephane Ducasse" <[hidden email]> > To: "Any question about pharo is welcome" <[hidden email]> > Subject: Re: [Pharo-users] XMLHTMLParser Entity Handling oddity > > Hi guys > > It would be supercool to have a chapter on the XML package. > Does any of you have the knowledge to do it? > I do not have it. > > Stef > > > On Sat, May 6, 2017 at 9:51 AM, Udo Schneider <[hidden email][mailto:[hidden email]]> wrote:Perfect! Thank you very very much! > > Am 05/05/17 um 19:28 schrieb monty: > > This should be fixed now. Thanks for the bug report. > Sent: Wednesday, May 03, 2017 at 4:44 PM > From: "Udo Schneider" <[hidden email][mailto:[hidden email]]> > To: [hidden email][mailto:[hidden email]] > Subject: [Pharo-users] XMLHTMLParser Entity Handling oddity > > All, > > I'm hitting an interesting issue with XMLHTMLParser and I'm not even > sure if this is a bug or intended behaviour. Given an HTML Entity in a > String it's resolved or quoted depending on the tag (header or section tag): > > doc := XMLHTMLParser parse: > '<html><head><title>Ü</title></head><body>Ü</body></html>'. > (doc findElementNamed: 'title') contentString. "'Ü'" > (doc findElementNamed: 'body') contentString. "'Ü'" > > In my understanding and according to > https://www.w3.org/TR/html401/struct/global.html#h-7.4.2[https://www.w3.org/TR/html401/struct/global.html#h-7.4.2] Entities in the > title tag are allowed and should IMHO be resolved. > > So both should return 'Ü' in this case. > > Any pointers? > > CU, > > Udo |
Hi monty yes I would love to have a booklet and I can help reading and reviewing and producing it. Tell me that you want. What you can do also is to start by writing little (2/3) page blok posts and we turn them into a booklet. Stef On Sun, May 7, 2017 at 1:37 AM, monty <[hidden email]> wrote: Yes, but at this point it will probably be a booklet, like the Glorp and Smacc ones you posted. |
Free forum by Nabble | Edit this page |