Hi there,
I've an issue with file upload. It's about files with accents in name for example: hellò.doc. On the debugger I see the WAFile filename is hellò7.doc and it comes from ZnMimePart It's clearly an ecoding problem but I don't know where to look at. Can you help me, please? TIA Dave |
Hi Dave,
On 08 Oct 2014, at 15:25, Dave <[hidden email]> wrote: > Hi there, > I've an issue with file upload. It's about files with accents in name for > example: hellò.doc. > On the debugger I see the WAFile filename is hellò7.doc and it comes from > ZnMimePart > > It's clearly an ecoding problem but I don't know where to look at. Yes, there seems to be an issue here. It can be observed in WAUploadFunctionalTest. ZnZincServerAdaptor>>#convertMultipartFileField: creates the WAFile instance by falling back on ZnMimePart>>#fileName and ZnMimePart>>#contents. It seems that these are UTF-8 encoded, this could be fixed easily I guess. The problem is that I am not 100% sure that this is always the case (i.e. part of the spec) and thus safe to do by default. Any opinions ? Sven PS: Zinc-HTTP-SvenVanCaekenberghe.412 contains a new ZnDefaultServerDelegate>>#formTest3: that deals successfully with this issue. > Can you help me, please? > TIA > Dave > > > > -- > View this message in context: http://forum.world.st/File-upload-encoding-issue-tp4783446.html > Sent from the Seaside General mailing list archive at Nabble.com. > _______________________________________________ > seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
On Wed, Oct 8, 2014 at 5:28 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> Hi Dave, > > On 08 Oct 2014, at 15:25, Dave <[hidden email]> wrote: > >> Hi there, >> I've an issue with file upload. It's about files with accents in name for >> example: hellò.doc. >> On the debugger I see the WAFile filename is hellò7.doc and it comes from >> ZnMimePart >> >> It's clearly an ecoding problem but I don't know where to look at. > > Yes, there seems to be an issue here. It can be observed in WAUploadFunctionalTest. > > ZnZincServerAdaptor>>#convertMultipartFileField: creates the WAFile instance by falling back on ZnMimePart>>#fileName and ZnMimePart>>#contents. It seems that these are UTF-8 encoded, this could be fixed easily I guess. Do you have information in the request header that suggests UTF-8? Cheers Philippe _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
On 08 Oct 2014, at 18:08, Philippe Marschall <[hidden email]> wrote: On Wed, Oct 8, 2014 at 5:28 PM, Sven Van Caekenberghe <[hidden email]> wrote:Hi Dave, Not that I can see, there are no charset=utf-8 anywhere (but one could assume they are the default): Cheers _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Right, I also can't find where utf-8 is set. Any idea on how can I change the charset? Cheers Dave |
On 09 Oct 2014, at 08:46, Dave <[hidden email]> wrote: > Sven Van Caekenberghe-2 wrote >>> Do you have information in the request header that suggests UTF-8? >> >> Not that I can see, there are no charset=utf-8 anywhere (but one could >> assume they are the default): > > Right, I also can't find where utf-8 is set. Any idea on how can I change > the charset? Well, there is an accept-charset="utf-8" in the form, but it does not appear in the submitted form (I only checked one browser). Like I said, I need an informed opinion to help me make a decision here. As a quick work around, you can convert the strings you get using (GRCodec forEncoding: 'utf-8') decode: 'your string'. I will keep this on my todo list. I hope to come up with a better solution. Sven > Cheers > Dave > > > > -- > View this message in context: http://forum.world.st/File-upload-encoding-issue-tp4783446p4783606.html > Sent from the Seaside General mailing list archive at Nabble.com. > _______________________________________________ > seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
fine, I'll convert the string, thanks Dave |
In reply to this post by Sven Van Caekenberghe-2
On Thu, Oct 9, 2014 at 9:30 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> > On 09 Oct 2014, at 08:46, Dave <[hidden email]> wrote: > >> Sven Van Caekenberghe-2 wrote >>>> Do you have information in the request header that suggests UTF-8? >>> >>> Not that I can see, there are no charset=utf-8 anywhere (but one could >>> assume they are the default): >> >> Right, I also can't find where utf-8 is set. Any idea on how can I change >> the charset? > > Well, there is an accept-charset="utf-8" in the form, but it does not appear in the submitted form (I only checked one browser). Like I said, I need an informed opinion to help me make a decision here. The codec on the server adaptor should do the trick. It should match the page encoding and the accept-charset. Seaside always sets them to the same value, I did not test which takes precedence in which browser. I did a quick test and could verify it with UTF-8 and ISO-8859-1 on Firefox. You can either use the codec on the server adaptor or ask the codec for the name and do it with the Zinc adaptors. Weird things happen in ISO-8859-1 when using code points that do not fit. Eg Mac OS X uses NFD so German umlauts are two code points with the second one outside of ISO-8859-1. I did not test UTF-16 or Shift_JIS. Cheers Philippe _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Hi All,
I’m setting up a local SmalltalkHub for my team, helping package distribution, and to learn about Kaliningrad and Amber. I’m testing by connecting directly to the pharo vm from the browser - I don’t know if there’s a need for WebDAV or not. It appears to be working but I think I have missed something fundamental :) 1. I register a new user with the seaside UI and it appears in the recently registered list. 2. UI still says “0 registered users” [Error] TypeError: undefined is not an object (evaluating 'a.length') each (jquery-1.7.1.min.js, line 2) widget (jQueryUi.js, line 5) (anonymous function) (jQueryUi.js, line 5) global code (jQueryUi.js, line 5) [Error] Failed to load resource: the server responded with a status of 500 (Internal Server Error) (users, line 0) http://localhost:8080/hub/projects/count 3. Login fails - Oops invalid username or password [Error] Failed to load resource: the server responded with a status of 404 (Not Found) (login, line 0) http://localhost:8080/hub/login I tried: (ShUser selectOne: [ :each | each username ='jupiter' ]) validatePassword: ‘myPassword’ …and it returned false until I changed ShUser-#validatePassword: from: validatePassword: aString ^self password asInteger = (GRPlatform current secureHashFor: aString) to: validatePassword: aString ^self password = (GRPlatform current secureHashFor: aString) asString However, login still fails with NotFound. I tried to put a halt in the login handler however it’s not halting when I hit hub/login. Could something be caching the old method so my halt is not being seen? Before I start breaking more things, are there any docs for SmalltalkHub? Is there a version that runs in GemStone rather than using Mongo? (just interested) And finally, for Amber development, is there a defined way to load development and popupHelios()? Any advice would be much appreciated. I also noticed from the Mongo log that every connection appears to remain open: 2014-10-10T08:44:18.918+1100 [initandlisten] connection accepted from 127.0.0.1:59924 #1 (1 connection now open) 2014-10-10T08:44:18.922+1100 [initandlisten] connection accepted from 127.0.0.1:59925 #2 (2 connections now open) 2014-10-10T08:44:18.923+1100 [initandlisten] connection accepted from 127.0.0.1:59926 #3 (3 connections now open) 2014-10-10T08:44:18.924+1100 [initandlisten] connection accepted from 127.0.0.1:59927 #4 (4 connections now open) 2014-10-10T08:44:18.992+1100 [initandlisten] connection accepted from 127.0.0.1:59928 #5 (5 connections now open) 2014-10-10T08:44:19.010+1100 [initandlisten] connection accepted from 127.0.0.1:59930 #6 (6 connections now open) 2014-10-10T08:44:19.098+1100 [initandlisten] connection accepted from 127.0.0.1:59931 #7 (7 connections now open) 2014-10-10T08:44:19.182+1100 [initandlisten] connection accepted from 127.0.0.1:59932 #8 (8 connections now open) 2014-10-10T08:44:19.266+1100 [initandlisten] connection accepted from 127.0.0.1:59933 #9 (9 connections now open) 2014-10-10T08:44:19.349+1100 [initandlisten] connection accepted from 127.0.0.1:59934 #10 (10 connections now open) etc. Is this correct? After 5 minutes playing around I had hundreds of connections “now open”. Thanks for your time. Cheers, J_______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Philippe Marschall
Hi Philippe, Dave,
I made a couple of changes to Zinc to handle the problem (which basically is: mime parts such as uploaded files embedded in multipart/form-data do not have a charset parameter on their mime types, hence the encoding is not known with absolute certainty) and I think I fixed it (for Zn itself, the default encoding now is UTF-8). I added a specific test (ZnServerTests>>#testFormTest3Unspecified) for this case. Additionally, the filename is now also assumed to be UTF-8 encoded (like a file path). For the Zn Seaside adaptor, the story was a bit different. The adaptor uses a special Zn option to read everything binary, as Seaside wants to do its own conversions. That option did not extend to mime parts in multipart/form-data. This is now added and the adaptor now works, without altering ZnZincServerAdaptor>>#convertMultipartFileField: IMHO though, WAUploadFunctionTest is wrong. Basically, the use of ISO-8859-1 is questionable and should be replaced with UTF-8 for current browsers (in the methods #renderDownloadLinksOn: and #renderFileContentsOn:). Then those tests pass for uploaded text files that have non-ascii contents. The comment in #renderDownloadLinksOn: suggests that this problem (as described in the 1st paragraph) was noted before, the solution or fallback is wrong though, IMHO. The codec set in the adaptor could indeed be a fallback. I don't know if this can be accessed in regular Seaside code (like in the functional test). On the other hand, I can't see (and would love an example) where it makes sense, in the 21st century, to not use UTF-8 as a fallback (in case nothing was specified). In any case, thanks for raising this issue, it helped to improve the code. Sven PS: BTW, are there no unit tests that actually stress the functional tests ? On 09 Oct 2014, at 20:31, Philippe Marschall <[hidden email]> wrote: > On Thu, Oct 9, 2014 at 9:30 AM, Sven Van Caekenberghe <[hidden email]> wrote: >> >> On 09 Oct 2014, at 08:46, Dave <[hidden email]> wrote: >> >>> Sven Van Caekenberghe-2 wrote >>>>> Do you have information in the request header that suggests UTF-8? >>>> >>>> Not that I can see, there are no charset=utf-8 anywhere (but one could >>>> assume they are the default): >>> >>> Right, I also can't find where utf-8 is set. Any idea on how can I change >>> the charset? >> >> Well, there is an accept-charset="utf-8" in the form, but it does not appear in the submitted form (I only checked one browser). Like I said, I need an informed opinion to help me make a decision here. > > The codec on the server adaptor should do the trick. It should match > the page encoding and the accept-charset. Seaside always sets them to > the same value, I did not test which takes precedence in which > browser. I did a quick test and could verify it with UTF-8 and > ISO-8859-1 on Firefox. You can either use the codec on the server > adaptor or ask the codec for the name and do it with the Zinc > adaptors. > > Weird things happen in ISO-8859-1 when using code points that do not > fit. Eg Mac OS X uses NFD so German umlauts are two code points with > the second one outside of ISO-8859-1. I did not test UTF-16 or > Shift_JIS. > > Cheers > Philippe > _______________________________________________ > seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
On Fri, Oct 17, 2014 at 11:25 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> Hi Philippe, Dave, > > I made a couple of changes to Zinc to handle the problem (which basically is: mime parts such as uploaded files embedded in multipart/form-data do not have a charset parameter on their mime types, hence the encoding is not known with absolute certainty) and I think I fixed it (for Zn itself, the default encoding now is UTF-8). I added a specific test (ZnServerTests>>#testFormTest3Unspecified) for this case. Additionally, the filename is now also assumed to be UTF-8 encoded (like a file path). > > For the Zn Seaside adaptor, the story was a bit different. The adaptor uses a special Zn option to read everything binary, as Seaside wants to do its own conversions. That option did not extend to mime parts in multipart/form-data. This is now added and the adaptor now works, without altering ZnZincServerAdaptor>>#convertMultipartFileField: > > IMHO though, WAUploadFunctionTest is wrong. Basically, the use of ISO-8859-1 is questionable and should be replaced with UTF-8 for current browsers (in the methods #renderDownloadLinksOn: and #renderFileContentsOn:). Then those tests pass for uploaded text files that have non-ascii contents. > > The comment in #renderDownloadLinksOn: suggests that this problem (as described in the 1st paragraph) was noted before, the solution or fallback is wrong though, IMHO. > > The codec set in the adaptor could indeed be a fallback. I don't know if this can be accessed in regular Seaside code (like in the functional test). > > On the other hand, I can't see (and would love an example) where it makes sense, in the 21st century, to not use UTF-8 as a fallback (in case nothing was specified). I'll have a look. > In any case, thanks for raising this issue, it helped to improve the code. > > Sven > > PS: BTW, are there no unit tests that actually stress the functional tests ? No unfortunately there are not. I assume you don't mean unit tests but functional tests with Selenium or similar. Cheers Philippe _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
On 17 Oct 2014, at 15:57, Philippe Marschall <[hidden email]> wrote: >> PS: BTW, are there no unit tests that actually stress the functional tests ? > > No unfortunately there are not. I assume you don't mean unit tests but > functional tests with Selenium or similar. Well, driving a web browser is one thing, and of course necessary for JavaScript interaction - that is quite complex I guess (I never did it). But just for rendering and functionality like what we are discussing here, a web client like ZnClient and an XML parser like XMLDOMParser are enough. I did this in my HP-35 tutorial, where web buttons are 'clicked' and the 'display' is read._______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Philippe Marschall
On 17 Oct 2014, at 15:57, Philippe Marschall <[hidden email]> wrote: PS: BTW, are there no unit tests that actually stress the functional tests ? Work started on that: Seaside-Tests-Webdriver-JohanBrichau.1 Johan _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Sven Van Caekenberghe-2
On 17 Oct 2014, at 16:05, Sven Van Caekenberghe <[hidden email]> wrote: Well, driving a web browser is one thing, and of course necessary for JavaScript interaction - that is quite complex I guess (I never did it). With Parasol it’s not complex at all. The old testing tool (SeasideTesting) does roughly the same what you describe: parsing. But imho, it’s actually a lot more difficult codebase. Johan _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Sven Van Caekenberghe-2
On Fri, Oct 17, 2014 at 11:25 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> Hi Philippe, Dave, > > I made a couple of changes to Zinc to handle the problem (which basically is: mime parts such as uploaded files embedded in multipart/form-data do not have a charset parameter on their mime types, hence the encoding is not known with absolute certainty) and I think I fixed it (for Zn itself, the default encoding now is UTF-8). I added a specific test (ZnServerTests>>#testFormTest3Unspecified) for this case. Additionally, the filename is now also assumed to be UTF-8 encoded (like a file path). > > For the Zn Seaside adaptor, the story was a bit different. The adaptor uses a special Zn option to read everything binary, as Seaside wants to do its own conversions. Not really. Seaside wants a WARequest object (or a subtype). The adapters in the Seaside repository all do the conversion but that's because these servers don't support conversion. That is out of necessity not by contract. Seaside should work totally fine if you came up with a WARequest object that is build from an already parsed object. The same goes for WAUrl and WAFile. You don't have to use the class side parse methods. If you already have parsed objects it is totally fine for an adapter to build WAUrl instances with #new and #addAllToPath: and friends. > That option did not extend to mime parts in multipart/form-data. This is now added and the adaptor now works, without altering ZnZincServerAdaptor>>#convertMultipartFileField: > > IMHO though, WAUploadFunctionTest is wrong. Basically, the use of ISO-8859-1 is questionable and should be replaced with UTF-8 for current browsers (in the methods #renderDownloadLinksOn: and #renderFileContentsOn:). Then those tests pass for uploaded text files that have non-ascii contents. #renderDownloadLinksOn: could probably we fixed if we always use #rawContents #renderFileContentsOn: is trickier because we need to know what the on disk encoding of the file was. That could have been to operating system default encoding (UTF-8 on MacOS and modern Linux, maybe UTF-16 on Windows) or something else. We could look for a UTF-16 BOM and if it's missing default to UTF-8. Cheers Philippe _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Free forum by Nabble | Edit this page |