Issue 6160 in pharo: Monticello: Zipping Wide Characters

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Issue 6160 in pharo: Monticello: Zipping Wide Characters

pharo
Status: Accepted
Owner: [hidden email]
CC: [hidden email],  marianopeck
Labels: Type-Bug Milestone-2.0

New issue 6160 by [hidden email]: Monticello: Zipping Wide Characters
http://code.google.com/p/pharo/issues/detail?id=6160

Pharo2.0a
Latest update: #20160

If one has a method with wide character literals, when Monticello saves the  
package, the source.st file inside the mcz file will only be valid if  
unzipped inside Squeak/Pharo.

This is because Pharo writes zip files in 4096 byte chunks, which will all  
be ByteStrings, except the ones containing wide characters, which will be  
WideStrings. Thus, source.st will be a random mashup of such 4096 byte  
stretches.

See the following for possibly related info:
*  
http://forum.world.st/Monticello-mcz-files-write-their-DataStream-and-a-portion-of-their-chunk-files-as-WideStrings-which--td2294551.html
* https://code.google.com/p/pharo/issues/detail?id=2697
* http://code.google.com/p/pharo/issues/detail?id=6143
* https://code.google.com/p/pharo/issues/detail?id=830

Squeak/Pharo encoding in general
* http://www.visoracle.com/squeak/faq/unicode-utf8byte.html
* http://forum.world.st/squeak-dev-WideString-UTF-8-UTF-32-UCS2-td74243.html
* http://forum.world.st/WideString-performance-td4077404.html


_______________________________________________
Pharo-bugtracker mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-bugtracker
Reply | Threaded
Open this post in threaded view
|

Re: Issue 6160 in pharo: Monticello: Zipping Wide Characters

pharo

Comment #1 on issue 6160 by [hidden email]: Monticello: Zipping Wide  
Characters
http://code.google.com/p/pharo/issues/detail?id=6160

6143 is unrelated.
There are 2 separate issues really:
- Gemstone may or may not be able to read/write WideStrings from/to  
the .bin part of the Monticello file, which means if squeaksource3 tries to  
parse/write .mcz with WideStrings in them, it may fail.
I sent a proposed fix to Dale at ESUG 2 years ago to correctly read and  
asked if it would be of use if I implemented writing as well.
After resending at ESUG last year, I've heard nothing more and have no idea  
if he's found time to look at it.
http://forum.world.st/Monticello-mcz-files-write-their-DataStream-and-a-portion-of-their-chunk-files-as-WideStrings-which--td2294551.html#a2294602

http://forum.world.st/XML-Parser-Monticello-and-unicode-td2315039.html
See also http://bugs.squeak.org/view.php?id=5996

- .st file in the .mcz is not BOM-marked utf8-encoded if outside ASCII  
range, like normal .cs and .st files.
I started fixing it with Bert Freudenberg at ESUG 2 years ago, as I thought  
that was the reason Gemstone had problems, but never polished it up for  
general consumption when I got sidetracked by the above.

Others have made similar efforts:
http://bugs.squeak.org/view.php?id=5996


_______________________________________________
Pharo-bugtracker mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-bugtracker
Reply | Threaded
Open this post in threaded view
|

Re: Issue 6160 in pharo: Monticello: Zipping Wide Characters

pharo

Comment #2 on issue 6160 by [hidden email]: Monticello: Zipping Wide  
Characters
http://code.google.com/p/pharo/issues/detail?id=6160

6143 is unrelated.
There are 2 separate issues really:
- Gemstone may or may not be able to read/write WideStrings from/to  
the .bin part of the Monticello file, which means if squeaksource3 tries to  
parse/write .mcz with WideStrings in them, it may fail.
I sent a proposed fix to Dale at ESUG 2 years ago to correctly read and  
asked if it would be of use if I implemented writing as well.
After resending at ESUG last year, I've heard nothing more and have no idea  
if he's found time to look at it.

Threads from back then:
http://forum.world.st/Monticello-mcz-files-write-their-DataStream-and-a-portion-of-their-chunk-files-as-WideStrings-which--td2294551.html#a2294602

http://forum.world.st/XML-Parser-Monticello-and-unicode-td2315039.html

- .st file in the .mcz is not BOM-marked utf8-encoded if outside ASCII  
range, like normal .cs and .st files.
I started fixing it with Bert Freudenberg at ESUG 2 years ago, as I thought  
that was the reason Gemstone had problems, but never polished it up for  
general consumption when I got sidetracked by the above.

Others have made similar efforts:
http://bugs.squeak.org/view.php?id=5996


_______________________________________________
Pharo-bugtracker mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-bugtracker
Reply | Threaded
Open this post in threaded view
|

Re: Issue 6160 in pharo: Monticello: Zipping Wide Characters

pharo

Comment #3 on issue 6160 by [hidden email]: Monticello: Zipping Wide  
Characters
http://code.google.com/p/pharo/issues/detail?id=6160

Issue 2697 has been merged into this issue.


_______________________________________________
Pharo-bugtracker mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-bugtracker
Reply | Threaded
Open this post in threaded view
|

Re: Issue 6160 in pharo: Monticello: Zipping Wide Characters

pharo
Updates:
        Labels: -Milestone-2.0

Comment #4 on issue 6160 by [hidden email]: Monticello: Zipping Wide  
Characters
http://code.google.com/p/pharo/issues/detail?id=6160

This is a problem since a looooong time... I think we can declare this as a  
non-show stopping bug


_______________________________________________
Pharo-bugtracker mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-bugtracker
Reply | Threaded
Open this post in threaded view
|

Re: Issue 6160 in pharo: Monticello: Zipping Wide Characters

pharo

Comment #5 on issue 6160 by [hidden email]: Monticello: Zipping Wide  
Characters
http://code.google.com/p/pharo/issues/detail?id=6160

Yes to 3.0

Stef


_______________________________________________
Pharo-bugtracker mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-bugtracker