[Glass] Files and UTF8 but not using String #encodeAsUTF8

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...

I suppose it's about time we did something in this area ...

Dale


On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list


On Thu, Feb 26, 2015 at 2:46 PM, Dale Henrichs <[hidden email]> wrote:
What package is that class located in?


spec package: 'Multilingual-TextConversion' with: [
spec repository: 'github://glassdb/PharoCompatibility:master/repository'
].
spec package: 'Multilingual-Tests' with: [
spec repository: 'github://glassdb/PharoCompatibility:master/repository'.
spec requires: #('Multilingual-TextConversion')
].
 
Dale


On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--




--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
In reply to this post by GLASS mailing list
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--



_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
For UTF8, I think you can better use Grease GRUtf8CodecStream
Let me see if I can extract something from what we did because we do read files in several encodings in Gemstone. 
There’s just stream wrappers all over the place… I’m looking into it right now.

Johan

On 26 Feb 2015, at 20:56, Mariano Martinez Peck <[hidden email]> wrote:

Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
Mariano,

Does this help you?

|codec stream|
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: 'bla.xml' mode: 'r' onClient: false).
stream binary.
stream contents

I guess you can convert this to writing easily.

Johan

On 26 Feb 2015, at 21:22, Johan Brichau <[hidden email]> wrote:

For UTF8, I think you can better use Grease GRUtf8CodecStream
Let me see if I can extract something from what we did because we do read files in several encodings in Gemstone. 
There’s just stream wrappers all over the place… I’m looking into it right now.

Johan

On 26 Feb 2015, at 20:56, Mariano Martinez Peck <[hidden email]> wrote:

Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--



_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list


On Thu, Feb 26, 2015 at 5:38 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

Does this help you?

|codec stream|
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: 'bla.xml' mode: 'r' onClient: false).
stream binary.
stream contents

I guess you can convert this to writing easily.


Hi Johan,

Yes, it did help. However, I am still getting more errors. Your above code works only correct for me if I send #contents. If I send #next for example, it fails. In my case, I cannot send #contents but instead pass the stream to SIXX and Sixx will read the stream and materialize. So sixx for example, sends #next:. The problem is that the UTF8 magritte readers expects the streams to answer a number (ascii value) to  #next. However, GsFile answers an instance of character. See the attached screenshot.   

I know GsFile understands #nextByte which indeed answers the ascii value, but as you can see in the stack I don't have control over which messages are sent, so #next: is sent.

Reproducing the error is very easy:

| stream codec  |
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'w' onClient: false).
stream text. 
stream nextPutAll: '<mariano>'.
stream flush.

codec := GRCodec forEncoding: 'utf8'.
stream := codec decoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'r' onClient: false).
stream next   

There you will get the DNU. 

Any ideas how to workaround this?

Thanks in advance, 



 
Johan

On 26 Feb 2015, at 21:22, Johan Brichau <[hidden email]> wrote:

For UTF8, I think you can better use Grease GRUtf8CodecStream
Let me see if I can extract something from what we did because we do read files in several encodings in Gemstone. 
There’s just stream wrappers all over the place… I’m looking into it right now.

Johan

On 26 Feb 2015, at 20:56, Mariano Martinez Peck <[hidden email]> wrote:

Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--





--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass

Screen Shot 2015-03-02 at 10.56.45 AM.png (216K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list


On Mon, Mar 2, 2015 at 11:08 AM, Mariano Martinez Peck <[hidden email]> wrote:


On Thu, Feb 26, 2015 at 5:38 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

Does this help you?

|codec stream|
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: 'bla.xml' mode: 'r' onClient: false).
stream binary.
stream contents

I guess you can convert this to writing easily.


Hi Johan,

Yes, it did help. However, I am still getting more errors. Your above code works only correct for me if I send #contents. If I send #next for example, it fails. In my case, I cannot send #contents but instead pass the stream to SIXX and Sixx will read the stream and materialize. So sixx for example, sends #next:. The problem is that the UTF8 magritte readers expects the streams to answer a number (ascii value) to  #next. However, GsFile answers an instance of character. See the attached screenshot.   

I know GsFile understands #nextByte which indeed answers the ascii value, but as you can see in the stack I don't have control over which messages are sent, so #next: is sent.

Reproducing the error is very easy:

| stream codec  |
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'w' onClient: false).
stream text. 
stream nextPutAll: '<mariano>'.
stream flush.

codec := GRCodec forEncoding: 'utf8'.
stream := codec decoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'r' onClient: false).
stream next   

There you will get the DNU. 

Any ideas how to workaround this?


I forgot to said...I tried sending r+b ir w+b as open mode in my OSX and didn't work. 

If I replace the reading with this:

stream := codec decoderFor: (FileStream oldFileNamed: aFilename) binary.
it works correct....becasue FileStream #next answers asciiValue (if I send #binary).

What a mess...  then it means that GRUtf8CodecStream expects that I define the stream as binary even if they are text? 

Thanks in advance for any clarification!

Best, 

 
 
Johan

On 26 Feb 2015, at 21:22, Johan Brichau <[hidden email]> wrote:

For UTF8, I think you can better use Grease GRUtf8CodecStream
Let me see if I can extract something from what we did because we do read files in several encodings in Gemstone. 
There’s just stream wrappers all over the place… I’m looking into it right now.

Johan

On 26 Feb 2015, at 20:56, Mariano Martinez Peck <[hidden email]> wrote:

Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--





--



--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list


On Mon, Mar 2, 2015 at 11:58 AM, Richard Sargent <[hidden email]> wrote:

Mariano,
UTF-8 *is* a binary encoding. Yes, it is 8-bit, but binary. (It uses code points in the 228-255 range to identify the encoded characters.)

Indeed, you are right. But even doing so GsFile does not work while FileStream does. So...I guess my problem is that I have no way to force the open of GsFile as binary...I tried 'r+b' and 'w+b' as follows:

(GsFile open: '/Users/mariano/test.txt' mode: 'r+b' onClient: false).   

But GsFile #next will always answer a character. On the contrary,  (FileStream oldFileNamed: aFilename) binary.  does answer a number when sending #next. At least, for FileStream I can force the binary mode with #binary, thing that I cannot do (or doesn't work) with GsFile.

Best, 








 
On Mar 2, 2015 6:20 AM, "Mariano Martinez Peck via Glass" <[hidden email]> wrote:


On Mon, Mar 2, 2015 at 11:08 AM, Mariano Martinez Peck <[hidden email]> wrote:


On Thu, Feb 26, 2015 at 5:38 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

Does this help you?

|codec stream|
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: 'bla.xml' mode: 'r' onClient: false).
stream binary.
stream contents

I guess you can convert this to writing easily.


Hi Johan,

Yes, it did help. However, I am still getting more errors. Your above code works only correct for me if I send #contents. If I send #next for example, it fails. In my case, I cannot send #contents but instead pass the stream to SIXX and Sixx will read the stream and materialize. So sixx for example, sends #next:. The problem is that the UTF8 magritte readers expects the streams to answer a number (ascii value) to  #next. However, GsFile answers an instance of character. See the attached screenshot.   

I know GsFile understands #nextByte which indeed answers the ascii value, but as you can see in the stack I don't have control over which messages are sent, so #next: is sent.

Reproducing the error is very easy:

| stream codec  |
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'w' onClient: false).
stream text. 
stream nextPutAll: '<mariano>'.
stream flush.

codec := GRCodec forEncoding: 'utf8'.
stream := codec decoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'r' onClient: false).
stream next   

There you will get the DNU. 

Any ideas how to workaround this?


I forgot to said...I tried sending r+b ir w+b as open mode in my OSX and didn't work. 

If I replace the reading with this:

stream := codec decoderFor: (FileStream oldFileNamed: aFilename) binary.
it works correct....becasue FileStream #next answers asciiValue (if I send #binary).

What a mess...  then it means that GRUtf8CodecStream expects that I define the stream as binary even if they are text? 

Thanks in advance for any clarification!

Best, 

 
 
Johan

On 26 Feb 2015, at 21:22, Johan Brichau <[hidden email]> wrote:

For UTF8, I think you can better use Grease GRUtf8CodecStream
Let me see if I can extract something from what we did because we do read files in several encodings in Gemstone. 
There’s just stream wrappers all over the place… I’m looking into it right now.

Johan

On 26 Feb 2015, at 20:56, Mariano Martinez Peck <[hidden email]> wrote:

Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--





--



--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Files and UTF8 but not using String #encodeAsUTF8

GLASS mailing list
In reply to this post by GLASS mailing list
Mariano,

I think you can wrap the stream and convert messages, no?
At least, that's what we do all over the place in this area, but I need to check if we have your use case. 

At least, I identified that I should publish these wrappers with the TextConverters code because it's needed. So I am checking that too but that will require some more time.

Johan

On 02 Mar 2015, at 15:08, Mariano Martinez Peck <[hidden email]> wrote:



On Thu, Feb 26, 2015 at 5:38 PM, Johan Brichau <[hidden email]> wrote:
Mariano,

Does this help you?

|codec stream|
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: 'bla.xml' mode: 'r' onClient: false).
stream binary.
stream contents

I guess you can convert this to writing easily.


Hi Johan,

Yes, it did help. However, I am still getting more errors. Your above code works only correct for me if I send #contents. If I send #next for example, it fails. In my case, I cannot send #contents but instead pass the stream to SIXX and Sixx will read the stream and materialize. So sixx for example, sends #next:. The problem is that the UTF8 magritte readers expects the streams to answer a number (ascii value) to  #next. However, GsFile answers an instance of character. See the attached screenshot.   

I know GsFile understands #nextByte which indeed answers the ascii value, but as you can see in the stack I don't have control over which messages are sent, so #next: is sent.

Reproducing the error is very easy:

| stream codec  |
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'w' onClient: false).
stream text. 
stream nextPutAll: '<mariano>'.
stream flush.

codec := GRCodec forEncoding: 'utf8'.
stream := codec decoderFor: (GsFile open: '/Users/mariano/test.txt' mode: 'r' onClient: false).
stream next   

There you will get the DNU. 

Any ideas how to workaround this?

Thanks in advance, 



 
Johan

On 26 Feb 2015, at 21:22, Johan Brichau <[hidden email]> wrote:

For UTF8, I think you can better use Grease GRUtf8CodecStream
Let me see if I can extract something from what we did because we do read files in several encodings in Gemstone. 
There’s just stream wrappers all over the place… I’m looking into it right now.

Johan

On 26 Feb 2015, at 20:56, Mariano Martinez Peck <[hidden email]> wrote:

Hi Johan,

But do you have any example of using a UTF8TextConvertor with files? because I found no way :(

Thanks!

On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <[hidden email]> wrote:
It’s in the PharoCompatibility project on github.

Though, I must say it merits some love.
It works (using it for years already) but we have sometimes done some hacks to make it work with the Stream classes of GS. Or at least that’s what I remember right off the top of my head.

But I have not used this with SIXX. My last experience with SIXX was when we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the commitOnAlmostOutOfMemory trick.

Johan

On 26 Feb 2015, at 18:46, Dale Henrichs <[hidden email]> wrote:

What package is that class located in?

Dale

On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:


On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <[hidden email]> wrote:
Mariano,


Hmmm, this is a bit of a sticky wicket ....

I'm afraid that the best way to solve this on is make a major change to SIXX and force all output and input to be utf8 ... but if you are using SIXX to move data between pharo and gemstone, then you'll have to make sure thatSIXX on the pharo size will properly decode utf8 ...

Alternatively, we could try porting the the whole TextConverter scheme to GemStone ...


Hi Dale,

This is already ported, Johan did it. I just don't know how to use the MultiByteBinaryOrTextStream (to which I can set a #converter:) together with a GsFile backend.. but i guess Johan did something because I cannot imagine he did everything in memory, right?

 
I suppose it's about time we did something in this area ...

Dale



On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
Hi guys,

I am trying to implement the solution provided by Dale for exporting and importing large objects with SIXX: https://github.com/glassdb/SIXX?files=1

In his example he does:

 strm := WriteStream on: String new.
 #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')

That stream 'strm' is in memory. I need files. And I want those files to be encoded with UTF8. In addition, in my experience, I have been trying to use GsFile as much as possible since it was way faster than other classes when I tested it. So...so far I was using the following approach to write a UTF8 file:

file := GsFile openWrite: aFilename.
file nextPutAll: aString encodeAsUTF8.

However, I cannot use that approach in the SIXX scenario. Why? Because I cannot easily hook in the parts where sixx gets the string of an object and writes it to the stream. So I kind of need to create the File stream with UTF8 from the beginning. 

I do have UT8TextConverter, but GsFile dnu #converter:. I tried:

| stream |
stream := MultiByteBinaryOrTextStream on: (GsFile openWrite: aFilename).
stream converter: UTF8TextConverter new. 
stream text. 
  MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
    UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
       #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at: #'MY_SIXX_ROOT_ARRAY')
  ].
  stream close.

But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I could write all the file first, then grab the contents as UT8 and then do what I did above (a new file doing #nextPutAll: of the UTF8). But...since I am doing all this code because the object graph I am trying to serialize is big I am afraid I will run out of memory while trying to have all the contents as UTF8. So I would really like the "streaming" possibility. 

Any ideas how can I do that?

Thanks, 



--


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--





--





--
<Screen Shot 2015-03-02 at 10.56.45 AM.png>

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass