Ston feature idea?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Ston feature idea?

Sven Van Caekenberghe-2

> On 19 May 2017, at 10:39, Offray Vladimir Luna Cárdenas <[hidden email]> wrote:
>
> Hi,
>
> Thanks for this interesting conversation. I had some similar problem with Grafoscopio notebooks being to big when I run the default serializer because it puts data about the Text object and some other info about runs. What I did was to flat the tree to contain only the info I need. I solved this with two methods (listed below). Maybe this can be helpful:
>
> GrafoscopioNotebook>>exportAsSton: aNotebook on: aFileStream
>    | stonPrettyString |
>    aNotebook flatten.
>    stonPrettyString := String streamContents: [ :stream |
>        (STON writer on: stream)
>            newLine: String crlf;
>              prettyPrint: true;
>            keepNewLines: true;
>              nextPut: aNotebook children].
>    aFileStream nextPutAll: stonPrettyString

Any reason why you first write to a string and then directly put that string on aFileStream ?

You can directly write STON on aFileStream, which is more memory efficient, as in

  (STON writer on: aFileStream)

If speed is important, it is best to wrap a ZnBufferedWriteStream around it.

> GrafoscopioNode>>flatten
>    "I traverse the tree looking for node bodies containing 'Text' objects and transform them to
>    their string content, so space is saved and storage format is DVCS friendly while serializing
>    them to STON"
>
>    (self preorderTraversal) do: [ :eachNode |
>            (eachNode body class = Text)
>                ifTrue: [eachNode body: (eachNode body asString)]]
>
>
> Cheers,
>
> Offray
>
> On 17/05/17 09:15, Cyril Ferlicot D. wrote:
>> On 17/05/2017 16:02, Sven Van Caekenberghe wrote:
>>> OK, that is an understandable example.
>>>
>>> But why exactly do you want Currency to be serialised differently ? You don't want too many instances ? You want all instances to be #== ? Is Currency too big ? You want to allow humans to edit the STON file ? What ? Why ?
>>>
>>> You can easily make it Currency['Euro'] or even Currency[#Euro] like it is already done for a number of built-in classes.
>>>
>>> The last example that I gave solved the 'too many instances', make them #== issue.
>>>
>>> If it is size, that got covered too, but you must solve the problem of how the receiving end will resolve the reference then.
>>>
>>> But since that is all no good, there must still be another requirement.
>> Here it is a simple example to show what I want.
>>
>> The real use case is that in Moose, entities are really connected to
>> their model, and if one of our classes contains an entity it would
>> produce a really really really big ston file since it would export
>> millions of entities.
>>
>> But there is a way to retrieve an entity of this model from its moose name.
>>
>> Thus, I would like to export only the moose name of the entities, then
>> get the corresponding entity from a Model the user would have load
>> before during the ston reading.
>>
>>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Ston feature idea?

Offray Vladimir Luna Cárdenas-2
Hi,


On 19/05/17 16:38, Sven Van Caekenberghe wrote:

>> On 19 May 2017, at 10:39, Offray Vladimir Luna Cárdenas <[hidden email]> wrote:
>>
>> Hi,
>>
>> Thanks for this interesting conversation. I had some similar problem with Grafoscopio notebooks being to big when I run the default serializer because it puts data about the Text object and some other info about runs. What I did was to flat the tree to contain only the info I need. I solved this with two methods (listed below). Maybe this can be helpful:
>>
>> GrafoscopioNotebook>>exportAsSton: aNotebook on: aFileStream
>>    | stonPrettyString |
>>    aNotebook flatten.
>>    stonPrettyString := String streamContents: [ :stream |
>>        (STON writer on: stream)
>>            newLine: String crlf;
>>              prettyPrint: true;
>>            keepNewLines: true;
>>              nextPut: aNotebook children].
>>    aFileStream nextPutAll: stonPrettyString
> Any reason why you first write to a string and then directly put that string on aFileStream ?
>
> You can directly write STON on aFileStream, which is more memory efficient, as in
>
>   (STON writer on: aFileStream)

Not any particular reason, except of being a newbie at that time, going
by myself with Pharo, objects an so on. Grafoscopio was the project I
made to learn Pharo and programming properly, so there is still rough
edges here and there. So thanks Sven. Code reading and comment is
important for self-learners[*]. New versions include that improvement:

======
Name: Grafoscopio-OffrayLuna.280
Author: OffrayLuna
Time: 26 May 2017, 8:51:57.125071 am
UUID: 737baf60-cf0c-0d00-924a-23f409cbd238
Ancestors: Grafoscopio-OffrayLuna.279

Improving STON exporting, as recomended by Sven.
======

> If speed is important, it is best to wrap a ZnBufferedWriteStream around it.

Flattering the notebook tree before exporting gives me proper speed for
now, but I will have this into account.

Cheers,

Offray

[*] Is really sad that Google changed the rules for the GSoC on the run,
precluding participation PhD students who, like me, proposed their own
research project for GSoC, under the assumption that that would give us
double funding for our research (!). In Latin America is not uncommon to
have zero funding and make our PhD self funded (like in my case). But
Google, it its "infinite wisdom" thought that knows better that the
organizations it trusted to run the GSoC and said that if organization
choose something different to Googles expectations, their future
selection for funding will be in danger... which tells about the "Google
way" (TM) of trust and support, at least for the GSoC. This is my last
year as a student, so I will not have the mentoring that would helped me
a lot in improving my code and skills... :-/. Anyway, community is still
here and life goes on.


12