Reflecting on data (literal) object syntax

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Reflecting on data (literal) object syntax

Christian Haider

Yes, Value (the superclass) is used for – äh J - values, i.e. immutable literal objects, which aid me with a more functional programming style.

The literal printing of Objects is possible for all objects, since the facilities are defined in Object (#asSource).

 

And yes, it is the same as #storeString – in spirit and as fallback. But instead of using #instvarAt:put:, #asSource uses the constructor form instead.

(I did not want to extend #storeString to not interfere with the systems use of it (lots).)

 

So, instead of writing the infos (class names, selectors and arguments) in a new syntax, why not writing it as Smalltalk source?

You just need to define a print method #printvalueWith: and implement the corresponding constructor(s).

 

My question about the purpose was more from the user view.

Is it for persisting settings or configurations outside an image?

Or exchanging data between images / dialects?

Or for interfacing with external systems?

For these, I could recommend Values J.

 

Cheers

 

P.S. are you at ESUG? I could show you what I mean…

 

Von: Pharo-dev [mailto:[hidden email]] Im Auftrag von Norbert Hartl
Gesendet: Dienstag, 4. Juli 2017 09:20
An: Pharo Dev <[hidden email]>
Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax

 

Hi Christian,

 

thanks for the explanation. I see that Values serve a different purpose. We are looking for a compact form of object literal, the is not bound to the external interface and does not need to be subclassed. 

I'm interested how the printing of Values is different to #storeString.

 

Norbert

 

Am 04.07.2017 um 09:05 schrieb Christian Haider <[hidden email]>:

 

Hi Norbert,

 

yes, that is the point: it is a normal Smalltalk expression.

It is trivial and every Smalltalk understands it.

No need for special parsers; you just have class names and constructor methods.

I believe this is a very compact and simple representation of Smalltalk objects.

 

The Values package contains mainly the machinery to print an object (aka Value) as String so that it can be reconstructed from it.

Exactly like Smalltalk prints literal objects like Array, Integer, Character, String, Symbol etc.

The superclass Value makes sure that the object tree (remember – no cycles or references) will print properly (under consideration of namespaces).

Also, many “simple” base objects can be turned into Values by implementing constructors and the printer (like Date, Timestamp, Point, Rectangle, ColorValue etc.).

 

Additionally, Values allow to declare optional instVars, so that the representation does not get too cluttered with boilerplate arguments.

In the dev tools, there is a generator which produces the proper constructors (2**<number of defaults>).

 

There is really not much to it, technically. Maybe it is just too simple.

I would be interested in the differences to the other proposals – what should be achieved and what may be missing?

 

HTH,

                Christian

 

Von: Pharo-dev [[hidden email]] Im Auftrag von Norbert Hartl
Gesendet: Dienstag, 4. Juli 2017 08:14
An: Pharo Development List <[hidden email]>
Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax

 

Hi Christian,


Am 03.07.2017 um 11:06 schrieb Christian Haider <[hidden email]>:

I solved this with Values[1] for VW and I am very happy with it (using it intensively/routinely).

Your example would look like:

 

(PointCollection points: (Array

                with: (Point x: 10 y: 20) 

                with: (Point x: 5 y: 8)

))

 

I do not understand your point. Above is a normal smalltalk expression. I skimmed through the Values document and I do not get how this is related to a discussion about 

compact object literal syntax.

Can you elaborate a bit more? How do Values help here?

 

Norbert


As you see, it is the same except that lots of noise is gone.

Drawbacks: only literal objects (like Values) are allowed; i.e. no cyclic structures (same with JSON etc.)

and the order of arguments is fixed unlike JSON (but you can add constructors for every permutation J).

 

I think this is very clear and direct (no magic from parsers etc.). I like it most for configurations (see everything at a glance) and interface data.

 

Best,

                Christian

 

 

Von: Pharo-dev [[hidden email]] Im Auftrag von Norbert Hartl
Gesendet: Montag, 3. Juli 2017 10:28
An: Pharo Dev <[hidden email]>
Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax

 

Eliot,

 

Am 01.07.2017 um 20:22 schrieb Eliot Miranda <[hidden email]>:

 

Hi Norbert,





On Jul 1, 2017, at 7:36 AM, Norbert Hartl <[hidden email]> wrote:





Am 30.06.2017 um 21:14 schrieb Stephane Ducasse <[hidden email]>:

But what is DataFrame?


the new collection done by alesnedr from Lviv. It is really nice but
does not solve the problem of the compact syntax.





STON fromString: 'Point[10,20]'

Same goes for JSON.




We were brainstorming with marcus and we could have a first nice extension:

{ 'x' -> 10 .'y' -> 20 } asObjectOf: #Point.



10@20


Now in addition I think that there is a value in having an object
literal syntax.

I pasted the old mail of igor on object literals because I like the
idea since it does not add any change in the parser.
Do you remember what were the problem raised by this solution (beside
the fact that it had too many # and the order was like in ston not
explicit).

I would love to have another pass on the idea of Igor.


What I don't like about it is that the object literal exposes the internal implementation of the object. Everything is based on index. So it could suffer the same problem as fuel. When you don't have the exact same code the deserialization fails.


Indeed this is why
{ 'x' -> 10 .'y' -> 20 } asObjectOf: #Point.
could be more robust.
We could extend the object literal syntax to use association for non
collection.

I think it is more robust and more explicit. I do not know what are the semantics of detecting something as #Point being a class name. Is it then forbidden to use symbols with uppercase letters? I think something like

{ #x -> 10 . #y -> 20} asObjectOf: #Point

is handling the format with implicit knowledge of type. While the explicit version would be

{ #Point -> { #x -> 10 . #y -> 20}} asObject

And then nested objects are as easy as

{#PointCollection -> { 
  #points -> { {#Point -> { #x -> 10 . #y -> 20} }.
                {#Point -> { #x -> 5 . #y -> 8} } } }  asObject


The -> messages are just noise and add additional processing for nothing.  This is just as effective:

{ #Point. { #x. 10 . #y. 20}} asObject

{#PointCollection. {   #points. { {#Point. { #x. 10 . #y. 20} }.
              {#Point. { #x. 5 . #y. 8} } } }  asObject

So an object is a pair of a class name and an array of slot specifier pairs, and a slot specifier is a pair of a slot band and a value.  And of course that means that many object specs can be literal, which is useful for storing in pragmas etc:

   #(PointCollection
       (points ((Point (x 10 y 20))
                     ((Point (x 5 y 8))))  asObject



Agreed. My first impression was it should be something like S-expression which your example is. I was misled by the idea it should be closer to the programming syntax. But a parser does not care if implemented properly, that's right. I like the compactness of that format but still find it a bit hard to read if there is only pairs. As this object literal syntax is meant to be written in code it is important that it reads well even if there are noisy characters. 




would give a PointCollection of two point objects. My future wish would be that there is an object literal parser that takes all of the information from the format. And then a object literal parser that is aware of slot information. Meaning that the type information can be gathered from the object class instead having it to write in the format. In the PointCollection the slot for points would have the type information #Point attached. The format goes then to

{ #points -> {
  { #x -> 10 . #y -> 20 }.
      { #x -> 5 . #y -> 8 } }

which would then the equivalent to something like JSON

{ "points" : [
    { "x" : 10, "y" : 20 },
    { "x" : 5, "y" : 8 } ] }

What I don't know is how to solve the difference between a dictionary and an collection of associations.


That's incidental to the format, internal to the parser.  If the parser chooses to build a dictionary as it parses so be it.  The point is that the output is as you specify; a tree of objects.

The thing to think about is how to introduce labels so that sub objects can be shared in the graph, the naïve deepCopy vs deepCopyUsing: issue.

 

I'm not sure this is necessary. It should be a format to easily instantiate a small tree of objects. Making it build a graph instead of a tree makes everything much more complicated. Either we decide that STON can do the full set and in that case it is probably less valuable to have a simple syntax to write in code. Or we need to break pairs rule. In that case an object definition can have an optional third argument which would be the label for the object. The draback is that the label needs to be before the array of slots

 

{ :v1 #ValueHolder { 'contents' . { ValueHolder . { 'contents' . @v1 }}}} 

 

Or something like this. It would be in theory closer to STON using the @ reference. The difference is that STON has indexed object access and that variant would make it based on labels.Or something like this.

 

Norbert

 






Norbert






As a dictionary is both, an array of associations and a key-value store, it works perfectly there. But for other objects I have doubts. Especially is in a lot of contexts you need to have a mapping of internal state to external representation. It can be applied afterwards but I'm not sure that can work all the time.


Yes after we should focus on the frequent cases. And may be having a
literal syntax for dictionary would be good enough.

I will do another version of igor's proposal with associations to see
how it feels.





my 2 cents,

Norbert





Stef




---------- Forwarded message ----------
From: Igor Stasenko <[hidden email]>
Date: Fri, Oct 19, 2012 at 1:09 PM
Subject: [Pharo-project] Yet another Notation format: Object literals
To: Pharo Development <[hidden email]>


Hi,
as i promised before, here the simple smalltalk-based literal format.
It based on smalltalk syntax, and so, unlike JSON, it doesn't needs to
have separate parser (a normal smalltalk parser used for that).

The idea is quite simple:
you can tell any object to represent itself as an 'object literal' ,
for example:

(1@3) asObjectLiteral
-->  #(#Point 1 3)

{ 1@2.  3@4. true. false . nil  } asObjectLiteral

-> #(#Array #(#Point 1 2) #(#Point 3 4) true false nil)

(Dictionary newFromPairs: { 1->#(1 2 3) . 'foo' -> 'bar' }) asObjectLiteral
->
#(#Dictionary 1 #(#Array 1 2 3) 'foo' 'bar')

Next thing, you can 'pretty-print' it (kinda):

#(#Dictionary 1 #(#Array 1 2 3) 'foo' 'bar') printObjectLiteral

'#(#Dictionary
    1
    (#Array 1 2 3)
    ''foo'' ''bar'')'


and sure thing, you can do reverse conversion:

'#(#Dictionary
    1
    (#Array 1 2 3)
    ''foo'' ''bar'')'  parseAsObjectLiteral

a Dictionary('foo'->'bar' 1->#(1 2 3) )

Initially, i thought that it could be generic (by implementing default
Object>>#asObjectLiteral),
but then after discussing it with others, we decided to leave

Object>>#asObjectLiteral to be a subclass responsibility.
So, potentially the format allows to represent any object(s) as
literals, except from circular referencing objects, of course.

The implementation is fairly simple, as you may guess and contains no
new classes, but just extension methods here and there.

Take it with grain and salt, since it is just a small proof of
concept. (And if doing it for real it may need some changes etc).
Since i am far from areas right now, where it can be used, i don't
want to pursue it further or advocate if this is the right way to do
things.
Neither i having a public repository for this project..

So, if there anyone who willing to pick it up and pursue the idea
further, please feel free to do so and make a public repository for
project.

 

Reply | Threaded
Open this post in threaded view
|

Re: Reflecting on data (literal) object syntax

Stephane Ducasse-3
In reply to this post by NorbertHartl
Yes I agree with Norbert. I will implement the proposal of eliot
because it should be slight variation of what I implemented already.

Christian we have immutable objects in Pharo and we plan to clean the
system to use them more and more :)
I hope that people will join to port your pdf framework from Gemstone
to Pharo :).

Stef


On Tue, Jul 4, 2017 at 9:20 AM, Norbert Hartl <[hidden email]> wrote:

> Hi Christian,
>
> thanks for the explanation. I see that Values serve a different purpose. We
> are looking for a compact form of object literal, the is not bound to the
> external interface and does not need to be subclassed.
> I'm interested how the printing of Values is different to #storeString.
>
> Norbert
>
> Am 04.07.2017 um 09:05 schrieb Christian Haider
> <[hidden email]>:
>
> Hi Norbert,
>
> yes, that is the point: it is a normal Smalltalk expression.
> It is trivial and every Smalltalk understands it.
> No need for special parsers; you just have class names and constructor
> methods.
> I believe this is a very compact and simple representation of Smalltalk
> objects.
>
> The Values package contains mainly the machinery to print an object (aka
> Value) as String so that it can be reconstructed from it.
> Exactly like Smalltalk prints literal objects like Array, Integer,
> Character, String, Symbol etc.
> The superclass Value makes sure that the object tree (remember – no cycles
> or references) will print properly (under consideration of namespaces).
> Also, many “simple” base objects can be turned into Values by implementing
> constructors and the printer (like Date, Timestamp, Point, Rectangle,
> ColorValue etc.).
>
> Additionally, Values allow to declare optional instVars, so that the
> representation does not get too cluttered with boilerplate arguments.
> In the dev tools, there is a generator which produces the proper
> constructors (2**<number of defaults>).
>
> There is really not much to it, technically. Maybe it is just too simple.
> I would be interested in the differences to the other proposals – what
> should be achieved and what may be missing?
>
> HTH,
>                 Christian
>
> Von: Pharo-dev [mailto:[hidden email]] Im Auftrag von
> Norbert Hartl
> Gesendet: Dienstag, 4. Juli 2017 08:14
> An: Pharo Development List <[hidden email]>
> Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax
>
> Hi Christian,
>
>
> Am 03.07.2017 um 11:06 schrieb Christian Haider
> <[hidden email]>:
>
> I solved this with Values[1] for VW and I am very happy with it (using it
> intensively/routinely).
> Your example would look like:
>
> (PointCollection points: (Array
>                 with: (Point x: 10 y: 20)
>                 with: (Point x: 5 y: 8)
> ))
>
>
> I do not understand your point. Above is a normal smalltalk expression. I
> skimmed through the Values document and I do not get how this is related to
> a discussion about
> compact object literal syntax.
> Can you elaborate a bit more? How do Values help here?
>
> Norbert
>
> As you see, it is the same except that lots of noise is gone.
> Drawbacks: only literal objects (like Values) are allowed; i.e. no cyclic
> structures (same with JSON etc.)
> and the order of arguments is fixed unlike JSON (but you can add
> constructors for every permutation J).
>
> I think this is very clear and direct (no magic from parsers etc.). I like
> it most for configurations (see everything at a glance) and interface data.
>
> Best,
>                 Christian
>
> [1] https://wiki.pdftalk.de/doku.php?id=complexvalues
>
> Von: Pharo-dev [mailto:[hidden email]] Im Auftrag von
> Norbert Hartl
> Gesendet: Montag, 3. Juli 2017 10:28
> An: Pharo Dev <[hidden email]>
> Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax
>
> Eliot,
>
>
> Am 01.07.2017 um 20:22 schrieb Eliot Miranda <[hidden email]>:
>
> Hi Norbert,
>
>
>
>
> On Jul 1, 2017, at 7:36 AM, Norbert Hartl <[hidden email]> wrote:
>
>
>
>
> Am 30.06.2017 um 21:14 schrieb Stephane Ducasse <[hidden email]>:
>
> But what is DataFrame?
>
>
> the new collection done by alesnedr from Lviv. It is really nice but
> does not solve the problem of the compact syntax.
>
>
>
>
> STON fromString: 'Point[10,20]'
>
> Same goes for JSON.
>
>
>
> We were brainstorming with marcus and we could have a first nice extension:
>
> { 'x' -> 10 .'y' -> 20 } asObjectOf: #Point.
>
>
> 10@20
>
>
> Now in addition I think that there is a value in having an object
> literal syntax.
>
> I pasted the old mail of igor on object literals because I like the
> idea since it does not add any change in the parser.
> Do you remember what were the problem raised by this solution (beside
> the fact that it had too many # and the order was like in ston not
> explicit).
>
> I would love to have another pass on the idea of Igor.
>
>
> What I don't like about it is that the object literal exposes the internal
> implementation of the object. Everything is based on index. So it could
> suffer the same problem as fuel. When you don't have the exact same code the
> deserialization fails.
>
>
> Indeed this is why
> { 'x' -> 10 .'y' -> 20 } asObjectOf: #Point.
> could be more robust.
> We could extend the object literal syntax to use association for non
> collection.
>
> I think it is more robust and more explicit. I do not know what are the
> semantics of detecting something as #Point being a class name. Is it then
> forbidden to use symbols with uppercase letters? I think something like
>
> { #x -> 10 . #y -> 20} asObjectOf: #Point
>
> is handling the format with implicit knowledge of type. While the explicit
> version would be
>
> { #Point -> { #x -> 10 . #y -> 20}} asObject
>
> And then nested objects are as easy as
>
> {#PointCollection -> {
>   #points -> { {#Point -> { #x -> 10 . #y -> 20} }.
>                 {#Point -> { #x -> 5 . #y -> 8} } } }  asObject
>
>
> The -> messages are just noise and add additional processing for nothing.
> This is just as effective:
>
> { #Point. { #x. 10 . #y. 20}} asObject
>
> {#PointCollection. {   #points. { {#Point. { #x. 10 . #y. 20} }.
>               {#Point. { #x. 5 . #y. 8} } } }  asObject
>
> So an object is a pair of a class name and an array of slot specifier pairs,
> and a slot specifier is a pair of a slot band and a value.  And of course
> that means that many object specs can be literal, which is useful for
> storing in pragmas etc:
>
>    #(PointCollection
>        (points ((Point (x 10 y 20))
>                      ((Point (x 5 y 8))))  asObject
>
>
> Agreed. My first impression was it should be something like S-expression
> which your example is. I was misled by the idea it should be closer to the
> programming syntax. But a parser does not care if implemented properly,
> that's right. I like the compactness of that format but still find it a bit
> hard to read if there is only pairs. As this object literal syntax is meant
> to be written in code it is important that it reads well even if there are
> noisy characters.
>
>
>
> would give a PointCollection of two point objects. My future wish would be
> that there is an object literal parser that takes all of the information
> from the format. And then a object literal parser that is aware of slot
> information. Meaning that the type information can be gathered from the
> object class instead having it to write in the format. In the
> PointCollection the slot for points would have the type information #Point
> attached. The format goes then to
>
> { #points -> {
>   { #x -> 10 . #y -> 20 }.
>       { #x -> 5 . #y -> 8 } }
>
> which would then the equivalent to something like JSON
>
> { "points" : [
>     { "x" : 10, "y" : 20 },
>     { "x" : 5, "y" : 8 } ] }
>
> What I don't know is how to solve the difference between a dictionary and an
> collection of associations.
>
>
> That's incidental to the format, internal to the parser.  If the parser
> chooses to build a dictionary as it parses so be it.  The point is that the
> output is as you specify; a tree of objects.
>
> The thing to think about is how to introduce labels so that sub objects can
> be shared in the graph, the naïve deepCopy vs deepCopyUsing: issue.
>
>
> I'm not sure this is necessary. It should be a format to easily instantiate
> a small tree of objects. Making it build a graph instead of a tree makes
> everything much more complicated. Either we decide that STON can do the full
> set and in that case it is probably less valuable to have a simple syntax to
> write in code. Or we need to break pairs rule. In that case an object
> definition can have an optional third argument which would be the label for
> the object. The draback is that the label needs to be before the array of
> slots
>
> { :v1 #ValueHolder { 'contents' . { ValueHolder . { 'contents' . @v1 }}}}
>
> Or something like this. It would be in theory closer to STON using the @
> reference. The difference is that STON has indexed object access and that
> variant would make it based on labels.Or something like this.
>
> Norbert
>
>
>
>
>
>
> Norbert
>
>
>
>
>
> As a dictionary is both, an array of associations and a key-value store, it
> works perfectly there. But for other objects I have doubts. Especially is in
> a lot of contexts you need to have a mapping of internal state to external
> representation. It can be applied afterwards but I'm not sure that can work
> all the time.
>
>
> Yes after we should focus on the frequent cases. And may be having a
> literal syntax for dictionary would be good enough.
>
> I will do another version of igor's proposal with associations to see
> how it feels.
>
>
>
>
> my 2 cents,
>
> Norbert
>
>
>
>
> Stef
>
>
>
>
> ---------- Forwarded message ----------
> From: Igor Stasenko <[hidden email]>
> Date: Fri, Oct 19, 2012 at 1:09 PM
> Subject: [Pharo-project] Yet another Notation format: Object literals
> To: Pharo Development <[hidden email]>
>
>
> Hi,
> as i promised before, here the simple smalltalk-based literal format.
> It based on smalltalk syntax, and so, unlike JSON, it doesn't needs to
> have separate parser (a normal smalltalk parser used for that).
>
> The idea is quite simple:
> you can tell any object to represent itself as an 'object literal' ,
> for example:
>
> (1@3) asObjectLiteral
> -->  #(#Point 1 3)
>
> { 1@2.  3@4. true. false . nil  } asObjectLiteral
>
> -> #(#Array #(#Point 1 2) #(#Point 3 4) true false nil)
>
> (Dictionary newFromPairs: { 1->#(1 2 3) . 'foo' -> 'bar' }) asObjectLiteral
> ->
> #(#Dictionary 1 #(#Array 1 2 3) 'foo' 'bar')
>
> Next thing, you can 'pretty-print' it (kinda):
>
> #(#Dictionary 1 #(#Array 1 2 3) 'foo' 'bar') printObjectLiteral
>
> '#(#Dictionary
>     1
>     (#Array 1 2 3)
>     ''foo'' ''bar'')'
>
>
> and sure thing, you can do reverse conversion:
>
> '#(#Dictionary
>     1
>     (#Array 1 2 3)
>     ''foo'' ''bar'')'  parseAsObjectLiteral
>
> a Dictionary('foo'->'bar' 1->#(1 2 3) )
>
> Initially, i thought that it could be generic (by implementing default
> Object>>#asObjectLiteral),
> but then after discussing it with others, we decided to leave
>
> Object>>#asObjectLiteral to be a subclass responsibility.
> So, potentially the format allows to represent any object(s) as
> literals, except from circular referencing objects, of course.
>
> The implementation is fairly simple, as you may guess and contains no
> new classes, but just extension methods here and there.
>
> Take it with grain and salt, since it is just a small proof of
> concept. (And if doing it for real it may need some changes etc).
> Since i am far from areas right now, where it can be used, i don't
> want to pursue it further or advocate if this is the right way to do
> things.
> Neither i having a public repository for this project..
>
> So, if there anyone who willing to pick it up and pursue the idea
> further, please feel free to do so and make a public repository for
> project.
>
>

12