Re: Reflecting on data (literal) object syntax - porting PDF

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: Reflecting on data (literal) object syntax - porting PDF

Christian Haider

Hi Ben,

at ESUG I will hopefully show the new version of the library with a new PDF type implementation which allows the classes to be freely (re)named.

This should make a port a lot easier.

Currently I am working on the port for Gemstone. If that works, it could be more than half the way towards a Pharo/Squeak version.

cheers

 

Von: Pharo-dev [mailto:[hidden email]] Im Auftrag von Ben Coman
Gesendet: Montag, 3. Juli 2017 18:54
An: Pharo Development List <[hidden email]>
Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax

 

Thanks Christian for the open release of pdf4smalltalk.  Sorry I aborted the two times I started to port it to Pharo.  The task of dealing with both file format and namespace differences to synchronise with VW was too big for me.  Now with Iceberg, if VisualWorks might work with git that would knock down one barrier and I might have another go.

 

On Mon, Jul 3, 2017 at 5:45 PM, Christian Haider <[hidden email]> wrote:

I did

 

Von: Pharo-dev [mailto:[hidden email]] Im Auftrag von Serge Stinckwich
Gesendet: Montag, 3. Juli 2017 11:40
An: Pharo Development List <[hidden email]>


Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax

 

Hi Christian,

 

interesting ! Maybe you can release your software with an MIT licence ?

 

Regards,

 

On Mon, Jul 3, 2017 at 10:06 AM, Christian Haider <[hidden email]> wrote:

I solved this with Values[1] for VW and I am very happy with it (using it intensively/routinely).

Your example would look like:

 

(PointCollection points: (Array

                with: (Point x: 10 y: 20)

                with: (Point x: 5 y: 8)

))

 

And I remember you saying pdf4smalltalk relied heavily on this, so having Values integrated (if it was broadly useful) would also be a minor step in porting pdf4smalltalk to Pharo. 

 

cheers -ben

 

As you see, it is the same except that lots of noise is gone.

Drawbacks: only literal objects (like Values) are allowed; i.e. no cyclic structures (same with JSON etc.)

and the order of arguments is fixed unlike JSON (but you can add constructors for every permutation J).

 

I think this is very clear and direct (no magic from parsers etc.). I like it most for configurations (see everything at a glance) and interface data.

 

Best,

                Christian

 

[1] https://wiki.pdftalk.de/doku.php?id=complexvalues

 

Von: Pharo-dev [mailto:[hidden email]] Im Auftrag von Norbert Hartl
Gesendet: Montag, 3. Juli 2017 10:28
An: Pharo Dev <[hidden email]>
Betreff: Re: [Pharo-dev] Reflecting on data (literal) object syntax

 

Eliot,

 

Am 01.07.2017 um 20:22 schrieb Eliot Miranda <[hidden email]>:

 

Hi Norbert,

On Jul 1, 2017, at 7:36 AM, Norbert Hartl <[hidden email]> wrote:

Am 30.06.2017 um 21:14 schrieb Stephane Ducasse <[hidden email]>:

But what is DataFrame?


the new collection done by alesnedr from Lviv. It is really nice but
does not solve the problem of the compact syntax.


STON fromString: 'Point[10,20]'

Same goes for JSON.

We were brainstorming with marcus and we could have a first nice extension:

{ 'x' -> 10 .'y' -> 20 } asObjectOf: #Point.

10@20


Now in addition I think that there is a value in having an object
literal syntax.

I pasted the old mail of igor on object literals because I like the
idea since it does not add any change in the parser.
Do you remember what were the problem raised by this solution (beside
the fact that it had too many # and the order was like in ston not
explicit).

I would love to have another pass on the idea of Igor.


What I don't like about it is that the object literal exposes the internal implementation of the object. Everything is based on index. So it could suffer the same problem as fuel. When you don't have the exact same code the deserialization fails.


Indeed this is why
{ 'x' -> 10 .'y' -> 20 } asObjectOf: #Point.
could be more robust.
We could extend the object literal syntax to use association for non
collection.

I think it is more robust and more explicit. I do not know what are the semantics of detecting something as #Point being a class name. Is it then forbidden to use symbols with uppercase letters? I think something like

{ #x -> 10 . #y -> 20} asObjectOf: #Point

is handling the format with implicit knowledge of type. While the explicit version would be

{ #Point -> { #x -> 10 . #y -> 20}} asObject

And then nested objects are as easy as

{#PointCollection -> { 
  #points -> { {#Point -> { #x -> 10 . #y -> 20} }.
                {#Point -> { #x -> 5 . #y -> 8} } } }  asObject


The -> messages are just noise and add additional processing for nothing.  This is just as effective:

{ #Point. { #x. 10 . #y. 20}} asObject

{#PointCollection. {   #points. { {#Point. { #x. 10 . #y. 20} }.
              {#Point. { #x. 5 . #y. 8} } } }  asObject

So an object is a pair of a class name and an array of slot specifier pairs, and a slot specifier is a pair of a slot band and a value.  And of course that means that many object specs can be literal, which is useful for storing in pragmas etc:

   #(PointCollection
       (points ((Point (x 10 y 20))
                     ((Point (x 5 y 8))))  asObject

Agreed. My first impression was it should be something like S-expression which your example is. I was misled by the idea it should be closer to the programming syntax. But a parser does not care if implemented properly, that's right. I like the compactness of that format but still find it a bit hard to read if there is only pairs. As this object literal syntax is meant to be written in code it is important that it reads well even if there are noisy characters. 


would give a PointCollection of two point objects. My future wish would be that there is an object literal parser that takes all of the information from the format. And then a object literal parser that is aware of slot information. Meaning that the type information can be gathered from the object class instead having it to write in the format. In the PointCollection the slot for points would have the type information #Point attached. The format goes then to

{ #points -> {
  { #x -> 10 . #y -> 20 }.
      { #x -> 5 . #y -> 8 } }

which would then the equivalent to something like JSON

{ "points" : [
    { "x" : 10, "y" : 20 },
    { "x" : 5, "y" : 8 } ] }

What I don't know is how to solve the difference between a dictionary and an collection of associations.


That's incidental to the format, internal to the parser.  If the parser chooses to build a dictionary as it parses so be it.  The point is that the output is as you specify; a tree of objects.

The thing to think about is how to introduce labels so that sub objects can be shared in the graph, the naïve deepCopy vs deepCopyUsing: issue.

 

I'm not sure this is necessary. It should be a format to easily instantiate a small tree of objects. Making it build a graph instead of a tree makes everything much more complicated. Either we decide that STON can do the full set and in that case it is probably less valuable to have a simple syntax to write in code. Or we need to break pairs rule. In that case an object definition can have an optional third argument which would be the label for the object. The draback is that the label needs to be before the array of slots

 

{ :v1 #ValueHolder { 'contents' . { ValueHolder . { 'contents' . @v1 }}}} 

 

Or something like this. It would be in theory closer to STON using the @ reference. The difference is that STON has indexed object access and that variant would make it based on labels.Or something like this.

 

Norbert

 

 


Norbert


As a dictionary is both, an array of associations and a key-value store, it works perfectly there. But for other objects I have doubts. Especially is in a lot of contexts you need to have a mapping of internal state to external representation. It can be applied afterwards but I'm not sure that can work all the time.


Yes after we should focus on the frequent cases. And may be having a
literal syntax for dictionary would be good enough.

I will do another version of igor's proposal with associations to see
how it feels.


my 2 cents,

Norbert


Stef




---------- Forwarded message ----------
From: Igor Stasenko <[hidden email]>
Date: Fri, Oct 19, 2012 at 1:09 PM
Subject: [Pharo-project] Yet another Notation format: Object literals
To: Pharo Development <[hidden email]>


Hi,
as i promised before, here the simple smalltalk-based literal format.
It based on smalltalk syntax, and so, unlike JSON, it doesn't needs to
have separate parser (a normal smalltalk parser used for that).

The idea is quite simple:
you can tell any object to represent itself as an 'object literal' ,
for example:

(1@3) asObjectLiteral
-->  #(#Point 1 3)

{ 1@2.  3@4. true. false . nil  } asObjectLiteral

-> #(#Array #(#Point 1 2) #(#Point 3 4) true false nil)

(Dictionary newFromPairs: { 1->#(1 2 3) . 'foo' -> 'bar' }) asObjectLiteral
->
#(#Dictionary 1 #(#Array 1 2 3) 'foo' 'bar')

Next thing, you can 'pretty-print' it (kinda):

#(#Dictionary 1 #(#Array 1 2 3) 'foo' 'bar') printObjectLiteral

'#(#Dictionary
    1
    (#Array 1 2 3)
    ''foo'' ''bar'')'


and sure thing, you can do reverse conversion:

'#(#Dictionary
    1
    (#Array 1 2 3)
    ''foo'' ''bar'')'  parseAsObjectLiteral

a Dictionary('foo'->'bar' 1->#(1 2 3) )

Initially, i thought that it could be generic (by implementing default
Object>>#asObjectLiteral),
but then after discussing it with others, we decided to leave

Object>>#asObjectLiteral to be a subclass responsibility.
So, potentially the format allows to represent any object(s) as
literals, except from circular referencing objects, of course.

The implementation is fairly simple, as you may guess and contains no
new classes, but just extension methods here and there.

Take it with grain and salt, since it is just a small proof of
concept. (And if doing it for real it may need some changes etc).
Since i am far from areas right now, where it can be used, i don't
want to pursue it further or advocate if this is the right way to do
things.
Neither i having a public repository for this project..

So, if there anyone who willing to pick it up and pursue the idea
further, please feel free to do so and make a public repository for
project.

 



 

--

Serge Stinckwich
UCN & UMI UMMISCO 209 (IRD/UPMC)
Every DSL ends up being Smalltalk
http://www.doesnotunderstand.org/