Hey!
Ok, so I hacked up that little thingy to use as Delta file format. And just wrote a lengthy blog article about it: http://goran.krampe.se/blog/Squeak/Tirade.rdoc Let me know what you think! regards, Göran |
Hi Goran,
Two remarks: 1) closing the language: I know you want control and security. But are you sure the language is opened enough for future deltas extensions? I think of arbitrary method attributes for example, that might require more complex objects than just strings. Imagine for example you need to enter rich text attributes and specify RGB colors #(1.0 0.0 0.0)... You could eventually enhance syntax to allow a Float literal, but not easily a Fraction (if ever needed...). 2) securing the language: You'll definitely have to black/white list messages... become: {'#hacker inspect'}. evalStrings. 2009/3/16 Göran Krampe <[hidden email]> Hey! |
Hi!
Nicolas Cellier wrote: > Hi Goran, > Two remarks: > > 1) closing the language: > > I know you want control and security. Well, I want to maintain simplicity mainly. Currently it is so simple that it is "intuitively safe". Safety is not necessarily about exploits but rather about not inviting developers to do "smart stuff" that breaks tools etc. > But are you sure the language is opened enough for future deltas extensions? Can't really see what you mean. > I think of arbitrary method attributes for example, that might require more > complex objects than just strings. > Imagine for example you need to enter rich text attributes and specify RGB > colors #(1.0 0.0 0.0)... Well, first of all you can always fall back on String representations in Tirade. Like say: "rgb: 'FFEA99'". The builder can always do some interpretation of the Strings passed in. A more obvious example are TimeStamps or Dates. We just decide on a suitable String representation and use that. > You could eventually enhance syntax to allow a Float literal, but not easily > a Fraction (if ever needed...). Allowing richer Numbers is something I discussed today on IRC. Right now Tirade only does positive and negative Integers. I started looking at it but realized that Number class>>readFrom: and friends are complex beasts indeed! It is fairly obvious though that we can make Tirade either use the logic in Squeak "straight off" or something a tad simpler/faster. There are basically two situations I can see that makes "." in Floats tricky: - Mistakenly think the "." is a message final period. - Mistakenly think the "." is a brace array separator. The first case is actually not hard at all in Tirade because it is impossible to have a digit *after* a message final period. In regular Smalltalk it is not impossible though, if the next statement starts with a literal number as receiver. In Tirade the next character must be either whitespace, " or a letter starting a new keyword/unary message. In a brace array (without having looked at the Squeak Parser in detail) it seems that the parser "eagerly" parses Floats, so that a digit followed by a "." followed by a digit turns into a Float and yet another "." turns into a separator. So in short - if you start using Floats in a brace array then whitespace counts! :) > 2) securing the language: > > You'll definitely have to black/white list messages... > > become: {'#hacker inspect'}. > evalStrings. This is done by the reader today, it will send #isSelectorAllowed: to the builder and only do the perform: if it answers true. This is not hardwired into the Parser. Same goes for the "who is the receiver"-logic, you control this in your subclass of TiradeParser. In other words, these two things are not defined in the Tirade "language". :) regards, Göran |
2009/3/16 Göran Krampe <[hidden email]> The example you are presenting seems flat. Could you expose how you would build a deeper object?Hi! The fact that you need a special interpretation of -> makes me suspicious. Can you tell you won't ever need 1/2 for example? I mean representing arbitrary deep oject trees (not speaking of general graphs). Nicolas |
Hi!
Nicolas Cellier wrote: > 2009/3/16 Göran Krampe <[hidden email]> >> Hi! >> >> Nicolas Cellier wrote: >> >>> Hi Goran, >>> Two remarks: >>> >>> 1) closing the language: >>> >>> I know you want control and security. >>> >> Well, I want to maintain simplicity mainly. Currently it is so simple that >> it is "intuitively safe". Safety is not necessarily about exploits but >> rather about not inviting developers to do "smart stuff" that breaks tools >> etc. >> >> But are you sure the language is opened enough for future deltas >>> extensions? >>> >> Can't really see what you mean. > > The fact that you need a special interpretation of -> makes me suspicious. The reason for #-> being handled differently is because #-> is in fact in Smalltalk implemented in Object: -> anObject "Answer an Association between self and anObject" ^Association basicNew key: self value: anObject ...it is thus not a literal syntax - which some may mistakenly believe! BUT... in Tirade we don't allow expressions (messages to objects and using their results), but we really want a syntax to create Associations (and thus Arrays of Associations which easily can be turned into a Dictionary by the builder). So this led me to implement the syntax <something> "->" <somethingelse> in the TiradeParser, "mimicking" Smalltalk. This is btw also a reason for choosing "brace arrays" because regular Array syntax in Smalltalk does not evaluate any expressions inside the Array, leading to this: #('key'->'value') ==> #('key' #'->' 'value') ...but {'key'->'value'} works fine. > Can you tell you won't ever need 1/2 for example? Yes, I can tell you that. :) You can create tons of "nice things to have" but if you really think about it there are a lot of ways already in Tirade to avoid adding support for "expressions" like that, for example, simply make sure to send a message that knows that the argument is a mathematical expression that the builder can evaluate: mathematicalExpression: '1/2' > I mean representing arbitrary deep oject trees (not speaking of general > graphs). > The example you are presenting seems flat. Could you expose how you would > build a deeper object? Sure, either you build depth in "nested data": structure: {'key'-> { 'a'->12. 'b'->13'. 'c'-> { 'd'->{123. 345. 567}. 'e'->12}}} ...and let the builder create whatever objects it likes given that data. Or more likely you build depth by using a message protocol that shows what you are doing, for example using a stack protocol or whatever: createInstanceOf: #Animal. name: 'Tiger'. description: 'Striped animal'. addInstanceOf: #Leg. name: 'left front'. endInstance. addInstanceOf: #Leg. name: 'left front'. endInstance. addInstanceOf: #Leg. name: 'left front'. endInstance. addInstanceOf: #Leg. name: 'left front'. endInstance. endInstance. (this was one reason I added indentation support in the writer, to give hints about structure and depth that actually is not there syntactically, but only semantically) ...then we can implement a builder for this: createInstanceOf: aClassName "Create an instance of given class and put on top of stack." stack push: (Smalltalk at: aClassName) new endInstance "Pop current object." stack pop addInstanceOf: aClassName | obj | obj := (Smalltalk at: aClassName) new. stack top add: obj. "add the Leg to the Animal" stack push: obj "push Leg on stack" name: aString "Yeah, we can use DNU to cover these." stack top name: aString description: aString "Yeah, we can use DNU to cover these." stack top description: aString ...Also, if any of the above messages return another object instead of the builder (self) - that will be the receiver of the next Tirade message. But in order for #endInstance stuff to work we can't return a domain object as the receiver of the next Tirade message (to receive #name: and #description: directly) because it will not understand #endInstance and it does not have access to the builder object nor the stack etc. Mmmmm, but... well, we could actually let the *TiradeReader* implement some Tirade messages to maintain the stack of objects that are meant to receive the Tirade messages. Aaaahhh. Then we let the reader implement a stack of receivers instead of just "let the result of this Tirade message be the receiver of the next Tirade message". Cool, definitely a useful kind of "reader". Ok, I will add some of these examples - but to answer your question - it is quite easily done. :) regards, Göran |
2009/3/17 Göran Krampe <[hidden email]> Hi! First, thank you for sharing this work and your thoughts, that's interesting.
.. Or maybe for imitating JSON structures... But to me, above sentence is heavy (pathological?) for such a simple thing as a Fraction. It's like writing some C code: Cat aCat; /* this is a cat */ ;) Of course, Fraction might probably not be a problem for deltas for few next years, but i want to explore other possible applications of your ideas :)
Sure, VW GUI specs did (do?) use this kind of scheme... You have sort of JSON in Smalltalk Array Literal, maybe more powerfull because you can even intermix unary messages. {'MyObject' 'a'->0. 'beAGoodObject'}. It might event be possible to use n-ary. I guess you ommitted class names in above hierarchy by accident... But then it's kind of troubling to have a Date string syntax in JSON syntax in Tirade syntax in Smalltalk. Doesn't that deserve more thoughts ?
Yes, but nothing above tells to store the Leg object in the leg slot of the Animal.
With DNU and other tricks, this might be doable, I trust your talent, and i am impatient to read your next solution. What I can tell you is that my first scheme for saving objects trees was built on such a stack pattern... ...But I then switched to a syntax with file scope variables support to get: 1) more power (ability to save arbitrary graphs) 2) readable code (I don't consider a tree spanning over several pages readable) 3) plenty of deprecated messages for handling stack (they don't add value to the API, do they ?) 4) no more bytecode limitations 5) a scheme that could be used to transcript (log) user graphical actions From this time on, the application lived twenty years with constant upgrades without file format problems. Again, this is probably too much for deltas, and does not meet all your requirements. Sure, a collection of flat objects might be more than enough. But it's worth thinking twice for possible evolutions or other apps. Nicolas |
Hi!
(snipping a bit) Nicolas Cellier wrote: >> Can you tell you won't ever need 1/2 for example? >> Yes, I can tell you that. :) You can create tons of "nice things to have" >> but if you really think about it there are a lot of ways already in Tirade >> to avoid adding support for "expressions" like that, for example, simply >> make sure to send a message that knows that the argument is a mathematical >> expression that the builder can evaluate: >> >> mathematicalExpression: '1/2' > > I understand you added a syntax for Associations because you need some > Associations NOW in delta... > .. Or maybe for imitating JSON structures... More for that, I was not particularly aiming at anything in Deltas. > But to me, above sentence is heavy (pathological?) for such a simple thing > as a Fraction. > It's like writing some C code: Cat aCat; /* this is a cat */ ;) > Of course, Fraction might probably not be a problem for deltas for few next > years, but i want to explore other possible applications of your ideas :) Sure, but... a Fraction? :) It is not that often that you sit around with a Fraction in your hand so to speak. Btw, I just modifified Tirade to now support the full Squeak Number syntax. So you can do radix, exponents, floats etc. And I hacked up that TiradeStackReader too. :) Also, do recall that JSON is wildly successful and it only does numbers and strings. Definitely no Fractions there. :) >> I mean representing arbitrary deep oject trees (not speaking of general >>> graphs). >>> The example you are presenting seems flat. Could you expose how you would >>> build a deeper object? >>> >> Sure, either you build depth in "nested data": >> >> structure: {'key'-> { >> 'a'->12. >> 'b'->13'. >> 'c'-> { >> 'd'->{123. 345. 567}. >> 'e'->12}}} >> >> ...and let the builder create whatever objects it likes given that data. > > Sure, VW GUI specs did (do?) use this kind of scheme... > You have sort of JSON in Smalltalk Array Literal, maybe more powerfull > because you can even intermix unary messages. Yes, the supported syntax for "data" should cover JSON. But "intermix unary messages"? Not inside the data. Tirade is strictly a sequence of messages with 0-n arguments where each argument is more or less like a JSON doc: keyword1: <data> keyword2: <data>. unary. > {'MyObject' 'a'->0. 'beAGoodObject'}. > It might event be possible to use n-ary. > I guess you ommitted class names in above hierarchy by accident... You mean in my little "structure"? No, it is up to builder to "know" what to create. Sure, you can embed class names or other Strings as hints to what classes the builder should use, but it is not always needed. > But then it's kind of troubling to have a Date string syntax in JSON syntax > in Tirade syntax in Smalltalk. > Doesn't that deserve more thoughts ? What deserves more thought, Dates? Tirade does not have "JSON syntax" - it just happens to more or less match JSON in capability, when it comes to literal syntax. > Yes, but nothing above tells to store the Leg object in the leg slot of the > Animal. This may be one of our "differences" in view. I do not intend to use Tirade as a strict "serialization mechanism". I will not iterate over ivars and write all state out including info about which concrete class it was etc. And while reading I do not expect to get specific concrete class names and ivar names and all their contents. Instead I aim to write and read a series of messages that is "just enough" to recreate objects. So the above was actually a rather bad example, I was just pseudo coding from the hip. > With DNU and other tricks, this might be doable, I trust your talent, and i > am impatient to read your next solution. I added TiradeStackReader, it seems quite nice. > What I can tell you is that my first scheme for saving objects trees was > built on such a stack pattern... > ...But I then switched to a syntax with file scope variables support to get: > 1) more power (ability to save arbitrary graphs) > 2) readable code (I don't consider a tree spanning over several pages > readable) > 3) plenty of deprecated messages for handling stack (they don't add value to > the API, do they ?) > 4) no more bytecode limitations > 5) a scheme that could be used to transcript (log) user graphical actions >>From this time on, the application lived twenty years with constant upgrades > without file format problems. > > Again, this is probably too much for deltas, and does not meet all your > requirements. > Sure, a collection of flat objects might be more than enough. > But it's worth thinking twice for possible evolutions or other apps. It might be worth noting that I want something simple that people can use with very little effort. This means cutting some corners. But I also don't want to do "plain serialization", instead I want to create a *custom* builder (serializers have a generic builder for all kinds of Smalltalk objects) that works in *tandem* with the domain objects it constructs (by utilizing real instance creation methods taking arguments and not just "basicNew" followed by stuffing ivars) and that can be driven by Tirade messages. The above paragraph catches it quite well I think. So construction looks like this perhaps: "tirade message sequence" => TiradeParser (parses) => TiradeStackReader (security checks and maintaining stack of receivers) => DeltaBuilder => "delta domain objects" Now, a very small example: Tirade input: --------------- createDelta: 'Name of delta'. addRenameClass: #OldClassName. newClassName: #NewClassName. end. addRenameClass: #AnotherOldClassName. newClassName: #AnotherNewClassName. end. end. ----------------------------- DeltaBuilder>>createDelta: aName ^DSDelta named: aName DSDelta>>addRenameClass: oldClassName "Here we return the DSClassRenameChange instance. Thus it will be the stacked receiver for Tirade messages." ^self addChange: (DSClassRenameChange from: oldClassName) DSClassRenameChange>>newClassName: newClassName newName := newClassName ...ok, so the DeltaBuilder creates a Delta and returns it as the next receiver on the stack. Then comes another message to add a "rename class" change. (A Delta is a sequence of Changes basically). We do that BUT we also return this new DSClassRenameChange object so that it will be the next receiver. #newClassName: is thus not sent to the DSClassRenameChange object, setting one of its attributes. It returns self so it will still be the receiver for more messages. If it returned nil it would cause the TiradeStackReader to pop it, thus an object can actually "pop itself". More likely the Tirade input knows when we are done setting attributes so it sends #end. This message is a message that TiradeStackReader intercepts and causes it to pop the stack. The final #end pops the Delta too. Note that we are not naming ivars in the Tirade input, we aren't even naming the class DSDelta! Everything is a message with data as arguments. regards, Göran |
2009/3/17 Göran Krampe <[hidden email]>
Hi! me too
Agree. My "serialization" was customized to use higher level messages to re-build the objects (a public API)... ...rather than lower level inst var description. When necessary, with help of a builder (stored in a predefined file scope variable).
OK, this is cute, both syntax and implementation are simple, efficient and readable. I was fearing this would apply only to a collection of flat objects. ... and any deeper level would require another strategy, like using a JSON like literal array... ... unless, each object acts as the builder for its next level in hierarchy. Here DSDelta acts as the builder for building a DSClassRenameChange. This is certainly a good pattern. Thank you for explaining. Nicolas |
Hi!
(this thread is probably done now, but one last "devil in the details"-post for the insatiably interested...) Nicolas Cellier wrote: > I wrote: >> It might be worth noting that I want something simple that people can use >> with very little effort. This means cutting some corners. >> >> But I also don't want to do "plain serialization", instead I want to create >> a *custom* builder (serializers have a generic builder for all kinds of >> Smalltalk objects) that works in *tandem* with the domain objects it >> constructs (by utilizing real instance creation methods taking arguments and >> not just "basicNew" followed by stuffing ivars) and that can be driven by >> Tirade messages. >> >> The above paragraph catches it quite well I think. So construction looks >> like this perhaps: > > Agree. > My "serialization" was customized to use higher level messages to re-build > the objects (a public API)... > ...rather than lower level inst var description. > When necessary, with help of a builder (stored in a predefined file scope > variable). Ok, so in fact a bit similar in philosophy. >> "tirade message sequence" => TiradeParser (parses) => TiradeStackReader >> (security checks and maintaining stack of receivers) => DeltaBuilder => >> "delta domain objects" >> >> Now, a very small example: >> >> Tirade input: >> --------------- >> createDelta: 'Name of delta'. >> addRenameClass: #OldClassName. >> newClassName: #NewClassName. >> end. >> addRenameClass: #AnotherOldClassName. >> newClassName: #AnotherNewClassName. >> end. >> end. >> ----------------------------- >> >> DeltaBuilder>>createDelta: aName >> ^DSDelta named: aName >> >> DSDelta>>addRenameClass: oldClassName >> "Here we return the DSClassRenameChange instance. >> Thus it will be the stacked receiver for Tirade messages." >> ^self addChange: (DSClassRenameChange from: oldClassName) >> >> DSClassRenameChange>>newClassName: newClassName >> newName := newClassName >> >> >> ...ok, so the DeltaBuilder creates a Delta and returns it as the next >> receiver on the stack. >> >> Then comes another message to add a "rename class" change. (A Delta is a >> sequence of Changes basically). We do that BUT we also return this new >> DSClassRenameChange object so that it will be the next receiver. >> >> #newClassName: is thus not sent to the DSClassRenameChange object, setting >> one of its attributes. It returns self so it will still be the receiver for >> more messages. If it returned nil it would cause the TiradeStackReader to >> pop it, thus an object can actually "pop itself". >> >> More likely the Tirade input knows when we are done setting attributes so >> it sends #end. This message is a message that TiradeStackReader intercepts >> and causes it to pop the stack. The final #end pops the Delta too. >> >> Note that we are not naming ivars in the Tirade input, we aren't even >> naming the class DSDelta! Everything is a message with data as arguments. >> >> regards, Göran > > OK, this is cute, both syntax and implementation are simple, efficient and > readable. Thanks! Yeah, I like it so far, although the stack bits aren't set in stone yet - see below. > I was fearing this would apply only to a collection of flat objects. > ... and any deeper level would require another strategy, like using a JSON > like literal array... No, definitely not - in that case it would be rather ... dull. :) > ... unless, each object acts as the builder for its next level in hierarchy. > Here DSDelta acts as the builder for building a DSClassRenameChange. > This is certainly a good pattern. Yes, but I don't want to "hardcode" this pattern into Tirade, which is why I separate some things from the TiradeParser (language) into TiradeReader (subclass of parser that actually "does" something) and also into the builder object(s). My current view on the responsibilities: TiradeParser: Parses an input stream of Tirade messages. Completely defines the syntax of Tirade. If you use TiradeParser on a stream you will get printouts in Transcript because it doesn't really *do* anything with all the messages it reads. TiradeParser knows nothing about receivers, it just parses an endless sequence of messages. The reader: A reader is a subclass of TiradeParser and implements #processMessage. It could do whatever it likes, printing them, constructing an object - beep, whatever. :) TiradeReader: The only included reader class in the Tirade package. It deals with three things: 1. Figuring out the receiver for the message. 2. Making security checks on the message to be sent. 3. How to actually send the message. Figuring out the receiver is done by using a stack of receivers and following some conventions around that stack. The security check right now is based on checking the method category and verifying it beginsWith: 'tirade'. Otherwise there is an error. If there is no implementation (then the receiver probably has implemented doesNotUnderstand:) there is no check, we presume it does its own checking. There is also a "whitelist" of allowed messages that is empty by default. Using this you can allow receivers like blocks etc (allowing #value:) or such. How to send the message. This is interesting, I added a Set of "controlMessages" in the reader in which the builder can register messages that *it* wants to receive and not the current receiver on the stack. This makes it possibly for the builder to pre-register for example stack manipulation messages and when it receives them it can manipulate the reader "from the outside" regardless of the receiver stack. This is very nice because then the builder is "in full control" and can decide how to deal with Tirade input. Even if the input stream says "end." the builder can decide to not pop the current receiver. Anyway, I find Tirade to be quite nice so far, and encourage anyone interested in "file formats" or similar to give feedback. The SS repo is open for writes :) regards, Göran |
Free forum by Nabble | Edit this page |