feedback on using JSON to specify baselines for git repository

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
Stef,

Let me start by acknowledging that I understand that I am having a hard time getting my point across to you (and apparently Camillo). The following message may sound harsh, but I am merely trying to establish a basic foundation from which I am drawing conclusions. While it is clear to me that you are not understanding my points, it is not at all clear to me what you don't understand... Consequently I am going to try to say the same things I have said in my previous emails in a different form and see if we can make some progress.

POINT 1 "Executable specifications introduce a SECURITY HOLE":

The big sticking points that we appear to have is your contention that an "executable specification is just fine" and my contention that an "executable specification is not".

You do understand that the when you crack open an mcz file that the serialized Monticello meta data is in a parsable file format called DataStream, right? When you deserialize that code, a stream is parsed and data is read off of that stream to reconstruct the Monticello meta data without executing any code that wasn't already loaded in the image (i.e., no SECURITY PROBLEMS).

So the first thing that I need from you Stef is an acknowledgement that you understand this is a weaksness of an executable specification. In other words, this type of situation is unacceptable:

  (Node
        name: #baseline
        son: (Node
                        name: 'ProjectVersion')).
  SmalltalkImage current shutdown.

So please acknowledge that an executable specification like the above is not acceptable.

It  is pointless to discuss any further if you don't understand why the above is bad.

POINT 2 "A parser is needed for a non-execution (SECURE) specification":

If we agree that an executable specification format is not a good thing, then you must agree that in order to avoid trojan horse attacks like `SmalltalkImage current shutdown.`, we must have a parser for our specification and our specification must also be readable by human beings.

So the second point of acknowledgement that we must agree on is that we need to have a parser for whatever format we cook up. If you do not agree that we need a parser then you haven't agree with the first point, because the only way to avoid a trojan horse is to only allow statements that conform the agreed upon format.

POINT 3 "A Smalltalk parser for JSON exists"

One class 28 methods and you have a JSON parser ... If I wasn't writing emails right now, I would have a JSON to Metacello reader completed (I already have a Metacello to JSON writer written).

POINT 4 "JSON is a readable format"

POINT 5 "I am not married to JSON"

If you can provide me with a portable parser (must run on Pharo, GemStone and Squeak) that parses whatever format you'd like to suggest and that allows me to write code to use the output of that parser to produce the Metacello data structures, then I will be happy to use your parser and format.

If you recall I started my work on Metacello using an EXECUTABLE literal array format that was roundly criticized as being too difficult to read and was replaced by the ConfigurationOf specifications which we we use now.

If you tell me that you PREFER the literal array format then I won't complain as long as you provide me with an independent, portable parser for the literal array format, then I'll be happy...

Dale


----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Monday, February 6, 2012 11:00:22 AM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
|
| On Feb 6, 2012, at 6:47 PM, Dale Henrichs wrote:
|
| > Stef,
| >
| > 1. The JSON parser used by Seaside is one class with 28 methods. I
| > intend to copy
| >   the class and use it for Metacello, with possible mods to support
| >   YAML. For
| >   the very short term, I will do my "one week experiment" using
| >   JSON.
| >
| > 2. I am looking at JSON as a means to get away from the fact that
| > Metacallo does
| >   not have a non-executable specification format and that is a
| >   GIANT SECURITY HOLE.
| >
| >   I would love for a site like Bibliocello or SqueakSource to
| >   reason about Metacello
| >   versions, but with only an executable format to work with, a
| >   server site CANNOT
| >   create Metacello objects because of the security risks to do so.
|
| I still do not understand what you mean.
|
| #( #baseline
| #( projectVersion '3.6' )
| #( url 'http://www.'))
|
| Why this is not enough?
| I do not get it.
|
| Then the smalltalk message syntax is excellent to represent
| declarative structure
|
| (Node
| name: #baseline
| son: (Node
| name: 'ProjectVersion')
|
| I still do not get why you need a different syntax
|
| >   If we have a machine parsable and human readable format, the part
| >   of the problem is
| >   solved.
| >
| > 3. If I want to do true merges of configurations, I must be able to
| > arrange to have 2-3
| >   versions of the ConfigurationOfXXX class installed in the image
| >   at one point in time.
| >   If I want to preserve any changes that have been made to the
| >   original configuration, I
| >   am in trouble.
| >
| >   With a non-executable specification for a configuration, I can
| >   PARSE 3 different files
| >   and easily create the 3 instances of MetacelloProject that I need
| >   to do the merge.
| >
| >   For mcz files, I can arrange to embed the JSON representation of
| >   the spec in a separate
| >   zip file directory…
|
| To me this is totally orthogonal to git usage or not. It depends on
| the specification of your inputs.
| And this is why I said that the problem is the fact that you need to
| execute method to get objects
| but not the technology used.
|
| > 4. In a file-based repository where the entire directory structure
| > is versioned instead of
| >   the individual files, the class-based configuration does not make
| >   a lot of sense. In fact
| >   I only need a single baseline specification. So I have the choice
| >   of continuing with
| >   the executable specification or creating a parsable
| >   representation of the Metacello
| >   specifications
|
| As I said experience with a really declarative specification but in
| plain smalltalk syntax
| Look at VisualiWorks UI specs this is exactly the same problem (see
| at the end of this mail)
|
| Look at MSE format for Moose this is a declarative spec for model of
| source code.
| Ask camillo because he is expert in Git and SCM and we discuss and he
| agreed with me. :)
|
| > 5. Since JSON has an existing (compact) parser for Smalltalk and
| > JSON is a pretty readable
| >   format, I am proposing that JSON be used for the parsable
| >   Metacello specifications.
| >
| > 6. The parsable format (JSON) and executable format
| > (ConfigurationOfXXX) can be used
| >   interchangeably. For the forseeable future the vast majority of
| >   developers will continue
| >   to use the executable format to create their specifications.
|
| I hope not. It would be good to have one format for all.
|
| > If they happen to be using
| >   a file-based repository (git or svn), then they will have the
| >   option of creating/editting
| >   the parsable format if they prefer.
| >
| > Dale
| >
| >
| >
| windowSpec
| "UIPainter new openOnClass: self andSelector: #windowSpec"
|
| <resource: #canvas>
| ^#(#{UI.FullSpec}
| #window:
| #(#{UI.WindowSpec}
| #label: #(#{Kernel.UserMessage} #key: #UnlabeledCanvas
| #defaultString: 'Unlabeled Canvas' #catalogID: #labels)
| #bounds: #(#{Graphics.Rectangle} 512 384 858 635 ) )
| #component:
| #(#{UI.SpecCollection}
| #collection: #(
| #(#{UI.TextEditorSpec}
| #layout: #(#{Graphics.LayoutFrame} 10 0 10 0 -10 1 -10 1 )
| #name: #textEditor
| #model: #textHolder
| #isReadOnly: true
| #tabRequiresControl: true ) ) ) )
|
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
In reply to this post by stephane ducasse-2


----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Monday, February 6, 2012 11:11:22 AM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
|
| On Feb 6, 2012, at 7:12 PM, Dale Henrichs wrote:
|
| > Stef,
| >
| > As far as I am concerned the beauty of this whole project is that
| > the majority of the functionality of Monticello will be preserved.
| > Monticello packages will still be defined and used, the Monticello
| > Browser will still be used. etc.
| >
| > The work that Camillo, Max, and Otto have done is to create an
| > alternate mechanism for storing Monticello packages in a
| > repository.
|
| Yes I discussed with camillo: mapping on method per files and
| protocol for folder.
|
| > The mcz format is based on creating a zip repository and
| > serializing the Monticello meta model into that format. Each mcz
| > file contains a different version of the package.
|
| Yes
| >
| > The FileTree (Otto) and FSGit(Camillo and Max) projects simply map
| > the package structure onto a directory structure and use embedded
| > .st files for storing the source code. A single .st file contains
| > the class defition, there are separate .st files for each of the
| > methods, etc. When a package is loaded form a FileTree of FSGit
| > repository, you have an honest to goodness Monticello meta model
| > (snapshot) that can be used like any other snapshot.
|
| Yes I asked camillo during lunch about it.
|
| > The big difference is the ancestry information needs to be handled
| > differently. For git and svn the ancestry information is not used
| > for doing merges since svn and git have their own representation
| > of the history of the repository. So when working strictly with
| > git or svn, the merge tools for those repositories can and should
| > be used, especially when it comes to the Metacello meta model.
|
| Yes so this means no merging tool in Smalltalk anymore. Just plain
| unix one.

NO ... read the next sentence....
|
| > When using git and svn one will still be able to use the Monticello
| > 3 way merge code for merging packages ... everything works the
| > same at the level of source code ...
|
| Yes but I'm not sure because people will use Unix/github tools and
| may be this is better but it means that we
| cannot change the behavior if we do not like it.

If the unix/github tools are BETTER then they will be used ... Is this really a bad thing?

If the unix/github tools can be improved then they can be improved in Smalltalk.

If you want to implement your own git merge capability in Smalltalk, then I think that you can certainly do so ... the git api lets you have full access to the API. If you do not like the behavior, you can certainly change it...


|
| > The only remaining work to make git/svn truly usable from Smalltalk
| > is to adapt Metacello so that it can be used to load the projects
| > that are stored in git/svn ...
| >
| > And that is the work that I am doing right now .... taking a
| > serious look at what changes need to be made to Metacello to make
| > this integration as seamless as possible ...
| >
| > To move to git/svn does not require a revolution, it can be done in
| > an evolutionary manner ... I intended Metacello to be flexible
| > from the very beginning, since MC2 was on the horizon, so it isn't
| > as big a stretch to incorporate git/svn as I had been thinking
| > only a week ago.
| >
| > I don't think you should be sad, Stef. I really think that a number
| > of the things that you want for Pharo can be done much more
| > effectively using git/svn than trying to build them from scratch
| > in Smalltalk …
|
| I know. Now I would like to get some stable ground. Because we will
| put in production (we got an engineer and a postdoc working) on
| jenkins rules checking and distribution construction and for that we
| need a robust metacello. This is why I'm thinking that having a
| declarative syntax and a fixed semantics is probably a good move to
| have.

Let me be very clear. I am not changing the "declarative syntax" of Metacello. With JSON I am proposing an additional format for specifications, not a change to the syntax.

I am looking very seriously at supporting directory-based SCMs (as one or more new Monticello repository types), but it is very important that the the directory-based SCMs do not change the existing semantics of Metacello.

| I guess that now we got nearly all the case and I do not understand
| why Metacello configuration could not handle as pure data on class
| side or in the manifest of a specific package (This is what we are
| doing for rule checking positive): a package will have a manifest
| class with only meta data.

It is very important to me that you understand why we can't just go hanging methods off of Configuration classes ... I have been maintaining for a long time that the Configuration class is just a "file format" like XML and I have discouraged any attempts that would make the specifications DEPENDENT on an executable specification.

Hopefully, my other message in this thread will allow us to focus on reaching an understanding on this point ...

I don't think that you quite understand what my concerns are and until you understand my concerns it is difficult to move forward.

|
| Now people piss on monticello regularly but when monticello arrived
| it was one the first scm doing three way merge.
| And instead of pissing on it, people should have improve it. It is
| boring to hear MC is bad after a while.

If Monticello was so good, why did Metacello have to be invented?

I will repeat that I don't want to throw Monticello away ... being able to use the Monticello pacakges in git/svn repositories is a good thing as it continues to leverage the good things about Monticello!

Monticello does not manage the versions of projects, it manages the versions of files and that is a limitation of Monticello.

Git/svn manage versions of projects and that is a good thing ... it is why people don't argue about WHY they are using SVN or GIT. They only argue about whether they should use SVN or GIT ... the WHY is a given.

|
| A real problem with MC is
| - the ancestor history is stored in the image 4mb now
| - does not handle well non code elements.

Directory-based repositories address the history issue immediately and directory-based repositories do a wonderful job of handling non-code elements ...

Again the beauty of this approach is that we can keep our Monticello and have our non-code elements as well...

|
| Now even if I prefer the git model over SVN, Git is too arcane for
| me. It adds a good level of complexity and I'm not blind.
| I'm ok to learn it too. But having tool in your language is also
| quite cool.

I will agree that some seemingly standard git operations require some real odd command incantations to perform... To me, github, makes it worhtwhile learning the incantations...

|
| Stef
|
|
| > Dale
| > ----- Original Message -----
| > | From: "stephane ducasse" <[hidden email]>
| > | To: [hidden email]
| > | Sent: Sunday, February 5, 2012 12:19:28 PM
| > | Subject: Re: [Metacello] feedback on using JSON to specify
| > | baselines for git repository
| > |
| > | > Yes. In the beginning, the git (or SVN) tools that support the
| > | > merge would be used to do the source level merges. Over time,
| > | > if
| > | > it is successful I imagine that image-level for doing the merge
| > | > would be developed…
| > |
| > | does it mean dropping all the merge tools from the image? and the
| > | three way merge?
| > |
| > | > | can you give one example?
| > | > | Because we want to have a real code representation exchange
| > | > | format
| > | > | for logging changes:
| > | > | having method, and class is not enough.
| > | > | Now we have ring so we could expect that a new logger would
| > | > | log
| > | > | better the actions we are doing
| > | >
| > | > The JSON-based format is for Metacello specifications only.
| > |
| > | ah for Metacello so I'm even more confused. So not spec package:
| > | requires: []… anymore.
| > | So we will have write metacello spec in JSON?
| > |
| > | So in addition the spec in metacello we will have to load a JSON
| > | or
| > | YAML parser?
| > |
| > | > Using a class for ConfigurationOfXXX has been a convenient and
| > | > expedient mechanism for defining configurations in Metacello,
| > | > but
| > | > using executable code to define a specification is not ideal:
| > | >
| > | >  - one cannot extract specification information without
| > | >  executing
| > | >    code is bad from a security point of view
| > | >  - tools have to create classes and generate methods in order
| > | >  to
| > | >    write a specification, again not an ideal situation.
| > |
| > | Sure but you do not have to execute code to se the smalltalk
| > | syntax
| > | as a declarative one.
| > | You do not need a JSON syntax to get a tree structure.
| > |
| > | The problem of metacello is not the execution but the fact that
| > | it is
| > | not used declaratively.
| > | Because
| > | if you have a tree in JSON
| > |
| > | [
| > |   {"baseline" : [
| > | {"common" :
| > | [
| > |   {"package" : {
| > |   "name" : "MonticelloFileTree-Core"}},
| > |
| > | then you can have
| > | Baseline named: 'Common'
| > | addElement:
| > | Package named: 'name'
| > |
| > | You can use the smalltalk syntax as a declarative one. This is
| > | totally orthogonal.
| > | So I really do not understand why JSON is better except that you
| > | will
| > | need yet another parser.
| > |
| > |
| > | >
| > | > With the file based SCM systems (SVN and GIT) the Metacello
| > | > configuration does not have to become the data base for all
| > | > versions of the project.
| > | DNU
| > |
| > | > It is only necessary for Metacello to specify required projects
| > | > and
| > | > package dependencies, basically all we need is a baseline
| > | > specification.
| > |
| > | I do not see why.
| > | Because if I want to rely on specific versions or latest dev….
| > |
| > |
| > | > For a single baseline specification, one would need only one
| > | > method
| > | > in the current class-based system so including a class in the
| > | > mix
| > | > is basically overkill. I considered using a doit to define the
| > | > specification, but thought that now would be a good time to go
| > | > for
| > | > the structured file to specify the baseline.
| > |
| > | you are losing me and I feel immensely sad because I have the
| > | impression that this is the revolution in metacello
| > | and that we will have to reinvent the wheel and without a map
| > | packages are nothing …
| > | So you killed my energy.
| > |
| > | >
| > | > |
| > | > | That I do not know. For me I do not see problems beside
| > | > | resources
| > | > | management. I do not see the difference
| > | > | for the rest because I can merge and the merge works quite
| > | > | well.
| > | > | What I know is that we technical people are excited by
| > | > | technology.
| > | > | People are telling me that git is wonderful
| > | > | and that I should stop using svn. Well so far Git just
| > | > | alienates
| > | > | me
| > | > | and I need a bad UI to manage simple pull requests.
| > | >
| > | > You are talking about merging individual mcz files working well
| > | > and
| > | > I agree. Monticello is a very nice system for dealing with
| > | > versioned files.
| > | >
| > | > As it stands today, there is no project merge capability. There
| > | > is
| > | > really no way for developers to create a branch at the project
| > | > level, work for hours, days or weeks independently and then
| > | > come
| > | > back and merge their configuration changes and package changes.
| > | > If
| > | > you figured out which sets of packages changed, you would be
| > | > able
| > | > to merge the smalltalk code, but if you had made changes to the
| > | > baselines and versions there is no merge capability…
| > |
| > | ah that.
| > | But I do not see how Git can provide magic there.
| > | To me git deal with texts or if it does not then it means that
| > | the
| > | logic can be apply to Metacello specs as well.
| > |
| > |
| > | > When I started out with Metacello, I imagined that merging
| > | > would be
| > | > straightforward source code comparison. Today I don't think so.
| > | > If
| > | > you recall last spring Alexandre's style of using Metacello was
| > | > to
| > | > create a new project version for each commit. The basic problem
| > | > wasn't that it was "wrong" to do so, it just made doing any
| > | > sort
| > | > of source-based merge impossible.
| > |
| > | I do not get it.
| > | You have two spec trees and you merge them. So I do not see why a
| > | git
| > | textual merge would succeeds and if it succeeds
| > | why a merge based on objects = Metacello spec would not work.
| > |
| > |
| > |
| > | > In order to merge two branches of a Metacello project that
| > | > involve
| > | > changes to more than just the same version, one must create two
| > | > or
| > | > three version specifications in memory at the same time. For an
| > | > mcz file, you can keep multiple definitions because the mcz
| > | > file
| > | > is a serialized version of the meta data. For metacello, we
| > | > have
| > | > to load a class and execute a method for each of the three
| > | > versions and if the version of the class in the initial image
| > | > is
| > | > dirty? What do we do?
| > |
| > | I think that git is not the solution to the problem. The solution
| > | to
| > | the problem is to rethink what is a metacello spec.
| > | Personally I see a metacello spec as a tree of declarations. Now
| > | if
| > | the three is generated then indeed this is a problem
| > | because you have to execute the program but I do not see why the
| > | solution that would solved the problem by using git would not
| > | solve
| > | the problem without git. I think that the solution looks
| > | orthogonal
| > | to the software artifact used but may be I'm wrong.
| > |
| > | > Now technically there is a solution to this merge problem, but
| > | > it
| > | > involves at a minimum creating a completely different file
| > | > format
| > | > for configurations, not to mention a whole set of new tools ….
| > |
| > | Why not.
| > | Because we have a declarative Smalltalk syntax.
| > | Personnally I hate that I have to define method on a
| > | configuration
| > | like loadLatest load….
| > | a ConfigurationOf is about data not execution.
| > |
| > | > As for SVN and GIT, I think that the jury is still out on
| > | > whether
| > | > SVN is better or worse than GIT ... there isn't really a clear
| > | > winner there.
| > | >
| > | > The bulk of the work that I have and will be doing will be
| > | > equally
| > | > applicable to SVN and GIT ... we use SVN within GemStone and I
| > | > intend to use this new work for managing the GemStone GLASS
| > | > source
| > | > in SVN directly rather than versioning a directory of mcz files
| > | > ....
| > | >
| > | > I believe that SVN and GIT can be used interchangeably
| > | > especially
| > | > with the initial versions of my work, since I will be relying
| > | > on
| > | > the developers using the SVN/GIT tools for managing their
| > | > repositories.
| > | >
| > | > On the other hand GitHub has a lot of features that we don't
| > | > have
| > | > in SqueakSource that would be real useful to have ... so
| > | > because
| > | > of GitHub, I am interested in making a push directed at git.
| > |
| > | Ok but I think that relying on an external logic to merge
| > | metacello
| > | configuration looks like dropping the ball on the floor
| > | while now we could revisit and really simplify metacello and
| > | freeze
| > | in declaration the configuration.
| > | May be I'm wrong but this is what you want to do with your git
| > | approach.
| > |
| > | So may be I got it totally wrong but to me. The problem is in the
| > | executability of the DSL you built and the fact that you
| > | need to interpret the data to get it. while with a structural
| > | spec
| > | and a fixed semantics I think that merge should be possible.
| > |
| > | > I should amend my statement to include git and svn ... being
| > | > able
| > | > to manage projects at the source level where one can create
| > | > project level branches and have a chance for merging the
| > | > changes
| > | > between two arbitrary branches will be a major step forward.
| > | >
| > | > With mcz files the best one can do is manage a version of the
| > | > Monticello directory, but all of the source is completely
| > | > opaque.
| > | >
| > | > In the file-base Monticello the meta data is stored in plain
| > | > files
| > | > and directory structure and the ancestry is taken care of the
| > | > the
| > | > underlying SCM (git or svn). For the FileTree package
| > | > structure,
| > | > the ancestry will be kept around only to make it possible to
| > | > move
| > | > the package into and out of a file-based repository without
| > | > losing
| > | > the ancestry (because it is required for Monticello-level
| > | > merging).
| > |
| > | You lost me again.
| > | I thought you wanted to put only metacello in git.
| > |
| > | Do you mean that you would version method as git objects and use
| > | the
| > | git ancestry management to merge?
| > |
| > |
| > |
| > |
|
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Ben Coman-2
In reply to this post by Dale Henrichs

Dale Henrichs wrote:
> Stef,
>
> Let me start by acknowledging that I understand that I am having a hard time getting my point across to you (and apparently Camillo). The following message may sound harsh, but I am merely trying to establish a basic foundation from which I am drawing conclusions. While it is clear to me that you are not understanding my points, it is not at all clear to me what you don't understand... Consequently I am going to try to say the same things I have said in my previous emails in a different form and see if we can make some progress.
>
> POINT 1 "Executable specifications introduce a SECURITY HOLE":
>  
Stef,

 From long experience in business systems administration I must fully
support Dale on this.  Security is difficult enough to get right even
when you are trying hard.  Perpetuating a known security hole is _not_
_good_. Assuming the best, and that usage of Pharo increases it will
come under increasing scrutiny in this regard.   Security should to be
tightened up where-ever possible.  In today's business IT environment,
security generally overrides most other considerations.

This issue of a non-executable format is orthogonal to the use of git.  
I think it was a good idea to separate this into different threads.

> The big sticking points that we appear to have is your contention that an "executable specification is just fine" and my contention that an "executable specification is not".
>
> You do understand that the when you crack open an mcz file that the serialized Monticello meta data is in a parsable file format called DataStream, right? When you deserialize that code, a stream is parsed and data is read off of that stream to reconstruct the Monticello meta data without executing any code that wasn't already loaded in the image (i.e., no SECURITY PROBLEMS).
>
> So the first thing that I need from you Stef is an acknowledgement that you understand this is a weaksness of an executable specification. In other words, this type of situation is unacceptable:
>
>   (Node
>         name: #baseline
>         son: (Node
>                         name: 'ProjectVersion')).
>   SmalltalkImage current shutdown.
>  
That last line would just be annoying.  A more subversive example would
be installation of a remote logger followed by self-modification of the
Configuration to remove the line so it is no longer visible.

> So please acknowledge that an executable specification like the above is not acceptable.
>
> It  is pointless to discuss any further if you don't understand why the above is bad.
>
> POINT 2 "A parser is needed for a non-execution (SECURE) specification":
>
> If we agree that an executable specification format is not a good thing, then you must agree that in order to avoid trojan horse attacks like `SmalltalkImage current shutdown.`, we must have a parser for our specification and our specification must also be readable by human beings.
>
> So the second point of acknowledgement that we must agree on is that we need to have a parser for whatever format we cook up. If you do not agree that we need a parser then you haven't agree with the first point, because the only way to avoid a trojan horse is to only allow statements that conform the agreed upon format.
>
> POINT 3 "A Smalltalk parser for JSON exists"
>
> One class 28 methods and you have a JSON parser ... If I wasn't writing emails right now, I would have a JSON to Metacello reader completed (I already have a Metacello to JSON writer written).
>
> POINT 4 "JSON is a readable format"
>
> POINT 5 "I am not married to JSON"
>
> If you can provide me with a portable parser (must run on Pharo, GemStone and Squeak) that parses whatever format you'd like to suggest and that allows me to write code to use the output of that parser to produce the Metacello data structures, then I will be happy to use your parser and format.
>  
My proposal to use Smalltalk syntax had implied that it would only use
the parser part and _not_ be executed - but perhaps it is not simple to
separate these concerns in existing code. Perhaps also later temptation
"to be clever" with some form of execution would be too close.  At a
minimum use of "half" the Smalltalk compiler might leave the door open
for unexpected exploits.   Upon reflection of this, this may justify a
parser other than the inbuilt Smalltalk.

Regarding the addition of "more" code when Pharo's goal is to strip down
to a smaller core.  I am guessing that it is more the package
dependencies that you want to break rather than the total amount of
code.  In this case, the JSON implementation might be implemented
privately to Metacello to not generically for reuse.  The little bit of
code duplication of another generic JSON package might be justified in
this case.

> If you recall I started my work on Metacello using an EXECUTABLE literal array format that was roundly criticized as being too difficult to read and was replaced by the ConfigurationOf specifications which we we use now.
>
> If you tell me that you PREFER the literal array format then I won't complain as long as you provide me with an independent, portable parser for the literal array format, then I'll be happy...
>
> Dale
>
>
> ----- Original Message -----
> | From: "stephane ducasse" <[hidden email]>
> | To: [hidden email]
> | Sent: Monday, February 6, 2012 11:00:22 AM
> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
> |
> |
> | On Feb 6, 2012, at 6:47 PM, Dale Henrichs wrote:
> |
> | > Stef,
> | >
> | > 1. The JSON parser used by Seaside is one class with 28 methods. I
> | > intend to copy
> | >   the class and use it for Metacello, with possible mods to support
> | >   YAML. For
> | >   the very short term, I will do my "one week experiment" using
> | >   JSON.
> | >
> | > 2. I am looking at JSON as a means to get away from the fact that
> | > Metacallo does
> | >   not have a non-executable specification format and that is a
> | >   GIANT SECURITY HOLE.
> | >
> | >   I would love for a site like Bibliocello or SqueakSource to
> | >   reason about Metacello
> | >   versions, but with only an executable format to work with, a
> | >   server site CANNOT
> | >   create Metacello objects because of the security risks to do so.
> |
> | I still do not understand what you mean.
> |
> | #( #baseline
> | #( projectVersion '3.6' )
> | #( url 'http://www.'))
> |
> | Why this is not enough?
> | I do not get it.
> |
> | Then the smalltalk message syntax is excellent to represent
> | declarative structure
> |
> | (Node
> | name: #baseline
> | son: (Node
> | name: 'ProjectVersion')
> |
> | I still do not get why you need a different syntax
> |
> | >   If we have a machine parsable and human readable format, the part
> | >   of the problem is
> | >   solved.
> | >
> | > 3. If I want to do true merges of configurations, I must be able to
> | > arrange to have 2-3
> | >   versions of the ConfigurationOfXXX class installed in the image
> | >   at one point in time.
> | >   If I want to preserve any changes that have been made to the
> | >   original configuration, I
> | >   am in trouble.
> | >
> | >   With a non-executable specification for a configuration, I can
> | >   PARSE 3 different files
> | >   and easily create the 3 instances of MetacelloProject that I need
> | >   to do the merge.
> | >
> | >   For mcz files, I can arrange to embed the JSON representation of
> | >   the spec in a separate
> | >   zip file directory…
> |
> | To me this is totally orthogonal to git usage or not. It depends on
> | the specification of your inputs.
> | And this is why I said that the problem is the fact that you need to
> | execute method to get objects
> | but not the technology used.
> |
> | > 4. In a file-based repository where the entire directory structure
> | > is versioned instead of
> | >   the individual files, the class-based configuration does not make
> | >   a lot of sense. In fact
> | >   I only need a single baseline specification. So I have the choice
> | >   of continuing with
> | >   the executable specification or creating a parsable
> | >   representation of the Metacello
> | >   specifications
> |
> | As I said experience with a really declarative specification but in
> | plain smalltalk syntax
> | Look at VisualiWorks UI specs this is exactly the same problem (see
> | at the end of this mail)
> |
> | Look at MSE format for Moose this is a declarative spec for model of
> | source code.
> | Ask camillo because he is expert in Git and SCM and we discuss and he
> | agreed with me. :)
> |
> | > 5. Since JSON has an existing (compact) parser for Smalltalk and
> | > JSON is a pretty readable
> | >   format, I am proposing that JSON be used for the parsable
> | >   Metacello specifications.
> | >
> | > 6. The parsable format (JSON) and executable format
> | > (ConfigurationOfXXX) can be used
> | >   interchangeably. For the forseeable future the vast majority of
> | >   developers will continue
> | >   to use the executable format to create their specifications.
> |
> | I hope not. It would be good to have one format for all.
> |
> | > If they happen to be using
> | >   a file-based repository (git or svn), then they will have the
> | >   option of creating/editting
> | >   the parsable format if they prefer.
> | >
> | > Dale
> | >
> | >
> | >
> | windowSpec
> | "UIPainter new openOnClass: self andSelector: #windowSpec"
> |
> | <resource: #canvas>
> | ^#(#{UI.FullSpec}
> | #window:
> | #(#{UI.WindowSpec}
> | #label: #(#{Kernel.UserMessage} #key: #UnlabeledCanvas
> | #defaultString: 'Unlabeled Canvas' #catalogID: #labels)
> | #bounds: #(#{Graphics.Rectangle} 512 384 858 635 ) )
> | #component:
> | #(#{UI.SpecCollection}
> | #collection: #(
> | #(#{UI.TextEditorSpec}
> | #layout: #(#{Graphics.LayoutFrame} 10 0 10 0 -10 1 -10 1 )
> | #name: #textEditor
> | #model: #textHolder
> | #isReadOnly: true
> | #tabRequiresControl: true ) ) ) )
> |
> |
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Ben Coman-2
In reply to this post by Dale Henrichs
Dale Henrichs wrote:
> | > | >
> | > | > On the other hand GitHub has a lot of features that we don't
> | > | > have
> | > | > in SqueakSource that would be real useful to have ... so
> | > | > because
> | > | > of GitHub, I am interested in making a push directed at git.
> | > |
You've mentioned github a few times.  Not withstanding that any features
"could" be implemented in Squeaksource,
How do you think github would be of benefit to Pharo ?

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
The advantage of using github, is that we don't have to wait for the full set of features available in github to be implemented for git in Smalltalk (as opposed to attempting to duplicate git functionality and github features in SqueakSource).

If we are using file-based repositories there is no reason that we shouldn't just leverage the existing support for file-based repositories.

This doesn't preclude doing a Smalltalk implementation of github, it just means that we don't have to wait for one to become available ...

This is no different than the rationale that I used when first starting Metacello. I chose an executable specification because I could leverage the existing SystemBrowser for creating configurations without also having to invent a tool for all three target platforms, BEFORE anyone could use it.

Dale

----- Original Message -----
| From: "Ben Coman" <[hidden email]>
| To: [hidden email]
| Sent: Tuesday, February 7, 2012 6:53:27 AM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
| Dale Henrichs wrote:
| > | > | >
| > | > | > On the other hand GitHub has a lot of features that we
| > | > | > don't
| > | > | > have
| > | > | > in SqueakSource that would be real useful to have ... so
| > | > | > because
| > | > | > of GitHub, I am interested in making a push directed at
| > | > | > git.
| > | > |
| You've mentioned github a few times.  Not withstanding that any
| features
| "could" be implemented in Squeaksource,
| How do you think github would be of benefit to Pharo ?
|
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

stephane ducasse-2
In reply to this post by Dale Henrichs
Hi dale,

> Stef,
>
> Let me start by acknowledging that I understand that I am having a hard time getting my point across to you (and apparently Camillo). The following message may sound harsh, but I am merely trying to establish a basic foundation from which I am drawing conclusions. While it is clear to me that you are not understanding my points, it is not at all clear to me what you don't understand... Consequently I am going to try to say the same things I have said in my previous emails in a different form and see if we can make some progress.
>
> POINT 1 "Executable specifications introduce a SECURITY HOLE":
>
> The big sticking points that we appear to have is your contention that an "executable specification is just fine" and my contention that an "executable specification is not".
>
> You do understand that the when you crack open an mcz file that the serialized Monticello meta data is in a parsable file format called DataStream, right? When you deserialize that code, a stream is parsed and data is read off of that stream to reconstruct the Monticello meta data without executing any code that wasn't already loaded in the image (i.e., no SECURITY PROBLEMS).
>
> So the first thing that I need from you Stef is an acknowledgement that you understand this is a weaksness of an executable specification. In other words, this type of situation is unacceptable:
>
>  (Node
>        name: #baseline
>        son: (Node
>                        name: 'ProjectVersion')).
>  SmalltalkImage current shutdown.

Two points
        - if you use a literal array you do not have this problem.
        - Now I do not really understand why this stress you. Do you want to protect your server?
Because if you load the code nothing prevent the package initialize to do SmalltalkImage current shutdown.
so in the end the security hole is to load packages. No??

> So please acknowledge that an executable specification like the above is not acceptable.

I do not see the problem you are trying to avoid and how this is related to the first problem you mentioned
As I recall it you mentioned that Git specification is good because you will be able to scale and to merge
configurations and I still do not understand how having a syntax in JSON will help you with that.
So may be you have multiple goals but this is not clear to me.

> It  is pointless to discuss any further if you don't understand why the above is bad.
>
> POINT 2 "A parser is needed for a non-execution (SECURE) specification":
>
> If we agree that an executable specification format is not a good thing, then you must agree that in order to avoid trojan horse attacks like `SmalltalkImage current shutdown.`, we must have a parser for our specification and our specification must also be readable by human beings.

but again as I say is literal arrays not equivalent to JSON?
with literal arrays you only get
        strings, float, numbers and symbols so…

>
> So the second point of acknowledgement that we must agree on is that we need to have a parser for whatever format we cook up. If you do not agree that we need a parser then you haven't agree with the first point, because the only way to avoid a trojan horse is to only allow statements that conform the agreed upon format.
>
> POINT 3 "A Smalltalk parser for JSON exists"
>
> One class 28 methods and you have a JSON parser ... If I wasn't writing emails right now, I would have a JSON to Metacello reader completed (I already have a Metacello to JSON writer written).
>
> POINT 4 "JSON is a readable format"
>
> POINT 5 "I am not married to JSON"
>
> If you can provide me with a portable parser (must run on Pharo, GemStone and Squeak) that parses whatever format you'd like to suggest and that allows me to write code to use the output of that parser to produce the Metacello data structures, then I will be happy to use your parser and format.
>
> If you recall I started my work on Metacello using an EXECUTABLE literal array format that was roundly criticized as being too difficult to read and was replaced by the ConfigurationOf specifications which we we use now.
>
> If you tell me that you PREFER the literal array format then I won't complain as long as you provide me with an independent, portable parser for the literal array format, then I'll be happy…

is it not our scanner?
The first implementation of MSE the format of moose just used the plain default VW parser.

>
>
> ----- Original Message -----
> | From: "stephane ducasse" <[hidden email]>
> | To: [hidden email]
> | Sent: Monday, February 6, 2012 11:00:22 AM
> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
> |
> |
> | On Feb 6, 2012, at 6:47 PM, Dale Henrichs wrote:
> |
> | > Stef,
> | >
> | > 1. The JSON parser used by Seaside is one class with 28 methods. I
> | > intend to copy
> | >   the class and use it for Metacello, with possible mods to support
> | >   YAML. For
> | >   the very short term, I will do my "one week experiment" using
> | >   JSON.
> | >
> | > 2. I am looking at JSON as a means to get away from the fact that
> | > Metacallo does
> | >   not have a non-executable specification format and that is a
> | >   GIANT SECURITY HOLE.
> | >
> | >   I would love for a site like Bibliocello or SqueakSource to
> | >   reason about Metacello
> | >   versions, but with only an executable format to work with, a
> | >   server site CANNOT
> | >   create Metacello objects because of the security risks to do so.
> |
> | I still do not understand what you mean.
> |
> | #( #baseline
> | #( projectVersion '3.6' )
> | #( url 'http://www.'))
> |
> | Why this is not enough?
> | I do not get it.
> |
> | Then the smalltalk message syntax is excellent to represent
> | declarative structure
> |
> | (Node
> | name: #baseline
> | son: (Node
> | name: 'ProjectVersion')
> |
> | I still do not get why you need a different syntax
> |
> | >   If we have a machine parsable and human readable format, the part
> | >   of the problem is
> | >   solved.
> | >
> | > 3. If I want to do true merges of configurations, I must be able to
> | > arrange to have 2-3
> | >   versions of the ConfigurationOfXXX class installed in the image
> | >   at one point in time.
> | >   If I want to preserve any changes that have been made to the
> | >   original configuration, I
> | >   am in trouble.
> | >
> | >   With a non-executable specification for a configuration, I can
> | >   PARSE 3 different files
> | >   and easily create the 3 instances of MetacelloProject that I need
> | >   to do the merge.
> | >
> | >   For mcz files, I can arrange to embed the JSON representation of
> | >   the spec in a separate
> | >   zip file directory…
> |
> | To me this is totally orthogonal to git usage or not. It depends on
> | the specification of your inputs.
> | And this is why I said that the problem is the fact that you need to
> | execute method to get objects
> | but not the technology used.
> |
> | > 4. In a file-based repository where the entire directory structure
> | > is versioned instead of
> | >   the individual files, the class-based configuration does not make
> | >   a lot of sense. In fact
> | >   I only need a single baseline specification. So I have the choice
> | >   of continuing with
> | >   the executable specification or creating a parsable
> | >   representation of the Metacello
> | >   specifications
> |
> | As I said experience with a really declarative specification but in
> | plain smalltalk syntax
> | Look at VisualiWorks UI specs this is exactly the same problem (see
> | at the end of this mail)
> |
> | Look at MSE format for Moose this is a declarative spec for model of
> | source code.
> | Ask camillo because he is expert in Git and SCM and we discuss and he
> | agreed with me. :)
> |
> | > 5. Since JSON has an existing (compact) parser for Smalltalk and
> | > JSON is a pretty readable
> | >   format, I am proposing that JSON be used for the parsable
> | >   Metacello specifications.
> | >
> | > 6. The parsable format (JSON) and executable format
> | > (ConfigurationOfXXX) can be used
> | >   interchangeably. For the forseeable future the vast majority of
> | >   developers will continue
> | >   to use the executable format to create their specifications.
> |
> | I hope not. It would be good to have one format for all.
> |
> | > If they happen to be using
> | >   a file-based repository (git or svn), then they will have the
> | >   option of creating/editting
> | >   the parsable format if they prefer.
> | >
> | > Dale
> | >
> | >
> | >
> | windowSpec
> | "UIPainter new openOnClass: self andSelector: #windowSpec"
> |
> | <resource: #canvas>
> | ^#(#{UI.FullSpec}
> | #window:
> | #(#{UI.WindowSpec}
> | #label: #(#{Kernel.UserMessage} #key: #UnlabeledCanvas
> | #defaultString: 'Unlabeled Canvas' #catalogID: #labels)
> | #bounds: #(#{Graphics.Rectangle} 512 384 858 635 ) )
> | #component:
> | #(#{UI.SpecCollection}
> | #collection: #(
> | #(#{UI.TextEditorSpec}
> | #layout: #(#{Graphics.LayoutFrame} 10 0 10 0 -10 1 -10 1 )
> | #name: #textEditor
> | #model: #textHolder
> | #isReadOnly: true
> | #tabRequiresControl: true ) ) ) )
> |
> |

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs


----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Wednesday, February 8, 2012 8:17:13 AM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
| Hi dale,
|
| > Stef,
| >
| > Let me start by acknowledging that I understand that I am having a
| > hard time getting my point across to you (and apparently Camillo).
| > The following message may sound harsh, but I am merely trying to
| > establish a basic foundation from which I am drawing conclusions.
| > While it is clear to me that you are not understanding my points,
| > it is not at all clear to me what you don't understand...
| > Consequently I am going to try to say the same things I have said
| > in my previous emails in a different form and see if we can make
| > some progress.
| >
| > POINT 1 "Executable specifications introduce a SECURITY HOLE":
| >
| > The big sticking points that we appear to have is your contention
| > that an "executable specification is just fine" and my contention
| > that an "executable specification is not".
| >
| > You do understand that the when you crack open an mcz file that the
| > serialized Monticello meta data is in a parsable file format
| > called DataStream, right? When you deserialize that code, a stream
| > is parsed and data is read off of that stream to reconstruct the
| > Monticello meta data without executing any code that wasn't
| > already loaded in the image (i.e., no SECURITY PROBLEMS).
| >
| > So the first thing that I need from you Stef is an acknowledgement
| > that you understand this is a weaksness of an executable
| > specification. In other words, this type of situation is
| > unacceptable:
| >
| >  (Node
| >        name: #baseline
| >        son: (Node
| >                        name: 'ProjectVersion')).
| >  SmalltalkImage current shutdown.
|
| Two points
| - if you use a literal array you do not have this problem.

If I expect to execute an expression using the literal array then it is a problem.

  SmalltalkImage current shutdown.
  ^#()

is still a problem if one is expected to EXECUTE the statements whether it be a doit or a method or a string that is compiled and executed ... The security hole exists if one is expected to EXECUTE a Smalltalk expression, because there is no way to guarantee that the expression is safe.

You do understand that the current mechanism we use in Metacello is to send the #project message to a class:

  (Smalltalk at: #ConfigurationOfXXX) project

and there are ZERO guarantees that the #project message won't do something nasty ... One cannot even load the code from an arbitrary mcz file on a server, to create the class before getting a chance to send the #project message, because the #initialize message is AUTOMATICALLY executed. As it stands today, there is no (safe) way for code executing on a server to reason about a Metacello configration without executing arbitrary Smalltalk code.

Executing arbitrary Smalltalk expressions in a server is a security hole. It is the moral equivalent of giving a hacker a login on your system and your server is not safe. I cannot run a server exposed to the outside world in a VMWare data center that has this security hole.

If you would just acknowledge that this is true I will read the rest of your email message ...

If you don't acknowledge this point, there is no point in trying to talk about anything else on this subject.

Dale
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

otto
Hi,

I understand that the security hole is created by evaluating a text
string, thereby compiling and executing the expression.

So, we want to avoid embedding Smalltalk methods in the metacello
specification. This could even be a problem with a JSON implementation
if we parse the spec and then perform certain elements in the parsed
spec. Otoh, this is not as big a risk as evaluating more general
expressions.

Surely it is possible to parse a literal array syntax in stead of a
json syntax, without dynamically executing the literal array. The
advantage of this approach is that a Smalltalk scanner / parser can be
used, which comes with the image. Well, this assumes a portable or
standard scanner / parser exists between gemstone and pharo and what
not.

I kindof like the idea of using the Smalltalk tools; I could even use
a method with auto formatting to create the spec. The tools could then
verify if my spec makes sense.

HTH
Otto

On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]> wrote:

>
>
> ----- Original Message -----
> | From: "stephane ducasse" <[hidden email]>
> | To: [hidden email]
> | Sent: Wednesday, February 8, 2012 8:17:13 AM
> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
> |
> | Hi dale,
> |
> | > Stef,
> | >
> | > Let me start by acknowledging that I understand that I am having a
> | > hard time getting my point across to you (and apparently Camillo).
> | > The following message may sound harsh, but I am merely trying to
> | > establish a basic foundation from which I am drawing conclusions.
> | > While it is clear to me that you are not understanding my points,
> | > it is not at all clear to me what you don't understand...
> | > Consequently I am going to try to say the same things I have said
> | > in my previous emails in a different form and see if we can make
> | > some progress.
> | >
> | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
> | >
> | > The big sticking points that we appear to have is your contention
> | > that an "executable specification is just fine" and my contention
> | > that an "executable specification is not".
> | >
> | > You do understand that the when you crack open an mcz file that the
> | > serialized Monticello meta data is in a parsable file format
> | > called DataStream, right? When you deserialize that code, a stream
> | > is parsed and data is read off of that stream to reconstruct the
> | > Monticello meta data without executing any code that wasn't
> | > already loaded in the image (i.e., no SECURITY PROBLEMS).
> | >
> | > So the first thing that I need from you Stef is an acknowledgement
> | > that you understand this is a weaksness of an executable
> | > specification. In other words, this type of situation is
> | > unacceptable:
> | >
> | >  (Node
> | >        name: #baseline
> | >        son: (Node
> | >                        name: 'ProjectVersion')).
> | >  SmalltalkImage current shutdown.
> |
> | Two points
> |    - if you use a literal array you do not have this problem.
>
> If I expect to execute an expression using the literal array then it is a problem.
>
>  SmalltalkImage current shutdown.
>  ^#()
>
> is still a problem if one is expected to EXECUTE the statements whether it be a doit or a method or a string that is compiled and executed ... The security hole exists if one is expected to EXECUTE a Smalltalk expression, because there is no way to guarantee that the expression is safe.
>
> You do understand that the current mechanism we use in Metacello is to send the #project message to a class:
>
>  (Smalltalk at: #ConfigurationOfXXX) project
>
> and there are ZERO guarantees that the #project message won't do something nasty ... One cannot even load the code from an arbitrary mcz file on a server, to create the class before getting a chance to send the #project message, because the #initialize message is AUTOMATICALLY executed. As it stands today, there is no (safe) way for code executing on a server to reason about a Metacello configration without executing arbitrary Smalltalk code.
>
> Executing arbitrary Smalltalk expressions in a server is a security hole. It is the moral equivalent of giving a hacker a login on your system and your server is not safe. I cannot run a server exposed to the outside world in a VMWare data center that has this security hole.
>
> If you would just acknowledge that this is true I will read the rest of your email message ...
>
> If you don't acknowledge this point, there is no point in trying to talk about anything else on this subject.
>
> Dale
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
Otto,

Security isn't an issue with the current JSON parser, it creates Arrays and Dictionaries from JSON, then I interpret the structure by sending messages to a known class instance and no strings are evaluated, so badly formed JSON input will only result in walkbacks, not a trojan horse. BTW, in Javascript, JSON strings apparently ARE executed like literal arrays, but the security issues with code that executes in a Browser aren't as great as those for a server.

I have nothing against alternate parsing/creation schemes, however, I am not excited about writing a new parser nor am I interested in requiring something like AST when we've got a perfectly good 28 method JSON parser:).

Right now I'm continuing to move forward using JSON, with the expectation that over time the format of the file will change and when it changes it will be for the better.

Yesterday, I was able to use Metacello to load from a fileTree repository using the JSON specification format. So the following Metacello script is functional:

  Metacello new
    project: 'Sample';
    filetree: '/foos1/users/dhenrich/smalltalk/sample/';
    load.

and I am continuing work on the Metacello scripting API as a whole.

As I work through issues, it is beginning to look like you will be able to write a baseline specification using the familiar mathod-based format and when you save the project to a filetree repository, I will convert the baseline specification into JSON, but we'll see how it goes. I am juggling a number of different requirements as I move forward, so we'll see what happens.

As far as validation ... that is needed no matter what ... writing a Smalltalk method doesn't guarantee that the spec is valid or correct, so the specs created via JSON will be validated as well ...

To be honest if we are going to write a parser for Metacello specs, then we should parse the current method-based format, then there will be only one format for folks to worry about ... While this would be ideal, I just don't have the extra cycles to spend writing a parser.

Dale

----- Original Message -----
| From: "Otto Behrens" <[hidden email]>
| To: [hidden email]
| Sent: Wednesday, February 8, 2012 10:18:02 AM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
| Hi,
|
| I understand that the security hole is created by evaluating a text
| string, thereby compiling and executing the expression.
|
| So, we want to avoid embedding Smalltalk methods in the metacello
| specification. This could even be a problem with a JSON
| implementation
| if we parse the spec and then perform certain elements in the parsed
| spec. Otoh, this is not as big a risk as evaluating more general
| expressions.
|
| Surely it is possible to parse a literal array syntax in stead of a
| json syntax, without dynamically executing the literal array. The
| advantage of this approach is that a Smalltalk scanner / parser can
| be
| used, which comes with the image. Well, this assumes a portable or
| standard scanner / parser exists between gemstone and pharo and what
| not.
|
| I kindof like the idea of using the Smalltalk tools; I could even use
| a method with auto formatting to create the spec. The tools could
| then
| verify if my spec makes sense.
|
| HTH
| Otto
|
| On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]> wrote:
|
| >
| >
| > ----- Original Message -----
| > | From: "stephane ducasse" <[hidden email]>
| > | To: [hidden email]
| > | Sent: Wednesday, February 8, 2012 8:17:13 AM
| > | Subject: Re: [Metacello] feedback on using JSON to specify
| > | baselines for git repository
| > |
| > | Hi dale,
| > |
| > | > Stef,
| > | >
| > | > Let me start by acknowledging that I understand that I am
| > | > having a
| > | > hard time getting my point across to you (and apparently
| > | > Camillo).
| > | > The following message may sound harsh, but I am merely trying
| > | > to
| > | > establish a basic foundation from which I am drawing
| > | > conclusions.
| > | > While it is clear to me that you are not understanding my
| > | > points,
| > | > it is not at all clear to me what you don't understand...
| > | > Consequently I am going to try to say the same things I have
| > | > said
| > | > in my previous emails in a different form and see if we can
| > | > make
| > | > some progress.
| > | >
| > | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
| > | >
| > | > The big sticking points that we appear to have is your
| > | > contention
| > | > that an "executable specification is just fine" and my
| > | > contention
| > | > that an "executable specification is not".
| > | >
| > | > You do understand that the when you crack open an mcz file that
| > | > the
| > | > serialized Monticello meta data is in a parsable file format
| > | > called DataStream, right? When you deserialize that code, a
| > | > stream
| > | > is parsed and data is read off of that stream to reconstruct
| > | > the
| > | > Monticello meta data without executing any code that wasn't
| > | > already loaded in the image (i.e., no SECURITY PROBLEMS).
| > | >
| > | > So the first thing that I need from you Stef is an
| > | > acknowledgement
| > | > that you understand this is a weaksness of an executable
| > | > specification. In other words, this type of situation is
| > | > unacceptable:
| > | >
| > | >  (Node
| > | >        name: #baseline
| > | >        son: (Node
| > | >                        name: 'ProjectVersion')).
| > | >  SmalltalkImage current shutdown.
| > |
| > | Two points
| > |    - if you use a literal array you do not have this problem.
| >
| > If I expect to execute an expression using the literal array then
| > it is a problem.
| >
| >  SmalltalkImage current shutdown.
| >  ^#()
| >
| > is still a problem if one is expected to EXECUTE the statements
| > whether it be a doit or a method or a string that is compiled and
| > executed ... The security hole exists if one is expected to
| > EXECUTE a Smalltalk expression, because there is no way to
| > guarantee that the expression is safe.
| >
| > You do understand that the current mechanism we use in Metacello is
| > to send the #project message to a class:
| >
| >  (Smalltalk at: #ConfigurationOfXXX) project
| >
| > and there are ZERO guarantees that the #project message won't do
| > something nasty ... One cannot even load the code from an
| > arbitrary mcz file on a server, to create the class before getting
| > a chance to send the #project message, because the #initialize
| > message is AUTOMATICALLY executed. As it stands today, there is no
| > (safe) way for code executing on a server to reason about a
| > Metacello configration without executing arbitrary Smalltalk code.
| >
| > Executing arbitrary Smalltalk expressions in a server is a security
| > hole. It is the moral equivalent of giving a hacker a login on
| > your system and your server is not safe. I cannot run a server
| > exposed to the outside world in a VMWare data center that has this
| > security hole.
| >
| > If you would just acknowledge that this is true I will read the
| > rest of your email message ...
| >
| > If you don't acknowledge this point, there is no point in trying to
| > talk about anything else on this subject.
| >
| > Dale
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

stephane ducasse-2
In reply to this post by Dale Henrichs
>
> If I expect to execute an expression using the literal array then it is a problem.
>
>  SmalltalkImage current shutdown.
>  ^#()
>
> is still a problem if one is expected to EXECUTE the statements whether it be a doit or a method or a string that is compiled and executed ... The security hole exists if one is expected to EXECUTE a Smalltalk expression, because there is no way to guarantee that the expression is safe.
>
> You do understand that the current mechanism we use in Metacello is to send the #project message to a class:
>
>  (Smalltalk at: #ConfigurationOfXXX) project
>
> and there are ZERO guarantees that the #project message won't do something nasty ... One cannot even load the code from an arbitrary mcz file on a server, to create the class before getting a chance to send the #project message, because the #initialize message is AUTOMATICALLY executed. As it stands today, there is no (safe) way for code executing on a server to reason about a Metacello configration without executing arbitrary Smalltalk code.
>
> Executing arbitrary Smalltalk expressions in a server is a security hole. It is the moral equivalent of giving a hacker a login on your system and your server is not safe. I cannot run a server exposed to the outside world in a VMWare data center that has this security hole.

yes that I understand.

but tell me why returning a JSON Strings would be any different than a literal array?
because the problem is not the literal array of the JSON but the way you get you hand on it.

If your literal array parser parsers a string then you should get a stream of literals.
the fact that you parse
        #(#baseline
                #(project '3.6')
                )

will return a stream composed of literals.
I think that this is exactly the same as for a JSON expression.
Now if a hack add

        SmalltalkImage killtherest
        #(#baseline
                #(project '3.6')
                )

then you semantics analysis will see that this is not an element: first not an array, second that SmalltalkImage is not
an authorized keywords like baseline, repository or whatever.

So the parsing is not the problem or there is something I do not understand.




> If you would just acknowledge that this is true I will read the rest of your email message ...
>
> If you don't acknowledge this point, there is no point in trying to talk about anything else on this subject.
>
> Dale

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

stephane ducasse-2
In reply to this post by otto
Thanks otto this is what I'm trying to explain.
The semantics interpretation of the stream of elements is what can produce a hole not the syntax.

Stef


On Feb 8, 2012, at 7:18 PM, Otto Behrens wrote:

> Hi,
>
> I understand that the security hole is created by evaluating a text
> string, thereby compiling and executing the expression.
>
> So, we want to avoid embedding Smalltalk methods in the metacello
> specification. This could even be a problem with a JSON implementation
> if we parse the spec and then perform certain elements in the parsed
> spec. Otoh, this is not as big a risk as evaluating more general
> expressions.
>
> Surely it is possible to parse a literal array syntax in stead of a
> json syntax, without dynamically executing the literal array. The
> advantage of this approach is that a Smalltalk scanner / parser can be
> used, which comes with the image. Well, this assumes a portable or
> standard scanner / parser exists between gemstone and pharo and what
> not.
>
> I kindof like the idea of using the Smalltalk tools; I could even use
> a method with auto formatting to create the spec. The tools could then
> verify if my spec makes sense.
>
> HTH
> Otto
>
> On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]> wrote:
>
>>
>>
>> ----- Original Message -----
>> | From: "stephane ducasse" <[hidden email]>
>> | To: [hidden email]
>> | Sent: Wednesday, February 8, 2012 8:17:13 AM
>> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
>> |
>> | Hi dale,
>> |
>> | > Stef,
>> | >
>> | > Let me start by acknowledging that I understand that I am having a
>> | > hard time getting my point across to you (and apparently Camillo).
>> | > The following message may sound harsh, but I am merely trying to
>> | > establish a basic foundation from which I am drawing conclusions.
>> | > While it is clear to me that you are not understanding my points,
>> | > it is not at all clear to me what you don't understand...
>> | > Consequently I am going to try to say the same things I have said
>> | > in my previous emails in a different form and see if we can make
>> | > some progress.
>> | >
>> | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
>> | >
>> | > The big sticking points that we appear to have is your contention
>> | > that an "executable specification is just fine" and my contention
>> | > that an "executable specification is not".
>> | >
>> | > You do understand that the when you crack open an mcz file that the
>> | > serialized Monticello meta data is in a parsable file format
>> | > called DataStream, right? When you deserialize that code, a stream
>> | > is parsed and data is read off of that stream to reconstruct the
>> | > Monticello meta data without executing any code that wasn't
>> | > already loaded in the image (i.e., no SECURITY PROBLEMS).
>> | >
>> | > So the first thing that I need from you Stef is an acknowledgement
>> | > that you understand this is a weaksness of an executable
>> | > specification. In other words, this type of situation is
>> | > unacceptable:
>> | >
>> | >  (Node
>> | >        name: #baseline
>> | >        son: (Node
>> | >                        name: 'ProjectVersion')).
>> | >  SmalltalkImage current shutdown.
>> |
>> | Two points
>> |    - if you use a literal array you do not have this problem.
>>
>> If I expect to execute an expression using the literal array then it is a problem.
>>
>> SmalltalkImage current shutdown.
>> ^#()
>>
>> is still a problem if one is expected to EXECUTE the statements whether it be a doit or a method or a string that is compiled and executed ... The security hole exists if one is expected to EXECUTE a Smalltalk expression, because there is no way to guarantee that the expression is safe.
>>
>> You do understand that the current mechanism we use in Metacello is to send the #project message to a class:
>>
>> (Smalltalk at: #ConfigurationOfXXX) project
>>
>> and there are ZERO guarantees that the #project message won't do something nasty ... One cannot even load the code from an arbitrary mcz file on a server, to create the class before getting a chance to send the #project message, because the #initialize message is AUTOMATICALLY executed. As it stands today, there is no (safe) way for code executing on a server to reason about a Metacello configration without executing arbitrary Smalltalk code.
>>
>> Executing arbitrary Smalltalk expressions in a server is a security hole. It is the moral equivalent of giving a hacker a login on your system and your server is not safe. I cannot run a server exposed to the outside world in a VMWare data center that has this security hole.
>>
>> If you would just acknowledge that this is true I will read the rest of your email message ...
>>
>> If you don't acknowledge this point, there is no point in trying to talk about anything else on this subject.
>>
>> Dale

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

stephane ducasse-2
In reply to this post by Dale Henrichs

On Feb 8, 2012, at 8:13 PM, Dale Henrichs wrote:

> Otto,
>
> Security isn't an issue with the current JSON parser, it creates Arrays and Dictionaries from JSON, then I interpret the structure by sending messages to a known class instance and no strings are evaluated, so badly formed JSON input will only result in walkbacks, not a trojan horse. BTW, in Javascript, JSON strings apparently ARE executed like literal arrays, but the security issues with code that executes in a Browser aren't as great as those for a server.

but you can get exactly the same with a literal array. This is the semantics (or absence) analysis of the structure that makes it vulnerable.


>
> I have nothing against alternate parsing/creation schemes, however, I am not excited about writing a new parser nor am I interested in requiring something like AST when we've got a perfectly good 28 method JSON parser:).

But what I do not understand is if you get
        '#(baseline
                #(version ))))'

the literal parser is parsing that expression for you in one line.
and it is not executing anything because this is a literal parser.

Parser new scanTokens: 'SmalltalkImage'

guess what you get…..
tada
#SmalltalkImage not the class

Parser new scanTokens: '#(baseline (version 3.6))'

and here
tada
an array.

#(#(#baseline #(#version 3.6)))

So ?


>
> Right now I'm continuing to move forward using JSON, with the expectation that over time the format of the file will change and when it changes it will be for the better.
>
> Yesterday, I was able to use Metacello to load from a fileTree repository using the JSON specification format. So the following Metacello script is functional:
>
>  Metacello new
>    project: 'Sample';
>    filetree: '/foos1/users/dhenrich/smalltalk/sample/';
>    load.
>
> and I am continuing work on the Metacello scripting API as a whole.
>
> As I work through issues, it is beginning to look like you will be able to write a baseline specification using the familiar mathod-based format and when you save the project to a filetree repository, I will convert the baseline specification into JSON, but we'll see how it goes. I am juggling a number of different requirements as I move forward, so we'll see what happens.
>
> As far as validation ... that is needed no matter what ... writing a Smalltalk method doesn't guarantee that the spec is valid or correct, so the specs created via JSON will be validated as well ...
>
> To be honest if we are going to write a parser for Metacello specs, then we should parse the current method-based format, then there will be only one format for folks to worry about ... While this would be ideal, I just don't have the extra cycles to spend writing a parser.
>
> Dale
>
> ----- Original Message -----
> | From: "Otto Behrens" <[hidden email]>
> | To: [hidden email]
> | Sent: Wednesday, February 8, 2012 10:18:02 AM
> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
> |
> | Hi,
> |
> | I understand that the security hole is created by evaluating a text
> | string, thereby compiling and executing the expression.
> |
> | So, we want to avoid embedding Smalltalk methods in the metacello
> | specification. This could even be a problem with a JSON
> | implementation
> | if we parse the spec and then perform certain elements in the parsed
> | spec. Otoh, this is not as big a risk as evaluating more general
> | expressions.
> |
> | Surely it is possible to parse a literal array syntax in stead of a
> | json syntax, without dynamically executing the literal array. The
> | advantage of this approach is that a Smalltalk scanner / parser can
> | be
> | used, which comes with the image. Well, this assumes a portable or
> | standard scanner / parser exists between gemstone and pharo and what
> | not.
> |
> | I kindof like the idea of using the Smalltalk tools; I could even use
> | a method with auto formatting to create the spec. The tools could
> | then
> | verify if my spec makes sense.
> |
> | HTH
> | Otto
> |
> | On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]> wrote:
> |
> | >
> | >
> | > ----- Original Message -----
> | > | From: "stephane ducasse" <[hidden email]>
> | > | To: [hidden email]
> | > | Sent: Wednesday, February 8, 2012 8:17:13 AM
> | > | Subject: Re: [Metacello] feedback on using JSON to specify
> | > | baselines for git repository
> | > |
> | > | Hi dale,
> | > |
> | > | > Stef,
> | > | >
> | > | > Let me start by acknowledging that I understand that I am
> | > | > having a
> | > | > hard time getting my point across to you (and apparently
> | > | > Camillo).
> | > | > The following message may sound harsh, but I am merely trying
> | > | > to
> | > | > establish a basic foundation from which I am drawing
> | > | > conclusions.
> | > | > While it is clear to me that you are not understanding my
> | > | > points,
> | > | > it is not at all clear to me what you don't understand...
> | > | > Consequently I am going to try to say the same things I have
> | > | > said
> | > | > in my previous emails in a different form and see if we can
> | > | > make
> | > | > some progress.
> | > | >
> | > | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
> | > | >
> | > | > The big sticking points that we appear to have is your
> | > | > contention
> | > | > that an "executable specification is just fine" and my
> | > | > contention
> | > | > that an "executable specification is not".
> | > | >
> | > | > You do understand that the when you crack open an mcz file that
> | > | > the
> | > | > serialized Monticello meta data is in a parsable file format
> | > | > called DataStream, right? When you deserialize that code, a
> | > | > stream
> | > | > is parsed and data is read off of that stream to reconstruct
> | > | > the
> | > | > Monticello meta data without executing any code that wasn't
> | > | > already loaded in the image (i.e., no SECURITY PROBLEMS).
> | > | >
> | > | > So the first thing that I need from you Stef is an
> | > | > acknowledgement
> | > | > that you understand this is a weaksness of an executable
> | > | > specification. In other words, this type of situation is
> | > | > unacceptable:
> | > | >
> | > | >  (Node
> | > | >        name: #baseline
> | > | >        son: (Node
> | > | >                        name: 'ProjectVersion')).
> | > | >  SmalltalkImage current shutdown.
> | > |
> | > | Two points
> | > |    - if you use a literal array you do not have this problem.
> | >
> | > If I expect to execute an expression using the literal array then
> | > it is a problem.
> | >
> | >  SmalltalkImage current shutdown.
> | >  ^#()
> | >
> | > is still a problem if one is expected to EXECUTE the statements
> | > whether it be a doit or a method or a string that is compiled and
> | > executed ... The security hole exists if one is expected to
> | > EXECUTE a Smalltalk expression, because there is no way to
> | > guarantee that the expression is safe.
> | >
> | > You do understand that the current mechanism we use in Metacello is
> | > to send the #project message to a class:
> | >
> | >  (Smalltalk at: #ConfigurationOfXXX) project
> | >
> | > and there are ZERO guarantees that the #project message won't do
> | > something nasty ... One cannot even load the code from an
> | > arbitrary mcz file on a server, to create the class before getting
> | > a chance to send the #project message, because the #initialize
> | > message is AUTOMATICALLY executed. As it stands today, there is no
> | > (safe) way for code executing on a server to reason about a
> | > Metacello configration without executing arbitrary Smalltalk code.
> | >
> | > Executing arbitrary Smalltalk expressions in a server is a security
> | > hole. It is the moral equivalent of giving a hacker a login on
> | > your system and your server is not safe. I cannot run a server
> | > exposed to the outside world in a VMWare data center that has this
> | > security hole.
> | >
> | > If you would just acknowledge that this is true I will read the
> | > rest of your email message ...
> | >
> | > If you don't acknowledge this point, there is no point in trying to
> | > talk about anything else on this subject.
> | >
> | > Dale
> |

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

stephane ducasse-2
In reply to this post by Dale Henrichs
BTW dale I still do not understand how this declarative syntax can help you merging configuration.
It helps you to avoid server security hole but in my first email about git I ask and you told me that it was about merging configurations and since then I'm confused.

Stef


On Feb 8, 2012, at 8:13 PM, Dale Henrichs wrote:

> Otto,
>
> Security isn't an issue with the current JSON parser, it creates Arrays and Dictionaries from JSON, then I interpret the structure by sending messages to a known class instance and no strings are evaluated, so badly formed JSON input will only result in walkbacks, not a trojan horse. BTW, in Javascript, JSON strings apparently ARE executed like literal arrays, but the security issues with code that executes in a Browser aren't as great as those for a server.
>
> I have nothing against alternate parsing/creation schemes, however, I am not excited about writing a new parser nor am I interested in requiring something like AST when we've got a perfectly good 28 method JSON parser:).
>
> Right now I'm continuing to move forward using JSON, with the expectation that over time the format of the file will change and when it changes it will be for the better.
>
> Yesterday, I was able to use Metacello to load from a fileTree repository using the JSON specification format. So the following Metacello script is functional:
>
>  Metacello new
>    project: 'Sample';
>    filetree: '/foos1/users/dhenrich/smalltalk/sample/';
>    load.
>
> and I am continuing work on the Metacello scripting API as a whole.
>
> As I work through issues, it is beginning to look like you will be able to write a baseline specification using the familiar mathod-based format and when you save the project to a filetree repository, I will convert the baseline specification into JSON, but we'll see how it goes. I am juggling a number of different requirements as I move forward, so we'll see what happens.
>
> As far as validation ... that is needed no matter what ... writing a Smalltalk method doesn't guarantee that the spec is valid or correct, so the specs created via JSON will be validated as well ...
>
> To be honest if we are going to write a parser for Metacello specs, then we should parse the current method-based format, then there will be only one format for folks to worry about ... While this would be ideal, I just don't have the extra cycles to spend writing a parser.
>
> Dale
>
> ----- Original Message -----
> | From: "Otto Behrens" <[hidden email]>
> | To: [hidden email]
> | Sent: Wednesday, February 8, 2012 10:18:02 AM
> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
> |
> | Hi,
> |
> | I understand that the security hole is created by evaluating a text
> | string, thereby compiling and executing the expression.
> |
> | So, we want to avoid embedding Smalltalk methods in the metacello
> | specification. This could even be a problem with a JSON
> | implementation
> | if we parse the spec and then perform certain elements in the parsed
> | spec. Otoh, this is not as big a risk as evaluating more general
> | expressions.
> |
> | Surely it is possible to parse a literal array syntax in stead of a
> | json syntax, without dynamically executing the literal array. The
> | advantage of this approach is that a Smalltalk scanner / parser can
> | be
> | used, which comes with the image. Well, this assumes a portable or
> | standard scanner / parser exists between gemstone and pharo and what
> | not.
> |
> | I kindof like the idea of using the Smalltalk tools; I could even use
> | a method with auto formatting to create the spec. The tools could
> | then
> | verify if my spec makes sense.
> |
> | HTH
> | Otto
> |
> | On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]> wrote:
> |
> | >
> | >
> | > ----- Original Message -----
> | > | From: "stephane ducasse" <[hidden email]>
> | > | To: [hidden email]
> | > | Sent: Wednesday, February 8, 2012 8:17:13 AM
> | > | Subject: Re: [Metacello] feedback on using JSON to specify
> | > | baselines for git repository
> | > |
> | > | Hi dale,
> | > |
> | > | > Stef,
> | > | >
> | > | > Let me start by acknowledging that I understand that I am
> | > | > having a
> | > | > hard time getting my point across to you (and apparently
> | > | > Camillo).
> | > | > The following message may sound harsh, but I am merely trying
> | > | > to
> | > | > establish a basic foundation from which I am drawing
> | > | > conclusions.
> | > | > While it is clear to me that you are not understanding my
> | > | > points,
> | > | > it is not at all clear to me what you don't understand...
> | > | > Consequently I am going to try to say the same things I have
> | > | > said
> | > | > in my previous emails in a different form and see if we can
> | > | > make
> | > | > some progress.
> | > | >
> | > | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
> | > | >
> | > | > The big sticking points that we appear to have is your
> | > | > contention
> | > | > that an "executable specification is just fine" and my
> | > | > contention
> | > | > that an "executable specification is not".
> | > | >
> | > | > You do understand that the when you crack open an mcz file that
> | > | > the
> | > | > serialized Monticello meta data is in a parsable file format
> | > | > called DataStream, right? When you deserialize that code, a
> | > | > stream
> | > | > is parsed and data is read off of that stream to reconstruct
> | > | > the
> | > | > Monticello meta data without executing any code that wasn't
> | > | > already loaded in the image (i.e., no SECURITY PROBLEMS).
> | > | >
> | > | > So the first thing that I need from you Stef is an
> | > | > acknowledgement
> | > | > that you understand this is a weaksness of an executable
> | > | > specification. In other words, this type of situation is
> | > | > unacceptable:
> | > | >
> | > | >  (Node
> | > | >        name: #baseline
> | > | >        son: (Node
> | > | >                        name: 'ProjectVersion')).
> | > | >  SmalltalkImage current shutdown.
> | > |
> | > | Two points
> | > |    - if you use a literal array you do not have this problem.
> | >
> | > If I expect to execute an expression using the literal array then
> | > it is a problem.
> | >
> | >  SmalltalkImage current shutdown.
> | >  ^#()
> | >
> | > is still a problem if one is expected to EXECUTE the statements
> | > whether it be a doit or a method or a string that is compiled and
> | > executed ... The security hole exists if one is expected to
> | > EXECUTE a Smalltalk expression, because there is no way to
> | > guarantee that the expression is safe.
> | >
> | > You do understand that the current mechanism we use in Metacello is
> | > to send the #project message to a class:
> | >
> | >  (Smalltalk at: #ConfigurationOfXXX) project
> | >
> | > and there are ZERO guarantees that the #project message won't do
> | > something nasty ... One cannot even load the code from an
> | > arbitrary mcz file on a server, to create the class before getting
> | > a chance to send the #project message, because the #initialize
> | > message is AUTOMATICALLY executed. As it stands today, there is no
> | > (safe) way for code executing on a server to reason about a
> | > Metacello configration without executing arbitrary Smalltalk code.
> | >
> | > Executing arbitrary Smalltalk expressions in a server is a security
> | > hole. It is the moral equivalent of giving a hacker a login on
> | > your system and your server is not safe. I cannot run a server
> | > exposed to the outside world in a VMWare data center that has this
> | > security hole.
> | >
> | > If you would just acknowledge that this is true I will read the
> | > rest of your email message ...
> | >
> | > If you don't acknowledge this point, there is no point in trying to
> | > talk about anything else on this subject.
> | >
> | > Dale
> |

Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

abergel
> BTW dale I still do not understand how this declarative syntax can help you merging configuration.
> It helps you to avoid server security hole but in my first email about git I ask and you told me that it was about merging configurations and since then I'm confused.

One positive thing that I see with a declaration instead of an AST, is about reflection. In the tool we are building, we need to add a new dependency to a particular version. Currently, Dale has to touch the AST I believe, which is far from being convenient.

Cheers,
Alexandre


>
> On Feb 8, 2012, at 8:13 PM, Dale Henrichs wrote:
>
>> Otto,
>>
>> Security isn't an issue with the current JSON parser, it creates Arrays and Dictionaries from JSON, then I interpret the structure by sending messages to a known class instance and no strings are evaluated, so badly formed JSON input will only result in walkbacks, not a trojan horse. BTW, in Javascript, JSON strings apparently ARE executed like literal arrays, but the security issues with code that executes in a Browser aren't as great as those for a server.
>>
>> I have nothing against alternate parsing/creation schemes, however, I am not excited about writing a new parser nor am I interested in requiring something like AST when we've got a perfectly good 28 method JSON parser:).
>>
>> Right now I'm continuing to move forward using JSON, with the expectation that over time the format of the file will change and when it changes it will be for the better.
>>
>> Yesterday, I was able to use Metacello to load from a fileTree repository using the JSON specification format. So the following Metacello script is functional:
>>
>> Metacello new
>>   project: 'Sample';
>>   filetree: '/foos1/users/dhenrich/smalltalk/sample/';
>>   load.
>>
>> and I am continuing work on the Metacello scripting API as a whole.
>>
>> As I work through issues, it is beginning to look like you will be able to write a baseline specification using the familiar mathod-based format and when you save the project to a filetree repository, I will convert the baseline specification into JSON, but we'll see how it goes. I am juggling a number of different requirements as I move forward, so we'll see what happens.
>>
>> As far as validation ... that is needed no matter what ... writing a Smalltalk method doesn't guarantee that the spec is valid or correct, so the specs created via JSON will be validated as well ...
>>
>> To be honest if we are going to write a parser for Metacello specs, then we should parse the current method-based format, then there will be only one format for folks to worry about ... While this would be ideal, I just don't have the extra cycles to spend writing a parser.
>>
>> Dale
>>
>> ----- Original Message -----
>> | From: "Otto Behrens" <[hidden email]>
>> | To: [hidden email]
>> | Sent: Wednesday, February 8, 2012 10:18:02 AM
>> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
>> |
>> | Hi,
>> |
>> | I understand that the security hole is created by evaluating a text
>> | string, thereby compiling and executing the expression.
>> |
>> | So, we want to avoid embedding Smalltalk methods in the metacello
>> | specification. This could even be a problem with a JSON
>> | implementation
>> | if we parse the spec and then perform certain elements in the parsed
>> | spec. Otoh, this is not as big a risk as evaluating more general
>> | expressions.
>> |
>> | Surely it is possible to parse a literal array syntax in stead of a
>> | json syntax, without dynamically executing the literal array. The
>> | advantage of this approach is that a Smalltalk scanner / parser can
>> | be
>> | used, which comes with the image. Well, this assumes a portable or
>> | standard scanner / parser exists between gemstone and pharo and what
>> | not.
>> |
>> | I kindof like the idea of using the Smalltalk tools; I could even use
>> | a method with auto formatting to create the spec. The tools could
>> | then
>> | verify if my spec makes sense.
>> |
>> | HTH
>> | Otto
>> |
>> | On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]> wrote:
>> |
>> | >
>> | >
>> | > ----- Original Message -----
>> | > | From: "stephane ducasse" <[hidden email]>
>> | > | To: [hidden email]
>> | > | Sent: Wednesday, February 8, 2012 8:17:13 AM
>> | > | Subject: Re: [Metacello] feedback on using JSON to specify
>> | > | baselines for git repository
>> | > |
>> | > | Hi dale,
>> | > |
>> | > | > Stef,
>> | > | >
>> | > | > Let me start by acknowledging that I understand that I am
>> | > | > having a
>> | > | > hard time getting my point across to you (and apparently
>> | > | > Camillo).
>> | > | > The following message may sound harsh, but I am merely trying
>> | > | > to
>> | > | > establish a basic foundation from which I am drawing
>> | > | > conclusions.
>> | > | > While it is clear to me that you are not understanding my
>> | > | > points,
>> | > | > it is not at all clear to me what you don't understand...
>> | > | > Consequently I am going to try to say the same things I have
>> | > | > said
>> | > | > in my previous emails in a different form and see if we can
>> | > | > make
>> | > | > some progress.
>> | > | >
>> | > | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
>> | > | >
>> | > | > The big sticking points that we appear to have is your
>> | > | > contention
>> | > | > that an "executable specification is just fine" and my
>> | > | > contention
>> | > | > that an "executable specification is not".
>> | > | >
>> | > | > You do understand that the when you crack open an mcz file that
>> | > | > the
>> | > | > serialized Monticello meta data is in a parsable file format
>> | > | > called DataStream, right? When you deserialize that code, a
>> | > | > stream
>> | > | > is parsed and data is read off of that stream to reconstruct
>> | > | > the
>> | > | > Monticello meta data without executing any code that wasn't
>> | > | > already loaded in the image (i.e., no SECURITY PROBLEMS).
>> | > | >
>> | > | > So the first thing that I need from you Stef is an
>> | > | > acknowledgement
>> | > | > that you understand this is a weaksness of an executable
>> | > | > specification. In other words, this type of situation is
>> | > | > unacceptable:
>> | > | >
>> | > | >  (Node
>> | > | >        name: #baseline
>> | > | >        son: (Node
>> | > | >                        name: 'ProjectVersion')).
>> | > | >  SmalltalkImage current shutdown.
>> | > |
>> | > | Two points
>> | > |    - if you use a literal array you do not have this problem.
>> | >
>> | > If I expect to execute an expression using the literal array then
>> | > it is a problem.
>> | >
>> | >  SmalltalkImage current shutdown.
>> | >  ^#()
>> | >
>> | > is still a problem if one is expected to EXECUTE the statements
>> | > whether it be a doit or a method or a string that is compiled and
>> | > executed ... The security hole exists if one is expected to
>> | > EXECUTE a Smalltalk expression, because there is no way to
>> | > guarantee that the expression is safe.
>> | >
>> | > You do understand that the current mechanism we use in Metacello is
>> | > to send the #project message to a class:
>> | >
>> | >  (Smalltalk at: #ConfigurationOfXXX) project
>> | >
>> | > and there are ZERO guarantees that the #project message won't do
>> | > something nasty ... One cannot even load the code from an
>> | > arbitrary mcz file on a server, to create the class before getting
>> | > a chance to send the #project message, because the #initialize
>> | > message is AUTOMATICALLY executed. As it stands today, there is no
>> | > (safe) way for code executing on a server to reason about a
>> | > Metacello configration without executing arbitrary Smalltalk code.
>> | >
>> | > Executing arbitrary Smalltalk expressions in a server is a security
>> | > hole. It is the moral equivalent of giving a hacker a login on
>> | > your system and your server is not safe. I cannot run a server
>> | > exposed to the outside world in a VMWare data center that has this
>> | > security hole.
>> | >
>> | > If you would just acknowledge that this is true I will read the
>> | > rest of your email message ...
>> | >
>> | > If you don't acknowledge this point, there is no point in trying to
>> | > talk about anything else on this subject.
>> | >
>> | > Dale
>> |
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.





Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

stephane ducasse-2

On Feb 8, 2012, at 9:39 PM, Alexandre Bergel wrote:

>> BTW dale I still do not understand how this declarative syntax can help you merging configuration.
>> It helps you to avoid server security hole but in my first email about git I ask and you told me that it was about merging configurations and since then I'm confused.
>
> One positive thing that I see with a declaration instead of an AST, is about reflection. In the tool we are building, we need to add a new dependency to a particular version. Currently, Dale has to touch the AST I believe, which is far from being convenient.

Ahhhh  if you mean that adding an array in an array is easier than adding a pragma node is easier, that I understand :)
Now I do not think that it can really help merging. It help manipulating representation of information.

Because at the end for merging you want to have two or multiple structures and compare them, the fact that you get them by executing a program or parsing a file is orthogonal.

Stef
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
In reply to this post by stephane ducasse-2
Stef,

I can only think is that you must be _assuming_ that the literal array is consumed by a parser that does not accept standard smalltalk sytax, but is written to parse literal arrays of strings and literal arrays.

If you expect the Smalltalk compiler to compile and evaluate an expression to produce an array from the literal array string, then you are open to the security hole.

If you postulate that someone were to write a portable literal array parser, then we have come to an understanding...and I will patiently wait for one to appear.

Dale

----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Wednesday, February 8, 2012 12:12:01 PM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
| >
| > If I expect to execute an expression using the literal array then
| > it is a problem.
| >
| >  SmalltalkImage current shutdown.
| >  ^#()
| >
| > is still a problem if one is expected to EXECUTE the statements
| > whether it be a doit or a method or a string that is compiled and
| > executed ... The security hole exists if one is expected to
| > EXECUTE a Smalltalk expression, because there is no way to
| > guarantee that the expression is safe.
| >
| > You do understand that the current mechanism we use in Metacello is
| > to send the #project message to a class:
| >
| >  (Smalltalk at: #ConfigurationOfXXX) project
| >
| > and there are ZERO guarantees that the #project message won't do
| > something nasty ... One cannot even load the code from an
| > arbitrary mcz file on a server, to create the class before getting
| > a chance to send the #project message, because the #initialize
| > message is AUTOMATICALLY executed. As it stands today, there is no
| > (safe) way for code executing on a server to reason about a
| > Metacello configration without executing arbitrary Smalltalk code.
| >
| > Executing arbitrary Smalltalk expressions in a server is a security
| > hole. It is the moral equivalent of giving a hacker a login on
| > your system and your server is not safe. I cannot run a server
| > exposed to the outside world in a VMWare data center that has this
| > security hole.
|
| yes that I understand.
|
| but tell me why returning a JSON Strings would be any different than
| a literal array?
| because the problem is not the literal array of the JSON but the way
| you get you hand on it.
|
| If your literal array parser parsers a string then you should get a
| stream of literals.
| the fact that you parse
| #(#baseline
| #(project '3.6')
| )
|
| will return a stream composed of literals.
| I think that this is exactly the same as for a JSON expression.
| Now if a hack add
|
| SmalltalkImage killtherest
| #(#baseline
| #(project '3.6')
| )
|
| then you semantics analysis will see that this is not an element:
| first not an array, second that SmalltalkImage is not
| an authorized keywords like baseline, repository or whatever.
|
| So the parsing is not the problem or there is something I do not
| understand.
|
|
|
|
| > If you would just acknowledge that this is true I will read the
| > rest of your email message ...
| >
| > If you don't acknowledge this point, there is no point in trying to
| > talk about anything else on this subject.
| >
| > Dale
|
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
In reply to this post by stephane ducasse-2
Stef,

I think I agree with you:)

If you provide me with a portable literal array parser I will be glad to use literal array strings for specifications.

Dale

----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Wednesday, February 8, 2012 12:13:33 PM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
| Thanks otto this is what I'm trying to explain.
| The semantics interpretation of the stream of elements is what can
| produce a hole not the syntax.
|
| Stef
|
|
| On Feb 8, 2012, at 7:18 PM, Otto Behrens wrote:
|
| > Hi,
| >
| > I understand that the security hole is created by evaluating a text
| > string, thereby compiling and executing the expression.
| >
| > So, we want to avoid embedding Smalltalk methods in the metacello
| > specification. This could even be a problem with a JSON
| > implementation
| > if we parse the spec and then perform certain elements in the
| > parsed
| > spec. Otoh, this is not as big a risk as evaluating more general
| > expressions.
| >
| > Surely it is possible to parse a literal array syntax in stead of a
| > json syntax, without dynamically executing the literal array. The
| > advantage of this approach is that a Smalltalk scanner / parser can
| > be
| > used, which comes with the image. Well, this assumes a portable or
| > standard scanner / parser exists between gemstone and pharo and
| > what
| > not.
| >
| > I kindof like the idea of using the Smalltalk tools; I could even
| > use
| > a method with auto formatting to create the spec. The tools could
| > then
| > verify if my spec makes sense.
| >
| > HTH
| > Otto
| >
| > On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]>
| > wrote:
| >
| >>
| >>
| >> ----- Original Message -----
| >> | From: "stephane ducasse" <[hidden email]>
| >> | To: [hidden email]
| >> | Sent: Wednesday, February 8, 2012 8:17:13 AM
| >> | Subject: Re: [Metacello] feedback on using JSON to specify
| >> | baselines for git repository
| >> |
| >> | Hi dale,
| >> |
| >> | > Stef,
| >> | >
| >> | > Let me start by acknowledging that I understand that I am
| >> | > having a
| >> | > hard time getting my point across to you (and apparently
| >> | > Camillo).
| >> | > The following message may sound harsh, but I am merely trying
| >> | > to
| >> | > establish a basic foundation from which I am drawing
| >> | > conclusions.
| >> | > While it is clear to me that you are not understanding my
| >> | > points,
| >> | > it is not at all clear to me what you don't understand...
| >> | > Consequently I am going to try to say the same things I have
| >> | > said
| >> | > in my previous emails in a different form and see if we can
| >> | > make
| >> | > some progress.
| >> | >
| >> | > POINT 1 "Executable specifications introduce a SECURITY HOLE":
| >> | >
| >> | > The big sticking points that we appear to have is your
| >> | > contention
| >> | > that an "executable specification is just fine" and my
| >> | > contention
| >> | > that an "executable specification is not".
| >> | >
| >> | > You do understand that the when you crack open an mcz file
| >> | > that the
| >> | > serialized Monticello meta data is in a parsable file format
| >> | > called DataStream, right? When you deserialize that code, a
| >> | > stream
| >> | > is parsed and data is read off of that stream to reconstruct
| >> | > the
| >> | > Monticello meta data without executing any code that wasn't
| >> | > already loaded in the image (i.e., no SECURITY PROBLEMS).
| >> | >
| >> | > So the first thing that I need from you Stef is an
| >> | > acknowledgement
| >> | > that you understand this is a weaksness of an executable
| >> | > specification. In other words, this type of situation is
| >> | > unacceptable:
| >> | >
| >> | >  (Node
| >> | >        name: #baseline
| >> | >        son: (Node
| >> | >                        name: 'ProjectVersion')).
| >> | >  SmalltalkImage current shutdown.
| >> |
| >> | Two points
| >> |    - if you use a literal array you do not have this problem.
| >>
| >> If I expect to execute an expression using the literal array then
| >> it is a problem.
| >>
| >> SmalltalkImage current shutdown.
| >> ^#()
| >>
| >> is still a problem if one is expected to EXECUTE the statements
| >> whether it be a doit or a method or a string that is compiled and
| >> executed ... The security hole exists if one is expected to
| >> EXECUTE a Smalltalk expression, because there is no way to
| >> guarantee that the expression is safe.
| >>
| >> You do understand that the current mechanism we use in Metacello
| >> is to send the #project message to a class:
| >>
| >> (Smalltalk at: #ConfigurationOfXXX) project
| >>
| >> and there are ZERO guarantees that the #project message won't do
| >> something nasty ... One cannot even load the code from an
| >> arbitrary mcz file on a server, to create the class before
| >> getting a chance to send the #project message, because the
| >> #initialize message is AUTOMATICALLY executed. As it stands
| >> today, there is no (safe) way for code executing on a server to
| >> reason about a Metacello configration without executing arbitrary
| >> Smalltalk code.
| >>
| >> Executing arbitrary Smalltalk expressions in a server is a
| >> security hole. It is the moral equivalent of giving a hacker a
| >> login on your system and your server is not safe. I cannot run a
| >> server exposed to the outside world in a VMWare data center that
| >> has this security hole.
| >>
| >> If you would just acknowledge that this is true I will read the
| >> rest of your email message ...
| >>
| >> If you don't acknowledge this point, there is no point in trying
| >> to talk about anything else on this subject.
| >>
| >> Dale
|
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
In reply to this post by stephane ducasse-2
So Stef,

Parser is Squeak/Pharo specific. There is no Parser class in GemStone ... You are assuming that a portable parser exists for literal arrays and one does not exist.

If you provide me with a portable parser that runs on Squeak/Pharo/GemStone, then I will use it...

Dale

----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Wednesday, February 8, 2012 12:22:44 PM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
|
| On Feb 8, 2012, at 8:13 PM, Dale Henrichs wrote:
|
| > Otto,
| >
| > Security isn't an issue with the current JSON parser, it creates
| > Arrays and Dictionaries from JSON, then I interpret the structure
| > by sending messages to a known class instance and no strings are
| > evaluated, so badly formed JSON input will only result in
| > walkbacks, not a trojan horse. BTW, in Javascript, JSON strings
| > apparently ARE executed like literal arrays, but the security
| > issues with code that executes in a Browser aren't as great as
| > those for a server.
|
| but you can get exactly the same with a literal array. This is the
| semantics (or absence) analysis of the structure that makes it
| vulnerable.
|
|
| >
| > I have nothing against alternate parsing/creation schemes, however,
| > I am not excited about writing a new parser nor am I interested in
| > requiring something like AST when we've got a perfectly good 28
| > method JSON parser:).
|
| But what I do not understand is if you get
| '#(baseline
| #(version ))))'
|
| the literal parser is parsing that expression for you in one line.
| and it is not executing anything because this is a literal parser.
|
| Parser new scanTokens: 'SmalltalkImage'
|
| guess what you get…..
| tada
| #SmalltalkImage not the class
|
| Parser new scanTokens: '#(baseline (version 3.6))'
|
| and here
| tada
| an array.
|
| #(#(#baseline #(#version 3.6)))
|
| So ?
|
|
| >
| > Right now I'm continuing to move forward using JSON, with the
| > expectation that over time the format of the file will change and
| > when it changes it will be for the better.
| >
| > Yesterday, I was able to use Metacello to load from a fileTree
| > repository using the JSON specification format. So the following
| > Metacello script is functional:
| >
| >  Metacello new
| >    project: 'Sample';
| >    filetree: '/foos1/users/dhenrich/smalltalk/sample/';
| >    load.
| >
| > and I am continuing work on the Metacello scripting API as a whole.
| >
| > As I work through issues, it is beginning to look like you will be
| > able to write a baseline specification using the familiar
| > mathod-based format and when you save the project to a filetree
| > repository, I will convert the baseline specification into JSON,
| > but we'll see how it goes. I am juggling a number of different
| > requirements as I move forward, so we'll see what happens.
| >
| > As far as validation ... that is needed no matter what ... writing
| > a Smalltalk method doesn't guarantee that the spec is valid or
| > correct, so the specs created via JSON will be validated as well
| > ...
| >
| > To be honest if we are going to write a parser for Metacello specs,
| > then we should parse the current method-based format, then there
| > will be only one format for folks to worry about ... While this
| > would be ideal, I just don't have the extra cycles to spend
| > writing a parser.
| >
| > Dale
| >
| > ----- Original Message -----
| > | From: "Otto Behrens" <[hidden email]>
| > | To: [hidden email]
| > | Sent: Wednesday, February 8, 2012 10:18:02 AM
| > | Subject: Re: [Metacello] feedback on using JSON to specify
| > | baselines for git repository
| > |
| > | Hi,
| > |
| > | I understand that the security hole is created by evaluating a
| > | text
| > | string, thereby compiling and executing the expression.
| > |
| > | So, we want to avoid embedding Smalltalk methods in the metacello
| > | specification. This could even be a problem with a JSON
| > | implementation
| > | if we parse the spec and then perform certain elements in the
| > | parsed
| > | spec. Otoh, this is not as big a risk as evaluating more general
| > | expressions.
| > |
| > | Surely it is possible to parse a literal array syntax in stead of
| > | a
| > | json syntax, without dynamically executing the literal array. The
| > | advantage of this approach is that a Smalltalk scanner / parser
| > | can
| > | be
| > | used, which comes with the image. Well, this assumes a portable
| > | or
| > | standard scanner / parser exists between gemstone and pharo and
| > | what
| > | not.
| > |
| > | I kindof like the idea of using the Smalltalk tools; I could even
| > | use
| > | a method with auto formatting to create the spec. The tools could
| > | then
| > | verify if my spec makes sense.
| > |
| > | HTH
| > | Otto
| > |
| > | On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]>
| > | wrote:
| > |
| > | >
| > | >
| > | > ----- Original Message -----
| > | > | From: "stephane ducasse" <[hidden email]>
| > | > | To: [hidden email]
| > | > | Sent: Wednesday, February 8, 2012 8:17:13 AM
| > | > | Subject: Re: [Metacello] feedback on using JSON to specify
| > | > | baselines for git repository
| > | > |
| > | > | Hi dale,
| > | > |
| > | > | > Stef,
| > | > | >
| > | > | > Let me start by acknowledging that I understand that I am
| > | > | > having a
| > | > | > hard time getting my point across to you (and apparently
| > | > | > Camillo).
| > | > | > The following message may sound harsh, but I am merely
| > | > | > trying
| > | > | > to
| > | > | > establish a basic foundation from which I am drawing
| > | > | > conclusions.
| > | > | > While it is clear to me that you are not understanding my
| > | > | > points,
| > | > | > it is not at all clear to me what you don't understand...
| > | > | > Consequently I am going to try to say the same things I
| > | > | > have
| > | > | > said
| > | > | > in my previous emails in a different form and see if we can
| > | > | > make
| > | > | > some progress.
| > | > | >
| > | > | > POINT 1 "Executable specifications introduce a SECURITY
| > | > | > HOLE":
| > | > | >
| > | > | > The big sticking points that we appear to have is your
| > | > | > contention
| > | > | > that an "executable specification is just fine" and my
| > | > | > contention
| > | > | > that an "executable specification is not".
| > | > | >
| > | > | > You do understand that the when you crack open an mcz file
| > | > | > that
| > | > | > the
| > | > | > serialized Monticello meta data is in a parsable file
| > | > | > format
| > | > | > called DataStream, right? When you deserialize that code, a
| > | > | > stream
| > | > | > is parsed and data is read off of that stream to
| > | > | > reconstruct
| > | > | > the
| > | > | > Monticello meta data without executing any code that wasn't
| > | > | > already loaded in the image (i.e., no SECURITY PROBLEMS).
| > | > | >
| > | > | > So the first thing that I need from you Stef is an
| > | > | > acknowledgement
| > | > | > that you understand this is a weaksness of an executable
| > | > | > specification. In other words, this type of situation is
| > | > | > unacceptable:
| > | > | >
| > | > | >  (Node
| > | > | >        name: #baseline
| > | > | >        son: (Node
| > | > | >                        name: 'ProjectVersion')).
| > | > | >  SmalltalkImage current shutdown.
| > | > |
| > | > | Two points
| > | > |    - if you use a literal array you do not have this problem.
| > | >
| > | > If I expect to execute an expression using the literal array
| > | > then
| > | > it is a problem.
| > | >
| > | >  SmalltalkImage current shutdown.
| > | >  ^#()
| > | >
| > | > is still a problem if one is expected to EXECUTE the statements
| > | > whether it be a doit or a method or a string that is compiled
| > | > and
| > | > executed ... The security hole exists if one is expected to
| > | > EXECUTE a Smalltalk expression, because there is no way to
| > | > guarantee that the expression is safe.
| > | >
| > | > You do understand that the current mechanism we use in
| > | > Metacello is
| > | > to send the #project message to a class:
| > | >
| > | >  (Smalltalk at: #ConfigurationOfXXX) project
| > | >
| > | > and there are ZERO guarantees that the #project message won't
| > | > do
| > | > something nasty ... One cannot even load the code from an
| > | > arbitrary mcz file on a server, to create the class before
| > | > getting
| > | > a chance to send the #project message, because the #initialize
| > | > message is AUTOMATICALLY executed. As it stands today, there is
| > | > no
| > | > (safe) way for code executing on a server to reason about a
| > | > Metacello configration without executing arbitrary Smalltalk
| > | > code.
| > | >
| > | > Executing arbitrary Smalltalk expressions in a server is a
| > | > security
| > | > hole. It is the moral equivalent of giving a hacker a login on
| > | > your system and your server is not safe. I cannot run a server
| > | > exposed to the outside world in a VMWare data center that has
| > | > this
| > | > security hole.
| > | >
| > | > If you would just acknowledge that this is true I will read the
| > | > rest of your email message ...
| > | >
| > | > If you don't acknowledge this point, there is no point in
| > | > trying to
| > | > talk about anything else on this subject.
| > | >
| > | > Dale
| > |
|
|
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Frank Shearar-3
In reply to this post by Dale Henrichs
On 8 February 2012 21:23, Dale Henrichs <[hidden email]> wrote:
> Stef,
>
> I can only think is that you must be _assuming_ that the literal array is consumed by a parser that does not accept standard smalltalk sytax, but is written to parse literal arrays of strings and literal arrays.
>
> If you expect the Smalltalk compiler to compile and evaluate an expression to produce an array from the literal array string, then you are open to the security hole.
>
> If you postulate that someone were to write a portable literal array parser, then we have come to an understanding...and I will patiently wait for one to appear.

There's another advantage to JSON over a Smalltalk literal array:
loads of other languages have JSON parsers, so may parse these specs
and do interesting things.

(I do this with ruby and Maven's pom.xml files. It's really, really handy.)

frank

> Dale
>
> ----- Original Message -----
> | From: "stephane ducasse" <[hidden email]>
> | To: [hidden email]
> | Sent: Wednesday, February 8, 2012 12:12:01 PM
> | Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
> |
> | >
> | > If I expect to execute an expression using the literal array then
> | > it is a problem.
> | >
> | >  SmalltalkImage current shutdown.
> | >  ^#()
> | >
> | > is still a problem if one is expected to EXECUTE the statements
> | > whether it be a doit or a method or a string that is compiled and
> | > executed ... The security hole exists if one is expected to
> | > EXECUTE a Smalltalk expression, because there is no way to
> | > guarantee that the expression is safe.
> | >
> | > You do understand that the current mechanism we use in Metacello is
> | > to send the #project message to a class:
> | >
> | >  (Smalltalk at: #ConfigurationOfXXX) project
> | >
> | > and there are ZERO guarantees that the #project message won't do
> | > something nasty ... One cannot even load the code from an
> | > arbitrary mcz file on a server, to create the class before getting
> | > a chance to send the #project message, because the #initialize
> | > message is AUTOMATICALLY executed. As it stands today, there is no
> | > (safe) way for code executing on a server to reason about a
> | > Metacello configration without executing arbitrary Smalltalk code.
> | >
> | > Executing arbitrary Smalltalk expressions in a server is a security
> | > hole. It is the moral equivalent of giving a hacker a login on
> | > your system and your server is not safe. I cannot run a server
> | > exposed to the outside world in a VMWare data center that has this
> | > security hole.
> |
> | yes that I understand.
> |
> | but tell me why returning a JSON Strings would be any different than
> | a literal array?
> | because the problem is not the literal array of the JSON but the way
> | you get you hand on it.
> |
> | If your literal array parser parsers a string then you should get a
> | stream of literals.
> | the fact that you parse
> |       #(#baseline
> |               #(project '3.6')
> |               )
> |
> | will return a stream composed of literals.
> | I think that this is exactly the same as for a JSON expression.
> | Now if a hack add
> |
> |       SmalltalkImage killtherest
> |       #(#baseline
> |               #(project '3.6')
> |               )
> |
> | then you semantics analysis will see that this is not an element:
> | first not an array, second that SmalltalkImage is not
> | an authorized keywords like baseline, repository or whatever.
> |
> | So the parsing is not the problem or there is something I do not
> | understand.
> |
> |
> |
> |
> | > If you would just acknowledge that this is true I will read the
> | > rest of your email message ...
> | >
> | > If you don't acknowledge this point, there is no point in trying to
> | > talk about anything else on this subject.
> | >
> | > Dale
> |
> |
Reply | Threaded
Open this post in threaded view
|

Re: feedback on using JSON to specify baselines for git repository

Dale Henrichs
In reply to this post by stephane ducasse-2
Stef,

BTW, I consider the current method-based format to be declarative syntax as well, but I digress...

To start with, I covered most of the reasons for considering using a structured file format (as opposed to a Smalltalk source format) in my first post and it was the sum of all of these issues that lead me to consider JSON:

  1. The security issue, of course.

  2. Last Friday, I hit the 255 literal limit for a method
     in the Seaside30 3.0.6 version method. A year ago I hit
     the literal limit for the 3.0.0-alpha5.5-baseline
     specification. The implication is that using method-
     based or doit-based specifications will not scale and
     in the case of Seaside30, has already hit the limits.

  3. When doing a merge of a configuration it is necessary
     to have the Metacello project information for three
     different versions of the ConfigurationOfXXX class:
     common ancestor, configuration version A, and
     configuration version B (just like Monticello).
     Unfortunately, with the current system that necessitates
     loading each version of the ConfigurationOfXXX class,
     executing the #project method and then stashing the result
     while going about loading the other versions of the class.
     If we are LOADING classes what happens if the class has
     been modified in the current image, how does one get the
     other two versions of the classes needed to perform the
     merge?

     Monticello doesn't have this problem, because it stores
     the serialized meta model. It is trivial to have as many
     different versions of the meta model for a package as one
     needs.

     With a structured file format like JSON, it becomes trivial
     to load the Metacello version information for multiple
     project versions into memory at once.

     For mcz-based ConfigurationOfXXX files, it is still an open
     issue as to where to stash the structured (PARSABLE) data
     file, but having a structured file format gives us one less
     problem to solve in this area.

  4. In a directory-based SCM (git or SVN) the entire directory
     is versioned. Only the Metacello baseline specification
     applies to a single version of the directory. The directory
     will contain the correct version of each package for the
     project, so there is no need to specify any package
     versions - only the name is needed and the name is already
     in the baseline. You need package dependencies and required
     projects, but these are also specified in the baseline. If
     you add or remove a package, you will edit the baseline to
     reflect the changes to the structure of the project, then
     when you commit there is no need to keep the old baseline
     version around. So the net-net is that a single baseline
     specification is all that is needed as part of the directory.

     It is true, though, that there is information like the symbolic
     version information in the ConfigurationOf file that needs to
     be versioned separately from the project packages. This is the
     meta information that tells you which git (or SVN) version you
     should use for GemStone or Pharo1.1, or Squeak. This meta
     information will still be kept in the new ConfigurationOf that
     is versioned on a separate cycle than the embedded baseline
     information.

     In the end then, I will be using the JSON (or literal array)
     format only to store the per project version baseline information
     in a file currently called metacello.json[1].

Does this help clear things up?

Dale

[1] https://github.com/dalehenrich/metacello/blob/MetacelloGitProject/metacello.json

----- Original Message -----
| From: "stephane ducasse" <[hidden email]>
| To: [hidden email]
| Sent: Wednesday, February 8, 2012 12:24:35 PM
| Subject: Re: [Metacello] feedback on using JSON to specify baselines for git repository
|
| BTW dale I still do not understand how this declarative syntax can
| help you merging configuration.
| It helps you to avoid server security hole but in my first email
| about git I ask and you told me that it was about merging
| configurations and since then I'm confused.
|
| Stef
|
|
| On Feb 8, 2012, at 8:13 PM, Dale Henrichs wrote:
|
| > Otto,
| >
| > Security isn't an issue with the current JSON parser, it creates
| > Arrays and Dictionaries from JSON, then I interpret the structure
| > by sending messages to a known class instance and no strings are
| > evaluated, so badly formed JSON input will only result in
| > walkbacks, not a trojan horse. BTW, in Javascript, JSON strings
| > apparently ARE executed like literal arrays, but the security
| > issues with code that executes in a Browser aren't as great as
| > those for a server.
| >
| > I have nothing against alternate parsing/creation schemes, however,
| > I am not excited about writing a new parser nor am I interested in
| > requiring something like AST when we've got a perfectly good 28
| > method JSON parser:).
| >
| > Right now I'm continuing to move forward using JSON, with the
| > expectation that over time the format of the file will change and
| > when it changes it will be for the better.
| >
| > Yesterday, I was able to use Metacello to load from a fileTree
| > repository using the JSON specification format. So the following
| > Metacello script is functional:
| >
| >  Metacello new
| >    project: 'Sample';
| >    filetree: '/foos1/users/dhenrich/smalltalk/sample/';
| >    load.
| >
| > and I am continuing work on the Metacello scripting API as a whole.
| >
| > As I work through issues, it is beginning to look like you will be
| > able to write a baseline specification using the familiar
| > mathod-based format and when you save the project to a filetree
| > repository, I will convert the baseline specification into JSON,
| > but we'll see how it goes. I am juggling a number of different
| > requirements as I move forward, so we'll see what happens.
| >
| > As far as validation ... that is needed no matter what ... writing
| > a Smalltalk method doesn't guarantee that the spec is valid or
| > correct, so the specs created via JSON will be validated as well
| > ...
| >
| > To be honest if we are going to write a parser for Metacello specs,
| > then we should parse the current method-based format, then there
| > will be only one format for folks to worry about ... While this
| > would be ideal, I just don't have the extra cycles to spend
| > writing a parser.
| >
| > Dale
| >
| > ----- Original Message -----
| > | From: "Otto Behrens" <[hidden email]>
| > | To: [hidden email]
| > | Sent: Wednesday, February 8, 2012 10:18:02 AM
| > | Subject: Re: [Metacello] feedback on using JSON to specify
| > | baselines for git repository
| > |
| > | Hi,
| > |
| > | I understand that the security hole is created by evaluating a
| > | text
| > | string, thereby compiling and executing the expression.
| > |
| > | So, we want to avoid embedding Smalltalk methods in the metacello
| > | specification. This could even be a problem with a JSON
| > | implementation
| > | if we parse the spec and then perform certain elements in the
| > | parsed
| > | spec. Otoh, this is not as big a risk as evaluating more general
| > | expressions.
| > |
| > | Surely it is possible to parse a literal array syntax in stead of
| > | a
| > | json syntax, without dynamically executing the literal array. The
| > | advantage of this approach is that a Smalltalk scanner / parser
| > | can
| > | be
| > | used, which comes with the image. Well, this assumes a portable
| > | or
| > | standard scanner / parser exists between gemstone and pharo and
| > | what
| > | not.
| > |
| > | I kindof like the idea of using the Smalltalk tools; I could even
| > | use
| > | a method with auto formatting to create the spec. The tools could
| > | then
| > | verify if my spec makes sense.
| > |
| > | HTH
| > | Otto
| > |
| > | On 08 Feb 2012, at 19:18, Dale Henrichs <[hidden email]>
| > | wrote:
| > |
| > | >
| > | >
| > | > ----- Original Message -----
| > | > | From: "stephane ducasse" <[hidden email]>
| > | > | To: [hidden email]
| > | > | Sent: Wednesday, February 8, 2012 8:17:13 AM
| > | > | Subject: Re: [Metacello] feedback on using JSON to specify
| > | > | baselines for git repository
| > | > |
| > | > | Hi dale,
| > | > |
| > | > | > Stef,
| > | > | >
| > | > | > Let me start by acknowledging that I understand that I am
| > | > | > having a
| > | > | > hard time getting my point across to you (and apparently
| > | > | > Camillo).
| > | > | > The following message may sound harsh, but I am merely
| > | > | > trying
| > | > | > to
| > | > | > establish a basic foundation from which I am drawing
| > | > | > conclusions.
| > | > | > While it is clear to me that you are not understanding my
| > | > | > points,
| > | > | > it is not at all clear to me what you don't understand...
| > | > | > Consequently I am going to try to say the same things I
| > | > | > have
| > | > | > said
| > | > | > in my previous emails in a different form and see if we can
| > | > | > make
| > | > | > some progress.
| > | > | >
| > | > | > POINT 1 "Executable specifications introduce a SECURITY
| > | > | > HOLE":
| > | > | >
| > | > | > The big sticking points that we appear to have is your
| > | > | > contention
| > | > | > that an "executable specification is just fine" and my
| > | > | > contention
| > | > | > that an "executable specification is not".
| > | > | >
| > | > | > You do understand that the when you crack open an mcz file
| > | > | > that
| > | > | > the
| > | > | > serialized Monticello meta data is in a parsable file
| > | > | > format
| > | > | > called DataStream, right? When you deserialize that code, a
| > | > | > stream
| > | > | > is parsed and data is read off of that stream to
| > | > | > reconstruct
| > | > | > the
| > | > | > Monticello meta data without executing any code that wasn't
| > | > | > already loaded in the image (i.e., no SECURITY PROBLEMS).
| > | > | >
| > | > | > So the first thing that I need from you Stef is an
| > | > | > acknowledgement
| > | > | > that you understand this is a weaksness of an executable
| > | > | > specification. In other words, this type of situation is
| > | > | > unacceptable:
| > | > | >
| > | > | >  (Node
| > | > | >        name: #baseline
| > | > | >        son: (Node
| > | > | >                        name: 'ProjectVersion')).
| > | > | >  SmalltalkImage current shutdown.
| > | > |
| > | > | Two points
| > | > |    - if you use a literal array you do not have this problem.
| > | >
| > | > If I expect to execute an expression using the literal array
| > | > then
| > | > it is a problem.
| > | >
| > | >  SmalltalkImage current shutdown.
| > | >  ^#()
| > | >
| > | > is still a problem if one is expected to EXECUTE the statements
| > | > whether it be a doit or a method or a string that is compiled
| > | > and
| > | > executed ... The security hole exists if one is expected to
| > | > EXECUTE a Smalltalk expression, because there is no way to
| > | > guarantee that the expression is safe.
| > | >
| > | > You do understand that the current mechanism we use in
| > | > Metacello is
| > | > to send the #project message to a class:
| > | >
| > | >  (Smalltalk at: #ConfigurationOfXXX) project
| > | >
| > | > and there are ZERO guarantees that the #project message won't
| > | > do
| > | > something nasty ... One cannot even load the code from an
| > | > arbitrary mcz file on a server, to create the class before
| > | > getting
| > | > a chance to send the #project message, because the #initialize
| > | > message is AUTOMATICALLY executed. As it stands today, there is
| > | > no
| > | > (safe) way for code executing on a server to reason about a
| > | > Metacello configration without executing arbitrary Smalltalk
| > | > code.
| > | >
| > | > Executing arbitrary Smalltalk expressions in a server is a
| > | > security
| > | > hole. It is the moral equivalent of giving a hacker a login on
| > | > your system and your server is not safe. I cannot run a server
| > | > exposed to the outside world in a VMWare data center that has
| > | > this
| > | > security hole.
| > | >
| > | > If you would just acknowledge that this is true I will read the
| > | > rest of your email message ...
| > | >
| > | > If you don't acknowledge this point, there is no point in
| > | > trying to
| > | > talk about anything else on this subject.
| > | >
| > | > Dale
| > |
|
|
123