Smalltalk › Squeak › Squeak VM

Re: [Pharo-dev] [squeak-dev] Re: The Dilemma: Building a Futuristic GUI for Ephestos

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

8 messages Options

Eliot Miranda-2

Re: [Pharo-dev] [squeak-dev] Re: The Dilemma: Building a Futuristic GUI for Ephestos

Hi Torsten,

wow, what a great summary. Thanks. See below.

On Wed, Sep 17, 2014 at 12:09 PM, Torsten Bergmann <[hidden email]> wrote:

Eliot wrote:
>We're getting there with fast. Tiny needs more definition, but the core Cog/Spur VM on Mac minus plugins and GUI >code is 568k, 506k code, 63k data; the newspeak VM which has fewer primitives but support for two bytecode sets is >453k, 386k code, 68k data. That includes the interpreter, JIT, core primitives and memory manager/GC. Add on >several k for the FFI, C library et al and a VM in a megabyte is realistic. Is that in the right ball park?

Yes, wet's my appetite :)

Tiny means easy to download and use for scripting, no other dependencies
for the out of the box REPL experience. But still powerfull to built one on another
and scale up to large programs.

Initially you provide just a simple VM executable for a "tiny smalltalk" (ts), for simple
things like:

c:\myscripts\ts.exe -e 3+4
7

where ts.exe is the VM in a few kB including a small initial kernel image (as part of
the PE DATA section). It should be able to run scripts:

c:\myscripst\ts.exe automate.st

It should be possible to easily build components (either with or without the sources):

ts.exe sunit.st -o sunit.sll
ts.exe sunit.st -noSource -o sunit.sll

So you can either include other scripts

ts.exe -i sunit.st mytests.st

or link the previously built binary components to form something new.

ts.exe -l junit.sll mytests.st

But it should be possible to build another vm+image easily:

ts.exe -i mykillerapp.st -o killerapp.exe
or ts.exe -l mykillerapp.sll -o killerapp.exe

by writing a new PE file for Windows either including a new kernel image or a combined
one based on the already built-in initial kernel of the previous one.
So I can deploy a killerapp or reassemble it to form the next language version.

SmallScript was very close to that and it was a nice thing to work with.
Especially since usually Smalltalk usually throws things out at the end - but
one never builds a house by carving it out of a rock. We need bricks to assemble.

+1000.

With the above scheme you could build components, some of them portable, others
explicitly non-portable as the code includes native calls. The later have to
be rebuilt on the other platform - so if you have good design abstractions you
can implement a UI lib on Windows and one on Mac.

Some of the components are with the source code packaged in for debugging
(but not changeable), for closed source components you leave it out.

And still it should support pinnable objects (not moved by GC for callouts),
sandboxing, Classes with UUIDs to allow for side by side loading/migration
and an extensible meta-scheme as this is where Smalltalk is still the hero of all
und metaprogramming will be our future.

So so far I have pinning. But I'd love to hear more detail on sandboxing. I'm not sure about UUIDs. What about other mechanisms, namespaces, ClassBoxes? UUID sounds too much like an implementation to me. No one is proposing qualifying class names with UUIDs (Array:c5c09212-0c7d-44f3-81b7-18ae5d7f14d9 new: 5 ????). How about reitierating this more abstractly?

Also the VM should be available as a shared component - so it can run in-process
in other programs like a web browser. Always wished to write:

<script type="text/tinyscript">
BrowserLog write: 'HelloWorld'.
</script>

directly in HTML.

And this is only the beginning of the wish list...

Please, I'd love to have you write a fuller list. We could add it to the list at Cog Projects or add it to a "Directions" page or some such. But capturing these ideas is importasnt.

>Yep. But personally I think Fuel is better and just as fast. Certainly that's the parcels experience.
> - but still with the ability to bootstrap up to a full saveable image

Yes, Fuel would be an option. At least it would be platform independent. Not sure
about other options (quick loading by mapping component using memory mapped files etc.)

Well, mapping finished images a la VW's PermSpace is probably easier. The memory mapping in .Nt is extremely complex. But I get the requirement; ultra-fast start-up of small components.

>This we'll leave to the web framework designers, but it seems eminently doable no?

For sure.

Bye
Torsten

thanks again. And give serious thought to carefully enunciating these requirements/this design sketch?

--
best,

Eliot

ccrraaiigg

re: deployment components (was "The Dilemma: Building a Futuristic GUI for Ephestos")

Hi Eliot!

Torsten writes:

> Classes with UUIDs to allow for side by side loading/migration and an
> extensible meta-scheme as this is where Smalltalk is still the hero
> of all and metaprogramming will be our future.

You respond:

> So so far I have pinning. But I'd love to hear more detail on
> sandboxing. I'm not sure about UUIDs. What about other mechanisms,
> namespaces, ClassBoxes? UUID sounds too much like an implementation
> to me. No one is proposing qualifying class names with UUIDs
> (Array:c5c09212-0c7d-44f3-81b7-18ae5d7f14d9 new: 5 ????). How about
> reiterating this more abstractly?

Right, there's no need for anyone to use UUIDs when referring to
classes in Smalltalk expressions. We can keep using class names as we
always have. Instead, we can do the following:

- Add an ID field to each class.

- Make each class responsible for its own name literal. The "name"
field of each class would hold the method literal used by methods
that refer to that class, rather than just a name key into the
traditional system dictionary.

This allows each class to have any name. When a developer attempts
to compile source that refers to a particular class name, the system can
present a list of all the classes with that name from which to choose.
To aid in the decision, the system can present additional information,
such as author, release date, module membership, etc. When displaying
method source after compilation, the system can emphasize class names
which correspond to multiple classes, and provide quick means for
getting at that additional information.

We could keep that information in a live memory of objects
describing edits made to the system, with which we can interact via
remote messaging. Let's call it a history memory. Like the traditional
changes/sources system, it could record the history of our development
activity, but with all the reflective power of an object system for
answering our queries (and synchronizing with other systems on our behalf).

This history memory could supplant the traditional system
dictionary and changes/sources files. The key to making this practical
is having a minimal object memory and a module system with which to grow it.

I've done all this. Will you collaborate with me?

thanks!

-C

[1] http://thiscontext.com/a-detailed-naiad-description

(I'm currently working on a GitHub-based installer. It interacts with
GitHub's web services API, reminiscent of the WebDAV support I wrote for
interacting with the live system from Your Favorite Text Editor.)

--
Craig Latta
netjam.org
+31 6 2757 7177 (SMS ok)
+ 1 415 287 3547 (no SMS)

Torsten Bergmann

re: deployment components (was "The Dilemma: Building a Futuristic GUI for Ephestos")

Craig Latta wrote:
>Right, there's no need for anyone to use UUIDs when referring to
>classes in Smalltalk expressions.

Yes, exactly this. We still use regular names for the classes - but for the tools
they have a distinguishable ID.
Best is to use a UUID for the ID as it is unique from the beginning. So if you
create a class "Foo" and I create one both are unqique by ID - even when using
the same name.

The "rest" is implemeting tooling support. When you write "Foo" in a workspace
then the workspace either already has a naming context to resolve the name (see [1]
or could provide you with a selection if there is more than one. Internally you could
bind to the right class by using the ID.

I hate prefixing the classes as we now do, I want to name the classes "Instruction"
not "IRInstruction" in Opal or "AJInstruction" in AsmJIT, especially since "IR" and "AJ"
do not really tell you anything. Smalltalk names should be understandable. Additionally
we should have meaningfull (and also unique) names also for namespaces.

Also the current three/two letter prefixing of classes that is used in Squeak or Pharo
only scales well for small communities...

When there are things I like in Java then it is the fact to use reverse domains
for namespaces. It is easy, understandable and I could also easily check if there
is a web location.

This way a namespace "org.apache.maven" gives you something meaningful, I can even
visit http://maven.apache.org/. Allows the community to easily scale with thousands of
classes without having clashes.

Think of "org.pharo.asmjit.Instruction" vs "org.pharo.opal.Instruction"

So you additionally need better and meaningful namespaces THAT SCALE AND ARE EASY TO UNDERSTAND
combined with real packages (as objects, as Pharo already has).

Then we also need ABIA (Around, beginner, inner, after messages), multiple method
categories, selector namespaces, possibility for visibility (like private methods, see [2]),
unified annotations for classes/methods/variables and a few other ingredients from my
personal wishlist and it may shape a different new future...

The interesting part is not to add all kind of stuff to Smalltalk - the interesting
part is what is the most minimalistic system to allow for something extensible
and still flexible (even within the meta-system) so that one can bootstrap things on top.

Bye
T.

[1] http://vimeo.com/75391990
[2] http://smalltalkhub.com/#!/~CamilleTeruel/PrivateMethods

Frank Shearar-3

re: deployment components (was "The Dilemma: Building a Futuristic GUI for Ephestos")

On 24 September 2014 09:16, Torsten Bergmann <[hidden email]> wrote:

>
> Craig Latta wrote:
>>Right, there's no need for anyone to use UUIDs when referring to
>>classes in Smalltalk expressions.
>
> Yes, exactly this. We still use regular names for the classes - but for the tools
> they have a distinguishable ID.
> Best is to use a UUID for the ID as it is unique from the beginning. So if you
> create a class "Foo" and I create one both are unqique by ID - even when using
> the same name.

The problem I have with UUIDs is that, in a distributed setting, you
need to rely on everyone to generate UUIDs correctly. Now IF everyone
implements UUID generation correctly, and IF everyone plays the game
in good faith, you will almost certainly not get a UUID clash. But
when you say your code depends on package/class/object/method with
guid SOME_GUID, and I as a bad actor inject my own class with guid
SOME_GUID, how are you to know that you've just been 0wned?

Wouldn't it be better, instead of using a key-value store (key K maps
to value V), we have a store that uses function_of(V) maps to value V?
Like in git: a commit is uniquely identified by its hash and, because
the hash is a checksum, you can't - short of engineering a SHA-1
collision (good luck with that) - forge part of the system.

> The "rest" is implemeting tooling support. When you write "Foo" in a workspace
> then the workspace either already has a naming context to resolve the name (see [1]
> or could provide you with a selection if there is more than one. Internally you could
> bind to the right class by using the ID.

Yes, that "rest" is most of the work. But you know this because of the
scare quotes :)

> I hate prefixing the classes as we now do, I want to name the classes "Instruction"
> not "IRInstruction" in Opal or "AJInstruction" in AsmJIT, especially since "IR" and "AJ"
> do not really tell you anything. Smalltalk names should be understandable. Additionally
> we should have meaningfull (and also unique) names also for namespaces.
>
> Also the current three/two letter prefixing of classes that is used in Squeak or Pharo
> only scales well for small communities...
>
> When there are things I like in Java then it is the fact to use reverse domains
> for namespaces. It is easy, understandable and I could also easily check if there
> is a web location.
>
> This way a namespace "org.apache.maven" gives you something meaningful, I can even
> visit http://maven.apache.org/. Allows the community to easily scale with thousands of
> classes without having clashes.
>
> Think of "org.pharo.asmjit.Instruction" vs "org.pharo.opal.Instruction"
>
> So you additionally need better and meaningful namespaces THAT SCALE AND ARE EASY TO UNDERSTAND
> combined with real packages (as objects, as Pharo already has).

We already have this, or the basis at least, in Squeak in the form of
Environments.

> Then we also need ABIA (Around, beginner, inner, after messages), multiple method
> categories, selector namespaces, possibility for visibility (like private methods, see [2]),
> unified annotations for classes/methods/variables and a few other ingredients from my
> personal wishlist and it may shape a different new future...

Squeak hasn't done this yet, but it should be possible to use
Environments to implement selector namespaces. We haven't done it yet
because Colin Putney's been really busy lately, and no one else has
had the time/energy to Just Do It. And if you have selector
namespaces, you have private methods, don't you?

frank

> The interesting part is not to add all kind of stuff to Smalltalk - the interesting
> part is what is the most minimalistic system to allow for something extensible
> and still flexible (even within the meta-system) so that one can bootstrap things on top.
>
> Bye
> T.
>
> [1] http://vimeo.com/75391990
> [2] http://smalltalkhub.com/#!/~CamilleTeruel/PrivateMethods

ccrraaiigg

re: deployment components

Hi Frank--

I wrote:

> Right, there's no need for anyone to use UUIDs when referring to
> classes in Smalltalk expressions.

Torsten responded:

> Yes, exactly this. We still use regular names for the classes - but
> for the tools they have a distinguishable ID. Best is to use a UUID
> for the ID as it is unique from the beginning. So if you create a
> class "Foo" and I create one both are unqique by ID - even when using
> the same name.

You responded to Torsten:

> The problem I have with UUIDs is that...

Hooray! Discussion! :)

> ...in a distributed setting, you need to rely on everyone to generate
> UUIDs correctly. Now IF everyone implements UUID generation
> correctly...

This is not hard, if there are common implementations at the
Smalltalk level (like we have common collection classes), or at the
virtual machine level (like we have common plugins).

> ...and IF everyone plays the game in good faith...

Whoa, if you're going to invoke that concern, then I wonder how you
can tolerate what we've already been doing. Namely:

- Compiling source code with no way to trust its authorship.

- Compiling source code with no way to trust that the code does what
the authors say it does (trusting authorship would help somewhat).

- Compiling source code with no way to trust that the target
compilation environment is sufficiently like the one in which the
source code was developed.

> ...you will almost certainly not get a UUID clash. But when you say
> your code depends on package/class/object/method with guid SOME_GUID,
> and I as a bad actor inject my own class with guid SOME_GUID, how are
> you to know that you've just been 0wned?

Again, how would you know today that the filein or Monticello
package you just loaded has 0wned you? When would you know? What would
you do about it?

I think we have to concede that a motivated insider will always be
able to do bad things for a while before being caught, probably
involving social mechanisms. I'm interested in how to minimize this
time, and how to undo the bad things. With the Naiad module system, I
want to identify, as soon as possible, the human interactions required
to establish trust, and facilitate them.

Naiad modules are synchronized live between systems (without
recompiling source code; only authors compile source code). You could
know you're likely to be 0nwed if you proceed with a module
synchronization because the cryptographic signature of a module is
incorrect for its purported authors. It would be similar to knowing that
the PGP-signed message you just read isn't really from the person whose
key you verified earlier.

> Wouldn't it be better, instead of using a key-value store (key K maps
> to value V), we have a store that uses function_of(V) maps to value V?
> Like in git: a commit is uniquely identified by its hash and, because
> the hash is a checksum, you can't - short of engineering a SHA-1
> collision (good luck with that) - forge part of the system.

The things we want to identify have state that changes over time;
their hashes would also change over time. This is fine for things like a
module signature during a synchronization conversation (in fact, I'd
like to leverage the actual existing public-key cryptography
infrastructure for Naiad). It's inappropriate for referring to a
changing class over the course of years. UUIDs are a straightforward and
well-known way of generating IDs for things which change.

Anyway, there it is, both UUIDs and hashing are involved.

-C

--
Craig Latta
netjam.org
+31 6 2757 7177 (SMS ok)
+ 1 415 287 3547 (no SMS)

ccrraaiigg

re: deployment components

In reply to this post by Torsten Bergmann

Hi Torsten--

> The interesting part is not to add all kind of stuff to Smalltalk -
> the interesting part is what is the most minimalistic system to allow
> for something extensible and still flexible (even within the
> meta-system) so that one can bootstrap things on top.

Yes, that's my interest precisely.

-C

--
Craig Latta
netjam.org
+31 6 2757 7177 (SMS ok)
+ 1 415 287 3547 (no SMS)

Eliot Miranda-2

re: deployment components (was "The Dilemma: Building a Futuristic GUI for Ephestos")

In reply to this post by Torsten Bergmann

Hi Torsten,

On Wed, Sep 24, 2014 at 1:16 AM, Torsten Bergmann <[hidden email]> wrote:

Craig Latta wrote:
>Right, there's no need for anyone to use UUIDs when referring to
>classes in Smalltalk expressions.

Yes, exactly this. We still use regular names for the classes - but for the tools
they have a distinguishable ID.
Best is to use a UUID for the ID as it is unique from the beginning. So if you
create a class "Foo" and I create one both are unqique by ID - even when using
the same name.

But this doesn't cope with the fact that I can take a copy of your class, modify my copy by adding or changing a method, etc, and hence derive two different versions of the class with the same UUID. If a class is defined by what it is (its methods, instance variables, superclass, etc) then it doesn't need a UUID. Its own form uniquely (and reliably) identifies it. If you use UUIDs then is the UUID modified every time some modification is made to it?

The "rest" is implemeting tooling support. When you write "Foo" in a workspace
then the workspace either already has a naming context to resolve the name (see [1]
or could provide you with a selection if there is more than one. Internally you could
bind to the right class by using the ID.

That's what namespaces do.

I hate prefixing the classes as we now do, I want to name the classes "Instruction"
not "IRInstruction" in Opal or "AJInstruction" in AsmJIT, especially since "IR" and "AJ"
do not really tell you anything. Smalltalk names should be understandable. Additionally
we should have meaningfull (and also unique) names also for namespaces.

Agreed.

Also the current three/two letter prefixing of classes that is used in Squeak or Pharo
only scales well for small communities...

It's horrible. Crappy. But we're in the process of transitioning to namespaces and that will change things.

When there are things I like in Java then it is the fact to use reverse domains
for namespaces. It is easy, understandable and I could also easily check if there
is a web location.

This way a namespace "org.apache.maven" gives you something meaningful, I can even
visit http://maven.apache.org/. Allows the community to easily scale with thousands of
classes without having clashes.

Think of "org.pharo.asmjit.Instruction" vs "org.pharo.opal.Instruction"

You should read the Newspeak stuff on namepsaces. This explicit wiring is horrible. It doesn't allow us to mix and match as we want to.

So you additionally need better and meaningful namespaces THAT SCALE AND ARE EASY TO UNDERSTAND
combined with real packages (as objects, as Pharo already has).

+1.

Then we also need ABIA (Around, beginner, inner, after messages),

I don't see what this has to do with distribution and you haven't given any rationale for it. For me super is enough. One can synthesize all the others using it (except I guess inner; can you give me an example of inner?). In any case one can synthesise code via editing without needing new VM machinery. So I'm most dubious about this.

I *do* recall discussing and sketching with David Leibs something of this flavour for loading packages. For example deploying some Smalltalk component/service might require modification of some core method, e.g. add/remove dependents, and that the edit would be better expressed as a pattern where the existing method is filled out in an element of a template of the replacement to be loaded. This is to do with being able to define finer-grained overrides, and have them removed after. IMO this is more general, powerful and easier to manage than using new VM machinery.

multiple method
categories,

What's the rationale here? Arguably we have this now but there are bugs in ClassOrganizer and the tools. But it is something you could write tests for and fix now. But it seems kind of trivial and beside the point of deployment. Can you explain why it is important?

selector namespaces, possibility for visibility (like private methods, see [2]),

agreed. These, at least in theory, are a very nice way of insulating extensions.

unified annotations for classes/methods/variables and a few other ingredients from my
personal wishlist and it may shape a different new future...

Annotations indeed. But I think first-class slots are more important no? And the Pharo folks have a working system which we can use.

The interesting part is not to add all kind of stuff to Smalltalk - the interesting
part is what is the most minimalistic system to allow for something extensible
and still flexible (even within the meta-system) so that one can bootstrap things on top.

Right. So ABIA seems over the top. What does one *really* need? You talk of "my personal wishlist" and at the same time of "the most minimalistic system". This reads to me as "my personal grab bag" and "the most minimalistic system". Decide :-).

Bye
T.

[1] http://vimeo.com/75391990
[2] http://smalltalkhub.com/#!/~CamilleTeruel/PrivateMethods

--
best,

Eliot

ccrraaiigg

re: deployment components

Hi Eliot--

I write:

> Right, there's no need for anyone to use UUIDs when referring to
> classes in Smalltalk expressions.

Torsten responds:

> Yes, exactly this. We still use regular names for the classes - but
> for the tools they have a distinguishable ID. Best is to use a UUID
> for the ID as it is unique from the beginning. So if you create a
> class "Foo" and I create one both are unique by ID - even when using
> the same name.

You respond:

> But this doesn't cope with the fact that I can take a copy of your
> class, modify my copy by adding or changing a method, etc, and hence
> derive two different versions of the class with the same UUID.

A newly-created class copy would give a new ID to itself, as part
of its post-copy behavior. The UUID of a class corresponds to its object
identity. Sure, you could manually overwrite the ID field of any class
with something mischievous, just as, traditionally, you can manually
overwrite any field of any object you like. We have the means to deal
with that broader issue, if we want (e.g., we can implement object
immutability).

> If a class is defined by what it is (its methods, instance variables,
> superclass, etc) then it doesn't need a UUID. Its own form uniquely
> (and reliably) identifies it. If you use UUIDs then is the UUID
> modified every time some modification is made to it?

No, the UUID is never modified, and no, a class is not defined by
its state (I suggest). For our purposes here, a class is defined only by
its object identity, and we only find it interesting to keep track of
that identity because that class object prescribes behavior. (Typically,
that corresponds to the class object being a subinstance of Behavior,
but this need not be so.)

By establishing distributed identity, we are free to give whatever
name we want to each class object. Every class name is thus
unconstrained; every class effectively has its own namespace. This is
the concept of "Name And Identity Are Distinct" (NAIAD). To implement
it, we express object identity in terms that are usable in a distributed
system, using a commonly-known algorithm (UUIDs).

-C

--
Craig Latta
netjam.org
+31 6 2757 7177 (SMS ok)
+ 1 415 287 3547 (no SMS)