Falsehoods programmers believe about Smalltalk

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Falsehoods programmers believe about Smalltalk

hernanmd
Hi there,

I just created a GitHub repo to collect myths around Smalltalk-based technologies: Pharo, Squeak, VW, VAST, Smalltalk/X, GNU/ST, etc. in the spirit of the Falsehoods lists [1-4].

This is just a draft now but please feel free to add falsehoods based on your own experiences. Examples are greatly appreciated.


Cheers,

Hernán



Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

Eliot Miranda-2
Hi Hernán,

On Sun, Jan 20, 2019 at 2:31 PM Hernán Morales Durand <[hidden email]> wrote:
Hi there,

I just created a GitHub repo to collect myths around Smalltalk-based technologies: Pharo, Squeak, VW, VAST, Smalltalk/X, GNU/ST, etc. in the spirit of the Falsehoods lists [1-4].

This is just a draft now but please feel free to add falsehoods based on your own experiences. Examples are greatly appreciated.

You want pull requests?  If not, would you give me write permission?  I'd love to add to the "Smalltalk is obsolete" section...
 



--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

hernanmd

Done.

I have some possible myths, but I'd like to confirm or reject:

- All Smalltalk bytecode sets are stack-based VM. (?)
- Bytecodes are always fixed-size. (?)
- Most of the time spent by a VM is in the instruction interpreter. (actually it's in the GC right?)
- You cannot serialize objects containing blocks. (IIRC one can use MessageSends)
- Image cannot be bootstrapped. (This is possible in ST/X and now in Pharo I think).
- All Smalltalks includes UI classes. (GemStone doesn't have AFAIK).
- All implementations uses direct pointers, (GST?)
- All implementations uses green threads. (VAST? MT?)

I'm sure people in this list will have a lot more myths heard from Conferences, Forums, Videos, Talks, etc. Like the guy who said Smalltalk was dead. So if you did something which could be ignored publicly, please don't hesitate to reply or ping me to get added as collaborator.

Cheers,

Hernán



El dom., 20 ene. 2019 a las 22:41, Eliot Miranda (<[hidden email]>) escribió:
Hi Hernán,

On Sun, Jan 20, 2019 at 2:31 PM Hernán Morales Durand <[hidden email]> wrote:
Hi there,

I just created a GitHub repo to collect myths around Smalltalk-based technologies: Pharo, Squeak, VW, VAST, Smalltalk/X, GNU/ST, etc. in the spirit of the Falsehoods lists [1-4].

This is just a draft now but please feel free to add falsehoods based on your own experiences. Examples are greatly appreciated.

You want pull requests?  If not, would you give me write permission?  I'd love to add to the "Smalltalk is obsolete" section...
 



--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

Jecel Assumpcao Jr
Hernán Morales Durand wrote on Mon, 21 Jan 2019 18:13:45 -0300
> I have some possible myths, but I'd like to confirm or reject:
>
> - All Smalltalk bytecode sets are stack-based VM. (?)
> - Bytecodes are always fixed-size. (?)

SOAR (Smalltalk On A RISC, now renamed as RISC-III) used a 32 bit
register based instruction set and Smalltalk sources were translated to
that. They did regret dumping bytecodes later in their project as some
things became more complicated and the increase in memory use was very
expensive at that time.

https://apps.dtic.mil/dtic/tr/fulltext/u2/a172800.pdf

SOM (Simple Object Machine) is a set of Smalltalk VMs where some of them
represent code as abstract syntax trees instead of bytecodes.

http://som-st.github.io/
http://www.hpi.uni-potsdam.de/hirschfeld/projects/som/

> - Most of the time spent by a VM is in the instruction interpreter. (actually it's in the GC right?)

That will vary from one VM to another. On page 179 of the "green book"
you can see a nice graph of the space and time used by different part of
the Apple Smalltalk (from which Squeak evolved) and on page 177 you can
find the numbers used to create the chart.

http://sdmeta.gforge.inria.fr/FreeBooks/BitsOfHistory/

10.2% of the time in the fetch loop, 16.0% in the bytecode interpreter,
39.2% in sends and returns, 22.6% in the memory management and 10% in
primitives.

Adding the first 3 numbers you get 10.2+16.0+39.2 = 65.4% in the
instruction interprter while the GC is part of the 22.6% which is the
memory management. That said, in Squeak gcc tricks really helped with
the fetch loop and the stack VM greatly reduced the send/return
overhead. So it might be the case that the GC dominates performance. Or
not - we have to measure and see.

> - You cannot serialize objects containing blocks. (IIRC one can use MessageSends)

Given that the image contains blocks, that can't be true. Obviously
serializing a subset of objects is a harder problem than just dumping
memory, but I consider images a proof of existence.

> - Image cannot be bootstrapped. (This is possible in ST/X and now in Pharo I think).

Little Smalltalk is a good example of taking a textual representation
and bootstrapping an image from it. GNU Smalltalk didn't even use images
the last time I looked at it. I consider Self to be a Smalltalk (just
not a Smalltalk-80) and it can start with either a snapshot (its name
for image) or with an empty world and load text files (possible because
the source to bytecode compiler is included in the VM).

> - All Smalltalks includes UI classes. (GemStone doesn't have AFAIK)

The MS-DOS port of Squeak had no GUI, just a command line prompt. That
was also the case for GNU Smalltalk and Little Smalltalk.

> .- All implementations uses direct pointers, (GST?)

The RoarVM for Squeak uses object tables. In fact, the lack of direct
pointers in early implementations is what led to the use of #become:
which complicated the adoption of direct pointers. VisualWorks has an
indirection pointer in the header - see slide 7 of

https://www.slideshare.net/esug/spur-a-new-object-representation-for-cog

> - All implementations uses green threads. (VAST? MT?)

I would say this was a side effect of patching the original Smalltalk,
which was its own operating system (and so the idea of green thread
doesn't apply) to run on top of Unix on commercial workstations. All the
old code supposed the mix of cooperative and preemptive multithreading
that breaks down if you have multiple native threads.

Some from scratch Smalltalks copied this model while others (I am pretty
sure it was the case for MT, as you mentioned) had their libraries
written with native threads in mind.

-- Jecel

Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

hernanmd
Hi Jecel,

El lun., 21 ene. 2019 a las 22:27, Jecel Assumpcao Jr. (<[hidden email]>) escribió:
Hernán Morales Durand wrote on Mon, 21 Jan 2019 18:13:45 -0300
> I have some possible myths, but I'd like to confirm or reject:
>
> - All Smalltalk bytecode sets are stack-based VM. (?)
> - Bytecodes are always fixed-size. (?)

SOAR (Smalltalk On A RISC, now renamed as RISC-III) used a 32 bit
register based instruction set and Smalltalk sources were translated to
that. They did regret dumping bytecodes later in their project as some
things became more complicated and the increase in memory use was very
expensive at that time.

https://apps.dtic.mil/dtic/tr/fulltext/u2/a172800.pdf

SOM (Simple Object Machine) is a set of Smalltalk VMs where some of them
represent code as abstract syntax trees instead of bytecodes.

http://som-st.github.io/
http://www.hpi.uni-potsdam.de/hirschfeld/projects/som/

> - Most of the time spent by a VM is in the instruction interpreter. (actually it's in the GC right?)

That will vary from one VM to another. On page 179 of the "green book"
you can see a nice graph of the space and time used by different part of
the Apple Smalltalk (from which Squeak evolved) and on page 177 you can
find the numbers used to create the chart.

http://sdmeta.gforge.inria.fr/FreeBooks/BitsOfHistory/

10.2% of the time in the fetch loop, 16.0% in the bytecode interpreter,
39.2% in sends and returns, 22.6% in the memory management and 10% in
primitives.

Adding the first 3 numbers you get 10.2+16.0+39.2 = 65.4% in the
instruction interprter while the GC is part of the 22.6% which is the
memory management. That said, in Squeak gcc tricks really helped with
the fetch loop and the stack VM greatly reduced the send/return
overhead. So it might be the case that the GC dominates performance. Or
not - we have to measure and see.

> - You cannot serialize objects containing blocks. (IIRC one can use MessageSends)

Given that the image contains blocks, that can't be true. Obviously
serializing a subset of objects is a harder problem than just dumping
memory, but I consider images a proof of existence.

> - Image cannot be bootstrapped. (This is possible in ST/X and now in Pharo I think).

Little Smalltalk is a good example of taking a textual representation
and bootstrapping an image from it.

Yes, sadly some of the LS implementations were lost in time and need to be tracked now, http://www.littlesmalltalk.org is now chinese thing, PDST and Parla were based in LittleSmalltalk 3 but only accessible through archive: http://web.archive.org/web/20051025043437/http://www.copyleft.de/Parla/Parla.html. There is another implementation now : https://github.com/0x7CFE/llst however didn't checked yet.
 
GNU Smalltalk didn't even use images
the last time I looked at it. I consider Self to be a Smalltalk (just
not a Smalltalk-80) and it can start with either a snapshot (its name
for image) or with an empty world and load text files (possible because
the source to bytecode compiler is included in the VM).


I always wondered about how much performance is gained moving the all the Compiler infrastructure into the VM.

> - All Smalltalks includes UI classes. (GemStone doesn't have AFAIK)

The MS-DOS port of Squeak had no GUI, just a command line prompt. That
was also the case for GNU Smalltalk and Little Smalltalk.


Didn't knew there was a DOS-only based Squeak. Any link out there to try?

For GST I should note there is an interesting project using GTK which provides a gst-browser, although cannot say if now is part of GNU Smalltalk.
 
> .- All implementations uses direct pointers, (GST?)

The RoarVM for Squeak uses object tables. In fact, the lack of direct
pointers in early implementations is what led to the use of #become:
which complicated the adoption of direct pointers. VisualWorks has an
indirection pointer in the header - see slide 7 of

https://www.slideshare.net/esug/spur-a-new-object-representation-for-cog


Thank you for the pointer, really informative presentation.
 
> - All implementations uses green threads. (VAST? MT?)

I would say this was a side effect of patching the original Smalltalk,
which was its own operating system (and so the idea of green thread
doesn't apply) to run on top of Unix on commercial workstations. All the
old code supposed the mix of cooperative and preemptive multithreading
that breaks down if you have multiple native threads.

Some from scratch Smalltalks copied this model while others (I am pretty
sure it was the case for MT, as you mentioned) had their libraries
written with native threads in mind.

-- Jecel


Thank you, added with credits to https://github.com/hernanmd/falsehoods_smalltalk

Cheers,

Hernán



Reply | Threaded
Open this post in threaded view
|

Little Smalltalk Re: Falsehoods programmers believe about Smalltalk

Edgar De Cleene
Little Smalltalk Re: [squeak-dev] Falsehoods programmers believe about Smalltalk


On 22/01/2019, 02:03, "Hernán Morales Durand" <[hidden email]> wrote:


Little Smalltalk is a good example of taking a textual representation
and bootstrapping an image from it.

Yes, sadly some of the LS implementations were lost in time and need to be tracked now, http://www.littlesmalltalk.org is now chinese thing, PDST and Parla were based in LittleSmalltalk 3 but only accessible through archive: http://web.archive.org/web/20051025043437/http://www.copyleft.de/Parla/Parla.html. There is another implementation now : https://github.com/0x7CFE/llst however didn't checked yet.

Hernán
https://github.com/kyle-github/littlesmalltalk works wel in Mac with class browser view running as localhost in Firefox

A good starting point to have a WebAssembly of it, afraid lack the skills to made it

Edgar
@morplenauta





Reply | Threaded
Open this post in threaded view
|

SmallWorld (was Re: Little Smalltalk)

Tony Garnock-Jones-5
I was pleased to discover the other day "SmallWorld", Tim Budd's own
derivative of Little Smalltalk. Russell Allen has recently been keeping
it running. It looks like SmallWorld was written in 2004; Russell
Allen's version was updated most recently in 2015.

 - Tim Budd's (not completely functional) page on SmallWorld:
   http://web.engr.oregonstate.edu/~budd/SmallWorld/ReadMe.html

 - Russell Allen's SmallWorld:
   https://github.com/russellallen/SmallWorld

I've been experimenting with using high-level languages to write tiny
VMs (direct interpreter, JIT, and partial-evaluating JIT) compatible
with SmallWorld images.

Cheers,
  Tony



On 1/22/19 9:09 AM, Edgar J. De Cleene wrote:

>
>
>
> On 22/01/2019, 02:03, "Hernán Morales Durand" <[hidden email]>
> wrote:
>
>
>         Little Smalltalk is a good example of taking a textual
>         representation
>         and bootstrapping an image from it.
>
>         Yes, sadly some of the LS implementations were lost in time and
>         need to be tracked now, _http://www.littlesmalltalk.org_ is now
>         chinese thing, PDST and Parla were based in LittleSmalltalk 3
>         but only accessible through archive:
>         http://web.archive.org/web/20051025043437/_http://www.copyleft.de/Parla/Parla.html_.
>         There is another implementation now :
>         _https://github.com/0x7CFE/llst_ however didn't checked yet.
>
>         Hernán
>         https://github.com/kyle-github/littlesmalltalk works wel in Mac
>         with class browser view running as localhost in Firefox
>
>         A good starting point to have a WebAssembly of it, afraid lack
>         the skills to made it
>
> Edgar
> @morplenauta
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

Jecel Assumpcao Jr
In reply to this post by hernanmd
Hernán Morales Durand wrote on Tue, 22 Jan 2019 02:03:01 -0300
> > [Little Smalltalk and bootstrapping an image]
>
> Yes, sadly some of the LS implementations were lost in time and
> need to be tracked now, http://www.littlesmalltalk.org is now
> chinese thing, PDST and Parla were based in LittleSmalltalk 3 but
> only accessible through archive: http://web.archive.org/web/20051025043437/http://www.copyleft.de/Parla/Parla.html.
> There is another implementation now : https://github.com/0x7CFE/llst
> however didn't checked yet.

For those who don't know about Little Smalltalk, version 1 described in
the book has some important differences from Smalltalk-80. One of them
is that an object is split into pieces corresponding to the superclasses
so that the offset of a variable can be the same in both a class and its
subclasses. This avoid having to do a lot of recompilation when a class
changes.

I don't remember any details about LST2, but LST3 was a lot more like
Smalltalk-80 but had all classes be instances of Class like in
Smalltalk-76. That was changed in LST4 to have metaclasses like
Smalltalk-80. For some reason many people prefer the simpler LST3 and
use that as the starting point of their forks, of which there have been
many over the years.

You asked about bytecodes having a fixed size. I am not sure what you
mean by that. While most Smalltalk-80 bytecodes are 1 byte long, a few
are "extended" and in the case of closure bytecode in Squeak they can be
up to 4 bytecodes long (bytecode 143 - push closure num copied num args
blocksize).

The Little Smalltalk bytecodes (also the Self ones) use an operand
extension bytecode like in the old Inmos Transputer. Each bytecode has a
4 bit op code and a 4 bit value for stuff that needs one argument (like
push literal) or uses the 4 bit value as its op code for stuff that
needs zero arguments (like return top of stack). A special extension
bytecode will combine its value with the value of the next bytecode when
an 8 bit argument is needed. Or two extensions allow a normal bytecode
to have a 12 bit argument. While not really fixed size, this is more
regular than the Smalltalk-80 scheme.

> > [Selft source to bytecode compiler in the VM]
>
> I always wondered about how much performance is gained moving
> the all the Compiler infrastructure into the VM.

There should not be any performance difference if the VM compiles
instead of interpreting. And the source to bytecode translation is not
in the critical path anyway since only short methods are compiled at any
one time while interacting with the user that won't be able to tell the
difference between 3 ms and 12 ms.
 
> Didn't knew there was a DOS-only based Squeak. Any link out there to try?

> https://web.archive.org/web/20050217200230/http://www.unicavia.com:80/Squeak/Downloads.php

Note that most Smalltalks that use a command line have a few helper
methods defined to make things easier, but this is just Squeak without
any GUI. So navigating around and doing stuff can be really awkward. I
didn't test the download link to see if the zip is actually at
archive.org, but I am sure I have a copy of the binary here if needed.

> For GST I should note there is an interesting project using GTK which
> provides a gst-browser, although cannot say if now is part of GNU
> Smalltalk.

There were optional GUIs for Little Smalltalk as well and a friend added
a web browser based GUI to LST4.
  
-- Jecel

Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

Eliot Miranda-2
In reply to this post by hernanmd


On Mon, Jan 21, 2019 at 1:13 PM Hernán Morales Durand <[hidden email]> wrote:

Done.

I have some possible myths, but I'd like to confirm or reject:

- All Smalltalk bytecode sets are stack-based VM. (?)

While there might be some implementations that use a register based bytecode set I've never heard of one.  I do know of a few implementations that don't use bytecode at all. 
 
- Bytecodes are always fixed-size. (?)

False.
 
- Most of the time spent by a VM is in the instruction interpreter. (actually it's in the GC right?)

Neither.  In a JIT VM most time is spent executing Smalltalk code.  GC overheads vary depending on workload, typically in the range 1% to 50% (typically for specially constructed benchmarks written to stress the GC).
 
- You cannot serialize objects containing blocks. (IIRC one can use MessageSends)

False.
 
- Image cannot be bootstrapped. (This is possible in ST/X and now in Pharo I think).

False.
 
- All Smalltalks includes UI classes. (GemStone doesn't have AFAIK).

False.
 
- All implementations uses direct pointers, (GST?)
 
False.  VisualWorks uses indirection pointers.  Xerox Smalltalk-80 implementations used indirection.  But many modern implementations use direct pointers.
 
- All implementations uses green threads. (VAST? MT?)

False.  SmalltalkMT.   But still mostly true.  VAST & VW have green threads and a threaded FFI.


I'm sure people in this list will have a lot more myths heard from Conferences, Forums, Videos, Talks, etc. Like the guy who said Smalltalk was dead. So if you did something which could be ignored publicly, please don't hesitate to reply or ping me to get added as collaborator.

Cheers,

Hernán



El dom., 20 ene. 2019 a las 22:41, Eliot Miranda (<[hidden email]>) escribió:
Hi Hernán,

On Sun, Jan 20, 2019 at 2:31 PM Hernán Morales Durand <[hidden email]> wrote:
Hi there,

I just created a GitHub repo to collect myths around Smalltalk-based technologies: Pharo, Squeak, VW, VAST, Smalltalk/X, GNU/ST, etc. in the spirit of the Falsehoods lists [1-4].

This is just a draft now but please feel free to add falsehoods based on your own experiences. Examples are greatly appreciated.

You want pull requests?  If not, would you give me write permission?  I'd love to add to the "Smalltalk is obsolete" section...
 



--
_,,,^..^,,,_
best, Eliot


--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Glass] [Pharo-users] Falsehoods programmers believe about Smalltalk

hernanmd
In reply to this post by hernanmd

Thank you for your notes Richard, it's really interesting to see not so well-known projects.
I would love to have more people sharing their wisdom and knocking down more myths.

Cheers,

Hernán


El dom., 27 ene. 2019 a las 1:17, Richard O'Keefe via Glass (<[hidden email]>) escribió:
I have my own Smalltalk system implemented as a batch compiler via C.
This was originally just going to be a baseline for a student wanting
to work on JIT, but he went elsewhere and I found the system surprisingly
useful.  I also wanted something that hewed closely to the ANSI Smalltalk
standard, but could diverge in other matters (like not having dynamic code
modification).

- All Smalltalk bytecode sets are stack-based VM. (?)

My system has no bytecodes.  Smalltalk=>C=>native code.

- Bytecodes are always fixed-size. (?)

False back in the Blue Book.  Why does it matter anyway?

- Most of the time spent by a VM is in the instruction interpreter. (actually it's in the GC right?)

There is no interpreter in my system, and many modern systems use a JIT.
That or they generate Javascript or JVM instructions or .NET or something,
and then _that_ gets turned into native code.

- You cannot serialize objects containing blocks. (IIRC one can use MessageSends)

True in my system but that's because blocks contain pointers to native code and may
contain pointers into the C stack.  I have plans to work around this, but it has not been
a priority.  Something I don't ever plan to deal with is objects containing references to
external objects (memory-mapped segments, file descriptors, sockets, ...) and it is not
at all clear to me what the semantics should be.

BinaryObjectStorage in VisualWorks has no trouble with blocks.

I meant to try this in Pharo 7.0.  The image I just installed via the launcher has
no DataStream, ReferenceStream, or SmartRefStream, but the class comment
for MCDataStream begins
"This is the save-to-disk facility.
 A DataStream can store one or more objects in a persistent form.

 To handle objects with sharing and cycles, you must use a ReferenceStream
 instead of a DataStream.  (Or SmartRefStream.)  ReferenceStream is typically
 faster and produces smaller files because it doesn't repeatedly write the same Symbols."

This was also the case back to  Pharo 2.0.  What *is* the persistence scheme in Pharo these days?

- Image cannot be bootstrapped. (This is possible in ST/X and now in Pharo I think).

There are no images in my system.

- All Smalltalks includes UI classes. (GemStone doesn't have AFAIK).

It depends on what you mean by "include".  Gnu Smalltalk *has* UI classes but they
are not loaded by default.

- All implementations uses direct pointers, (GST?)

True in my case, but that's because I'm lazy and using the Boehm collector.

- All implementations uses green threads. (VAST? MT?)

False in my case.  A Process is a POSIX (red) thread and no green threads exist.
This meant having to keep the interface fairly lean, but honestly wasn't that hard,
since the Boehm collector handled the hard stuff.


On Tue, 22 Jan 2019 at 13:27, Hernán Morales Durand <[hidden email]> wrote:

Done.

I have some possible myths, but I'd like to confirm or reject:

- All Smalltalk bytecode sets are stack-based VM. (?)
- Bytecodes are always fixed-size. (?)
- Most of the time spent by a VM is in the instruction interpreter. (actually it's in the GC right?)
- You cannot serialize objects containing blocks. (IIRC one can use MessageSends)
- Image cannot be bootstrapped. (This is possible in ST/X and now in Pharo I think).
- All Smalltalks includes UI classes. (GemStone doesn't have AFAIK).
- All implementations uses direct pointers, (GST?)
- All implementations uses green threads. (VAST? MT?)

I'm sure people in this list will have a lot more myths heard from Conferences, Forums, Videos, Talks, etc. Like the guy who said Smalltalk was dead. So if you did something which could be ignored publicly, please don't hesitate to reply or ping me to get added as collaborator.

Cheers,

Hernán



El dom., 20 ene. 2019 a las 22:41, Eliot Miranda (<[hidden email]>) escribió:
Hi Hernán,

On Sun, Jan 20, 2019 at 2:31 PM Hernán Morales Durand <[hidden email]> wrote:
Hi there,

I just created a GitHub repo to collect myths around Smalltalk-based technologies: Pharo, Squeak, VW, VAST, Smalltalk/X, GNU/ST, etc. in the spirit of the Falsehoods lists [1-4].

This is just a draft now but please feel free to add falsehoods based on your own experiences. Examples are greatly appreciated.

You want pull requests?  If not, would you give me write permission?  I'd love to add to the "Smalltalk is obsolete" section...
 



--
_,,,^..^,,,_
best, Eliot
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass


Reply | Threaded
Open this post in threaded view
|

Re: Falsehoods programmers believe about Smalltalk

marcel.taeumel
In reply to this post by hernanmd
Hi Hernán,

maybe you could add more references to the readme.md? Like the ones you
listed here at the beginning for inspiration. That would make the repo on
GitHub more self-contained and informative.

Best,
Marcel



--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html