Smalltalk › Squeak › Croquet › Croquet - User

oops

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Paul Sheldon-2

oops

googling :
oops "signed integers" smalltalk

gave me some wisdom.

I think they are longer bits for addressing object oriented programming
objects.

The bodies of pdf's are picked up by google and a context seemed hopeful in
:

http://portal.acm.org/ft_gateway.cfm?
id=38839&type=pdf&coll=&dl=acm&CFID=15151515&CFTOKEN=6184618

which I cite merely for credibility of context :

"Oops are either 31 bit signed integers or 32 bit addresses ..."

the 1 meg file is coming in with title p354-miranda.pdf, a 1987 document,
daunting nonetheless, since I have seen the world that goes
into making a compiler in an elemetary system programming course
that only went so far as making a parser.

Bert was sort of defining oops in context, I wanted to check.

Les Howell

Re: oops

There is another longer term issue on pointers. Using integers as
pointers means losing portablity. As the systems upgrade, the pointers
will naturally upgrade to the correct size for the system, while integer
definitions are a result of language definition. In addition, as
integers are signed, the math will occasionally yeild unexpected
results, losing correct memory mapping thus becoming a memory leak which
can be exploited to gain inappropriate access, thus presenting a
possible problem for security.

I have had experience with this issue on a large project, where the OS
update invalidated much of the existing code, which would have required
tens of thousands of man hours to correct the issues with integers used
as pointers. It is very difficult, since a reference via an integer may
have lots of consequences, from argument passing, to math results
anticipated, and to finding these problems which almost mandates
troubleshooting by hand the entire project. In the case I am aware of,
the whole project was auto translated, all integers converted to 64 bits
(which I guess will buy them some time), and then recompiled. Leaks and
bad links had to be hand massaged. I left before the project was
complete.

My final thought on integers as pointers... Don't. It is very poor
programming and non-professional. Viva full checking on arguments and
typing.

Regards,
Les H

On Mon, 2007-04-02 at 08:53 -0800, [hidden email] wrote:

> googling :
> oops "signed integers" smalltalk
>
> gave me some wisdom.
>
> I think they are longer bits for addressing object oriented programming
> objects.
>
> The bodies of pdf's are picked up by google and a context seemed hopeful in
> :
>
> http://portal.acm.org/ft_gateway.cfm?
> id=38839&type=pdf&coll=&dl=acm&CFID=15151515&CFTOKEN=6184618
>
> which I cite merely for credibility of context :
>
> "Oops are either 31 bit signed integers or 32 bit addresses ..."
>
> the 1 meg file is coming in with title p354-miranda.pdf, a 1987 document,
> daunting nonetheless, since I have seen the world that goes
> into making a compiler in an elemetary system programming course
> that only went so far as making a parser.
>
> Bert was sort of defining oops in context, I wanted to check.
>

Bert Freudenberg

Re: oops

On Apr 2, 2007, at 21:50 , Les wrote:

> There is another longer term issue on pointers. Using integers as
> pointers means losing portablity. As the systems upgrade, the
> pointers
> will naturally upgrade to the correct size for the system, while
> integer
> definitions are a result of language definition.

Well, we *define* our language, and our virtual machine. It was
decided an oop is 32 bits, which is very well. This gives us 4
Billion distinct objects, which seems enough for the immediate
future. Preliminary work exists for 64 bit oops, but the judgement is
still out whether this will be an overall improvement or not. Anyway,
this is the theoretical model which is manifested in the image.

Note this decision is orthogonal to how the mapping occurs between
oops and memory addresses. This is purely an optimization issue. The
original Squeak VM directly encodes the memory address of an object
in the high 30 bits of an oop. The lowest bit of an oop is used to
signify an immediate object (one that does not take space in the
object memory). If that LSB is 1, the upper 31 bits are used to
encode an immediate instance.

This optimization (direct address encoding) reduces the number of
usable oops. 30 bits would only give 1 billion objects anyway. But if
an average object had 32 bytes then only 1/8 of the potential oops
are actually usable, 128 Million objects. Now, even that would take
Squeak's garbage collector a long time to clean up, and it is an
unrealistic case for todays apps anyway, which is why the trade-offs
still work well (Squeak's interpreter is faster than most other
purely interpreted languages). Err, I digress ...

My point is that defining a fixed bit size for our oops is actually
valid. An oop is *not* a pointer per se. If we'd rewrite the Squeak
VM to use an object table (which would have many advantages) than an
oop would simply be an index into the object table. The image format
would not even have to change for that (actually it may, but not for
this reason).

The *problem* is that the VM generation process is sloppy, presumably
because in 1996 there was no easy way to test this. It just uses C as
an intermediate language and never strived to produce "good" C code.
It might still pay off to do so because modern C compilers take type
annotation as hints for compiling good machine code. But we don't
need any type checking on that intermediate stuff because it is
generated code. If there is any problem, the translator needs to be
fixed. And there are problems, so help is appreciated :)

If you're really interested about this stuff, the squeak-dev and vm-
dev lists are a fine place for discussing this.

- Bert -

Alan Grimes-2

Re: oops

> The *problem* is that the VM generation process is sloppy, presumably
> because in 1996 there was no easy way to test this. It just uses C as an
> intermediate language and never strived to produce "good" C code. It
> might still pay off to do so because modern C compilers take type
> annotation as hints for compiling good machine code. But we don't need
> any type checking on that intermediate stuff because it is generated
> code. If there is any problem, the translator needs to be fixed. And
> there are problems, so help is appreciated :)

when I tried to paralellize the VM I came across the same issue, the C
translator. The problem is that the interface between the C translator
and the (smalltalk) compiler was very opaque to me and I couldn't figure
out how to better exploit the compiler to make better C code. One of the
things I wanted to do was to do scope reductions where C variables would
only be defined within the block they are used, -- presumably would save
stack space... I couldn't make much headway on that. =(

--
Opera: Sing it loud! :o( )>-<

Les Howell

Re: oops

On Tue, 2007-04-03 at 08:40 -0500, Alan Grimes wrote:

> > The *problem* is that the VM generation process is sloppy, presumably
> > because in 1996 there was no easy way to test this. It just uses C as an
> > intermediate language and never strived to produce "good" C code. It
> > might still pay off to do so because modern C compilers take type
> > annotation as hints for compiling good machine code. But we don't need
> > any type checking on that intermediate stuff because it is generated
> > code. If there is any problem, the translator needs to be fixed. And
> > there are problems, so help is appreciated :)
>
> when I tried to paralellize the VM I came across the same issue, the C
> translator. The problem is that the interface between the C translator
> and the (smalltalk) compiler was very opaque to me and I couldn't figure
> out how to better exploit the compiler to make better C code. One of the
> things I wanted to do was to do scope reductions where C variables would
> only be defined within the block they are used, -- presumably would save
> stack space... I couldn't make much headway on that. =(
>
>

The C definition makes scope only global and local. Block usage is
generally accomplished using a global pointer with block malloc's.
However this requires judicious memory management. If the system
crashes, you can end up eating up memory. Some programmers use memory
chains as doubly linked lists, where block entries are made as a block
of memory. This simplifies the bookkeeping a bit, but again requires a
means of cleaning up memory. Generally this is handled by a error trap
that cleans memory prior to exit. Some operating systems, like solaris,
manage allocated memory on a PID basis and free all allocated memory
upon exit. This works well as long as the system crash hasn't
overwritten the control tree (typically restricted access memory).
Dealing with memory is a real issue best dealt with off line. If you
have more questions that I might be able to help with, please email me
personally.

Regards,
Les H