Smalltalk › Squeak › Croquet › Croquet - Dev

basic tea-time questions

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

12 messages Options

Aemon Cannon

basic tea-time questions

Hey there, I've got a couple questions about the tea-time protocol.

1) It's my understanding that the router is the sole authority on Time, and
that internal future-messages are not executed until an external message with
later timestamp is received. However, I was wondering, if my controller
receives external message e1, and internal messages i1, i2 and i3 are already
in the queue (with lesser timestamps), are i1,i2 & i3 executed
simultaneously(with respect to island time)? Is island time incremented with
the offset of each internal message, as it's executed?

I am confused because each of i1,i2,i3 may have a different time-offset, but
because they all fall between the same two external messages, they are all
executed at the same time (with respect to the user's experience).
Is this correct?

This seems to have some implications for animation, which leads me to my second
question..

2) Disregarding frame-rate(how quickly the view is redrawn), is the update-rate
(the ticking of the simulation) limited by the router's heart-beat rate? in
other words, as far as the user's experience is concerned, is the simulation
ever updated BETWEEN external messages?
If not, what are the implications for the user-experience on real-world WANs ?

It's probably apparent that I need to roll up my sleeves and dig into the
Smalltalk code, but I wanted to consult the authorities first, rather than just
rely on my reading ability :)

Thanks!

Aemon

David A. Smith-3

Re: basic tea-time questions

Aemon Cannon wrote:

> Hey there, I've got a couple questions about the tea-time protocol.
>
> 1) It's my understanding that the router is the sole authority on Time, and
> that internal future-messages are not executed until an external message with
> later timestamp is received. However, I was wondering, if my controller
> receives external message e1, and internal messages i1, i2 and i3 are already
> in the queue (with lesser timestamps), are i1,i2 & i3 executed
> simultaneously(with respect to island time)? Is island time incremented with
> the offset of each internal message, as it's executed?
>
> I am confused because each of i1,i2,i3 may have a different time-offset, but
> because they all fall between the same two external messages, they are all
> executed at the same time (with respect to the user's experience).
> Is this correct?
>
> This seems to have some implications for animation, which leads me to my second
> question..
>
>

i1,i2, and i3 will be executed in order upon receipt of e1. Then e1 will
be executed. The virtual time at the time of execution will be the
specified future: time. Currently, future messages are executed
independently of rendering.

> 2) Disregarding frame-rate(how quickly the view is redrawn), is the update-rate
> (the ticking of the simulation) limited by the router's heart-beat rate? in
> other words, as far as the user's experience is concerned, is the simulation
> ever updated BETWEEN external messages?
> If not, what are the implications for the user-experience on real-world WANs ?
>
> It's probably apparent that I need to roll up my sleeves and dig into the
> Smalltalk code, but I wanted to consult the authorities first, rather than just
> rely on my reading ability :)
>
>

Yes - the heart-beat or any other external message triggers the local
execution. Heart-beats are artificially constructed external messages
that are generated simply to move internal future message execution
along. These can be specified when you create a new Router. Typical
numbers are every 20 or 50 msecs.
> Thanks!
>
> Aemon
>
>

Andreas.Raab

Re: basic tea-time questions

David Reed was just making a good point about the nature of the router
which is worth repeating: It is important to keep in mind that the
router is our way of dodging some of the harder problems of messaging in
replicated p2p environments and that the router shouldn't be taken as
the only possible solution.

The deeper problem is that in a replicated environment all peers need to
agree on two primary factors: The time base they operate on (so that
simulations can run at the "same" virtual time slots) and the order of
messages (so that message execution has the same results on all
replicas). Our way of solving these problems is by utilizing a central
authority (the router) which makes the problem trivial: The router can
both, order the messages and tell each client what time it is now, and
the clients can then execute messages deterministically and at the right
points in time.

Now, it is possible for the client to "interpolate" between time slots
to run simulations more smoothly and largely independent from the
centralized time base. However, that has the risk that a client might
get an external message that is logically already past (e.g., the client
has simulated past that point in virtual time already). To be able to
execute such a past message would require the client to be able to
efficiently roll back to that point in time when it should execute that
message, then execute the pending message and then continue the simulation.

Unfortunately, although we discussed this problem in excruciating
detail, we were never able to find an efficient solution to the rollback
problem (at least not in our current implementation; perhaps this will
get easier with transactional memory). And because of this, *any* past
message is fatal. And because any past message is fatal, we try hard to
avoid them ;-) Therefore we don't execute pending internal messages up
until we got notice that this particular time is already past and that
it is safe to simulate up to this point in time. The router basically
guarantees that there will be no messages with earlier time stamps and
we then execute pending internal messages to update the environment as
close as possible to real-time.

But this is by no means the only way to do it. Originally, we discussed
various alternatives, for example that of having a sliding window for
how far a client may simulate ahead locally, which would depend on its
average latency (a closer look at real latency variations will very
quickly cure you of the idea ;-) Because as long as you can make sure
that you won't get past messages, you can update your local simulation
at your leisure. And if you can roll back you don't have to worry about
past messages at all.

Cheers,
- Andreas

Howard Stearns

rollback [Re: basic tea-time questions]

All is well. Fortunately, other issues seem to dominate at the current state of
development (in my opinion). So it isn't really a problem to be dependent on
clock ticks from the router in order to move animation along. I don't have any
opinion on when it will matter.

But it's still interesting to look at rollback.

Brie uses dependency-directed backtracking to provide an efficient means of
rollback. The motivation was not optimistic execution, though -- it was to allow
safe composition of separately written behaviors in an unpredictable
event-driven environment. But a nice result is that arbitrary events can be
undone out-of-order.

I would like to exploit this to provide efficient object-oriented "undo." [User
selects any arbitrary combination of one or more objects to obtain a merged
timeline of events. User can slide back and forth to any point on any *branch*
of the timeline, with all dependencies unwound (even in objects not selected).]

Note that there's no need to limit things to "reversible" behaviors.

Given this, I think optimistic execution becomes pretty easy, no?

Of course, this only works where everything that effects the semantics of
simulation (e.g., not including rendering) is handled as a Briehavior, and
that's not really acceptable at the basic core Croquet infrastructure level.

Alas, I'm not convinced that there is absolutely no way to create a
history-dependent undo state. That's not a problem for replicated user-initiated
undoing -- the worst that can happen is that all users get an unexpected result.
But it's fatal for non-replicated internal corrections for late-arriving past
messages.

Andreas Raab wrote:
> ...
> Unfortunately, although we discussed this problem in excruciating
> detail, we were never able to find an efficient solution to the rollback
> problem (at least not in our current implementation; perhaps this will
> get easier with transactional memory).
> ...
--
Howard Stearns
University of Wisconsin - Madison
Division of Information Technology
mailto:[hidden email]
jabber:[hidden email]
office:+1-608-262-3724
mobile:+1-608-658-2419

David P. Reed

Re: rollback [Re: basic tea-time questions]

The solution Andreas was referring to as being inefficient was rollback
at the object level. What is efficient enough is of course debatable -
partly depending on the frequency of rollback needed and the cost
overhead incurred when running forward. Andreas and I have a dispute as
to what is "efficient enough", while we share a view that modifying the
Smalltalk VM in radical and complex ways is something that we should
undertake with great care and as a last resort (especially since a new
VM implementation paradigm - Coke - is in process that would make a
better place to try out a rollback scheme).

Letting multiple flowers bloom at the bottom layer of the system is
interesting for research exploration, but the short-term plan for
Croquet has been focused on achieving deployment at scale, not
continuous low-level re-design and bug introduction, with no progress on
UI or collaboration experience. So I agree with the current focus of
efforts on building a useful, scalable system, with improvements to be
done later. To preserve the opportunity to drop the "router", just
don't build in unnecessary dependencies on having a "router" (in other
words, adopt the core "end-to-end argument" which says if there is any
way to avoid adding function to central resources like the router, do
so, even if it seems cheaper to toss it into the router*).

* Those of you who think the "end-to-end argument" is a four letter word
(the number who respond viciously to my mentioning it as a good thing
increases daily) can ignore the last sentence. Or not - I' like arguing. :-)

Howard Stearns wrote:

> All is well. Fortunately, other issues seem to dominate at the current
> state of development (in my opinion). So it isn't really a problem to
> be dependent on clock ticks from the router in order to move animation
> along. I don't have any opinion on when it will matter.
>
> But it's still interesting to look at rollback.
>
> Brie uses dependency-directed backtracking to provide an efficient
> means of rollback. The motivation was not optimistic execution, though
> -- it was to allow safe composition of separately written behaviors in
> an unpredictable event-driven environment. But a nice result is that
> arbitrary events can be undone out-of-order.
>
> I would like to exploit this to provide efficient object-oriented
> "undo." [User selects any arbitrary combination of one or more objects
> to obtain a merged timeline of events. User can slide back and forth
> to any point on any *branch* of the timeline, with all dependencies
> unwound (even in objects not selected).]
>
> Note that there's no need to limit things to "reversible" behaviors.
>
> Given this, I think optimistic execution becomes pretty easy, no?
>
> Of course, this only works where everything that effects the semantics
> of simulation (e.g., not including rendering) is handled as a
> Briehavior, and that's not really acceptable at the basic core Croquet
> infrastructure level.
>
> Alas, I'm not convinced that there is absolutely no way to create a
> history-dependent undo state. That's not a problem for replicated
> user-initiated undoing -- the worst that can happen is that all users
> get an unexpected result. But it's fatal for non-replicated internal
> corrections for late-arriving past messages.
>
>
> Andreas Raab wrote:
>> ...
>> Unfortunately, although we discussed this problem in excruciating
>> detail, we were never able to find an efficient solution to the
>> rollback problem (at least not in our current implementation; perhaps
>> this will get easier with transactional memory). ...

Howard Stearns

Re: rollback [Re: basic tea-time questions]

Complete agreement. Brie is a layer of user-visible behavior objects, not a
replacement for stuff below. I.e., it's not briehaviors all the way down. (I'll
leave that to Ian.) But it does provide efficient rollback at the Brie-object level.

off topic: I blog at a friend's site that features a media-watchdog lawyer. I
find it interesting that he's constantly finding applications in his field for
end-to-end.
http://www.google.com/search?q=site%3Awetmachine.com+%22Harold+Feld%22+%22end+to+end%22

David P. Reed wrote:

> The solution Andreas was referring to as being inefficient was rollback
> at the object level. What is efficient enough is of course debatable -
> partly depending on the frequency of rollback needed and the cost
> overhead incurred when running forward. Andreas and I have a dispute as
> to what is "efficient enough", while we share a view that modifying the
> Smalltalk VM in radical and complex ways is something that we should
> undertake with great care and as a last resort (especially since a new
> VM implementation paradigm - Coke - is in process that would make a
> better place to try out a rollback scheme).
>
> Letting multiple flowers bloom at the bottom layer of the system is
> interesting for research exploration, but the short-term plan for
> Croquet has been focused on achieving deployment at scale, not
> continuous low-level re-design and bug introduction, with no progress on
> UI or collaboration experience. So I agree with the current focus of
> efforts on building a useful, scalable system, with improvements to be
> done later. To preserve the opportunity to drop the "router", just
> don't build in unnecessary dependencies on having a "router" (in other
> words, adopt the core "end-to-end argument" which says if there is any
> way to avoid adding function to central resources like the router, do
> so, even if it seems cheaper to toss it into the router*).
>
> * Those of you who think the "end-to-end argument" is a four letter word
> (the number who respond viciously to my mentioning it as a good thing
> increases daily) can ignore the last sentence. Or not - I' like arguing.
> :-)
>
> Howard Stearns wrote:
>> All is well. Fortunately, other issues seem to dominate at the current
>> state of development (in my opinion). So it isn't really a problem to
>> be dependent on clock ticks from the router in order to move animation
>> along. I don't have any opinion on when it will matter.
>>
>> But it's still interesting to look at rollback.
>>
>> Brie uses dependency-directed backtracking to provide an efficient
>> means of rollback. The motivation was not optimistic execution, though
>> -- it was to allow safe composition of separately written behaviors in
>> an unpredictable event-driven environment. But a nice result is that
>> arbitrary events can be undone out-of-order.
>>
>> I would like to exploit this to provide efficient object-oriented
>> "undo." [User selects any arbitrary combination of one or more objects
>> to obtain a merged timeline of events. User can slide back and forth
>> to any point on any *branch* of the timeline, with all dependencies
>> unwound (even in objects not selected).]
>>
>> Note that there's no need to limit things to "reversible" behaviors.
>>
>> Given this, I think optimistic execution becomes pretty easy, no?
>>
>> Of course, this only works where everything that effects the semantics
>> of simulation (e.g., not including rendering) is handled as a
>> Briehavior, and that's not really acceptable at the basic core Croquet
>> infrastructure level.
>>
>> Alas, I'm not convinced that there is absolutely no way to create a
>> history-dependent undo state. That's not a problem for replicated
>> user-initiated undoing -- the worst that can happen is that all users
>> get an unexpected result. But it's fatal for non-replicated internal
>> corrections for late-arriving past messages.
>>
>>
>> Andreas Raab wrote:
>>> ...
>>> Unfortunately, although we discussed this problem in excruciating
>>> detail, we were never able to find an efficient solution to the
>>> rollback problem (at least not in our current implementation; perhaps
>>> this will get easier with transactional memory). ...

--
Howard Stearns
University of Wisconsin - Madison
Division of Information Technology
mailto:[hidden email]
jabber:[hidden email]
office:+1-608-262-3724
mobile:+1-608-658-2419

obrien-2

Re: basic tea-time questions

In reply to this post by Andreas.Raab

> Unfortunately, although we discussed this problem in excruciating
> detail, we were never able to find an efficient solution to the rollback
> problem (at least not in our current implementation; perhaps this will
> get easier with transactional memory). And because of this, *any* past
> message is fatal. And because any past message is fatal, we try hard to
> avoid them ;-)

Henry Sowizral and Dave Jefferson were working on this
very problem years ago when I was at Rand. They came up with
a solution, called Timewarps, which does exactly this: roll back
efficiently.

You might be able to dig up their work. It's not my field,
though, so it might not apply.

Mike O'Brien

David P. Reed

Re: basic tea-time questions

It's painful to recall such concerns, but my original work on this
predated Jefferson's stuff (my thesis was completed in 1978, and I did a
world interview tour talking about it a year earlier, including a talk
at USC where Jefferson was in the audience). In any case, the issue
isn't about the conceptual framework, where I agree that my work and
Jefferson's are the core ideas. The issue is about "efficiency" which
involves "write barriers" "incremental garbage collection" and other
ideas that make it efficient in the instruction cycles and memory
references per operation so that one might claim that "Croquet is as
fast as compiled C code within 10%" or some such silly metric. o(N)
vs. o(N^2) is not the issue being debated. We are talking about constant
factors.)

Mike O'Brien wrote:

>> Unfortunately, although we discussed this problem in excruciating
>> detail, we were never able to find an efficient solution to the rollback
>> problem (at least not in our current implementation; perhaps this will
>> get easier with transactional memory). And because of this, *any* past
>> message is fatal. And because any past message is fatal, we try hard to
>> avoid them ;-)
>>
>
> Henry Sowizral and Dave Jefferson were working on this
> very problem years ago when I was at Rand. They came up with
> a solution, called Timewarps, which does exactly this: roll back
> efficiently.
>
> You might be able to dig up their work. It's not my field,
> though, so it might not apply.
>
> Mike O'Brien
>
>

Andreas.Raab

Re: basic tea-time questions

In reply to this post by obrien-2

Mike O'Brien wrote:

>> Unfortunately, although we discussed this problem in excruciating
>> detail, we were never able to find an efficient solution to the rollback
>> problem (at least not in our current implementation; perhaps this will
>> get easier with transactional memory). And because of this, *any* past
>> message is fatal. And because any past message is fatal, we try hard to
>> avoid them ;-)
>
> Henry Sowizral and Dave Jefferson were working on this
> very problem years ago when I was at Rand. They came up with
> a solution, called Timewarps, which does exactly this: roll back
> efficiently.

Well, there is efficient and then there is efficient ;-) What we were
looking for is a mechanism that can roll back an island of potentially a
couple of megabytes in size within milliseconds, undoing potentially
many thousands of messages executed ahead. Oh, and did I mention that
you're not allowed to cheat (e.g., change the object model) to make this
easier? ;-)

> You might be able to dig up their work. It's not my field,
> though, so it might not apply.

Some of the higher level bits most definitely apply but from the
implementation point of view it would probably be completely useless.
Simply because that kind of stuff depends heavily on one's object model
and the memory subsystem (state backups, write barriers etc. all need to
be handled at that level) and I doubt that their object model had any
similarity to that of Squeak ;-)

Cheers,
- Andreas

Ralph Johnson

Re: basic tea-time questions

On 3/23/07, Andreas Raab <[hidden email]> wrote:
> Well, there is efficient and then there is efficient ;-) What we were
> looking for is a mechanism that can roll back an island of potentially a
> couple of megabytes in size within milliseconds, undoing potentially
> many thousands of messages executed ahead. Oh, and did I mention that
> you're not allowed to cheat (e.g., change the object model) to make this
> easier? ;-)

What do you mean by "change the object model"? Do you mean "must run
on a stock Squeak VM"? In my experience, major improvements in
efficiency usually involve creativity, i.e. changing the basic
assumptions, i.e. cheating.

For example, suppose we change the VM so that it could make snapshots
of memory and rollback to a snapshot. By a snapshot, I mean just
copying the entire contents of object memory. Suppose you had a 10 MB
image. You should be able to copy 10 MB in 10 milliseconds. Even
better, supposed you tied it into a generation scavenging GC so that
whenever you copied newspace, you made a separate copy of it, and you
also made a log of the original version of all old objects that you
changed. So, when you wanted to roll back, you'd do it to a point in
time when you did a GC, but you'd only have to save new space after a
GC, which would be a couple percent of your entire image size. I bet
you could do this by just adding a few primitives to Squeak ("enable
GC snapshots", "get list of snapshots", "rollback to particular
snapshot") without doing much damage to Squeak, so that the default
Squeak VM could support it.

If we could get this to work, and it didn't impact normal Squeak
performance enough to prevent it from becoming part of the standard
VM, would you consider this "changing the object model"?

This is related to another idea that I have, which is making Croquet
so it can make use of multicores. I wondered whether it would be
possible to do this by running each island on its own image. The
tricky part would be rendering, because the main image would have to
ask each island image to render on a particular part of the screen.
(The alternative, making the main image reach into the island image
for rendering would probably be too slow) Other communication, such
as from the routers, could be mediated by the main image, so that
island images would not need to keep track of external resources, and
it would be easy to roll them back.

Do you think it is feaible to partition Croquet into several images,
the "master image" and a set of "island images"?

-Ralph

David P. Reed

Re: basic tea-time questions

No doubt the isolation possible due to the TeaTime model would map well
to multicores.

Once upon a time (when Croquet research was being supported, rather than
just development) we were trying to get some experimentation along those
lines started. It's still a good idea if anyone can get support to
pursue it.

Ralph Johnson wrote:

> On 3/23/07, Andreas Raab <[hidden email]> wrote:
>> Well, there is efficient and then there is efficient ;-) What we were
>> looking for is a mechanism that can roll back an island of potentially a
>> couple of megabytes in size within milliseconds, undoing potentially
>> many thousands of messages executed ahead. Oh, and did I mention that
>> you're not allowed to cheat (e.g., change the object model) to make this
>> easier? ;-)
>
> What do you mean by "change the object model"? Do you mean "must run
> on a stock Squeak VM"? In my experience, major improvements in
> efficiency usually involve creativity, i.e. changing the basic
> assumptions, i.e. cheating.
>
> For example, suppose we change the VM so that it could make snapshots
> of memory and rollback to a snapshot. By a snapshot, I mean just
> copying the entire contents of object memory. Suppose you had a 10 MB
> image. You should be able to copy 10 MB in 10 milliseconds. Even
> better, supposed you tied it into a generation scavenging GC so that
> whenever you copied newspace, you made a separate copy of it, and you
> also made a log of the original version of all old objects that you
> changed. So, when you wanted to roll back, you'd do it to a point in
> time when you did a GC, but you'd only have to save new space after a
> GC, which would be a couple percent of your entire image size. I bet
> you could do this by just adding a few primitives to Squeak ("enable
> GC snapshots", "get list of snapshots", "rollback to particular
> snapshot") without doing much damage to Squeak, so that the default
> Squeak VM could support it.
>
> If we could get this to work, and it didn't impact normal Squeak
> performance enough to prevent it from becoming part of the standard
> VM, would you consider this "changing the object model"?
>
> This is related to another idea that I have, which is making Croquet
> so it can make use of multicores. I wondered whether it would be
> possible to do this by running each island on its own image. The
> tricky part would be rendering, because the main image would have to
> ask each island image to render on a particular part of the screen.
> (The alternative, making the main image reach into the island image
> for rendering would probably be too slow) Other communication, such
> as from the routers, could be mediated by the main image, so that
> island images would not need to keep track of external resources, and
> it would be easy to roll them back.
>
> Do you think it is feaible to partition Croquet into several images,
> the "master image" and a set of "island images"?
>
> -Ralph
>

Andreas.Raab

Re: basic tea-time questions

In reply to this post by Ralph Johnson

Ralph Johnson wrote:
> What do you mean by "change the object model"? Do you mean "must run
> on a stock Squeak VM"? In my experience, major improvements in
> efficiency usually involve creativity, i.e. changing the basic
> assumptions, i.e. cheating.

Oh sure, I perfectly understand that. I was describing the concrete
constraints we were operating under for Hedgehog. Changes at this level
are usually quite disruptive and we simply didn't have the time to make
modifications at this level. Which had an impact on a number of
decisions, including that we didn't go with rollback.

Cheers,
- Andreas