Hi all,
I'm trying around with Croquet since a short while and stumbled over a bug in TPromise, but after a first fix there I also found that I could lock up my squeak (3.8.18beta1U on os x 10.5) as simply as this: |s b c| s := Semaphore new. b := false. [b := true. "forgotten/crashed therefore no: s signal." ] fork. Delay timeoutSemaphore: s afterMSecs: 1. s wait. c := b. Transcript show: c; cr. This shows the use case. Shorter even: |s| s := Semaphore new. Delay timeoutSemaphore: s afterMSecs: 1. s wait. Transcript show: 'doesnt get here - not printed'; cr. I hope I'm doing something wrong here! I pasted the fixes from 6576 and 6588 to no avail. I have this fix for my use-case: Avoid Delay>>timeoutSempaphore:afterMSecs: |s| s := Semaphore new. [(Delay forSeconds:1) wait. s signal.] fork. s wait. Transcript show: 'Now gets here - line will be printed'; cr. Resource intense, I know. I would love to use your Delay mechanism! Cheers, Georg |
On Jan 4, 2008 2:58 PM, Georg Köster <[hidden email]> wrote:
> Delay timeoutSemaphore: s afterMSecs: 1. Despite the comment in the source code, Delay class>>timeoutSemaphore:afterMSecs: doesn't signal the given semaphore; in fact, it never uses that parameter. That method should be deprecated, if not simply removed. I've filed a bug report: http://bugs.squeak.org/view.php?id=6834 > I have this fix for my use-case: Avoid Delay>>timeoutSempaphore:afterMSecs: > |s| > s := Semaphore new. > [(Delay forSeconds:1) wait. s signal.] fork. > s wait. > Transcript show: 'Now gets here - line will be printed'; cr. > Resource intense, I know. Why; because you used fork? Squeak processes don't work the way you might think. I would recommend a solution along those exact lines. Hope this helps! --Tom Phoenix |
Hi Tom,
Thanks for the quick response! I thought the semaphore gets set as the semaphore that is signaled later... well the code is complicated. I have another question: Semaphore>>waitTimeoutMSecs: | d | d := Delay timeoutSemaphore: self afterMSecs: (anInteger max: 0). self wait. "Signaling by the Delay here would over-signal the Sem." d unschedule. A fix would need a sem for the sem I think... is that method usable? Cheers again, Georg On Jan 5, 2008 12:37 AM, Tom Phoenix <[hidden email]> wrote:
|
On Jan 4, 2008 6:57 PM, Georg Köster <[hidden email]> wrote:
> I have another question: > Semaphore>>waitTimeoutMSecs: > | d | > d := Delay timeoutSemaphore: self afterMSecs: (anInteger max: 0). > self wait. > "Signaling by the Delay here would over-signal the Sem." > d unschedule. > > A fix would need a sem for the sem I think... is that method usable? Is that the source in your image for that method? Here's what I have: waitTimeoutMSecs: anInteger "Wait on this semaphore for up to the given number of milliseconds, then timeout. It is up to the sender to determine the difference between the expected event and a timeout." | waitingProcess wakeupProcess | waitingProcess _ Processor activeProcess. wakeupProcess _ [(Delay forMilliseconds: (anInteger max: 0)) wait. self resumeProcess: waitingProcess] fork. self wait. wakeupProcess terminate. Cheers! --Tom Phoenix |
In reply to this post by Georg Köster
Georg Köster wrote:
> I'm trying around with Croquet since a short while and stumbled over a > bug in TPromise, but after a first fix there I also found that I could > lock up my squeak (3.8.18beta1U on os x 10.5) as simply as this: > > |s b c| > s := Semaphore new. > b := false. > [b := true. "forgotten/crashed therefore no: s signal." ] fork. > Delay timeoutSemaphore: s afterMSecs: 1. > s wait. > c := b. > Transcript show: c; cr. I am not sure what you expect this to print. In a Croquet SDK image it prints "true" which I presume is correct? > This shows the use case. Shorter even: > |s| > s := Semaphore new. > Delay timeoutSemaphore: s afterMSecs: 1. > s wait. > Transcript show: 'doesnt get here - not printed'; cr. > > I hope I'm doing something wrong here! You must be. This works fine in a Croquet SDK image. > I pasted the fixes from 6576 and 6588 to no avail. None of these would even affect this behavior. > Resource intense, I know. I would love to use your Delay mechanism! And you should be able to. All of the above seem to work fine with the Croquet images that I've tried - what images are you using it with? Cheers, - Andreas |
In reply to this post by Tom Phoenix
Tom Phoenix wrote:
> Is that the source in your image for that method? Here's what I have: > > waitTimeoutMSecs: anInteger > "Wait on this semaphore for up to the given number of milliseconds, > then timeout. It is up to the sender to determine the difference > between the expected event and a timeout." > > | waitingProcess wakeupProcess | > waitingProcess _ Processor activeProcess. > wakeupProcess _ > [(Delay forMilliseconds: (anInteger max: 0)) wait. > self resumeProcess: waitingProcess] fork. > > self wait. > wakeupProcess terminate. How odd. It seems that this version is only present in 3.9 - all other images have a variant of the (much simpler) code: waitTimeoutMSecs: anInteger "Wait on this semaphore for up to the given number of milliseconds, then timeout. It is up to the sender to determine the difference between the expected event and a timeout." | d | d := Delay timeoutSemaphore: self afterMSecs: (anInteger max: 0). [self wait] ensure:[d unschedule]. The 3.9 version is tagged "jf 12/2/2003" but there is no indication what it is trying to address (if anything). Cheers, - Andreas |
In reply to this post by Andreas.Raab
Andreas Raab schrieb:
> Georg Köster wrote: >> This shows the use case. Shorter even: >> |s| >> s := Semaphore new. >> Delay timeoutSemaphore: s afterMSecs: 1. >> s wait. >> Transcript show: 'doesnt get here - not printed'; cr. >> >> I hope I'm doing something wrong here! > > You must be. This works fine in a Croquet SDK image. introduced this bug. I guess it got included in 3.9 without anyone else looking at it. As I mention in [2] the Delay class side method should read: timeoutSemaphore: aSemaphore afterMSecs: anInteger "...comment..." ^ (self new setDelay: anInteger forSemaphore: aSemaphore) schedule I guess trying to refactor too much made me change the method. I'm sorry that I did not double check it more carefully in the first place. Alex [1] http://bugs.squeak.org/view.php?id=854 [2] http://bugs.squeak.org/view.php?id=6834 |
Hi Alex,
thanks for the note - but in my case I'm on the Croquet image and on the 3.8.18beta1U and my code looks like it should! Georg On Jan 5, 2008 1:47 PM, Alexander Lazarević <
[hidden email]> wrote: Andreas Raab schrieb: |
Georg Köster wrote:
> Hi Alex, > > thanks for the note - but in my case I'm on the Croquet image and on the > 3.8.18beta1U and my code looks like it should! So then what exactly is your problem? Like I was saying earlier, the code you posted appears to work fine in the Croquet SDK image. Cheers, - Andreas > > Georg > > On Jan 5, 2008 1:47 PM, Alexander Lazarević < [hidden email] > <mailto:[hidden email]>> wrote: > > Andreas Raab schrieb: > > Georg Köster wrote: > >> This shows the use case. Shorter even: > >> |s| > >> s := Semaphore new. > >> Delay timeoutSemaphore: s afterMSecs: 1. > >> s wait. > >> Transcript show: 'doesnt get here - not printed'; cr. > >> > >> I hope I'm doing something wrong here! > > > > You must be. This works fine in a Croquet SDK image. > Unfortunately it seems that a ChangeSet of mine addressing [1] > introduced this bug. I guess it got included in 3.9 without anyone else > looking at it. As I mention in [2] the Delay class side method > should read: > > timeoutSemaphore: aSemaphore afterMSecs: anInteger > "...comment..." > > ^ (self new setDelay: anInteger forSemaphore: aSemaphore) schedule > > I guess trying to refactor too much made me change the method. I'm sorry > that I did not double check it more carefully in the first place. > > Alex > > [1] http://bugs.squeak.org/view.php?id=854 > [2] http://bugs.squeak.org/view.php?id=6834 > <http://bugs.squeak.org/view.php?id=6834> > > > > ------------------------------------------------------------------------ > > |
In reply to this post by Andreas.Raab
Hey I prefer Tom's version! I shouldn't be racy if I read the resumeProcess and resume code correctly. The resumeProcess message has no effect on non-waiting processes!
> | waitingProcess wakeupProcess | > waitingProcess _ Processor activeProcess. > wakeupProcess _ > [(Delay forMilliseconds: (anInteger max: 0)) wait. > self resumeProcess: waitingProcess] fork. > > self wait. "preempting here and getting the resumeProcess message sent would have no effect - therefore no race!" "in comparison having a semaphore getting signaled here would cause an excess signal on the sem: bad" > wakeupProcess terminate. Cheers! Georg On Jan 5, 2008 9:43 AM, Andreas Raab <[hidden email]> wrote:
|
Georg Köster wrote:
> Hey I prefer Tom's version! I shouldn't be racy if I read the > resumeProcess and resume code correctly. The resumeProcess message has > no effect on non-waiting processes! To the contrary. This code *introduces* a race condition when manipulating the Semaphore's list of processes without protecting it against concurrent modifications. Worse, the code *cannot be protected* against concurrent modification since the VM manipulates that list on its own. It may actually explain why some people have reported issues on 3.9 that do not seem to appear on 3.8 or 3.10 variants. > > > | waitingProcess wakeupProcess | > > waitingProcess _ Processor activeProcess. > > wakeupProcess _ > > [(Delay forMilliseconds: (anInteger max: 0)) wait. > > self resumeProcess: waitingProcess] fork. > > > > self wait. > "preempting here and getting the resumeProcess message sent would > have no effect - therefore no race!" What are you talking about? With the original code there was no race condition whatsoever. If you think there is a race condition somewhere, please explain in detail where you think that race condition is. > "in comparison having a semaphore getting signaled here would cause > an excess signal on the sem: bad" First, the original code was guarded with a call to #unschedule which would avoid a second signal if the "real event" occurs long enough before the "timeout event". On the other hand, neither implementation prevents a double-signal if the real event occurs within an epsilon of the timeout event. And due to the asynchronous nature of some semaphore signals it is impossible to have a light-weight implementation like the one provided in Semaphore handle this correctly. It's possible to handle this but this would require to separate the semaphore and the delay signal and use a critical section to decide which one came first and how to deal with the other. That, on the other hand, seems overkill for the 99% of the cases in which the simplistic implementation in 3.8/3.10 is all that's needed. Cheers, - Andreas |
Hi Andreas, all,
first of all: Sorry, it also works in my fresh Croquet image. The backup I always used is apparently not as fresh as I thought :-( mince. And I didn't see that email :-( Double sorry. Newbie qestion: Why are images restartable? Why not label them differently (for example as 'core dump') if they don't work? To the race discussion. I think we have a point: § double signaling is unavoidable if a critical section is to be avoided Therefore there is a race in the first implementation. The second implementation has clearly other troubles (I didn't see that it's just using a LinkedList that's apparently not thread safe), but I explained before that I believe at least the double signaling is dealt with. I would recommend at least adding this problem to the documentation of wait*, that if one of the timed waits is used the wait (without timeout) contract is violated in that it might return even if no user process sent a signal. Best regards and thanks for considering my comments in the first place! Georg On Jan 5, 2008 3:05 PM, Andreas Raab <[hidden email]
> wrote:
|
Georg Köster wrote:
> The second implementation has clearly other troubles (I didn't see that > it's just using a LinkedList that's apparently not thread safe), but I > explained before that I believe at least the double signaling is dealt > with. It most definitely does not prevent double signaling! What makes you think it does? It is easiest to see in the case where we first get a timeout signal, and then -while being inside #resumeProcess:- the "real" event. Obviously there is nothing in the implementation that prevents the second signal from occurring. > I would recommend at least adding this problem to the documentation of > wait*, that if one of the timed waits is used the wait (without timeout) > contract is violated in that it might return even if no user process > sent a signal. Well, duh. It is the very *definition* of waitTimeoutMSecs: that it may return although the event hasn't occurred. That's what the method does. The comment is explicit, too noting that "It is up to the sender to determine the difference between the expected event and a timeout". Cheers, - Andreas |
Hi Andreas,
ok, I didn't explain it thorough enough. That's why meeting can be nice sometimes! I'm keeping my two statements up. To the first: The doubled signal and its resume would occur, but it would resume a process that is already resumed, and therefore have no effect (I assume that the process doesn't skip a wait because of a resume received while running). The process can only be waiting again and therefore vulnerable to a doubled signal if he passes the termination of the interrupting thread - and past this termination no signaling from the dead process can occur. So yes, double signaling may occur - but it's dealt with. To the second: Yes, definition of waitTimeoutMSecs:, but not wait:! That's what I meant when I said 'if one of the timed waits [waitTimeoutMSecs or -Seconds] is used the wait (without timeout) contract is violated...' It is very normal that the caller needs to check the result after using a timed wait, but nothing in the doc states that also the processes calling a forever-blocking wait will now need to check the result. That should be added. Something along the lines of: Semaphore>>wait "... The calling process is required to check if it was signaled by some external means as the signaling might occur bogusly." ... Isn't this the consequence of the possible double signaling? Cheers, Georg On Jan 5, 2008 4:28 PM, Andreas Raab <[hidden email]> wrote:
|
In reply to this post by Georg Köster
Hi all. While playing around with my DeltaStreamBrowser, I see
that it takes four seconds for the UI to update after I click on a change. And I have no idea why. I've often wondered why the browser is so slow too, and now I have the need to find out. So, How does one go about debugging this? I know about MessageTally and TimeProfileBrowser, but I've only seen them used on well-defined code, not in the wild in the Squeak UI. And I know about world > debug > Message Tally, but I really have no idea what the numbers they report mean, or what to look for in the reports they return, or how to even know what messages were sent while actually processing my click vs. in between while I move the mouse around. What I would really like to happen is, I tell squeak: "Pop-up a debugger on the next click". Then, on the next click, the UI process is suspended and shown in a debugger that has it's own private UI loop. Is anything like this possible? I am sure the Morphic event loop was not created without the help of the debugger, but I really have no idea how to trace a UI event from post to screen-update. So, that is what I have to work with, and how I don't know what to do with it. So, how do I track down the latency problem? -- Matthew Fulmer -- http://mtfulmer.wordpress.com/ Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808 |
On Jan 25, 2008 10:25 PM, Matthew Fulmer <[hidden email]> wrote:
> I've often wondered why the browser is so slow too, and now I > have the need to find out. So, How does one go about debugging > this? Have you tried interrupting it? While you're waiting for Squeak to respond, use the Cmd-period (Alt-period) interrupt to bring up a debugger (and a new main event loop, if needed). From there, you may still have a slow UI, but it should be fairly straightforward to see what method is taking a long time to respond. (My money is on #hash, but you never know.) Good luck with it! --Tom Phoenix |
In reply to this post by Tapple Gao
Real profiling: Why is my UI so slow?
Hi Matthew, I have seen some interesting problems with the image and browser speed. The slowness seems to come when the browser is updating itself and that can be triggered by other morphs overlapping (e.g. other browsers). See the mantis report: http://bugs.squeak.org/view.php?id=2084 0002084: Each overlaping browser slows down screen updates considerably. Caching the image would speed things up. This is mostly external observations. At the time I reported this I hadn't learned to use the tally tools usefully. (I still haven't. They give too much information. I want something that will point me quickly to the culprit.) My observation is that caching images would be one way to speed things up. I haven't tried to tackle the problem directly. (Except to make 3.9 workspaces start out smaller. The less things overlap the smaller the problem.) Animated morphs are a good way to aggrivate the problem. By stepping and invalidating their bounds. Hmm, I wonder if a one pixel rectangle would be good for pinpointing the worst update areas?.Invalidate the rectangle each step. Move it around the screen to see which pixel causes the greatest slowdown. The step code could note the clock interval between steps. Color a map of the screen with the results. Alright, we have a crazy idea. Lets see where it gets us. I'll post something to mantis when it gets interesting. Cheers. Yours in curiosity and service, --Jerome Peace *** >Matthew Fulmer tapplek at gmail.com >Sat Jan 26 06:25:35 UTC 2008 > > >Hi all. While playing around with my DeltaStreamBrowser, I see >that it takes four seconds for the UI to update after I click on >a change. And I have no idea why. > >I've often wondered why the browser is so slow too, and now I >have the need to find out. So, How does one go about debugging >this? > >I know about MessageTally and TimeProfileBrowser, but I've only >seen them used on well-defined code, not in the wild in the >Squeak UI. And I know about world > debug > Message Tally, but I >really have no idea what the numbers they report mean, or what >to look for in the reports they return, or how to even know what >messages were sent while actually processing my click vs. in >between while I move the mouse around. > >What I would really like to happen is, I tell squeak: "Pop-up a >debugger on the next click". Then, on the next click, the UI >process is suspended and shown in a debugger that has it's own >private UI loop. Is anything like this possible? I am sure the >Morphic event loop was not created without the help of the >debugger, but I really have no idea how to trace a UI event from >post to screen-update. > >So, that is what I have to work with, and how I don't know what >to do with it. So, how do I track down the latency problem? *** ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping |
On 26-Jan-08, at 10:07 PM, Jerome Peace wrote: > > > http://bugs.squeak.org/view.php?id=2084 > 0002084: Each overlaping browser slows down screen > updates considerably. Caching the image would speed > things up. It appears to be not just 'considerably' but something along the lines of O(N^2) or worse. Run a loop that opens browsers and watch as the first few are really fast, then a few slow then it gets really sl o w tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: RCM: Randomly Corrupt Microcode |
On 27/01/2008, tim Rowledge <[hidden email]> wrote:
> > On 26-Jan-08, at 10:07 PM, Jerome Peace wrote: > > > > > > http://bugs.squeak.org/view.php?id=2084 > > 0002084: Each overlaping browser slows down screen > > updates considerably. Caching the image would speed > > things up. > It appears to be not just 'considerably' but something along the lines > of O(N^2) or worse. Run a loop that opens browsers and watch as the > first few are really fast, then a few slow then > > it > > gets > > really > > > > > sl > > > o > > > w > No, it's O(N) , but when you having 10 browser windows on screen, full world update could take about 2-3 seconds. I'm using this code for testing draw speed: MessageTally spyOn: [ World fullDrawOn: World assuredCanvas ] In my image, most time is spent in string drawing primitive. -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Jerome Peace
On Sat, Jan 26, 2008 at 10:07:03PM -0800, Jerome Peace wrote:
> This is mostly external observations. At the time I > reported this I hadn't learned to use the tally tools > usefully. (I still haven't. They give too much > information. I want something that will point me > quickly to the culprit.) I talked to Bryce about this on #squeak, and he gave several suggestions: - In MessageTally reports, look for big drops in time usage from a parent to child method. That usually points toward the culprit - Insert logging statements in code. In Keith's Logging framework (http://wiki.squeak.org/squeak/3706), I added a method just to make logging profiles easy. It helps to see when a method is getting called way too often. I still don't see how to use the debugger effectively to profile UI's, since events will be posted from the debugger thread, but screen updates happen in the UI thread, which is impossible to get a useful debugger on (as far as I can tell) -- Matthew Fulmer -- http://mtfulmer.wordpress.com/ Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808 |
Free forum by Nabble | Edit this page |