Hi guys
in Squeak andreas introduced the idea of test time out Do you think that this is interesting? Stef SUnit ----- All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
yes think it's a good idea. I'm not sure the granularity that's
required though. mike On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote: > Hi guys > > in Squeak andreas introduced the idea of test time out > Do you think that this is interesting? > > Stef > > SUnit > ----- > All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Stef,
Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil. I have some code that I am still porting, but the basic idea is to be able to write [ "code that might not complete" ] tryForSeconds:10 onTimeOut:[ ]. With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong. Bill ________________________________________ From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]] Sent: Saturday, May 29, 2010 6:17 AM To: [hidden email] Subject: Re: [Pharo-project] SUnit Time out yes think it's a good idea. I'm not sure the granularity that's required though. mike On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote: > Hi guys > > in Squeak andreas introduced the idea of test time out > Do you think that this is interesting? > > Stef > > SUnit > ----- > All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On May 29, 2010, at 4:43 PM, Schwab,Wilhelm K wrote: > Stef, > > Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? probably a pragma > That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil. > > I have some code that I am still porting, but the basic idea is to be able to write > > [ > "code that might not complete" > > ] tryForSeconds:10 onTimeOut:[ > > ]. > > With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong. Yes this could be a nice extension now what is nice with the tag (I imagine I did not check is that this is orthogonal to the code and this is jyst an indication for the test runner and this is not intrusive. > > Bill > > ________________________________________ > From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]] > Sent: Saturday, May 29, 2010 6:17 AM > To: [hidden email] > Subject: Re: [Pharo-project] SUnit Time out > > yes think it's a good idea. I'm not sure the granularity that's > required though. > > mike > > On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote: >> Hi guys >> >> in Squeak andreas introduced the idea of test time out >> Do you think that this is interesting? >> >> Stef >> >> SUnit >> ----- >> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Schwab,Wilhelm K
We already have #should:notTakeMoreThan: and friends in TestCase. The
complete TestCase can be protected by overriding #runCase, the individual test by wrapping the code of the test method. Lukas On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote: > Stef, > > Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil. > > I have some code that I am still porting, but the basic idea is to be able to write > > [ > "code that might not complete" > > ] tryForSeconds:10 onTimeOut:[ > > ]. > > With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong. > > Bill > > ________________________________________ > From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]] > Sent: Saturday, May 29, 2010 6:17 AM > To: [hidden email] > Subject: Re: [Pharo-project] SUnit Time out > > yes think it's a good idea. I'm not sure the granularity that's > required though. > > mike > > On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote: >> Hi guys >> >> in Squeak andreas introduced the idea of test time out >> Do you think that this is interesting? >> >> Stef >> >> SUnit >> ----- >> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > -- Lukas Renggli www.lukas-renggli.ch _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
For me the point is that I have the impression that this is an interesting feature to have
when we have a test server. Like that you make sure that you do not have tests with infinite recursion now this is also true that on a test server you do not really care if your tests take 30s or 2 min. Stef > We already have #should:notTakeMoreThan: and friends in TestCase. The > complete TestCase can be protected by overriding #runCase, the > individual test by wrapping the code of the test method. > > Lukas > > On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote: >> Stef, >> >> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil. >> >> I have some code that I am still porting, but the basic idea is to be able to write >> >> [ >> "code that might not complete" >> >> ] tryForSeconds:10 onTimeOut:[ >> >> ]. >> >> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong. >> >> Bill >> >> ________________________________________ >> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]] >> Sent: Saturday, May 29, 2010 6:17 AM >> To: [hidden email] >> Subject: Re: [Pharo-project] SUnit Time out >> >> yes think it's a good idea. I'm not sure the granularity that's >> required though. >> >> mike >> >> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote: >>> Hi guys >>> >>> in Squeak andreas introduced the idea of test time out >>> Do you think that this is interesting? >>> >>> Stef >>> >>> SUnit >>> ----- >>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > > -- > Lukas Renggli > www.lukas-renggli.ch > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
exactly... my thought is running these automatically and headless. you
don't necessarily need it built into the test framework itself. You can just have another process that monitors and kills after a timeout. This is why i commented on the granularity. The feature Andreas describes would be used more comprehensively than this. but i'm not so fussed by that. cheers, Mike On Sat, May 29, 2010 at 8:42 PM, Stéphane Ducasse <[hidden email]> wrote: > For me the point is that I have the impression that this is an interesting feature to have > when we have a test server. Like that you make sure that you do not have tests with infinite recursion > now this is also true that on a test server you do not really care if your tests take 30s or 2 min. > > Stef > >> We already have #should:notTakeMoreThan: and friends in TestCase. The >> complete TestCase can be protected by overriding #runCase, the >> individual test by wrapping the code of the test method. >> >> Lukas >> >> On 29 May 2010 16:43, Schwab,Wilhelm K <[hidden email]> wrote: >>> Stef, >>> >>> Time to expose some of my ignorance (don't worry, I have plenty more waiting where I found this): what is the tag concept? That sounds very Tweak-ish, and I am a real believer in doing things "with the language, not TO the language" whenever possible. That is not to say that frameworks are bad; in fact, it means that frameworks are good, language extensions are anywhere from suspect to evil. >>> >>> I have some code that I am still porting, but the basic idea is to be able to write >>> >>> [ >>> "code that might not complete" >>> >>> ] tryForSeconds:10 onTimeOut:[ >>> >>> ]. >>> >>> With a robust capability to do such things, it is probably not necessary (or even appropriate) for TestCase to enforce timeouts. The timeout block can simply raise an exception or assert false, and there is no need to disable timeouts where they do not belong. >>> >>> Bill >>> >>> ________________________________________ >>> From: [hidden email] [[hidden email]] On Behalf Of Michael Roberts [[hidden email]] >>> Sent: Saturday, May 29, 2010 6:17 AM >>> To: [hidden email] >>> Subject: Re: [Pharo-project] SUnit Time out >>> >>> yes think it's a good idea. I'm not sure the granularity that's >>> required though. >>> >>> mike >>> >>> On Saturday, May 29, 2010, stephane ducasse <[hidden email]> wrote: >>>> Hi guys >>>> >>>> in Squeak andreas introduced the idea of test time out >>>> Do you think that this is interesting? >>>> >>>> Stef >>>> >>>> SUnit >>>> ----- >>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> >> -- >> Lukas Renggli >> www.lukas-renggli.ch >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by stephane ducasse
(Copying squeak-dev too).
I'm not sold on the whole test timeout thing. When I run tests, I want to know the answer to the question, "is the software working?" Putting a timeout on tests trades a slower, but definitive, "yes" or "no" for a supposedly-faster "maybe". But is getting a "maybe" back really faster? I've just incurred the cost of running a test suite, but left without my answer. I get a "maybe", what am I supposed to do next? Find a faster machine? Hack into the code to fiddle with a timeout pragma? That's not faster.. But, the reason given for the change was not for running tests interactively (the 99% case), rather, all tests form the beginning of time are now saddled with a timeout for the 1% case: "The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments." If tests are supposed to be quick (and deterministic) anyway, wouldn't an infinite loop or user-input be caught the first time the test was run (interactively)? Seriously, when you make software changes, we run the tests interactively first, and then the purpose of night-time automated test environment is to catch regressions on the merged code.. In that case, the high-level test-controller which spits out the results could and should be responsible for handling "unexpected user input" and/or putting in a timeout, not each and every last test method.. IMO, we want short tests, so let's just write them to be short. If they're too long, then the encouragement to shorten them comes from our own impatience of running them interactively. Running them in batch at night requires no patience, because we're sleeping, and besides, the batch processor should take responsibility for handling those rare scenarios at a higher-level.. Regards, Chris On Sat, May 29, 2010 at 2:53 AM, stephane ducasse <[hidden email]> wrote: > Hi guys > > in Squeak andreas introduced the idea of test time out > Do you think that this is interesting? > > Stef > > SUnit > ----- > All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On May 30, 2010, at 8:52 PM, Chris Muller wrote: > (Copying squeak-dev too). > > I'm not sold on the whole test timeout thing. When I run tests, I > want to know the answer to the question, "is the software working?" > > Putting a timeout on tests trades a slower, but definitive, "yes" or > "no" for a supposedly-faster "maybe". But is getting a "maybe" back > really faster? I've just incurred the cost of running a test suite, > but left without my answer. I get a "maybe", what am I supposed to do > next? Find a faster machine? Hack into the code to fiddle with a > timeout pragma? That's not faster.. Thanks this is a really good point. > But, the reason given for the change was not for running tests > interactively (the 99% case), rather, all tests form the beginning of > time are now saddled with a timeout for the 1% case: > > "The purpose of the timeout is to catch issues like infinite loops, > unexpected user input etc. in automated test environments." > > If tests are supposed to be quick (and deterministic) anyway, wouldn't > an infinite loop or user-input be caught the first time the test was > run (interactively)? Seriously, when you make software changes, we > run the tests interactively first, and then the purpose of night-time > automated test environment is to catch regressions on the merged > code.. Yes this is what I was also implying in my previous mail. If we have a test server this does not really help to have a time out and I wonder the case of infinite loop because this may be really rare. > In that case, the high-level test-controller which spits out the > results could and should be responsible for handling "unexpected user > input" and/or putting in a timeout, not each and every last test > method.. > > IMO, we want short tests, so let's just write them to be short. If > they're too long, then the encouragement to shorten them comes from > our own impatience of running them interactively. Running them in > batch at night requires no patience, because we're sleeping, and > besides, the batch processor should take responsibility for handling > those rare scenarios at a higher-level.. I agree. Thanks for sharing your thoughts. So the issue is done. :) > > Regards, > Chris > > > On Sat, May 29, 2010 at 2:53 AM, stephane ducasse > <[hidden email]> wrote: >> Hi guys >> >> in Squeak andreas introduced the idea of test time out >> Do you think that this is interesting? >> >> Stef >> >> SUnit >> ----- >> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2010/5/30 Stéphane Ducasse <[hidden email]>:
> > On May 30, 2010, at 8:52 PM, Chris Muller wrote: > >> (Copying squeak-dev too). >> >> I'm not sold on the whole test timeout thing. When I run tests, I >> want to know the answer to the question, "is the software working?" >> >> Putting a timeout on tests trades a slower, but definitive, "yes" or >> "no" for a supposedly-faster "maybe". But is getting a "maybe" back >> really faster? I've just incurred the cost of running a test suite, >> but left without my answer. I get a "maybe", what am I supposed to do >> next? Find a faster machine? Hack into the code to fiddle with a >> timeout pragma? That's not faster.. > > Thanks this is a really good point. > >> But, the reason given for the change was not for running tests >> interactively (the 99% case), rather, all tests form the beginning of >> time are now saddled with a timeout for the 1% case: >> >> "The purpose of the timeout is to catch issues like infinite loops, >> unexpected user input etc. in automated test environments." >> >> If tests are supposed to be quick (and deterministic) anyway, wouldn't >> an infinite loop or user-input be caught the first time the test was >> run (interactively)? Seriously, when you make software changes, we >> run the tests interactively first, and then the purpose of night-time >> automated test environment is to catch regressions on the merged >> code.. > > Yes this is what I was also implying in my previous mail. > If we have a test server this does not really help to have a time out > and I wonder the case of infinite loop because this may be really rare. > My opinion differs here. Every test should run in a short time frame but a few exceptions. So it seems reasonnable to just have to specify a default timeout on your architecture and some specific timeouts for a few specific tests (or test classes). Your main argument is that manual tuning will always be better than automated default behaviour, and we can only agree on that one. But there are two pragmatic cases you don't take into account: - the case when community supplied test cases do not comply with these rather implicit requirements, and image integrator does not have time to dig into each case and do the fine tuning for the rest of community. - the case when automated tests are used for exploratory package testing. In the first case, integrator just put a threshold that will reject some tests, up to rest of community to inject more time in solving the problem. Maybe Andreas was also addressing case of network-in-the-loop tests. Thus it can be seen as a quick hack for by-passing the low level timeout and number of retries (which are not always accessible that easily in some APIs...). Concerning infinite loops occurence, some are produced by uncompatible packages. So when you automate compatible package exploration, it might help because I doubt you will have explored each case interactively. Of course, you can always put a timeout at upper level in your bash or something, but it would not be particularly fine grained, would it ? One typical case I often bump into is those classes defining printOn: by sending storeOn: et vice et versa, same for printString and printOn:. If core happens to change between two releases, and you have a subclass defined in your package, probability to run into one of these infinite loops increases. Nicolas >> In that case, the high-level test-controller which spits out the >> results could and should be responsible for handling "unexpected user >> input" and/or putting in a timeout, not each and every last test >> method.. >> >> IMO, we want short tests, so let's just write them to be short. If >> they're too long, then the encouragement to shorten them comes from >> our own impatience of running them interactively. Running them in >> batch at night requires no patience, because we're sleeping, and >> besides, the batch processor should take responsibility for handling >> those rare scenarios at a higher-level.. > > I agree. > Thanks for sharing your thoughts. > So the issue is done. :) > > >> >> Regards, >> Chris >> >> >> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse >> <[hidden email]> wrote: >>> Hi guys >>> >>> in Squeak andreas introduced the idea of test time out >>> Do you think that this is interesting? >>> >>> Stef >>> >>> SUnit >>> ----- >>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2010/5/31 Nicolas Cellier <[hidden email]>:
> 2010/5/30 Stéphane Ducasse <[hidden email]>: >> >> On May 30, 2010, at 8:52 PM, Chris Muller wrote: >> >>> (Copying squeak-dev too). >>> >>> I'm not sold on the whole test timeout thing. When I run tests, I >>> want to know the answer to the question, "is the software working?" >>> >>> Putting a timeout on tests trades a slower, but definitive, "yes" or >>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back >>> really faster? I've just incurred the cost of running a test suite, >>> but left without my answer. I get a "maybe", what am I supposed to do >>> next? Find a faster machine? Hack into the code to fiddle with a >>> timeout pragma? That's not faster.. >> >> Thanks this is a really good point. >> >>> But, the reason given for the change was not for running tests >>> interactively (the 99% case), rather, all tests form the beginning of >>> time are now saddled with a timeout for the 1% case: >>> >>> "The purpose of the timeout is to catch issues like infinite loops, >>> unexpected user input etc. in automated test environments." >>> >>> If tests are supposed to be quick (and deterministic) anyway, wouldn't >>> an infinite loop or user-input be caught the first time the test was >>> run (interactively)? Seriously, when you make software changes, we >>> run the tests interactively first, and then the purpose of night-time >>> automated test environment is to catch regressions on the merged >>> code.. >> >> Yes this is what I was also implying in my previous mail. >> If we have a test server this does not really help to have a time out >> and I wonder the case of infinite loop because this may be really rare. >> > > My opinion differs here. Every test should run in a short time frame > but a few exceptions. So it seems reasonnable to just have to specify > a default timeout on your architecture and some specific timeouts for > a few specific tests (or test classes). > > Your main argument is that manual tuning will always be better than > automated default behaviour, and we can only agree on that one. > > But there are two pragmatic cases you don't take into account: > - the case when community supplied test cases do not comply with these > rather implicit requirements, and image integrator does not have time > to dig into each case and do the fine tuning for the rest of > community. > - the case when automated tests are used for exploratory package testing. > > In the first case, integrator just put a threshold that will reject > some tests, up to rest of community to inject more time in solving the > problem. > In other words, the timeout has the advantage to turn a rather implicit requirement into an explicit requirement. And it's then up to test producers to fine tune their tests wrt this requirement or use the available hooks in case of long tests, rather than letting the integrator guess. Nicolas > Maybe Andreas was also addressing case of network-in-the-loop tests. > Thus it can be seen as a quick hack for by-passing the low level > timeout and number of retries (which are not always accessible that > easily in some APIs...). > > Concerning infinite loops occurence, some are produced by uncompatible > packages. So when you automate compatible package exploration, it > might help because I doubt you will have explored each case > interactively. Of course, you can always put a timeout at upper level > in your bash or something, but it would not be particularly fine > grained, would it ? > > One typical case I often bump into is those classes defining printOn: > by sending storeOn: et vice et versa, same for printString and > printOn:. If core happens to change between two releases, and you have > a subclass defined in your package, probability to run into one of > these infinite loops increases. > > Nicolas > >>> In that case, the high-level test-controller which spits out the >>> results could and should be responsible for handling "unexpected user >>> input" and/or putting in a timeout, not each and every last test >>> method.. >>> >>> IMO, we want short tests, so let's just write them to be short. If >>> they're too long, then the encouragement to shorten them comes from >>> our own impatience of running them interactively. Running them in >>> batch at night requires no patience, because we're sleeping, and >>> besides, the batch processor should take responsibility for handling >>> those rare scenarios at a higher-level.. >> >> I agree. >> Thanks for sharing your thoughts. >> So the issue is done. :) >> >> >>> >>> Regards, >>> Chris >>> >>> >>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse >>> <[hidden email]> wrote: >>>> Hi guys >>>> >>>> in Squeak andreas introduced the idea of test time out >>>> Do you think that this is interesting? >>>> >>>> Stef >>>> >>>> SUnit >>>> ----- >>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Oh, and I see one more advantage: Only quick tests are run
interactively by default from your TestRunner. We could automate the separation between long tests and quick tests based on timeout if the information is discoverable (either in method annotations or class side query, I don't care). Of course, as a user I would be interested to distinguish failures errors and timeouts, so I would want the user interface to change. Nicolas 2010/5/31 Nicolas Cellier <[hidden email]>: > 2010/5/31 Nicolas Cellier <[hidden email]>: >> 2010/5/30 Stéphane Ducasse <[hidden email]>: >>> >>> On May 30, 2010, at 8:52 PM, Chris Muller wrote: >>> >>>> (Copying squeak-dev too). >>>> >>>> I'm not sold on the whole test timeout thing. When I run tests, I >>>> want to know the answer to the question, "is the software working?" >>>> >>>> Putting a timeout on tests trades a slower, but definitive, "yes" or >>>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back >>>> really faster? I've just incurred the cost of running a test suite, >>>> but left without my answer. I get a "maybe", what am I supposed to do >>>> next? Find a faster machine? Hack into the code to fiddle with a >>>> timeout pragma? That's not faster.. >>> >>> Thanks this is a really good point. >>> >>>> But, the reason given for the change was not for running tests >>>> interactively (the 99% case), rather, all tests form the beginning of >>>> time are now saddled with a timeout for the 1% case: >>>> >>>> "The purpose of the timeout is to catch issues like infinite loops, >>>> unexpected user input etc. in automated test environments." >>>> >>>> If tests are supposed to be quick (and deterministic) anyway, wouldn't >>>> an infinite loop or user-input be caught the first time the test was >>>> run (interactively)? Seriously, when you make software changes, we >>>> run the tests interactively first, and then the purpose of night-time >>>> automated test environment is to catch regressions on the merged >>>> code.. >>> >>> Yes this is what I was also implying in my previous mail. >>> If we have a test server this does not really help to have a time out >>> and I wonder the case of infinite loop because this may be really rare. >>> >> >> My opinion differs here. Every test should run in a short time frame >> but a few exceptions. So it seems reasonnable to just have to specify >> a default timeout on your architecture and some specific timeouts for >> a few specific tests (or test classes). >> >> Your main argument is that manual tuning will always be better than >> automated default behaviour, and we can only agree on that one. >> >> But there are two pragmatic cases you don't take into account: >> - the case when community supplied test cases do not comply with these >> rather implicit requirements, and image integrator does not have time >> to dig into each case and do the fine tuning for the rest of >> community. >> - the case when automated tests are used for exploratory package testing. >> >> In the first case, integrator just put a threshold that will reject >> some tests, up to rest of community to inject more time in solving the >> problem. >> > > In other words, the timeout has the advantage to turn a rather > implicit requirement into an explicit requirement. > And it's then up to test producers to fine tune their tests wrt this > requirement or use the available hooks in case of long tests, rather > than letting the integrator guess. > > Nicolas > >> Maybe Andreas was also addressing case of network-in-the-loop tests. >> Thus it can be seen as a quick hack for by-passing the low level >> timeout and number of retries (which are not always accessible that >> easily in some APIs...). >> >> Concerning infinite loops occurence, some are produced by uncompatible >> packages. So when you automate compatible package exploration, it >> might help because I doubt you will have explored each case >> interactively. Of course, you can always put a timeout at upper level >> in your bash or something, but it would not be particularly fine >> grained, would it ? >> >> One typical case I often bump into is those classes defining printOn: >> by sending storeOn: et vice et versa, same for printString and >> printOn:. If core happens to change between two releases, and you have >> a subclass defined in your package, probability to run into one of >> these infinite loops increases. >> >> Nicolas >> >>>> In that case, the high-level test-controller which spits out the >>>> results could and should be responsible for handling "unexpected user >>>> input" and/or putting in a timeout, not each and every last test >>>> method.. >>>> >>>> IMO, we want short tests, so let's just write them to be short. If >>>> they're too long, then the encouragement to shorten them comes from >>>> our own impatience of running them interactively. Running them in >>>> batch at night requires no patience, because we're sleeping, and >>>> besides, the batch processor should take responsibility for handling >>>> those rare scenarios at a higher-level.. >>> >>> I agree. >>> Thanks for sharing your thoughts. >>> So the issue is done. :) >>> >>> >>>> >>>> Regards, >>>> Chris >>>> >>>> >>>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse >>>> <[hidden email]> wrote: >>>>> Hi guys >>>>> >>>>> in Squeak andreas introduced the idea of test time out >>>>> Do you think that this is interesting? >>>>> >>>>> Stef >>>>> >>>>> SUnit >>>>> ----- >>>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
ok tx
Now may be the test runner should be adapted. We will see. Now I just remove the items from the urgent list and if the need emerges we know that this is there. Stef On May 31, 2010, at 12:10 PM, Nicolas Cellier wrote: > 2010/5/30 Stéphane Ducasse <[hidden email]>: >> >> On May 30, 2010, at 8:52 PM, Chris Muller wrote: >> >>> (Copying squeak-dev too). >>> >>> I'm not sold on the whole test timeout thing. When I run tests, I >>> want to know the answer to the question, "is the software working?" >>> >>> Putting a timeout on tests trades a slower, but definitive, "yes" or >>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back >>> really faster? I've just incurred the cost of running a test suite, >>> but left without my answer. I get a "maybe", what am I supposed to do >>> next? Find a faster machine? Hack into the code to fiddle with a >>> timeout pragma? That's not faster.. >> >> Thanks this is a really good point. >> >>> But, the reason given for the change was not for running tests >>> interactively (the 99% case), rather, all tests form the beginning of >>> time are now saddled with a timeout for the 1% case: >>> >>> "The purpose of the timeout is to catch issues like infinite loops, >>> unexpected user input etc. in automated test environments." >>> >>> If tests are supposed to be quick (and deterministic) anyway, wouldn't >>> an infinite loop or user-input be caught the first time the test was >>> run (interactively)? Seriously, when you make software changes, we >>> run the tests interactively first, and then the purpose of night-time >>> automated test environment is to catch regressions on the merged >>> code.. >> >> Yes this is what I was also implying in my previous mail. >> If we have a test server this does not really help to have a time out >> and I wonder the case of infinite loop because this may be really rare. >> > > My opinion differs here. Every test should run in a short time frame > but a few exceptions. So it seems reasonnable to just have to specify > a default timeout on your architecture and some specific timeouts for > a few specific tests (or test classes). > > Your main argument is that manual tuning will always be better than > automated default behaviour, and we can only agree on that one. > > But there are two pragmatic cases you don't take into account: > - the case when community supplied test cases do not comply with these > rather implicit requirements, and image integrator does not have time > to dig into each case and do the fine tuning for the rest of > community. > - the case when automated tests are used for exploratory package testing. > > In the first case, integrator just put a threshold that will reject > some tests, up to rest of community to inject more time in solving the > problem. > > Maybe Andreas was also addressing case of network-in-the-loop tests. > Thus it can be seen as a quick hack for by-passing the low level > timeout and number of retries (which are not always accessible that > easily in some APIs...). > > Concerning infinite loops occurence, some are produced by uncompatible > packages. So when you automate compatible package exploration, it > might help because I doubt you will have explored each case > interactively. Of course, you can always put a timeout at upper level > in your bash or something, but it would not be particularly fine > grained, would it ? > > One typical case I often bump into is those classes defining printOn: > by sending storeOn: et vice et versa, same for printString and > printOn:. If core happens to change between two releases, and you have > a subclass defined in your package, probability to run into one of > these infinite loops increases. > > Nicolas > >>> In that case, the high-level test-controller which spits out the >>> results could and should be responsible for handling "unexpected user >>> input" and/or putting in a timeout, not each and every last test >>> method.. >>> >>> IMO, we want short tests, so let's just write them to be short. If >>> they're too long, then the encouragement to shorten them comes from >>> our own impatience of running them interactively. Running them in >>> batch at night requires no patience, because we're sleeping, and >>> besides, the batch processor should take responsibility for handling >>> those rare scenarios at a higher-level.. >> >> I agree. >> Thanks for sharing your thoughts. >> So the issue is done. :) >> >> >>> >>> Regards, >>> Chris >>> >>> >>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse >>> <[hidden email]> wrote: >>>> Hi guys >>>> >>>> in Squeak andreas introduced the idea of test time out >>>> Do you think that this is interesting? >>>> >>>> Stef >>>> >>>> SUnit >>>> ----- >>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
>>
>> > > In other words, the timeout has the advantage to turn a rather > implicit requirement into an explicit requirement. > And it's then up to test producers to fine tune their tests wrt this > requirement or use the available hooks in case of long tests, rather > than letting the integrator guess. yes now for that we can probably use "We already have #should:notTakeMoreThan: and friends in TestCase. The complete TestCase can be protected by overriding #runCase, the individual test by wrapping the code of the test method. " _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
On May 31, 2010, at 12:29 PM, Nicolas Cellier wrote: > Oh, and I see one more advantage: Only quick tests are run > interactively by default from your TestRunner. > We could automate the separation between long tests and quick tests > based on timeout if the information is discoverable (either in method > annotations or class side query, I don't care). but do we have to tag that with a timeout. jorge already sorted tests as unit = fast and integration = slow. In interactive mode we could only run the fast one. I think that we have already a lot and we do not use it enough. So we will learn and see. Stef _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Nicolas Cellier
> In other words, the timeout has the advantage to turn a rather
> implicit requirement into an explicit requirement. I can almost understand it for this, especially in the context of development within an open source community. Unfortunately, it neuters the deterministic property. Did it timeout because it was stuck in a loop or was the Integrator watching a video while the tests were running, causing a couple of them to timeout? That's an obfuscation that is not quick and easy to see through. And why will someone, someday, have to work their way through this sort of opaque obfuscation? Merely because we imposed a design-time constraint on ourselves, like working in static language. The result of working through it most likely will be a new package version that merely bumps the timeout, hurling us backward toward the present; because the test suite is allowed to run longer, and longer.. > And it's then up to test producers to fine tune their tests wrt this > requirement or use the available hooks in case of long tests, rather > than letting the integrator guess. It's very possible this is another case of me just not seeing or understanding the key point. Exploratory package testing? Integrator doesn't have time to "fine tune"? I must admit, my feelings about it are more visceral than logical nature. However this timeout breaks all long-running legacy tests. I intend to make the default timeout be a global preference in Squeak. From there I will wait and see quietly.. Regards, Chris On Mon, May 31, 2010 at 5:16 AM, Nicolas Cellier <[hidden email]> wrote: > 2010/5/31 Nicolas Cellier <[hidden email]>: >> 2010/5/30 Stéphane Ducasse <[hidden email]>: >>> >>> On May 30, 2010, at 8:52 PM, Chris Muller wrote: >>> >>>> (Copying squeak-dev too). >>>> >>>> I'm not sold on the whole test timeout thing. When I run tests, I >>>> want to know the answer to the question, "is the software working?" >>>> >>>> Putting a timeout on tests trades a slower, but definitive, "yes" or >>>> "no" for a supposedly-faster "maybe". But is getting a "maybe" back >>>> really faster? I've just incurred the cost of running a test suite, >>>> but left without my answer. I get a "maybe", what am I supposed to do >>>> next? Find a faster machine? Hack into the code to fiddle with a >>>> timeout pragma? That's not faster.. >>> >>> Thanks this is a really good point. >>> >>>> But, the reason given for the change was not for running tests >>>> interactively (the 99% case), rather, all tests form the beginning of >>>> time are now saddled with a timeout for the 1% case: >>>> >>>> "The purpose of the timeout is to catch issues like infinite loops, >>>> unexpected user input etc. in automated test environments." >>>> >>>> If tests are supposed to be quick (and deterministic) anyway, wouldn't >>>> an infinite loop or user-input be caught the first time the test was >>>> run (interactively)? Seriously, when you make software changes, we >>>> run the tests interactively first, and then the purpose of night-time >>>> automated test environment is to catch regressions on the merged >>>> code.. >>> >>> Yes this is what I was also implying in my previous mail. >>> If we have a test server this does not really help to have a time out >>> and I wonder the case of infinite loop because this may be really rare. >>> >> >> My opinion differs here. Every test should run in a short time frame >> but a few exceptions. So it seems reasonnable to just have to specify >> a default timeout on your architecture and some specific timeouts for >> a few specific tests (or test classes). >> >> Your main argument is that manual tuning will always be better than >> automated default behaviour, and we can only agree on that one. >> >> But there are two pragmatic cases you don't take into account: >> - the case when community supplied test cases do not comply with these >> rather implicit requirements, and image integrator does not have time >> to dig into each case and do the fine tuning for the rest of >> community. >> - the case when automated tests are used for exploratory package testing. >> >> In the first case, integrator just put a threshold that will reject >> some tests, up to rest of community to inject more time in solving the >> problem. >> > > In other words, the timeout has the advantage to turn a rather > implicit requirement into an explicit requirement. > And it's then up to test producers to fine tune their tests wrt this > requirement or use the available hooks in case of long tests, rather > than letting the integrator guess. > > Nicolas > >> Maybe Andreas was also addressing case of network-in-the-loop tests. >> Thus it can be seen as a quick hack for by-passing the low level >> timeout and number of retries (which are not always accessible that >> easily in some APIs...). >> >> Concerning infinite loops occurence, some are produced by uncompatible >> packages. So when you automate compatible package exploration, it >> might help because I doubt you will have explored each case >> interactively. Of course, you can always put a timeout at upper level >> in your bash or something, but it would not be particularly fine >> grained, would it ? >> >> One typical case I often bump into is those classes defining printOn: >> by sending storeOn: et vice et versa, same for printString and >> printOn:. If core happens to change between two releases, and you have >> a subclass defined in your package, probability to run into one of >> these infinite loops increases. >> >> Nicolas >> >>>> In that case, the high-level test-controller which spits out the >>>> results could and should be responsible for handling "unexpected user >>>> input" and/or putting in a timeout, not each and every last test >>>> method.. >>>> >>>> IMO, we want short tests, so let's just write them to be short. If >>>> they're too long, then the encouragement to shorten them comes from >>>> our own impatience of running them interactively. Running them in >>>> batch at night requires no patience, because we're sleeping, and >>>> besides, the batch processor should take responsibility for handling >>>> those rare scenarios at a higher-level.. >>> >>> I agree. >>> Thanks for sharing your thoughts. >>> So the issue is done. :) >>> >>> >>>> >>>> Regards, >>>> Chris >>>> >>>> >>>> On Sat, May 29, 2010 at 2:53 AM, stephane ducasse >>>> <[hidden email]> wrote: >>>>> Hi guys >>>>> >>>>> in Squeak andreas introduced the idea of test time out >>>>> Do you think that this is interesting? >>>>> >>>>> Stef >>>>> >>>>> SUnit >>>>> ----- >>>>> All test cases now have an associated timeout after which the test is considered failed. The purpose of the timeout is to catch issues like infinite loops, unexpected user input etc. in automated test environments. Timeouts can be set on an individual test basis using the <timeout: seconds> tag or for an entire test case by implementing the #defaultTimeout method. >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Chris Muller-3
Hi Chris -
Let me comment on this from a more general point of view first, before going into the specifics. I've spent the last five years building a distributed system and during this time I've learned a couple of things about the value of timeouts :-) One thing that I've come to understand is that *no* operation is unbounded. We may leisurely talk about "just wait until it's done" but the reality is that regardless of what the operation is we never actually wait forever. At some point we *will* give up no matter what you may think. This is THE fundamental point here. Everything else is basically haggling about what the right timeout is. For the right timeout the second fundamental thing to understand is that if there's a question of whether the operation "maybe" completed, then your timeout is too short. Period. The timeout's value is not to indicate that "maybe" the operation completed, it is there to say unequivocally that something caused it to not complete and that it DID fail. Obviously, introducing timeouts will create some initial false positives. But it may be interesting to be a bit more precise on what we're talking about. To do this I attributed TestRunner to measure the time it takes to run each test and then ran all the tests in 4.2 to see where that leads us. As you might expect, the distribution is extremely uneven. Out of 2681 tests run 2588 execute in < 500 msecs (approx. 1800 execute with no measurable time); 2630 execute in less than one second, leaving a total of 51 that take more than a second and only three tests actually take longer than 5 seconds and they are all tagged as such. As you can see the vast majority of tests have a "safety margin" of 10x or more between the time the test usually takes and its timeout value. Generally speaking, this margin is sufficient to compensate for "other" effects that might rightfully delay the completion of the test in time. If you have tests that commonly vary by 10x I'd be interested in finding out more about what makes them so unpredictable. So if your question is "are my timeouts to tight" one thing we could do is to introduce the 10x as a more or less general guideline for executing tests, and perhaps add a transcript notifier if we ever come closer than 1/3rd of the specified timeout value (i.e., indicating that something in the nature of the test has changed that should be reflected in its timeout). This would give you ample warning that you need to adjust your test even if it isn't (yet) failing on the timeout. That said, a couple of concrete comments to your post: On 5/30/2010 11:52 AM, Chris Muller wrote: > (Copying squeak-dev too). > > I'm not sold on the whole test timeout thing. When I run tests, I > want to know the answer to the question, "is the software working?" Correct. > Putting a timeout on tests trades a slower, but definitive, "yes" or > "no" for a supposedly-faster "maybe". But is getting a "maybe" back > really faster? I've just incurred the cost of running a test suite, > but left without my answer. I get a "maybe", what am I supposed to do > next? Find a faster machine? Hack into the code to fiddle with a > timeout pragma? That's not faster.. See above. If you're thinking "maybe", then the timeout is too short. > But, the reason given for the change was not for running tests > interactively (the 99% case), rather, all tests form the beginning of > time are now saddled with a timeout for the 1% case: As the data shows, this is already the case. It may be interesting to note that so far there were a total of 5 (five) places that had to be adjusted in Squeak. One was a general place (the default timeout for the decompiler tests) and four were individual methods. Considering that computers usually don't become slower over time, it seems unlikely that further adjustments will be necessary here. So the bottom line is that the changes required aren't exactly excessive. > "The purpose of the timeout is to catch issues like infinite loops, > unexpected user input etc. in automated test environments." > > If tests are supposed to be quick (and deterministic) anyway, wouldn't > an infinite loop or user-input be caught the first time the test was > run (interactively)? Seriously, when you make software changes, we > run the tests interactively first, and then the purpose of night-time > automated test environment is to catch regressions on the merged > code. These changes are largely intended for automated integration testing. I am hoping to automate the tests for community supported packages to a point where there will be no user in front of the system. Even if there were, it's not clear whether that person can fix the issue immediately or whether the entire process is stuck because someone can momentarily not fix the problem at hand and the tests will never run to completion and produce any useful result. So the idea here is not that unit tests are *only* to catch regressions in previously manually tested (combinations of) code. The idea is to catch interactions, and integration bugs and be able to produce a result even if there is no user to watch the particular combination of packages being loaded together in this particular form. Perhaps that is our problem here? It seems to me that you're taking a view that says unit tests are exclusively for regression testing and consequently there is no way a previously successful test would suddenly become unsuccessful in a way that makes it time out ... but you know, having written this sentence, it makes no sense to me. If we'd know beforehand that tests fail only in particular known ways we wouldn't have to run them to begin with. The whole idea of running the tests to catch *unexpected* situations and as a consequence there is value of capturing these situations instead of hanging and producing no useful result. > In that case, the high-level test-controller which spits out the > results could and should be responsible for handling "unexpected user > input" and/or putting in a timeout, not each and every last test > method.. Do you have such a "high-level test-controller"? Or do you mean a human being spending their time watching the tests run to completion? If the former, I'm curious as to how it would differ from what I did. If the latter, are you volunteering? ;-) > IMO, we want short tests, so let's just write them to be short. If > they're too long, then the encouragement to shorten them comes from > our own impatience of running them interactively. Running them in > batch at night requires no patience, because we're sleeping, and > besides, the batch processor should take responsibility for handling > those rare scenarios at a higher-level.. The goal for the timeouts is *not* to cause you to write shorter tests. If you're looking at it this way you're looking at it from the wrong angle. Up your timeout to whatever you feel is sensible to have trust in the results of the tests. As I said earlier, I'm quite happy to discuss the default timeout; it's simply that with some 95% coverage on a 10x safety margin it feels to me that we're playing it safe enough for the remaining cases to have explicit timeouts. Cheers, - Andreas _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Thanks for clarifying your goals w.r.t. introducing the timeout. I
think that's important because, as I've said, legacy tests that live in external packages are affected. I read your whole note a few times, and one part in particular stuck out to me as a potentially useful use-case for test-case timeout: > These changes are largely intended for automated integration testing. I am > hoping to automate the tests for community supported packages to a point > where there will be no user in front of the system. If, by this, you mean you want to simply have a headless running squeak image which: [ true ] whileTrue: [ loadLatestPackageCombinations. runTestSuite. mailResultsToSqueakDev ] THEN, that brings us down to only haggling about the default timeout, although I still would prefer to handle timeout it at a higher level.. If, however, this isn't the goal, then I still don't seem to have grasped, what I sense is, some key point.. or that my own concerns were properly understood. If so, let me try one more time. :) > done" but the reality is that regardless of what the operation is we never > actually wait forever. At some point we *will* give up no matter what you > may think. This is THE fundamental point here. Everything else is basically > haggling about what the right timeout is. Of course we would "give up" after an unreasonable amount of time. In either case, there is something to interrogate, either a live looping test-runner machine, or a static report of test results with one or more that say, "timed out". In the former case, we have a bevy of useful information, (e.g., which test is it trying to run? How much memory is the test image using right now? Can I Alt+. interrupt it and get even more information?) In the latter case, there is no choice but to start at square 1: Try to recreate the problem. (What if it works?) Personally, I would always prefer to deal with the former case than the latter.. > For the right timeout the second fundamental thing to understand is that if > there's a question of whether the operation "maybe" completed, then your > timeout is too short. Period. The timeout's value is not to indicate that > "maybe" the operation completed, it is there to say unequivocally that > something caused it to not complete and that it DID fail. I didn't understand this. There is no question about "maybe completed". We know if a test times out then it _didn't_ complete. The "maybe" I referred to was about the core question: whether the underlying software being tested can be used or not. "Maybe" it could, then again, maybe it shouldn't. It sounds like we agree, a timeout would *have* to be regarded as a failure. > Obviously, introducing timeouts will create some initial false positives. You mean false negatives? If we are saying that we must treat a timeout as failure, and failure is "negative", then a timeout would be false negative or a true negative....? > But it may be interesting to be a bit more precise on what we're talking > about. To do this I attributed TestRunner to measure the time it takes to > run each test and then ran all the tests in 4.2 to see where that leads us. > As you might expect, the distribution is extremely uneven. Out of 2681 tests > run 2588 execute in < 500 msecs (approx. 1800 execute with no measurable > time); 2630 execute in less than one second, leaving a total of 51 that > take more than a second and only three tests actually take longer than 5 > seconds and they are all tagged as such. That's fine for the 4.2 tests, but there are hundreds of tests in external packages. With a mere 5-second default, many will need to be updated with a pragma. But then we're talking about a branch in the package because that won't be backward compatible with 3.9, will it? > As you can see the vast majority of tests have a "safety margin" of 10x or > more between the time the test usually takes and its timeout value. > Generally speaking, this margin is sufficient to compensate for "other" > effects that might rightfully delay the completion of the test in time. I can see that jacking up the timeout may tend reduce the number of false negatives (at the expense of potentially longer wait times!), but when they do, we have no useful information whatsoever. Not even certainty whether the underlying software is usable or not, because it could be a false negative. > If > you have tests that commonly vary by 10x I'd be interested in finding out > more about what makes them so unpredictable. Well, again, it's not just about randomness in the tests but also about external factors; CPU speed, current system load, etc. > So if your question is "are my timeouts to tight" one thing we could do is > to introduce the 10x as a more or less general guideline for executing > tests, Ok, with that kind of margin, the message I'm getting from you is that it does about making a human have to wait. We just want to make sure we "get some kind of report?" >> But, the reason given for the change was not for running tests >> interactively (the 99% case), rather, all tests form the beginning of >> time are now saddled with a timeout for the 1% case: > > As the data shows, this is already the case. It may be interesting to note > that so far there were a total of 5 (five) places that had to be adjusted in > Squeak. I'm not worried about the built-in tests; recall I acknowledged that I can "almost understand" a forced timeout in the context of an open-source project where people are all contributing their portions and no one else wants to be "held up" because of one persons tests looping. My concern is more about the impact to legacy external packages.. > One was a general place (the default timeout for the decompiler > tests) and four were individual methods. Considering that computers usually > don't become slower over time, it seems unlikely that further adjustments > will be necessary here. Well, they do.. It's not just a function of time, but who's running it, and on which machine. We all have different machines. Maybe someone wants to test on an iPhone that might be considerably slower than the original desktop on which the timeout was specified... > So the bottom line is that the changes required > aren't exactly excessive. That depends on whether, to have an Community Supported Package be included, how many test methods I have and whether I also want that to run in 3.9 and whether, to do that, I have to put in a pragma.. (unless I'm mistaken about pragmas working in 3.9). Bottom line: Today Magma runs on 3.9 - 4.2 + Pharo. Some of Magma's tests necessarily take several minutes. Question: Can Magma be a CSP and still retain this wide compatibility? > These changes are largely intended for automated integration testing. I am > hoping to automate the tests for community supported packages to a point > where there will be no user in front of the system. > > Even if there were, it's > not clear whether that person can fix the issue immediately or whether the > entire process is stuck because someone can momentarily not fix the problem > at hand and the tests will never run to completion and produce any useful > result. Who is "that person" and what is their role? > begin with. The whole idea of running the tests to catch *unexpected* > situations and as a consequence there is value of capturing these situations > instead of hanging and producing no useful result. To me, "timed out" is what is not useful. To find a hanging machine that can be interrogated is much more useful. >> In that case, the high-level test-controller which spits out the >> results could and should be responsible for handling "unexpected user >> input" and/or putting in a timeout, not each and every last test >> method.. > > Do you have such a "high-level test-controller"? Or do you mean a human > being spending their time watching the tests run to completion? If the > former, I'm curious as to how it would differ from what I did. If the > latter, are you volunteering? ;-) I meant the former. It differs from what you did in that it preserves legacy compatibilty, and the legacy deterministic property of testing. To handle automated test server, I would handle the on-timeout: from a much higher place, and therefore it would not be for individual tests, but for the whole suite. Information about the last running test would be sufficient for me, especially if we're talking about all of the other disadvantages I've mentioned for fine-grained timeouts.. - Chris _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
I dont think it is a necessary... I mean, if you are doing TDD, you write the test and you run it inmediatly, if it takes too much time to run (that is, more than 2 seconds :-)) then you press ctrl+ . and problem solved...
I'm saying this because the should:notTakeMoreThan: has some interesting implementation details, like running the test in another process and synchronizing them, etc... so running all the test would create a process for each test, etc. tacking more time unnecessarily in most of the cases... unless the implementation takes care of only doing this special "feature" when needed (for example, when you have that pragma) but the, if you have to write the pragma why not just send the message #should:notTakeMoreThan: and problem solved?
On Sat, May 29, 2010 at 3:42 PM, Stéphane Ducasse <[hidden email]> wrote: For me the point is that I have the impression that this is an interesting feature to have _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Chris Muller-3
I completely agree
On Sun, May 30, 2010 at 2:52 PM, Chris Muller <[hidden email]> wrote: (Copying squeak-dev too). _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |