I would agree that grey is better than red - but I personally think we’re being too pedantic on this - particularly when doing TDD and coding in the debugger. If I’m writing straight forward tests and like to see a red failure (either by deliberately returning false, or a subclass responsibility, or -1) and then correct that failure in the debugger to return the correct result - then its tedious to have to run the test again (in fact it feels odd). For the rare time that I got it wrong - it will show up when I run all tests on the next phase.
I accept that others may see this the other way around - but I’m a more optimistic guy. This said - maybe we make it an option (or an easy code switch) - I’d default it to the optimistic TDD mode personally. My CI server will give me the full lowdown. Tim > On 15 Nov 2017, at 21:50, Sean P. DeNigris <[hidden email]> wrote: > > Richard Sargent wrote >> I would go a little further. Any method modified by the developer during >> the course of running a test voids the ability to claim the test >> succeeded. >> Likewise, for any object editted in an inspector. > > That makes sense to me. > > > > ----- > Cheers, > Sean > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html > |
While we are discussing colors, what should we do about a test that does not make any assertions at all?
A couple of years ago, a smart student who was working on a testing dialect for me decided that such tests should be in a new category all of their own. Later I simplified things and just made these tests fail. I am pretty convinced now that this is the right behaviour. It certainly suits my purposes (teaching TDD); if the student omits to make any assertions, they have a failing test. The message is: "Failure: test made no assertions". Currently, such a test is green in Pharo. In 2013, I filed a bug report on a bunch of tests of #printOn: on collections that made no assertions. (The test writer didn’t understand how streams worked, and made assertions for each element of the empty string.) I was reminded of this just this week, because those tests have yet to be fixed. I suspect that if they had been yellow, rather than green, then they would have been fixed before now. I plan to fix those tests on Friday, but I also wonder about changing the behaviour of the testing framework. What do you think? Andrew |
Administrator
|
A test that asserts nothing has only asserted that the code ran through without throwing an error. :-) I like your proposal that such a test is an inherently failing test. (Of course, that will result in the student adding a single assertion at the end, of the form "self assert: true"!) On Tue, Nov 21, 2017 at 4:07 PM, Prof. Andrew P. Black <[hidden email]> wrote: While we are discussing colors, what should we do about a test that does not make any assertions at all? |
In reply to this post by Prof. Andrew P. Black
Hi andrew
I like your idea. It is fun and at least we can spot test. Now sometimes we will have to have self assert: true because running a code can be also considered as a test. Now what I do not like is empty test method because there are green. stef On Tue, Nov 21, 2017 at 10:07 PM, Prof. Andrew P. Black <[hidden email]> wrote: > While we are discussing colors, what should we do about a test that does not make any assertions at all? > > A couple of years ago, a smart student who was working on a testing dialect for me decided that such tests should be in a new category all of their own. Later I simplified things and just made these tests fail. I am pretty convinced now that this is the right behaviour. It certainly suits my purposes (teaching TDD); if the student omits to make any assertions, they have a failing test. The message is: "Failure: test made no assertions". > > Currently, such a test is green in Pharo. In 2013, I filed a bug report on a bunch of tests of #printOn: on collections that made no assertions. (The test writer didn’t understand how streams worked, and made assertions for each element of the empty string.) I was reminded of this just this week, because those tests have yet to be fixed. I suspect that if they had been yellow, rather than green, then they would have been fixed before now. > > I plan to fix those tests on Friday, but I also wonder about changing the behaviour of the testing framework. What do you think? > > Andrew > > |
BTW I like grey for a method that was edited in the debugger.
Because red is not good. The proof is that when I rerun my test is green. While saying I do not know you have to rerun is a good solution. I like it. I got always frustrated with the red. On Fri, Nov 24, 2017 at 12:05 AM, Stephane Ducasse <[hidden email]> wrote: > Hi andrew > > I like your idea. > It is fun and at least we can spot test. > Now sometimes we will have to have self assert: true because running a > code can be also considered as a test. > Now what I do not like is empty test method because there are green. > > stef > > On Tue, Nov 21, 2017 at 10:07 PM, Prof. Andrew P. Black > <[hidden email]> wrote: >> While we are discussing colors, what should we do about a test that does not make any assertions at all? >> >> A couple of years ago, a smart student who was working on a testing dialect for me decided that such tests should be in a new category all of their own. Later I simplified things and just made these tests fail. I am pretty convinced now that this is the right behaviour. It certainly suits my purposes (teaching TDD); if the student omits to make any assertions, they have a failing test. The message is: "Failure: test made no assertions". >> >> Currently, such a test is green in Pharo. In 2013, I filed a bug report on a bunch of tests of #printOn: on collections that made no assertions. (The test writer didn’t understand how streams worked, and made assertions for each element of the empty string.) I was reminded of this just this week, because those tests have yet to be fixed. I suspect that if they had been yellow, rather than green, then they would have been fixed before now. >> >> I plan to fix those tests on Friday, but I also wonder about changing the behaviour of the testing framework. What do you think? >> >> Andrew >> >> |
In reply to this post by Richard Sargent
On 16 November 2017 at 00:20, Richard Sargent <[hidden email]> wrote:
So to summarise the viewpoints as I understand them, consider an interrupted test that later runs to completion and is then... A. incorrectly marked red (or grey) [false negative] ==> frequent during TDD ==> need to manually rerun test *every* time ==> large extra effort for developer B. incorrectly marked green [false positive] ==> infrequent (presumed) ==> error picked up anytime test is run again, or when test group is run ==> small extra effort for developer ==> developer may be aware when they make suspect changes which undermine test result, to judge to run it again. C. test is automatically rerun a second time ==> infrequently some tests run too long for this to be practical A and B are like the philosophical difference between engineers and scientists. i.e. engineers deal with approximations** that make the design process more efficient. For C, I still don't fully understand the concrete problem. How much time do such tests take? Anyway, I prefer early efficiency with late bound correctness, so my vote would be to avoid A.
There are costs associated with perfect correctness. i.e. costs associated with both false-negatives and false-positives. The alternate approach is to consider the consequence of temporary false-positive. I'd vote for making things as simple and *efficient* as possible. cheers -ben **A mathematician, a scientist, and an engineer are given the task of finding how high a particular red rubber ball will bounce when dropped from a given height onto a given surface. The mathematician derives the elasticity of the ball from its chemical makeup, derives the equations to determine how high it will bounce and calculates it. The physicist takes the ball into the lab, measures its elasticity, and plugs the variables into a formula. The engineer looks it up in his red rubber ball book.
|
Administrator
|
Ben, I think you understand the dilemma. My philosophy is to avoid claims that are known to be not provably true. i.e. don't colour it green, because the test runner cannot claim that knowledge. I am agnostic to whether it should be left uncoloured or coloured red to document the fact that it failed (as in was unable to run to completion without a problem). I truly don't care. Our profession has a tendency to make claims which aren't provably true. It is one of the many reasons we have poor software. So I am adamant about avoiding that particular mistake. And, I am adamant about dissuading others from the same kinds of mistakes. Write software which conveys exactly what it can claim without Deus ex machina intervention. The test failed. It was unable to run to completion without an error. It doesn't matter whether that particular error was corrected or not. The software running the test is incapable of determining whether the change that allowed the test to run to completion was code, data editing, or something else and it is incapable of knowing whether whatever allowed the test to finish is a correct fix for the error. Be adamant about what your software can know to be true. Equally important, be adamant about your own approach to problems and solutions. Godel had some good advice. On Nov 23, 2017 19:31, "Ben Coman" <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |