Squeak comes with a large set of SUnit tests. Unfortunately, some of
them don't work. As far as I can tell, there is NO recent version of Squak in which all the tests work. This is a sign that something is wrong. The main purpose of shipping tests with code is so that people making changes can tell when they break things. If the tests don't work then people will not run them. If they don't run the tests then the tests are useless. The current set of tests are useless because of the bad tests. Nobody complains about them, which tells me that nobody runs them. So, it is all a waste of time. If the tests worked then it would be easy to make a new version. Every bug fix would have to come with a test that illustrates the bug and shows that it has been fixed. The group that makes a new version would check that all tests continue to work after the bug fix. An easy way to make all the tests run is to delete the ones that don't work. There are thousands of working tests and, depending on the version, dozens of non-working tests. Perhaps the non-working tests indicate bugs, perhaps they indicate bad tests. It seems a shame to delete tests that are illustrating bugs. But if these tests don't work, they keep the other tests from being useful. Programmers need to know that all the tests worked in the virgin image, and that if the tests quit working, it is there own fault. No development image should ever be shipped with any failing tests. -Ralph Johnson |
Hi Ralph,
Of course you're right, this has been an issue for quite a while. I think the problem is that tests have diverse domains of validity, and there are neither abstractions nor infrastructure in place to support them. In theory (and often in practice) you run "the" test suite every few minutes, and a test fails iff some code is broken. Wonderful! unfortunately, in a large scale, distributed, diverse effort like Squeak, things are more complicated. Examples: - Platform specific tests. - Very long running tests, which for most people don't give enough value for their machine time. - Non-self-contained tests, for example ones that require external files to be present. - Performance tests (only valid on reasonably fast machines. And this might change over time...) All of these do have some value in some context, but some cannot be expected to be always green, and some aren't even worth running most of the time. And the problem is that our current choice about "where/when should this test run" is currently binary - everywhere, or nowhere. You say we should be more aggressive in making this binary decision, but the reason this isn't happening is that sometimes neither option is quite right. The community has moved back and forth between extracting some/all tests into an optional package, but in practice that just means they never get run. Do you know of some set of abstractions/practices/framework to deal with this problem? Daniel Vainsencher Ralph Johnson wrote: > Squeak comes with a large set of SUnit tests. Unfortunately, some of > them don't work. As far as I can tell, there is NO recent version of > Squak in which all the tests work. > > This is a sign that something is wrong. The main purpose of shipping > tests with code is so that people making changes can tell when they > break things. If the tests don't work then people will not run them. > If they don't run the tests then the tests are useless. The current > set of tests are useless because of the bad tests. Nobody complains > about them, which tells me that nobody runs them. So, it is all a > waste of time. > > If the tests worked then it would be easy to make a new version. > Every bug fix would have to come with a test that illustrates the bug > and shows that it has been fixed. The group that makes a new version > would check that all tests continue to work after the bug fix. > > An easy way to make all the tests run is to delete the ones that don't > work. There are thousands of working tests and, depending on the > version, dozens of non-working tests. Perhaps the non-working tests > indicate bugs, perhaps they indicate bad tests. It seems a shame to > delete tests that are illustrating bugs. But if these tests don't > work, they keep the other tests from being useful. Programmers need > to know that all the tests worked in the virgin image, and that if the > tests quit working, it is there own fault. > > No development image should ever be shipped with any failing tests. > > -Ralph Johnson > |
Daniel Vainsencher wrote:
> ... > unfortunately, in a large scale, distributed, diverse effort like > Squeak, things are more complicated. > > Examples: > ...And the problem is that our current choice about "where/when > should this test run" is currently binary - everywhere, or nowhere. ... > > Do you know of some set of abstractions/practices/framework to deal with > this problem? In the lisp community (which is also based on image+package), the abstraction for software-package (called a "system") encompasses version, dependencies, and operation, where operation is generally considered to include test. A system is defined as a set of modules with a type tag, including "test", as well as "foreign libraries", "documentation", Lisp "source code", and other systems recursively. Modules define other metadata, including various kinds of dependencies. You perform an operation on a system, such as "load" or "test". The machinery collects all the dependencies based on the operation and any operations that the specified operation requires. This collection is based on knowledge of what has already been successfully performed in the current running image and which is still valid (e.g., that source hasn't changed). The resulting partial orderings of dependencies are then topologically sorted to produce a total ordering of operations on modules. The more general such system tools allow developers to define their own operation and module types, without having to re-engineer the system tools themselves. The result is that developer's can pretty readily test any combination of systems in a meaningful way. The code for doing all this was really quite small and understandable. I was part of a group who used it for planning manufacturing operations in a factory. Alas, every Lisp organization and nearly every programmer has written his own version of this general mechanism, so no standard emerged (as of my last experience with this, circa '99). I don't know whether this means that the model wasn't quite right, or that Lisp programmers are perverse. References: http://www.google.com/search?q=lisp+defsystem http://www.google.com/search?q=lisp+define-system http://www.google.com/search?q=lisp+waters+regression+test -- Howard Stearns University of Wisconsin - Madison Division of Information Technology voice:+1-608-262-3724 |
In reply to this post by Daniel Vainsencher-6
On 11/1/06, Daniel Vainsencher <[hidden email]> wrote:
> Do you know of some set of abstractions/practices/framework to deal with > this problem? Yes. TestSuites can be used to group tests. That's what I was trying to say in: http://lists.squeakfoundation.org/pipermail/squeak-dev/2006-October/110461.html ...but I think that the mail was lost between all the mails that comes to the list :( |
In reply to this post by Daniel Vainsencher-6
On 11/1/06, Daniel Vainsencher <[hidden email]> wrote:
> Do you know of some set of abstractions/practices/framework to deal with > this problem? Sure. Mostly practices. I think we have enough abstractions and frameworks already, though they could be better. Divide tests into the ones that you expect to be run by people developing other packages, and those run by developers of your package. The first are going to be included with your package, and the second will be in a separate package in MC. There is a base image that includes some set of packages. Other packages are said to "work with the base image". The tests in the base image all run on all platforms. There might be platform specific tests, but they are not in the base image. If a package works with the base image then when you load it, all of the original tests in the image will work, and the tests with the package will work. Presumably, the private tests for the package will work, too, but that is up to the developer of the package. Of course, just because two packages work with the base image does not mean that they will work with each other. It makes sense to have a "universe" in which any combination of packages in the universe will work with each other, as well as with the base image. This takes more testing and certification, and there has to be a "universe maintainer" who does this. In theory it is easy, in practice it is a lot of work. But much of the work can be automated. We can worry about this after we have a base image in which all tests work. The first priority is to create a world in which developers can assume that any broken tests are their fault. The current set of tests in Squeak are not too bad. I think they will take on the order of half an hour on a fast machine, so it is possible for a developer to run them all before releasing code. People do not run all the tests every time they make a little change, no matter what the books say. People tend to pick the most relevant test suites and run them after each little change, so those tests will run in just a few minutes. They don't run long tests very often. So, I do not think that speed is the problem. The problem is that all the tests is a relase-image should work. Period. Either fix them or remove them. From then on, if someone offers a patch and the patch breaks some tests, reject the patch. Never release an image with broken tests. Don't accept code that breaks tests unless you are trying to help the author and plan to fix the tests yourself. -Ralph |
In reply to this post by Diego Fernández
Hello all,
I just published a new version of ICal (well a couple, ignore the first jbjohns) that now supports querying what occurences of an event are described by a recurrence rule. The public API consists of the following 6 methods. ICEvent>>occurrences ICEvent>>occurrencesAfter: aTimeSpan ICEvent>>occurrencesBetween: aStartTimeSpan and: anEndTimeSpan ICEvent>>occurrences: aNumber ICEvent>>occurrences: aNumber after: aTimeSpan ICEvent>>occurrences: aNumber between: aStartTimeSpan and: anEndTimeSpan The first two require the rule to have a count or until directive, since otherwise the set would be infinite (I will consider infinite sets later :) ). All methods are constrained by the rule (i.e. if the rule has a count directive of 4 then occurrences: 6 will still return only 4). The ICEvent>>isValidForDate: method was also changed, so that it checks if the given date is in the set. Things to be aware of: Right now the API only works for monthly recurrence rules, but I plan to put in more soon (I will be focusing on Weekly and above). The rest will spit out some "does not understand" messages for the occurrence methods, but otherwise, everything works as before. Right now the occurrence methods just return an ordered list of dates. I haven't decided yet what should be returned (just a DateAndTime, or maybe a complete event representing that day?) so I have just deferred for now. Let me know what would be the most useful to you. The classes wont change and the API listed above wont change, but the methods in the ICFrequency classes will be moved around some. BUG: If your ICEvent uses multiple rules and they have common dates between them, they will all be in the returned set (i.e. there can be multiples of the same date). I am thinking of using some other data structure then OrderedCollection to fix this problem, and remove the need for sorting to happen in various spots throughout. ExclusionRules are not handled at the moment. The methods are designed for TimeSpan resolution, but right now some of the lower level methods work on Date's. This should mostly be transparent, except that the set returned are Date's, instead of DateAndTime's like they should be. :) Thanks. Hope this is useful to someone. :) Jason _________________________________________________________________ Stay in touch with old friends and meet new ones with Windows Live Spaces http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us |
In reply to this post by Ralph Johnson
On 11/1/06, Ralph Johnson <[hidden email]> wrote:
> No development image should ever be shipped with any failing tests. ... except if those tests are really customer/functional tests. They aren't expected to run "green" 100% of the time. I don't know if that is the case with these tests (I suspect that it is not). What would be useful is to package up tests into real Unit/Developer tests (as has been suggested) and functional tests. -- Jason Rogers "Where there is no vision, the people perish..." Proverbs 29:18 |
In reply to this post by Ralph Johnson
Daniel wrote:
> I think the problem is that tests have diverse domains of validity, > and there are neither abstractions nor infrastructure in place to > support them. > Examples: > - Platform specific tests. > - Very long running tests, which for most people don't give enough > value for their machine time. > - Non-self-contained tests, for example ones that require external > files to be present. > - Performance tests (only valid on reasonably fast machines. And this > might change over time...) > ... > Do you know of some set of abstractions/practices/framework to deal > with this problem? Yes! Or at least a step in the right direction ;-) SSpec 0.13, which I have recently ported to squeak, is hardwired to define a suite of tests using the method category 'specs', whereas as you know SUnit effectively hardwires the definition of a suite of tests by methods beginning with 'test*' In order to integrate the two with the same TestRunner I have made steps to combine the two with a more generic and flexible solution. For many years I have been adding code to SUnit that allows a TestCase to publish a number of different test suites for different contexts as suggested above. I have now added and extended this feature to SUnit in the hope that it may be adopted in 3.9+. I have also extended TestRunner to provide a UI for selecting which published suite(s) to run. All that is needed is to define some conventions for naming suites that the community may find useful, including I presume: tests that should always pass in an image release, tests that are specific to a particular release, tests that illustrate bugs to be addressed, tests that highlight when certain fixes have not been loaded or the external environment is/is not as expected, long tests, and test suites associated with particular products, or specialist releases. I have also taken the liberty of reorganizing the Class categories from SUnit in order to integrate more nicely with SSpec. I adopted the following Top level categories which I put forward as a suggestion for 3.10 if others are agreeable. Testing-SUnit Testing-SSpec Testing-Common Testing Tests-SUnit Testing Tests-SSpec How the scheme works. A TestCase, (or SpecContext) class defines #publishedSuites such that TestCase-c-#publishedSuites ^#( #allStandardTests #longTests #squeak39release #knownBugs #allStandardAndLongTests) each of the nominated published suits defines a match string which defines the particular test suite. The match string supports '|' for 'or' so as to support multiple matches. Also the match string matches against both the method name and the method category name together. <method match string>@<category match string> TestCase-c-#allStandardTests ^ '*test|*@tests*' TestCase-c-#longTests ^ '*longtest' TestCase-c-#allStandardAndLongTests ^ 'test*|longTest*' The new api for building suites is based upon #suite: . e.g. (myTestCase suite: '*@mytests') would return a test suite consisting of all the test methods in the category 'mytests'. The testRunner can build a single suite across multiple classes by using (TestCase-c-#suite: <match> addTo: <suite>) together with the information gathered from #publishedSuites. You will find the code and the test runner in http://www.squeaksource.com/Testing I havent finished the spec integration with TestRunner, though SSpec can be used with the TextRunner. enjoy, and do let me know what you think. Keith Send instant messages to your online friends http://uk.messenger.yahoo.com |
In reply to this post by Ralph Johnson
+1
>From: "Ralph Johnson" <[hidden email]> >Reply-To: The general-purpose Squeak developers >list<[hidden email]> >To: "The general-purpose Squeak developers >list"<[hidden email]> >Subject: Tests and software process >Date: Wed, 1 Nov 2006 07:30:30 -0600 > >Squeak comes with a large set of SUnit tests. Unfortunately, some of >them don't work. As far as I can tell, there is NO recent version of >Squak in which all the tests work. > >This is a sign that something is wrong. The main purpose of shipping >tests with code is so that people making changes can tell when they >break things. If the tests don't work then people will not run them. >If they don't run the tests then the tests are useless. The current >set of tests are useless because of the bad tests. Nobody complains >about them, which tells me that nobody runs them. So, it is all a >waste of time. > >If the tests worked then it would be easy to make a new version. >Every bug fix would have to come with a test that illustrates the bug >and shows that it has been fixed. The group that makes a new version >would check that all tests continue to work after the bug fix. > >An easy way to make all the tests run is to delete the ones that don't >work. There are thousands of working tests and, depending on the >version, dozens of non-working tests. Perhaps the non-working tests >indicate bugs, perhaps they indicate bad tests. It seems a shame to >delete tests that are illustrating bugs. But if these tests don't >work, they keep the other tests from being useful. Programmers need >to know that all the tests worked in the virgin image, and that if the >tests quit working, it is there own fault. > >No development image should ever be shipped with any failing tests. > >-Ralph Johnson > _________________________________________________________________ Find a local pizza place, music store, museum and more then map the best route! http://local.live.com?FORM=MGA001 |
In reply to this post by keith1y
Errata in previous message:
> > The new api for building suites is based upon #suite: . e.g. > (myTestCase suite: '*@mytests') would return a test suite consisting > of all the test methods in the category 'mytests'. I had forgotten that the most recent incarnation of this api already works in conjunction with the #publishedSuites. So that you obtain a suite with a call to #suite: supplying the publishedSuite selctor. myTestCase suite: #allStandardTests or myTestCase suite: #longTests best regards Keith Send instant messages to your online friends http://uk.messenger.yahoo.com |
In reply to this post by J J-6
-1 No development image should ever be shipped with any failing tests without some context to explain why they are failing. There may be tests that are intended to fail, particularly those that are intended to validate the external environment for a product. Of course the most annoying case being the example failure raising test case in SUnit that is placed there for beginners to see what a failure looks like.So, better to say, that no development image should ever be shipped with any failing tests that are associated with the domain of 'ensuring that the development image works as expected'. There may be other domains such as 'known bugs still to be fixed'. Keith |
On 11/1/06, Keith Hodges <[hidden email]> wrote:>
> No development image should ever be shipped with any failing tests. > > -Ralph Johnson > -1 > > No development image should ever be shipped with any failing tests without > some context to explain why they are failing. > > There may be tests that are intended to fail, particularly those that are > intended to validate the external environment for a product. Of course the > most annoying case being the example failure raising test case in SUnit that > is placed there for beginners to see what a failure looks like. > So, better to say, that no development image should ever be shipped with > any failing tests that are associated with the domain of 'ensuring that the > development image works as expected'. There may be other domains such as > 'known bugs still to be fixed'. The first thing I do when I start working with a new image is to delete the SUnit tests. The one that always fails is especially annoying. It is useful for people porting SUnit to a new platform, but it is not useful to most people. "Known bugs still to be fixed" should be a separate package that you can load if you are going to fix the bugs. There should not be any SUnit tests like that in a released, stable image. Not even in an alpha or a beta image. One of the main purpose of tests is to let you know when you broke something. They will not have this function as long as some of them fail. If tests fail then either 1) delete them 2) fix the code so they no longer fail or 3) move them to a package on MC, or anywhere not in the image. -Ralph Johnson |
> "Known bugs still to be fixed" should be a separate package that you > can load if you are going to fix the bugs. There should not be any > SUnit tests like that in a released, stable image. Not even in an > alpha or a beta image. This might be an ideal, but I don't think that it is practical, since an individual bug to fix test may well (should) exist in the context of the other tests for that package/set of functionality. Making a separate change-set or package for such a test just seems too much. Once the fix is made then the individual bug-test method from the separate-bugs-to-fix-package would then have to be integrated etc etc. I would prefer to have the full information available to me in order to assess the state of an image, whether declared stable or not. For me the 50 or so tests that fail in the 3.9 image would be ok if they were in a "tests we expect to fail" category. With the scheme I propose, you select the "tests for release 3.10" category of tests, and hit run, if all the tests pass then great. The existence of miscellaneous snippets of code in the image that are not part of that category is simply an ignorable artefact for the purpose of validating that release. Those snippets might be named #bugtestMyBug or #release39Test or #extraLongTest. my 2p Keith ___________________________________________________________ All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine http://uk.docs.yahoo.com/nowyoucan.html |
In reply to this post by Ralph Johnson
Ralph Johnson schrieb:
> Squeak comes with a large set of SUnit tests. Unfortunately, some of > them don't work. As far as I can tell, there is NO recent version of > Squak in which all the tests work. > > This is a sign that something is wrong. Yup. To strengthen the upcoming trend of "do something" I have investigated all the failing test cases in a 3.9-RC3-7066 image. The results are at http://wiki.squeak.org/5889 - please feel free to comment. Incidentally, there are very few classes of problems which are responsible for most failing cases: One (which causes half of the failures and errors) is missing features in the MVC implementation of ToolBuilder. In my opinion, the MVCToolBuilderTests should simply clamp these down by overriding the test cases which can not possibly work in MVC with empty methods. Then there are a number of FloatMathPlugin tests which just test whether a sequence of floating point operations on a huge number of pseudorandom floats exactly yields a specified result. In one case, the result on my machine is not equal to the result specified in the test, but in more cases the pseudorandom inputs are simply not applicable to the mathematical functions under test. This indicates a problem with the test and not with the plugin. There are a small number of SqueakMap and Monticello tests which I don't understand. These should be checked by the developers. One test (ReleaseTest>>#testUnimplementedNonPrimitiveCalls) should simply not be a unit test. This is a lint test which may be valuable as far as it concerns your own code, but unless we want a very rigid release regime this does not make sense here. What's left is a very short list of genuine bugs. Some are simple to fix, others probably require intensive debugging. Expect less than 10 failing unit tests in 3.9 by the end of this week. Cheers, Hans-Martin |
>> Squeak comes with a large set of SUnit tests. Unfortunately, some of >> them don't work. As far as I can tell, there is NO recent version of >> Squak in which all the tests work. >> >> This is a sign that something is wrong. >> agreed. > Yup. To strengthen the upcoming trend of "do something" I have > investigated all the failing test cases in a 3.9-RC3-7066 image. The > results are at http://wiki.squeak.org/5889 - please feel free to comment. > > Expect less than 10 failing unit tests in 3.9 by the end of this week. > > Cheers, > Hans-Martin > > Keith Send instant messages to your online friends http://uk.messenger.yahoo.com |
In reply to this post by Hans-Martin Mosner
Hans-Martin Mosner wrote:
> Then there are a number of FloatMathPlugin tests which just test whether > a sequence of floating point operations on a huge number of pseudorandom > floats exactly yields a specified result. In one case, the result on my > machine is not equal to the result specified in the test, but in more > cases the pseudorandom inputs are simply not applicable to the > mathematical functions under test. This indicates a problem with the > test and not with the plugin. I wrote those tests to make sure we have consistent (bit-identical) results for various floating point functions across different Croquet VMs. How these tests ended up in 3.9 I have no idea - they are part of Croquet, for sure, and in the context of Croquet they make perfect sense (and they pass if you use a Croquet VM and they fail if you don't - which is exactly what they should do). To me, it points out more a problem with the selection of code being put into the base image rather than any failing of the test itself. The test is meaningful in the context it was designed for. Cheers, - Andreas |
Andreas Raab schrieb:
> I wrote those tests to make sure we have consistent (bit-identical) > results for various floating point functions across different Croquet > VMs. How these tests ended up in 3.9 I have no idea - they are part of > Croquet, for sure, and in the context of Croquet they make perfect > sense (and they pass if you use a Croquet VM and they fail if you > don't - which is exactly what they should do). That's good. Just out of curiosity: Does the FloatMathPlugin used in Croquet fail on any invalid inputs (e.g. numbers outside the range -1..1 for arcCos) or does it return NaN? I'd guess it returns NaN because otherwise some of the tests could not possibly succeed. > > To me, it points out more a problem with the selection of code being > put into the base image rather than any failing of the test itself. > The test is meaningful in the context it was designed for. Agreed. As far as I remember, the FloatMathPlugin for Croquet uses a software implementation for some operations to achieve the goal of bit-identical computation on all platforms. This probably means that the functions are quite a bit slower, so including this in an environment where the requirement is not present does not make much sense. So removing these tests from the general Squeak image seems like the reasonable thing to do, right? BTW, I will try to run the tests with a Croquet VM soo, so then I will know the answer to my first question :-) Cheers, Hans-Martin |
Hans-Martin Mosner wrote:
> Andreas Raab schrieb: >> I wrote those tests to make sure we have consistent (bit-identical) >> results for various floating point functions across different Croquet >> VMs. How these tests ended up in 3.9 I have no idea - they are part of >> Croquet, for sure, and in the context of Croquet they make perfect >> sense (and they pass if you use a Croquet VM and they fail if you >> don't - which is exactly what they should do). > That's good. Just out of curiosity: Does the FloatMathPlugin used in > Croquet fail on any invalid inputs (e.g. numbers outside the range -1..1 > for arcCos) or does it return NaN? I'd guess it returns NaN because > otherwise some of the tests could not possibly succeed. Actually, this also not the latest version of these tests. In Croquet we use the CroquetVMTests suite which includes these and other tests. And yes, the plugin fails (I added that when noticing that -thanks to IEEE754- different platforms would report different bit-patterns for NaN; all in compliance with the spec!) and the exception is handled by simply resuming with NaN so that the test can successfully complete. >> To me, it points out more a problem with the selection of code being >> put into the base image rather than any failing of the test itself. >> The test is meaningful in the context it was designed for. > Agreed. As far as I remember, the FloatMathPlugin for Croquet uses a > software implementation for some operations to achieve the goal of > bit-identical computation on all platforms. This probably means that the > functions are quite a bit slower, so including this in an environment > where the requirement is not present does not make much sense. > So removing these tests from the general Squeak image seems like the > reasonable thing to do, right? Yes. (although it's not as slow as one may think as long as you can use a "native" sqrt instruction which is fortunately the _one_ insn that the FPUs seem to agree upon) > BTW, I will try to run the tests with a Croquet VM soo, so then I will > know the answer to my first question :-) Run the CroquetVMTests instead. Those are really the relevants ones. Cheers, - Andreas |
In reply to this post by Ralph Johnson
"Ralph Johnson" <[hidden email]> writes:
> Squeak comes with a large set of SUnit tests. Unfortunately, some of > them don't work. As far as I can tell, there is NO recent version of > Squak in which all the tests work. Known-failing tests should be marked in some way or another. Thus far people have proposed putting them in separate packages so that you can simply unload them. That is not a bad solution. However, it would be better if you can load a known-failing test without having the tools bother you. Then you can see the tests and mess with them. To achieve this, however, you need to have a way to mark this in the image. The simplest way I can think of is to rename the methods for known-failing tests. Right now, a method named testFoo is a unit test. We could change that so that pendtestFoo is a pending unit test that is known not to work. -Lex |
I apologize in advance for going off again. (See my earlier response to
the call for info/best-practices.) But I can't help but think about "those who fail to learn from history..." Here we are with all these wonderful and practical ideas about doing just "one more thing" to make stuff really work, and I'm unhelpfully being abstract. Sorry. And I don't mean to discourage anyone. Just giving a heads-up. What's fundamentally at issue with tests and with packages? At their core, these are both declarative-style collections of things to be operated on. The "win" is that the user doesn't need to manually carry out the operations, and that the system can manage the combination of collections and operations. (E.g., interleave complex combinations.) The key point I'm trying to make is to recognize that we're dealing with collections of stuff that can be combined, and with operations that can be combined. As I understand them, SUNIT and MC don't really handle combinations of collections, nor do they interact with each other for combinations of operations. (Monticello configurations are a step in this direction, and before/after scripts in MC can be used to procedurally achieve some manual combination of operations, but then you're losing the power of the declarative definitions.) So my feeling is that the various improvements to MC and SUNIT are just messing with the margins. Half a loaf. [This community will quite rightly shout, "Go ahead and do it!" I'll answer in advance that I'm working on other issues. This isn't on my critical path. Besides, my meta-point is that this whole area is an already-solved problem. If I had a student to throw at this...] -H [Another possible objection to the idea of combination is that what Lisp did sounds like Mark Twain's comments on smoking. "It's easy to quit smoking. I've done it hundreds of times!" The fact that this problem has "been solved" so many times might mean that it hasn't. My personal view is that it HAS been solved, but that there are other (social) issues that have caused it to be repeatedly solved in the Lisp community.] Lex Spoon wrote: > "Ralph Johnson" <[hidden email]> writes: >> Squeak comes with a large set of SUnit tests. Unfortunately, some of >> them don't work. As far as I can tell, there is NO recent version of >> Squak in which all the tests work. > > Known-failing tests should be marked in some way or another. Thus far > people have proposed putting them in separate packages so that you can > simply unload them. > > That is not a bad solution. However, it would be better if you can > load a known-failing test without having the tools bother you. Then > you can see the tests and mess with them. To achieve this, however, > you need to have a way to mark this in the image. > > The simplest way I can think of is to rename the methods for > known-failing tests. Right now, a method named testFoo is a unit > test. We could change that so that pendtestFoo is a pending unit test > that is known not to work. > > > -Lex > > > > -- Howard Stearns University of Wisconsin - Madison Division of Information Technology mailto:[hidden email] jabber:[hidden email] voice:+1-608-262-3724 |
Free forum by Nabble | Edit this page |