While I've been enjoying the fantastic performance improvements we've seen from Cog onward, one thing I've been less excited about are some of the stability/functionality issues I've been running into. They are not numerous (maybe 1/2 dozen or so major ones in the last 5 years) but they are getting quite tedious to isolate and replicate. Recent examples that come to mind include the 64-bit primHighResClock truncation and 'could not grow remembered set' issues (My current joy is a case where I have an #ifTrue: block that doesn't get executed unless I convert it to an #ifTrue:ifFalse: with a noop for the ifFalse:.. I'll provide a reproducible test case as soon as I'm able. The specific issue isn't the issue, but rather that I keep hitting things like this that seem rather fundamental yet edge-casey at the same time)
I don't expect perfection as a phenomenal amount of progress is being made by a small group of people but I am beginning to wonder if the existing unit tests are sufficient to adequately exercise the VM? I.e. so that the VM developers are aware that a recent change may have broken something or are the existing tests mainly oriented towards image and bytecode VM development? Just some food for thought and wanted to see if it's just me having these sorts of issues... Thanks, Phil |
Hi Phil, that's probably right: there is a lack of smoke tests.problems encounterd so far are including * work in progress in core VM or plugins * wrong configuration of pharo target directories or credentials * failure to build a library due to some tool changes at appveyor/travis(this was the case most of the 2017 year but is fortunately fixed now) * staled or intermitent links (url) for example, the build is loading stuff from the network (like cygwin updates) that sometimes fail Introduction of new bugs could be prevented if feedback was correct (no false alarm). But it's not really the case until now (lot of parasites). - 2) we run after many hares, that is a combination of I certainly forgot threaded FFI in above list, plus the register efficient JIT variants... Someone has to do the work (or pay it)... -4) build status feedback is very sloowww We all know that dev branches (feature branches) help a lot for some of the above problems. * as said above, we build too many configurations * Pharo has introduced a lot of dependencies on external libraries this leads to either long build times, or the use of caches that delay detection of new failures a lot of the changes required for SISTA, 64bits and JIT variants are competing, and parallel branches would create conflicts and would not work without regular sync. that explains why all the branches are gathered into a giant and complex one today... it's still possible to generate code for a plugin (if non concurrent) but this prevents working in parallel branches as soon as the core generation is changed in VMMakerIn recent posts, I saw billiant young people under-estimating a bit the work involved and the complexity of the task. But maybe current state is at the limit of sustainability. And maybe it's time to drop some drag. 2018-03-30 22:35 GMT+02:00 Phil B <[hidden email]>:
|
In reply to this post by Phil B
Hi Phil, > On Mar 30, 2018, at 1:35 PM, Phil B <[hidden email]> wrote: > > While I've been enjoying the fantastic performance improvements we've seen from Cog onward, one thing I've been less excited about are some of the stability/functionality issues I've been running into. They are not numerous (maybe 1/2 dozen or so major ones in the last 5 years) but they are getting quite tedious to isolate and replicate. Recent examples that come to mind include the 64-bit primHighResClock truncation and 'could not grow remembered set' issues (My current joy is a case where I have an #ifTrue: block that doesn't get executed unless I convert it to an #ifTrue:ifFalse: with a noop for the ifFalse:.. I'll provide a reproducible test case as soon as I'm able. The specific issue isn't the issue, but rather that I keep hitting things like this that seem rather fundamental yet edge-casey at the same time) > > I don't expect perfection as a phenomenal amount of progress is being made by a small group of people but I am beginning to wonder if the existing unit tests are sufficient to adequately exercise the VM? I.e. so that the VM developers are aware that a recent change may have broken something or are the existing tests mainly oriented towards image and bytecode VM development? Just some food for thought and wanted to see if it's just me having these sorts of issues... Part of the problem is in creating test frameworks that are stable enough and complex enough. It's a lot of work. Consider the most unstable part of Spur for the past year, the new compactor, which took a year to become fully reliable (touch wood). The last case that showed the last bug I fixed required a really large image, a snapshot and a load of that snapshot followed by a GC to show the bug. In fact what this shows is that writing regression tests is easy but writing adequate stress tests is hard. In my experience it's more effective to let the community provide the stress tests and try and be as responsive as possible in fixing the bugs as soon as they appear. So having knowledge of how to create reproducible cases, knowing the right channel to report a bug, etc, are important. And if I'm right here then this points to the need for a workflow where VMs are built and tested automatically from tip. I don't properly understand the issue, but I'm frustrated that the current Pharo vm is way behind that compactor bug fix. I think the issue is that the Pharo vm has more than one tip; it has the execution engine/GC/FFI tip that Clément, Nicolas and I take responsibility for, and then there's the various library extensions (for git, fonts, imaging) that is a significant weight on Esteban's shoulders, and then there's SSL support from Tobias, etc. So perhaps we need a two tier VM code base, so we can decouple these various tips and advance each tip to "the stable branch" when appropriate. That in turn requires a CI infrastructure which allows developers of each tip to test their changes in the context of an otherwise stable code base. > > Thanks, > Phil |
Free forum by Nabble | Edit this page |