Hi all, I have restored (I think) the green status of minheadless build. We could eventually authorize failure of all those builds, or restrict the number of flavours that we build (it takes really loooong time before we get CI feedback). But if we abandon all those now, I fear that we never catch up; it's a one way ticket. IMO there are still interesting ideas to take, even if development has continued in Pharo fork... Now the next failing build on travis is about squeak.cog.v3. Did some incompatible VM change took place?
|
Hi Nicolas, On Oct 26, 2020, at 6:59 AM, Nicolas Cellier <[hidden email]> wrote:
Could these test failures be nothing to do with the VM but instead to do with the (growing) divergence between trunk/spur and Squeak 4.6? _,,,^..^,,,_ (phone) |
Le lun. 26 oct. 2020 à 15:11, Eliot Miranda <[hidden email]> a écrit :
+1, but I do not know how to configure the CI like that...
No idea... I've downloaded a Squeak4.6-15102.image, compiled a squeak.cog.v3 on windows10, and the test pass... I will retry on other OSes this evening (the test fails on linux...). |
On Mon, Oct 26, 2020 at 03:22:39PM +0100, Nicolas Cellier wrote: > > Le lun. 26 oct. 2020 ?? 15:11, Eliot Miranda <[hidden email]> a > ??crit : > > > > > Hi Nicolas, > > > > On Oct 26, 2020, at 6:59 AM, Nicolas Cellier < > > [hidden email]> wrote: > > > > ??? > > Hi all, > > I have restored (I think) the green status of minheadless build. > > > > > > Thank you so much! > > > > We could eventually authorize failure of all those builds, or restrict the > > number of flavours that we build (it takes really loooong time before we > > get CI feedback). > > But if we abandon all those now, I fear that we never catch up; it's a one > > way ticket. IMO there are still interesting ideas to take, even if > > development has continued in Pharo fork... > > > > > > One approach might be to split them into ???essential??? and ???nice to have??? so > > we get faster feedback from the ???essential??? set. > > > > +1, but I do not know how to configure the CI like that... > > > Now the next failing build on travis is about squeak.cog.v3. Did some > > incompatible VM change took place? > > > > For example > > https://travis-ci.org/github/OpenSmalltalk/opensmalltalk-vm/jobs/738831149 > > > > ###################################################### > > > > # Squeak-4.6 on Travis CI (2278.23) # > > > > # 3401 Tests with 5 Failures and 0 Errors in 112.13s # > > > > ###################################################### > > > > ######################### > > > > # 5 tests did not pass: # > > > > ######################### > > > > SUnitToolBuilderTests > > 837fef_266b > > > > ??? #testHandlingNotification (10023ms) > > > > TestValueWithinFix > > 2a65cb_266b > > > > ??? #testValueWithinTimingBasic (1005ms) > > e9a7ab_266b > > > > ??? #testValueWithinTimingNestedInner (1001ms) > > c57415_266b > > > > ??? #testValueWithinTimingNestedOuter (1002ms) > > e89da3_266b > > > > ??? #testValueWithinTimingRepeat (3004ms) > > > > Executed 3401 Tests with 5 Failures and 0 Errors in 112.13s. > > > > To reproduce the failed build locally, download smalltalkCI, and try to run something like: > > > > bin/smalltalkci -s Squeak-4.6 --headfull /path/to/.smalltalk.ston > > > > Could these test failures be nothing to do with the VM but instead to do > > with the (growing) divergence between trunk/spur and Squeak 4.6? > > > > _,,,^..^,,,_ (phone) > > > No idea... > I've downloaded a Squeak4.6-15102.image, compiled a squeak.cog.v3 on > windows10, and the test pass... > I will retry on other OSes this evening (the test fails on linux...). I do not know if it is related to the CI issues, but here is one clue: I am maintaining a V3 "trunk" image that attempts to keep in sync with trunk except for changes that pertain to Spur object format (immediate characters, etc). As of Monticello-ul.728 I get failures in SSL connection to e.g. source.squeak.org. The failures happen with both Cog and the interpreter VM (although the specific error symptoms are different). I can bypass the errors by reverting MCHttpRepository>>httpGet:arguments: to its prior version (stamped ul 9/20/2019). I have not had time to dig into the underlying cause of the problem, but it seems likely that any CI test for Cog that requires the use of secure sockets will fail as of Monticello-ul.728 or later. This would include any test that needs to access the source.squeak.org repository. Dave |
In reply to this post by Nicolas Cellier
On Oct 26, 2020, at 7:22 AM, Nicolas Cellier <[hidden email]> wrote:
At least one of those failures is a timeout. So it may just be that the CI box is slow. We might cure the issues by lengthening the timeouts. Or we could make them expected failures, especially if we can find out some way to identify that we’re running on a CI box.
|
Le lun. 26 oct. 2020 à 16:26, Eliot Miranda <[hidden email]> a écrit :
There's something fishy... Only the linux brand times out. The test waits 10 times 200msec, so should last about 2s. [SUnitToolBuilderTests new testHandlingNotification] timeToRun. does answer something around 2048 on macos brand, but 7706 on linux (???). The build.itimerheartbeat brand does complete the test in about 2020ms so it's OK. squeak.cog.spur/build/squeak also performs the test in about 7700ms in an updated trunk image... So it's not just squeak.cog.v3 here... I can only run linux thru a VM (Parallels), so if someone can confirm the behavior on some native linux |
On Oct 26, 2020, at 2:53 PM, Nicolas Cellier <[hidden email]> wrote:
Ah! Is the kernel older than 2.6.4 (IIRC)? If so, the heartbeat thread doesn’t work properly because the vm doesn’t have permission or ability to set the heartbeat thread’s priority.
|
In reply to this post by David T. Lewis
Hi David On Mon, 26 Oct 2020, David T. Lewis wrote: > I do not know if it is related to the CI issues, but here is one clue: > > I am maintaining a V3 "trunk" image that attempts to keep in sync with > trunk except for changes that pertain to Spur object format (immediate > characters, etc). As of Monticello-ul.728 I get failures in SSL connection > to e.g. source.squeak.org. The failures happen with both Cog and the > interpreter VM (although the specific error symptoms are different). > > I can bypass the errors by reverting MCHttpRepository>>httpGet:arguments: > to its prior version (stamped ul 9/20/2019). That version adds some rewrite rules, so some http urls are converted to https. If that doesn't work, then there is a problem with the SqueakSSL plugin in your VM. Does the following print true? | response | response := WebClient httpGet: 'https://source.squeak.org'. response code = 200 Instead of reverting #httpGet:arguments: you can remove the individual http->https rewrite rules with the following snippet: MCHttpRepository urlRewriteRules in: [ :rules | rules size // 3 timesRepeat: [ | set | set := rules removeFirst: 3. (set second beginsWith: 'https') ifFalse: [ rules addAll: set ] ] ] But it would be better to use https for those repositories. Levente > > I have not had time to dig into the underlying cause of the problem, > but it seems likely that any CI test for Cog that requires the use > of secure sockets will fail as of Monticello-ul.728 or later. This > would include any test that needs to access the source.squeak.org > repository. > > Dave |
In reply to this post by Nicolas Cellier
On Mon, 26 Oct 2020 at 21:53, Nicolas Cellier <[hidden email]> wrote:
|
In reply to this post by Levente Uzonyi
Hi Levente, On Tue, Oct 27, 2020 at 12:59:55AM +0100, Levente Uzonyi wrote: > > Hi David > > On Mon, 26 Oct 2020, David T. Lewis wrote: > > >I do not know if it is related to the CI issues, but here is one clue: > > > >I am maintaining a V3 "trunk" image that attempts to keep in sync with > >trunk except for changes that pertain to Spur object format (immediate > >characters, etc). As of Monticello-ul.728 I get failures in SSL connection > >to e.g. source.squeak.org. The failures happen with both Cog and the > >interpreter VM (although the specific error symptoms are different). > > > >I can bypass the errors by reverting MCHttpRepository>>httpGet:arguments: > >to its prior version (stamped ul 9/20/2019). > > That version adds some rewrite rules, so some http urls are converted to > https. > If that doesn't work, then there is a problem with the SqueakSSL plugin in > your VM. Does the following print true? > > | response | > response := WebClient httpGet: 'https://source.squeak.org'. > response code = 200 > With an interpreter VM, I get a failure from primitiveSSLCreate. With Cog (squeak.cog.v3_linux32x86_202010010729.tar.gz) I also get a failure on the same primitive call. So yes it is probably an issue with the SqueakSSL plugin. Sorry I don't have time to follow up on this properly, I just wanted to mention that I saw an issue in case it relates to the CI problems. Dave > > Instead of reverting #httpGet:arguments: you can remove the individual > http->https rewrite rules with the following snippet: > > MCHttpRepository urlRewriteRules in: [ :rules | > rules size // 3 timesRepeat: [ > | set | > set := rules removeFirst: 3. > (set second beginsWith: 'https') ifFalse: [ > rules addAll: set ] ] ] > > But it would be better to use https for those repositories. > > > Levente > > > > >I have not had time to dig into the underlying cause of the problem, > >but it seems likely that any CI test for Cog that requires the use > >of secure sockets will fail as of Monticello-ul.728 or later. This > >would include any test that needs to access the source.squeak.org > >repository. > > > >Dave |
> On 2020-10-26, at 6:02 PM, David T. Lewis <[hidden email]> wrote: >> >> | response | >> response := WebClient httpGet: 'https://source.squeak.org'. >> response code = 200 >> > > With an interpreter VM, I get a failure from primitiveSSLCreate. > > With Cog (squeak.cog.v3_linux32x86_202010010729.tar.gz) I also get a > failure on the same primitive call. > > So yes it is probably an issue with the SqueakSSL plugin. Just as a datapoint, it works on on my Pi 4 squeak.cog.spur VM tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Never do card tricks for the group you play poker with. |
In reply to this post by David T. Lewis
On Mon, Oct 26, 2020 at 09:02:40PM -0400, David T. Lewis wrote: > > Hi Levente, > > On Tue, Oct 27, 2020 at 12:59:55AM +0100, Levente Uzonyi wrote: > > > > Hi David > > > > On Mon, 26 Oct 2020, David T. Lewis wrote: > > > > >I do not know if it is related to the CI issues, but here is one clue: > > > > > >I am maintaining a V3 "trunk" image that attempts to keep in sync with > > >trunk except for changes that pertain to Spur object format (immediate > > >characters, etc). As of Monticello-ul.728 I get failures in SSL connection > > >to e.g. source.squeak.org. The failures happen with both Cog and the > > >interpreter VM (although the specific error symptoms are different). > > > > > >I can bypass the errors by reverting MCHttpRepository>>httpGet:arguments: > > >to its prior version (stamped ul 9/20/2019). > > > > That version adds some rewrite rules, so some http urls are converted to > > https. > > If that doesn't work, then there is a problem with the SqueakSSL plugin in > > your VM. Does the following print true? > > > > | response | > > response := WebClient httpGet: 'https://source.squeak.org'. > > response code = 200 > > > > With an interpreter VM, I get a failure from primitiveSSLCreate. > > With Cog (squeak.cog.v3_linux32x86_202010010729.tar.gz) I also get a > failure on the same primitive call. > > So yes it is probably an issue with the SqueakSSL plugin. > > Sorry I don't have time to follow up on this properly, I just wanted > to mention that I saw an issue in case it relates to the CI problems. > With Cog (squeak.cog.v3_linux32x86_202010010729.tar.gz) I get the failure on primitiveSSLCreate. Basically "SqueakSSL new" is failing. With an interpreter VM, SqueakSSL class>>new succeeds, and the failure happens later with a "Host name mismatch" on the certificate. The SecureSocketStream thinks that the host name is source.squeak.org, and the SqueakSSL thinks that its peerName is squeak.org. Very likely this is a plugin problem also, since the interpreter VM would be using an outdated version of the plugin. Dave SqueakSSLCertificateError.png (68K) Download Attachment |
Am 27. Oktober 2020 02:34:37 MEZ schrieb "David T. Lewis" <[hidden email]>: > Hi this looks an awful lot like a very old or brown plugin: SNI / sAN seems not to work. but on the other hand, is webclient outdated? the host name seems not to be send to the plug in... -t -- Sent from a mobile device |
Hi On 27.10.2020, at 07:04, Tobias <[hidden email]> wrote: I've had a look, but I cannot reproduce :( See screenshot. Please tell me the openssl version used. Best regards -Tobias |
In reply to this post by Eliot Miranda-2
Le mar. 27 oct. 2020 à 00:00, Eliot Miranda <[hidden email]> a écrit :
No, it's an ubuntu 16: $ uname -a Linux ubuntu 4.4.0-193-generic #224-Ubuntu SMP Tue Oct 6 17:15:28 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux I hopefully have the correct setup for authorizing thread priority $ cat /etc/security/limits.d/squeak.conf * hard rtprio 2 * soft rtprio 2 I have no warning in the console telling that "pthread_setschedparam failed"... So I don't know what's going on. Can someone reproduce? |
In reply to this post by Tobias Pape
On Tue, Oct 27, 2020 at 07:04:20AM +0100, Tobias wrote: > > Am 27. Oktober 2020 02:34:37 MEZ schrieb "David T. Lewis" <[hidden email]>: > > > > Hi > this looks an awful lot like a very old or brown plugin: SNI / sAN seems not to work. > > but on the other hand, is webclient outdated? the host name seems not to be send to the plug in... > The plugin in my interpreter VM is outdated, yes. But it's the Cog VM that would be of interest here. With Cog, I see a primitive failure in SqueakSSL>>primitiveSSLCreate. Here is the VM I am running: Virtual Machine --------------- /usr/local/lib/squeak/4.5-202010010729/squeak Open Smalltalk Cog[SqueakV3] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2831] Unix built on Oct 1 2020 07:59:33 Compiler: 5.4.0 20160609 platform sources revision VM: 202010010729 https://github.com/OpenSmalltalk/opensmalltalk-vm.git Date: Thu Oct 1 09:29:33 2020 CommitHash: f679770 Plugins: 202010010729 https://github.com/OpenSmalltalk/opensmalltalk-vm.git CoInterpreter VMMaker.oscog-eem.2831 uuid: f4fed277-6b62-4e79-9147-c5405589f2b1 Oct 1 2020 StackToRegisterMappingCogit VMMaker.oscog-eem.2824 uuid: 8f091e5b-fc0f-4b4b-ab5e-e90e598f75ee Oct 1 2020 To Build A Similar Virtual Machine ---------------------------------- Visit https://github.com/OpenSmalltalk/opensmalltalk-vm; follow the "Clone or download" instructions, then read the top-level README.md and HowToBuild files in the top-level build directory for your platform(s), build.macos64x64/HowToBuild, build.win32x86/HowToBuild, etc. Loaded VM Modules ----------------- AioPlugin VMConstruction-Plugins-AioPlugin.oscog-eem.25 (i) B2DPlugin VMMaker.oscog-eem.2805 (i) BitBltPlugin VMMaker.oscog-eem.2821 (i) CroquetPlugin VMMaker.oscog-eem.2744 (i) FilePlugin VMMaker.oscog-eem.2795 (i) FloatArrayPlugin VMMaker.oscog-eem.2759 (i) LargeIntegers v2.0 VMMaker.oscog-eem.2821 (i) Matrix2x3Plugin VMMaker.oscog-eem.2780 (i) MiscPrimitivePlugin VMMaker.oscog-eem.2761 (i) SecurityPlugin VMMaker.oscog-eem.2790 (i) SocketPlugin VMMaker.oscog-eem.2823 (i) SqueakSSL VMMaker.oscog-eem.2805 (e) UnixOSProcessPlugin VMConstruction-Plugins-OSProcessPlugin.oscog-eem.69 (e) |
In reply to this post by Nicolas Cellier
> On 2020-10-27, at 1:33 AM, Nicolas Cellier <[hidden email]> wrote: > > No, it's an ubuntu 16: > > $ uname -a > Linux ubuntu 4.4.0-193-generic #224-Ubuntu SMP Tue Oct 6 17:15:28 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux > > I hopefully have the correct setup for authorizing thread priority > > $ cat /etc/security/limits.d/squeak.conf > * hard rtprio 2 > * soft rtprio 2 > > I have no warning in the console telling that "pthread_setschedparam failed"... > > So I don't know what's going on. Can someone reproduce? Actually yeah, kinda. I have a linux i7 box I mostly use to run an ancient VW product on. It's a recent Ubuntu 18.0.4 download with that horrible xfce ui stuff (really, what were they thinking). Aside from many other faults it (mis)works with VNC in annoying ways tha mean it frequently stops sharing the copy/paste buffer. Anyway. uname -a tells me (and I can't paste it, of course!) 4.15.0-101generic #102-Ubuntu SMP etc etc I have run the limits.d set up several times and the file is ok. And yes, I have rebooted plenty of times. It *never* works to make squeak happy. Did I mention that Ubuntu is not very good? tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim A computer's attention span is only as long as its extension cord. |
Le mar. 27 oct. 2020 à 19:02, tim Rowledge <[hidden email]> a écrit :
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that I encounter. This testHandlingNotification is repeating 10 loops with a 200ms delay wait. The problem I've got with threaded heartbeat VM is that first 5 delays run normally, but next 5 will last for 1 second instead of 200ms... This is probably what happens on the CI server too (the test times out and CI fails). |
Hi > On 27.10.2020, at 23:39, Nicolas Cellier <[hidden email]> wrote: > > Hi Tim, > what is the result on that box with threaded heartbeat VM of: > > [SUnitToolBuilderTests new testHandlingNotification] timeToRun. > > If it is more than 5000 ms, then you would confirm the problem that I encounter. > > This testHandlingNotification is repeating 10 loops with a 200ms delay wait. > The problem I've got with threaded heartbeat VM is that first 5 delays run normally, but next 5 will last for 1 second instead of 200ms... > This is probably what happens on the CI server too (the test times out and CI fails). Also note that the CI test builds both vms but uses the last one built (because it overwrites the first), and that happens to be the itimer one, not th ethreaded. The code above runs in around ~2000 ms on my machine (~2015, with ubuntu 18.04) Things that happened: - I just ran the test suite in the DEBUG itimer headful and headless variant and it passes. - I just ran the test suite in the DEBUG threaded headful and headless variant and it passes. - I ran the RELEASE itimer headful and headless variant and it passes - I ran the RELEASE threaded headless variant and it FAILED as on the CI - I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did testValueWithinTimingBasic and testValueWithinTimingNestedInner but testValueWithinTimingNestedOuter testValueWithinTimingRepeat still fail! So there are discrepancies between debug and release and headful and headless (at least for threaded release) TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem ... Does that help anyone? Best regards -Tobias BTW: Eliot, the VM spits out a "aioDisable: epoll_ctl: Bad file descriptor". Is that expected? |
In reply to this post by Nicolas Cellier
Hi Tobi, Hi Levente, > On Oct 28, 2020, at 7:06 AM, Tobias Pape <[hidden email]> wrote: > > > Hi > >>>>> On 27.10.2020, at 23:39, Nicolas Cellier <[hidden email]> wrote: >> Hi Tim, >> what is the result on that box with threaded heartbeat VM of: >> [SUnitToolBuilderTests new testHandlingNotification] timeToRun. >> If it is more than 5000 ms, then you would confirm the problem that I encounter. >> This testHandlingNotification is repeating 10 loops with a 200ms delay wait. >> The problem I've got with threaded heartbeat VM is that first 5 delays run normally, but next 5 will last for 1 second instead of 200ms... >> This is probably what happens on the CI server too (the test times out and CI fails). > > Also note that the CI test builds both vms but uses the last one built (because it overwrites the first), and that happens to be the itimer one, not th ethreaded. > > The code above runs in around ~2000 ms on my machine (~2015, with ubuntu 18.04) > > Things that happened: > - I just ran the test suite in the DEBUG itimer headful and headless variant and it passes. > - I just ran the test suite in the DEBUG threaded headful and headless variant and it passes. > - I ran the RELEASE itimer headful and headless variant and it passes > - I ran the RELEASE threaded headless variant and it FAILED as on the CI > - I ran the RELEASE threaded headful variant and it FAILED LESS > I mean: testHandlingNotification passed, and so did testValueWithinTimingBasic and testValueWithinTimingNestedInner > but testValueWithinTimingNestedOuter testValueWithinTimingRepeat still fail! > > So there are discrepancies between > debug and release and > headful and headless (at least for threaded release) > > TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem ... > > Does that help anyone? If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream. > > Best regards > -Tobias > > BTW: Eliot, the VM spits out a "aioDisable: epoll_ctl: Bad file descriptor". Is that expected? It’s not expected, but may be harmless. Levente, is this noise? Or is it that the input file descriptor is being shut down? Presumably this is just noise U.K. do with running headless. |
Free forum by Nabble | Edit this page |