VMMaker CI under opensmalltalk-vm (was Re: OSProcess fork issue with Debian built VM)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

VMMaker CI under opensmalltalk-vm (was Re: OSProcess fork issue with Debian built VM)

Ben Coman
 

On Sun, Jun 4, 2017 at 1:10 AM, Eliot Miranda <[hidden email]> wrote:

On Jun 3, 2017, at 2:22 AM, Ben Coman <[hidden email]> wrote:
On Sat, Jun 3, 2017 at 5:57 AM, Eliot Miranda <[hidden email]> wrote:

On Fri, Jun 2, 2017 at 9:54 AM, K K Subbu <[hidden email]> wrote:

On Friday 02 June 2017 04:50 PM, Holger Freyther wrote:

+1. I would go further and propose that plugins, other than those
which are integral to VM[1], be moved into separate projects and
built in their own tree with no references to vm source paths. They
should be free to organize their platform-specific and
platform-independent code and makefiles.

why? How many plugins are used that don't ship with the pharo-vm (or
squeakvm)? I have to say I like the pharo-vm approach a lot where the
VMMaker* packages and generated sources sit next to each other.

My proposal is to loosen build time coupling, not packaging. Nothing prevents plugins built out of tree from being built, installed and shipped together. Supporting out of tree plugins allows others to extend a VM even long after its release.

Eliot has already responded to your example in detail, so I will not dwell upon it further, except to point out that we need a command line tool to generate a .c file directly from the plugin's mcz file. Then we can create a Make rule to compile a .o and get rid of .c file. e.g.

 FooPlugin.o : FooPlugin.mcz
        squeak -headless vmmaker.image gencc.st FooPlugin.mcz
        cc -c ... FooPlugin.c
        rm -f FooPlugin.c

Over my dead body.  The make must *NEVER* be done like this.  The problems are many:
- the VM so produced is undebuggable.  There is no source for FooPlugin.c; it has been deleted
- the build is slow; building a VMMaker image (the necessary prerequisite for st2c) takes a long time, mush longer than the st2c step
- unless FooPlugin.mcz is versioned then the VM so produced is unversioned and if  FooPlugin.mcz /is/ versioned then there's no reason why the plugin can't be produced using the normal workflow, generate C source, check it in to opensmalltalk/vm, build, and that way we have a versioned entity and break the dependency of builds requiring VMMaker


And I remember it mentioned a while ago, the bug is sometimes in the generation-code, so you want historical access to both mcz and generated-C to help debug that. 
+1


Look, I just spent several years persuading the Pharo community that generating source and then building was a really bad idea and they should shelve it and now someone is trying to bring it back.  No, no, and thrice times, no.


I think Pharo has not completely abandoned generating C-sources, although its now a sideline activity...

It's a CI activity to test the current state of the VMMaker package.  This is useful.

So we should do this under the OpenSmalltalk-VM project?


On Wed, May 31, 2017 at 2:48 PM, Esteban Lorenzano <[hidden email]> wrote:

In any case, in the “pharo side” of things, we keep a parallel CI process who does this: 

1) there is a development branch that keep all sources we need: the osvm *and* all monticello packages (filetree format) acting as a mirror that merges changes and executes all building process (including generating sources) and testing (we run all pharo tests as a way to test the VM). This is done just for being sure the CI is green and no artifact is created. This is how I have early reported problem in VM generation before.
2) in parallel the build follows the “normal” process on opensmalltalk-vm and everything is build there. This generates a “latest” build which is published in pharo servers so people can test (but this binary could be wrong as is not something declared as “stable”

I do wonder from time to time is why the C-generation can't be done as part of an opensmalltalk-vm CI process?  The manual process is a bit opaque, and a CI process would be better than documentation.  One possibility of doing it via CI could be generating from both Squeak and Pharo and checking the output is identical (assuming that might be useful).
Yes this would be useful.  Also there's still at least one place in the CoInterpreter source that instability in the type inference code that flips the type of a variable between sqInt & usqInt and automation generation might help track that down.

IIRC Eliot didn't want the generated-C updated in the Cog branch for every VMMaker commit. 

Right.  No point doing so for changes to simulation only code, or to experimental code.  Or on every commit to a package that contains a translated primitive, a form of primitives we need to eliminate by rewriting as conventional ones.

But the real issue is in not committing changes to generated code that hasn't changed.  See scripts/revertUnchangedPlugins.

If there is, say, a Slang change that has limited effects (only a subset of plugins, or only the interpreter file, etc) we don't want to commit new versions of the essentially unchanged code because it obscures which commit leads to a bug.

Plus there's your observation about Slang bugs.  I want to eyeball the changes in at least one representative generated file to look for unintended consequences and assure myself that source generation is sane.

A way to conform to this would be the CI updating the generated C in a side branch, and only when deemed worthy it could be integrated into the Cog branch via a simple merge rather than a one-time manual generation run.  

Eliot,  If you're receptive to this idea, I'd be interested in working on this.  What concerns would you have with it?

1. the process identifies files that only change in version stamp (if essentially unchanged) and /not/ commit this subset.  Right now that happens with a script on plugins (cuz they're relatively simple) and by my VMMaker image that tracks commits only producing source for classes whose timestamps have changed

2. that the process maintains the separation between generation followed by check in from build from checked in source.

3. that people who work on the Smalltalk side of the business not get lazy in eyeballing the generation process cuz Slang issues can be subtle and tedious to track down.  IME, human supervision remains useful here

One extra thing, there feels like as explosion of *src* folders in the root directory.  This could be an opportunity to clean these up into a subdirectory (and a side benefit is I can experiment without touching existing srcs)

So I'd like to run a possible structure past you...
opensmalltalk-vm/
   VMMaker/
       mc-mirror/    #versioned in filetree format
       generated/   #versioned - initial generation checked into a separate branch until manual review and merge.    
           src/
           stacksrc/
           nsspursrc/    
           nsspur64src/  
           nsspurstacksrc/    
           nsspurstack64src/  
           spursrc/       
           spur64src/         
           spurstack64src/  
           spurstacksrc/    
           spurlowcodesrc/         
           spurlowcodestacksrc/  
           spurlowcode64src/  
           spurlowcodestack64src/  
           spursistasrc/  
           spursista64src/       
       build/
           Makefile
           CI-build-scripts.sh
           generated.squeak/    #not versioned, build host only
               src/
               spursrc/       
               spur64src/         
               ...etc
           generated.pharo/    #not versioned, build host only
               src/
               spursrc/       
               spur64src/         
               ...etc
 
When http://source.squeak.org is updated, this would trigger the CI copy it into "mc-mirror"  
then build both "generated.squeak/" & "generated.pharo/".   The two would be cross compared and if identical, 
then "generated/" would be updated with substantive changes only.

Both "mc-mirror/" and "generated/" would be checked into a "source-generation" branch
that would kick off a regular CI vm build and test (only if there were substantive changes).  
The advantage being that github comments can be attached to individual lines of code to facilitate public review and discussion.  

After manual review, the source-generation branch can be merged via PR to perform the regular CI test run inclusive of any platform changes.

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: VMMaker CI under opensmalltalk-vm (was Re: OSProcess fork issue with Debian built VM)

K K Subbu
 
On Tuesday 13 June 2017 11:09 AM, Ben Coman wrote:
>
> One extra thing, there feels like as explosion of *src* folders in the
> root directory.  This could be an opportunity to clean these up into a
> subdirectory (and a side benefit is I can experiment without touching
> existing srcs)

This is due to all variations sharing the same branch ("Cog"). Now that
all sources are in git, these variations can be moved into their own
branches:
    master (Sista?), Spur, Cog, Stack?

and use Tags to mark releases like Squeak5, Pharo6 etc. This will reduce
clutter and make it easy for CI builds and tests.

Regards .. Subbu
Reply | Threaded
Open this post in threaded view
|

Re: VMMaker CI under opensmalltalk-vm (was Re: OSProcess fork issue with Debian built VM)

Ben Coman
 


On Wed, Jun 14, 2017 at 1:57 AM, K K Subbu <[hidden email]> wrote:

On Tuesday 13 June 2017 11:09 AM, Ben Coman wrote:

One extra thing, there feels like as explosion of *src* folders in the
root directory.  This could be an opportunity to clean these up into a
subdirectory (and a side benefit is I can experiment without touching
existing srcs)

This is due to all variations sharing the same branch ("Cog"). Now that all sources are in git, these variations can be moved into their own branches:
   master (Sista?), Spur, Cog, Stack?

I expect there will be strong resistance to that.  It would be hell to track.  Those folders are generated from the same Smalltalk code base, just different options.  We want the CI to build all variations in one run to verify changes don't break any of them.  Technically "Stack" is outside "Cog", but I guess its more than "this is where we are concentrating on developing Cog" with any changes to "Stack"-only code being a side-effect.

While I believe it would be good to have more short lived branches (e.g. not a long-lived "Sista" branch, but a "this-new-feature-for-Sista" branch that goes away once CI tests pass and PR is integrated)
its hard to change habits and while the existing might not scale, its working for the number of main collaborators we currently have.

 
and use Tags to mark releases like Squeak5, Pharo6 etc. This will reduce clutter and make it easy for CI builds and tests.

Since tags don't show up in the graph view...
I believe we need we need branches to mark major versions and tags marking point releases.
but just for human tracking, not having any impact on CI builds.

cheers -ben



Reply | Threaded
Open this post in threaded view
|

Re: VMMaker CI under opensmalltalk-vm (was Re: OSProcess fork issue with Debian built VM)

Ben Coman
In reply to this post by Ben Coman
 


On Tue, Jun 13, 2017 at 1:39 PM, Ben Coman <[hidden email]> wrote:
 

On Sun, Jun 4, 2017 at 1:10 AM, Eliot Miranda <[hidden email]> wrote:

On Jun 3, 2017, at 2:22 AM, Ben Coman <[hidden email]> wrote:
On Sat, Jun 3, 2017 at 5:57 AM, Eliot Miranda <[hidden email]> wrote:

On Fri, Jun 2, 2017 at 9:54 AM, K K Subbu <[hidden email]> wrote:

On Friday 02 June 2017 04:50 PM, Holger Freyther wrote:

+1. I would go further and propose that plugins, other than those
which are integral to VM[1], be moved into separate projects and
built in their own tree with no references to vm source paths. They
should be free to organize their platform-specific and
platform-independent code and makefiles.

why? How many plugins are used that don't ship with the pharo-vm (or
squeakvm)? I have to say I like the pharo-vm approach a lot where the
VMMaker* packages and generated sources sit next to each other.

My proposal is to loosen build time coupling, not packaging. Nothing prevents plugins built out of tree from being built, installed and shipped together. Supporting out of tree plugins allows others to extend a VM even long after its release.

Eliot has already responded to your example in detail, so I will not dwell upon it further, except to point out that we need a command line tool to generate a .c file directly from the plugin's mcz file. Then we can create a Make rule to compile a .o and get rid of .c file. e.g.

 FooPlugin.o : FooPlugin.mcz
        squeak -headless vmmaker.image gencc.st FooPlugin.mcz
        cc -c ... FooPlugin.c
        rm -f FooPlugin.c

Over my dead body.  The make must *NEVER* be done like this.  The problems are many:
- the VM so produced is undebuggable.  There is no source for FooPlugin.c; it has been deleted
- the build is slow; building a VMMaker image (the necessary prerequisite for st2c) takes a long time, mush longer than the st2c step
- unless FooPlugin.mcz is versioned then the VM so produced is unversioned and if  FooPlugin.mcz /is/ versioned then there's no reason why the plugin can't be produced using the normal workflow, generate C source, check it in to opensmalltalk/vm, build, and that way we have a versioned entity and break the dependency of builds requiring VMMaker


And I remember it mentioned a while ago, the bug is sometimes in the generation-code, so you want historical access to both mcz and generated-C to help debug that. 
+1


Look, I just spent several years persuading the Pharo community that generating source and then building was a really bad idea and they should shelve it and now someone is trying to bring it back.  No, no, and thrice times, no.


I think Pharo has not completely abandoned generating C-sources, although its now a sideline activity...

It's a CI activity to test the current state of the VMMaker package.  This is useful.

So we should do this under the OpenSmalltalk-VM project?


On Wed, May 31, 2017 at 2:48 PM, Esteban Lorenzano <[hidden email]> wrote:

In any case, in the “pharo side” of things, we keep a parallel CI process who does this: 

1) there is a development branch that keep all sources we need: the osvm *and* all monticello packages (filetree format) acting as a mirror that merges changes and executes all building process (including generating sources) and testing (we run all pharo tests as a way to test the VM). This is done just for being sure the CI is green and no artifact is created. This is how I have early reported problem in VM generation before.
2) in parallel the build follows the “normal” process on opensmalltalk-vm and everything is build there. This generates a “latest” build which is published in pharo servers so people can test (but this binary could be wrong as is not something declared as “stable”

I do wonder from time to time is why the C-generation can't be done as part of an opensmalltalk-vm CI process?  The manual process is a bit opaque, and a CI process would be better than documentation.  One possibility of doing it via CI could be generating from both Squeak and Pharo and checking the output is identical (assuming that might be useful).
Yes this would be useful.  Also there's still at least one place in the CoInterpreter source that instability in the type inference code that flips the type of a variable between sqInt & usqInt and automation generation might help track that down.

IIRC Eliot didn't want the generated-C updated in the Cog branch for every VMMaker commit. 

Right.  No point doing so for changes to simulation only code, or to experimental code.  Or on every commit to a package that contains a translated primitive, a form of primitives we need to eliminate by rewriting as conventional ones.

But the real issue is in not committing changes to generated code that hasn't changed.  See scripts/revertUnchangedPlugins.

If there is, say, a Slang change that has limited effects (only a subset of plugins, or only the interpreter file, etc) we don't want to commit new versions of the essentially unchanged code because it obscures which commit leads to a bug.

Plus there's your observation about Slang bugs.  I want to eyeball the changes in at least one representative generated file to look for unintended consequences and assure myself that source generation is sane.

A way to conform to this would be the CI updating the generated C in a side branch, and only when deemed worthy it could be integrated into the Cog branch via a simple merge rather than a one-time manual generation run.  

Eliot,  If you're receptive to this idea, I'd be interested in working on this.  What concerns would you have with it?

1. the process identifies files that only change in version stamp (if essentially unchanged) and /not/ commit this subset.  Right now that happens with a script on plugins (cuz they're relatively simple) and by my VMMaker image that tracks commits only producing source for classes whose timestamps have changed

2. that the process maintains the separation between generation followed by check in from build from checked in source.

3. that people who work on the Smalltalk side of the business not get lazy in eyeballing the generation process cuz Slang issues can be subtle and tedious to track down.  IME, human supervision remains useful here

One extra thing, there feels like as explosion of *src* folders in the root directory.  This could be an opportunity to clean these up into a subdirectory (and a side benefit is I can experiment without touching existing srcs)

So I'd like to run a possible structure past you...
opensmalltalk-vm/
   VMMaker/
       mc-mirror/    #versioned in filetree format
       generated/   #versioned - initial generation checked into a separate branch until manual review and merge.    
           src/
           stacksrc/
           nsspursrc/    
           nsspur64src/  
           nsspurstacksrc/    
           nsspurstack64src/  
           spursrc/       
           spur64src/         
           spurstack64src/  
           spurstacksrc/    
           spurlowcodesrc/         
           spurlowcodestacksrc/  
           spurlowcode64src/  
           spurlowcodestack64src/  
           spursistasrc/  
           spursista64src/       
       build/
           Makefile
           CI-build-scripts.sh
           generated.squeak/    #not versioned, build host only
               src/
               spursrc/       
               spur64src/         
               ...etc
           generated.pharo/    #not versioned, build host only
               src/
               spursrc/       
               spur64src/         
               ...etc
 
When http://source.squeak.org is updated, this would trigger the CI copy it into "mc-mirror"  
then build both "generated.squeak/" & "generated.pharo/".   The two would be cross compared and if identical, 
then "generated/" would be updated with substantive changes only.

btw, I'd set this up so that updating "generated/" is a separate script from updating together "mc-mirror/" and "generating.squeak/" 
to help manually investigate differences between a "generated/" and new "generated.squeak/".
 
cheers -ben


Both "mc-mirror/" and "generated/" would be checked into a "source-generation" branch
that would kick off a regular CI vm build and test (only if there were substantive changes).  
The advantage being that github comments can be attached to individual lines of code to facilitate public review and discussion.  

After manual review, the source-generation branch can be merged via PR to perform the regular CI test run inclusive of any platform changes.

cheers -ben