Howdy!
I want to build Pharo images using Nix. On the one hand this should look basically like the Jenkins scripts, but on the other hand the build must run inside a sandbox that is isolated from the network. So I need to download all of the code that I need ahead of time because I am not able to contact Github, SmalltalkHub, etc, once the build is running. I am not sure how to approach this problem yet but I have a preliminary question. Is there an easy way that I could "recursively" download all the Smalltalk code (etc) required for an application onto the local filesystem, and then load this into a Pharo image that does not have network access? Concrete example - How could I write a Pharo script that downloads all the software required to install Moose in a standard Pharo 6.0 image? Then how could I load that into an image off the local filesystem? I feel like there should be a solution here, since Monticello seems to be able to cache packages locally on the filesystem, but I don't how enough experience to quite know how to approach it. If this is possible then the "utopian" vision could be to recursively download every Pharo package that exists, in every relevant version, and make them available in the Nix universe. This is what Nix does with Python, R, Emacs, etc packages. Or if this turns out to be difficult for some reason then I could perhaps build the Pharo image externally - e.g. on the Inria Jenkins - and then import that into Nix as an opaque binary. This is probably the simplest solution but it is not very appealing from my Nix-centric viewpoint because I won't have visibility into what has changed between one image build and the next. Feedback would be much appreciated! -Luke P.S. Screenshot from the Pharo part of my application that is starting to come together, visualizing Lua JIT compiler IR code to make it more understandable than disassembly listings: https://github.com/raptorjit/raptorjit/pull/63#issuecomment-310138536. |
On Fri, Jun 23, 2017 at 4:20 PM, Luke Gorrie <[hidden email]> wrote:
I guess if you pre-seed the package-cache, Pharo won't try to go to the network to find the package. And with git repos, Pharo would be build from a local clone. cheers -ben
|
Hi Ben,
On 23 June 2017 at 15:20, Ben Coman <[hidden email]> wrote:
That looks really promising, thanks for the link! I suppose that in the simplest case I would simply download every mcz file into the package cache and then I should be able to load any package without going over the network? Then the more sophisticated step would be to understand the dependency relationships so that I know what subset of the packages I need to install a given package e.g. Moose depends on Roassal depends on Trachel, etc. Is there an easy way to dump this dependency relationship e.g. to a text file? Is there also an easy way to determine the complete set of mcz files in the relevant universe? (The link above only covers projects hosted on SmalltalkHub, right? So I need to also know where to find all the potentially referenced projects on Github, etc, too?) |
Hi again Ben,
I have an idea for a hacky way to import all of the Pharo packages into my universe. Shooting holes in this idea would be welcome :). Suppose that for each package "P" in the catalog I want to make a list of the recursive dependencies (mcz files) and their sha256 hashes. I would do this using a script that runs Pharo outside the sandbox and with access to the network: - Start Pharo with an empty package cache. - Download "P" and all of its dependencies automatically with Gofer. - Inspect the package cache to see every mcz file that was downloaded and its sha256 hash. Then I could use this process to bootstrap a package repository: - Having a copy of all the .mcz files in a common ftp directory. - Having a mapping of filename->sha256 to provide to nix. - Having a mapping from package name ("P") to all .mcz dependencies. This package repository could then be used for Nix such that I say which packages I want and it automatically downloads all of the required .mcz files, validates them with sha256 to make sure the build is reproducible, and then loads them all into an image and dumps the result ready for use. So then on the Nix side I would write something like: nix-build -E 'pharo6 { packages = [ roassal2 neojson ... ]; }' and it would automatically create an image for me based on the package repository definition that it finds in Git. Sane? Workable? Fatal flaws? |
On Fri, Jun 23, 2017 at 11:59 PM, Luke Gorrie <[hidden email]> wrote:
That is exactly the approach I was thinking of also. But btw note, IIUC the bootstrap hasn't got a far as programatically creating an empty file and writing nil,true,false to it yet (if ever?). You will always be starting with a minimal.image rather than an empty.image. cheers -ben |
In reply to this post by Luke Gorrie
Hi Luke,
On Fri, Jun 23, 2017 at 8:59 AM, Luke Gorrie <[hidden email]> wrote:
Workable? Yes. Fatal flaws? None. Sane? Why introduce your own ftp directory? All you're doing is replicating all the repositories out there. Basically you want to seed a package-cache with any and all versions of the relevant packages. The ftp directory looks like a superfluous step. What you need to construct is the repository crawler that locates as many versions as you can find. The list of versions should be in each package; it's part of its history. You then have to find out where they are. As packages move from repository to repository (a sad fact of life) so tracking down these versions becomes difficult. _,,,^..^,,,_ best, Eliot |
On Sat, Jun 24, 2017 at 12:58 AM, Eliot Miranda <[hidden email]> wrote:
btw, you may find interesting World > System > System Reporter > MC Working Copies which is coded at SystemReporter>>reportWorkingCopies:
I guess its an attempt to shortcut that pain in five years when that Linux distribution wants to recreate a particular historical release. cheers -ben |
In reply to this post by Luke Gorrie
For those projects described with Metacello baselines or configurations you can issue a #fetch command to download all the Monticello packages needed to load the projects into your image....
e.g. Metacello new configuration:'Seaside3'; repository: 'http://www.smalltalkhub.com/mc/Seaside/MetacelloConfigurations/main'; version: #stable; fetch or fetch: #('Core' 'Tests') etc etc etc It does load the projects configuration/baseline and all of the dependencies projects/baselines in order to analyze them and find what needs to be downloaded. But it doesn't load any of the packages' code. Does this help? Paul
|
In reply to this post by Luke Gorrie
On 23/06/17 17:59, Luke Gorrie wrote:
> Hi again Ben, > > I have an idea for a hacky way to import all of the Pharo packages into > my universe. Shooting holes in this idea would be welcome :). > ... > Sane? Workable? Fatal flaws? You might want to look at StephanEggermont/DeprecationFinder on sth That tries downloading latest versions of all packages in all projects of sth. It misses the team projects. It needs an old moose (or upgrading). Another experiment I did was StephanEggermont/MonticelloProjects. That combines all code from all the mcz's in a project and can recreate the separate versions. The deduplication is incorrect when there are inconsistent histories. http://forum.world.st/Compact-representation-of-source-code-history-td4866993.html http://forum.world.st/Working-with-a-compressed-Fuel-file-td4869105.html#a4869318 Stephan |
In reply to this post by Eliot Miranda-2
On 23 June 2017 at 18:58, Eliot Miranda <[hidden email]> wrote:
Good question... Let me try a new iteration of the idea please. Here is what I _think_ I want: - For each package in the Catalog, the list of mcz files it requires to install. - For each mcz file, its expected sha256 hash and a basically stable download URL. So if I would use this algorithm: for each package P: empty the package cache install P inspect the package cache to discover dependencies Then I should be able to work out every mcz file required by each package, and I should be able to calculate the sha256 hashes of those mcz files too, but will I be able to determine a suitable download URL? I don't immediately see how and so one alternative would be to manufacture a URL by mirroring the file somewhere. |
In reply to this post by Paul DeBruicker
Hi Paul,
On 23 June 2017 at 22:11, Paul DeBruicker <[hidden email]> wrote: For those projects described with Metacello baselines or configurations you Thanks for this idea! Could be that a "high-brow" approach based on Metacello metadata is an interesting alternative to a "low-brow" approach based on simply scraping the package-cache/ directory after installation. Here is what I would really like (restating from another mail): To make a dump of all available packages (restricting to ones with Metacello configurations is probably fine) and, for each package, to make a list of: - The mcz files required for installation (and any other downloded resources if there are some...?) - The sha256 hash of each mcz file (assume I can get this easily if the file is downloaded) - The URL where the file was downloaded from / should be downloaded from in the future Do you think this can be determined in a straightforward way from Metacello metadata? |
In reply to this post by Ben Coman
On 23 June 2017 at 18:56, Ben Coman <[hidden email]> wrote:
This should be fine for me. I imagine importing some images as binaries and then writing a Nix expression to customize those e.g. adding packages, running some code, etc. Just for my own purposes I am not too concerned about bloat or bootstrapping. I would be fine to build on the standard Pharo 6 or Moose 6.1 image. I mostly want to add my own packages, and their dependencies, and make minor customizations (e.g. by default have the Spotter, etc, only show the application domain and not the Smalltalk universe.) |
In reply to this post by Luke Gorrie
Hi Luke, I think it is straight forward, if not already supported. I think you want data that is contained in the instance of the MetacelloVersionLoadDirective that is created by Metacello's record command (https://github.com/dalehenrich/metacello-work/blob/master/docs/MetacelloScriptingAPI.md#recording). But I'm not sure where to look to give you the URL from where its loading the package. So continuing with the Seaside example I posted replace the #fetch with a #record and inspect the output of the command. Hope this helps Paul. |
In reply to this post by Luke Gorrie
I have done some work on packaging up pharo projects for Nix so that images can be created in the sandboxed build environment. I am a bit stuck though.
I have a script that takes a project named and creates a specification of all the required mcz files, recursing into dependencies: Here is the spec for a small project, NeoJSON: [ { name = "ConfigurationOfNeoJSON-SvenVanCaekenberghe.20"; package = "ConfigurationOfNeoJSON-SvenVanCaekenberghe.20"; sha256 = "0y5yqmb2hjfya2s4a8pkrz0q6pdm01vrfhvfj5knk6slgsy8c2my"; } { name = "Neo-JSON-Core-SvenVanCaekenberghe.37"; package = "Neo-JSON-Core-SvenVanCaekenberghe.37"; sha256 = "054hcp1lcriby5a8gpc54gcrd9agbpjv6f0n81bk7f091n9m01z0"; } { name = "Neo-JSON-Tests-SvenVanCaekenberghe.36"; package = "Neo-JSON-Tests-SvenVanCaekenberghe.36"; sha256 = "1c3nxi09l785npjypjmzcn5bhy5sswp1agj7pfqs640vg7l80l4k"; } ] Nix then uses this to construct a package cache directory with a local copy of each mcz file: $ ls -1 /nix/store/41jbwrsdpcvh684ll3qqk9lcfrrvxy59-pharo-package-cache/ ConfigurationOfNeoJSON-SvenVanCaekenberghe.20.mcz Neo-JSON-Core-SvenVanCaekenberghe.37.mcz Neo-JSON-Tests-SvenVanCaekenberghe.36.mcz So this looks promising: I have all the code and should be able to load it without using the network. However - I do in practice get an error when I try to do this. I guess Metacello wants to access the network for some reasons I have not anticipated. Here is my script: Metacello new configuration: 'NeoJSON'; cacheRepository: '/nix/store/rdy7wp3cx4z86wy537idkhjb0q70xzjp-pharo-package-cache/'; ignoreImage; load. Smalltalk saveAs: 'pharo'. Smalltalk exitSuccess. and my Metacello error here: https://gist.github.com/lukego/ab105c257457e7b9a53f6378425df9a8 Any advice to make this work? (Could I perhaps bypass Metacello entirely and just load the mcz files directly? Guessing not...?) |
JFYI: My conclusion is that there is too much friction when building Pharo images under Nix. Nix and Pharo seem to make opposite trade-offs on the reproducibility-vs-convenience axis. Pharo hackers seem to expect a lot of freedom during builds, like unrestricted internet access, and that doesn't really fit with Nix's goal of referentially transparent builds. Damien Cassou tried to tell me that I was on the wrong track from the beginning but I was too stubborn to listen :). So I reckon that I will fall back to using two parallel build/CI systems. I'll build the Pharo images using Jenkins and then import them into Nix as binary blobs. This should work fine. Just means that I depend on two CI systems instead of one and I'll need to maintain my image builds using different tools that my other software builds. (I will stick with Nix for building the VM for now...) On 30 June 2017 at 14:27, Luke Gorrie <[hidden email]> wrote:
|
In reply to this post by Eliot Miranda-2
On 23 June 2017 at 18:58, Eliot Miranda <[hidden email]> wrote:
There did seem to be a fatal flaw. Metacello doesn't seem to want to run without network access. Pre-populating the package cache with .mcz files doesn't seem to be sufficient. I'm sure it's possible to achieve but it was beyond my skills and patience. I have found a reasonably comfortable compromise now though. I build my image in two steps now. First a "base" image is built in the Jenkins universe using Metacello and the stable versions de jour of all the dependencies. Second the base image is imported into the Nix universe and extended using very low-brow Smalltalk scripts to force a reload of my own packages and setup the settings that I want. This seems fine. The base image can be updated by Jenkins every month or more: Just when I need to add a dependency or pull fixes to an existing one. The derived image can then be built many times per day within the same workflow as the rest of my application. Just in case anybody is curious here is the Nix code that I'm using to import the base image, add my extensions, and create a startup script that runs the image under vncserver+ratpoison for easy remote access to the GUI: |
Free forum by Nabble | Edit this page |