On Thursday, July 26, 2012 4:43:41 AM UTC+1, Sean DeNigris wrote:
Sure - the main thing on the wishlist is image-wide diffs between builds, that'd make it a lot easier for me to follow what's going on. How do people follow development in general? Look at every slice via MC browser? Seems hard work compared to lazily reading diffs via mailinglist/webpage/github/etc. |
In reply to this post by Patrik Sundberg
----- Original Message ----- | From: "Patrik Sundberg" <[hidden email]> | To: [hidden email] | Sent: Wednesday, July 25, 2012 9:32:31 AM | Subject: Re: [Metacello] Re: GitHub Sample project WTF? | | On Wednesday, July 25, 2012 4:33:22 PM UTC+1, Dale wrote: | | comments embedded... | | ----- Original Message ----- | | To: [hidden email] | | Sent: Tuesday, July 24, 2012 3:18:02 PM | | Subject: [Metacello] Re: GitHub Sample project WTF? | | | | I did some more playing with this and have some more | | comments/questions below | | | | On Tuesday, June 19, 2012 2:00:35 PM UTC+1, Dale wrote: | | | | The Sample project was downloaded from GitHub without using the | | standard Monticello HTTP repository. Instead a zipfile of the | | repository contents is downloaded and unzipped into the git-cache | | directory (in parallel with package-cache). In the | | MonticelloBrowser | | you will see a number of github: repositories. WTF? | | | | | | | | | | I see the directory is called github-cache in my local setup after | | running the commands - I like git-cache better since I for example | | maintain some private projects in my own repo (fronted by gitlab, | | great project if you haven't seen it). | | The github-cache is only used to cache downloads from github. | Metacello doesn't automatically clone projects from github (right | now). For cloned git projects you would use a filetree repository | and the clone can be located anywhere on disk. | | | | | The short version of my point was more that it seems unnecessary to | special case github. Why not include it in a more general scheme so | that any provider supporting the same API can be used, i.e. gitlab | is a way to run your own private github so to say and it is (mostly) | API compatible. My short response is that I'm within 24 hours of releasing the preview so I'm not going to make any more changes until after I push this son-of-a-gun out the door. | | | | The other thing I don't quite get is the "intermediate" directory | | at | | the end called something like: | | xxx/github-cache/dalehenrich/sample/cc46e15368c6b7cf3667427faaf6167d35044f78/dalehenrich-sample-cc46e15 | | | | | | I get all the bits up to the last part - why is that last part | | needed | | and what does it signify? | | The things in the github cache are downloads of the zipped contents | of the directories, not cloned git repositories. The | 'dalehenrich/sample/cc46e15368c6b7cf3667427faaf6167d35044f78' part | of the path comes directly from the repository spec from the project | reference in the configuration. The 'dalehenrich-sample-cc46e15' is | the root directory in the zip archive ... | | | | | I understand that today these are just "plain" files transferred via | the github API's ability to throw a zip together for a specific | revision. However, I don't see why we'd need to limit us to that | special case of file transfer in terms of how things are layed out | and named etc. My thinking was to make it a bit less special case in | terms of design of layout etc, but for now just support the specific | file transfer case. To be honest I haven't even considered that Metacello might be auto-cloning repositories. My plans have been to get the preview in the hands of developers to start collecting feedback. Then after the dust has settled (say September) to start thinking about how git would be integrated into Metacello:) | | | So in that line of thinking it seems unnecessary to me to have that | last "zip file random bit" in the path. A little bit more generic | conventions and it'll be future proof for working with general git | repos and transfer methods via proper URI's. | | | | | | | A scheme for making it more generic would be: | | xxx/git-cache/URI/branch|tag|SHA/the_part_i_dont_yet_get (if the | | last | | part is needed) | | | | | | the URI would be e.g: | | http://github.com/sundbp/reponame.git | | https://github.com/sundbp/reponame.git | | git:// github.com/sundbp/reponame.git | | [hidden email]:sundbp/reponame.git | | | | | | Obviously to include it in the filename we'd need to encode it to | | be | | a valid filename somehow. I do think that being future proof is | | worth it and then thinking in general git URI's makes sense. Also, | | it may be worth taking a look at how bundler organizes it's git | | cache (given it's relying on git for fetching etc and it'd be good | | to go that way later even if just supporting http get for now). | | Since I'm not that familiar with Bundler...Does bundler automatically | clone git repositories from github? | | | | | Yes. The ruby community uses nothing but git for most things they | can. The Smalltalk community is just being introduced to git ... before I can jump into git with both feet I have to feel comfortable that the Smalltalk community will see the same advantages that I see ... I am optimistic, but I realize that providing a smooth path between the two universes is the most important thing right now ... | | | The corresponding terms as I see it are: | | | ruby - pharo | rubygem - monticello package | bundler Gemfile - metacello config | bundler - metacello | git - git | | | In some ways metacello is more powerful in building the graph of | dependencies relying on other metacello configs, bundler can only | rely on gems, not other bundles (although gems have dependency info | as well). But bundler is more mature in terms of integrating with | rubygems living in git repos and other such things. Agreed. I am very interested in learning from the Bundler experience and don't have the bandwidth to learn firsthand:) ... So I greatly appreciate this information and the time that you are spending. | | | | I just checked and what bundler does is clone git repos (from | anywhere via any protocol (http, https, git, ssh), not just github) | into a "bundler" directory in the normal rubygems cache directory | structure. inside there is creates versions like | "rubygemname-shortSHA", so a very flat structure relying only on SHA | and not involving the repo URL. I now see why you are harping on this point .... what I am doing with the straight downloads (no git metadata) looks similar to what bundler is doing with clones of git repositories ... I've grabbed a ruby guy that's here in the office (Peter McLain) and now that you've opened my eyes I am asking him more pointed questions about Bundler:) | I guess that's enough since it'll be | unique. So if you depend on say the rubygem foo located in the | "stable" branch of a specified git repo, it'll go and ask the repo | what the latest revision on stable is, get the SHA, create a | directory for that named for example "foo-5f9317f0f789" and clone | the repo on that revision in there, and in the bundle we're with | it'll lock the foo gem to that location. If another bundle relies on | a different revision it'll lock to a different dir for that (I guess | the various bundles can be thought of analogous to images, i.e. | things are specified and loaded with dependencies in a consistent | way per bundle (i.e. a running process has a 1-to-1 mapping with a | bundle, which is why metacello and bundler aren't quite the same | since a bundle can't rely on other bundles). I had also ruled out cloning git repositories for each SHA because it is much faster to download the bits for a SHA than to download the whole git repository ... I have seem minutes worth of download time for some git repositories ... But I do understand that the download time is a function of the size of the repo... I see that the advantage to downloading the repo is that one has the opportunity to immediately work on fixing bugs/adding features if you have the repo already downloaded whereas there'd have to be an extra step involved if we only downloaded the bits ... I agree with your sentiment that that eventually Metacello will want to support a mode of operation where clones of repositories are downloaded from a wide variety of sources and I agree that I need to spend some time thinking through how this will be done ... Unfortunately we've baked in the notion that `http:` signifies a web-based monticello package (mcz) repository so I need to think more about this...there are alternate possibilities... | | | Given that I'd revise what I wrote above to simpler URI's involving | just repo name and SHA. The SHA would be good enough - and if later | we have real repos the repo itself knows all meta info about itself | so no use having it in the naming conventions. Based on that I'd | suggest instead making: | | | xxx/github-cache/dalehenrich/sample/cc46e15368c6b7cf3667427faaf6167d35044f78/dalehenrich-sample-cc46e15 | | | | into: | | | xxx/git-cache/sample/cc46e15368c6b7cf3667427faaf6167d35044f78/ (and | in there is the plain files for now, at some point could be a git | repo). | To pick at some fine points...the dalehenrich (github username) is necessary to disambiquate between the sample project coming from different users.... the redundant dalehenrich-sample-cc46e15 at the end of the path is simply an implementation detail - the dalehenrich-sample-cc46e15 is the root directory that is in the zip file downloaded from github and I am making my life a bit easier by not having to munge around with the contents of the zip directory ... The `github-cache` is not a permanent structure so I don't feel the need to fine tune this implementation detail at the present time ... when we download git clones (note I haven't said if:) they will be located in a different directory structure from the one that I am using currently so I will be free to change that detail in the future... The specification used in the configurations/baselines does need to be thought through again, though. | | btw, bundler just shells out to use the normal git installation of | the machine it's running on - no native ruby implementation of git | protocols etc, no need to when the command line tool gives you all | you need + you get things like config etc consistent across all | useage of git. and reusability, if git adds a new transfer protocol | or storage format then no need to update anything, just continue use | the git command as usual. So the same way you currently do a curl | and zip you could do a git fetch/clone. I'm not saying it makes | sense to focus on that, just pointing out how in principle one can | start migrating at some point. | | | | For the Smalltalk community at the moment, I can't demand that folks | install git on their computers. I'm trying to make the barrier to | using git/github as low as possible. | | | | | I agree with that - since I'm very comfortable with git/github my | point is more to not lock into conventions based on the special case | of zip file transfer only. If possible let's use conventions that | are less special case. | | | I want the use of github repositories to be transparent to folks who | are simply consuming the bits from a project and zip downloads give | me that. | | | | | Yes, right approach. Just being able to follow projects on gits will | make it so much easier for people to participate in a lazy way. | | | | If a developer is going to do development on a project, then it makes | sense to require git (but with Monticello even git wouldn't be | required). | | There is work ongoing with providing an in-mage interface to | git/github and I anticipate that when that is released we'll be able | to support direct git manipulation from image-based tools. | | | | | Yep, at some point having a 2-way workflow would be sweet. It's not | clear how one would go about it to me yet - given that one usually | have 1 active part of a project, which one wants 2way activity on, | and several dependencies that are passive and where one would only | ever read. That's a later question to crack though. | | | | | | | The ConfigurationOfSample has versions 0.8 and 0.9, but when you | | look | | at the spec methods, you just see a #baseline: reference and no mcz | | files. WTF? | | | | | | | | | | So the branch|tag|SHA is given in the repository string. It'd | | probably be sweeter to have #branch, #tag, #sha messages explicitly | | and the repository is used to just point to the git repository, | | that'd be the more natural way to specify it with git, and that way | | we can have the baseline specify the repo, with the versions | | specifying the branch|tag|SHA. | | I understand what you're getting at, but we're not talking about | directly monkeying with git versions right now. Remember the github | repository reference is a download of the directory structure for a | particular branch|tag|sha. It's pretty static on disk so a static | string is appropriate. | | When and if we start auto-cloning git repositories, then I will agree | that providing separate messages will make sense. | | | | | Fair enough - however it sounds like at that stage one would need to | update the URI structure completely. I guess if one keeps | github://xxx for the current scheme and uses the more general | git://xxx in the future that's no harm. Only bit I can think of is | that it could be nice to keep githuib:// for the general case (as a | shortcut) and use a variation for the special case of now, perhaps | github-zip:// or something like that? this is a very good suggestion ... | | | | | | | I have not thought about how that integrates with non-git repos in | | the above, just what the natural git way of doing it would be. It | | may not mesh well with non-git. | | Yes, the current github: specification is a non-git repos. | | | | | Get that, trying to think of ways where it wont interfere or cause | problems when they could be git repos. | | | | The BaselineOfSample has a single baseline: method and it should | | look | | like a familiar baseline version spec, just in a separate class. | | WTF? | | | | | | | | | | no surprises to me. | | | | | | | | Git isn't involved for downloading and installing projects from | | GitHub. Curl and unzip are being used and are called using | | OSProcess. WTF? | | | | | | | | | | As noted above, I'd vote for making the underlying naming etc more | | general to be future proof, but for now just support github and | | http/https that we can do without using git at all and just fail | | for | | any other case for now. However, laying the groundwork will pay off | | the day we want to have complete two-way support to also push from | | within the image using git+ssh etc (having the git-cache holding | | "real" git repo clones). | | As for future proofing I expect that the specification for git | repository will different (in some way) and the current convention | will always mean "download the bits for that SHA". | | | | | yes - but may be worth saving the really nice names and conventions | for the more general case. | | | | Version 0.8 references a specific git commit and will not change | | ... | | so it's a lot like a traditional released version for a Metacello | | configuration...only the unchanging part is enforced by git. | | | | | | | | Logical, my only input is the above regarding separating repo and | | version (in the general sense of branch|tag|SHA) if possible. | | | | | | Version 0.9 references the HEAD of the master branch...If you load | | version 0.9 (which is blessed as #development BTW) you will get the | | #bleedingEdge version of the project ... | | | | ConfigurationOfSample is managed in the configuration branch of the | | Sample project[2]. The only package on the configuration branch is | | the ConfigurationOfSample. WTF? | | | | | | | | | | We talked about this elsewhere, I'm pretty comfy with it being in a | | separate branch having chewed on it a bit. | | | | | | Drawing some inspiration from bundler again I guess the way it | | works | | there is the packages of the project itself are implicitly the same | | as that of the checked out bundle file. Drawing an analogy would be | | to have the ConfigOfXXX in the normal branch, but implicitly use | | the | | same version as the ConfigOfXXX for the project's own packages. | | That | | however doesn't jive with the metacello layer on top of the git | | layer, so I think you're right a separate config branch is the | | natural place to put it. | | | | | | BaselineOfSample is managed along with the rest of the packages in | | the master branch of the Sample project. It is embedded with | | packages and changes whenever new packages or projects or groups | | are | | added or removed ... otherwise it is left untouched. | | | | | | | | | | | | I could go either way on this one. No issues with that one in the | | same way as ConfigOf and could be kept in the project branch(es). | | It | | could also go in the config branch. If in the project branches then | | we can have for example a pharo2.0.x branch with a different | | baseline spec than gemstone branch, but both pinned down by the | | same | | ConfigOf. However, could also use the inbuilt metacello features | | for | | managing the platform specific stuff, and then I'm more thinking it | | goes with the ConfigOf in the configuration branch to group all | | version and package management info in one branch. | | The BaselineOf really needs to be embedded with the rest of the | packages ... if a new package is added or removed, or there is a new | dependency in a particular version of the project then the | BaselineOf must be changed in the same git commit as the other | changes since the two must be in sync ... The BaselineOf is the | equivalent of a makefile (do you have these in ruby?) and it is | tightly coupled to the files present in the project, so it doesn't | make sense to put the BaselineOf in a separate branch. | | | | | Ah, I see your thinking. I guess trying to make every git commit | always consistent in principle is a worthy goal. I was implicitly | assuming a 2 commit step - thanks for pointing it out more clearly. | | | No makefiles in ruby, the same kind of meta info comes from the | rubygem spec or bundler Gemfile depending on the setting. And yes, | one would commit a change where the rubygem in development depends | on another gem by updating the gemspec at the same time as the code | using it. Which is why initially it felt wrong on gut feel to have | the ConfigOf on the outside. The same issue of ConfigOf never arises | in the ruby/bundler parallel. I'll contrast that a bit more in | detail once I've thought it true. | | | | The ConfigurationOf has to be in a separate branch because it is the | meta layer over the platform branches. | | | | | | | How does that reasoning stack up with yours? | | | | | | | | Nowhere do you see fully qualified mcz file names...except in the | | MonticelloBrowser ... so mcz versions are being maintained, it's | | just that for a git/github project you don't have to constantly | | change them, like you do in a traditional Metacello project. | | | | At the moment, the MetacelloPreview is functional, but only the | | `load` command has been fleshed out. Not looking for scripting API | | WTF, but if you've got some ... fire away:) | | | | Scratch your head, wrap your brain around the concept of | | BaselineOf, | | ask questions, yell, now is a good time:) | | | | The Sample project is the future of Metacello... | | | | | | | | | | Indeed - I see a lot of things I like here! | | | | | | Let me know if there are specific things you want tried out or | | otherwise looked at. | | I am looking forward to more of your feedback ... I think that moving | forward we need to gain experience using Metacello/git/github in | real work scenarios ... | | | | | Btw, do you have it written down somewhere how your current workflow | looks form the developer side? (not jsut consider of things from | git/github but publishing to as well). I'd be interested to know. | | | | Currently I am accommodating git without embracing it, because I | think that trying to include git in this round would be biting off | more than we can chew. | | | | | My head is fully embraced for several years so I'm bound to think | more aggressively about it :) Despite picking over the fine points I really appreciate your willingness to share your knowledge of Bundler ... you have given me wonderful food for thought and just as you are digesting Smalltalk I am digesting the information about Bundler (I will also pick Peter's mind about Bundler details) ... continue to be aggressive in pointing things out ... I reserve the right to disagree with you at the detail level, but I understand that you are trying to make Metacello better and I want things to be questioned ... as I've said before, this is exactly what the preview is for and I will be making changes based on what you have said ... | | | P.S. My first dive into 2.0 and metacello didn't go so well, the | metacello bootstrap is broken and I got frustrated about the | combination of lack of knowledge to effectively dig into it, and | lack of across-build diff availability to make it easier to find the | starting point (Sean seemed to have narrowed it down in another | thread but without image-wide-diff I'm not sure how to start digging | further.. They are making a number of significant changes in Pharo-2.0 so it will be unstable for some time ... Pharo-1.4 is a better choice as a starting point because the rate of change for Pharo1.4 is much less. Image wide diffs is something that is sorely lacking from the Smalltalk experience ... what I expect in the future is that there will be a git repository with all of the base (and optional base) packages that will then make it feasible for doing image-wide diffs .... as things stand today you have to do things by hand... Dale |
On Thursday, July 26, 2012 7:57:39 PM UTC+1, Dale wrote:
Right approach - treat all my points as suggestions for the future and devils-advocating the preview. Some useage will tease out a lot more info than postulations based on extrapolation from other similar systems. If it sounds like I have strong opinions on the details it's just an illusion - only strong opinion is that smalltalk+monticello+metacello+filetree+git+github is the future one way or another :) | Cool - since I'm fully immersed in git I'm jumping 2 steps in 1. Hopefully it'll be useful. | yes - smooth path is important (although I'm naturally not very conservative towards big-bang changes so feel free to haircut me a bit :)). My thinking around naming conventions and directory structure was with future smooth path in mind, i.e. if you have a cache with packages in it already based on this scheme and then you change it a lot you'll have an obstacle to change since either needs to make sure users cache's are wiped (may not be popular), or handle both in parallel (tends to be messy). | I'll do my best to fill in info that seems relevant and happy to answer any questions as well. Their code isn't too tricky and I've dug in it before so even things I don't know isn't that hard for me to dig up. | | I guess that's enough since it'll be I'm not 100% sure here but there may be some way of doing shallow clones using fetch rather than clone (and possibly that's what bundler does), i'll dig into that more. Quick google suggests I'm right on that being possible: http://stackoverflow.com/questions/6941889/is-git-clone-depth-1-shallow-clone-more-useful-than-it-makes-out I see that the advantage to downloading the repo is that one has the opportunity to immediately work on fixing bugs/adding features if you have the repo already downloaded whereas there'd have to be an extra step involved if we only downloaded the bits ... Yes, it's really not obvious to me either yet how 2way works since one may want to lock down dependencies as read only while working on ones one packages etc. How the workflow would look is anything but clear yet, but I have a feeling there's the possibility of good things being possible. | yeah, without "real" git knowledge the SHA itself isn't a complete description (since the repo info not available). agree. The `github-cache` is not a permanent structure so I don't feel the need to fine tune this implementation detail at the present time ... when we download git clones (note I haven't said if:) they will be located in a different directory structure from the one that I am using currently so I will be free to change that detail in the future... i was thinking of the problem of getting existing caches as mentioned above - agree it's implementation detail. minor thing to ease/avoid bad user upgrade experience (wiped cache) or supporting multiple schemes in a tricky way (hence thinking github-zip:// instead of github:// to be able to leave the current when/if moving on). The specification used in the configurations/baselines does need to be thought through again, though. I think if we nail that bit to not lock into using naming schemes we'd like to reuse for more ambitious future schemes then the implementation details between git-zip-balls matter a lot less since it wont clash and will be easy to maintain in parallel. | And please be sceptical of my crazy input until I'm more at ease with living in the image :) I reserve the right to disagree with you at the detail level, but I understand that you are trying to make Metacello better and I want things to be questioned ... as I've said before, this is exactly what the preview is for and I will be making changes based on what you have said ... Yeah, a few things like SystemAnnouncements and custom slots appeals to my vision for my first project and my impression was it's 2.0 features so hence why giving 2.0 a go. I may try postponing custom slots and go with the older events system for time being and dabble with both 1.4 and 2.0. Image wide diffs is something that is sorely lacking from the Smalltalk experience ... what I expect in the future is that there will be a git repository with all of the base (and optional base) packages that will then make it feasible for doing image-wide diffs .... as things stand today you have to do things by hand... yes, it does seem that filetree+git is the best bet to get image wide diffs. I think people underestimate how useful that would be. Before one has gotten used to following a lot of projects on github and getting nice complete diffs of everything as RSS feeds etc it's hard to see exactly how much it improves the experience. Eager to play with the preview! Patrik |
Free forum by Nabble | Edit this page |