Cleaning changes file?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Cleaning changes file?

laura
Hi all,

Can/should the changes file be somehow cleaned? 
How do you recommend doing so?

Love,
Laura
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Sven Van Caekenberghe-2
Laura,

You can do that using the expression:

  PharoChangesCondenser condense

Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.

Sven

> On 10 Mar 2015, at 05:49, Laura Risani <[hidden email]> wrote:
>
> Hi all,
>
> Can/should the changes file be somehow cleaned?
> How do you recommend doing so?
>
> Love,
> Laura


Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Marcus Denker-4

> On 10 Mar 2015, at 07:36, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Laura,
>
> You can do that using the expression:
>
>  PharoChangesCondenser condense
>
> Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.
>
>
Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

The current system has multiple problems:

 - Why have them *in addition* in a file on disk of everyone even though almost nobody will ever look at them?
 - We have to maintain two history mechanisms. Hard to explain: “Yes, version of a method does not show you the version to committed when you use a new image”. WTF?
 - It does not scale. We clear the history between releases as the .changes get too large. So the practical use of the history is not very high
 - handling .sources, .changes and .sources is complex and hard to explain
 - It is *SLOW*. Reading code from .sources/.changes is amazingly slow
 - The code implementing all this is amazingly bad. Just look at the references to SourceFiles: 44 references, all over the place.


        Marcus
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Luc Fabresse
Hi Marcus,

Speaking of .changes and .source files, I remember a mid-term goal of introducing them inside the image.
This would imply only one file (.image) and not so much space wasted IIRC an experiment (done by Camillo?) where it was zipped on the fly.
Am just wondering if this is still a goal?

Luc


2015-03-10 7:58 GMT+01:00 Marcus Denker <[hidden email]>:

> On 10 Mar 2015, at 07:36, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Laura,
>
> You can do that using the expression:
>
>  PharoChangesCondenser condense
>
> Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.
>
>
Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

The current system has multiple problems:

 - Why have them *in addition* in a file on disk of everyone even though almost nobody will ever look at them?
 - We have to maintain two history mechanisms. Hard to explain: “Yes, version of a method does not show you the version to committed when you use a new image”. WTF?
 - It does not scale. We clear the history between releases as the .changes get too large. So the practical use of the history is not very high
 - handling .sources, .changes and .sources is complex and hard to explain
 - It is *SLOW*. Reading code from .sources/.changes is amazingly slow
 - The code implementing all this is amazingly bad. Just look at the references to SourceFiles: 44 references, all over the place.


        Marcus

Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Thierry Goubier
In reply to this post by Marcus Denker-4


2015-03-10 7:58 GMT+01:00 Marcus Denker <[hidden email]>:

> On 10 Mar 2015, at 07:36, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Laura,
>
> You can do that using the expression:
>
>  PharoChangesCondenser condense
>
> Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.
>
>
Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

The current system has multiple problems:

 - Why have them *in addition* in a file on disk of everyone even though almost nobody will ever look at them?
 - We have to maintain two history mechanisms. Hard to explain: “Yes, version of a method does not show you the version to committed when you use a new image”. WTF?
 - It does not scale. We clear the history between releases as the .changes get too large. So the practical use of the history is not very high

And getting lower and lower with repositories where one can query for the history of a method, such as gitfiletree.
 
 - handling .sources, .changes and .sources is complex and hard to explain
 - It is *SLOW*. Reading code from .sources/.changes is amazingly slow
 - The code implementing all this is amazingly bad. Just look at the references to SourceFiles: 44 references, all over the place.

And where it could be reused, such as  for doIts, a new mechanism with on disk files is created to store source code: the GTPlayground cache :(

But stiil, the changes allow you a recovery in case of a crash. It has some valuables side effects.

Thierry
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Marcus Denker-4
In reply to this post by Luc Fabresse

> On 10 Mar 2015, at 09:21, Luc Fabresse <[hidden email]> wrote:
>
> Hi Marcus,
>
> Speaking of .changes and .source files, I remember a mid-term goal of introducing them inside the image.
> This would imply only one file (.image) and not so much space wasted IIRC an experiment (done by Camillo?) where it was zipped on the fly.
> Am just wondering if this is still a goal?
>
Yes, but the devil is in the details… e.g. just compressing per method is not good enough, we need to share the compression
dictionary. Another interesting thing is that we have the symbols already in the system, thus we do not need to store all those
strings again.

Of course my dream would be some AST-like representation that is a) compressed (decompressed on the fly) and good enough
to present a text view to the human.

        Marcus


Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Marcus Denker-4
In reply to this post by Thierry Goubier
>
> But stiil, the changes allow you a recovery in case of a crash. It has some valuables side effects.
>

Yes, it was genius to merge both source storage and transaction log 30 years ago… today we would
not do that.

        Marcus
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Thierry Goubier


2015-03-10 9:28 GMT+01:00 Marcus Denker <[hidden email]>:
>
> But stiil, the changes allow you a recovery in case of a crash. It has some valuables side effects.
>

Yes, it was genius to merge both source storage and transaction log 30 years ago… today we would
not do that.

Wouldn't we? Or would we do a more compact log format, optimizing by the knowledge of what is in the image and what is inside the package repositories?

i.e. I'm not sure that the fundamental is wrong. Just that the environment is a bit different.

I toyed a long time ago with the idea of having a git as a replacement of the changes. On every method change, you commit... Has a way of compacting by itself the changes. Largely overkill :)

Now, I could also see a scheme based on MC, where the source of a method is fetched from the package (with an ast cache, if needed). Not very difficult to do.

But it's not very grandiose. Just a bit of software engineering. Not interesting ;)

Thierry
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Marcus Denker-4

On 10 Mar 2015, at 09:43, Thierry Goubier <[hidden email]> wrote:



2015-03-10 9:28 GMT+01:00 Marcus Denker <[hidden email]>:
>
> But stiil, the changes allow you a recovery in case of a crash. It has some valuables side effects.
>

Yes, it was genius to merge both source storage and transaction log 30 years ago… today we would
not do that.

Wouldn't we? Or would we do a more compact log format, optimizing by the knowledge of what is in the image and what is inside the package repositories?


right now we throw away the transaction log with the image. So we are not talking about a lot of data. 

i.e. I'm not sure that the fundamental is wrong. Just that the environment is a bit different.

I toyed a long time ago with the idea of having a git as a replacement of the changes. On every method change, you commit... Has a way of compacting by itself the changes. Largely overkill :)

Now, I could also see a scheme based on MC, where the source of a method is fetched from the package (with an ast cache, if needed). Not very difficult to do.

But it's not very grandiose. Just a bit of software engineering. Not interesting ;)


Everything is like that. Just your *goal* changes if typing “do:” is part of something interesting or not. 

Marcus
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Thierry Goubier


2015-03-10 9:48 GMT+01:00 Marcus Denker <[hidden email]>:

On 10 Mar 2015, at 09:43, Thierry Goubier <[hidden email]> wrote:



2015-03-10 9:28 GMT+01:00 Marcus Denker <[hidden email]>:
>
> But stiil, the changes allow you a recovery in case of a crash. It has some valuables side effects.
>

Yes, it was genius to merge both source storage and transaction log 30 years ago… today we would
not do that.

Wouldn't we? Or would we do a more compact log format, optimizing by the knowledge of what is in the image and what is inside the package repositories?


right now we throw away the transaction log with the image. So we are not talking about a lot of data. 

Hum. The .sources is big and duplicated in the packages.
 

i.e. I'm not sure that the fundamental is wrong. Just that the environment is a bit different.

I toyed a long time ago with the idea of having a git as a replacement of the changes. On every method change, you commit... Has a way of compacting by itself the changes. Largely overkill :)

Now, I could also see a scheme based on MC, where the source of a method is fetched from the package (with an ast cache, if needed). Not very difficult to do.

But it's not very grandiose. Just a bit of software engineering. Not interesting ;)


Everything is like that. Just your *goal* changes if typing “do:” is part of something interesting or not. 

It also depends on a very significant factor. Will this do: be used or not?

If it won't because the next best thing will take preeminence as soon as it is ready in a few years, then you don't invest in the do:.

Thierry
Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

jfabry
In reply to this post by Marcus Denker-4

> On Mar 10, 2015, at 03:58, Marcus Denker <[hidden email]> wrote:
>
>>
>> You can do that using the expression:
>>
>> PharoChangesCondenser condense
>>
>> Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.
>>
>>
> Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
> Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

OK so a silly and not-so-innocent question: Why not include that in the build script *right now* ? It has to be done some time, and now the discussion is open why not make use of the opportunity?

Also, my use of the changes file is uniquely to recover in case of a crash. With condensed changes for every daily build I download this would make my recovery just that bit easier. One small change ...

---> Save our in-boxes! http://emailcharter.org <---

Johan Fabry   -   http://pleiad.cl/~jfabry
PLEIAD lab  -  Computer Science Department (DCC)  -  University of Chile


Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Marcus Denker-4

On 10 Mar 2015, at 14:17, Johan Fabry <[hidden email]> wrote:


On Mar 10, 2015, at 03:58, Marcus Denker <[hidden email]> wrote:


You can do that using the expression:

PharoChangesCondenser condense

Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.


Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

OK so a silly and not-so-innocent question: Why not include that in the build script *right now* ? It has to be done some time, and now the discussion is open why not make use of the opportunity?

Also, my use of the changes file is uniquely to recover in case of a crash. With condensed changes for every daily build I download this would make my recovery just that bit easier. One small change …


From a size point of view it would make sense: a “Pharo4.sources” is even a bit smaller than the current .changes (which itself would be empty). Compressed it would be 15,7 instead of 13,7 due to the .changes
being better compressible (it contains much useless things even by that metric…)

The problem is that we would need to see how it plays with the get.pharo.org where the vm download contains all of Pharo1, Pharo2 and Pharo3 sources…

This is a real mess… I am locking forward to a time when we just have one .pharo image.

Marcus


Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

stepharo


You can do that using the expression:

PharoChangesCondenser condense

Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.


Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

OK so a silly and not-so-innocent question: Why not include that in the build script *right now* ? It has to be done some time, and now the discussion is open why not make use of the opportunity?

Also, my use of the changes file is uniquely to recover in case of a crash. With condensed changes for every daily build I download this would make my recovery just that bit easier. One small change …


From a size point of view it would make sense: a “Pharo4.sources” is even a bit smaller than the current .changes (which itself would be empty). Compressed it would be 15,7 instead of 13,7 due to the .changes
being better compressible (it contains much useless things even by that metric…)

The problem is that we would need to see how it plays with the get.pharo.org where the vm download contains all of Pharo1, Pharo2 and Pharo3 sources…

I do not understand your stress. If we produce a condense change files what is the problem.
You are talking about the problem of having multiple .sources?


This is a real mess… I am locking forward to a time when we just have one .pharo image.

Marcus



Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Marcus Denker-4

On 13 Mar 2015, at 09:18, stepharo <[hidden email]> wrote:



You can do that using the expression:

PharoChangesCondenser condense

Although that also means you lose the version history of methods inside your image. This is typically not done a lot, maybe for production deploy, maybe for release of Pharo itself.


Yes, but we need to change that. The artefact on the CI server needs to be the artefact of release. We should not have *anything* of the kind “and a day before release we call X”.
Because for sure it will be broken. And we waste resources: the changes are the history of the revision control system.

OK so a silly and not-so-innocent question: Why not include that in the build script *right now* ? It has to be done some time, and now the discussion is open why not make use of the opportunity?

Also, my use of the changes file is uniquely to recover in case of a crash. With condensed changes for every daily build I download this would make my recovery just that bit easier. One small change …


From a size point of view it would make sense: a “Pharo4.sources” is even a bit smaller than the current .changes (which itself would be empty). Compressed it would be 15,7 instead of 13,7 due to the .changes
being better compressible (it contains much useless things even by that metric…)

The problem is that we would need to see how it plays with the get.pharo.org where the vm download contains all of Pharo1, Pharo2 and Pharo3 sources…

I do not understand your stress. If we produce a condense change files what is the problem.
You are talking about the problem of having multiple .sources?

yes. we download always *all* sources files with the VM.

Marcus

Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

stepharo

>> I do not understand your stress. If we produce a condense change
>> files what is the problem.
>> You are talking about the problem of having multiple .sources?
>
> yes. we download always *all* sources files with the VM.
>

I do not get it. I only download once the source file and after only the vm.
So do you mena that if we would generate several 4.0.1, 4.0.2 , 40.3
sources it would be a mess?

Reply | Threaded
Open this post in threaded view
|

Re: Cleaning changes file?

Sven Van Caekenberghe-2

> On 13 Mar 2015, at 21:24, stepharo <[hidden email]> wrote:
>
>
>>> I do not understand your stress. If we produce a condense change files what is the problem.
>>> You are talking about the problem of having multiple .sources?
>>
>> yes. we download always *all* sources files with the VM.
>>
>
> I do not get it. I only download once the source file and after only the vm.
> So do you mena that if we would generate several 4.0.1, 4.0.2 , 40.3 sources it would be a mess?

When you download the VM, you get V1 V2 & V3 sources, only V3 is currently needed.