Smalltalk › Squeak › Squeak - Dev

[ANN] MCInfoProxy

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

23 messages Options

Chris Muller-3

Re: [ANN] MCInfoProxy

> I tried to rescue my image that got broken by choosing "flush cached versions". Manually downloaded Monticello-cmm.560.mcz, tried to merge in using a file list. Didn't help, because even merging needs access to the infos. And it's a pain in the neck to debug because just opening a debugger tries to materialize the proxy again which results in an error again.
>
> This is what I mean by "fragile" and "unneeded complexity".

Ok. Since I've become comfortable debugging proxy issues in Magma for
so long, I had trouble understanding this at first. I understand your
feelings now.

As I said yesterday, the issue you encountered was not only easy to
identify from your SqueakDebug.log, it was the issue predicted could
happen in my "Special Notes". Although I think all ancestry should be
there, the reality is it didn't take along at all for you to find a
case where it wasn't -- and so it MUST handle that.

The improvement you suggested yesterday eliminates the expectation
that an older MCVersion need be in the repository. It was a great
idea, I think it took care of that. :) Are you experiencing any
issues at all with MC.560?

And, I'll add the separate menu item so you'll be able to continue
flush Versions from cache and still keep in-memory Ancestry without
being required to have a network connection.

PS - It sounds like Levente's solution avoids Proxy's, but if it
changes the file format, there'd need to be a "commitment" because
it'd be harder to go back. The Proxy way occurs just in memory, so we
can evaluate this for a while and easily upgrade or change to
something entirely different with no side effects. My app is creating
hundreds of images, so I care about size right now.

Bert Freudenberg

Re: [ANN] MCInfoProxy

In reply to this post by Chris Muller-3

On 2013-08-16, at 17:00, Chris Muller <[hidden email]> wrote:

>>> Did you notice that I uploaded ALL interim versions of
>>> Monticello-cmm.[552-557]? Why would I do that when technically I only
>>> needed to upload 557?
>>
>> We try to have a continuous "trunk" of versions in the trunk repository. We named it that way, even. But we do not store copies of all branches, because MC doesn't need them, and we don't need them. So versions that got merged into trunk do not need to be in trunk themselves, and for sure not their ancestors.
>
> Bottom line -- if you want to find the diffs between two old versions
> in the ancestry, you'll need them both. For you to assert "for sure
> not their ancestors" is wrong -- you CAN'T be sure. No one knows what
> might be needed in the future.

We only store trunk versions in trunk, not the non-trunk ancestors of merged versions. Seems reasonable to me.

>>> Because MC functions depend on the ancestry model matching what's in
>>> the repositories. Keeping all versions supports incremental
>>> development and rollback. Besides that we should just maintain an MC
>>> model that is "whole" and operational rather than broken. Are you
>>> concerned about disk space?
>>
>> No, I am concerned about putting even more restrictions onto Monticello. We have gradually moved from a system with very few assumptions, over a period of non-enforced conventions, to a rigidly enforced one. Version names are an example of that. And now your adding a requirement to have an internet connection all the time because MC can unpredictably request an ancient version. I do not see that as a good idea.
>
> You know what was restrictive about the version names before? It was
> that they were dumb Strings being treated as a multi-field object,
> from 10 different places in the code, all similar but slightly
> different, and none commented. It caused paralysis because changes
> could not be made safely. It's why it took weeks for me to dissect
> and do the surgery necessary to reify that crap.

Actually the MC code base is very careful to not assign any meaning to a version name. Only the UI would try to parse it to present multiple versions in a useful way to the user. A version name *is* just a string, nothing more. Everything meaningful in MC had its own class, but version names were just that, dumb labels, intentionally. Now that you have "reified that crap" people tend to misuse it for all sorts of things.

> Did you know, Bert, that before I did that work, we were "restricted"
> to use only FileBasedRepository's. Now we we have a unified API
> between all repository types.

I did not know that. But I also don't think MCVersionName would have been necessary to achieve that goal, because, again, it's supposed to be strictly a UI thing.

> Or, we DID, until recently when you and Eliot slapped that branch-name
> in it. At least it's no longer hidden like it was before
> MCVersionName, but MC has no notion of branches anywhere in its
> domain. Guess what? Projects using your feature are now stuck back
> on only FileBasedRepositories once again.

Branches have been supported by file naming conventions since the inception of Monticello. Ask Colin.

>>>> But as soon as you use MC it needs the ancestry anyway.
>>>
>>> Not all of it. We're up to version 600+ of Morphic, when was the last
>>> time version 1 of Morphic was needed? But we continue to carry that
>>> around, in and out of the system, forever.
>>
>> It does not need to load these old versions, but it often needs to their version names, and sometimes the UUID, and having the commit message is useful too at times.
>
> Dodge. Please explain the use-case where Morphic.1 would need to be
> consumed by a human or the system.

Select Morphic in the MC browser. Open the trunk repo. Done.

Actually, I couldn't try it because even in a fully updated image I get a proxy error doing just that. To make sure it's not just my image I did the same using a trunk image from the build server. Same error.

(using MC-cmm.560 in both cases)

>> You're not doing anything about that need. You're just hiding it out of sight. That's not a solution.
>
> What need? Hiding what? Huh?

I thought the actual issue was that accessing the trunk repo feels slow. Okay, you're not hiding that, I take it back. (I had a mental image of hiding problems behind a proxy, but it's not that easy to verbalize).

>>> It's a gradual decline, unsustainable.
>>> Levente and I are interested in addressing this.
>>
>> A noble goal, and I agree we need to work on it, but you're not addressing it.
>
> You obviously didn't read my note to Levente in this thread which
> explained the next-step I want to take with this.

I only saw you proposing to reduce the need for materializing your proxies by ignoring older meta data. Which has nothing to do with the actual issues, cf above.

>>> We haven't lost clarity or simplicity. That's the nice thing about
>>> this solution, it changes _nothing_ about the MC model. It's very
>>> transient, all-in-memory. There's no disaster scenario.
>>
>> Wrong. Now just about anything you do can cause a file read or network access because MC is trying to materialize a proxy that shouldn't have been stubbed out in the first place. Before, each working copy could access its full ancestry data. That is a very serious change of behavior, in my book.
>
> Look, I'm glad you at least agree it's a noble _goal_. So please give
> us a solution, won't you? Please share your wildest imagination about
> how it would be possible to achieve this goal without needing to be
> connected to a repository?

I don't have a solution for that, but then I also don't see the ancestry data in the image as a big problem. We could talk about inefficiencies with the squeaksource server, but that would be a different topic.

> Levente has an alternate solution that does not employ proxies. I
> personally like the Proxy solution because it's just a simple "one
> off" solution that makes no changes to the MC model. But realizing
> the goal is more important to me than using Proxies. Perhaps, Bert,
> you would approve of Levente's solution or propose one yourself.

Levente's idea was very different. He did not imply to purge anything from memory which would have to be separately loaded on demand. He is looking for a more efficient way to store the ancestry data.

> Until then, I'll make the purging of ancestry a separate menu item, so
> you don't have to select it and you can stay happy.

That's a good idea, to avoid running into the problem by accident. Or perhaps a preference, then you wouldn't even need a menu entry. Also useful would be a menu item (or do-it) that would restore the full meta data without going through the proxy machinery (which also could get triggered when you turn off the preference).

- Bert -

Chris Muller-3

Re: [ANN] MCInfoProxy

>> Bottom line -- if you want to find the diffs between two old versions
>> in the ancestry, you'll need them both. For you to assert "for sure
>> not their ancestors" is wrong -- you CAN'T be sure. No one knows what
>> might be needed in the future.
>
> We only store trunk versions in trunk, not the non-trunk ancestors of merged versions. Seems reasonable to me.

I always considered something merged into ancestry tree, then it's
part of trunk. It sounds like what you're saying is that diffing with
previous versions existing within the trunk repository is good enough.
However, that still means that diffing from the Ancestry list could
result in a debugger.

>> You know what was restrictive about the version names before? It was
>> that they were dumb Strings being treated as a multi-field object,
>> from 10 different places in the code, all similar but slightly
>> different, and none commented. It caused paralysis because changes
>> could not be made safely. It's why it took weeks for me to dissect
>> and do the surgery necessary to reify that crap.
>
> Actually the MC code base is very careful to not assign any meaning to a version name. Only the UI would try to parse it to present multiple versions in a useful way to the user. A version name *is* just a string, nothing more. Everything meaningful in MC had its own class, but version names were just that, dumb labels, intentionally. Now that you have "reified that crap" people tend to misuse it for all sorts of things.

To say, the UI would "try to parse" in a context of being "useful", in
itself gives away that there are desired behaviors here based on
structure in a version-name. Other parsing had found its way into our
systems, scattered about, to accomodate our SCM processes long before
I reified it. I don't know how willy-nilly-naming would ever be
helpful for anything, but this is OT.

>>>>> But as soon as you use MC it needs the ancestry anyway.
>>>>
>>>> Not all of it. We're up to version 600+ of Morphic, when was the last
>>>> time version 1 of Morphic was needed? But we continue to carry that
>>>> around, in and out of the system, forever.
>>>
>>> It does not need to load these old versions, but it often needs to their version names, and sometimes the UUID, and having the commit message is useful too at times.
>>
>> Dodge. Please explain the use-case where Morphic.1 would need to be
>> consumed by a human or the system.
>
> Select Morphic in the MC browser. Open the trunk repo. Done.

I said *need* to be consumed, not consumed. Right now, the system
consumes it needlessly. What I have so far doesn't change that except
on a per-package basis for now.

> Actually, I couldn't try it because even in a fully updated image I get a proxy error doing just that. To make sure it's not just my image I did the same using a trunk image from the build server. Same error.
>
> (using MC-cmm.560 in both cases)

I guess it's because you had merged versions in your image. That's
fixed that in MC.561.

>> You obviously didn't read my note to Levente in this thread which
>> explained the next-step I want to take with this.
>
> I only saw you proposing to reduce the need for materializing your proxies by ignoring older meta data. Which has nothing to do with the actual issues, cf above.

Ignoring Morphic.1 seems like a safe thing to do. By setting the
history-size preference to 999999 one could easily regain access to
Morphic.1.

And activating it via purge (vs. load) is an approach that matches up
against the use-cases very well.

>> Look, I'm glad you at least agree it's a noble _goal_. So please give
>> us a solution, won't you? Please share your wildest imagination about
>> how it would be possible to achieve this goal without needing to be
>> connected to a repository?
>
> I don't have a solution for that, but then I also don't see the ancestry data in the image as a big problem. We could talk about inefficiencies with the squeaksource server, but that would be a different topic.

Not a big problem, but a problem. And a "noble" goal. :)

> That's a good idea, to avoid running into the problem by accident. Or perhaps a preference, then you wouldn't even need a menu entry. Also useful would be a menu item (or do-it) that would restore the full meta data without going through the proxy machinery (which also could get triggered when you turn off the preference).

I chose a separate menu option in the interests of simplicity.