How to avoid corrupted deferred actions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

How to avoid corrupted deferred actions

Sebastián Sastre
Hi all,

    I had an application that stores stb files (serialized models) as the
main user application file type.

    With this files the user can open and store the state of it document.

    Now I found that when opening one of this models and click on a list
header to sort some list of items that are part of the model, when I store
it on disk, then I can't restore it again.

    Debugging the error when the model is beign loaded, I found that the
model deserializes OK but inmediately after it tries to evaluate some
deferred actions that has inside an invalid view.

    My question is: how that view reached the model (only when sorting the
headers)? and how to avoid any kind of undesirable deferred actions like
this.

    regards,

--
Sebastián Sastre
[hidden email]
www.seaswork.com.ar


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Schwab,Wilhelm K
Sebastián,

>     Now I found that when opening one of this models and click on a list
> header to sort some list of items that are part of the model, when I store
> it on disk, then I can't restore it again.

There at least was a bug in Dolphin, such that sort blocks were
captured, causing problems on re-load after sorting.  My recollection is
that this has long since been patched, but I can't speak from
experience, because my workaround #asOrderedCollection and #list: sends
are still in place (there was really no reason to remove them).


>     Debugging the error when the model is beign loaded, I found that the
> model deserializes OK but inmediately after it tries to evaluate some
> deferred actions that has inside an invalid view.
>
>     My question is: how that view reached the model (only when sorting the
> headers)? and how to avoid any kind of undesirable deferred actions like
> this.

First, I could be "remembering" a fix that does not exist.  Otherwise,
are you by any chance loading the model from other than the UI process?

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Louis Sumberg-2
In reply to this post by Sebastián Sastre
Hi Sebastian,

>     Now I found that when opening one of this models and click on a list
> header to sort some list of items that are part of the model, when I store
> it on disk, then I can't restore it again.

As Bill noted in his reply, this does sound like the old "stb filing a
sorted collection" problem.  I don't think there ever was a system fix for
this.  Try converting the model's list to an OrderedCollection before saving
it and see if that works.

-- Louis


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Chris Uppal-3
In reply to this post by Sebastián Sastre
Sebastián Sastre wrote:

>     I had an application that stores stb files (serialized models) as the
> main user application file type.

Leaving aside the specific problems, which other people have already addressed,
I want to advise against using STB in this way at all.

It's a great way to get an application up and running quickly, and -- as
such -- is a very useful short-cut during development, but I wouldn't stick
with STB in the long run.  For instance I'd say it's mistake to go on using STB
once you reach the point where you want to keep /real/ data in the file, rather
than just throwaway test data.

The problems are two-fold.

First is that it is actually quite inconvenient using STB in the long run.  As
the model evolves you have to do one of:
    1) convert saved data to the new format manually
    2) write STB conversion code
    3) discard all your saved data
Only the last is really what I'd call "convenient", so STB isn't really
suitable for storing valuable data.  As you may guess, I'm talking from
experience here :-(

Second is that there are quite serious security implications for the use of
STB.  This may not matter at all for your application, but it's something to
consider carefully, especially if anyone except you is going to use it.
Basically the problem is that de-STB-ing a "malicious" byte sequence can cause
Dolphin to execute almost arbitrary code.  I can create a file which will cause
any application that de-STBs it to truncate any other file that I choose
(subject to OS permissions).  It will even work on most deployed applications.
I think I know how to use the same mechanisms to perform more complicated
attacks, but I haven't bothered to try it.  I have no intention of posting
details here ;-) but, for instance, if you use the Dolphin sample program
"chat" to talk to me on the Internet, then be prepared to chaos...  That passes
STB-ed objects back and forth over the Net, and so is vulnerable.  Recent
versions of Dolphin have the class STBValidatingInFiler which, at some cost in
extra complexity, can be used to mitigate these problems by allowing you to
control which classes are acceptable in an STB file.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Andy Bower-3
Sebastian, Chris

> Sebastian Sastre wrote:
>
> >     I had an application that stores stb files (serialized models)
> > as the main user application file type.
>
> Leaving aside the specific problems, which other people have already
> addressed, I want to advise against using STB in this way at all.
>
> It's a great way to get an application up and running quickly, and --
> as such -- is a very useful short-cut during development, but I
> wouldn't stick with STB in the long run.  For instance I'd say it's
> mistake to go on using STB once you reach the point where you want to
> keep real data in the file, rather than just throwaway test data.
>
> The problems are two-fold.
>
> First is that it is actually quite inconvenient using STB in the long
> run.  As the model evolves you have to do one of:
>     1) convert saved data to the new format manually
>     2) write STB conversion code
>     3) discard all your saved data
> Only the last is really what I'd call "convenient", so STB isn't
> really suitable for storing valuable data.  As you may guess, I'm
> talking from experience here :-(

While I agree with Chris that STB may well not be suitable for long
term data storage, I'm not sure I agree with the reasons. For e.g I
don't really see why 2) write STB conversion code is not "convenient";
it doesn't seem that hard or time consuming to write the conversion
methods when they become necessary. Indeed, I would say that with any
data storage method you will end up doing one of the above actions if
the object format changes so that it no longer directly maps to the
saved data.

I also agree with Chris' analysis of the security implications but this
may not be a concern if the data is not being transmitted over an
insecure connection (i.e. if it just resides on the local disk).

I would add one other disadvantage of the STB format, however. The STB
format can't be used to incrementally update the data on disk (at least
not without creating your own macro format around it). That is, you
have to save down the entire object graph in one go. If the amount of
data becomes large you will find that the STB save operation becomes
inordinately slow, i.e. it is not a linear deterioration. The STB load
mechanism does remain relatively quick though.

You might like to take a look at Omnibase or ReStore as alternative
storage mechanisms.


http://www.gorisek.com/homepage/WOB-TTVWwshUG8lc8pqa1oU5gG8X-1-3.html?ac
tion=omnibase

http://www.solutionsoft.co.uk/restore/index.htm

--
Andy Bower
Dolphin Support
www.object-arts.com


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Chris Uppal-3
Andy,

> While I agree with Chris that STB may well not be suitable for long
> term data storage, I'm not sure I agree with the reasons. For e.g I
> don't really see why 2) write STB conversion code is not "convenient";
> it doesn't seem that hard or time consuming to write the conversion
> methods when they become necessary. Indeed, I would say that with any
> data storage method you will end up doing one of the above actions if
> the object format changes so that it no longer directly maps to the
> saved data.

I think it's because (at least for me) the physical representation of the data
changes a lot faster than the logical structure.  So while the model is
"logically" unchanged, all sorts of detailed changes to the implementation may
have happened -- what was stored as an OrderedCollection instvar in one object,
may have moved to another class altogether and now be represented as a Set.  If
each of those "small" changes requires a new STB version as well (with
associated conversion code) then they become considerably more work.
Especially if some datum has moved from one object to another, or a class has
been refactored away.

Your experience may be different, but I make such changes rather frequently.

One intermediate approach may be interesting (Mildly. To someone.  Somewhere
;-) which is to convert the internal data to some simpler form before STBing
it.  E.g. in one of my applications, the model is essentially just two lists.
So to save the data I build two Arrays with data copied from the real model,
and then STB /them/.  That has worked quite well, and has insulated the
external form from the detailed changes I've made to the internal form -- e.g.
one of the "lists" has been at times an OrderedCollection and a
SortedCollection (and is probably going to turn into a custom class in the
future), and the other list has been a LookupTable, an OrderedCollection, and
even -- for a little while -- a Set.

(I am, however, intending to change the external form one more time -- it's
going to turn into a Zip file containing several CSV-formatted sub-files.  That
will have the useful side-effect of making the lists readable by, say, a
spreadsheet program -- I already have the ability to "export" the data to CSV
files, but it'll be nice to merge the two concepts of "saving" the data.)

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Sebastián Sastre
In reply to this post by Schwab,Wilhelm K
> First, I could be "remembering" a fix that does not exist.  Otherwise,
> are you by any chance loading the model from other than the UI process?
There is no intention to load the model from another than UI process.
I found that the image was patched only to level 3 so I've patched it to
level 4 and now is updated, but that doesn't make any difference :(

regards

Seb

>
> Have a good one,
>
> Bill
>
> --
> Wilhelm K. Schwab, Ph.D.
> [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Sebastián Sastre
In reply to this post by Louis Sumberg-2
> As Bill noted in his reply, this does sound like the old "stb filing a
> sorted collection" problem.  I don't think there ever was a system fix for
> this.  Try converting the model's list to an OrderedCollection before
saving
> it and see if that works.
>
> -- Louis

I think I will, thanks Louis.


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Christopher J. Demers
In reply to this post by Andy Bower-3
"Andy Bower" <[hidden email]> wrote in message
news:40939125$[hidden email]...
>
> While I agree with Chris that STB may well not be suitable for long
> term data storage, I'm not sure I agree with the reasons. For e.g I
> don't really see why 2) write STB conversion code is not "convenient";
> it doesn't seem that hard or time consuming to write the conversion
> methods when they become necessary. Indeed, I would say that with any
> data storage method you will end up doing one of the above actions if
> the object format changes so that it no longer directly maps to the
> saved data.

Given that Blair said:
"Schema migration can be very tricky with STB, particularly when modifying a
class with subclasses that add further instance variables. This is just one
of the reasons we are intending to replace STB in the next release." in a
post on 2003-01-03 16:09:29 PST.  I am curious if this is still part of the
plan for Dolphin 6.0?

I really like the idea of something like STB, but I appreciate Chris's
concerns.  I think that with a little work STB could be enhanced to know a
little more about that structure of the data.  Just optionally or
selectively including the variable names would make it easier to migrate STB
data.  Persistence of data is an important part of Dolphin itself, and will
probably be relevant to most Dolphin users.  I would like to see STB evolve
into (or be replaced by) something with better migration support.

BTW: I would be happy to look at a Dolphin 6.0 beta if at some point such a
thing should exist. ;)

Chris


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Chris Uppal-3
Christopher J. Demers wrote:

> I think that with a little work STB could be enhanced to know a
> little more about that structure of the data.  Just optionally or
> selectively including the variable names would make it easier to migrate
> STB data.

I hope this is a direction that Andy and Blair are moving in, too.

As I wrote my last post, I remembered that only a few days before I'd posted in
comp.lang.java.programmer that "serialization" (Java's version of STB) was
/under/-used by the typical programmer.  Not obviously consistant with my views
as expressed here ;-)

Part of it is just to do with the language and culture -- Java is a much more
"viscous" language, so changes happen less often, and Java programmers (I'm
stereotyping recklessly here) tend want to do everything /conceivable/ in XML
instead of looking at the available technologies on their merits.

However, the part of it that's relevant here is that Java's serialization
format works (by default) in terms of the names of variables rather than their
positions, which makes it an order of magnitude less sensitive to changes.

(Of course, working in terms of names would be significantly slower without a
more complex implementation -- but that's "just code" ;-)

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Schwab,Wilhelm K
Chris,

> However, the part of it that's relevant here is that Java's serialization
> format works (by default) in terms of the names of variables rather than their
> positions, which makes it an order of magnitude less sensitive to changes.

One capability I would not want to loose is incrementing a version
number purely to make type changes in one or more variables.  Ok, I
could change the instance variable name(s), but I would not want to be
forced to do that simply to trigger a conversion.

It is also crucial to at least have the option to make conversions
transparent to the user.  As slick as it is to have something pop up a
browser or debugger on a proposed conversion method, conversions need to
work in deployed executables, and w/o bothering the user.  That's not to
say that an optional tool that does what I describe above would be bad -
far from it.  IIRC, Squeak has done that for quite some time, though I
find Dolphin's serializer to be better factored and, in general,
superior to Squeak's SmartReferenceStream.

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Chris Uppal-3
Bill Schwab wrote:

> > However, the part of it that's relevant here is that Java's
> > serialization format works (by default) in terms of the names of
> > variables rather than their positions, which makes it an order of
> > magnitude less sensitive to changes.
>
> One capability I would not want to loose is incrementing a version
> number purely to make type changes in one or more variables.

Good point.

Realistically, STB (or whatever you call it) isn't a solvable problem in
general.  There is no way that a fairly simple, automated process is
going to be able to "do the right thing" every time.

I think the principle use-case for STB is View resources.  If it works well for
those then I'd say it's doing its job.  I'm not sure its possible to handle
complex objects and object webs as straightforwardly (Views are conceptually
simple, even if the code -- sigh... -- is not).

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Christopher J. Demers
"Chris Uppal" <[hidden email]> wrote in message
news:[hidden email]...

> Bill Schwab wrote:
>
> > > However, the part of it that's relevant here is that Java's
> > > serialization format works (by default) in terms of the names of
> > > variables rather than their positions, which makes it an order of
> > > magnitude less sensitive to changes.
> >
> > One capability I would not want to loose is incrementing a version
> > number purely to make type changes in one or more variables.
>
> Good point.
>
> Realistically, STB (or whatever you call it) isn't a solvable problem in
> general.  There is no way that a fairly simple, automated process is
> going to be able to "do the right thing" every time.
>
> I think the principle use-case for STB is View resources.  If it works
well for
> those then I'd say it's doing its job.  I'm not sure its possible to
handle
> complex objects and object webs as straightforwardly (Views are
conceptually
> simple, even if the code -- sigh... -- is not).

I don't think we would have to give up the idea of a class version number if
variable names were added.  Currently variable names are (by default)
stripped from the image.  If this could be controlled on a class by class
basis then some classes could strip their variable names and others could
retain them.  I currently retain my variable names because I use ReStore and
it needs them.

I don't think STB migration could be totally automated in a generic way.  I
would just like a friendlier interface for me to write the migration code.
Currently STB version migration methods are usually quite unintuitive
looking.  However if instance variable names were used then it would make
the simple case of adding new instance variables much easier to migrate
from.  Additionally if there were instance variable names then it could
create some kind of generic composite holders based on the STB data, and one
could migrate from that even if the base class was removed or totally
refactored.

Perhaps I can come up with some experimental code to play with this further,
if I get a chance.

Chris


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Sebastián Sastre
Hi,

    what about using a meta-architecture schema?

    I mean, the serialization process make us to store bytes, the "map" of
how those bytes will be interpreted should be flexible enough to suport all
the changes you made. I'm actually working to port the goods oodb client to
dolphin, and the client's serialization was made based on the meta
architecture of the objects, and every object is serializable.

    (If you want to take a look
http://gatekeeper.dynalias.org:8888/GoodsST/4)

    The author of the engine says that in java (because of the lack of
architecture reflection) you need to make a hierarchy of *persistent capable
classes* but in smalltalk this client version makes that Object will be
*persistent capable* in a transparent way.

    regards,
--
Sebastián Sastre
[hidden email]
www.seaswork.com.ar

> I don't think STB migration could be totally automated in a generic way.
I
> would just like a friendlier interface for me to write the migration code.
> Currently STB version migration methods are usually quite unintuitive
> looking.  However if instance variable names were used then it would make
> the simple case of adding new instance variables much easier to migrate
> from.  Additionally if there were instance variable names then it could
> create some kind of generic composite holders based on the STB data, and
one
> could migrate from that even if the base class was removed or totally
> refactored.
>
> Perhaps I can come up with some experimental code to play with this
further,
> if I get a chance.
>
> Chris
>
>


Reply | Threaded
Open this post in threaded view
|

Re: How to avoid corrupted deferred actions

Chris Uppal-3
Sebastián Sastre wrote:

>     what about using a meta-architecture schema?

I've been thinking about this.  It seems like a primissing way to go.  Though I
have to admit that I don't really have a clear idea of what it should look
like.

The reason I'm coming around to the idea (it originally struck me as too much
work and/or too inflexible) is the realisation that I have no less than four
ad-hoc metadata implementations already in my image (one of them is the
"Aspects" stuff which comes with the IDE, the other three are my own).  And
that suggests that it's something that's needed fairly often, and would benefit
from a less ad-hoc approach.

OTOH, the problem that I don't immediately see how to solve is that different
uses of metadata need different, um, data.  For instance the metadata needed to
save an object as XML or in the registry is different from that needed to
generate a reasonable-looking GUI "form" for interacting with it (in the style,
perhaps, of "Naked Objects" -- see <www.nakedobjects.org>).

Perhaps it's only me.  Does anyone else find that they tend to accumulate
little metadata implementation all over the shop ?

    -- chris