Glorp delete anormaly

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Glorp delete anormaly

jtuchel
Hi there, 

I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:

Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.

So here is the longer story:

I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.

What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).

So what I did first was something like this:

recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.

This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.

So I experimented a little and found something better:

recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].

This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.

I ended up doing it in bare SQL in a scheme similar to the second approach.

deleteIDs := acc executeSQLString:
'select  node.id from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);']].

Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.

So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.

The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.

Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?

Joachim





--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

RE: Glorp delete anormaly

Maarten Mostert

It looks to me that you're not cleaning up things correctly behind your delete.

You can take a look on how I end up doing this in a part of the following method:

 

surpressTaskFromModel: aModel

 

           toDelete := OrderedCollection new.
           self children do: [:each | toDelete add: each].
           toDelete do: [:each | each surpressTaskFromModel: aModel].

 

             pieces := Array

                                               with: self

                                               with: self parent

                                            

            aModel getGlorpSession inUnitOfWorkDo:

                                   [aModel getGlorpSession registerExisting: (pieces at: 2).

                                   (pieces at: 2) children remove: self ifAbsent: [self error: 'absent'].

                                   (pieces at: 2) reorderChilds.

                                   self parent: nil.

                                   aModel getGlorpSession delete: self].

 

Using the Array pieces is of major importance here, as this is assures to work with full identity.

The purpose of the method it to delete itself as well as all off its children, and it is used in a cascading way.

 

registerExisting is the method I use that adds an object to the cache when it is still known in the image and existing to the database,

 

Hope this helps,

 

@+Maarten,

 

 

> "jtuchel" <[hidden email]> |

Hi there, 
I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:
Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.
So here is the longer story:
I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.
What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).
So what I did first was something like this:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.
This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.
So I experimented a little and found something better:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].
This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.
I ended up doing it in bare SQL in a scheme similar to the second approach.
deleteIDs := acc executeSQLString:
'select  node.id from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);']].
Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.
So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.
The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.
Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?
Joachim
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Glorp delete anormaly

jtuchel
Hi Maarten,

I think your suggestion is quite close to my second approach. I don't register any objects because I read all objects in the session that deletes them. The Glorp machinery does a #realObject on all children during prepareToCommit, so I think it should be the same thing that is happening. The only thing I am not doing is explicitly removing the children of a node, because they will be deleted anyways because of the #beExclusive.

The only situation in which I do a #register: is when I create new objects. 

I can see how you dig down the hierarchy and do lots of commits of single deletes on the way back up, so you don't rely on the Glorp mechanism for resolving dependent objects in an exclusive 1:m relationship. So you perfectly sail around the problem ;-) 
Or am I getting your code wrong?

I was thinking about doing that, but given my first measurements of the approaches I tried, I was quite sure my batch would run literally for hours if I delete/commit each single node. OTOH, Glorp has to read all children as well to resolve the transitive closure of a to-be-deleted node, so the performance difference may just lie in the number of commits, which shouldn't bee too much of a problem in an exclusive session during app startup.
What is your experience on this? 

I am glad you posted your code here, because it gives me new ideas to think about what I could do to reach my goal. I'll give your approach a try. 

Thanks for sharing.

Joachim



Am Freitag, 4. Juli 2014 13:23:40 UTC+2 schrieb [hidden email]:

It looks to me that you're not cleaning up things correctly behind your delete.

You can take a look on how I end up doing this in a part of the following method:

 

surpressTaskFromModel: aModel

 

           toDelete := OrderedCollection new.
           self children do: [:each | toDelete add: each].
           toDelete do: [:each | each surpressTaskFromModel: aModel].

 

             pieces := Array

                                               with: self

                                               with: self parent

                                            

            aModel getGlorpSession inUnitOfWorkDo:

                                   [aModel getGlorpSession registerExisting: (pieces at: 2).

                                   (pieces at: 2) children remove: self ifAbsent: [self error: 'absent'].

                                   (pieces at: 2) reorderChilds.

                                   self parent: nil.

                                   aModel getGlorpSession delete: self].

 

Using the Array pieces is of major importance here, as this is assures to work with full identity.

The purpose of the method it to delete itself as well as all off its children, and it is used in a cascading way.

 

registerExisting is the method I use that adds an object to the cache when it is still known in the image and existing to the database,

 

Hope this helps,

 

@+Maarten,

 

 

> "jtuchel" <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="BNaw85VmLLAJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">jtu...@...> |

Hi there, 
I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:
Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.
So here is the longer story:
I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.
What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).
So what I did first was something like this:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.
This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.
So I experimented a little and found something better:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].
This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.
I ended up doing it in bare SQL in a scheme similar to the second approach.
deleteIDs := acc executeSQLString:
'select  <a href="http://node.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fnode.id\46sa\75D\46sntz\0751\46usg\75AFQjCNE7mSumD_lS2vVT4VlQYOM2OuM-og';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fnode.id\46sa\75D\46sntz\0751\46usg\75AFQjCNE7mSumD_lS2vVT4VlQYOM2OuM-og';return true;">node.id from treenode node where node.parent_id is null or  node.parent_id not in (select <a href="http://kp.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;">kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select <a href="http://kp.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;">kp.id from treenode kp);']].
Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.
So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.
The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.
Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?
Joachim
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="BNaw85VmLLAJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">glorp-group...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="BNaw85VmLLAJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">glorp...@....
Visit this group at <a href="http://groups.google.com/group/glorp-group" target="_blank" onmousedown="this.href='http://groups.google.com/group/glorp-group';return true;" onclick="this.href='http://groups.google.com/group/glorp-group';return true;">http://groups.google.com/group/glorp-group.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Glorp delete anormaly

Alan Knight-2
Hmm. That does sound like a bug. So in the first case, where it deleted only about 2000 records, was it that it deleted all the parents but not the children? Or did it delete some of the children, but not all?

I suspect either approach would be faster if you did it with shorter transactions. So, don't use commitUnitOfWorkAndContinue, just do a commit. If you did a commit, re-read or re-register the first item from the list, then delete it in a transaction, then the read gets cached and the database isn't holding open a big pile of locks. But the SQL, where you don't have to read the objects at all, will always be faster. And getting a different answer between one and two just seems wrong. I don't know why it would be happening, but I suspect that the combination of deleteAll: and exclusive relationships is not well exercised.

Aside: I think the Store garbage collector should work more or less the wa your last option works. Delete the packages/bundles you don't want. Do repeated cleanup passes for anything that refers to the now missing things. I think it would be much faster and simpler.

On Sat Jul 05 2014 at 12:47:52 AM, jtuchel <[hidden email]> wrote:
Hi Maarten,

I think your suggestion is quite close to my second approach. I don't register any objects because I read all objects in the session that deletes them. The Glorp machinery does a #realObject on all children during prepareToCommit, so I think it should be the same thing that is happening. The only thing I am not doing is explicitly removing the children of a node, because they will be deleted anyways because of the #beExclusive.

The only situation in which I do a #register: is when I create new objects. 

I can see how you dig down the hierarchy and do lots of commits of single deletes on the way back up, so you don't rely on the Glorp mechanism for resolving dependent objects in an exclusive 1:m relationship. So you perfectly sail around the problem ;-) 
Or am I getting your code wrong?

I was thinking about doing that, but given my first measurements of the approaches I tried, I was quite sure my batch would run literally for hours if I delete/commit each single node. OTOH, Glorp has to read all children as well to resolve the transitive closure of a to-be-deleted node, so the performance difference may just lie in the number of commits, which shouldn't bee too much of a problem in an exclusive session during app startup.
What is your experience on this? 

I am glad you posted your code here, because it gives me new ideas to think about what I could do to reach my goal. I'll give your approach a try. 

Thanks for sharing.

Joachim



Am Freitag, 4. Juli 2014 13:23:40 UTC+2 schrieb [hidden email]:

It looks to me that you're not cleaning up things correctly behind your delete.

You can take a look on how I end up doing this in a part of the following method:

 

surpressTaskFromModel: aModel

 

           toDelete := OrderedCollection new.
           self children do: [:each | toDelete add: each].
           toDelete do: [:each | each surpressTaskFromModel: aModel].

 

             pieces := Array

                                               with: self

                                               with: self parent

                                            

            aModel getGlorpSession inUnitOfWorkDo:

                                   [aModel getGlorpSession registerExisting: (pieces at: 2).

                                   (pieces at: 2) children remove: self ifAbsent: [self error: 'absent'].

                                   (pieces at: 2) reorderChilds.

                                   self parent: nil.

                                   aModel getGlorpSession delete: self].

 

Using the Array pieces is of major importance here, as this is assures to work with full identity.

The purpose of the method it to delete itself as well as all off its children, and it is used in a cascading way.

 

registerExisting is the method I use that adds an object to the cache when it is still known in the image and existing to the database,

 

Hope this helps,

 

@+Maarten,

 

 

> "jtuchel" <[hidden email]> |

Hi there, 
I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:
Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.
So here is the longer story:
I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.
What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).
So what I did first was something like this:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.
This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.
So I experimented a little and found something better:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].
This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.
I ended up doing it in bare SQL in a scheme similar to the second approach.
deleteIDs := acc executeSQLString:
'select  node.id from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);']].
Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.
So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.
The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.
Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?
Joachim
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.
To post to this group, send email to [hidden email].

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Glorp delete anormaly

jtuchel
Hi Alan,


Am Montag, 7. Juli 2014 06:47:16 UTC+2 schrieb Alan Knight:
Hmm. That does sound like a bug. So in the first case, where it deleted only about 2000 records, was it that it deleted all the parents but not the children? Or did it delete some of the children, but not all?


There were only 432 containers that didn't know about a container, so the 2000 deleted items included children also. I haven't dug down deep enough to see if they were only direct children and children deeper down the tree were missing. Maybe I should have.
 
I suspect either approach would be faster if you did it with shorter transactions. So, don't use commitUnitOfWorkAndContinue, just do a commit. If you did a commit, re-read or re-register the first item from the list, then delete it in a transaction, then the read gets cached and the database isn't holding open a big pile of locks.

The re-registering of the first element in the collection for each new transaction is an idea well worth trying, I am not sure about the performance gains, however. 
But the fact that also the approach with individual delete: left thousands of objects in the DB doesn't really create much confidence ;-) 
 
But the SQL, where you don't have to read the objects at all, will always be faster. And getting a different answer between one and two just seems wrong. I don't know why it would be happening, but I suspect that the combination of deleteAll: and exclusive relationships is not well exercised.

Hmm. deleteAll: just iterates through the list and calls delete: for each object. So there shouldn't be a difference. 
 

Aside: I think the Store garbage collector should work more or less the wa your last option works. Delete the packages/bundles you don't want. Do repeated cleanup passes for anything that refers to the now missing things. I think it would be much faster and simpler.

It is very much faster. Approaches 1 and 2 would rund at least ten minutes, while the pure SQL approach is finished in a bit more than a minute.
 

Joachim


On Sat Jul 05 2014 at 12:47:52 AM, jtuchel <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="cBxxiHPy6AIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">jtu...@...> wrote:
Hi Maarten,

I think your suggestion is quite close to my second approach. I don't register any objects because I read all objects in the session that deletes them. The Glorp machinery does a #realObject on all children during prepareToCommit, so I think it should be the same thing that is happening. The only thing I am not doing is explicitly removing the children of a node, because they will be deleted anyways because of the #beExclusive.

The only situation in which I do a #register: is when I create new objects. 

I can see how you dig down the hierarchy and do lots of commits of single deletes on the way back up, so you don't rely on the Glorp mechanism for resolving dependent objects in an exclusive 1:m relationship. So you perfectly sail around the problem ;-) 
Or am I getting your code wrong?

I was thinking about doing that, but given my first measurements of the approaches I tried, I was quite sure my batch would run literally for hours if I delete/commit each single node. OTOH, Glorp has to read all children as well to resolve the transitive closure of a to-be-deleted node, so the performance difference may just lie in the number of commits, which shouldn't bee too much of a problem in an exclusive session during app startup.
What is your experience on this? 

I am glad you posted your code here, because it gives me new ideas to think about what I could do to reach my goal. I'll give your approach a try. 

Thanks for sharing.

Joachim



Am Freitag, 4. Juli 2014 13:23:40 UTC+2 schrieb [hidden email]:

It looks to me that you're not cleaning up things correctly behind your delete.

You can take a look on how I end up doing this in a part of the following method:

 

surpressTaskFromModel: aModel

 

           toDelete := OrderedCollection new.
           self children do: [:each | toDelete add: each].
           toDelete do: [:each | each surpressTaskFromModel: aModel].

 

             pieces := Array

                                               with: self

                                               with: self parent

                                            

            aModel getGlorpSession inUnitOfWorkDo:

                                   [aModel getGlorpSession registerExisting: (pieces at: 2).

                                   (pieces at: 2) children remove: self ifAbsent: [self error: 'absent'].

                                   (pieces at: 2) reorderChilds.

                                   self parent: nil.

                                   aModel getGlorpSession delete: self].

 

Using the Array pieces is of major importance here, as this is assures to work with full identity.

The purpose of the method it to delete itself as well as all off its children, and it is used in a cascading way.

 

registerExisting is the method I use that adds an object to the cache when it is still known in the image and existing to the database,

 

Hope this helps,

 

@+Maarten,

 

 

> "jtuchel" <[hidden email]> |

Hi there, 
I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:
Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.
So here is the longer story:
I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.
What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).
So what I did first was something like this:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.
This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.
So I experimented a little and found something better:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].
This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.
I ended up doing it in bare SQL in a scheme similar to the second approach.
deleteIDs := acc executeSQLString:
'select  <a href="http://node.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fnode.id\46sa\75D\46sntz\0751\46usg\75AFQjCNE7mSumD_lS2vVT4VlQYOM2OuM-og';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fnode.id\46sa\75D\46sntz\0751\46usg\75AFQjCNE7mSumD_lS2vVT4VlQYOM2OuM-og';return true;">node.id from treenode node where node.parent_id is null or  node.parent_id not in (select <a href="http://kp.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;">kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select <a href="http://kp.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;">kp.id from treenode kp);']].
Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.
So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.
The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.
Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?
Joachim
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.
To post to this group, send email to [hidden email].

Visit this group at <a href="http://groups.google.com/group/glorp-group" target="_blank" onmousedown="this.href='http://groups.google.com/group/glorp-group';return true;" onclick="this.href='http://groups.google.com/group/glorp-group';return true;">http://groups.google.com/group/glorp-group.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="cBxxiHPy6AIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">glorp-group...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="cBxxiHPy6AIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">glorp...@....
Visit this group at <a href="http://groups.google.com/group/glorp-group" target="_blank" onmousedown="this.href='http://groups.google.com/group/glorp-group';return true;" onclick="this.href='http://groups.google.com/group/glorp-group';return true;">http://groups.google.com/group/glorp-group.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Glorp delete anormaly

Maarten Mostert

My approach is that every mapped Object has a surpressFromModel method.

This means that when I delete an Object it will cascade throughout whatever proxy it holds to cleaning anything to which it could have a relation to.

Doing so I can delete in a consistant way, without resetting the session or doing refreshed reading.

Associated with my registerExisting: method it has proven to be very reliable.

 

Regards,

 

@+Maarten,

 

> "jtuchel" <[hidden email]> |

Hi Alan,


Am Montag, 7. Juli 2014 06:47:16 UTC+2 schrieb Alan Knight:
Hmm. That does sound like a bug. So in the first case, where it deleted only about 2000 records, was it that it deleted all the parents but not the children? Or did it delete some of the children, but not all?
There were only 432 containers that didn't know about a container, so the 2000 deleted items included children also. I haven't dug down deep enough to see if they were only direct children and children deeper down the tree were missing. Maybe I should have.
 
I suspect either approach would be faster if you did it with shorter transactions. So, don't use commitUnitOfWorkAndContinue, just do a commit. If you did a commit, re-read or re-register the first item from the list, then delete it in a transaction, then the read gets cached and the database isn't holding open a big pile of locks.
The re-registering of the first element in the collection for each new transaction is an idea well worth trying, I am not sure about the performance gains, however. 
But the fact that also the approach with individual delete: left thousands of objects in the DB doesn't really create much confidence ;-) 
 
But the SQL, where you don't have to read the objects at all, will always be faster. And getting a different answer between one and two just seems wrong. I don't know why it would be happening, but I suspect that the combination of deleteAll: and exclusive relationships is not well exercised.
Hmm. deleteAll: just iterates through the list and calls delete: for each object. So there shouldn't be a difference. 
 
Aside: I think the Store garbage collector should work more or less the wa your last option works. Delete the packages/bundles you don't want. Do repeated cleanup passes for anything that refers to the now missing things. I think it would be much faster and simpler.
It is very much faster. Approaches 1 and 2 would rund at least ten minutes, while the pure SQL approach is finished in a bit more than a minute.
 
Joachim

On Sat Jul 05 2014 at 12:47:52 AM, jtuchel <[hidden email]> wrote:
Hi Maarten,
I think your suggestion is quite close to my second approach. I don't register any objects because I read all objects in the session that deletes them. The Glorp machinery does a #realObject on all children during prepareToCommit, so I think it should be the same thing that is happening. The only thing I am not doing is explicitly removing the children of a node, because they will be deleted anyways because of the #beExclusive.
The only situation in which I do a #register: is when I create new objects. 
I can see how you dig down the hierarchy and do lots of commits of single deletes on the way back up, so you don't rely on the Glorp mechanism for resolving dependent objects in an exclusive 1:m relationship. So you perfectly sail around the problem ;-) 
Or am I getting your code wrong?
I was thinking about doing that, but given my first measurements of the approaches I tried, I was quite sure my batch would run literally for hours if I delete/commit each single node. OTOH, Glorp has to read all children as well to resolve the transitive closure of a to-be-deleted node, so the performance difference may just lie in the number of commits, which shouldn't bee too much of a problem in an exclusive session during app startup.
What is your experience on this? 
I am glad you posted your code here, because it gives me new ideas to think about what I could do to reach my goal. I'll give your approach a try. 
Thanks for sharing.
Joachim


Am Freitag, 4. Juli 2014 13:23:40 UTC+2 schrieb [hidden email]:

It looks to me that you're not cleaning up things correctly behind your delete.

You can take a look on how I end up doing this in a part of the following method:

 

surpressTaskFromModel: aModel

 

           toDelete := OrderedCollection new.
           self children do: [:each | toDelete add: each].
           toDelete do: [:each | each surpressTaskFromModel: aModel].

 

             pieces := Array

                                               with: self

                                               with: self parent

                                            

            aModel getGlorpSession inUnitOfWorkDo:

                                   [aModel getGlorpSession registerExisting: (pieces at: 2).

                                   (pieces at: 2) children remove: self ifAbsent: [self error: 'absent'].

                                   (pieces at: 2) reorderChilds.

                                   self parent: nil.

                                   aModel getGlorpSession delete: self].

 

Using the Array pieces is of major importance here, as this is assures to work with full identity.

The purpose of the method it to delete itself as well as all off its children, and it is used in a cascading way.

 

registerExisting is the method I use that adds an object to the cache when it is still known in the image and existing to the database,

 

Hope this helps,

 

@+Maarten,

 

 

> "jtuchel" <[hidden email]> |

Hi there, 
I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:
Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.
So here is the longer story:
I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.
What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).
So what I did first was something like this:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.
This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.
So I experimented a little and found something better:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].
This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.
I ended up doing it in bare SQL in a scheme similar to the second approach.
deleteIDs := acc executeSQLString:
'select  node.id from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select kp.id from treenode kp);']].
Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.
So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.
The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.
Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?
Joachim
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Glorp delete anormaly

jtuchel
Hi Maarten,

I didn't want to say your approach is wrong or bad. It is obviously the most reliable way of deleting dependent objects, because it seems Glorp has some problems with exclusive relationships on deletion.

All I wanted to say is that you do by hand what Glorp should provide when you delete a node with dependent children in an exclusive relationship. 
But it doesn't, so it seems. So your way or any variation of it is the only reliable way to do it.
For mass deletes, the SQL only approach is faster, but circumvents Glorp's bookkeeping, so it is not an option for situations where you are in the middle of working with objects that may be deleted by SQL.

The deletion of dependent objects in an exclusive relationship does work in other places of our application without any problems. In all those other cases, however, the root and its children are not in the same class/table. So this is not a general bug with registerTransitiveClosureFrom: and friends. I guess it is likely a problem related to the special case of class hierarchy that I am mapping here. BUT it also is not a general problem with my mappings, because deleting just one node with its dependent children also works.

Joachim



Am Dienstag, 8. Juli 2014 11:13:47 UTC+2 schrieb [hidden email]:

My approach is that every mapped Object has a surpressFromModel method.

This means that when I delete an Object it will cascade throughout whatever proxy it holds to cleaning anything to which it could have a relation to.

Doing so I can delete in a consistant way, without resetting the session or doing refreshed reading.

Associated with my registerExisting: method it has proven to be very reliable.

 

Regards,

 

@+Maarten,

 

> "jtuchel" <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="af9j00Pc1vMJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">jtu...@...> |

Hi Alan,


Am Montag, 7. Juli 2014 06:47:16 UTC+2 schrieb Alan Knight:
Hmm. That does sound like a bug. So in the first case, where it deleted only about 2000 records, was it that it deleted all the parents but not the children? Or did it delete some of the children, but not all?
There were only 432 containers that didn't know about a container, so the 2000 deleted items included children also. I haven't dug down deep enough to see if they were only direct children and children deeper down the tree were missing. Maybe I should have.
 
I suspect either approach would be faster if you did it with shorter transactions. So, don't use commitUnitOfWorkAndContinue, just do a commit. If you did a commit, re-read or re-register the first item from the list, then delete it in a transaction, then the read gets cached and the database isn't holding open a big pile of locks.
The re-registering of the first element in the collection for each new transaction is an idea well worth trying, I am not sure about the performance gains, however. 
But the fact that also the approach with individual delete: left thousands of objects in the DB doesn't really create much confidence ;-) 
 
But the SQL, where you don't have to read the objects at all, will always be faster. And getting a different answer between one and two just seems wrong. I don't know why it would be happening, but I suspect that the combination of deleteAll: and exclusive relationships is not well exercised.
Hmm. deleteAll: just iterates through the list and calls delete: for each object. So there shouldn't be a difference. 
 
Aside: I think the Store garbage collector should work more or less the wa your last option works. Delete the packages/bundles you don't want. Do repeated cleanup passes for anything that refers to the now missing things. I think it would be much faster and simpler.
It is very much faster. Approaches 1 and 2 would rund at least ten minutes, while the pure SQL approach is finished in a bit more than a minute.
 
Joachim

On Sat Jul 05 2014 at 12:47:52 AM, jtuchel <[hidden email]> wrote:
Hi Maarten,
I think your suggestion is quite close to my second approach. I don't register any objects because I read all objects in the session that deletes them. The Glorp machinery does a #realObject on all children during prepareToCommit, so I think it should be the same thing that is happening. The only thing I am not doing is explicitly removing the children of a node, because they will be deleted anyways because of the #beExclusive.
The only situation in which I do a #register: is when I create new objects. 
I can see how you dig down the hierarchy and do lots of commits of single deletes on the way back up, so you don't rely on the Glorp mechanism for resolving dependent objects in an exclusive 1:m relationship. So you perfectly sail around the problem ;-) 
Or am I getting your code wrong?
I was thinking about doing that, but given my first measurements of the approaches I tried, I was quite sure my batch would run literally for hours if I delete/commit each single node. OTOH, Glorp has to read all children as well to resolve the transitive closure of a to-be-deleted node, so the performance difference may just lie in the number of commits, which shouldn't bee too much of a problem in an exclusive session during app startup.
What is your experience on this? 
I am glad you posted your code here, because it gives me new ideas to think about what I could do to reach my goal. I'll give your approach a try. 
Thanks for sharing.
Joachim


Am Freitag, 4. Juli 2014 13:23:40 UTC+2 schrieb [hidden email]:

It looks to me that you're not cleaning up things correctly behind your delete.

You can take a look on how I end up doing this in a part of the following method:

 

surpressTaskFromModel: aModel

 

           toDelete := OrderedCollection new.
           self children do: [:each | toDelete add: each].
           toDelete do: [:each | each surpressTaskFromModel: aModel].

 

             pieces := Array

                                               with: self

                                               with: self parent

                                            

            aModel getGlorpSession inUnitOfWorkDo:

                                   [aModel getGlorpSession registerExisting: (pieces at: 2).

                                   (pieces at: 2) children remove: self ifAbsent: [self error: 'absent'].

                                   (pieces at: 2) reorderChilds.

                                   self parent: nil.

                                   aModel getGlorpSession delete: self].

 

Using the Array pieces is of major importance here, as this is assures to work with full identity.

The purpose of the method it to delete itself as well as all off its children, and it is used in a cascading way.

 

registerExisting is the method I use that adds an object to the cache when it is still known in the image and existing to the database,

 

Hope this helps,

 

@+Maarten,

 

 

> "jtuchel" <[hidden email]> |

Hi there, 
I've found something strange that I think should be at least explainable or even better solved.
This will be a bit longer, but I'll try to explain in one sentence what is happening:
Glorp seems to handle the deletion of objects in exclusive relationships differently if you use deleteAll: as compared to delete:.
So here is the longer story:
I needed to do some corrections in a production database, where this would be part of a bigger batch job. The model is something like a bill of materials: We have tree nodes that can be either leaves or nodes. Both are in the same table. Both have a column named parent_id and the toManyRelationship between a parent and its children is defined to #beExclusive in the descriptor.
What I had to do was filter out all elements that have no parent_id or have a parent_id that doesn't exist in the table. These records had been the result of a bug we had months ago (and maybe even the result of the problem I describe in this post).
So what I did first was something like this:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "the Smalltalk select handles both NULL and non-existent parent-ids. Slow but TSTTCPW "
dbSession deleteAll: recordsToDelete; commitUnitOfWorkAndContinue.
This, however, didn't really delete all records, especially not all of the dependent children. I would have expected this to delete the TreeNodes that have no parent together with all their children, recursively. It didn't. Of the roughly 21,000 records that would have to be deleted, only about 2,000 were gone after the operation.
So I experimented a little and found something better:
recordsToDelete:= (dbSession readManyOf: MyTreeNode) select: [:ea| ea parent yourSelf isNil]) asOrderedCollection. "still the same as above"
[recordsToDelete notEmpty] whileTrue: [
  dbSession delete: recordsToDelete removeFirst;
    commitUnitOfWorkAndContinue.].
This would be waaaayyyy slower (more than 30 minutes on my development machine), but delete about 8,000 records.
I ended up doing it in bare SQL in a scheme similar to the second approach.
deleteIDs := acc executeSQLString:
'select  <a href="http://node.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fnode.id\46sa\75D\46sntz\0751\46usg\75AFQjCNE7mSumD_lS2vVT4VlQYOM2OuM-og';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fnode.id\46sa\75D\46sntz\0751\46usg\75AFQjCNE7mSumD_lS2vVT4VlQYOM2OuM-og';return true;">node.id from treenode node where node.parent_id is null or  node.parent_id not in (select <a href="http://kp.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;">kp.id from treenode kp);'.
[(count := deleteIDs size) = 0] 
           whileFalse: [
                acc executeSQLString: 'delete from treenode node where node.parent_id is null or  node.parent_id not in (select <a href="http://kp.id" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fkp.id\46sa\75D\46sntz\0751\46usg\75AFQjCNHCEqhQSHHd4VKd5XssnyKUvVWijA';return true;">kp.id from treenode kp);']].
Digging down from a parentless container to all its children would take 7 repetitions, but each delete statement would finish in a few seconds. (note that in this snippet I could use count(*) instead of selecting ids, but I need to do a few sanity checks on the records to delete and therefor need the ids.
So my question is not about the speed of the three alternatives. It is clear that the first two are much slower than the third one.
What worries me is the fact that Glorp seems to fail in both of the first situations, and that the results differ.
The resolution of the transitive deletes differs between deleteAll: with one commit differs from repeated delete:'s with individual commits. And both leave thousands of records undeleted.... I am not sure I like that, and maybe this problem is even part of our bug that lead to so many "dead" nodes in the first place.
Does anyone have any experience with this? Any clues what could go wrong? Is it in my code or in Glorp?
Joachim
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.
To post to this group, send email to [hidden email].

Visit this group at <a href="http://groups.google.com/group/glorp-group" target="_blank" onmousedown="this.href='http://groups.google.com/group/glorp-group';return true;" onclick="this.href='http://groups.google.com/group/glorp-group';return true;">http://groups.google.com/group/glorp-group.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glorp-group...@googlegroups.com.
To post to this group, send email to [hidden email].
Visit this group at <a href="http://groups.google.com/group/glorp-group" target="_blank" onmousedown="this.href='http://groups.google.com/group/glorp-group';return true;" onclick="this.href='http://groups.google.com/group/glorp-group';return true;">http://groups.google.com/group/glorp-group.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="af9j00Pc1vMJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">glorp-group...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="af9j00Pc1vMJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">glorp...@....
Visit this group at <a href="http://groups.google.com/group/glorp-group" target="_blank" onmousedown="this.href='http://groups.google.com/group/glorp-group';return true;" onclick="this.href='http://groups.google.com/group/glorp-group';return true;">http://groups.google.com/group/glorp-group.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "glorp-group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/glorp-group.
For more options, visit https://groups.google.com/d/optout.