Smalltalk › Gemtalk › GLASS

Experiences migrating a Gemstone/S database

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

7 messages Options

GLASS mailing list

Dec 20, 2019; 11:56am

Experiences migrating a Gemstone/S database

4138 posts

Migration und Updating Databases

I would like to post some experiences I had while migrating a
Gemstone database.

The customer database - my tests are based on - has a size
of about 420GB. The database has been copied to our reference
system - an old Thinkpad W520 (I7 based) with 16GB of RAM
and ONE SSD and the tests were done on this machine. The stone
is working with 8GM shared cache page.

Between v70 and v71 of our product there were several
changes to the domain model we were developing. The model
is defined by 197 domain classes.

In v71 39 of these classes have been changed and theses changes
are the reason for 119.000.000 objects to be migrated. One class
had 66.000.000 instances, another one 49.000.000 instances and
the other classes have around 4.000.000 instances.

*** The origial way ***

The old traditional way had been written in the early state
of this product, where databases were not that big and migration
speed was not that critical.

It worked more or less the following way (shame on me):

a) Scan the repository for ONE (!) changed class

b) For each instance do a migration and on demand (no memory)
make a commit.

This was ok in the past. I could start update process on saturday
and finish the update on sunday remote.

Now the database became too large and this way of updating the
database would take from Thursday, 11:00 to Monday afternoon (so
more or less 4 days !)

*** Repository Scanning ***

The next evolution in this topic had been done:

a) now ONE repository scan (FOR ALL changed classes) is done - using
fastAllInstances and GsBitmap instances.

b) For each instance do a migration and on demand make a commit.

With this step the multiple scanning of the repository has been removed
and the largest time is now the base migration code execution. But
for 119 millions objects this still takes much time. I did not make a full
test but an initial test over some hours suggested, that this would take
around 2 days.

*** Indices ***

More than satisfied with the benefits of ONE scan, I had to look to the migration
code. The base migration code was generated by our code generator and I did
not want to change that (because it is general and would cover all model versions),
but actually knowing the specific model I want to migrate from, would cut the
to be executed code to 1/4 of the originial code. So here would be possibilities
for enhancements.

So, what about starting multiple processes, do the step (b) in parallel ? I stored
the GsBitmap in page order on the disc and that file became around 600 MByte of data.

And wrote processes to do the migration in parallel based on that GsBitmap file
... and it did not work.
Over and over commit conflicts. No way to go ... speed was pretty bad.

Actually only one process was running more or less without problems - the other
processes sometime did a little work, but most of the time they did an abort transaction.

So, somehow these conflicts were based on. As a first step I decided to remove
ALL indices used in the database. I had luck, that this application had an execution
path to find all used indices to remove them, to build them etc.

That script to remove all indices were started before migration (and it took at
least 1-2 hours).

Then I started the parallel migration code and now the stuff was working. The I7
had 8 execution threads and I started 8 of these processes and they work without
problems. The topaz script were started with "-t 500000" and that fit very well to
the machine above. 100% usage of the available RAM und minimal swapping.

The code itself had a sliding transaction size (from 1 to max. of 20000 objects between
each commit. This limit is adapted according to conflicts/successes) - but the logs showed,
that the processes are working with the upper value of 20000 each commit.

So to summarize:

a) Scanning the objects with fastAllInstances in ONE scan (1-2 hours)
b) Removing the indices (1-2) hours
c) Starting the migration code in 8 tasks (8 hours)
d) Scanning the objects with fastAllInstances in ONE scan (1-2 hours) - to reassure
e) Clean the history
f) Building the indices (3) hours

So, now I am at 17 hours and that is ok. I think, that (b) and (f) could also be
done in parallel execution mode.

*** Workload ***

So removing indices in concurrent tasks leads to very strange exception errors, so
I gave that up.

Creating indices in concurrent tasks work - so the 3h above can be reduced to 40
minutes and the overall time is now 15 hours.

*** Equal Workload up to the end ***

The next point shown up in this work was, that the work of creating indices vary
very much and so some task have much more to do than others ... and the
parallel workidea is not done up to the end. (creating indices task: 37 minutes (longest)
against 11 minutes (fastest)). So rearranging this work could still improve the
time needed to create the indices.

Marten
_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

GLASS mailing list

Dec 20, 2019; 1:55pm

Re: Experiences migrating a Gemstone/S database

4138 posts

Marten,

Thank you very much for describing your experience. At each step it seems to me like you made good choices and exhibited a strong understanding of how GemStone works and can be optimized. I appreciate that you arrived at something that is adequate but recognize that more could be done if necessary.

James

> On Dec 20, 2019, at 3:56 AM, Marten Feldtmann via Glass <[hidden email]> wrote:
>
> Migration und Updating Databases
>
> I would like to post some experiences I had while migrating a
> Gemstone database.
>
> The customer database - my tests are based on - has a size
> of about 420GB. The database has been copied to our reference
> system - an old Thinkpad W520 (I7 based) with 16GB of RAM
> and ONE SSD and the tests were done on this machine. The stone
> is working with 8GM shared cache page.
>
> Between v70 and v71 of our product there were several
> changes to the domain model we were developing. The model
> is defined by 197 domain classes.
>
> In v71 39 of these classes have been changed and theses changes
> are the reason for 119.000.000 objects to be migrated. One class
> had 66.000.000 instances, another one 49.000.000 instances and
> the other classes have around 4.000.000 instances.
>
> *** The origial way ***
>
> The old traditional way had been written in the early state
> of this product, where databases were not that big and migration
> speed was not that critical.
>
> It worked more or less the following way (shame on me):
>
> a) Scan the repository for ONE (!) changed class
>
> b) For each instance do a migration and on demand (no memory)
> make a commit.
>
> This was ok in the past. I could start update process on saturday
> and finish the update on sunday remote.
>
> Now the database became too large and this way of updating the
> database would take from Thursday, 11:00 to Monday afternoon (so
> more or less 4 days !)
>
> *** Repository Scanning ***
>
> The next evolution in this topic had been done:
>
> a) now ONE repository scan (FOR ALL changed classes) is done - using
> fastAllInstances and GsBitmap instances.
>
> b) For each instance do a migration and on demand make a commit.
>
> With this step the multiple scanning of the repository has been removed
> and the largest time is now the base migration code execution. But
> for 119 millions objects this still takes much time. I did not make a full
> test but an initial test over some hours suggested, that this would take
> around 2 days.
>
> *** Indices ***
>
> More than satisfied with the benefits of ONE scan, I had to look to the migration
> code. The base migration code was generated by our code generator and I did
> not want to change that (because it is general and would cover all model versions),
> but actually knowing the specific model I want to migrate from, would cut the
> to be executed code to 1/4 of the originial code. So here would be possibilities
> for enhancements.
>
> So, what about starting multiple processes, do the step (b) in parallel ? I stored
> the GsBitmap in page order on the disc and that file became around 600 MByte of data.
>
> And wrote processes to do the migration in parallel based on that GsBitmap file
> ... and it did not work.
> Over and over commit conflicts. No way to go ... speed was pretty bad.
>
> Actually only one process was running more or less without problems - the other
> processes sometime did a little work, but most of the time they did an abort transaction.
>
> So, somehow these conflicts were based on. As a first step I decided to remove
> ALL indices used in the database. I had luck, that this application had an execution
> path to find all used indices to remove them, to build them etc.
>
>
> That script to remove all indices were started before migration (and it took at
> least 1-2 hours).
>
> Then I started the parallel migration code and now the stuff was working. The I7
> had 8 execution threads and I started 8 of these processes and they work without
> problems. The topaz script were started with "-t 500000" and that fit very well to
> the machine above. 100% usage of the available RAM und minimal swapping.
>
> The code itself had a sliding transaction size (from 1 to max. of 20000 objects between
> each commit. This limit is adapted according to conflicts/successes) - but the logs showed,
> that the processes are working with the upper value of 20000 each commit.
>
>
> So to summarize:
>
> a) Scanning the objects with fastAllInstances in ONE scan (1-2 hours)
> b) Removing the indices (1-2) hours
> c) Starting the migration code in 8 tasks (8 hours)
> d) Scanning the objects with fastAllInstances in ONE scan (1-2 hours) - to reassure
> e) Clean the history
> f) Building the indices (3) hours
>
> So, now I am at 17 hours and that is ok. I think, that (b) and (f) could also be
> done in parallel execution mode.
>
> *** Workload ***
>
> So removing indices in concurrent tasks leads to very strange exception errors, so
> I gave that up.
>
> Creating indices in concurrent tasks work - so the 3h above can be reduced to 40
> minutes and the overall time is now 15 hours.
>
> *** Equal Workload up to the end ***
>
> The next point shown up in this work was, that the work of creating indices vary
> very much and so some task have much more to do than others ... and the
> parallel workidea is not done up to the end. (creating indices task: 37 minutes (longest)
> against 11 minutes (fastest)). So rearranging this work could still improve the
> time needed to create the indices.
>
>
> Marten
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass
>

... [show rest of quote]

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

GLASS mailing list

Dec 20, 2019; 2:36pm

Re: Experiences migrating a Gemstone/S database

4138 posts

In reply to this post by GLASS mailing list

Thanks for sharing your story ... impressive improvements ....

I see that you had “very strange exception errors” while attempted to concurrently remove indexes ... could you share more details about the errors you hit? In theory you should be able to do concurrent index removal without errors, so perhaps you ran into some fixable bugs?

Dale

Sent from my iPhone

> On Dec 20, 2019, at 4:01 AM, Marten Feldtmann via Glass <[hidden email]> wrote:
>
> Migration und Updating Databases
>
> I would like to post some experiences I had while migrating a
> Gemstone database.
>
> The customer database - my tests are based on - has a size
> of about 420GB. The database has been copied to our reference
> system - an old Thinkpad W520 (I7 based) with 16GB of RAM
> and ONE SSD and the tests were done on this machine. The stone
> is working with 8GM shared cache page.
>
> Between v70 and v71 of our product there were several
> changes to the domain model we were developing. The model
> is defined by 197 domain classes.
>
> In v71 39 of these classes have been changed and theses changes
> are the reason for 119.000.000 objects to be migrated. One class
> had 66.000.000 instances, another one 49.000.000 instances and
> the other classes have around 4.000.000 instances.
>
> *** The origial way ***
>
> The old traditional way had been written in the early state
> of this product, where databases were not that big and migration
> speed was not that critical.
>
> It worked more or less the following way (shame on me):
>
> a) Scan the repository for ONE (!) changed class
>
> b) For each instance do a migration and on demand (no memory)
> make a commit.
>
> This was ok in the past. I could start update process on saturday
> and finish the update on sunday remote.
>
> Now the database became too large and this way of updating the
> database would take from Thursday, 11:00 to Monday afternoon (so
> more or less 4 days !)
>
> *** Repository Scanning ***
>
> The next evolution in this topic had been done:
>
> a) now ONE repository scan (FOR ALL changed classes) is done - using
> fastAllInstances and GsBitmap instances.
>
> b) For each instance do a migration and on demand make a commit.
>
> With this step the multiple scanning of the repository has been removed
> and the largest time is now the base migration code execution. But
> for 119 millions objects this still takes much time. I did not make a full
> test but an initial test over some hours suggested, that this would take
> around 2 days.
>
> *** Indices ***
>
> More than satisfied with the benefits of ONE scan, I had to look to the migration
> code. The base migration code was generated by our code generator and I did
> not want to change that (because it is general and would cover all model versions),
> but actually knowing the specific model I want to migrate from, would cut the
> to be executed code to 1/4 of the originial code. So here would be possibilities
> for enhancements.
>
> So, what about starting multiple processes, do the step (b) in parallel ? I stored
> the GsBitmap in page order on the disc and that file became around 600 MByte of data.
>
> And wrote processes to do the migration in parallel based on that GsBitmap file
> ... and it did not work.
> Over and over commit conflicts. No way to go ... speed was pretty bad.
>
> Actually only one process was running more or less without problems - the other
> processes sometime did a little work, but most of the time they did an abort transaction.
>
> So, somehow these conflicts were based on. As a first step I decided to remove
> ALL indices used in the database. I had luck, that this application had an execution
> path to find all used indices to remove them, to build them etc.
>
>
> That script to remove all indices were started before migration (and it took at
> least 1-2 hours).
>
> Then I started the parallel migration code and now the stuff was working. The I7
> had 8 execution threads and I started 8 of these processes and they work without
> problems. The topaz script were started with "-t 500000" and that fit very well to
> the machine above. 100% usage of the available RAM und minimal swapping.
>
> The code itself had a sliding transaction size (from 1 to max. of 20000 objects between
> each commit. This limit is adapted according to conflicts/successes) - but the logs showed,
> that the processes are working with the upper value of 20000 each commit.
>
>
> So to summarize:
>
> a) Scanning the objects with fastAllInstances in ONE scan (1-2 hours)
> b) Removing the indices (1-2) hours
> c) Starting the migration code in 8 tasks (8 hours)
> d) Scanning the objects with fastAllInstances in ONE scan (1-2 hours) - to reassure
> e) Clean the history
> f) Building the indices (3) hours
>
> So, now I am at 17 hours and that is ok. I think, that (b) and (f) could also be
> done in parallel execution mode.
>
> *** Workload ***
>
> So removing indices in concurrent tasks leads to very strange exception errors, so
> I gave that up.
>
> Creating indices in concurrent tasks work - so the 3h above can be reduced to 40
> minutes and the overall time is now 15 hours.
>
> *** Equal Workload up to the end ***
>
> The next point shown up in this work was, that the work of creating indices vary
> very much and so some task have much more to do than others ... and the
> parallel workidea is not done up to the end. (creating indices task: 37 minutes (longest)
> against 11 minutes (fastest)). So rearranging this work could still improve the
> time needed to create the indices.
>
>
> Marten
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass

... [show rest of quote]

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

GLASS mailing list

Dec 20, 2019; 3:45pm

Re: Experiences migrating a Gemstone/S database

4138 posts

This is the error I get from my tasks which are removig indices. The code works, when its running alone. My working theory is, that the RcConflictSet in the IndexManager instance is the reason for this failure ... but I have no proof for that.

the commitTransaction in line 16 is just after I remove all indices from ONE structure. the removeIndex code is a copy of the createIndex code, so iI think, that all tasks are working on their own (not common) data

ERROR 2261 , a InternalError occurred (error 2261), The object with object ID 1480033747713 is corrupt. Reason: 'CorruptObj, FetchObjId fetch past end' (InternalError)
topaz > exec iferr 1 : where
==> 1 InternalError (AbstractException) >> _signalFromPrimitive: @6 line 15
2 DepListTable >> depListBucketFor: @1 line 1
3 DepListTable >> _add: @3 line 8
4 DepListTable (Object) >> perform:withArguments: @1 line 1
5 LogEntry >> redo @2 line 5
6 RedoLog >> _redoOperationsForEntries: @6 line 4
7 DepListTable (Object) >> _abortAndReplay: @15 line 20
8 DepListTable >> _resolveRcConflictsWith: @3 line 10
9 System class >> _resolveRcConflicts @21 line 26
10 System class >> _resolveRcConflictsForCommit: @4 line 8
11 [] in System class >> _localCommit: @28 line 42
12 ExecBlock0 (ExecBlock) >> onException:do: @2 line 66
13 System class >> _localCommit: @16 line 44
14 SessionMethodTransactionBoundaryPolicy (TransactionBoundaryDefaultPolicy) >> commit: @3 line 3
15 System class >> _commit: @8 line 16
16 System class >> commitTransaction @5 line 7
17 [] in WCATIServiceClass class >> removeAllIndices:total: @59 line 28
18 SortedCollection (Collection) >> do: @6 line 10
19 WCATIServiceClass class >> removeAllIndices:total: @35 line 24
20 Executed Code @2 line 1
21 GsNMethod class >> _gsReturnToC @1 line 1
topaz 1> commit
ERROR 2249 , a TransactionError occurred (error 2249), Further commits have been disabled for this session because: 'CorruptObj, FetchObjId fetch past end'. This session must logout. (TransactionError)
topaz > exec iferr 1 : where
==> 1 TransactionError (AbstractException) >> _signalFromPrimitive: @6 line 15
2 System class >> _primitiveCommit: @1 line 1
3 [] in System class >> _localCommit: @21 line 30
4 ExecBlock0 (ExecBlock) >> onException:do: @2 line 66
5 System class >> _localCommit: @9 line 31
6 SessionMethodTransactionBoundaryPolicy (TransactionBoundaryDefaultPolicy) >> commit: @3 line 3
7 System class >> _commit: @8 line 16
8 System class >> _gciCommit @5 line 5
9 GsNMethod class >> _gsReturnToC @1 line 1
topaz 1> doit

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

GLASS mailing list

Dec 20, 2019; 5:45pm

Re: Experiences migrating a Gemstone/S database

4138 posts

In reply to this post by GLASS mailing list

Great article Marten, thank you for posting. This would make a great
experience report to present at ESUG.

Norm

On 12/20/2019 3:56 AM, Marten Feldtmann via Glass wrote:

> Migration und Updating Databases
>
> I would like to post some experiences I had while migrating a
> Gemstone database.
>
> The customer database - my tests are based on - has a size
> of about 420GB. The database has been copied to our reference
> system - an old Thinkpad W520 (I7 based) with 16GB of RAM
> and ONE SSD and the tests were done on this machine. The stone
> is working with 8GM shared cache page.
>
> Between v70 and v71 of our product there were several
> changes to the domain model we were developing. The model
> is defined by 197 domain classes.
>
> In v71 39 of these classes have been changed and theses changes
> are the reason for 119.000.000 objects to be migrated. One class
> had 66.000.000 instances, another one 49.000.000 instances and
> the other classes have around 4.000.000 instances.
>
> *** The origial way ***
>
> The old traditional way had been written in the early state
> of this product, where databases were not that big and migration
> speed was not that critical.
>
> It worked more or less the following way (shame on me):
>
> a) Scan the repository for ONE (!) changed class
>
> b) For each instance do a migration and on demand (no memory)
> make a commit.
>
> This was ok in the past. I could start update process on saturday
> and finish the update on sunday remote.
>
> Now the database became too large and this way of updating the
> database would take from Thursday, 11:00 to Monday afternoon (so
> more or less 4 days !)
>
> *** Repository Scanning ***
>
> The next evolution in this topic had been done:
>
> a) now ONE repository scan (FOR ALL changed classes) is done - using
> fastAllInstances and GsBitmap instances.
>
> b) For each instance do a migration and on demand make a commit.
>
> With this step the multiple scanning of the repository has been removed
> and the largest time is now the base migration code execution. But
> for 119 millions objects this still takes much time. I did not make a full
> test but an initial test over some hours suggested, that this would take
> around 2 days.
>
> *** Indices ***
>
> More than satisfied with the benefits of ONE scan, I had to look to the migration
> code. The base migration code was generated by our code generator and I did
> not want to change that (because it is general and would cover all model versions),
> but actually knowing the specific model I want to migrate from, would cut the
> to be executed code to 1/4 of the originial code. So here would be possibilities
> for enhancements.
>
> So, what about starting multiple processes, do the step (b) in parallel ? I stored
> the GsBitmap in page order on the disc and that file became around 600 MByte of data.
>
> And wrote processes to do the migration in parallel based on that GsBitmap file
> ... and it did not work.
> Over and over commit conflicts. No way to go ... speed was pretty bad.
>
> Actually only one process was running more or less without problems - the other
> processes sometime did a little work, but most of the time they did an abort transaction.
>
> So, somehow these conflicts were based on. As a first step I decided to remove
> ALL indices used in the database. I had luck, that this application had an execution
> path to find all used indices to remove them, to build them etc.
>
>
> That script to remove all indices were started before migration (and it took at
> least 1-2 hours).
>
> Then I started the parallel migration code and now the stuff was working. The I7
> had 8 execution threads and I started 8 of these processes and they work without
> problems. The topaz script were started with "-t 500000" and that fit very well to
> the machine above. 100% usage of the available RAM und minimal swapping.
>
> The code itself had a sliding transaction size (from 1 to max. of 20000 objects between
> each commit. This limit is adapted according to conflicts/successes) - but the logs showed,
> that the processes are working with the upper value of 20000 each commit.
>
>
> So to summarize:
>
> a) Scanning the objects with fastAllInstances in ONE scan (1-2 hours)
> b) Removing the indices (1-2) hours
> c) Starting the migration code in 8 tasks (8 hours)
> d) Scanning the objects with fastAllInstances in ONE scan (1-2 hours) - to reassure
> e) Clean the history
> f) Building the indices (3) hours
>
> So, now I am at 17 hours and that is ok. I think, that (b) and (f) could also be
> done in parallel execution mode.
>
> *** Workload ***
>
> So removing indices in concurrent tasks leads to very strange exception errors, so
> I gave that up.
>
> Creating indices in concurrent tasks work - so the 3h above can be reduced to 40
> minutes and the overall time is now 15 hours.
>
> *** Equal Workload up to the end ***
>
> The next point shown up in this work was, that the work of creating indices vary
> very much and so some task have much more to do than others ... and the
> parallel workidea is not done up to the end. (creating indices task: 37 minutes (longest)
> against 11 minutes (fastest)). So rearranging this work could still improve the
> time needed to create the indices.
>
>
> Marten
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass

... [show rest of quote]

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

GLASS mailing list

Dec 20, 2019; 5:50pm

Re: Experiences migrating a Gemstone/S database

4138 posts

In reply to this post by GLASS mailing list

Hey Marten,

What's the actual GemStone call made to remove the indexes before you did the commit causing the error? Also, can you describe the type and paths of the various indexes on that collection?

I see that Dale has gone ahead and filed bug 48485 on this issue -- we may need this info to track down the problem.

------------------------------------------------------------------------
Bill Erickson
GemTalk Systems Engineering
15220 NW Greenbrier Parkway #240, Beaverton OR 97006
------------------------------------------------------------------------

On Fri, Dec 20, 2019 at 7:50 AM Marten Feldtmann via Glass <[hidden email]> wrote:

This is the error I get from my tasks which are removig indices. The code works, when its running alone. My working theory is, that the RcConflictSet in the IndexManager instance is the reason for this failure ... but I have no proof for that.

the commitTransaction in line 16 is just after I remove all indices from ONE structure. the removeIndex code is a copy of the createIndex code, so iI think, that all tasks are working on their own (not common) data

ERROR 2261 , a InternalError occurred (error 2261), The object with object ID 1480033747713 is corrupt. Reason: 'CorruptObj, FetchObjId fetch past end' (InternalError)
topaz > exec iferr 1 : where
==> 1 InternalError (AbstractException) >> _signalFromPrimitive: @6 line 15
2 DepListTable >> depListBucketFor: @1 line 1
3 DepListTable >> _add: @3 line 8
4 DepListTable (Object) >> perform:withArguments: @1 line 1
5 LogEntry >> redo @2 line 5
6 RedoLog >> _redoOperationsForEntries: @6 line 4
7 DepListTable (Object) >> _abortAndReplay: @15 line 20
8 DepListTable >> _resolveRcConflictsWith: @3 line 10
9 System class >> _resolveRcConflicts @21 line 26
10 System class >> _resolveRcConflictsForCommit: @4 line 8
11 [] in System class >> _localCommit: @28 line 42
12 ExecBlock0 (ExecBlock) >> onException:do: @2 line 66
13 System class >> _localCommit: @16 line 44
14 SessionMethodTransactionBoundaryPolicy (TransactionBoundaryDefaultPolicy) >> commit: @3 line 3
15 System class >> _commit: @8 line 16
16 System class >> commitTransaction @5 line 7
17 [] in WCATIServiceClass class >> removeAllIndices:total: @59 line 28
18 SortedCollection (Collection) >> do: @6 line 10
19 WCATIServiceClass class >> removeAllIndices:total: @35 line 24
20 Executed Code @2 line 1
21 GsNMethod class >> _gsReturnToC @1 line 1
topaz 1> commit
ERROR 2249 , a TransactionError occurred (error 2249), Further commits have been disabled for this session because: 'CorruptObj, FetchObjId fetch past end'. This session must logout. (TransactionError)
topaz > exec iferr 1 : where
==> 1 TransactionError (AbstractException) >> _signalFromPrimitive: @6 line 15
2 System class >> _primitiveCommit: @1 line 1
3 [] in System class >> _localCommit: @21 line 30
4 ExecBlock0 (ExecBlock) >> onException:do: @2 line 66
5 System class >> _localCommit: @9 line 31
6 SessionMethodTransactionBoundaryPolicy (TransactionBoundaryDefaultPolicy) >> commit: @3 line 3
7 System class >> _commit: @8 line 16
8 System class >> _gciCommit @5 line 5
9 GsNMethod class >> _gsReturnToC @1 line 1
topaz 1> doit

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

... [show rest of quote]

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

GLASS mailing list

Dec 20, 2019; 6:11pm

Re: Experiences migrating a Gemstone/S database

4138 posts

In reply to this post by GLASS mailing list

Marten,

Thanks for the stack ... For completeness, could you share the code in the removeAllIndices:total: with me? I would like to reproduce this problem, so the number of concurrent processes doing removal would be useful, as well ... Concurrent bugs are always difficult to track down (and reproduce) so the more information I have the better.

I've submitted a bug: "48485 'CorruptObj, FetchObjId fetch past end' during concurrent index removal" to track this problem.

Finally I am curious if you have tried `IndexManager removeAllIndexes`? It is intended for use in the case where you are removing all of the indexes in the system ... Instead of removing the individual objects participating in the indexes (which is what is done by the standard remove index code), the index data structures (btrees, etc.) are simply dropped on the floor --- the objects participating in index are removed directly from the dependency lists, so it should be quite a bit faster than removing each individual index from it's collection ...

Dale

On 12/20/19 7:45 AM, Marten Feldtmann wrote:

This is the error I get from my tasks which are removig indices. The code works, when its running alone. My working theory is, that the RcConflictSet in the IndexManager instance is the reason for this failure ... but I have no proof for that.

the commitTransaction in line 16 is just after I remove all indices from ONE structure. the removeIndex code is a copy of the createIndex code, so iI think, that all tasks are working on their own (not common) data

ERROR 2261 , a InternalError occurred (error 2261), The object with object ID 1480033747713 is corrupt. Reason: 'CorruptObj, FetchObjId fetch past end' (InternalError)
topaz > exec iferr 1 : where
==> 1 InternalError (AbstractException) >> _signalFromPrimitive: @6 line 15
2 DepListTable >> depListBucketFor: @1 line 1
3 DepListTable >> _add: @3 line 8
4 DepListTable (Object) >> perform:withArguments: @1 line 1
5 LogEntry >> redo @2 line 5
6 RedoLog >> _redoOperationsForEntries: @6 line 4
7 DepListTable (Object) >> _abortAndReplay: @15 line 20
8 DepListTable >> _resolveRcConflictsWith: @3 line 10
9 System class >> _resolveRcConflicts @21 line 26
10 System class >> _resolveRcConflictsForCommit: @4 line 8
11 [] in System class >> _localCommit: @28 line 42
12 ExecBlock0 (ExecBlock) >> onException:do: @2 line 66
13 System class >> _localCommit: @16 line 44
14 SessionMethodTransactionBoundaryPolicy (TransactionBoundaryDefaultPolicy) >> commit: @3 line 3
15 System class >> _commit: @8 line 16
16 System class >> commitTransaction @5 line 7
17 [] in WCATIServiceClass class >> removeAllIndices:total: @59 line 28
18 SortedCollection (Collection) >> do: @6 line 10
19 WCATIServiceClass class >> removeAllIndices:total: @35 line 24
20 Executed Code @2 line 1
21 GsNMethod class >> _gsReturnToC @1 line 1
topaz 1> commit
ERROR 2249 , a TransactionError occurred (error 2249), Further commits have been disabled for this session because: 'CorruptObj, FetchObjId fetch past end'. This session must logout. (TransactionError)
topaz > exec iferr 1 : where
==> 1 TransactionError (AbstractException) >> _signalFromPrimitive: @6 line 15
2 System class >> _primitiveCommit: @1 line 1
3 [] in System class >> _localCommit: @21 line 30
4 ExecBlock0 (ExecBlock) >> onException:do: @2 line 66
5 System class >> _localCommit: @9 line 31
6 SessionMethodTransactionBoundaryPolicy (TransactionBoundaryDefaultPolicy) >> commit: @3 line 3
7 System class >> _commit: @8 line 16
8 System class >> _gciCommit @5 line 5
9 GsNMethod class >> _gsReturnToC @1 line 1
topaz 1> doit

... [show rest of quote]

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass