Position method for BtreePlusReadStream classes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Position method for BtreePlusReadStream classes

GLASS mailing list
Hi,

aRcIdentitySet has an index on 'each.modifiedTime' and it can have a lot of instances.

In order to get a list of sorted instances (by modifiedTime) i do:
|gsQuery|
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.
gsQuery on: aRcIdentitySet .

Now i want to 'jump' to a given position in this stream...
It is possible to use some kind of #position: message in aBtreePlusReadStream ?
(position: does no exist in BtreePlusReadStream)

I could use #next to 'jump' to a given position, but the query can be very very large.
At the end it show a paging web page to a user that can click to get the next bunch of objects.
So i do not want to do #next over a large collection.

regards,
bruno

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list

Bruno,

There is no backing collection for the BtreePlusReadStream, so being able to go to a certain position is not possible without counting....

We should be able to quickly produce a result set of the entire query results, but it would be a set not an ordered collection:( And to get results _in order_ the streaming API is the only solution... To get the kind of performance that you would want, I would think that it should be possible to create a primitive that would produce the result set in the form of an Array (in order) instead of a Set.

For now you would have to produce the Array yourself using:

| result |
result := {}.
gsQuery do: [:each | result add: each]

#do: uses the BtreePlusReadStream api underneath covers, so the #do: elements are processed in order ...

Let me know if you you would need a primitive for performance and I can submit a feature request ...

Dale

On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
Hi,

aRcIdentitySet has an index on 'each.modifiedTime' and it can have a lot of instances.

In order to get a list of sorted instances (by modifiedTime) i do:
|gsQuery|
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.
gsQuery on: aRcIdentitySet .

Now i want to 'jump' to a given position in this stream...
It is possible to use some kind of #position: message in aBtreePlusReadStream ?
(position: does no exist in BtreePlusReadStream)

I could use #next to 'jump' to a given position, but the query can be very very large.
At the end it show a paging web page to a user that can click to get the next bunch of objects.
So i do not want to do #next over a large collection.

regards,
bruno

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list

Dale,

Which is the difference of your solution with the following ?
btreePlusReadStream := gsQuery reversedReadStream.
position := 1.
[btreePlusReadStream atEnd not and: [position < collectionSize]] whileTrue: [btreePlusReadStream next. position := position + 1].

What i want is to have a very large (a millon ?) GsQuery result set (in index order) and go to a position (K) without faulting into memory objects previous to the (K) position.

regards
bruno

On 8/6/2020 16:18, Dale Henrichs via Glass wrote:

Bruno,

There is no backing collection for the BtreePlusReadStream, so being able to go to a certain position is not possible without counting....

We should be able to quickly produce a result set of the entire query results, but it would be a set not an ordered collection:( And to get results _in order_ the streaming API is the only solution... To get the kind of performance that you would want, I would think that it should be possible to create a primitive that would produce the result set in the form of an Array (in order) instead of a Set.

For now you would have to produce the Array yourself using:

| result |
result := {}.
gsQuery do: [:each | result add: each]

#do: uses the BtreePlusReadStream api underneath covers, so the #do: elements are processed in order ...

Let me know if you you would need a primitive for performance and I can submit a feature request ...

Dale

On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
Hi,

aRcIdentitySet has an index on 'each.modifiedTime' and it can have a lot of instances.

In order to get a list of sorted instances (by modifiedTime) i do:
|gsQuery|
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.
gsQuery on: aRcIdentitySet .

Now i want to 'jump' to a given position in this stream...
It is possible to use some kind of #position: message in aBtreePlusReadStream ?
(position: does no exist in BtreePlusReadStream)

I could use #next to 'jump' to a given position, but the query can be very very large.
At the end it show a paging web page to a user that can click to get the next bunch of objects.
So i do not want to do #next over a large collection.

regards,
bruno

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list

Bruno,

Sorry, I wasn't sure what you were asking ... but now that you mention it, there is already a method that will advance the stream cursor, without accessing the object at that position (_btreeNextNoValue)

[ stream atEnd not and: [ pos < collectionSize ] ]
  whileTrue: [ 
    pos := pos + 1.
    stream _btreeNextNoValue ]

If you want to count backward from the end using a reversedReadStream, then you'd have to implement _btreePreviousNoValue:

_btreePreviousNoValue
  "Returns the next value on a stream of B-tree values and root objects.  Updates the current
 stack for a subsequent 'next'."

  | val |
  " get the index into the leaf node and see if it has reached the end "
  currentIndex == 0
    ifTrue: [ ^ self _errorEndOfStream ].	" get the leaf and the value within the leaf "
  (currentNode == endNode and: [ endIndex == currentIndex ])
    ifTrue: [ 
      currentIndex := 0.
      ^ self ].	" see if index refers to first entry in this leaf "
  currentIndex == 1
    ifTrue: [ 
      " must look down the stack for the next leaf node "
      self _previousLeaf ]
    ifFalse: [ currentIndex := currentIndex - currentEntrySize ].

_btreeNextNoValue and _btreePreviousNoValue both avoid faulting the values into the image, just the interior and leaf nodes would be faulted in gut that is unavoidable ...

If the these would work for you I can see adding skip: to both BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to make it official ... let me know if this is what you are looking for,

Dale

On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:

Dale,

Which is the difference of your solution with the following ?
btreePlusReadStream := gsQuery reversedReadStream.
position := 1.
[btreePlusReadStream atEnd not and: [position < collectionSize]] whileTrue: [btreePlusReadStream next. position := position + 1].

What i want is to have a very large (a millon ?) GsQuery result set (in index order) and go to a position (K) without faulting into memory objects previous to the (K) position.

regards
bruno

On 8/6/2020 16:18, Dale Henrichs via Glass wrote:

Bruno,

There is no backing collection for the BtreePlusReadStream, so being able to go to a certain position is not possible without counting....

We should be able to quickly produce a result set of the entire query results, but it would be a set not an ordered collection:( And to get results _in order_ the streaming API is the only solution... To get the kind of performance that you would want, I would think that it should be possible to create a primitive that would produce the result set in the form of an Array (in order) instead of a Set.

For now you would have to produce the Array yourself using:

| result |
result := {}.
gsQuery do: [:each | result add: each]

#do: uses the BtreePlusReadStream api underneath covers, so the #do: elements are processed in order ...

Let me know if you you would need a primitive for performance and I can submit a feature request ...

Dale

On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
Hi,

aRcIdentitySet has an index on 'each.modifiedTime' and it can have a lot of instances.

In order to get a list of sorted instances (by modifiedTime) i do:
|gsQuery|
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.
gsQuery on: aRcIdentitySet .

Now i want to 'jump' to a given position in this stream...
It is possible to use some kind of #position: message in aBtreePlusReadStream ?
(position: does no exist in BtreePlusReadStream)

I could use #next to 'jump' to a given position, but the query can be very very large.
At the end it show a paging web page to a user that can click to get the next bunch of objects.
So i do not want to do #next over a large collection.

regards,
bruno

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
Dale,

That’s exactly was I looking for. A public method in next release it will good too,
Thank very much...

----- Mensaje original -----
De: Dale Henrichs via Glass <[hidden email]>
Para: [hidden email]
Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
Asunto: Re: [Glass] Position method for BtreePlusReadStream classes

Bruno,

Sorry, I wasn't sure what you were asking ... but now that you mention
it, there is already a method that will advance the stream cursor,
without accessing the object at that position (_btreeNextNoValue)

    [ stream atEnd not and: [ pos < collectionSize ] ]
       whileTrue: [
         pos := pos + 1.
         stream _btreeNextNoValue ]

If you want to count backward from the end using a reversedReadStream,
then you'd have to implement _btreePreviousNoValue:

    _btreePreviousNoValue
       "Returns the next value on a stream of B-tree values and root objects.  Updates the current
      stack for a subsequent 'next'."

       | val |
       " get the index into the leaf node and see if it has reached the end "
       currentIndex == 0
         ifTrue: [ ^ self _errorEndOfStream ]. " get the leaf and the value within the leaf "
       (currentNode == endNode and: [ endIndex == currentIndex ])
         ifTrue: [
           currentIndex := 0.
           ^ self ]. " see if index refers to first entry in this leaf "
       currentIndex == 1
         ifTrue: [
           " must look down the stack for the next leaf node "
           self _previousLeaf ]
         ifFalse: [ currentIndex := currentIndex - currentEntrySize ].

_btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
values into the image, just the interior and leaf nodes would be faulted
in gut that is unavoidable ...

If the these would work for you I can see adding skip: to both
BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
make it official ... let me know if this is what you are looking for,

Dale

On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:

>
> Dale,
>
> Which is the difference of your solution with the following ?
> btreePlusReadStream := gsQuery reversedReadStream.
> position := 1.
> [btreePlusReadStream atEnd not and: [position < collectionSize]]
> whileTrue: [btreePlusReadStream next. position := position + 1].
>
> What i want is to have a very large (a millon ?) GsQuery result set
> (in index order) and go to a position (K) without faulting into memory
> objects previous to the (K) position.
>
> regards
> bruno
>
> On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
>>
>> Bruno,
>>
>> There is no backing collection for the BtreePlusReadStream, so being
>> able to go to a certain position is not possible without counting....
>>
>> We should be able to quickly produce a result set of the entire query
>> results, but it would be a set not an ordered collection:( And to get
>> results _in order_ the streaming API is the only solution... To get
>> the kind of performance that you would want, I would think that it
>> should be possible to create a primitive that would produce the
>> result set in the form of an Array (in order) instead of a Set.
>>
>> For now you would have to produce the Array yourself using:
>>
>>     | result |
>>     result := {}.
>>     gsQuery do: [:each | result add: each]
>>
>> #do: uses the BtreePlusReadStream api underneath covers, so the #do:
>> elements are processed in order ...
>>
>> Let me know if you you would need a primitive for performance and I
>> can submit a feature request ...
>>
>> Dale
>>
>> On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
>>> Hi,
>>>
>>> aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
>>> lot of instances.
>>>
>>> In order to get a list of sorted instances (by modifiedTime) i do:
>>> |gsQuery|
>>> gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
>>> gsQuery bind: 'timeNow' to: TimeStamp now.
>>> gsQuery on: aRcIdentitySet .
>>>
>>> Now i want to 'jump' to a given position in this stream...
>>> It is possible to use some kind of #position: message in
>>> aBtreePlusReadStream ?
>>> (position: does no exist in BtreePlusReadStream)
>>>
>>> I could use #next to 'jump' to a given position, but the query can
>>> be very very large.
>>> At the end it show a paging web page to a user that can click to get
>>> the next bunch of objects.
>>> So i do not want to do #next over a large collection.
>>>
>>> regards,
>>> bruno
>>> 2.11.0.0
>>> 2.11.0.0
>>>
>>> _______________________________________________
>>> Glass mailing list
>>> [hidden email]
>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>> _______________________________________________
>> Glass mailing list
>> [hidden email]
>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass


_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
Bruno,

I've submitted an internal feature request (48811), so keep your eyse
peeled.

Dale

On 6/9/20 9:20 AM, smalltalk--- via Glass wrote:

> Dale,
>
> That’s exactly was I looking for. A public method in next release it will good too,
> Thank very much...
>
> ----- Mensaje original -----
> De: Dale Henrichs via Glass <[hidden email]>
> Para: [hidden email]
> Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
> Asunto: Re: [Glass] Position method for BtreePlusReadStream classes
>
> Bruno,
>
> Sorry, I wasn't sure what you were asking ... but now that you mention
> it, there is already a method that will advance the stream cursor,
> without accessing the object at that position (_btreeNextNoValue)
>
>      [ stream atEnd not and: [ pos < collectionSize ] ]
>         whileTrue: [
>           pos := pos + 1.
>           stream _btreeNextNoValue ]
>
> If you want to count backward from the end using a reversedReadStream,
> then you'd have to implement _btreePreviousNoValue:
>
>      _btreePreviousNoValue
>         "Returns the next value on a stream of B-tree values and root objects.  Updates the current
>        stack for a subsequent 'next'."
>
>         | val |
>         " get the index into the leaf node and see if it has reached the end "
>         currentIndex == 0
>           ifTrue: [ ^ self _errorEndOfStream ]. " get the leaf and the value within the leaf "
>         (currentNode == endNode and: [ endIndex == currentIndex ])
>           ifTrue: [
>             currentIndex := 0.
>             ^ self ]. " see if index refers to first entry in this leaf "
>         currentIndex == 1
>           ifTrue: [
>             " must look down the stack for the next leaf node "
>             self _previousLeaf ]
>           ifFalse: [ currentIndex := currentIndex - currentEntrySize ].
>
> _btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
> values into the image, just the interior and leaf nodes would be faulted
> in gut that is unavoidable ...
>
> If the these would work for you I can see adding skip: to both
> BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
> make it official ... let me know if this is what you are looking for,
>
> Dale
>
> On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:
>> Dale,
>>
>> Which is the difference of your solution with the following ?
>> btreePlusReadStream := gsQuery reversedReadStream.
>> position := 1.
>> [btreePlusReadStream atEnd not and: [position < collectionSize]]
>> whileTrue: [btreePlusReadStream next. position := position + 1].
>>
>> What i want is to have a very large (a millon ?) GsQuery result set
>> (in index order) and go to a position (K) without faulting into memory
>> objects previous to the (K) position.
>>
>> regards
>> bruno
>>
>> On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
>>> Bruno,
>>>
>>> There is no backing collection for the BtreePlusReadStream, so being
>>> able to go to a certain position is not possible without counting....
>>>
>>> We should be able to quickly produce a result set of the entire query
>>> results, but it would be a set not an ordered collection:( And to get
>>> results _in order_ the streaming API is the only solution... To get
>>> the kind of performance that you would want, I would think that it
>>> should be possible to create a primitive that would produce the
>>> result set in the form of an Array (in order) instead of a Set.
>>>
>>> For now you would have to produce the Array yourself using:
>>>
>>>      | result |
>>>      result := {}.
>>>      gsQuery do: [:each | result add: each]
>>>
>>> #do: uses the BtreePlusReadStream api underneath covers, so the #do:
>>> elements are processed in order ...
>>>
>>> Let me know if you you would need a primitive for performance and I
>>> can submit a feature request ...
>>>
>>> Dale
>>>
>>> On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
>>>> Hi,
>>>>
>>>> aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
>>>> lot of instances.
>>>>
>>>> In order to get a list of sorted instances (by modifiedTime) i do:
>>>> |gsQuery|
>>>> gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
>>>> gsQuery bind: 'timeNow' to: TimeStamp now.
>>>> gsQuery on: aRcIdentitySet .
>>>>
>>>> Now i want to 'jump' to a given position in this stream...
>>>> It is possible to use some kind of #position: message in
>>>> aBtreePlusReadStream ?
>>>> (position: does no exist in BtreePlusReadStream)
>>>>
>>>> I could use #next to 'jump' to a given position, but the query can
>>>> be very very large.
>>>> At the end it show a paging web page to a user that can click to get
>>>> the next bunch of objects.
>>>> So i do not want to do #next over a large collection.
>>>>
>>>> regards,
>>>> bruno
>>>> 2.11.0.0
>>>> 2.11.0.0
>>>>
>>>> _______________________________________________
>>>> Glass mailing list
>>>> [hidden email]
>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>> _______________________________________________
>>> Glass mailing list
>>> [hidden email]
>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>> _______________________________________________
>> Glass mailing list
>> [hidden email]
>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass
_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
In reply to this post by GLASS mailing list
Bruno,

would you be able to talk about , why and how do you use this feature and how the speed is (if you want to go to 500000 position). Paging ?

I always like to discuss/support enhancements in the GsQuery structure

Marten



_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
In reply to this post by GLASS mailing list
Marten,

It is a REST layer in GS
(https://github.com/brunobuzzi/OrbeonPersistenceLayer) for a Java
Application (www.orbeon.com).
Orbeon display forms instances as summaries
(https://doc.orbeon.com/form-builder/summary-page) sorted by modifiedTime.

The summaries has paging buttons to display next bunch of forms. If the
collection is small it is ok to fault the entire collection into memory
and send #asSortedCollection. But if the collection is large an index by
modifiedTime must be used.

When a user click on a summary (next/previous button) Orbeon call the
REST layer with: form name, form version, page size (forms per page to
display) and page number (the index of the current page).

In GS i must able to do something like: aBtreePlusReadStream skip:
pageSize * pageNumber, in order to read the forms in modifiedTime order.

Right now i do not have the numbers for 500.000 position but at some
point i going to test the project with a large quantity of forms.
I will post the results here when is done.

regards,
bruno


> Bruno,
>
> would you be able to talk about , why and how do you use this feature
> and how the speed is (if you want to go to 500000 position). Paging ?
>
> I always like to discuss/support enhancements in the GsQuery structure
>
> Marten
On 9/6/2020 13:59, Dale Henrichs via Glass wrote:

> Bruno,
>
> I've submitted an internal feature request (48811), so keep your eyse
> peeled.
>
> Dale
>
> On 6/9/20 9:20 AM, smalltalk--- via Glass wrote:
>> Dale,
>>
>> That’s exactly was I looking for. A public method in next release it
>> will good too,
>> Thank very much...
>>
>> ----- Mensaje original -----
>> De: Dale Henrichs via Glass <[hidden email]>
>> Para: [hidden email]
>> Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
>> Asunto: Re: [Glass] Position method for BtreePlusReadStream classes
>>
>> Bruno,
>>
>> Sorry, I wasn't sure what you were asking ... but now that you mention
>> it, there is already a method that will advance the stream cursor,
>> without accessing the object at that position (_btreeNextNoValue)
>>
>>      [ stream atEnd not and: [ pos < collectionSize ] ]
>>         whileTrue: [
>>           pos := pos + 1.
>>           stream _btreeNextNoValue ]
>>
>> If you want to count backward from the end using a reversedReadStream,
>> then you'd have to implement _btreePreviousNoValue:
>>
>>      _btreePreviousNoValue
>>         "Returns the next value on a stream of B-tree values and root
>> objects.  Updates the current
>>        stack for a subsequent 'next'."
>>
>>         | val |
>>         " get the index into the leaf node and see if it has reached
>> the end "
>>         currentIndex == 0
>>           ifTrue: [ ^ self _errorEndOfStream ].    " get the leaf and
>> the value within the leaf "
>>         (currentNode == endNode and: [ endIndex == currentIndex ])
>>           ifTrue: [
>>             currentIndex := 0.
>>             ^ self ].    " see if index refers to first entry in this
>> leaf "
>>         currentIndex == 1
>>           ifTrue: [
>>             " must look down the stack for the next leaf node "
>>             self _previousLeaf ]
>>           ifFalse: [ currentIndex := currentIndex - currentEntrySize ].
>>
>> _btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
>> values into the image, just the interior and leaf nodes would be faulted
>> in gut that is unavoidable ...
>>
>> If the these would work for you I can see adding skip: to both
>> BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
>> make it official ... let me know if this is what you are looking for,
>>
>> Dale
>>
>> On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:
>>> Dale,
>>>
>>> Which is the difference of your solution with the following ?
>>> btreePlusReadStream := gsQuery reversedReadStream.
>>> position := 1.
>>> [btreePlusReadStream atEnd not and: [position < collectionSize]]
>>> whileTrue: [btreePlusReadStream next. position := position + 1].
>>>
>>> What i want is to have a very large (a millon ?) GsQuery result set
>>> (in index order) and go to a position (K) without faulting into memory
>>> objects previous to the (K) position.
>>>
>>> regards
>>> bruno
>>>
>>> On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
>>>> Bruno,
>>>>
>>>> There is no backing collection for the BtreePlusReadStream, so being
>>>> able to go to a certain position is not possible without counting....
>>>>
>>>> We should be able to quickly produce a result set of the entire query
>>>> results, but it would be a set not an ordered collection:( And to get
>>>> results _in order_ the streaming API is the only solution... To get
>>>> the kind of performance that you would want, I would think that it
>>>> should be possible to create a primitive that would produce the
>>>> result set in the form of an Array (in order) instead of a Set.
>>>>
>>>> For now you would have to produce the Array yourself using:
>>>>
>>>>      | result |
>>>>      result := {}.
>>>>      gsQuery do: [:each | result add: each]
>>>>
>>>> #do: uses the BtreePlusReadStream api underneath covers, so the #do:
>>>> elements are processed in order ...
>>>>
>>>> Let me know if you you would need a primitive for performance and I
>>>> can submit a feature request ...
>>>>
>>>> Dale
>>>>
>>>> On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
>>>>> Hi,
>>>>>
>>>>> aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
>>>>> lot of instances.
>>>>>
>>>>> In order to get a list of sorted instances (by modifiedTime) i do:
>>>>> |gsQuery|
>>>>> gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
>>>>> gsQuery bind: 'timeNow' to: TimeStamp now.
>>>>> gsQuery on: aRcIdentitySet .
>>>>>
>>>>> Now i want to 'jump' to a given position in this stream...
>>>>> It is possible to use some kind of #position: message in
>>>>> aBtreePlusReadStream ?
>>>>> (position: does no exist in BtreePlusReadStream)
>>>>>
>>>>> I could use #next to 'jump' to a given position, but the query can
>>>>> be very very large.
>>>>> At the end it show a paging web page to a user that can click to get
>>>>> the next bunch of objects.
>>>>> So i do not want to do #next over a large collection.
>>>>>
>>>>> regards,
>>>>> bruno
>>>>> 2.11.0.0
>>>>> 2.11.0.0
>>>>>
>>>>> _______________________________________________
>>>>> Glass mailing list
>>>>> [hidden email]
>>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>> _______________________________________________
>>>> Glass mailing list
>>>> [hidden email]
>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>> _______________________________________________
>>> Glass mailing list
>>> [hidden email]
>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>> _______________________________________________
>> Glass mailing list
>> [hidden email]
>> https://lists.gemtalksystems.com/mailman/listinfo/glass
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass
_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
Bruno,

I am responding to the human factors issues that arise from large
collections, not your need for the API. I agree with the need for the API.

See below.
Edit: in retrospect, you may be using the query API poorly with that design.


GLASS mailing list wrote

> Marten,
>
> It is a REST layer in GS
> (https://github.com/brunobuzzi/OrbeonPersistenceLayer) for a Java
> Application (www.orbeon.com).
> Orbeon display forms instances as summaries
> (https://doc.orbeon.com/form-builder/summary-page) sorted by modifiedTime.
>
> The summaries has paging buttons to display next bunch of forms. If the
> collection is small it is ok to fault the entire collection into memory
> and send #asSortedCollection. But if the collection is large an index by
> modifiedTime must be used.
>
> When a user click on a summary (next/previous button) Orbeon call the
> REST layer with: form name, form version, page size (forms per page to
> display) and page number (the index of the current page).
>
> In GS i must able to do something like: aBtreePlusReadStream skip:
> pageSize * pageNumber, in order to read the forms in modifiedTime order.

Depending on the sort direction of the index, you will encountered different
concerns. If the index is descending from the most recent time stamp, when
paging through the collection, you will skip any newly modified form once
you've gone past the first page. If ascending order, then it's not a
problem, as any newly modified form will come at the end of the index. But,
that also means that your starting offset will exclude one form that you
haven't seen. More than one if many modifications are occurring in the
interval.

Assuming you are reviewing "recently modified forms", you might want to use
the modifiedTime of the last one shown to start the query for the next page.

For example, a really simplified example, let's says that you are looking at
changes since yesterday. So you start with a query giving you everything
since yesterday at perhaps 16:00. The first page has changes ranging from
shortly after 16:00 through 17:30. When you page, you use 17:30 as the
starting time and get the next page. And so on.

(The time stamps probably have much finer granularity than 1 second, so the
number of modified forms with the same time stamp is probably very low. It
/might/ be > 1, in which case you get more that just one repeat for the next
page. Millisecond or microsecond precision on the time stamps will greatly
reduce the changes of that. FYI, the upcoming GemStone 3.6 will introduce
specials for a number of common classes, such as Date and Time, as well as
DateAndTime. The latter has a range from 2001 through 2072 with microsecond
precision. "Specials" means the OOP fully encodes the value, if that wasn't
already clear.)


> Right now i do not have the numbers for 500.000 position but at some
> point i going to test the project with a large quantity of forms.
> I will post the results here when is done.

500k forms sounds like a horror to interact with, as a user. Years ago, I
had to deal with 12k employees in a "drop list" and 1,200 bank branches
likewise. These sizes were unusable from a user interface perspective.
Hopefully, you have already considered how to make user selection of a form
manageable when the number of forms is large.

In my cases, both tables were small enough to have in memory, so providing a
filtered search capability for the drop down list was entirely manageable.


> regards,
> bruno
>
>
>> Bruno,
>>
>> would you be able to talk about , why and how do you use this feature
>> and how the speed is (if you want to go to 500000 position). Paging ?
>>
>> I always like to discuss/support enhancements in the GsQuery structure
>>
>> Marten
> On 9/6/2020 13:59, Dale Henrichs via Glass wrote:
>> Bruno,
>>
>> I've submitted an internal feature request (48811), so keep your eyse
>> peeled.
>>
>> Dale
>>
>> On 6/9/20 9:20 AM, smalltalk--- via Glass wrote:
>>> Dale,
>>>
>>> That’s exactly was I looking for. A public method in next release it
>>> will good too,
>>> Thank very much...
>>>
>>> ----- Mensaje original -----
>>> De: Dale Henrichs via Glass &lt;

> glass@.gemtalksystems

> &gt;
>>> Para:

> glass@.gemtalksystems

>>> Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
>>> Asunto: Re: [Glass] Position method for BtreePlusReadStream classes
>>>
>>> Bruno,
>>>
>>> Sorry, I wasn't sure what you were asking ... but now that you mention
>>> it, there is already a method that will advance the stream cursor,
>>> without accessing the object at that position (_btreeNextNoValue)
>>>
>>>      [ stream atEnd not and: [ pos < collectionSize ] ]
>>>         whileTrue: [
>>>           pos := pos + 1.
>>>           stream _btreeNextNoValue ]
>>>
>>> If you want to count backward from the end using a reversedReadStream,
>>> then you'd have to implement _btreePreviousNoValue:
>>>
>>>      _btreePreviousNoValue
>>>         "Returns the next value on a stream of B-tree values and root
>>> objects.  Updates the current
>>>        stack for a subsequent 'next'."
>>>
>>>         | val |
>>>         " get the index into the leaf node and see if it has reached
>>> the end "
>>>         currentIndex == 0
>>>           ifTrue: [ ^ self _errorEndOfStream ].    " get the leaf and
>>> the value within the leaf "
>>>         (currentNode == endNode and: [ endIndex == currentIndex ])
>>>           ifTrue: [
>>>             currentIndex := 0.
>>>             ^ self ].    " see if index refers to first entry in this
>>> leaf "
>>>         currentIndex == 1
>>>           ifTrue: [
>>>             " must look down the stack for the next leaf node "
>>>             self _previousLeaf ]
>>>           ifFalse: [ currentIndex := currentIndex - currentEntrySize ].
>>>
>>> _btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
>>> values into the image, just the interior and leaf nodes would be faulted
>>> in gut that is unavoidable ...
>>>
>>> If the these would work for you I can see adding skip: to both
>>> BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
>>> make it official ... let me know if this is what you are looking for,
>>>
>>> Dale
>>>
>>> On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:
>>>> Dale,
>>>>
>>>> Which is the difference of your solution with the following ?
>>>> btreePlusReadStream := gsQuery reversedReadStream.
>>>> position := 1.
>>>> [btreePlusReadStream atEnd not and: [position < collectionSize]]
>>>> whileTrue: [btreePlusReadStream next. position := position + 1].
>>>>
>>>> What i want is to have a very large (a millon ?) GsQuery result set
>>>> (in index order) and go to a position (K) without faulting into memory
>>>> objects previous to the (K) position.
>>>>
>>>> regards
>>>> bruno
>>>>
>>>> On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
>>>>> Bruno,
>>>>>
>>>>> There is no backing collection for the BtreePlusReadStream, so being
>>>>> able to go to a certain position is not possible without counting....
>>>>>
>>>>> We should be able to quickly produce a result set of the entire query
>>>>> results, but it would be a set not an ordered collection:( And to get
>>>>> results _in order_ the streaming API is the only solution... To get
>>>>> the kind of performance that you would want, I would think that it
>>>>> should be possible to create a primitive that would produce the
>>>>> result set in the form of an Array (in order) instead of a Set.
>>>>>
>>>>> For now you would have to produce the Array yourself using:
>>>>>
>>>>>      | result |
>>>>>      result := {}.
>>>>>      gsQuery do: [:each | result add: each]
>>>>>
>>>>> #do: uses the BtreePlusReadStream api underneath covers, so the #do:
>>>>> elements are processed in order ...
>>>>>
>>>>> Let me know if you you would need a primitive for performance and I
>>>>> can submit a feature request ...
>>>>>
>>>>> Dale
>>>>>
>>>>> On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
>>>>>> Hi,
>>>>>>
>>>>>> aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
>>>>>> lot of instances.
>>>>>>
>>>>>> In order to get a list of sorted instances (by modifiedTime) i do:
>>>>>> |gsQuery|
>>>>>> gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
>>>>>> gsQuery bind: 'timeNow' to: TimeStamp now.
>>>>>> gsQuery on: aRcIdentitySet .
>>>>>>
>>>>>> Now i want to 'jump' to a given position in this stream...
>>>>>> It is possible to use some kind of #position: message in
>>>>>> aBtreePlusReadStream ?
>>>>>> (position: does no exist in BtreePlusReadStream)
>>>>>>
>>>>>> I could use #next to 'jump' to a given position, but the query can
>>>>>> be very very large.
>>>>>> At the end it show a paging web page to a user that can click to get
>>>>>> the next bunch of objects.
>>>>>> So i do not want to do #next over a large collection.
>>>>>>
>>>>>> regards,
>>>>>> bruno
>>>>>> 2.11.0.0
>>>>>> 2.11.0.0
>>>>>>
>>>>>> _______________________________________________
>>>>>> Glass mailing list
>>>>>>

> Glass@.gemtalksystems

>>>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>>> _______________________________________________
>>>>> Glass mailing list
>>>>>

> Glass@.gemtalksystems

>>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>> _______________________________________________
>>>> Glass mailing list
>>>>

> Glass@.gemtalksystems

>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>>> _______________________________________________
>>> Glass mailing list
>>>

> Glass@.gemtalksystems

>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>> _______________________________________________
>> Glass mailing list
>>

> Glass@.gemtalksystems

>> https://lists.gemtalksystems.com/mailman/listinfo/glass
> _______________________________________________
> Glass mailing list

> Glass@.gemtalksystems

> https://lists.gemtalksystems.com/mailman/listinfo/glass





--
Sent from: http://forum.world.st/GLASS-f1460844.html
_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
Richard,

> Depending on the sort direction of the index, you will encountered different
> concerns. If the index is descending from the most recent time stamp, when
> paging through the collection, you will skip any newly modified form once
> you've gone past the first page. If ascending order, then it's not a
> problem, as any newly modified form will come at the end of the index. But,
> that also means that your starting offset will exclude one form that you
> haven't seen. More than one if many modifications are occurring in the
> interval.
The index is descending.
Yes newly modified form may be skipped but not for too long. A page
refresh or going back on page by the user will cause REST call to GS and
newly added forms will be displayed. The summary is used as entry point
(a general page to display the last forms added) but the user can do a
search by field values in the forms (this is other kind of search and it
is already implemented in GS).
The query used for general summary is:
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.

Because is a Java app i do not have control over the user interface
design (it is what it is :)
In order to locate a specific form you have to have an id or something
and do a search by that field.
Or you can search for a bunch of forms for example field "office
location = 'Beaverton' " and do the paging from that search.
Also the forms have permission so a user maybe do no see all the forms
(i omitted some details because there are to many options).
But if the forms, form permissions and field search are well designed
the user should not have any problem in usability (again i have no
control over the interface i'm just adding a REST layer in GS so Orbeon
Forms can be natively stored on GS).
Orbeon Forms is a product done by Orbeon (based in San Fransisco i think
-https://www.orbeon.com/-)

Your concerns are valid because if the form designer does not take into
account these factors could be very difficult to handle the page.
Although the form definition can be updated.
If the designed form has no search values (this is done at design time
in the Java app) and there are no permissions and there are large number
of forms then it going to be trouble to the end user.

It seems that 3.6 has a lot of new features !!!
Just with Tonel and Jadeite is going to be very interesting and now this
improvement in time stamps also.
Waiting for that release !!!

regards
bruno

> Assuming you are reviewing "recently modified forms", you might want to use
> the modifiedTime of the last one shown to start the query for the next page.
>
> For example, a really simplified example, let's says that you are looking at
> changes since yesterday. So you start with a query giving you everything
> since yesterday at perhaps 16:00. The first page has changes ranging from
> shortly after 16:00 through 17:30. When you page, you use 17:30 as the
> starting time and get the next page. And so on.
>
> (The time stamps probably have much finer granularity than 1 second, so the
> number of modified forms with the same time stamp is probably very low. It
> /might/ be > 1, in which case you get more that just one repeat for the next
> page. Millisecond or microsecond precision on the time stamps will greatly
> reduce the changes of that. FYI, the upcoming GemStone 3.6 will introduce
> specials for a number of common classes, such as Date and Time, as well as
> DateAndTime. The latter has a range from 2001 through 2072 with microsecond
> precision. "Specials" means the OOP fully encodes the value, if that wasn't
> already clear.)
>
>
>> Right now i do not have the numbers for 500.000 position but at some
>> point i going to test the project with a large quantity of forms.
>> I will post the results here when is done.
> 500k forms sounds like a horror to interact with, as a user. Years ago, I
> had to deal with 12k employees in a "drop list" and 1,200 bank branches
> likewise. These sizes were unusable from a user interface perspective.
> Hopefully, you have already considered how to make user selection of a form
> manageable when the number of forms is large.
>
> In my cases, both tables were small enough to have in memory, so providing a
> filtered search capability for the drop down list was entirely manageable.
>
>
>> regards,
>> bruno
>>
>>
>>> Bruno,
>>>
>>> would you be able to talk about , why and how do you use this feature
>>> and how the speed is (if you want to go to 500000 position). Paging ?
>>>
>>> I always like to discuss/support enhancements in the GsQuery structure
>>>
>>> Marten
>> On 9/6/2020 13:59, Dale Henrichs via Glass wrote:
>>> Bruno,
>>>
>>> I've submitted an internal feature request (48811), so keep your eyse
>>> peeled.
>>>
>>> Dale
>>>
>>> On 6/9/20 9:20 AM, smalltalk--- via Glass wrote:
>>>> Dale,
>>>>
>>>> That’s exactly was I looking for. A public method in next release it
>>>> will good too,
>>>> Thank very much...
>>>>
>>>> ----- Mensaje original -----
>>>> De: Dale Henrichs via Glass &lt;
>> glass@.gemtalksystems
>> &gt;
>>>> Para:
>> glass@.gemtalksystems
>>>> Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
>>>> Asunto: Re: [Glass] Position method for BtreePlusReadStream classes
>>>>
>>>> Bruno,
>>>>
>>>> Sorry, I wasn't sure what you were asking ... but now that you mention
>>>> it, there is already a method that will advance the stream cursor,
>>>> without accessing the object at that position (_btreeNextNoValue)
>>>>
>>>>       [ stream atEnd not and: [ pos < collectionSize ] ]
>>>>          whileTrue: [
>>>>            pos := pos + 1.
>>>>            stream _btreeNextNoValue ]
>>>>
>>>> If you want to count backward from the end using a reversedReadStream,
>>>> then you'd have to implement _btreePreviousNoValue:
>>>>
>>>>       _btreePreviousNoValue
>>>>          "Returns the next value on a stream of B-tree values and root
>>>> objects.  Updates the current
>>>>         stack for a subsequent 'next'."
>>>>
>>>>          | val |
>>>>          " get the index into the leaf node and see if it has reached
>>>> the end "
>>>>          currentIndex == 0
>>>>            ifTrue: [ ^ self _errorEndOfStream ].    " get the leaf and
>>>> the value within the leaf "
>>>>          (currentNode == endNode and: [ endIndex == currentIndex ])
>>>>            ifTrue: [
>>>>              currentIndex := 0.
>>>>              ^ self ].    " see if index refers to first entry in this
>>>> leaf "
>>>>          currentIndex == 1
>>>>            ifTrue: [
>>>>              " must look down the stack for the next leaf node "
>>>>              self _previousLeaf ]
>>>>            ifFalse: [ currentIndex := currentIndex - currentEntrySize ].
>>>>
>>>> _btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
>>>> values into the image, just the interior and leaf nodes would be faulted
>>>> in gut that is unavoidable ...
>>>>
>>>> If the these would work for you I can see adding skip: to both
>>>> BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
>>>> make it official ... let me know if this is what you are looking for,
>>>>
>>>> Dale
>>>>
>>>> On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:
>>>>> Dale,
>>>>>
>>>>> Which is the difference of your solution with the following ?
>>>>> btreePlusReadStream := gsQuery reversedReadStream.
>>>>> position := 1.
>>>>> [btreePlusReadStream atEnd not and: [position < collectionSize]]
>>>>> whileTrue: [btreePlusReadStream next. position := position + 1].
>>>>>
>>>>> What i want is to have a very large (a millon ?) GsQuery result set
>>>>> (in index order) and go to a position (K) without faulting into memory
>>>>> objects previous to the (K) position.
>>>>>
>>>>> regards
>>>>> bruno
>>>>>
>>>>> On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
>>>>>> Bruno,
>>>>>>
>>>>>> There is no backing collection for the BtreePlusReadStream, so being
>>>>>> able to go to a certain position is not possible without counting....
>>>>>>
>>>>>> We should be able to quickly produce a result set of the entire query
>>>>>> results, but it would be a set not an ordered collection:( And to get
>>>>>> results _in order_ the streaming API is the only solution... To get
>>>>>> the kind of performance that you would want, I would think that it
>>>>>> should be possible to create a primitive that would produce the
>>>>>> result set in the form of an Array (in order) instead of a Set.
>>>>>>
>>>>>> For now you would have to produce the Array yourself using:
>>>>>>
>>>>>>       | result |
>>>>>>       result := {}.
>>>>>>       gsQuery do: [:each | result add: each]
>>>>>>
>>>>>> #do: uses the BtreePlusReadStream api underneath covers, so the #do:
>>>>>> elements are processed in order ...
>>>>>>
>>>>>> Let me know if you you would need a primitive for performance and I
>>>>>> can submit a feature request ...
>>>>>>
>>>>>> Dale
>>>>>>
>>>>>> On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
>>>>>>> lot of instances.
>>>>>>>
>>>>>>> In order to get a list of sorted instances (by modifiedTime) i do:
>>>>>>> |gsQuery|
>>>>>>> gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
>>>>>>> gsQuery bind: 'timeNow' to: TimeStamp now.
>>>>>>> gsQuery on: aRcIdentitySet .
>>>>>>>
>>>>>>> Now i want to 'jump' to a given position in this stream...
>>>>>>> It is possible to use some kind of #position: message in
>>>>>>> aBtreePlusReadStream ?
>>>>>>> (position: does no exist in BtreePlusReadStream)
>>>>>>>
>>>>>>> I could use #next to 'jump' to a given position, but the query can
>>>>>>> be very very large.
>>>>>>> At the end it show a paging web page to a user that can click to get
>>>>>>> the next bunch of objects.
>>>>>>> So i do not want to do #next over a large collection.
>>>>>>>
>>>>>>> regards,
>>>>>>> bruno
>>>>>>> 2.11.0.0
>>>>>>> 2.11.0.0
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Glass mailing list
>>>>>>>
>> Glass@.gemtalksystems
>>>>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>>>> _______________________________________________
>>>>>> Glass mailing list
>>>>>>
>> Glass@.gemtalksystems
>>>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>>> _______________________________________________
>>>>> Glass mailing list
>>>>>
>> Glass@.gemtalksystems
>>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>>> _______________________________________________
>>>> Glass mailing list
>>>>
>> Glass@.gemtalksystems
>>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>> _______________________________________________
>>> Glass mailing list
>>>
>> Glass@.gemtalksystems
>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>> _______________________________________________
>> Glass mailing list
>> Glass@.gemtalksystems
>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>
>
>
>
> --
> Sent from: http://forum.world.st/GLASS-f1460844.html
> _______________________________________________
> Glass mailing list
> [hidden email]
> https://lists.gemtalksystems.com/mailman/listinfo/glass
_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
Bruno,

It looks like based upon customer requirements, we will be releasing
3.6.0 before Jadeite is fully integrated into the product, so you will
have to wait a little bit longer than you've anticipated. 3.6.0 is
slated for release this fall and 3.7.0 which should have full Jadeite
integration (Dolphin-based) plus a Pharo-based IDE is slated for next
year ...

Good news/bad news, 3.6.0 would have slipped until next year anyway if
we had held it up for the IDE's so in reality you won't be waiting any
longer than you would have anyway, but you are now waiting for 3.7.0:)

Dale

On 6/12/20 3:34 PM, bruno buzzi brassesco via Glass wrote:
> It seems that 3.6 has a lot of new features !!!
> Just with Tonel and Jadeite is going to be very interesting and now
> this improvement in time stamps also.
> Waiting for that release !!!
_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list
In reply to this post by GLASS mailing list
Hey Bruno,

well I thought it would be for paging. Actually I am are also working in that area and I am not very happy with the Gemstone/S support in this area and I am very surprised to see, that GsQuery is so limited and index support is also limited ... but the more the users are demanding it, the situation might change.

The problem with paging is actually very difficult to do, perhaps impossible or very expensive. I am also working in the area of REST and so the situation is the same:

a) Between each REST call to demand a new page of data, there might be transactional changes in the viewed set - so you have no stable set, but an always changing set. In some domain areas this is not acceptable, in others its ok. In our domain area we do not talk about that :-)

b) We are thinking about implementing a dual approach - paging on the typical page mechanism and with additional gop help, but we have not done further work there. The idea is, that we also deliver the gop of the first and the last object shown in the current page and then on the server side we find  the starting point of the current page (and the the next page) and a predicatable way. We have sets with more than 500000 elements and paging in such large set is not useable ... the iterations to find the starting object (via the btree-structure) is too expensive (time consuming), the UI is too slow.

c) Another apporach is to use a SortedCollection ... which gives you one pyhsical sorted collection and then you get the speed to do even buffering paging, which is the highest art in Javascript UI - but of course a linear speed decrease when changing that collection at the server or strange stuff when not being careful when changing object data.


One may like it or not, our users are table oriented persons ... and they like paging a lot - even though it might not be very clever to manual look through hundreds of data items.


Marten

bruno buzzi brassesco via Glass < [hidden email]> hat am 12. Juni 2020 um 22:49 geschrieben:


Marten,

It is a REST layer in GS
Application (www.orbeon.com).
Orbeon display forms instances as summaries

The summaries has paging buttons to display next bunch of forms. If the
collection is small it is ok to fault the entire collection into memory
and send #asSortedCollection. But if the collection is large an index by
modifiedTime must be used.

When a user click on a summary (next/previous button) Orbeon call the
REST layer with: form name, form version, page size (forms per page to
display) and page number (the index of the current page).

In GS i must able to do something like: aBtreePlusReadStream skip:
pageSize * pageNumber, in order to read the forms in modifiedTime order.

Right now i do not have the numbers for 500.000 position but at some
point i going to test the project with a large quantity of forms.
I will post the results here when is done.

regards,
bruno


Bruno,

would you be able to talk about , why and how do you use this feature
and how the speed is (if you want to go to 500000 position). Paging ?

I always like to discuss/support enhancements in the GsQuery structure

Marten
On 9/6/2020 13:59, Dale Henrichs via Glass wrote:
Bruno,

I've submitted an internal feature request (48811), so keep your eyse
peeled.

Dale

On 6/9/20 9:20 AM, smalltalk--- via Glass wrote:
Dale,

That’s exactly was I looking for. A public method in next release it
will good too,
Thank very much...

----- Mensaje original -----
De: Dale Henrichs via Glass < [hidden email]>
Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
Asunto: Re: [Glass] Position method for BtreePlusReadStream classes

Bruno,

Sorry, I wasn't sure what you were asking ... but now that you mention
it, there is already a method that will advance the stream cursor,
without accessing the object at that position (_btreeNextNoValue)

     [ stream atEnd not and: [ pos < collectionSize ] ]
        whileTrue: [
          pos := pos + 1.
          stream _btreeNextNoValue ]

If you want to count backward from the end using a reversedReadStream,
then you'd have to implement _btreePreviousNoValue:

     _btreePreviousNoValue
        "Returns the next value on a stream of B-tree values and root
objects.  Updates the current
       stack for a subsequent 'next'."

        | val |
        " get the index into the leaf node and see if it has reached
the end "
        currentIndex == 0
          ifTrue: [ ^ self _errorEndOfStream ].    " get the leaf and
the value within the leaf "
        (currentNode == endNode and: [ endIndex == currentIndex ])
          ifTrue: [
            currentIndex := 0.
            ^ self ].    " see if index refers to first entry in this
leaf "
        currentIndex == 1
          ifTrue: [
            " must look down the stack for the next leaf node "
            self _previousLeaf ]
          ifFalse: [ currentIndex := currentIndex - currentEntrySize ].

_btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
values into the image, just the interior and leaf nodes would be faulted
in gut that is unavoidable ...

If the these would work for you I can see adding skip: to both
BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
make it official ... let me know if this is what you are looking for,

Dale

On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:
Dale,

Which is the difference of your solution with the following ?
btreePlusReadStream := gsQuery reversedReadStream.
position := 1.
[btreePlusReadStream atEnd not and: [position < collectionSize]]
whileTrue: [btreePlusReadStream next. position := position + 1].

What i want is to have a very large (a millon ?) GsQuery result set
(in index order) and go to a position (K) without faulting into memory
objects previous to the (K) position.

regards
bruno

On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
Bruno,

There is no backing collection for the BtreePlusReadStream, so being
able to go to a certain position is not possible without counting....

We should be able to quickly produce a result set of the entire query
results, but it would be a set not an ordered collection:( And to get
results _in order_ the streaming API is the only solution... To get
the kind of performance that you would want, I would think that it
should be possible to create a primitive that would produce the
result set in the form of an Array (in order) instead of a Set.

For now you would have to produce the Array yourself using:

     | result |
     result := {}.
     gsQuery do: [:each | result add: each]

#do: uses the BtreePlusReadStream api underneath covers, so the #do:
elements are processed in order ...

Let me know if you you would need a primitive for performance and I
can submit a feature request ...

Dale

On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
Hi,

aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
lot of instances.

In order to get a list of sorted instances (by modifiedTime) i do:
|gsQuery|
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.
gsQuery on: aRcIdentitySet .

Now i want to 'jump' to a given position in this stream...
It is possible to use some kind of #position: message in
aBtreePlusReadStream ?
(position: does no exist in BtreePlusReadStream)

I could use #next to 'jump' to a given position, but the query can
be very very large.
At the end it show a paging web page to a user that can click to get
the next bunch of objects.
So i do not want to do #next over a large collection.

regards,
bruno
2.11.0.0
2.11.0.0

_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list

_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: Position method for BtreePlusReadStream classes

GLASS mailing list

Marten,
(see my answer in-lined)
For me this point is not critical because the user can NOT go from page 1 to page 50.000. Only can advance by 1 page or go  to the last (in this case reverseReadStream is enough). Although it will very good to have a primitive to go position 50.000 without faulting previous "btree" structure into memory.
In my case a user usually do a search by field value to find a desired form or groups of forms. A very common case is to add "state" field and different roles are only interested in forms with a specific "state" value (let say "started", "in process", "finished"). With an index of that field it seems enough. Other approach is to add an "id" field on the form but all of these cases are heavily dependent on your business.

a) Between each REST call to demand a new page of data, there might be transactional changes in the viewed set - so you have no stable set, but an always changing set. In some domain areas this is not acceptable, in others its ok. In our domain area we do not talk about that :-)
REST is stateless so is impossible to avoid this. But you can do some kind of AJAX call for a refresh at client level. Anyway if the user stay too much time in a page it sure that it is watching an outdated list of objects.
b) We are thinking about implementing a dual approach - paging on the typical page mechanism and with additional gop help, but we have not done further work there. The idea is, that we also deliver the gop of the first and the last object shown in the current page and then on the server side we find  the starting point of the current page (and the the next page) and a predicatable way. We have sets with more than 500000 elements and paging in such large set is not useable ... the iterations to find the starting object (via the btree-structure) is too expensive (time consuming), the UI is too slow.
In your domain does the time stamp is changed often ? or once a time stamp assigned to an object is rarely changed ?
And what kind of queries do you execute on this objects ? Because sometime is better to address this with your own object structure rather than the use of an index.
c) Another apporach is to use a SortedCollection ... which gives you one pyhsical sorted collection and then you get the speed to do even buffering paging, which is the highest art in Javascript UI - but of course a linear speed decrease when changing that collection at the server or strange stuff when not being careful when changing object data.

Good to know this !.  No idea about this !


One may like it or not, our users are table oriented persons ... and they like paging a lot - even though it might not be very clever to manual look through hundreds of data items.


Marten

bruno buzzi brassesco via Glass < [hidden email]> hat am 12. Juni 2020 um 22:49 geschrieben:


Marten,

It is a REST layer in GS
Application (www.orbeon.com).
Orbeon display forms instances as summaries

The summaries has paging buttons to display next bunch of forms. If the
collection is small it is ok to fault the entire collection into memory
and send #asSortedCollection. But if the collection is large an index by
modifiedTime must be used.

When a user click on a summary (next/previous button) Orbeon call the
REST layer with: form name, form version, page size (forms per page to
display) and page number (the index of the current page).

In GS i must able to do something like: aBtreePlusReadStream skip:
pageSize * pageNumber, in order to read the forms in modifiedTime order.

Right now i do not have the numbers for 500.000 position but at some
point i going to test the project with a large quantity of forms.
I will post the results here when is done.

regards,
bruno


Bruno,

would you be able to talk about , why and how do you use this feature
and how the speed is (if you want to go to 500000 position). Paging ?

I always like to discuss/support enhancements in the GsQuery structure

Marten
On 9/6/2020 13:59, Dale Henrichs via Glass wrote:
Bruno,

I've submitted an internal feature request (48811), so keep your eyse
peeled.

Dale

On 6/9/20 9:20 AM, smalltalk--- via Glass wrote:
Dale,

That’s exactly was I looking for. A public method in next release it
will good too,
Thank very much...

----- Mensaje original -----
De: Dale Henrichs via Glass < [hidden email]>
Enviado: Mon, 08 Jun 2020 19:35:37 -0300 (UYT)
Asunto: Re: [Glass] Position method for BtreePlusReadStream classes

Bruno,

Sorry, I wasn't sure what you were asking ... but now that you mention
it, there is already a method that will advance the stream cursor,
without accessing the object at that position (_btreeNextNoValue)

     [ stream atEnd not and: [ pos < collectionSize ] ]
        whileTrue: [
          pos := pos + 1.
          stream _btreeNextNoValue ]

If you want to count backward from the end using a reversedReadStream,
then you'd have to implement _btreePreviousNoValue:

     _btreePreviousNoValue
        "Returns the next value on a stream of B-tree values and root
objects.  Updates the current
       stack for a subsequent 'next'."

        | val |
        " get the index into the leaf node and see if it has reached
the end "
        currentIndex == 0
          ifTrue: [ ^ self _errorEndOfStream ].    " get the leaf and
the value within the leaf "
        (currentNode == endNode and: [ endIndex == currentIndex ])
          ifTrue: [
            currentIndex := 0.
            ^ self ].    " see if index refers to first entry in this
leaf "
        currentIndex == 1
          ifTrue: [
            " must look down the stack for the next leaf node "
            self _previousLeaf ]
          ifFalse: [ currentIndex := currentIndex - currentEntrySize ].

_btreeNextNoValue and _btreePreviousNoValue both avoid faulting the
values into the image, just the interior and leaf nodes would be faulted
in gut that is unavoidable ...

If the these would work for you I can see adding skip: to both
BtreePlusGsIndexReadStream and BtreePlusGsReversedIndexReadStream to
make it official ... let me know if this is what you are looking for,

Dale

On 6/8/20 1:11 PM, bruno buzzi brassesco via Glass wrote:
Dale,

Which is the difference of your solution with the following ?
btreePlusReadStream := gsQuery reversedReadStream.
position := 1.
[btreePlusReadStream atEnd not and: [position < collectionSize]]
whileTrue: [btreePlusReadStream next. position := position + 1].

What i want is to have a very large (a millon ?) GsQuery result set
(in index order) and go to a position (K) without faulting into memory
objects previous to the (K) position.

regards
bruno

On 8/6/2020 16:18, Dale Henrichs via Glass wrote:
Bruno,

There is no backing collection for the BtreePlusReadStream, so being
able to go to a certain position is not possible without counting....

We should be able to quickly produce a result set of the entire query
results, but it would be a set not an ordered collection:( And to get
results _in order_ the streaming API is the only solution... To get
the kind of performance that you would want, I would think that it
should be possible to create a primitive that would produce the
result set in the form of an Array (in order) instead of a Set.

For now you would have to produce the Array yourself using:

     | result |
     result := {}.
     gsQuery do: [:each | result add: each]

#do: uses the BtreePlusReadStream api underneath covers, so the #do:
elements are processed in order ...

Let me know if you you would need a primitive for performance and I
can submit a feature request ...

Dale

On 6/8/20 11:53 AM, smalltalk--- via Glass wrote:
Hi,

aRcIdentitySet has an index on 'each.modifiedTime' and it can have a
lot of instances.

In order to get a list of sorted instances (by modifiedTime) i do:
|gsQuery|
gsQuery := GsQuery fromString: 'each.modifiedTime <= timeNow'.
gsQuery bind: 'timeNow' to: TimeStamp now.
gsQuery on: aRcIdentitySet .

Now i want to 'jump' to a given position in this stream...
It is possible to use some kind of #position: message in
aBtreePlusReadStream ?
(position: does no exist in BtreePlusReadStream)

I could use #next to 'jump' to a given position, but the query can
be very very large.
At the end it show a paging web page to a user that can click to get
the next bunch of objects.
So i do not want to do #next over a large collection.

regards,
bruno
2.11.0.0
2.11.0.0

_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list

_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list
_______________________________________________
Glass mailing list

_______________________________________________
Glass mailing list
[hidden email]
https://lists.gemtalksystems.com/mailman/listinfo/glass