Yes, I know: indices are a world of their own. But I asked myself, what can an index bring for me. I had an IdentitySet with around 3759515 elements. The objects contained have several attributes, but I am only considering one attribute for my queries via GsQuery: the attribute instances are member of SmallDouble, no nil containing. I produced a query like: "(40,0 <= each.floatvalue <= 60,0)" and the results are around 1719852 elements (around 50 % of the total numbers of elements). I do not select the returned objects I just count them. When I define an index on this set (equalityIndex) it took around 6 seconds to count them all, without index creation the system get the job done in 2.5 seconds. (always a warm database). SQLite3 takes around 0.7s without index and 0.17s with index (total numbers are not useful here, but it shows that here an index is useful). It seems for me, that an index in Gemstone is useful, IF the number of expected elements are really small . Any comment ? Marten _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Ok, I read the documentation and I understand where the problem is: The query support creates a temporary container to held the result and then you iterate over the result container. I assume, that it takes time to create and maintain this result container (if the result size is larger). The example below can be transferred into a streamable query, which speed ups the query by at least a factor of 2. Marten Marten Feldtmann via Glass <[hidden email]> hat am 28. Januar 2017 um 16:03 geschrieben:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Vía phone.. Quick answer.. Small double are immediate objects so always are identical. Can you measure performance with identity index? On Jan 28, 2017 12:03 PM, "Marten Feldtmann via Glass" <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
That's correct. With a low selectivity query (aka large result
set), most of the time can be spent constructing the result set.
The #selectAsStream: methods avoid this penalty.
Also Dale is working on a major overhaul of the indexing system in 3.4. Indexing should perform better in that release. Norm On 1/28/2017 8:24 AM, Marten Feldtmann
via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
First streamable queries assume equality indices and I assume, that comparing numerical values (with something different than =) can not benefit from identity indices. Marten Mariano Martinez Peck <[hidden email]> hat am 28. Januar 2017 um 17:25 geschrieben: _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
But using streamable result sets pop up another interface problem: they are only allowed to use, if an equalityIndex is defined on that attribute. In the example I create an index only on larger sets (>50000 items) - that means sometimes the index is available, but sometimes not .. so to make it right, you should ask GsQuery (if this is possible at all) if the query can be executed at all and then according to the result, use the streaming interface - otherwise the do: interface. Marten
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
After testing the index support in Gemstone I come (for me) to the following point: a) try to define a query with only one predicate to get a streamable result set (and hope to reduce the size of your result set) b) if you have more than one predicate -> build your own block and then execute this block against the result set from (a). c) Defining more than one index on a set may result in strange (slower) query execution time - the query is executed faster, if no index is defined at all. Puuuh, lots of new things to learn. Perhaps it would be nice to ask GsQuery NOT to use indices at all - so you have ONE programming interface for all cases. Marten Marten Feldtmann via Glass <[hidden email]> hat am 28. Januar 2017 um 18:09 geschrieben:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
In 3.4 identity indexes will be streamable ... On 1/28/17 8:58 AM, Marten Feldtmann
via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
In reply to this post by GLASS mailing list
Marten, I suggest that you wait for 3.4 before making up your mind about
indexing support. The work that we are doing for 3.4 should significantly speed up
the execution of queries, but I don't think that query speedup is
the only thing that can be done ... You have been making several general observations about index
performance, but unfortunately you haven't provided any examples
or test cases to back up your observations ... I am not claiming that your observations are wrong, but without
an example or test case I cannot know what the root cause may be
and whether or not the performance issues you are seeing in your
particular test cases will be addressed or not by the new work or
even what additional work that _could_ be done to improve things
for you. You have made some observations that are likely to not be
directly affected by my work, like order of predicate execution. As you have observed, currently, you can only "hope to reduce the
size of your result set". One of features we are are adding is a
counted b plus tree implementation ... with a counted b plus tree,
it should be possible to calculate the size of a result set
without executing the entire query ... depending upon the exact
nature of your queries, it may be possible to define a query
optimizer that utilizes the estimated result set sizes to
determine predicate evaluation order ... otoh, without knowing the
details of your exact use case, we may not implement the optimizer
in such a way that addresses your use case ... If you are willing, I am sure that we could make an alpha release of 3.4 available to so you could give us better feedback. In fact, I would really like to get a closer look at your use cases. Ideally I would get a copy of your development extent so that I can work with exactly the same sample data and code that you are using ... If you are unable to share extents, then a set of test cases that exhibit the same behavior as you are seeing would be much appreciated. This happens to be the perfect time to be looking at index performance issues ... I am a couple of weeks from finishing up the initial work on the new implementation and being able to get direct feedback from you would help us focus and prioritize any additional work that might be needed to round out the usability and performance of the new implementation. Dale On 1/28/17 9:27 AM, Marten Feldtmann
via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |