Proxy pattern

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Proxy pattern

Günther Schmidt
Hi all,

I need to get the hang of implementing a proxy pattern.

I have a rather large SQL database and I'd like to show all the entries of 1 table (10.000 +) in a scrollable window within a list column view.

Now I certainly can not load all the object that exist in that table into memory at once, I suppose I'll have to do a Proxy pattern thingy.

I just don't have a clue how.

Help

Günther


Reply | Threaded
Open this post in threaded view
|

Re: Proxy pattern

Chris Uppal-3
Guenther Schmidt wrote:

> I have a rather large SQL database and I'd like to show all the entries
> of 1 table (10.000 +) in a scrollable window within a list column view.
>
> Now I certainly can not load all the object that exist in that table into
> memory at once, I suppose I'll have to do a Proxy pattern thingy.

Before talking about how to load this data in some incremental way, can I ask
whether you've tried doing it the simple way and just loading it all in and
displaying it ?

I ask because I suspect that the existing list widget will be OK with that many
rows.  I admit that I haven't tried it, but I have tried loading comparable
numbers of rows into a custom Grid View of my own and that works fine.  The
Windows control underlying Dolphin's ListView is (by default) working in a
virtual mode so I wouldn't /expect/ ListView to have problems.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: Proxy pattern

Günther Schmidt
Chris

Chris Uppal wrote:

>
> Before talking about how to load this data in some incremental way, can I ask
> whether you've tried doing it the simple way and just loading it all in and
> displaying it ?
>
> I ask because I suspect that the existing list widget will be OK with that many
> rows.  I admit that I haven't tried it, but I have tried loading comparable
> numbers of rows into a custom Grid View of my own and that works fine.  The
> Windows control underlying Dolphin's ListView is (by default) working in a
> virtual mode so I wouldn't /expect/ ListView to have problems.

No, I haven't yet. I actually presume also that 10.000 + entries do not really pose a problem. I'm asking more on general principle.

Günther


Reply | Threaded
Open this post in threaded view
|

Re: Proxy pattern

Fernando Rodriguez
In reply to this post by Chris Uppal-3
On Sat, 28 May 2005 11:37:44 +0100, "Chris Uppal"
<[hidden email]> wrote:


>> Now I certainly can not load all the object that exist in that table into
>> memory at once, I suppose I'll have to do a Proxy pattern thingy.
>
>Before talking about how to load this data in some incremental way, can I ask
>whether you've tried doing it the simple way and just loading it all in and
>displaying it ?
>
>I ask because I suspect that the existing list widget will be OK with that many
>rows.

AFAIK, a listbox can handle 32K entries


Reply | Threaded
Open this post in threaded view
|

Re: Proxy pattern

Andy Bower-3
Fernando,

> >> Now I certainly can not load all the object that exist in that
> table into >> memory at once, I suppose I'll have to do a Proxy
> pattern thingy.
> >
> > Before talking about how to load this data in some incremental way,
> > can I ask whether you've tried doing it the simple way and just
> > loading it all in and displaying it ?
> >
> > I ask because I suspect that the existing list widget will be OK
> > with that many rows.
>
> AFAIK, a listbox can handle 32K entries

Try:

ListPresenter showOn: (1 to: 200000).

Then for improved performance try:

ListPresenter show: 'Enhanced list view' on: (1 to: 200000).

The latter is very much faster because it uses a Windows listview
control rather than a listbox and this can operate in so-called virtual
mode where all of the items don't have to be added to the list before
it can be displayed.

I'll leave it as "an exercise for the student" to determine what the
upper boundary (both theoretical and usable) is for the number of items.

Best regards,


Andy Bower
Dolphin Support
www.object-arts.com


Reply | Threaded
Open this post in threaded view
|

Re: Proxy pattern

Chris Uppal-3
Andy Bower wrote:

> I'll leave it as "an exercise for the student" to determine what the
> upper boundary (both theoretical and usable) is for the number of items.

<grin/>

On this WinXP box, 100 million rows can be displayed in a ListView, one more
than that cannot.

I conjecture that the usable upper limit is somewhere in the order of 100
thousand -- beyond that point a visible window's worth of rows would correspond
to less than 1 pixel on the scrollbar, even with a full-height window, so
scrolling would become essentially useless, or at least very awkward.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: Proxy pattern

Chris Uppal-3
In reply to this post by Günther Schmidt
[NB: this post is very Dolphin-specific in its details. My apologies to anyone
reading it in C.L.S who would rather not read about such details]

Guenther,

> No, I haven't yet. I actually presume also that 10.000 + entries do not
> really pose a problem. I'm asking more on general principle.

Right.

I think there are two parts to the question.  One is how to set up a level of
indirection so that you /can/ control when and how data is fetched from the
database for display in the list.  The other is working out how best to /use/
that control once you've got it.

For the first, the way /I/ would approach it is to set up a proxy for a
DBResultSet.  That proxy would contain a reference to a DBResultSet, and
implement enough of the protocol understood by ListModel to allow it to be used
as the Model for a ListPresenter.  As far as I can tell from a quick test, the
only methods that the ProxyResultSet has to implement in order for a simple
ListPresenter+ListView to work are:
    #at:ifAbsent:
    #size
(though I'd also implement #at: for convenience, and override #at:put: to self
shouldNotImplement for safety).   If those two methods are implemented (say,
for the time being, by just passing through to the underlying DBResultSet),
then you should be able to do something like:

    conn := ... make a DBConnection...
    query := ...
    results := conn query: query forwardOnly: false.
    model := DBResultSetProxy on: results.
    lp := ListPresenter show: 'Enhanced list view' on: model.
    lp view hasSortHeaders: false.  "important !!"

That will work -- just.  We haven't really changed anything functionally yet,
only set up a framework where we can take control of how the result set is
used.

The next thing to think about is how you are going to fetch data from the DB.
The two approaches that I can think of are:
    1) Fetch each row only when it's needed but then
        keep it in a cache.
    2) Fetch some limited number of rows into a cache, but
        don't attempt to cache the entire table.

The advantage of the first approach (compared with just reading the whole table
into memory before showing the list) is that you can show the list immediately,
without waiting while all the data is fetched from the database.  Rows will
only be fetched from the DB when the user scrolls far enough to see them, and
they may never do that, in which case you will save on the total amount of
memory needed.  Actually, unless the table is pretty big, or the connection to
the database is pretty slow, neither of those advantages is likely to be very
significant.

To implement (1) you could create an OrderedCollection with the right capacity:
    cache := OrderedCollection new: resultSet size.
and check whether the row was in the cache in #at:ifAbsent:, something like
(untested):

    at: anInteger ifAbsent: a0Block
        anInteger between: 1 and: resultSet size
            ifFalse: [^ a0Block value].
       anInteger > cache size ifTrue:
            [resultSet
                    from: cache size + 1
                    to: anInteger
                    do: [:each | cache addLast: each].
    ^ cache at: anInteger.

BTW, the use of DBResultSet>>from:to:do: is a deliberate attempt to use a
method that may be more efficient than just a loop using #at: repeatedly).
Note that, since we only ever read from the underlying result set in order, we
can open the result set in 'forward-only' mode which can be a lot more
efficient, i.e. we would open the result set with:
    results := conn query: query forwardOnly: true.
Also we could automatically close the result set once we had copied all of its
contents into our cache which would reduce the load on the DB.

But, if the table is very big, then you may end up consuming too much memory.
If that's a risk then it may be worth considering (2).  But be warned, it
doesn't work as well as you might hope -- more below.

To implement (2) you'd use a similar approach, but instead of keeping
everything that you'd read from the underlying connection in your cache, you
would limit the amount of data that it could hold.  You might limit it to 1000
rows, for instance.

There are a number of issues with this approach.  The first is that designing
the cache is not necessarily simple -- you want to ensure that as the user
scrolls forwards and backwards in the list, that the cache isn't constantly
throwing away data that it is quite likely to need again soon.  You might use a
sliding "window" of cached rows that moves up and down the list, or you might
divide the list into logical "pages" (of 100 rows each, say) and keep an
"least-recently-used" list of the pages that had been copied into the cache.
The second problem is that databases typically don't like random access into a
result set.  They differ in how well they support it, but there's always a
danger that the only way the database can give you (say) row 55234 is to start
at row 1 and just go through all the rows until it reaches number 55234.  Which
is slow.  So if the user scrolls to the end of a 10000-row table, and you only
have the last 1000 rows in the cache, then if the user scrolls backwards to row
8000 the database may be very slow in providing it. The last issue is that some
ODBC drivers appear to do their own caching; the driver for PostgreSQL 8.0
looks as if it does, whereas the Access driver appears not to.  If it is, then
your cache isn't adding very much value to what the driver is already
providing.

In short, it's not simple ;-)

    -- chris