P3 timeout

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

P3 timeout

Pierce Ng-3
Hi Sven and all,

I have a PostgreSQL database running in a resource-light Linux VM with ~4
million rows consisting of mostly text. My query executes regexp_matches() on
the text and it runs noticeably slowly.

I am encountering ConnectionTimedOut using P3 to run my query. PostgresV2 and
Python's pg8000 do work though. I see that PostgresV2's PGConnection uses
"Socket standardTimeout". I changed ZdcSimpleSocketStream class-side's
#openConnectionToHostNamed:port: bottom part to add the same timeout because
P3Client>>open calls it:

  ZdcSimpleSocketStream class>>openConnectionToHostNamed: hostName port: portNumber
    [...]
    ^ socketStream
      timeout: Socket standardTimeOut; "<== added this line"
      connectTo: hostIP port: portNumber;
      yourself

Now my query returns instead of timing out.

It does look like ZdcSimpleSocketStream's three class-side methods could use
some refactoring love.

Pierce

Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Stephane Ducasse-3
Hi Pierce

It would be cool if we could get a little success stories around what
you are doing.
A success stories does not mean that you get as rich as Steve Jobs,
just that you build
something nice successfully. ;)
Stef

On Sun, Nov 26, 2017 at 9:43 AM, Pierce Ng <[hidden email]> wrote:

> Hi Sven and all,
>
> I have a PostgreSQL database running in a resource-light Linux VM with ~4
> million rows consisting of mostly text. My query executes regexp_matches() on
> the text and it runs noticeably slowly.
>
> I am encountering ConnectionTimedOut using P3 to run my query. PostgresV2 and
> Python's pg8000 do work though. I see that PostgresV2's PGConnection uses
> "Socket standardTimeout". I changed ZdcSimpleSocketStream class-side's
> #openConnectionToHostNamed:port: bottom part to add the same timeout because
> P3Client>>open calls it:
>
>   ZdcSimpleSocketStream class>>openConnectionToHostNamed: hostName port: portNumber
>     [...]
>     ^ socketStream
>       timeout: Socket standardTimeOut; "<== added this line"
>       connectTo: hostIP port: portNumber;
>       yourself
>
> Now my query returns instead of timing out.
>
> It does look like ZdcSimpleSocketStream's three class-side methods could use
> some refactoring love.
>
> Pierce
>

Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Sven Van Caekenberghe-2
In reply to this post by Pierce Ng-3
Hi Pierce,

Thanks for the feedback. Yes, I did run into the timeout issue for heavy queries myself as well. I was already planning to add this option.

https://github.com/svenvc/P3/commit/ce68fd206b47c0411b75f0992ef4f4e119eb4ff1

 Added P3Client>>#timeout[:] option (thx Pierce Ng)
 Added P3ClientTests>>#testTimeout

 Refactored host/port/user/password/database into properties

Now you can do the following:

P3ClientTests>>#testTimeout
        "Reconnect with an explicit timeout"
        client close.
        client url: url.
        client timeout: 1. "second"
        self assert: client isWorking.
        "The following (sleeping for 0.5 seconds) should just pass successfully"
        client execute: 'SELECT pg_sleep(0.5)'.
        "While the following (sleeping for 1.5 seconds) should time out"
        self should: [ client execute: 'SELECT pg_sleep(1.5)' ] raise: ConnectionTimedOut

About Zdc streams. They have their own default timeout (5 seconds). I think it is not a good idea to have global timeout setting, these should be done per actual (client) instance. I like relatively short timeouts, so that I get faster feedback.

BTW, I am also interested in your experience using P3. How did it go ?

Sven

> On 26 Nov 2017, at 09:43, Pierce Ng <[hidden email]> wrote:
>
> Hi Sven and all,
>
> I have a PostgreSQL database running in a resource-light Linux VM with ~4
> million rows consisting of mostly text. My query executes regexp_matches() on
> the text and it runs noticeably slowly.
>
> I am encountering ConnectionTimedOut using P3 to run my query. PostgresV2 and
> Python's pg8000 do work though. I see that PostgresV2's PGConnection uses
> "Socket standardTimeout". I changed ZdcSimpleSocketStream class-side's
> #openConnectionToHostNamed:port: bottom part to add the same timeout because
> P3Client>>open calls it:
>
>  ZdcSimpleSocketStream class>>openConnectionToHostNamed: hostName port: portNumber
>    [...]
>    ^ socketStream
>      timeout: Socket standardTimeOut; "<== added this line"
>      connectTo: hostIP port: portNumber;
>      yourself
>
> Now my query returns instead of timing out.
>
> It does look like ZdcSimpleSocketStream's three class-side methods could use
> some refactoring love.
>
> Pierce
>


Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Pierce Ng-3
On Sun, Nov 26, 2017 at 07:17:51PM +0100, Sven Van Caekenberghe wrote:
>  Added P3Client>>#timeout[:] option (thx Pierce Ng)

Thank you, Sven.

> About Zdc streams. They have their own default timeout (5 seconds). I think
> it is not a good idea to have global timeout setting, these should be done
> per actual (client) instance. I like relatively short timeouts, so that I get
> faster feedback.

Agree. Mine was just a quick hack.

> BTW, I am also interested in your experience using P3. How did it go ?

Very easy to get going. I'm querying my server in 'batch' mode, and trying to
work out the best mix of Pharo and PostgreSQL code to process the data. As we
all know on this list, rapid development with Pharo is productive and fun.

Are you planning to implement the advanced query protocol with parameter binding?

Pierce


Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Pierce Ng-3
In reply to this post by Stephane Ducasse-3
On Sun, Nov 26, 2017 at 02:03:13PM +0100, Stephane Ducasse wrote:
> It would be cool if we could get a little success stories around what
> you are doing.
> A success stories does not mean that you get as rich as Steve Jobs,
> just that you build
> something nice successfully. ;)

Hi Stef,

Will do when I get somewhere with this. :-) I'm taking your lead by working on
some booklets, although for now $DAY_JOB beckons and progress has been slow.

Pierce

Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Stephane Ducasse-3
If you need help for your booklet setup let me know.

Stef

On Tue, Dec 5, 2017 at 1:02 AM, Pierce Ng <[hidden email]> wrote:

> On Sun, Nov 26, 2017 at 02:03:13PM +0100, Stephane Ducasse wrote:
>> It would be cool if we could get a little success stories around what
>> you are doing.
>> A success stories does not mean that you get as rich as Steve Jobs,
>> just that you build
>> something nice successfully. ;)
>
> Hi Stef,
>
> Will do when I get somewhere with this. :-) I'm taking your lead by working on
> some booklets, although for now $DAY_JOB beckons and progress has been slow.
>
> Pierce
>

Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Sven Van Caekenberghe-2
In reply to this post by Pierce Ng-3


> On 5 Dec 2017, at 01:00, Pierce Ng <[hidden email]> wrote:
>
> On Sun, Nov 26, 2017 at 07:17:51PM +0100, Sven Van Caekenberghe wrote:
>> Added P3Client>>#timeout[:] option (thx Pierce Ng)
>
> Thank you, Sven.
>
>> About Zdc streams. They have their own default timeout (5 seconds). I think
>> it is not a good idea to have global timeout setting, these should be done
>> per actual (client) instance. I like relatively short timeouts, so that I get
>> faster feedback.
>
> Agree. Mine was just a quick hack.
>
>> BTW, I am also interested in your experience using P3. How did it go ?
>
> Very easy to get going.

Great !

> I'm querying my server in 'batch' mode, and trying to
> work out the best mix of Pharo and PostgreSQL code to process the data. As we
> all know on this list, rapid development with Pharo is productive and fun.

Yes, it is fun.

> Are you planning to implement the advanced query protocol with parameter binding?

Not immediately. It was not needed for Glorp and I doubt a bit that it would make much performance difference. Intuitively, query execution time and data transfer time feel as the most important, binding the query with parameters much less so.

Sven

> Pierce
>
>


Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Henrik Sperre Johansen
Sven Van Caekenberghe-2 wrote
>> On 5 Dec 2017, at 01:00, Pierce Ng &lt;

> pierce@

> &gt; wrote:
>>
>> On Sun, Nov 26, 2017 at 07:17:51PM +0100, Sven Van Caekenberghe wrote:
>>> Added P3Client>>#timeout[:] option (thx Pierce Ng)
>>
>> Thank you, Sven.
>>
>>> About Zdc streams. They have their own default timeout (5 seconds). I
>>> think
>>> it is not a good idea to have global timeout setting, these should be
>>> done
>>> per actual (client) instance. I like relatively short timeouts, so that
>>> I get
>>> faster feedback.
>>
>> Agree. Mine was just a quick hack.
>>
>>> BTW, I am also interested in your experience using P3. How did it go ?
>>
>> Very easy to get going.
>
> Great !
>
>> I'm querying my server in 'batch' mode, and trying to
>> work out the best mix of Pharo and PostgreSQL code to process the data.
>> As we
>> all know on this list, rapid development with Pharo is productive and
>> fun.
>
> Yes, it is fun.
>
>> Are you planning to implement the advanced query protocol with parameter
>> binding?
>
> Not immediately. It was not needed for Glorp and I doubt a bit that it
> would make much performance difference. Intuitively, query execution time
> and data transfer time feel as the most important, binding the query with
> parameters much less so.
>
> Sven
>
>> Pierce
>>
>>

For selects, that's probably true.
For inserts however (at least in my experience with Oracle), queries
inserting moderately sized batches (say, 1k rows/transaction), will go from
glacial to blazingly fast, when switched from 1 INSERT per row  to a single
insert with bindings containing data for all 1000 rows.

Similar to using nextPutAll: instead of nextPut: when writing to unbuffered,
flushed files, really ;)

Cheers,
Henry



--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Sven Van Caekenberghe-2


> On 7 Dec 2017, at 11:38, Henrik Sperre Johansen <[hidden email]> wrote:
>
> Sven Van Caekenberghe-2 wrote
>>> On 5 Dec 2017, at 01:00, Pierce Ng &lt;
>
>> pierce@
>
>> &gt; wrote:
>>>
>>> On Sun, Nov 26, 2017 at 07:17:51PM +0100, Sven Van Caekenberghe wrote:
>>>> Added P3Client>>#timeout[:] option (thx Pierce Ng)
>>>
>>> Thank you, Sven.
>>>
>>>> About Zdc streams. They have their own default timeout (5 seconds). I
>>>> think
>>>> it is not a good idea to have global timeout setting, these should be
>>>> done
>>>> per actual (client) instance. I like relatively short timeouts, so that
>>>> I get
>>>> faster feedback.
>>>
>>> Agree. Mine was just a quick hack.
>>>
>>>> BTW, I am also interested in your experience using P3. How did it go ?
>>>
>>> Very easy to get going.
>>
>> Great !
>>
>>> I'm querying my server in 'batch' mode, and trying to
>>> work out the best mix of Pharo and PostgreSQL code to process the data.
>>> As we
>>> all know on this list, rapid development with Pharo is productive and
>>> fun.
>>
>> Yes, it is fun.
>>
>>> Are you planning to implement the advanced query protocol with parameter
>>> binding?
>>
>> Not immediately. It was not needed for Glorp and I doubt a bit that it
>> would make much performance difference. Intuitively, query execution time
>> and data transfer time feel as the most important, binding the query with
>> parameters much less so.
>>
>> Sven
>>
>>> Pierce
>>>
>>>
>
> For selects, that's probably true.
> For inserts however (at least in my experience with Oracle), queries
> inserting moderately sized batches (say, 1k rows/transaction), will go from
> glacial to blazingly fast, when switched from 1 INSERT per row  to a single
> insert with bindings containing data for all 1000 rows.
>
> Similar to using nextPutAll: instead of nextPut: when writing to unbuffered,
> flushed files, really ;)

Yes, inserting lots of data (usually during (initial) data load) is a special case which might be different. PostgreSQL has special COPY functionality for that. Still, you can put a lot of data in a single INSERT statement as well.

What I forgot to say in my previous message was that the rest of the protocol could certainly be added within the current setup, although it would probably make some aspects more complex. Right now, there is no explicit state machine, when adding all the other logic, it will be more difficult to hand code around that.

> Cheers,
> Henry
>
>
>
> --
> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html


Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Pierce Ng-3
In reply to this post by Stephane Ducasse-3
On Tue, Dec 05, 2017 at 07:54:00AM +0100, Stephane Ducasse wrote:
> If you need help for your booklet setup let me know.

Thanks. I'm already set up with Pillar and all that. I'm working on extending
Sven's Reddit.st. Next step is to make it run over TLS, which is why I'm
writing a Let's Encrypt client, which is why I'm writing the OpenSSL wrapper.

Pierce


Reply | Threaded
Open this post in threaded view
|

Re: P3 timeout

Stephane Ducasse-3
Ok this looks like a good plan :)

On Mon, Dec 11, 2017 at 2:48 PM, Pierce Ng <[hidden email]> wrote:

> On Tue, Dec 05, 2017 at 07:54:00AM +0100, Stephane Ducasse wrote:
>> If you need help for your booklet setup let me know.
>
> Thanks. I'm already set up with Pillar and all that. I'm working on extending
> Sven's Reddit.st. Next step is to make it run over TLS, which is why I'm
> writing a Let's Encrypt client, which is why I'm writing the OpenSSL wrapper.
>
> Pierce
>
>