Smalltalk › Gemtalk › GLASS

slow data page reads?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

50 messages Options

123

Johan Brichau-2

slow data page reads?

Hi Gemstoners,

Is there any condition (other than a slow filesystem) that would trigger slow page reads when a gem needs to hit disk and load objects?

Here is the problem I'm trying to chase: a seaside gem is processing a request and (according to the statmonit output) ends up requesting pages. The pageread process goes terribly slow (takes approx +- 50s) and I see only 5 to 15 pages per second being read during that time period. There is no other activity at that moment and I'm puzzled by why the read goes so slow (other than a slow filesystem -- see next).

Because the iostat system monitoring also shows the same low read speed and indicates a 100% disk util statistic, my obvious first impression was that the disk is saturated and we have datastore problem. However, the disk read speed proves to be good when I'm doing other disk activity outside of Gemstone.
Moreover, the _write_ speed is terribly good at all times.

So, I'm currently trying to chase something that only triggers slow page read speed from a Gemstone topaz session.

GEM_IO_LIMIT is set at default setting of 5000

For illustration, these are some kind of io stats when Gemstone is doing read access:

Time: 06:40:21 PM
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda3 0.00 0.20 6.00 0.40 0.09 0.00 30.75 1.00 166.88 156.00 99.84

Time: 06:40:26 PM
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda3 0.00 0.20 8.20 0.40 0.13 0.00 31.07 1.05 119.91 115.72 99.52

Time: 06:40:31 PM
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda3 0.00 0.20 5.99 0.40 0.09 0.00 30.75 1.01 157.75 156.25 99.80

Johan Brichau-2

Re: slow data page reads?

Hi all,

Never mind my question below: our hosters have identified the problem on their SAN.
Strange behavior though...

phew ;-)
Johan

On 13 Feb 2012, at 14:05, Johan Brichau wrote:

> Hi Gemstoners,
>
> Is there any condition (other than a slow filesystem) that would trigger slow page reads when a gem needs to hit disk and load objects?
>
> Here is the problem I'm trying to chase: a seaside gem is processing a request and (according to the statmonit output) ends up requesting pages. The pageread process goes terribly slow (takes approx +- 50s) and I see only 5 to 15 pages per second being read during that time period. There is no other activity at that moment and I'm puzzled by why the read goes so slow (other than a slow filesystem -- see next).
>
> Because the iostat system monitoring also shows the same low read speed and indicates a 100% disk util statistic, my obvious first impression was that the disk is saturated and we have datastore problem. However, the disk read speed proves to be good when I'm doing other disk activity outside of Gemstone.
> Moreover, the _write_ speed is terribly good at all times.
>
> So, I'm currently trying to chase something that only triggers slow page read speed from a Gemstone topaz session.
>
> GEM_IO_LIMIT is set at default setting of 5000
>
> For illustration, these are some kind of io stats when Gemstone is doing read access:
>
> Time: 06:40:21 PM
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
> sda3 0.00 0.20 6.00 0.40 0.09 0.00 30.75 1.00 166.88 156.00 99.84
>
> Time: 06:40:26 PM
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
> sda3 0.00 0.20 8.20 0.40 0.13 0.00 31.07 1.05 119.91 115.72 99.52
>
> Time: 06:40:31 PM
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
> sda3 0.00 0.20 5.99 0.40 0.09 0.00 30.75 1.01 157.75 156.25 99.80

otto

Re: slow data page reads?

Hi Johan,

We had a machine hosted on a VPS, with a "state of the art" san, with
similar issues. We complained every so often and the service provider
responded with their inability to control some users on the same VPS
host doing "extremely heavy" disk io. We got the client off the vps
onto a normal machine with a SATA disk and have had joy ever since
(10-20x improvement with the vps at its best).

I think that the randomness of the reads thrown on top of other vms on
the same host just caused unpredictable io; so we prefer avoiding vms.

Alternatively, if it can work for you, put the extents in RAM.

Otto

On 13 Feb 2012, at 20:16, Johan Brichau <[hidden email]> wrote:

> Hi all,
>
> Never mind my question below: our hosters have identified the problem on their SAN.
> Strange behavior though...
>
> phew ;-)
> Johan
>
> On 13 Feb 2012, at 14:05, Johan Brichau wrote:
>
>> Hi Gemstoners,
>>
>> Is there any condition (other than a slow filesystem) that would trigger slow page reads when a gem needs to hit disk and load objects?
>>
>> Here is the problem I'm trying to chase: a seaside gem is processing a request and (according to the statmonit output) ends up requesting pages. The pageread process goes terribly slow (takes approx +- 50s) and I see only 5 to 15 pages per second being read during that time period. There is no other activity at that moment and I'm puzzled by why the read goes so slow (other than a slow filesystem -- see next).
>>
>> Because the iostat system monitoring also shows the same low read speed and indicates a 100% disk util statistic, my obvious first impression was that the disk is saturated and we have datastore problem. However, the disk read speed proves to be good when I'm doing other disk activity outside of Gemstone.
>> Moreover, the _write_ speed is terribly good at all times.
>>
>> So, I'm currently trying to chase something that only triggers slow page read speed from a Gemstone topaz session.
>>
>> GEM_IO_LIMIT is set at default setting of 5000
>>
>> For illustration, these are some kind of io stats when Gemstone is doing read access:
>>
>> Time: 06:40:21 PM
>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>> sda3 0.00 0.20 6.00 0.40 0.09 0.00 30.75 1.00 166.88 156.00 99.84
>>
>> Time: 06:40:26 PM
>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>> sda3 0.00 0.20 8.20 0.40 0.13 0.00 31.07 1.05 119.91 115.72 99.52
>>
>> Time: 06:40:31 PM
>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>> sda3 0.00 0.20 5.99 0.40 0.09 0.00 30.75 1.01 157.75 156.25 99.80
>

Johan Brichau-2

Re: slow data page reads?

Well.. it turns out that we were wrong and we still experience the problem...

Dale,

What we are seeing sounds very similar to this:

http://gemstonesoup.wordpress.com/2007/10/19/scaling-seaside-with-gemstones/

" The issue with the i/o anomalies that we observed in Linux has not been as easy to resolve. I spent some time tuning GemStone/S to make sure that GemStone/S wasn't the source of the anomaly. Finally our IS guy was able to reproduce the anomaly and he ran into a few other folks on the net that have observed similar anomalies.

At this writing we haven't found a solution to the anomaly, but we are pretty optimistic that it is resolvable. We've seen different versions of Linux running on similar hardware that doesn't show the anomaly, so it is either a function of the kernel version or the settings of some of the kernel parameters. As soon as we figure it out we'll let you know."

Do you have more information on this?

Johan

On 13 Feb 2012, at 19:39, Otto Behrens wrote:

> Hi Johan,
>
> We had a machine hosted on a VPS, with a "state of the art" san, with
> similar issues. We complained every so often and the service provider
> responded with their inability to control some users on the same VPS
> host doing "extremely heavy" disk io. We got the client off the vps
> onto a normal machine with a SATA disk and have had joy ever since
> (10-20x improvement with the vps at its best).
>
> I think that the randomness of the reads thrown on top of other vms on
> the same host just caused unpredictable io; so we prefer avoiding vms.
>
> Alternatively, if it can work for you, put the extents in RAM.
>
> Otto
>
> On 13 Feb 2012, at 20:16, Johan Brichau <[hidden email]> wrote:
>
>> Hi all,
>>
>> Never mind my question below: our hosters have identified the problem on their SAN.
>> Strange behavior though...
>>
>> phew ;-)
>> Johan
>>
>> On 13 Feb 2012, at 14:05, Johan Brichau wrote:
>>
>>> Hi Gemstoners,
>>>
>>> Is there any condition (other than a slow filesystem) that would trigger slow page reads when a gem needs to hit disk and load objects?
>>>
>>> Here is the problem I'm trying to chase: a seaside gem is processing a request and (according to the statmonit output) ends up requesting pages. The pageread process goes terribly slow (takes approx +- 50s) and I see only 5 to 15 pages per second being read during that time period. There is no other activity at that moment and I'm puzzled by why the read goes so slow (other than a slow filesystem -- see next).
>>>
>>> Because the iostat system monitoring also shows the same low read speed and indicates a 100% disk util statistic, my obvious first impression was that the disk is saturated and we have datastore problem. However, the disk read speed proves to be good when I'm doing other disk activity outside of Gemstone.
>>> Moreover, the _write_ speed is terribly good at all times.
>>>
>>> So, I'm currently trying to chase something that only triggers slow page read speed from a Gemstone topaz session.
>>>
>>> GEM_IO_LIMIT is set at default setting of 5000
>>>
>>> For illustration, these are some kind of io stats when Gemstone is doing read access:
>>>
>>> Time: 06:40:21 PM
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda3 0.00 0.20 6.00 0.40 0.09 0.00 30.75 1.00 166.88 156.00 99.84
>>>
>>> Time: 06:40:26 PM
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda3 0.00 0.20 8.20 0.40 0.13 0.00 31.07 1.05 119.91 115.72 99.52
>>>
>>> Time: 06:40:31 PM
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda3 0.00 0.20 5.99 0.40 0.09 0.00 30.75 1.01 157.75 156.25 99.80
>>

Johan Brichau-2

Re: slow data page reads?

As mentioned in Dale's blogpost, I went on to try a raw disk partition for the extent and the tranlogs and got exactly the same results: *very* low disk read speed (see below). Starting Gemstone and reading the SPC takes a long time.

We are pretty certain the SAN is not overloaded because all other disk operations can reach a lot higher speeds. For example, the copydbf operation from the extent file to the partition reached very good speeds of over 30MB/s.

So we are only seeing this issue when gemstone is doing read access on this kind of setup. I have other servers where everything is running smoothly.

If anybody has any ideas... that would be cool ;-)

Johan

Sample read speed during gemstone page read:

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda5 111.60 0.00 37.00 0.00 0.58 0.00 32.00 1.00 26.90 27.01 99.92

On 13 Feb 2012, at 21:09, Johan Brichau wrote:

> Well.. it turns out that we were wrong and we still experience the problem...
>
> Dale,
>
> What we are seeing sounds very similar to this:
>
> http://gemstonesoup.wordpress.com/2007/10/19/scaling-seaside-with-gemstones/
>
> " The issue with the i/o anomalies that we observed in Linux has not been as easy to resolve. I spent some time tuning GemStone/S to make sure that GemStone/S wasn't the source of the anomaly. Finally our IS guy was able to reproduce the anomaly and he ran into a few other folks on the net that have observed similar anomalies.
>
> At this writing we haven't found a solution to the anomaly, but we are pretty optimistic that it is resolvable. We've seen different versions of Linux running on similar hardware that doesn't show the anomaly, so it is either a function of the kernel version or the settings of some of the kernel parameters. As soon as we figure it out we'll let you know."
>
> Do you have more information on this?
>
> Johan
>
>
> On 13 Feb 2012, at 19:39, Otto Behrens wrote:
>
>> Hi Johan,
>>
>> We had a machine hosted on a VPS, with a "state of the art" san, with
>> similar issues. We complained every so often and the service provider
>> responded with their inability to control some users on the same VPS
>> host doing "extremely heavy" disk io. We got the client off the vps
>> onto a normal machine with a SATA disk and have had joy ever since
>> (10-20x improvement with the vps at its best).
>>
>> I think that the randomness of the reads thrown on top of other vms on
>> the same host just caused unpredictable io; so we prefer avoiding vms.
>>
>> Alternatively, if it can work for you, put the extents in RAM.
>>
>> Otto
>>
>> On 13 Feb 2012, at 20:16, Johan Brichau <[hidden email]> wrote:
>>
>>> Hi all,
>>>
>>> Never mind my question below: our hosters have identified the problem on their SAN.
>>> Strange behavior though...
>>>
>>> phew ;-)
>>> Johan
>>>
>>> On 13 Feb 2012, at 14:05, Johan Brichau wrote:
>>>
>>>> Hi Gemstoners,
>>>>
>>>> Is there any condition (other than a slow filesystem) that would trigger slow page reads when a gem needs to hit disk and load objects?
>>>>
>>>> Here is the problem I'm trying to chase: a seaside gem is processing a request and (according to the statmonit output) ends up requesting pages. The pageread process goes terribly slow (takes approx +- 50s) and I see only 5 to 15 pages per second being read during that time period. There is no other activity at that moment and I'm puzzled by why the read goes so slow (other than a slow filesystem -- see next).
>>>>
>>>> Because the iostat system monitoring also shows the same low read speed and indicates a 100% disk util statistic, my obvious first impression was that the disk is saturated and we have datastore problem. However, the disk read speed proves to be good when I'm doing other disk activity outside of Gemstone.
>>>> Moreover, the _write_ speed is terribly good at all times.
>>>>
>>>> So, I'm currently trying to chase something that only triggers slow page read speed from a Gemstone topaz session.
>>>>
>>>> GEM_IO_LIMIT is set at default setting of 5000
>>>>
>>>> For illustration, these are some kind of io stats when Gemstone is doing read access:
>>>>
>>>> Time: 06:40:21 PM
>>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>>> sda3 0.00 0.20 6.00 0.40 0.09 0.00 30.75 1.00 166.88 156.00 99.84
>>>>
>>>> Time: 06:40:26 PM
>>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>>> sda3 0.00 0.20 8.20 0.40 0.13 0.00 31.07 1.05 119.91 115.72 99.52
>>>>
>>>> Time: 06:40:31 PM
>>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>>> sda3 0.00 0.20 5.99 0.40 0.09 0.00 30.75 1.01 157.75 156.25 99.80
>>>
>

Johan Brichau-2

Re: slow data page reads?

In reply to this post by otto

Otto,

I'm curious: can you remember if you saw bad io also outside of Gemstone?
And maybe even better write than read speed?

Chasing on...
Johan

On 13 Feb 2012, at 19:39, Otto Behrens wrote:

otto

Re: slow data page reads?

> I'm curious: can you remember if you saw bad io also outside of Gemstone?
> And maybe even better write than read speed?

As I remember it, we had better speed when copying files around, but
still not good. I can't say that writing was specifically faster than
reading. Reading random pages was very much the worst. So we did not
conclude that it was something specific to do with GemStone. But when
you mentioned this problem you're having now, it certainly sounded
familiar, especially looking at the iostat output. I'm hunting through
the support emails and I'll forward some to you.

I don't know if you've got the async io driver enabled or not; I
remember reading that this used to be what GemStone recommended, but
it's no longer the case. I've seen complaints about slow IO devices in
the stone log before...

Sorry, best I can remember is snippets like these.

Larry Kellogg

Announcement, performance questions

In reply to this post by Johan Brichau-2

Hello,

Well, Apple approved my iPhone application, PracticeMusic ((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over the weekend, so, I am excited to announce that www.practicemusic.com is now deployed. ;-) All comments and suggestions are welcome.

I would like to extend a sincere thanks to all of the people on this forum who helped me out. I'll probably miss some people, but thanks are due to Dale, Nick, Norbert, Johan, Paul, and many others. You guys have been great. Keep up the good work!

A few performance questions. Although, load is low right now, I have noticed a few hangs in Seaside, where a page doesn't get rendered, and the url at the top of the browser just sits there for a long while, spinning. I have no idea what is going on and can back arrow and navigate around the website. I don't see any exceptions in the Object Log. How do I go about looking into performance issues in Seaside?

Also, where do I find commit failures? Which log file do they get written to?

Thanks again,

Larry

Johan Brichau-2

Re: Announcement, performance questions

Hi Lawrence,

Congratulations with the app! :-)

On 14 Feb 2012, at 15:45, Lawrence Kellogg wrote:

> A few performance questions. Although, load is low right now, I have noticed a few hangs in Seaside, where a page doesn't get rendered, and the url at the top of the browser just sits there for a long while, spinning. I have no idea what is going on and can back arrow and navigate around the website. I don't see any exceptions in the Object Log. How do I go about looking into performance issues in Seaside?

You might want to investigate a few things:
- did the gem crash? see the gem log (e.g. FastCGI_server_9001.log) for any stack traces or error reports and the frontend webserver logs for errors.
- is the gem consuming CPU cycles or busy waiting for io? You can check that out using 'top' -> is a topaz consuming a lot of cpu or io?
- You can run the statmonitor (see gemstone docs) tool to gather statistics on what gemstone is doing

That should be ok to get you started, because all other actions depend on wether it was a crash / long cpu time / long io time

Johan

Larry Kellogg

Re: Announcement, performance questions

On Feb 14, 2012, at 9:55 AM, Johan Brichau wrote:

> Hi Lawrence,
>
> Congratulations with the app! :-)

Thanks!

>
> On 14 Feb 2012, at 15:45, Lawrence Kellogg wrote:
>
>> A few performance questions. Although, load is low right now, I have noticed a few hangs in Seaside, where a page doesn't get rendered, and the url at the top of the browser just sits there for a long while, spinning. I have no idea what is going on and can back arrow and navigate around the website. I don't see any exceptions in the Object Log. How do I go about looking into performance issues in Seaside?
>
> You might want to investigate a few things:
> - did the gem crash? see the gem log (e.g. FastCGI_server_9001.log) for any stack traces or error reports and the frontend webserver logs for errors.

I don't think the gem crashed as I cannot find any stack trace, and I was able to keep navigating around after I hit the back button, so my session was still active.
It's strange.

> - is the gem consuming CPU cycles or busy waiting for io? You can check that out using 'top' -> is a topaz consuming a lot of cpu or io?
> - You can run the statmonitor (see gemstone docs) tool to gather statistics on what gemstone is doing
>

Thanks for this information. I should have invoked those tools when I saw the behavior. I will keep an eye on the system and do it next time.

Larry

> That should be ok to get you started, because all other actions depend on wether it was a crash / long cpu time / long io time
>
> Johan

Philippe Marschall

Re: Announcement, performance questions

In reply to this post by Larry Kellogg

2012/2/14 Lawrence Kellogg <[hidden email]>:
> Hello,
> Well, Apple approved my iPhone application, PracticeMusic
> ((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over
> the weekend, so, I am excited to announce that www.practicemusic.com is now
> deployed. ;-) All comments and suggestions are welcome.

Great. I noted a couple of small things:

- You have several div with id="container" and id="wrapper". You
probably want them to be class="container" class="wrapper".

- Consider moving your stylesheet from a #style method to a file.
Otherwise it will end up in the session cache (I know this sucks, I
plan to fix this the next weekend).

- I have a layout bug with Firefox 11 on Linux (screenshot).

Cheers
Philippe

Screenshot.png (49K) Download Attachment

Larry Kellogg

Re: Announcement, performance questions

On Feb 14, 2012, at 10:32 AM, Philippe Marschall wrote:

> 2012/2/14 Lawrence Kellogg <[hidden email]>:
>> Hello,
>> Well, Apple approved my iPhone application, PracticeMusic
>> ((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over
>> the weekend, so, I am excited to announce that www.practicemusic.com is now
>> deployed. ;-) All comments and suggestions are welcome.
>
> Great. I noted a couple of small things:
>
> - You have several div with id="container" and id="wrapper". You
> probably want them to be class="container" class="wrapper".
>

Hello Philippe,
Thanks for taking a moment to look at the site. Ah, yes, I should
change those definitions to classes. I am a CSS newbie, it probably
shows. I guess I'm not clear when to use class for an AP DIV and when
to use ID.

> - Consider moving your stylesheet from a #style method to a file.

How could you tell that they were still in the #style method? ;-)
My config is not open, is it? I did password protect the thing.

> Otherwise it will end up in the session cache (I know this sucks, I
> plan to fix this the next weekend).
>

Ah, that's the problem. Ok, I will put them in the UserFileLibrary.

> - I have a layout bug with Firefox 11 on Linux (screenshot).
>

Wow, that's terrible! It looks ok in Firefox 9.01. I guess I will have to get
11 and try to figure out what is going on, although I see that 10.0.1 is the default
download for Mac.

I made a little change to the canvas size on the UserLogin. Any better?

Larry
> Cheers
> Philippe
> <Screenshot.png>

Philippe Marschall

Re: Announcement, performance questions

2012/2/14 Lawrence Kellogg <[hidden email]>:

>
> On Feb 14, 2012, at 10:32 AM, Philippe Marschall wrote:
>
>> 2012/2/14 Lawrence Kellogg <[hidden email]>:
>>> Hello,
>>> Well, Apple approved my iPhone application, PracticeMusic
>>> ((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over
>>> the weekend, so, I am excited to announce that www.practicemusic.com is now
>>> deployed. ;-) All comments and suggestions are welcome.
>>
>> Great. I noted a couple of small things:
>>
>> - You have several div with id="container" and id="wrapper". You
>> probably want them to be class="container" class="wrapper".
>>
>
> Hello Philippe,
> Thanks for taking a moment to look at the site. Ah, yes, I should
> change those definitions to classes. I am a CSS newbie, it probably
> shows. I guess I'm not clear when to use class for an AP DIV and when
> to use ID.

Ids are for things that only appear once, classes are for things that
can appear several times. .

>> - Consider moving your stylesheet from a #style method to a file.
>
> How could you tell that they were still in the #style method? ;-)
> My config is not open, is it? I did password protect the thing.

I can tell from some of the URLs and seeing quite a few Seaside
applications in my time:

<link rel="stylesheet" type="text/css"
href="/PracticeJournalLoginTask?_s=mz7PQuhvG-iPkB-G"/>

Notice the _s? That means it goes into the session cache, which is
what happens with #style.

>> Otherwise it will end up in the session cache (I know this sucks, I
>> plan to fix this the next weekend).
>>
>
> Ah, that's the problem. Ok, I will put them in the UserFileLibrary.

It's best to have it on the file system and serve them with
Ngix/Apache. This way they don't hit the image at all. Probably put it
where you put your static images.

>> - I have a layout bug with Firefox 11 on Linux (screenshot).
>>
>
> Wow, that's terrible! It looks ok in Firefox 9.01. I guess I will have to get
> 11 and try to figure out what is going on, although I see that 10.0.1 is the default
> download for Mac.
>
> I made a little change to the canvas size on the UserLogin. Any better?

Nope, the easiest but ugly fix would probably be a #break

Cheers
Philippe

Larry Kellogg

Re: Announcement, performance questions

On Feb 14, 2012, at 12:42 PM, Philippe Marschall wrote:

> 2012/2/14 Lawrence Kellogg <[hidden email]>:
>>
>> On Feb 14, 2012, at 10:32 AM, Philippe Marschall wrote:
>>
>>> 2012/2/14 Lawrence Kellogg <[hidden email]>:
>>>> Hello,
>>>> Well, Apple approved my iPhone application, PracticeMusic
>>>> ((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over
>>>> the weekend, so, I am excited to announce that www.practicemusic.com is now
>>>> deployed. ;-) All comments and suggestions are welcome.
>>>
>>> Great. I noted a couple of small things:
>>>
>>> - You have several div with id="container" and id="wrapper". You
>>> probably want them to be class="container" class="wrapper".
>>>
>>
>> Hello Philippe,
>> Thanks for taking a moment to look at the site. Ah, yes, I should
>> change those definitions to classes. I am a CSS newbie, it probably
>> shows. I guess I'm not clear when to use class for an AP DIV and when
>> to use ID.
>
> Ids are for things that only appear once, classes are for things that
> can appear several times. .

At the moment, there are only two style sheets in the app, one for the
login page, and one for all of the other views. Each style sheet has
one "container" and one "wrapper" defined.

I'm nor clear on the idea of appearing multiple times, in terms of
CSS. What does that mean?

>
>>> - Consider moving your stylesheet from a #style method to a file.
>>
>> How could you tell that they were still in the #style method? ;-)
>> My config is not open, is it? I did password protect the thing.
>
> I can tell from some of the URLs and seeing quite a few Seaside
> applications in my time:
>
> <link rel="stylesheet" type="text/css"
> href="/PracticeJournalLoginTask?_s=mz7PQuhvG-iPkB-G"/>
>
> Notice the _s? That means it goes into the session cache, which is
> what happens with #style.
>
>>> Otherwise it will end up in the session cache (I know this sucks, I
>>> plan to fix this the next weekend).
>>>
>>
>> Ah, that's the problem. Ok, I will put them in the UserFileLibrary.
>
> It's best to have it on the file system and serve them with
> Ngix/Apache. This way they don't hit the image at all. Probably put it
> where you put your static images.

Got it, I will fix this problem right away. It should be easy to serve them from nginx.
I just have to change updateRoot.

>
>>> - I have a layout bug with Firefox 11 on Linux (screenshot).
>>>
>>
>> Wow, that's terrible! It looks ok in Firefox 9.01. I guess I will have to get
>> 11 and try to figure out what is going on, although I see that 10.0.1 is the default
>> download for Mac.
>>
>> I made a little change to the canvas size on the UserLogin. Any better?
>
> Nope, the easiest but ugly fix would probably be a #break
>

Interesting. I downloaded Firefox 11.0 (the Beta) on my Mac and it looks fine to me.
I'm not sure what is going on. Where does the break need to go? Everything is slotted into
DIVs so it's probably not that easy.

Larry

> Cheers
> Philippe

Dale Henrichs

Re: slow data page reads?

In reply to this post by Johan Brichau-2

Johan,

We have a program that tests performance for a system doing random page reads against an extent. Launch `$GEMSTONE/sys/pgsvrslow` then enter the following two commands at the 'PGSVR>' prompt :

'$GEMSTONE/seaside/data/extent0.dbf' opendbfnolock
<numpages> testreadrate
<numpages in block> <numsamples> testbigreadrate

The `testreadrate` command does reads <numpages> random pages from the given extent. The answer you get gives random read performance.

The `testbigreadrate` command does <numsamples> reads of <numpages in block> pages from random locations in the given extent. The answer you get gives you a measure of sequential read performance.

Here's sample output from one of our desktop boxes on standard file system (basically reading from file buffer):

---------------------------------------------------------------------------------
% $GEMSTONE/sys/pgsvrslow
PGSVR>'extent0.dbf' opendbfnolock

PGSVR>10000 testreadrate

10000 random pages read in 16 ms
Avg random read rate: 625000.00 pages/s (1.6000e-03 ms/read)

PGSVR>100 20 testbigreadrate

2000 random pages read in 20 IO calls in 4 ms
Avg random IO rate: 5000.00 IO/s (2.0000e-01 ms/read)
Avg page read rate: 500000.00 pages/s (2.0000e-03 ms/ page read)
PGSVR>
---------------------------------------------------------------------------------

These commands can be run against the extent for a running stone ... but you'll want to get measurements with a variety of configurations...

At the moment we're guessing that that the SAN might be optimized for sequential reads rather than random reads (i.e., buffering issues) ... also are you sure the you aren't be throttled by your provider?

Finally it is worth looking at a copy of the config file for the stone to see if there's anything there...

Dale

----- Original Message -----
| From: "Johan Brichau" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Tuesday, February 14, 2012 5:43:58 AM
| Subject: Re: [GS/SS Beta] slow data page reads?
|
| As mentioned in Dale's blogpost, I went on to try a raw disk
| partition for the extent and the tranlogs and got exactly the same
| results: *very* low disk read speed (see below). Starting Gemstone
| and reading the SPC takes a long time.
|
| We are pretty certain the SAN is not overloaded because all other
| disk operations can reach a lot higher speeds. For example, the
| copydbf operation from the extent file to the partition reached very
| good speeds of over 30MB/s.
|
| So we are only seeing this issue when gemstone is doing read access
| on this kind of setup. I have other servers where everything is
| running smoothly.
|
| If anybody has any ideas... that would be cool ;-)
|
| Johan
|
| Sample read speed during gemstone page read:
|
| Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
| avgrq-sz avgqu-sz await svctm %util
| sda5 111.60 0.00 37.00 0.00 0.58 0.00
| 32.00 1.00 26.90 27.01 99.92
|
|
| On 13 Feb 2012, at 21:09, Johan Brichau wrote:
|
| > Well.. it turns out that we were wrong and we still experience the
| > problem...
| >
| > Dale,
| >
| > What we are seeing sounds very similar to this:
| >
| > http://gemstonesoup.wordpress.com/2007/10/19/scaling-seaside-with-gemstones/
| >
| > " The issue with the i/o anomalies that we observed in Linux has
| > not been as easy to resolve. I spent some time tuning GemStone/S
| > to make sure that GemStone/S wasn't the source of the anomaly.
| > Finally our IS guy was able to reproduce the anomaly and he ran
| > into a few other folks on the net that have observed similar
| > anomalies.
| >
| > At this writing we haven't found a solution to the anomaly, but we
| > are pretty optimistic that it is resolvable. We've seen different
| > versions of Linux running on similar hardware that doesn't show
| > the anomaly, so it is either a function of the kernel version or
| > the settings of some of the kernel parameters. As soon as we
| > figure it out we'll let you know."
| >
| > Do you have more information on this?
| >
| > Johan
| >
| >
| > On 13 Feb 2012, at 19:39, Otto Behrens wrote:
| >
| >> Hi Johan,
| >>
| >> We had a machine hosted on a VPS, with a "state of the art" san,
| >> with
| >> similar issues. We complained every so often and the service
| >> provider
| >> responded with their inability to control some users on the same
| >> VPS
| >> host doing "extremely heavy" disk io. We got the client off the
| >> vps
| >> onto a normal machine with a SATA disk and have had joy ever since
| >> (10-20x improvement with the vps at its best).
| >>
| >> I think that the randomness of the reads thrown on top of other
| >> vms on
| >> the same host just caused unpredictable io; so we prefer avoiding
| >> vms.
| >>
| >> Alternatively, if it can work for you, put the extents in RAM.
| >>
| >> Otto
| >>
| >> On 13 Feb 2012, at 20:16, Johan Brichau <[hidden email]>
| >> wrote:
| >>
| >>> Hi all,
| >>>
| >>> Never mind my question below: our hosters have identified the
| >>> problem on their SAN.
| >>> Strange behavior though...
| >>>
| >>> phew ;-)
| >>> Johan
| >>>
| >>> On 13 Feb 2012, at 14:05, Johan Brichau wrote:
| >>>
| >>>> Hi Gemstoners,
| >>>>
| >>>> Is there any condition (other than a slow filesystem) that would
| >>>> trigger slow page reads when a gem needs to hit disk and load
| >>>> objects?
| >>>>
| >>>> Here is the problem I'm trying to chase: a seaside gem is
| >>>> processing a request and (according to the statmonit output)
| >>>> ends up requesting pages. The pageread process goes terribly
| >>>> slow (takes approx +- 50s) and I see only 5 to 15 pages per
| >>>> second being read during that time period. There is no other
| >>>> activity at that moment and I'm puzzled by why the read goes so
| >>>> slow (other than a slow filesystem -- see next).
| >>>>
| >>>> Because the iostat system monitoring also shows the same low
| >>>> read speed and indicates a 100% disk util statistic, my obvious
| >>>> first impression was that the disk is saturated and we have
| >>>> datastore problem. However, the disk read speed proves to be
| >>>> good when I'm doing other disk activity outside of Gemstone.
| >>>> Moreover, the _write_ speed is terribly good at all times.
| >>>>
| >>>> So, I'm currently trying to chase something that only triggers
| >>>> slow page read speed from a Gemstone topaz session.
| >>>>
| >>>> GEM_IO_LIMIT is set at default setting of 5000
| >>>>
| >>>> For illustration, these are some kind of io stats when Gemstone
| >>>> is doing read access:
| >>>>
| >>>> Time: 06:40:21 PM
| >>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
| >>>> avgrq-sz avgqu-sz await svctm %util
| >>>> sda3 0.00 0.20 6.00 0.40 0.09 0.00
| >>>> 30.75 1.00 166.88 156.00 99.84
| >>>>
| >>>> Time: 06:40:26 PM
| >>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
| >>>> avgrq-sz avgqu-sz await svctm %util
| >>>> sda3 0.00 0.20 8.20 0.40 0.13 0.00
| >>>> 31.07 1.05 119.91 115.72 99.52
| >>>>
| >>>> Time: 06:40:31 PM
| >>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
| >>>> avgrq-sz avgqu-sz await svctm %util
| >>>> sda3 0.00 0.20 5.99 0.40 0.09 0.00
| >>>> 30.75 1.01 157.75 156.25 99.80
| >>>
| >
|
|

Larry Kellogg

Extent size explosion

Hello,
So, I run two separate Amazon instances, once called Production, and one called Staging.
My plan was to back up production twice a day and load the backups into the Staging environment.
Well, here are my extent sizes:

515899392 Feb 14 18:32 extent0.dbf - Production
4192206848 Feb 14 18:32 extent0.dbf - Staging

Why is the size of the extent in staging eight! times the extent in Production when I loaded a backup from Production
into Staging? I'm reading the Admin guide as fast as I can but I don't know what is going on.

I removedbf old tranlogs in Staging, get the disk space down to 70%, look the other way, and it
goes to 100%. Puzzling. Is there a way to shrink the extent and clean things up?

Thanks,

Larry

Philippe Marschall

Re: Announcement, performance questions

In reply to this post by Larry Kellogg

2012/2/14 Lawrence Kellogg <[hidden email]>:

>
> On Feb 14, 2012, at 12:42 PM, Philippe Marschall wrote:
>
>> 2012/2/14 Lawrence Kellogg <[hidden email]>:
>>>
>>> On Feb 14, 2012, at 10:32 AM, Philippe Marschall wrote:
>>>
>>>> 2012/2/14 Lawrence Kellogg <[hidden email]>:
>>>>> Hello,
>>>>> Well, Apple approved my iPhone application, PracticeMusic
>>>>> ((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over
>>>>> the weekend, so, I am excited to announce that www.practicemusic.com is now
>>>>> deployed. ;-) All comments and suggestions are welcome.
>>>>
>>>> Great. I noted a couple of small things:
>>>>
>>>> - You have several div with id="container" and id="wrapper". You
>>>> probably want them to be class="container" class="wrapper".
>>>>
>>>
>>> Hello Philippe,
>>> Thanks for taking a moment to look at the site. Ah, yes, I should
>>> change those definitions to classes. I am a CSS newbie, it probably
>>> shows. I guess I'm not clear when to use class for an AP DIV and when
>>> to use ID.
>>
>> Ids are for things that only appear once, classes are for things that
>> can appear several times. .
>
> At the moment, there are only two style sheets in the app, one for the
> login page, and one for all of the other views. Each style sheet has
> one "container" and one "wrapper" defined.
>
> I'm nor clear on the idea of appearing multiple times, in terms of
> CSS. What does that mean?

I'm not talking about the CSS, I'm talking about the HTML. You can
just have one element with a given id

<div id="wrapper"><div id="wrapper"></div></div>

There you have two elements with the same id. It "works" because the
browser correct it but it likely not what you wanted. However the
following is no problem

<div class="wrapper"><div class="wrapper"></div></div>

>>
>>>> - Consider moving your stylesheet from a #style method to a file.
>>>
>>> How could you tell that they were still in the #style method? ;-)
>>> My config is not open, is it? I did password protect the thing.
>>
>> I can tell from some of the URLs and seeing quite a few Seaside
>> applications in my time:
>>
>> <link rel="stylesheet" type="text/css"
>> href="/PracticeJournalLoginTask?_s=mz7PQuhvG-iPkB-G"/>
>>
>> Notice the _s? That means it goes into the session cache, which is
>> what happens with #style.
>>
>>>> Otherwise it will end up in the session cache (I know this sucks, I
>>>> plan to fix this the next weekend).
>>>>
>>>
>>> Ah, that's the problem. Ok, I will put them in the UserFileLibrary.
>>
>> It's best to have it on the file system and serve them with
>> Ngix/Apache. This way they don't hit the image at all. Probably put it
>> where you put your static images.
>
> Got it, I will fix this problem right away. It should be easy to serve them from nginx.
> I just have to change updateRoot.
>
>>
>>>> - I have a layout bug with Firefox 11 on Linux (screenshot).
>>>>
>>>
>>> Wow, that's terrible! It looks ok in Firefox 9.01. I guess I will have to get
>>> 11 and try to figure out what is going on, although I see that 10.0.1 is the default
>>> download for Mac.
>>>
>>> I made a little change to the canvas size on the UserLogin. Any better?
>>
>> Nope, the easiest but ugly fix would probably be a #break
>>
>
> Interesting. I downloaded Firefox 11.0 (the Beta) on my Mac and it looks fine to me.
> I'm not sure what is going on. Where does the break need to go? Everything is slotted into
> DIVs so it's probably not that easy.

html text: 'Screen Name: '
html textInput

html break.

html text: 'Password:'
html passwordInput

Cheers
Philippe

James Foster-8

Re: Extent size explosion

In reply to this post by Larry Kellogg

Larry,

Did you start the Staging system with a clean extent ($GEMSTONE/bin/extent0.dbf) before doing the restore? Or did you restore into an extent that was already large?

James

On Feb 14, 2012, at 10:42 AM, Lawrence Kellogg wrote:

> Hello,
> So, I run two separate Amazon instances, once called Production, and one called Staging.
> My plan was to back up production twice a day and load the backups into the Staging environment.
> Well, here are my extent sizes:
>
> 515899392 Feb 14 18:32 extent0.dbf - Production
> 4192206848 Feb 14 18:32 extent0.dbf - Staging
>
> Why is the size of the extent in staging eight! times the extent in Production when I loaded a backup from Production
> into Staging? I'm reading the Admin guide as fast as I can but I don't know what is going on.
>
> I removedbf old tranlogs in Staging, get the disk space down to 70%, look the other way, and it
> goes to 100%. Puzzling. Is there a way to shrink the extent and clean things up?
>
> Thanks,
>
> Larry
>
>

Larry Kellogg

Re: Extent size explosion

Hello James,
Yes, I restored into an extent that was already large, I would say. How do I get a clean extent? I don't suppose it is a matter of just deleting the large extent.
I guess I could find the extent0 from the distribution and copy it in?

Larry

On Feb 14, 2012, at 2:07 PM, James Foster wrote:

> Larry,
>
> Did you start the Staging system with a clean extent ($GEMSTONE/bin/extent0.dbf) before doing the restore? Or did you restore into an extent that was already large?
>
> James
>
> On Feb 14, 2012, at 10:42 AM, Lawrence Kellogg wrote:
>
>> Hello,
>> So, I run two separate Amazon instances, once called Production, and one called Staging.
>> My plan was to back up production twice a day and load the backups into the Staging environment.
>> Well, here are my extent sizes:
>>
>> 515899392 Feb 14 18:32 extent0.dbf - Production
>> 4192206848 Feb 14 18:32 extent0.dbf - Staging
>>
>> Why is the size of the extent in staging eight! times the extent in Production when I loaded a backup from Production
>> into Staging? I'm reading the Admin guide as fast as I can but I don't know what is going on.
>>
>> I removedbf old tranlogs in Staging, get the disk space down to 70%, look the other way, and it
>> goes to 100%. Puzzling. Is there a way to shrink the extent and clean things up?
>>
>> Thanks,
>>
>> Larry
>>
>>
>

Larry Kellogg

Re: Announcement, performance questions

In reply to this post by Philippe Marschall

On Feb 14, 2012, at 1:56 PM, Philippe Marschall wrote:

2012/2/14 Lawrence Kellogg <[hidden email]>:

On Feb 14, 2012, at 12:42 PM, Philippe Marschall wrote:

2012/2/14 Lawrence Kellogg <[hidden email]>:

On Feb 14, 2012, at 10:32 AM, Philippe Marschall wrote:

2012/2/14 Lawrence Kellogg <[hidden email]>:
Hello,
Well, Apple approved my iPhone application, PracticeMusic
((http://itunes.apple.com/us/app/practicemusic/id471597168?ls=1&mt=8), over
the weekend, so, I am excited to announce that www.practicemusic.com is now
deployed. ;-) All comments and suggestions are welcome.

Great. I noted a couple of small things:

- You have several div with id="container" and id="wrapper". You
probably want them to be class="container" class="wrapper".

Hello Philippe,
Thanks for taking a moment to look at the site. Ah, yes, I should
change those definitions to classes. I am a CSS newbie, it probably
shows. I guess I'm not clear when to use class for an AP DIV and when
to use ID.

Ids are for things that only appear once, classes are for things that
can appear several times. .

At the moment, there are only two style sheets in the app, one for the
login page, and one for all of the other views. Each style sheet has
one "container" and one "wrapper" defined.

I'm nor clear on the idea of appearing multiple times, in terms of
CSS. What does that mean?

I'm not talking about the CSS, I'm talking about the HTML. You can
just have one element with a given id

<div id="wrapper"><div id="wrapper"></div></div>

There you have two elements with the same id. It "works" because the
browser correct it but it likely not what you wanted. However the
following is no problem

<div class="wrapper"><div class="wrapper"></div></div>

I think you meant "container in the first div. Ok, I will fix it, although I am giving

priority to those things that are broken and don't "work". ;-)

- Consider moving your stylesheet from a #style method to a file.

How could you tell that they were still in the #style method? ;-)
My config is not open, is it? I did password protect the thing.

I can tell from some of the URLs and seeing quite a few Seaside
applications in my time:

<link rel="stylesheet" type="text/css"
href="/PracticeJournalLoginTask?_s=mz7PQuhvG-iPkB-G"/>

Notice the _s? That means it goes into the session cache, which is
what happens with #style.

Otherwise it will end up in the session cache (I know this sucks, I
plan to fix this the next weekend).

Ah, that's the problem. Ok, I will put them in the UserFileLibrary.

It's best to have it on the file system and serve them with
Ngix/Apache. This way they don't hit the image at all. Probably put it
where you put your static images.

Got it, I will fix this problem right away. It should be easy to serve them from nginx.
I just have to change updateRoot.

- I have a layout bug with Firefox 11 on Linux (screenshot).

Wow, that's terrible! It looks ok in Firefox 9.01. I guess I will have to get
11 and try to figure out what is going on, although I see that 10.0.1 is the default
download for Mac.

I made a little change to the canvas size on the UserLogin. Any better?

Nope, the easiest but ugly fix would probably be a #break

Interesting. I downloaded Firefox 11.0 (the Beta) on my Mac and it looks fine to me.
I'm not sure what is going on. Where does the break need to go? Everything is slotted into
DIVs so it's probably not that easy.

html text: 'Screen Name: '
html textInput

html break.

html text: 'Password:'
html passwordInput

Yeah, well, I want those items on the same line. I'm having a hard time debugging the problem when I can't see it.

Will I fix it if I shift the login button to the right, say, 50 pixels? Is your screenshot as messed up as before? I get this on

Firefox 11

Cheers
Philippe

123