my site is completely dead..

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

my site is completely dead..

sergio_101

my site completely died today. i tried logging in with vnc, and it seems just stuck.. i can't do anything to it..

anyone have any ideas?  i really need this thing to run consistently ..

here is a screenshot of its current state:

http://db.tt/eVxJX6lr

thanks!


----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Dennis Schetinin
Is Pharo/Squeak process running at all? If it is, I usually download image to my local machine for analysis. I don't know about any remote "debug" tools useful in this circumstance. I also have a simple Seaside component to start/stop VNC, but it won't help in this situation. 


Best regards,
Dennis Schetinin

On Tuesday, 29 January 2013 г. at 9:59, sergio t. ruiz wrote:


my site completely died today. i tried logging in with vnc, and it seems just stuck.. i can't do anything to it..

anyone have any ideas? i really need this thing to run consistently ..

here is a screenshot of its current state:


thanks!


----
peace,
sergio
photographer, journalist, visionary




_______________________________________________
seaside mailing list


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Dale Henrichs
In reply to this post by sergio_101
Sergio,

Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.

If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...

If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.

It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...

If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..

You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?

Dale

----- Original Message -----
| From: "sergio t. ruiz" <[hidden email]>
| To: "discussion" <[hidden email]>
| Sent: Monday, January 28, 2013 9:59:11 PM
| Subject: [Seaside] my site is completely dead..
|
|
| my site completely died today. i tried logging in with vnc, and it
| seems just stuck.. i can't do anything to it..
|
| anyone have any ideas?  i really need this thing to run consistently
| ..
|
| here is a screenshot of its current state:
|
| http://db.tt/eVxJX6lr
|
| thanks!
|
|
| ----
| peace,
| sergio
| photographer, journalist, visionary
|
| http://www.ThoseOptimizeGuys.com
| http://www.CodingForHire.com
| http://www.coffee-black.com
| http://www.painlessfrugality.com
| http://www.twitter.com/sergio_101
| http://www.facebook.com/sergio101
|
|
|
| _______________________________________________
| seaside mailing list
| [hidden email]
| http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
|
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Dale Henrichs
In reply to this post by sergio_101
Sergio,

Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.

If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...

If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.

It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...

If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..

You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?

Dale

----- Original Message -----
| From: "sergio t. ruiz" <[hidden email]>
| To: "discussion" <[hidden email]>
| Sent: Monday, January 28, 2013 9:59:11 PM
| Subject: [Seaside] my site is completely dead..
|
|
| my site completely died today. i tried logging in with vnc, and it
| seems just stuck.. i can't do anything to it..
|
| anyone have any ideas?  i really need this thing to run consistently
| ..
|
| here is a screenshot of its current state:
|
| http://db.tt/eVxJX6lr
|
| thanks!
|
|
| ----
| peace,
| sergio
| photographer, journalist, visionary
|
| http://www.ThoseOptimizeGuys.com
| http://www.CodingForHire.com
| http://www.coffee-black.com
| http://www.painlessfrugality.com
| http://www.twitter.com/sergio_101
| http://www.facebook.com/sergio101
|
|
|
| _______________________________________________
| seaside mailing list
| [hidden email]
| http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
|
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

sebastianconcept@gmail.co
In reply to this post by sergio_101
right now, just restart the image

Nothing runs at 100% uptime so it's honest to say that it will hang from time to time (depending on load memory usage and maybe some other parameter)

That doesn't mean you can't work in creating the illusion of full uptime for the users (actually that's exactly why they want you for)

So you want to use some kind of process monitoring system to restart the images when they stop responding to http connections while you aren't looking at them.

You have many options for monitoring, I'm using http://supervisord.org/ by now and it does its job.


On Jan 29, 2013, at 3:59 AM, sergio t. ruiz wrote:


my site completely died today. i tried logging in with vnc, and it seems just stuck.. i can't do anything to it..

anyone have any ideas?  i really need this thing to run consistently ..

here is a screenshot of its current state:

http://db.tt/eVxJX6lr

thanks!


----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

EstebanLM
In reply to this post by sergio_101
it depends on many things... as Sebastian said, nothing in the earth runs forever, but you can achieve some stability anyway :)

first I would check your Pharo version and VM version, to ensure you are using a stable one.
second, RFB server (vnc server) is known for hang your image time to time (some sockets stays alive when they shouldn't and that is consuming your resources until nothing works :( ). Someone should fix that, but is obviously not so easy, otherwise it would be fixed by now.

what else? ensure your image is being persisted and backed up regularly, and add a monitoring process...

Esteban

ps: is not that it should hang each day... I have pier images running that hangs once a year or even less... but eventually it will :)

On Jan 29, 2013, at 6:59 AM, "sergio t. ruiz" <[hidden email]> wrote:

>
> my site completely died today. i tried logging in with vnc, and it seems just stuck.. i can't do anything to it..
>
> anyone have any ideas?  i really need this thing to run consistently ..
>
> here is a screenshot of its current state:
>
> http://db.tt/eVxJX6lr
>
> thanks!
>
>
> ----
> peace,
> sergio
> photographer, journalist, visionary
>
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
>
>
>
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Bob Arning-2
In reply to this post by sergio_101
Are you able to see if there is a SqueakDebug.log file written?

Cheers,
Bob

On 1/29/13 12:59 AM, sergio t. ruiz wrote:
my site completely died today. i tried logging in with vnc, and it seems just stuck.. i can't do anything to it..

anyone have any ideas?  i really need this thing to run consistently ..

here is a screenshot of its current state:

http://db.tt/eVxJX6lr

thanks!


----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

sergio_101
In reply to this post by Dennis Schetinin
hi, dennis..

yes, the pharo image is running.. i can see the process. i have killed and started it several times, but with no luck..

i am gonna have to bring it local.


On Tue, Jan 29, 2013 at 1:09 AM, Dennis Schetinin <[hidden email]> wrote:
Is Pharo/Squeak process running at all? If it is, I usually download image to my local machine for analysis. I don't know about any remote "debug" tools useful in this circumstance. I also have a simple Seaside component to start/stop VNC, but it won't help in this situation. 


Best regards,
Dennis Schetinin

On Tuesday, 29 January 2013 г. at 9:59, sergio t. ruiz wrote:


my site completely died today. i tried logging in with vnc, and it seems just stuck.. i can't do anything to it..

anyone have any ideas? i really need this thing to run consistently ..

here is a screenshot of its current state:


thanks!


----
peace,
sergio
photographer, journalist, visionary




_______________________________________________
seaside mailing list


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside




--
----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

sergio_101
In reply to this post by Dale Henrichs
hey, dale.. it seems like lately, i am seeing this problem at least once a week. there were times when i would run problem free for months, but not lately..


On Tue, Jan 29, 2013 at 1:24 AM, Dale Henrichs <[hidden email]> wrote:
Sergio,

Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.

If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...

If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.

It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...

If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..

You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?

Dale

----- Original Message -----
| From: "sergio t. ruiz" <[hidden email]>
| To: "discussion" <[hidden email]>
| Sent: Monday, January 28, 2013 9:59:11 PM
| Subject: [Seaside] my site is completely dead..
|
|
| my site completely died today. i tried logging in with vnc, and it
| seems just stuck.. i can't do anything to it..
|
| anyone have any ideas?  i really need this thing to run consistently
| ..
|
| here is a screenshot of its current state:
|
| http://db.tt/eVxJX6lr
|
| thanks!
|
|
| ----
| peace,
| sergio
| photographer, journalist, visionary
|
| http://www.ThoseOptimizeGuys.com
| http://www.CodingForHire.com
| http://www.coffee-black.com
| http://www.painlessfrugality.com
| http://www.twitter.com/sergio_101
| http://www.facebook.com/sergio101
|
|
|
| _______________________________________________
| seaside mailing list
| [hidden email]
| http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
|
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside



--
----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

sergio_101
i think i need to bring the image local, and see what's going on.. i am moving it to a new server this week anyway..

thanks!


On Tue, Jan 29, 2013 at 8:18 AM, sergio_101 <[hidden email]> wrote:
hey, dale.. it seems like lately, i am seeing this problem at least once a week. there were times when i would run problem free for months, but not lately..


On Tue, Jan 29, 2013 at 1:24 AM, Dale Henrichs <[hidden email]> wrote:
Sergio,

Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.

If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...

If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.

It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...

If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..

You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?

Dale

----- Original Message -----
| From: "sergio t. ruiz" <[hidden email]>
| To: "discussion" <[hidden email]>
| Sent: Monday, January 28, 2013 9:59:11 PM
| Subject: [Seaside] my site is completely dead..
|
|
| my site completely died today. i tried logging in with vnc, and it
| seems just stuck.. i can't do anything to it..
|
| anyone have any ideas?  i really need this thing to run consistently
| ..
|
| here is a screenshot of its current state:
|
| http://db.tt/eVxJX6lr
|
| thanks!
|
|
| ----
| peace,
| sergio
| photographer, journalist, visionary
|
| http://www.ThoseOptimizeGuys.com
| http://www.CodingForHire.com
| http://www.coffee-black.com
| http://www.painlessfrugality.com
| http://www.twitter.com/sergio_101
| http://www.facebook.com/sergio101
|
|
|
| _______________________________________________
| seaside mailing list
| [hidden email]
| http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
|
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside



--



--
----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Henrik Sperre Johansen
In reply to this post by sergio_101

On Jan 29, 2013, at 11:01 AM, [hidden email] wrote:

From: Esteban Lorenzano <[hidden email]>
Subject: Re: [Seaside] my site is completely dead..
Date: January 29, 2013 11:01:04 AM GMT+01:00
To: Seaside - general discussion <[hidden email]>
Reply-To: Seaside - general discussion <[hidden email]>


it depends on many things... as Sebastian said, nothing in the earth runs forever, but you can achieve some stability anyway :)

first I would check your Pharo version and VM version, to ensure you are using a stable one. 
second, RFB server (vnc server) is known for hang your image time to time (some sockets stays alive when they shouldn't and that is consuming your resources until nothing works :( ). Someone should fix that, but is obviously not so easy, otherwise it would be fixed by now. 

what else? ensure your image is being persisted and backed up regularly, and add a monitoring process... 

Esteban

The issue in RFB was identified and a fix posted, don't think any RFB package maintainers picked it up though.

As for Pharo, the relevant issue to giving better error messages instead of freezing is
but I never got around to actually implementing it :S

Also, a new(ish) VM will happily continue leaking semaphores instead of freezing till it runs out of space just the way non-cog used to, if I've understood Igor's latest changes correctly :)

Cheers,
Henry

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

EstebanLM
ok, I rescued the issue from oblivion, but it will take some time until I can work on it. If someone wants to produce a slice or a changeset (better a slice), I will be happy to review and integrate :) 

Esteban


On Jan 29, 2013, at 3:26 PM, Henrik Johansen <[hidden email]> wrote:


On Jan 29, 2013, at 11:01 AM, [hidden email] wrote:

From: Esteban Lorenzano <[hidden email]>
Subject: Re: [Seaside] my site is completely dead..
Date: January 29, 2013 11:01:04 AM GMT+01:00
To: Seaside - general discussion <[hidden email]>
Reply-To: Seaside - general discussion <[hidden email]>


it depends on many things... as Sebastian said, nothing in the earth runs forever, but you can achieve some stability anyway :)

first I would check your Pharo version and VM version, to ensure you are using a stable one. 
second, RFB server (vnc server) is known for hang your image time to time (some sockets stays alive when they shouldn't and that is consuming your resources until nothing works :( ). Someone should fix that, but is obviously not so easy, otherwise it would be fixed by now. 

what else? ensure your image is being persisted and backed up regularly, and add a monitoring process... 

Esteban

The issue in RFB was identified and a fix posted, don't think any RFB package maintainers picked it up though.

As for Pharo, the relevant issue to giving better error messages instead of freezing is
but I never got around to actually implementing it :S

Also, a new(ish) VM will happily continue leaking semaphores instead of freezing till it runs out of space just the way non-cog used to, if I've understood Igor's latest changes correctly :)

Cheers,
Henry
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Gastón Dall' Oglio
In reply to this post by sergio_101
Hi Sergio.

Some weeks ago I had deal with an image that works normally, whereas an Seaside app within it was not responding (until that time when an app was not responding always was because the image was hung).

I dis some forense analisis in this image :), and I saw several zombies forked process, in really with an very long timeout in semaphore. See screenshot, in left side I inspected the Semaphore to see DelayWaitTimeout at one sane forked process, while the right same for one broken. Plus, note that these broken process don't die if you stop and start the Seaside server adaptor.

To see that in your image, open an Process Browser (and turn on auto-update) and see if there are several process "ZnManagingMultiThreadedServer HTTP worker", if so, then terminate some of them and see if site begin to respond. My app began to respond after termine several of them.

I guess that this problem occurred when I save AND QUIT the image whereas exist those forked processes.

The solution to that the image begin to respond again was kill (terminate) manually all processes "ZnManagingMultiThreadedServer HTTP worker", and in the future be aware that there isn't workers running when I save the imagen.

I don't know if it is a bug, if we think that yes I can give more data about context (my image, package versions, SO, ...).

Regards.




2013/1/29 sergio_101 <[hidden email]>
i think i need to bring the image local, and see what's going on.. i am moving it to a new server this week anyway..

thanks!


On Tue, Jan 29, 2013 at 8:18 AM, sergio_101 <[hidden email]> wrote:
hey, dale.. it seems like lately, i am seeing this problem at least once a week. there were times when i would run problem free for months, but not lately..


On Tue, Jan 29, 2013 at 1:24 AM, Dale Henrichs <[hidden email]> wrote:
Sergio,

Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.

If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...

If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.

It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...

If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..

You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?

Dale

----- Original Message -----
| From: "sergio t. ruiz" <[hidden email]>
| To: "discussion" <[hidden email]>
| Sent: Monday, January 28, 2013 9:59:11 PM
| Subject: [Seaside] my site is completely dead..
|
|
| my site completely died today. i tried logging in with vnc, and it
| seems just stuck.. i can't do anything to it..
|
| anyone have any ideas?  i really need this thing to run consistently
| ..
|
| here is a screenshot of its current state:
|
| http://db.tt/eVxJX6lr
|
| thanks!
|
|
| ----
| peace,
| sergio
| photographer, journalist, visionary
|
| http://www.ThoseOptimizeGuys.com
| http://www.CodingForHire.com
| http://www.coffee-black.com
| http://www.painlessfrugality.com
| http://www.twitter.com/sergio_101
| http://www.facebook.com/sergio101
|
|
|
| _______________________________________________
| seaside mailing list
| [hidden email]
| http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
|
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside



--



--
----
peace,
sergio
photographer, journalist, visionary

http://www.ThoseOptimizeGuys.com
http://www.CodingForHire.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

image1.png (299K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Sven Van Caekenberghe-2

On 29 Jan 2013, at 19:29, Gastón Dall' Oglio <[hidden email]> wrote:

> Hi Sergio.
>
> Some weeks ago I had deal with an image that works normally, whereas an Seaside app within it was not responding (until that time when an app was not responding always was because the image was hung).
>
> I dis some forense analisis in this image :), and I saw several zombies forked process, in really with an very long timeout in semaphore. See screenshot, in left side I inspected the Semaphore to see DelayWaitTimeout at one sane forked process, while the right same for one broken. Plus, note that these broken process don't die if you stop and start the Seaside server adaptor.
>
> To see that in your image, open an Process Browser (and turn on auto-update) and see if there are several process "ZnManagingMultiThreadedServer HTTP worker", if so, then terminate some of them and see if site begin to respond. My app began to respond after termine several of them.
>
> I guess that this problem occurred when I save AND QUIT the image whereas exist those forked processes.
>
> The solution to that the image begin to respond again was kill (terminate) manually all processes "ZnManagingMultiThreadedServer HTTP worker", and in the future be aware that there isn't workers running when I save the imagen.
>
> I don't know if it is a bug, if we think that yes I can give more data about context (my image, package versions, SO, …).

Some clarifications: ZnManagingMultiThreadedServer has one server process listening for and accepting incoming requests, forking a worker process each time. Such a worker process will loop over HTTP 1.1 request/response cycles until the other end closes or something goes wrong. There is currently no timeout as such but of course the socket connection dies eventually, so that is almost the same thing.

The 'Managing' aspect means that the server keeps track of all open connections or socket streams. When the server is stopped, all the connections will be closed. The idea is that all the worker processes using these connections (a one to one mapping) will eventually get an exception that is then handled by cleaning up and finally stopping.

This last mechanism, the closing of a socket stream from another process resulting in an exception in a process using that connection does not work identically or equally well on the different platforms (Mac, Windows, Linux) because these have completely different socket implementations in the VM. Saving an image interacts with this is various subtle ways.

On my main development platform, Mac, I see no problems. In my production deploys on Linux things are fine too. But I do various things to minimise problems:

- my images hold no 'running' server(s), these are always created and started freshly using a startup script
- I never save images after that
- all the images are controlled by init.d scripts to start automatically with the machine
- all my images are controlled by monit so that they restart automatically when they stop working
- most of the time, I have multiple images under a load balancer, statefull or stateless, to improve availability and capacity
- the load balancer also functions as a sanitizer and controller of incoming requests protecting the images
- the load balancer can handle static resources directly, off loading work from the images

http://zn.stfx.eu/zn/index.html#livedemo
http://stfx.eu/pharo-server/

Yes, like any computer program, a Smalltalk vm+image combination has limits: there is some maximum number of processes and connections that can be running and open at the same time and there are general memory limits. I am pretty sure that with a setup like the one I described above production systems handling hundreds to thousands requests per second are possible.

Sven

> 2013/1/29 sergio_101 <[hidden email]>
> i think i need to bring the image local, and see what's going on.. i am moving it to a new server this week anyway..
>
> thanks!
>
>
> On Tue, Jan 29, 2013 at 8:18 AM, sergio_101 <[hidden email]> wrote:
> hey, dale.. it seems like lately, i am seeing this problem at least once a week. there were times when i would run problem free for months, but not lately..
>
>
> On Tue, Jan 29, 2013 at 1:24 AM, Dale Henrichs <[hidden email]> wrote:
> Sergio,
>
> Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.
>
> If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...
>
> If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.
>
> It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...
>
> If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..
>
> You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?
>
> Dale
>
> ----- Original Message -----
> | From: "sergio t. ruiz" <[hidden email]>
> | To: "discussion" <[hidden email]>
> | Sent: Monday, January 28, 2013 9:59:11 PM
> | Subject: [Seaside] my site is completely dead..
> |
> |
> | my site completely died today. i tried logging in with vnc, and it
> | seems just stuck.. i can't do anything to it..
> |
> | anyone have any ideas?  i really need this thing to run consistently
> | ..
> |
> | here is a screenshot of its current state:
> |
> | http://db.tt/eVxJX6lr
> |
> | thanks!
> |
> |
> | ----
> | peace,
> | sergio
> | photographer, journalist, visionary
> |
> | http://www.ThoseOptimizeGuys.com
> | http://www.CodingForHire.com
> | http://www.coffee-black.com
> | http://www.painlessfrugality.com
> | http://www.twitter.com/sergio_101
> | http://www.facebook.com/sergio101
> |
> |
> |
> | _______________________________________________
> | seaside mailing list
> | [hidden email]
> | http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
> |
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
>
>
> --
> ----
> peace,
> sergio
> photographer, journalist, visionary
>
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
>
>
>
> --
> ----
> peace,
> sergio
> photographer, journalist, visionary
>
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
>
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
>
> <image1.png>_______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

fstephany


On 30/01/13 09:33, Sven Van Caekenberghe wrote:
> - most of the time, I have multiple images under a load balancer, statefull or stateless, to improve availability and capacity

Stateful = session affinity?

Do you start images on-demand depending of the load or you let X images
running all the time?
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Sven Van Caekenberghe-2

On 30 Jan 2013, at 10:06, Francois Stephany <[hidden email]> wrote:

> On 30/01/13 09:33, Sven Van Caekenberghe wrote:
>> - most of the time, I have multiple images under a load balancer, statefull or stateless, to improve availability and capacity
>
> Stateful = session affinity?

Yes. With Seaside you have to, with pure Zinc you can but it is not the default.

If you reload http://zn.stfx.eu/status a couple of times, you will see the port changing (which means another member of the cluster is handling the request). If you reload http://zn.stfx.eu/session you will stay on the same cluster member (notice the session ID ending with a Route ID).

> Do you start images on-demand depending of the load or you let X images running all the time?

Right now, I do it statically: I just run 4 of them all the time.

One vm+image uses maybe 100 Mb, with current servers you get multiple GBs of RAM, that you can use for running multiple images, a db, memcached instances or other stuff.

Sven

--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Gastón Dall' Oglio
In reply to this post by Sven Van Caekenberghe-2
Hi Sven.

Thanks you for the clarifications and recommendations, really useful. I implement in my servers some things as you enumerate for minimize problems, but I put especially atention in save the image in "good state" every time that I update some package in it, because otherwise it does not matter how many bad images put in a cluster :)

Regards.


2013/1/30 Sven Van Caekenberghe <[hidden email]>

On 29 Jan 2013, at 19:29, Gastón Dall' Oglio <[hidden email]> wrote:

> Hi Sergio.
>
> Some weeks ago I had deal with an image that works normally, whereas an Seaside app within it was not responding (until that time when an app was not responding always was because the image was hung).
>
> I dis some forense analisis in this image :), and I saw several zombies forked process, in really with an very long timeout in semaphore. See screenshot, in left side I inspected the Semaphore to see DelayWaitTimeout at one sane forked process, while the right same for one broken. Plus, note that these broken process don't die if you stop and start the Seaside server adaptor.
>
> To see that in your image, open an Process Browser (and turn on auto-update) and see if there are several process "ZnManagingMultiThreadedServer HTTP worker", if so, then terminate some of them and see if site begin to respond. My app began to respond after termine several of them.
>
> I guess that this problem occurred when I save AND QUIT the image whereas exist those forked processes.
>
> The solution to that the image begin to respond again was kill (terminate) manually all processes "ZnManagingMultiThreadedServer HTTP worker", and in the future be aware that there isn't workers running when I save the imagen.
>
> I don't know if it is a bug, if we think that yes I can give more data about context (my image, package versions, SO, …).

Some clarifications: ZnManagingMultiThreadedServer has one server process listening for and accepting incoming requests, forking a worker process each time. Such a worker process will loop over HTTP 1.1 request/response cycles until the other end closes or something goes wrong. There is currently no timeout as such but of course the socket connection dies eventually, so that is almost the same thing.

The 'Managing' aspect means that the server keeps track of all open connections or socket streams. When the server is stopped, all the connections will be closed. The idea is that all the worker processes using these connections (a one to one mapping) will eventually get an exception that is then handled by cleaning up and finally stopping.

This last mechanism, the closing of a socket stream from another process resulting in an exception in a process using that connection does not work identically or equally well on the different platforms (Mac, Windows, Linux) because these have completely different socket implementations in the VM. Saving an image interacts with this is various subtle ways.

On my main development platform, Mac, I see no problems. In my production deploys on Linux things are fine too. But I do various things to minimise problems:

- my images hold no 'running' server(s), these are always created and started freshly using a startup script
- I never save images after that
- all the images are controlled by init.d scripts to start automatically with the machine
- all my images are controlled by monit so that they restart automatically when they stop working
- most of the time, I have multiple images under a load balancer, statefull or stateless, to improve availability and capacity
- the load balancer also functions as a sanitizer and controller of incoming requests protecting the images
- the load balancer can handle static resources directly, off loading work from the images

http://zn.stfx.eu/zn/index.html#livedemo
http://stfx.eu/pharo-server/

Yes, like any computer program, a Smalltalk vm+image combination has limits: there is some maximum number of processes and connections that can be running and open at the same time and there are general memory limits. I am pretty sure that with a setup like the one I described above production systems handling hundreds to thousands requests per second are possible.

Sven

> 2013/1/29 sergio_101 <[hidden email]>
> i think i need to bring the image local, and see what's going on.. i am moving it to a new server this week anyway..
>
> thanks!
>
>
> On Tue, Jan 29, 2013 at 8:18 AM, sergio_101 <[hidden email]> wrote:
> hey, dale.. it seems like lately, i am seeing this problem at least once a week. there were times when i would run problem free for months, but not lately..
>
>
> On Tue, Jan 29, 2013 at 1:24 AM, Dale Henrichs <[hidden email]> wrote:
> Sergio,
>
> Most of my experience is from working with GemStone, which is different animal, so take what I say with a grain of salt.
>
> If the running image is completely frozen, then you don't have much choice but to kill it and restart ... hopefully you haven't lost any data ...
>
> If after restart you see the problem again, then you might be able to debug the issue by copying the image to a local machine and bringing it up ... If the problem doesn't reproduce, I'd still be inclined to take a copy of the image and attempt to understand the particular problem.
>
> It's hard to tell from the screen shot what the thread is doing or even which thread it is ... it's not likely that the thread is a seaside application thread because those are normally forked and will sit around with an open debugger, but not necessarily affect the image itself. So I can't really guess what operation is causing trouble ...
>
> If you're lucky you can reproduce the problem on your local machine ... If you search the pharo bug list you might find a bug in this area and from that we might be able to figure out which thread is the bad boy and there might even be a fix ..
>
> You mentioned stability...are you seeing this particular problem occur often or are you seeing different issues?
>
> Dale
>
> ----- Original Message -----
> | From: "sergio t. ruiz" <[hidden email]>
> | To: "discussion" <[hidden email]>
> | Sent: Monday, January 28, 2013 9:59:11 PM
> | Subject: [Seaside] my site is completely dead..
> |
> |
> | my site completely died today. i tried logging in with vnc, and it
> | seems just stuck.. i can't do anything to it..
> |
> | anyone have any ideas?  i really need this thing to run consistently
> | ..
> |
> | here is a screenshot of its current state:
> |
> | http://db.tt/eVxJX6lr
> |
> | thanks!
> |
> |
> | ----
> | peace,
> | sergio
> | photographer, journalist, visionary
> |
> | http://www.ThoseOptimizeGuys.com
> | http://www.CodingForHire.com
> | http://www.coffee-black.com
> | http://www.painlessfrugality.com
> | http://www.twitter.com/sergio_101
> | http://www.facebook.com/sergio101
> |
> |
> |
> | _______________________________________________
> | seaside mailing list
> | [hidden email]
> | http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
> |
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
>
>
> --
> ----
> peace,
> sergio
> photographer, journalist, visionary
>
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
>
>
>
> --
> ----
> peace,
> sergio
> photographer, journalist, visionary
>
> http://www.ThoseOptimizeGuys.com
> http://www.CodingForHire.com
> http://www.coffee-black.com
> http://www.painlessfrugality.com
> http://www.twitter.com/sergio_101
> http://www.facebook.com/sergio101
>
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
>
> <image1.png>_______________________________________________
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

fstephany
In reply to this post by Sven Van Caekenberghe-2


On 30/01/13 10:15, Sven Van Caekenberghe wrote:
>
> On 30 Jan 2013, at 10:06, Francois Stephany <[hidden email]> wrote:
>
>> On 30/01/13 09:33, Sven Van Caekenberghe wrote:
>>> - most of the time, I have multiple images under a load balancer, statefull or stateless, to improve availability and capacity
>>
>> Stateful = session affinity?
>
> Yes. With Seaside you have to, with pure Zinc you can but it is not the default.

Yep, I guess that's another issue for Cloud deployment à la Heroku...

> If you reload http://zn.stfx.eu/status a couple of times, you will see the port changing (which means another member of the cluster is handling the request). If you reload http://zn.stfx.eu/session you will stay on the same cluster member (notice the session ID ending with a Route ID).
>
>> Do you start images on-demand depending of the load or you let X images running all the time?
>
> Right now, I do it statically: I just run 4 of them all the time.
>
> One vm+image uses maybe 100 Mb, with current servers you get multiple GBs of RAM, that you can use for running multiple images, a db, memcached instances or other stuff.

Thanks !
I know it's not really relevant without context but how much load can
one image handle? It's just to have an order of magnitude...
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: my site is completely dead..

Sven Van Caekenberghe-2

On 30 Jan 2013, at 13:54, Francois Stephany <[hidden email]> wrote:

> On 30/01/13 10:15, Sven Van Caekenberghe wrote:
>>
>> On 30 Jan 2013, at 10:06, Francois Stephany <[hidden email]> wrote:
>>
>>> On 30/01/13 09:33, Sven Van Caekenberghe wrote:
>>>> - most of the time, I have multiple images under a load balancer, statefull or stateless, to improve availability and capacity
>>>
>>> Stateful = session affinity?
>>
>> Yes. With Seaside you have to, with pure Zinc you can but it is not the default.
>
> Yep, I guess that's another issue for Cloud deployment à la Heroku...
>
>> If you reload http://zn.stfx.eu/status a couple of times, you will see the port changing (which means another member of the cluster is handling the request). If you reload http://zn.stfx.eu/session you will stay on the same cluster member (notice the session ID ending with a Route ID).
>>
>>> Do you start images on-demand depending of the load or you let X images running all the time?
>>
>> Right now, I do it statically: I just run 4 of them all the time.
>>
>> One vm+image uses maybe 100 Mb, with current servers you get multiple GBs of RAM, that you can use for running multiple images, a db, memcached instances or other stuff.
>
> Thanks !
> I know it's not really relevant without context but how much load can one image handle? It's just to have an order of magnitude…

You would have to define 'load' and even then, it really depends too much on the application (architecture) and/or frameworks used. The only way to find out it to try ;-)

Sven_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside