Apache Load Balancing with Seaside

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Apache Load Balancing with Seaside

jtuchel
Hi there,

we're finally at a stage where we need load balancing is not only needed
but also needs to be a bit more fine-grained.

So far, we've used Apache's mod_proxy_balancer with an byrequests
balancing scheme and the simply Set Cookie, but this is not really
distributing load in an even manner, because it simply does a round
robin based on requests. Teh way I seem to understand this, this just
sends every nth request to another backend, unless the sticky session
cookie is there.

This seems to have multiple problems in the context of Seaside:

1. each page render consits of dozens of requests, and at least 2 in
case of the usual redirect/render cycle

2. So request != new seaside session and therefor not a good base for
distribting load

3. It seems the Cokie as marker for stickyness is browser wide. Every
new session started in teh same browser gets directed to the same
seaside image

So this starts to cause problems in our scenario because there is always
one image (the first one in the http.conf) that handles most of the
load, no matter how many images we add.

I think I understand that URL encoding for the load balancer would fit
much better for Seaside. What's even better is the fact that Seaside
already uses the _s parameter to identify a session.

So at first sight, I think it might be a good idea to use the _s
parameter provided by Seaside as the URL parameter that is also used for
session stickyness.

This brings up a few questions:

* Is this a good idea?
* Has anybody tried?
* How to configure this?

The most important question, however, is this: would this be any better
wrt workload distribution? I mean, how could we make sure the very first
request gets redirected to a new image? My fear is that in the end this
will still have the same problem: the initial sessions will still be
mostly created by the same image all of the time. The only difference
might finally just be the fact that we use URL parameters instead of
Cookies and spend nights testing and stuff and end up with the same
problems...

Before you suggest using squid or the like, let's think about the basic
problem: does squid do things differently? What mechanism there is so
much better than counting requests, measuring bytes transferrred or such?

So how do people solve this problem? Any ideas, hints, experiences are
greatly appreciated



Joachim





--
-----------------------------------------------------------------------
Objektfabrik Joachim Tuchel          mailto:[hidden email]
Fliederweg 1                         http://www.objektfabrik.de
D-71640 Ludwigsburg                  http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Apache Load Balancing with Seaside

Philippe Marschall
On Sat, Nov 12, 2016 at 10:01 AM, [hidden email]
<[hidden email]> wrote:

> Hi there,
>
> we're finally at a stage where we need load balancing is not only needed but
> also needs to be a bit more fine-grained.
>
> So far, we've used Apache's mod_proxy_balancer with an byrequests balancing
> scheme and the simply Set Cookie, but this is not really distributing load
> in an even manner, because it simply does a round robin based on requests.
> Teh way I seem to understand this, this just sends every nth request to
> another backend, unless the sticky session cookie is there.
>
> This seems to have multiple problems in the context of Seaside:
>
> 1. each page render consits of dozens of requests, and at least 2 in case of
> the usual redirect/render cycle
>
> 2. So request != new seaside session and therefor not a good base for
> distribting load
>
> 3. It seems the Cokie as marker for stickyness is browser wide. Every new
> session started in teh same browser gets directed to the same seaside image
>
> So this starts to cause problems in our scenario because there is always one
> image (the first one in the http.conf) that handles most of the load, no
> matter how many images we add.
>
> I think I understand that URL encoding for the load balancer would fit much
> better for Seaside. What's even better is the fact that Seaside already uses
> the _s parameter to identify a session.
>
> So at first sight, I think it might be a good idea to use the _s parameter
> provided by Seaside as the URL parameter that is also used for session
> stickyness.
>
> This brings up a few questions:
>
> * Is this a good idea?
> * Has anybody tried?
> * How to configure this?
>
> The most important question, however, is this: would this be any better wrt
> workload distribution? I mean, how could we make sure the very first request
> gets redirected to a new image? My fear is that in the end this will still
> have the same problem: the initial sessions will still be mostly created by
> the same image all of the time. The only difference might finally just be
> the fact that we use URL parameters instead of Cookies and spend nights
> testing and stuff and end up with the same problems...
>
> Before you suggest using squid or the like, let's think about the basic
> problem: does squid do things differently? What mechanism there is so much
> better than counting requests, measuring bytes transferrred or such?
>
> So how do people solve this problem? Any ideas, hints, experiences are
> greatly appreciated

Unless you run on GemStone/S you need sticky sessions because there is
generally a lot session related state required to service requests.
That state is image local.
Making mod_proxy_balancer do sticky sessions is fairly trivial, simply
add a route to the session id. Have a look at the
WARouteHandlerTrackingStragety hierarchy in the Seaside-Cluster [1]
package. Then you'll just need to make sure that requests without a
session are evenly distributed across images.
There are still issues left. First you can't dynamically add and
remove images. Second when an image has more load than an other for
whatever reason you can't tell mod_proxy_balancer to favour the less
strained images for new sessions. mod_cluster allows you to do this,
but I have only an incomplete and buggy implementation.

 [1] http://www.squeaksource.com/ajp.html

Cheers
Philippe
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Apache Load Balancing with Seaside

Sven Van Caekenberghe-2
Hi Joachim,

Like Philippe says, you have to send all request of one session to the same image.

What I currently do is the following in nginx:

upstream my-seaside {
  ip_hash;
  server my-seaside-2:9090;
  server my-seaside-2:9091;
  server my-seaside-3:9090;
  server my-seaside-3:9091;
}

server {
  listen 443 ssl;
  listen 80;
  server_name ... ;

  ssl_certificate ...;
  ssl_certificate_key ...;

  location /files {
    alias /home/ubuntu/static-files/;
    try_files $uri @seaside;
    gzip on;
    gzip_types application/x-javascript text/css;
    expires 30d;
  }

  location / {
    proxy_pass http://my-seaside;
    add_header X-Server Pharo;
  }

  location @seaside {
    proxy_pass http://my-seaside;
    add_header X-Server Pharo;
  }
}

This does 2 things:

(1) it load balances (upstream) over 4 images (2 on 2 machines), using ip_hash - this needs no further configuration; the client's ip is hashed and all requests go to the same image (sticky); this is not optimal (there is some spreading but not load based), but it offers good availability (fail over in case a machine or image goes down).

(2) it tries to resolve all of seaside's /files resources directly from the file system, if possible - this off loads those requests off seaside, making it do less work. [ see #deployFiles ]

This seems to work well in practice.

Sven

> On 12 Nov 2016, at 11:14, Philippe Marschall <[hidden email]> wrote:
>
> On Sat, Nov 12, 2016 at 10:01 AM, [hidden email]
> <[hidden email]> wrote:
>> Hi there,
>>
>> we're finally at a stage where we need load balancing is not only needed but
>> also needs to be a bit more fine-grained.
>>
>> So far, we've used Apache's mod_proxy_balancer with an byrequests balancing
>> scheme and the simply Set Cookie, but this is not really distributing load
>> in an even manner, because it simply does a round robin based on requests.
>> Teh way I seem to understand this, this just sends every nth request to
>> another backend, unless the sticky session cookie is there.
>>
>> This seems to have multiple problems in the context of Seaside:
>>
>> 1. each page render consits of dozens of requests, and at least 2 in case of
>> the usual redirect/render cycle
>>
>> 2. So request != new seaside session and therefor not a good base for
>> distribting load
>>
>> 3. It seems the Cokie as marker for stickyness is browser wide. Every new
>> session started in teh same browser gets directed to the same seaside image
>>
>> So this starts to cause problems in our scenario because there is always one
>> image (the first one in the http.conf) that handles most of the load, no
>> matter how many images we add.
>>
>> I think I understand that URL encoding for the load balancer would fit much
>> better for Seaside. What's even better is the fact that Seaside already uses
>> the _s parameter to identify a session.
>>
>> So at first sight, I think it might be a good idea to use the _s parameter
>> provided by Seaside as the URL parameter that is also used for session
>> stickyness.
>>
>> This brings up a few questions:
>>
>> * Is this a good idea?
>> * Has anybody tried?
>> * How to configure this?
>>
>> The most important question, however, is this: would this be any better wrt
>> workload distribution? I mean, how could we make sure the very first request
>> gets redirected to a new image? My fear is that in the end this will still
>> have the same problem: the initial sessions will still be mostly created by
>> the same image all of the time. The only difference might finally just be
>> the fact that we use URL parameters instead of Cookies and spend nights
>> testing and stuff and end up with the same problems...
>>
>> Before you suggest using squid or the like, let's think about the basic
>> problem: does squid do things differently? What mechanism there is so much
>> better than counting requests, measuring bytes transferrred or such?
>>
>> So how do people solve this problem? Any ideas, hints, experiences are
>> greatly appreciated
>
> Unless you run on GemStone/S you need sticky sessions because there is
> generally a lot session related state required to service requests.
> That state is image local.
> Making mod_proxy_balancer do sticky sessions is fairly trivial, simply
> add a route to the session id. Have a look at the
> WARouteHandlerTrackingStragety hierarchy in the Seaside-Cluster [1]
> package. Then you'll just need to make sure that requests without a
> session are evenly distributed across images.
> There are still issues left. First you can't dynamically add and
> remove images. Second when an image has more load than an other for
> whatever reason you can't tell mod_proxy_balancer to favour the less
> strained images for new sessions. mod_cluster allows you to do this,
> but I have only an incomplete and buggy implementation.
>
> [1] http://www.squeaksource.com/ajp.html
>
> Cheers
> Philippe
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Apache Load Balancing with Seaside

jtuchel
In reply to this post by Philippe Marschall
Am 12.11.16 um 11:14 schrieb Philippe Marschall:

Philippe,
>
> Unless you run on GemStone/S you need sticky sessions because there is
> generally a lot session related state required to service requests.
> That state is image local.
> Making mod_proxy_balancer do sticky sessions is fairly trivial, simply
> add a route to the session id. Have a look at the

That's exactly what we do, sticky sessions encoded in a Cookie, all
handled by Apache. It's surprisingly easy to configure.

However, we have the impression that the byrequests strategy favors the
first server and puts most load onto it, because byrequests is not the
same as "by new sessions". Of course, bytraffic is not really much
better, because number of bytes transferred is not necessarily related
to the workload needed to produce them.

So my question was probably just asked in a way too encoded way ;-)

I was wondering if people found a scheme that works better in a Seaside
scenario without writing their own Apache module (We're way too busy to
learn how to do that right now). For a moment I thought that some of the
quite cryptic sentences in the Apache docs imply that using URL encoding
instead of Cookies for saving the session marker in the
requests/responses might work better. And what I thought was said in the
docs (between the lines, obviously, because in the lines it says nothing
but "usually this is done by the backend") is that therefor the Seaside
part would have to provide some URL parameter that can be used by
Apache. And instead if implementing something on my own, I thought the
_s parameter would probably be the cheapest possible solution.

I guess the WARouteHandlerTrackingStrategy you mention is exactly that.

OTOH, thinking abou this a little more, there is almost no chance Apache
could do much better with URL parameters assigned by the backend as long
as the strategy where to send an initial request is as good or bad as it
is now. No matter how great a backend supports marking a session as his
own, as long as apache decides to send two thirds of the requests to the
same image over and over again, this won't change much.

So this all possibly boils down to this question:

Has anybody found a way to teach Apache to do the balancing based on
previously unassigned sessions rather than just requests (remember, each
request for a js or css file counts in the byrequests strategy)?
I guess a simple round robin for each not yet attached session would be
almost infinitely better than what we see right now.






> WARouteHandlerTrackingStragety hierarchy in the Seaside-Cluster [1]
> package. Then you'll just need to make sure that requests without a
> session are evenly distributed across images.

Exactly. That is the key.

> There are still issues left. First you can't dynamically add and
> remove images.
I haven't done any further research, but the Drain mode in
mod_proxy_balancer looks very promising, as well as the hot standby
mode. Not sure how I would change these modes programmatically though.


> Second when an image has more load than an other for
> whatever reason you can't tell mod_proxy_balancer to favour the less
> strained images for new sessions.
Yeah, that would be great. Hardware is cheap, though, so I am quite sure
we could live with it for quite a while. In my naive amateur's mind,
this would probably be better handled in "real" load balancers anyways
and will be something to look into once we've taken this step and run
into problems with horizontal scaling in Apache. Seems like it won't be
any time soon.

> mod_cluster allows you to do this,
> but I have only an incomplete and buggy implementation.
>
>   [1] http://www.squeaksource.com/ajp.html

Thanks,


Joachim



>
> Cheers
> Philippe
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside


--
-----------------------------------------------------------------------
Objektfabrik Joachim Tuchel          mailto:[hidden email]
Fliederweg 1                         http://www.objektfabrik.de
D-71640 Ludwigsburg                  http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Apache Load Balancing with Seaside

jtuchel
In reply to this post by Sven Van Caekenberghe-2
Sven,

looks very similar to what we do in Apache, including the offloeding of
/files requests from the images.

I like the hashing strategy (based on the client's ip) nginx seems to
offer. I don't remember reading about such a strategy for Apache.
Seems like we'll have to live with the imperfectness of the current
solution... Maybe we'll experiment a little with the bytraffic strategy
to see if it results in better balancing (by accident).

Thanks for sharing.

Joachim


Am 12.11.16 um 13:29 schrieb Sven Van Caekenberghe:

> Hi Joachim,
>
> Like Philippe says, you have to send all request of one session to the same image.
>
> What I currently do is the following in nginx:
>
> upstream my-seaside {
>    ip_hash;
>    server my-seaside-2:9090;
>    server my-seaside-2:9091;
>    server my-seaside-3:9090;
>    server my-seaside-3:9091;
> }
>
> server {
>    listen 443 ssl;
>    listen 80;
>    server_name ... ;
>
>    ssl_certificate ...;
>    ssl_certificate_key ...;
>
>    location /files {
>      alias /home/ubuntu/static-files/;
>      try_files $uri @seaside;
>      gzip on;
>      gzip_types application/x-javascript text/css;
>      expires 30d;
>    }
>
>    location / {
>      proxy_pass http://my-seaside;
>      add_header X-Server Pharo;
>    }
>
>    location @seaside {
>      proxy_pass http://my-seaside;
>      add_header X-Server Pharo;
>    }
> }
>
> This does 2 things:
>
> (1) it load balances (upstream) over 4 images (2 on 2 machines), using ip_hash - this needs no further configuration; the client's ip is hashed and all requests go to the same image (sticky); this is not optimal (there is some spreading but not load based), but it offers good availability (fail over in case a machine or image goes down).
>
> (2) it tries to resolve all of seaside's /files resources directly from the file system, if possible - this off loads those requests off seaside, making it do less work. [ see #deployFiles ]
>
> This seems to work well in practice.
>
> Sven
>
>> On 12 Nov 2016, at 11:14, Philippe Marschall <[hidden email]> wrote:
>>
>> On Sat, Nov 12, 2016 at 10:01 AM, [hidden email]
>> <[hidden email]> wrote:
>>> Hi there,
>>>
>>> we're finally at a stage where we need load balancing is not only needed but
>>> also needs to be a bit more fine-grained.
>>>
>>> So far, we've used Apache's mod_proxy_balancer with an byrequests balancing
>>> scheme and the simply Set Cookie, but this is not really distributing load
>>> in an even manner, because it simply does a round robin based on requests.
>>> Teh way I seem to understand this, this just sends every nth request to
>>> another backend, unless the sticky session cookie is there.
>>>
>>> This seems to have multiple problems in the context of Seaside:
>>>
>>> 1. each page render consits of dozens of requests, and at least 2 in case of
>>> the usual redirect/render cycle
>>>
>>> 2. So request != new seaside session and therefor not a good base for
>>> distribting load
>>>
>>> 3. It seems the Cokie as marker for stickyness is browser wide. Every new
>>> session started in teh same browser gets directed to the same seaside image
>>>
>>> So this starts to cause problems in our scenario because there is always one
>>> image (the first one in the http.conf) that handles most of the load, no
>>> matter how many images we add.
>>>
>>> I think I understand that URL encoding for the load balancer would fit much
>>> better for Seaside. What's even better is the fact that Seaside already uses
>>> the _s parameter to identify a session.
>>>
>>> So at first sight, I think it might be a good idea to use the _s parameter
>>> provided by Seaside as the URL parameter that is also used for session
>>> stickyness.
>>>
>>> This brings up a few questions:
>>>
>>> * Is this a good idea?
>>> * Has anybody tried?
>>> * How to configure this?
>>>
>>> The most important question, however, is this: would this be any better wrt
>>> workload distribution? I mean, how could we make sure the very first request
>>> gets redirected to a new image? My fear is that in the end this will still
>>> have the same problem: the initial sessions will still be mostly created by
>>> the same image all of the time. The only difference might finally just be
>>> the fact that we use URL parameters instead of Cookies and spend nights
>>> testing and stuff and end up with the same problems...
>>>
>>> Before you suggest using squid or the like, let's think about the basic
>>> problem: does squid do things differently? What mechanism there is so much
>>> better than counting requests, measuring bytes transferrred or such?
>>>
>>> So how do people solve this problem? Any ideas, hints, experiences are
>>> greatly appreciated
>> Unless you run on GemStone/S you need sticky sessions because there is
>> generally a lot session related state required to service requests.
>> That state is image local.
>> Making mod_proxy_balancer do sticky sessions is fairly trivial, simply
>> add a route to the session id. Have a look at the
>> WARouteHandlerTrackingStragety hierarchy in the Seaside-Cluster [1]
>> package. Then you'll just need to make sure that requests without a
>> session are evenly distributed across images.
>> There are still issues left. First you can't dynamically add and
>> remove images. Second when an image has more load than an other for
>> whatever reason you can't tell mod_proxy_balancer to favour the less
>> strained images for new sessions. mod_cluster allows you to do this,
>> but I have only an incomplete and buggy implementation.
>>
>> [1] http://www.squeaksource.com/ajp.html
>>
>> Cheers
>> Philippe
>> _______________________________________________
>> seaside mailing list
>> [hidden email]
>> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside


--
-----------------------------------------------------------------------
Objektfabrik Joachim Tuchel          mailto:[hidden email]
Fliederweg 1                         http://www.objektfabrik.de
D-71640 Ludwigsburg                  http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1

_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside