Hi
I heighten everybody's mood I'll post some positive news. After some optimizations in both Seaside and AJP I managed to break 8000 requests / sec with a single Pharo 1.3 image. Thanks to SystemProfiler I knew where to look. This is with a single request handler that just returns a two byte response. It doesn't involve any rendering, sessions, continuations or whatsoever but it kicks on the full Seaside request handling machinery with a request context and everything. I'm using WASmallRequestHandler from the Seaside-Benchmark package. WASmallRequestHandler >> #handleFiltered: aRequestContext aRequestContext respond: [ :response | response binary; contentType: WAMimeType textHtml; nextPutAll: 'OK' asByteArray ] Apache 2.2.21 mpm_worker mod_proxy_ajp CPU Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz SmalltalkImage current vmVersion 'Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.138]' Attached you'll find the output of ApacheBench. Cheers Philippe |
Philippe,
On 25 Feb 2012, at 14:35, Philippe Marschall wrote: > Hi > > I heighten everybody's mood I'll post some positive news. > > After some optimizations in both Seaside and AJP I managed to break 8000 requests / sec with a single Pharo 1.3 image. Thanks to SystemProfiler I knew where to look. > > This is with a single request handler that just returns a two byte response. It doesn't involve any rendering, sessions, continuations or whatsoever but it kicks on the full Seaside request handling machinery with a request context and everything. > > I'm using WASmallRequestHandler from the Seaside-Benchmark package. > > WASmallRequestHandler >> #handleFiltered: aRequestContext > aRequestContext respond: [ :response | > response > binary; > contentType: WAMimeType textHtml; > nextPutAll: 'OK' asByteArray ] > > Apache 2.2.21 mpm_worker mod_proxy_ajp > CPU Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz > SmalltalkImage current vmVersion 'Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.138]' > > Attached you'll find the output of ApacheBench. > > Cheers > Philippe > <8k.txt> Very nice, indeed. What would you get with your setup when you increase the work a bit from the your lower limit 2 byte response to something like this page ? http://zn.stfx.eu/dw-bench (dynamically generated by Zn) http://caretaker.wolf359.be:8080/DW-Bench (dynamically generated by Seaside) http://stfx.eu/static.html (static reference by Apache) The response should be about 8Kb. Sven |
In reply to this post by Philippe Marschall-2-3
On Feb 25, 2012, at 2:35 PM, Philippe Marschall wrote: > Hi > > I heighten everybody's mood I'll post some positive news. Ok now I understand what the earth stopped to spin today :) Thanks for the mail Stef > > After some optimizations in both Seaside and AJP I managed to break 8000 requests / sec with a single Pharo 1.3 image. Thanks to SystemProfiler I knew where to look. > > This is with a single request handler that just returns a two byte response. It doesn't involve any rendering, sessions, continuations or whatsoever but it kicks on the full Seaside request handling machinery with a request context and everything. > > I'm using WASmallRequestHandler from the Seaside-Benchmark package. > > WASmallRequestHandler >> #handleFiltered: aRequestContext > aRequestContext respond: [ :response | > response > binary; > contentType: WAMimeType textHtml; > nextPutAll: 'OK' asByteArray ] > > Apache 2.2.21 mpm_worker mod_proxy_ajp > CPU Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz > SmalltalkImage current vmVersion 'Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.138]' > > Attached you'll find the output of ApacheBench. > > Cheers > Philippe > <8k.txt> |
In reply to this post by Sven Van Caekenberghe
On 25.02.2012 16:02, Sven Van Caekenberghe wrote:
> Philippe, > > On 25 Feb 2012, at 14:35, Philippe Marschall wrote: > >> Hi >> >> I heighten everybody's mood I'll post some positive news. >> >> After some optimizations in both Seaside and AJP I managed to break 8000 requests / sec with a single Pharo 1.3 image. Thanks to SystemProfiler I knew where to look. >> >> This is with a single request handler that just returns a two byte response. It doesn't involve any rendering, sessions, continuations or whatsoever but it kicks on the full Seaside request handling machinery with a request context and everything. >> >> I'm using WASmallRequestHandler from the Seaside-Benchmark package. >> >> WASmallRequestHandler>> #handleFiltered: aRequestContext >> aRequestContext respond: [ :response | >> response >> binary; >> contentType: WAMimeType textHtml; >> nextPutAll: 'OK' asByteArray ] >> >> Apache 2.2.21 mpm_worker mod_proxy_ajp >> CPU Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz >> SmalltalkImage current vmVersion 'Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.138]' >> >> Attached you'll find the output of ApacheBench. >> >> Cheers >> Philippe >> <8k.txt> > > Very nice, indeed. > > What would you get with your setup when you increase the work a bit from the your lower limit 2 byte response to something like this page ? > > http://zn.stfx.eu/dw-bench (dynamically generated by Zn) > http://caretaker.wolf359.be:8080/DW-Bench (dynamically generated by Seaside) > http://stfx.eu/static.html (static reference by Apache) That's with a statically allocated byte array, no rendering. It's still doing too much copying so there is space left to go. Cheers Philippe |
Philippe ,
That is incredibly fast, I just tried and I can't even get plain apache2 serve the static.html that fast over the local network ! When I have more time, I really have to try to repeat your results with your code ( as well as study the code ;-) Can you please provide the main pointers again ? I remember you once explained how to setup AJP somewhere... Sven On 25 Feb 2012, at 17:31, Philippe Marschall wrote: > On 25.02.2012 16:02, Sven Van Caekenberghe wrote: >> Philippe, >> >> On 25 Feb 2012, at 14:35, Philippe Marschall wrote: >> >>> Hi >>> >>> I heighten everybody's mood I'll post some positive news. >>> >>> After some optimizations in both Seaside and AJP I managed to break 8000 requests / sec with a single Pharo 1.3 image. Thanks to SystemProfiler I knew where to look. >>> >>> This is with a single request handler that just returns a two byte response. It doesn't involve any rendering, sessions, continuations or whatsoever but it kicks on the full Seaside request handling machinery with a request context and everything. >>> >>> I'm using WASmallRequestHandler from the Seaside-Benchmark package. >>> >>> WASmallRequestHandler>> #handleFiltered: aRequestContext >>> aRequestContext respond: [ :response | >>> response >>> binary; >>> contentType: WAMimeType textHtml; >>> nextPutAll: 'OK' asByteArray ] >>> >>> Apache 2.2.21 mpm_worker mod_proxy_ajp >>> CPU Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz >>> SmalltalkImage current vmVersion 'Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.138]' >>> >>> Attached you'll find the output of ApacheBench. >>> >>> Cheers >>> Philippe >>> <8k.txt> >> >> Very nice, indeed. >> >> What would you get with your setup when you increase the work a bit from the your lower limit 2 byte response to something like this page ? >> >> http://zn.stfx.eu/dw-bench (dynamically generated by Zn) >> http://caretaker.wolf359.be:8080/DW-Bench (dynamically generated by Seaside) >> http://stfx.eu/static.html (static reference by Apache) > > Drops to about 6.5 k but throughput goes up to about 50 Mbytes/sec. That's with a statically allocated byte array, no rendering. > > It's still doing too much copying so there is space left to go. > > Cheers > Philippe > > <dw-bench.txt> |
On 25.02.2012 17:47, Sven Van Caekenberghe wrote:
> Philippe , > > That is incredibly fast, I just tried and I can't even get plain apache2 serve the static.html that fast over the local network ! It's not over the network, it's on the local machine. Keep-alive makes a big difference. > When I have more time, I really have to try to repeat your results with your code ( as well as study the code ;-) > > Can you please provide the main pointers again ? - get the image from [1] - in Apache 2.2 you set up AJP the way you set up an HTTP reverse proxy, the protocol is just ajp:// instead of http:// - load Seaside-Benchmark from [2] - see the class side of WASmallRequestHandler (2 byte response), WAFastRequestHandler (16k response, seaside.st homepage) and WADwBenchHandler (dw-bench) to register the request handlers [1] http://jenkins.lukas-renggli.ch/job/Seaside%203.1/lastSuccessfulBuild/artifact/seaside31-ajp.zip [2] http://www.squeaksource.com/Seaside31Addons Cheers Philippe |
On 25 Feb 2012, at 19:21, Philippe Marschall wrote: > On 25.02.2012 17:47, Sven Van Caekenberghe wrote: >> Philippe , >> >> That is incredibly fast, I just tried and I can't even get plain apache2 serve the static.html that fast over the local network ! > > It's not over the network, it's on the local machine. Keep-alive makes a big difference. Yes, I meant 127.0.0.1. And yes keep alive is necessary. >> When I have more time, I really have to try to repeat your results with your code ( as well as study the code ;-) >> >> Can you please provide the main pointers again ? > > - get the image from [1] > - in Apache 2.2 you set up AJP the way you set up an HTTP reverse proxy, the protocol is just ajp:// instead of http:// > - load Seaside-Benchmark from [2] > - see the class side of WASmallRequestHandler (2 byte response), WAFastRequestHandler (16k response, seaside.st homepage) and WADwBenchHandler (dw-bench) to register the request handlers > > [1] http://jenkins.lukas-renggli.ch/job/Seaside%203.1/lastSuccessfulBuild/artifact/seaside31-ajp.zip > [2] http://www.squeaksource.com/Seaside31Addons OK, Thanks! Sven |
In reply to this post by Sven Van Caekenberghe
On 02/25/2012 05:47 PM, Sven Van Caekenberghe wrote:
> Philippe , > > That is incredibly fast, I just tried and I can't even get plain apache2 serve the static.html that fast over the local network ! > > When I have more time, I really have to try to repeat your results with your code ( as well as study the code ;-) The biggest thing is probably the recycling of the response buffers. Each worker thread has a response buffer that is reused. I found that request handling is very sensitive to allocation. There is a direct correlation between removing allocation and handling more requests and having a bigger throughput. The more allocation you can remove the better. Especially things like Stream >> #contents. There is also some code to have buffers that can efficiently work on both ByteArray and ByteString. Cheers Philippe |
Hi Philippe!
Nice to see your AJP work giving results! I think Nginx has a module for AJP, would be interesting to see if that makes a difference. :) Are you using stock SocketStream internally or anything even more bare bone? regards, Göran |
On 02/27/2012 10:11 AM, Göran Krampe wrote:
> Hi Philippe! > > Nice to see your AJP work giving results! I think Nginx has a module for > AJP, would be interesting to see if that makes a difference. :) I don't see how this should help when the Pharo image is at 100% CPU. I don't see how event driven IO is supposed to help for few, high throughput connections. > Are you using stock SocketStream internally or anything even more bare > bone? No, I build my own buffer and go straight to Socket. AJP is packet oriented with 8k packets so this is easy. Cheers Philippe |
On 02/27/2012 10:27 AM, Philippe Marschall wrote:
> On 02/27/2012 10:11 AM, Göran Krampe wrote: >> Hi Philippe! >> >> Nice to see your AJP work giving results! I think Nginx has a module for >> AJP, would be interesting to see if that makes a difference. :) > > I don't see how this should help when the Pharo image is at 100% CPU. I > don't see how event driven IO is supposed to help for few, high > throughput connections. I agree that it sounds like it wouldn't help - but still, curious. I got a feeling when I messed around with SCGI that it can also matter "how" the frontend works against the backend. regards, Göran |
In reply to this post by Philippe Marschall-2
Hi guys,
S, Philippe Marschall piše: > Göran Krampe wrote: >> Nice to see your AJP work giving results! I think Nginx has a module for >> AJP, would be interesting to see if that makes a difference. :) > I don't see how this should help when the Pharo image is at 100% CPU. I > don't see how event driven IO is supposed to help for few, high > throughput connections. >> Are you using stock SocketStream internally or anything even more bare >> bone? > No, I build my own buffer and go straight to Socket. AJP is packet > oriented with 8k packets so this is easy. Reuse of the same buffer (same ByteArray) on raw socket is also the technique used in Swazoo and results are similar. I'm preparing a similar benchmark including the comparison with VW, so that we can see how Pharo is progressing on network field and also in general. Best regards Janko -- Janko Mivšek Aida/Web Smalltalk Web Application Server http://www.aidaweb.si |
On 02/27/2012 12:53 PM, Janko Mivšek wrote:
> Hi guys, > > S, Philippe Marschall piše: >> Göran Krampe wrote: > >>> Nice to see your AJP work giving results! I think Nginx has a module for >>> AJP, would be interesting to see if that makes a difference. :) > >> I don't see how this should help when the Pharo image is at 100% CPU. I >> don't see how event driven IO is supposed to help for few, high >> throughput connections. > >>> Are you using stock SocketStream internally or anything even more bare >>> bone? > >> No, I build my own buffer and go straight to Socket. AJP is packet >> oriented with 8k packets so this is easy. > > Reuse of the same buffer (same ByteArray) on raw socket is also the > technique used in Swazoo and results are similar. I'm preparing a > similar benchmark including the comparison with VW, so that we can see > how Pharo is progressing on network field and also in general. One thing I noted with testing Swazoo is that is doesn't support Keep-Alive with HTTP 1.0. Unfortunately ApacheBench uses exactly this. Cheers Philippe |
S, Philippe Marschall piše:
>>> No, I build my own buffer and go straight to Socket. AJP is packet >>> oriented with 8k packets so this is easy. >> >> Reuse of the same buffer (same ByteArray) on raw socket is also the >> technique used in Swazoo and results are similar. I'm preparing a >> similar benchmark including the comparison with VW, so that we can see >> how Pharo is progressing on network field and also in general. > > One thing I noted with testing Swazoo is that is doesn't support > Keep-Alive with HTTP 1.0. Unfortunately ApacheBench uses exactly this. Correcting this was easy: Swazoo patch to allow Keep Alive (ab -k) over HTTP 1.0: HTTPConnection>>getAndDispatchMessages ... (self task request isHttp10 and: [self task request isKeepAlive not]) ifTrue: [self close]. ... Best regards Janko -- Janko Mivšek Aida/Web Smalltalk Web Application Server http://www.aidaweb.si |
In reply to this post by Janko Mivšek
>
> Reuse of the same buffer (same ByteArray) on raw socket is also the > technique used in Swazoo and results are similar. I'm preparing a > similar benchmark including the comparison with VW, so that we can see > how Pharo is progressing on network field and also in general. let us know because this is nice to know. I want to also know the slope of progress :) Stef |
Free forum by Nabble | Edit this page |