Re: Context switches per second

21 Nov 2013

      On 20 November 2013 15:16, Tim Connors <tconnors@rather.puzzling.org> wrote:
...
...
...
When we have these overloads, nothing else we measure seems to be
approaching any limit.  The servers have plenty of CPU left, and there's
no real difficulty logging into them.  Anything else I should be looking
at?  Fork rate is tiny (1 or 2 per second).  Network bandwidth is fine.
Not sure that I've noticed network packet limitations (4k packets per
second per host when it failed last time, generating 16000
interrupts/second total per host).
What is going wrong in the "overload"?
Something hits a tipping point, the number of apache worker slots
(3000-6000 depending on hardware specs) rapidly fills up, then apache
stops accepting new connections and www.bom.gov.au goes dark (since this
happens on all machines in the load balanced cluster simultaneously).
Ah, you are probably already well aware of this, so stop me if so..
In my experience, there's definitely an upper bound to the number of
web-serving worker threads you can run on a machine, beyond which you
start to see a drop in the aggregate performance rather than gain.
Three to six thousand slots sounds like a lot for one machine, to me.*
I wondered why so many? Are you not running a reverse-proxy
accelerator in front of Apache? (eg. Varnish or some configurations of
nginx)

If you were just serving static content directly, I'd go with
something lighter-weight than Apache; and if you're serving dynamic
content (ie. the php you mention) then I'd definitely not do so
without a good reverse-proxy in front of it, and a much-reduced number
of apache threads.

Sorry this doesn't really help with the context-switching question,
but maybe helps with the overall issue.

-Toby

* But I'm a bit out of date; current spec hardware is quite a bit more
powerful than it was last time I was seriously working with
high-thread-count code.

Re: Context switches per second

Toby Corkindale