
On Wed, 20 Nov 2013, Russell Coker wrote:
On Wed, 20 Nov 2013, Tim Connors <tconnors@rather.puzzling.org> wrote:
Does anyone know what the maximum number of context switches per core you can expect on xeon level hardware?
I'm trying to claim we get overloaded when we reach a little less than 10,000 cswch/s per second, but we've lost all the historical data.
Indeed, is there going to be a maximum for a given piece of hardware (eg, maximum amount of interrupts that can be generated per second; time spent in the interrupt handler that all has to be handled by only one CPU hence explaining why CPU system usage never looks alarming (divide by 8 on some servers, by 16 on others); big kernel lock somewhere in the context switch code)?
When we have these overloads, nothing else we measure seems to be approaching any limit. The servers have plenty of CPU left, and there's no real difficulty logging into them. Anything else I should be looking at? Fork rate is tiny (1 or 2 per second). Network bandwidth is fine. Not sure that I've noticed network packet limitations (4k packets per second per host when it failed last time, generating 16000 interrupts/second total per host).
What is going wrong in the "overload"?
Something hits a tipping point, the number of apache worker slots (3000-6000 depending on hardware specs) rapidly fills up, then apache stops accepting new connections and www.bom.gov.au goes dark (since this happens on all machines in the load balanced cluster simultaneously). woops!
Why not just write a context switch benchmark? It should be simple to have a 50+ pairs of processes and for each pair have them send a byte to a pipe and then wait to receive a byte from another pipe.
http://manpages.ubuntu.com/manpages/hardy/lat_ctx.8.html
From a quick Google search it seems that my above idea has already been implemented.
http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html https://github.com/tsuna/contextswitch
The above looks interesting too. A google search for the words context, switch, and benchmark will find you other things as well.
Believe me I searched. All the snot in my head seems to be clogging up my synapses today unfortunately. But the blog entry looks good. I imagine that the 140,000 cswitches/second on 16 core machines running httpd+php interpreter is pretty much a fundamental limit on E5410 level hardware, given that apache is heavy weight enough that it's going to be more towards the 50,000µs end of the spectrum presented in that blog. Now I just have to convince the powers that be that php is a stupid thing to rely on when you don't have to, and it's obviously that recent change that broke the system that formerly coped with many times the amount of traffic that it now croaks on. Now I've got some benchmarks to run. I mean, fight some fires. -- Tim Connors