
On Thu, 5 Jul 2012, Robin Humble <robin.humble@anu.edu.au> wrote:
1MW for a couple of bugs which didn't even affect all servers!
another take on this would be that it's a shocking waste that these machines aren't using 1MW all the time - it means that they are basically idle and wasted. cloud is highly inefficient. those of us in HPC expect all machines to be running at ~90% of max power all the time. if they're not, then something is wrong.
An advantage of the Linode model or the EC2 model is that the resources may be used more efficiently. Hetzner just rents servers and you only need a fraction of the resources then you still get an entire server. The Hetzner servers I run are far from fully utilised, but they are still a lot cheaper than any other option for getting the same job done.
I have root on
5 Hetzner systems which includes two MySQL instances and for some reason none of them were afflicted by this.
we had 2 nodes out of ~2000 that might have been affected by the leap second. very minor.
I just had Chromium and MySQL on my workstion get afflicted. It's strange that it apparently only happened today and didn't appear to happen in the past (I ran top a couple of days ago and saw nothing unusual).
Hetzner is only one German hosting company and there's also a lot of private computer use that has mostly idle servers (EG pretty much every corporate server I've ever run).
that is shocking. machines use ~30-50% of max power when idle. they should either be off or at max power doing useful work. anything else is a total waste. I guess virtualisation doesn't work now any better than it ever has done.
We need more grid computing tasks like SETI@home. It's a pity that they all seem to have proprietary clients which makes them undesirable to us.
It's easy to imagine
this bug as having added a few hundred MW of load to the power grid. That sort of sudden load could cause a blackout. If the systems which manage the power grid to prevent cascading failures were also hit by the same bug then it would have been particularly nasty.
servers are tiny proportion of the baseload. think of all the air conditioners and aluminium smelters out there. I believe they are at the 1 to 2% level of total power used.
I believe that aluminium smelters have close arrangements with the power companies, they don't just surprise the power company by turning things on. Air conditioners are quite predictable, you won't suddenly have a few hundred MW of air-conditioning turn on at midnight!
also if power companies can't supply to the sum of their rated substations then that would be negligent of them. they strictly regulate their substations - you can't just plug one in.
It's a well known fact that in almost every part of the world the power companies can NEVER supply to the sum of their substations (*). Doing so would be simply uneconomical as a lot of unused generating capacity would need to be build and maintained at significant expense. It's not uncommon for this to be demonstrated in summer. (*) Antarctica may be the only exception.
major blackouts usually occur because of poor maintainance and preparation (eg. the current USA storm blackouts) componded by, storms, ice storms, geomagnetic storms, or by bugs in power company software and protocols, such as those that took out the east coast of the USA a few years ago.
A few years ago power to a large portion of the Melbourne CBD was cut when the connection to Tasmania failed. It seems that in hot weather we are on power station failure away from having a massive power cut. Cutting off the CBD is a major issue. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/