
Hi Russell We maintain, and regularly deploy, dual master setups for MySQL. Upgrading and maintenance is very easy - zero downtime for the frontend. Obviously for scaling of a web app, read-only slaves are used and extra can be put in as needed, and there too obviously maintenance and upgrades are not an issue provided there's sufficient capacity on the overall system (which there should be, otherwise it can't handle failures either). Just to be clear, a master-master setup is not "active/passive" in the traditional sense, technically both masters are active. One thing to note is that Amazon is not a good place to host such setups because a floating IP address is used, and Amazon's Elastic IP offering is charged at datacenter-external rates. Depending on the load/activity, Amazon is not particularly economical for bigger web/db setups - so it's important to do the math. For a site that has lots of traffic for some of the day but is doing nothing the rest of the day, Amazon works well. For a site that has traffic most of the day, the cost in CPU cycles will be much higher than for instance a setup on a set of Linode servers. Linode also has an API and other tools so deployments can be automated. We use Puppet as well. Depending on dataset size and load, solutions like Cassandra can be overkill. Factors to consider are whether you need RDBMS-type access with joins, grouping and aggregates (which typically you don't get with distributed systems - you have to DIY those things, it's a trade-off) and knowledge of the technology. MySQL is not necessarily the optimal for some scenarios, but it's a general purpose system and its performance, behaviour, advantages and pitfalls are well known - known factors are easier to manage. Regards, Arjen. ----- Original Message -----
One of my clients is proposing a project that requires good storage performance and high reliability. It's an entirely new project so there's no legacy code to deal with.
The traditional way of doing this would be to have a cluster of systems maybe in an activa/passive configuration with database replication or with MySQL or PostgreSQL clustering. Those solutions are difficult to manage and upgrade.
http://en.wikipedia.org/wiki/Apache_Cassandra
I think that probably the best thing to do is to use something like Cassandra on a cluster of servers in a DC to run this. The Cassandra feature set seems good (including being able to add new servers at run-time) and developing it from scratch can't be a lot harder than doing MySQL development.
Does anyone have any suggestions for planning at this early stage?
I had thought of doing something similar with the Amazon EC2 equivalent to Cassandra, but a quick scan of their web site reveals no mention of it. Did Amazon cancel their cloud key-value store service?
-- Exec.Director @ Open Query (http://openquery.com) MySQL services Sane business strategy explorations at http://Upstarta.biz Personal blog at http://lentz.com.au/blog/