Amazon / Google + rsync for storage/backup

Hi there, I am looking at options for backup storage. From reviewing prices storage is quite cheap but I get confused by the mix-'n-match nature of some of the cloud offerings. The alternative is to spin up a VM with out current providers but then cost per GB is large and there are some limits on the VM disk size. I have tried implementations (fuse I believe) of mounting a bucket and using rsync to copy data to is but it was slow and all files and whole files were transferred instead of the changes only - which killed the cost of the exercise because of bandwidth. From a cost / reliability POV I would like to go with a large provider and I would like to avoid spinning up a new VM if a direct rsync service is available. Any ideas? Thanks Piers

On 23/02/16 07:05, Piers Rowan via luv-main wrote:
Hi there,
I am looking at options for backup storage. From reviewing prices storage is quite cheap but I get confused by the mix-'n-match nature of some of the cloud offerings. The alternative is to spin up a VM with out current providers but then cost per GB is large and there are some limits on the VM disk size.
Seem to be up and working with Google now. Thanks

On Tue, 2016-02-23 at 11:10 +1000, Piers Rowan via luv-main wrote:
I am looking at options for backup storage. From reviewing prices storage is quite cheap but I get confused by the mix-'n-match nature of some of the cloud offerings. [...] Seem to be up and working with Google now.
If storing data "outside your premise" is not an issue for you (you have to trust another entity what they are doing and not doing with your data while in transfer and while in rest), AWS S3 is pretty simple and straight forward: - get an AWS account, login at the AWS Management Console - S3: create a bucket (choose your preferred geo location, e.g. Sydney) - IAM: create a new user account (API access only, no password) - IAM: attach an access policy (to let this user access the bucket) - on your Linux box: install "aws cli" - run "aws configure" (to store the access and secret key) That's it. Now you can read/write/sync your data to and from the S3 bucket. Some examples below. Output the general man pages: $ aws help Output the S3-specific man pages: $ aws s3 help List all buckets: $ aws s3 ls Copy file "helloworld.txt" to S3 bucket "mybucket": $ aws s3 cp helloworld.txt s3://mybucket/ List the content of bucket "mybucket": $ aws s3 ls s3://mybuchet/ Download file "helloworld.txt" from S3 bucket "mybucket" to /tmp: $ aws s3 cp s3://mybucket/helloworld.txt /tmp/ Synchronise directory /tmp/myfiles to S3 recursively: $ aws s3 sync /tmp/myfiles s3://mybucket/myfiles/ List the content of directory "myfiles" of bucket "mybucket" recursively: $ aws s3 ls s3://mybuchet/myfiles/ --recursive Further reading: Getting Started: https://aws.amazon.com/getting-started/ AWS S3 details and pricing: https://aws.amazon.com/s3/ AWS CLI: https://aws.amazon.com/cli/ AWS CLI at GitHub: https://github.com/aws/aws-cli I also played with "s3fs-fuse" and wrote a tutorial (see link below), but I prefer to use the "official" aws-cli-way nowadays to be honest :-) https://schams.net/typo3-on-aws/documentation/howto/mount-an-aws-s3-bucket/ Off topic: depending on the data, you possibly want to consider encrypting every file at your end, prior the transfer to AWS, Google or any other external service provider. Cheers Michael

I use backupninja to schedule and configure the use of duplicity to back up to AWS S3. This gets me GPG encrypted backups on cheap storage with redundancy, with compressed incremental backups. It's not bad in terms of the traffic usage for backup runs, but it has to transfer and examine a lot of metadata if you just need to recover the odd file. My main concern is to cover myself against catastrophic failure of servers. You can use a GPG key, but I'm not convinced it adds anything for my use, so I use symetric encryption an signing, which means that the config file for backupninja contains all I need for recovery. Basically you install backupninja, and then set up a config file like so: ------ /etc/backup.d/90.offsite-s3.dup ------ options = --volsize 200 nicelevel = 19 testconnect = no [gpg] encryptkey = signkey = password = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [source] include = /etc include = /home include = /usr/local/*bin include = /var/backups include = /var/spool/cron/crontabs include = /var/lib/dpkg/status* include = /var/www exclude = [dest] incremental = yes increments = 30 keep = 180 keepincroffulls = 3 desturl = s3+http://YOUR_BUCKET_NAME_HERE awsaccesskeyid = XXXXXXXXXXXXXXXXXXXX awssecretaccesskey = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX bandwidthlimit = ------------------------------------------------- If you want to use european buckets, you need a couple of extra options for duplicity: ------------------------------------------------- options = --volsize 200 --s3-european-buckets --s3-use-new-style ------------------------------------------------- I'm not sure I've got the volsize right. Probably if I'd done more recovery to date I might have formed more of an opinion on that. Installing on debian and ubuntu, I seem to remember an issue with incomplete dependencies in the backupninja installation, which meant there was a python library or two I had to add. "boto" comes to mind. There might have been something else as well. Supposedly this approach is compatible with letting AWS automatically move old files to "glacier" storage. For my purposes it's cheap enough without that that I'd rather not spend time thinking through the issues involved (time and cost) with recovery from glacier. I also have backupninja doing database backups into /var/backups, and then those backup files get included in duplicity's run. Also, I have some stuff with Hetzner in Germany, and I noticed this the other day: https://www.hetzner.de/ot/hosting/produktmatrix/storagebox-produktmatrix . Looks interesting and cheap. It probably doesn't have the redundancy that AWS offers, and AWS's elastic pricing is nice. OTOH, it gives things like rsync access, and access from a mobile app, so it's worth keeping in mind. Regards, Andrew McNaughton On 23/02/16 21:55, Michael Schams via luv-main wrote:
On Tue, 2016-02-23 at 11:10 +1000, Piers Rowan via luv-main wrote:
I am looking at options for backup storage. From reviewing prices storage is quite cheap but I get confused by the mix-'n-match nature of some of the cloud offerings. [...] Seem to be up and working with Google now.
If storing data "outside your premise" is not an issue for you (you have to trust another entity what they are doing and not doing with your data while in transfer and while in rest), AWS S3 is pretty simple and straight forward:
- get an AWS account, login at the AWS Management Console - S3: create a bucket (choose your preferred geo location, e.g. Sydney) - IAM: create a new user account (API access only, no password) - IAM: attach an access policy (to let this user access the bucket) - on your Linux box: install "aws cli" - run "aws configure" (to store the access and secret key)
That's it. Now you can read/write/sync your data to and from the S3 bucket. Some examples below.
Output the general man pages:
$ aws help
Output the S3-specific man pages:
$ aws s3 help
List all buckets:
$ aws s3 ls
Copy file "helloworld.txt" to S3 bucket "mybucket":
$ aws s3 cp helloworld.txt s3://mybucket/
List the content of bucket "mybucket":
$ aws s3 ls s3://mybuchet/
Download file "helloworld.txt" from S3 bucket "mybucket" to /tmp:
$ aws s3 cp s3://mybucket/helloworld.txt /tmp/
Synchronise directory /tmp/myfiles to S3 recursively:
$ aws s3 sync /tmp/myfiles s3://mybucket/myfiles/
List the content of directory "myfiles" of bucket "mybucket" recursively:
$ aws s3 ls s3://mybuchet/myfiles/ --recursive
Further reading:
Getting Started: https://aws.amazon.com/getting-started/ AWS S3 details and pricing: https://aws.amazon.com/s3/ AWS CLI: https://aws.amazon.com/cli/ AWS CLI at GitHub: https://github.com/aws/aws-cli
I also played with "s3fs-fuse" and wrote a tutorial (see link below), but I prefer to use the "official" aws-cli-way nowadays to be honest :-)
https://schams.net/typo3-on-aws/documentation/howto/mount-an-aws-s3-bucket/
Off topic: depending on the data, you possibly want to consider encrypting every file at your end, prior the transfer to AWS, Google or any other external service provider.
Cheers Michael
_______________________________________________ luv-main mailing list luv-main@luv.asn.au https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main
participants (3)
-
Andrew McN
-
Michael Schams
-
Piers Rowan