Amazon Glacier backup tools for Linux/Unix

I'm interested in moving my automated off-site backups to Amazon Glacier rather than my current system which is a combination of Dropbox, Box, S3, and a custom Perl app. Glacier is better off working with chunks of files rather than individual ones, so I need a tool that knows to make archives out of folders, and to replace the whole lot in glacier if any files have changed -- but will leave it alone if it hasn't changed. A SHA checksum over all the (ordered) files would be sufficient I think. Software that could perform incremental backups would be better though. I'd also prefer to have these archives encrypted before being uploaded, preferably using public-key crypto so that I don't need to store the passphrase on the server all the time. If I write another script, at least it'll be in Scala instead of Perl, but still.. I feel like I'd just be reinventing the wheel. Surely someone has already created something that does the above? Unfortunately, googling around just brings up GUI software, or else people with very simple one-off scripts. I want something I can stick in cron.nightly and forget about it, unless it emails me to say it's failed. Does the hivemind have any recommendations?

Duplicity? http://duplicity.nongnu.org/ I've never used it, and can't vouch for it, though :( Martin On 21 May 2014 16:07, Toby Corkindale <toby@dryft.net> wrote:
I'm interested in moving my automated off-site backups to Amazon Glacier rather than my current system which is a combination of Dropbox, Box, S3, and a custom Perl app.
Glacier is better off working with chunks of files rather than individual ones, so I need a tool that knows to make archives out of folders, and to replace the whole lot in glacier if any files have changed -- but will leave it alone if it hasn't changed. A SHA checksum over all the (ordered) files would be sufficient I think. Software that could perform incremental backups would be better though.
I'd also prefer to have these archives encrypted before being uploaded, preferably using public-key crypto so that I don't need to store the passphrase on the server all the time.
If I write another script, at least it'll be in Scala instead of Perl, but still.. I feel like I'd just be reinventing the wheel. Surely someone has already created something that does the above?
Unfortunately, googling around just brings up GUI software, or else people with very simple one-off scripts. I want something I can stick in cron.nightly and forget about it, unless it emails me to say it's failed.
Does the hivemind have any recommendations? _______________________________________________ luv-main mailing list luv-main@luv.asn.au http://lists.luv.asn.au/listinfo/luv-main
-- ================================================================= Martin Paulo, BSc. Software Developer Tel : +61-3-9434 2508 (Home) Tel : 04 205 20339 (Mobile) Site: http://www.thepaulofamily.net "Nobody goes there any more. It's too crowded" - Yogi Berra.

On 21 May 2014 16:34, Martin Paulo <martin.paulo@gmail.com> wrote:
Duplicity? http://duplicity.nongnu.org/
I've never used it, and can't vouch for it, though :(
Thanks. Looks like it would work for S3-based backups and is almost certainly neater than my custom solution -- but doesn't support Glacier. It's probably not hard to add support though, as long as it's making tarball-like archives and not individual files it'll play OK with their accounting. (Glacier encourages fewer, very large, file archives)

On Wed, 21 May 2014 17:40:14 Toby Corkindale wrote:
Looks like it would work for S3-based backups and is almost certainly neater than my custom solution -- but doesn't support Glacier. It's probably not hard to add support though, as long as it's making tarball-like archives and not individual files it'll play OK with their accounting. (Glacier encourages fewer, very large, file archives)
Amazon has a facility for automatically copying S3 data into Glacier. So why can't anything that uses S3 support copying the data to Glacier? Also why do you want Glacier? Last time I looked at the pricing the cost of storing 15TB in Glacier for a year was about equal to buying a Dell PowerEdge T110 server and 5*4TB disks which in a RAID-Z configuration will store the same amount of data. Personally I'd trust a ZFS server I run at a remote site more than Amazon cloud storage. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 21 May 2014 17:55, Russell Coker <russell@coker.com.au> wrote:
On Wed, 21 May 2014 17:40:14 Toby Corkindale wrote:
Looks like it would work for S3-based backups and is almost certainly neater than my custom solution -- but doesn't support Glacier. It's probably not hard to add support though, as long as it's making tarball-like archives and not individual files it'll play OK with their accounting. (Glacier encourages fewer, very large, file archives)
Amazon has a facility for automatically copying S3 data into Glacier. So why can't anything that uses S3 support copying the data to Glacier?
No reason, but it means you're paying all the S3 fees too.
Also why do you want Glacier?
Last time I looked at the pricing the cost of storing 15TB in Glacier for a year was about equal to buying a Dell PowerEdge T110 server and 5*4TB disks which in a RAID-Z configuration will store the same amount of data.
Seriously? Amazon Glacier will charge ~$10/month to store my data, with no up-front fee. Buying a server and a pile of disks is a significant start-up cost, and then hosting will be $90/month -- and I'll have to monitor the server and spend money on replacement disks and parts as they fail, and replace the entire machine every few years. I really don't see how that is possibly cheaper than spending $10/month. Toby

$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup? -- Sent from my Samsung Galaxy Note 2 with K-9 Mail.

On 21 May 2014 18:44, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Because I'm lazy and forgetful when it comes to the removable backup part. If I remember to update my off-site backup monthly, I'm doing well! I'd like to just script it instead.

Hi, On Wed, May 21, 2014 at 6:50 PM, Toby Corkindale <toby@dryft.net> wrote:
On 21 May 2014 18:44, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Because I'm lazy and forgetful when it comes to the removable backup part. If I remember to update my off-site backup monthly, I'm doing well! I'd like to just script it instead.
The lazy motivation drives efficiency and repeatability...cool admin! :-) http://www.iwise.com/CSx01 BW

On 21 May 2014 18:50, Toby Corkindale <toby@dryft.net> wrote:
On 21 May 2014 18:44, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Because I'm lazy and forgetful when it comes to the removable backup part. If I remember to update my off-site backup monthly, I'm doing well! I'd like to just script it instead.
In case any current or future employers are reading this, I must point out that I'm talking about my personal archives here, not anything to do with work. I'm just lazy in my own spare time :)

Hi, On Wed, May 21, 2014 at 7:03 PM, Toby Corkindale <toby@dryft.net> wrote:
On 21 May 2014 18:50, Toby Corkindale <toby@dryft.net> wrote:
On 21 May 2014 18:44, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Because I'm lazy and forgetful when it comes to the removable backup part. If I remember to update my off-site backup monthly, I'm doing well! I'd like to just script it instead.
In case any current or future employers are reading this, I must point out that I'm talking about my personal archives here, not anything to do with work. I'm just lazy in my own spare time :)
I can guarantee you that you have no cause for alarm.... Show an employer that you can handle a backup once with a static script... then hand it onto some drone with 5 minutes of knowledge transfer and you will win big time :-) BW

Toby Corkindale <toby@dryft.net> writes:
On 21 May 2014 18:44, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Because I'm lazy and forgetful when it comes to the removable backup part. If I remember to update my off-site backup monthly, I'm doing well! I'd like to just script it instead.
And presumably you don't care if Amazon (& their business customers, & the intelligence community) can see your bad poetry and photos of your kids[0] ;-) [0] or whatever normal people keep on their home machines.

On 22 May 2014 17:43, Trent W. Buck <trentbuck@gmail.com> wrote:
Toby Corkindale <toby@dryft.net> writes:
On 21 May 2014 18:44, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Because I'm lazy and forgetful when it comes to the removable backup part. If I remember to update my off-site backup monthly, I'm doing well! I'd like to just script it instead.
And presumably you don't care if Amazon (& their business customers, & the intelligence community) can see your bad poetry and photos of your kids[0] ;-)
In my original post, I mentioned client-side crypto being a requirement..

Hi, On Wed, May 21, 2014 at 6:44 PM, Russell Coker <russell@coker.com.au> wrote:
$10 per month means 1TB of data. Why not just buy a bunch of SATA disks and a USB SATA device for removable backup?
Why not do that? - Upfront cost? (albeit cheap sata disk...) - Cost of physical swap out of device (goto DC, change disk, leave DC..time is money) - Knowledge transfer? (How do you backup?..cp -l, rsync, backula, Commvault...Veeam...(add your personal solution here) - What happens if the swapper/backup designer gets hit by a bus? BW

Hi Toby, Have you check out http://www.tarsnap.com/? It's built by one of the main contributor of FreeBSD, it uses AWS S3, really secure and has very reasonable price. Or if you don't care about their service, you might want to have a look on their blog post http://www.daemonology.net/blog/2012-09-04-why-tarsnap-doesnt-use-glacier.ht... HTH. Cheers, On Wed, May 21, 2014 at 4:07 PM, Toby Corkindale <toby@dryft.net> wrote:
I'm interested in moving my automated off-site backups to Amazon Glacier rather than my current system which is a combination of Dropbox, Box, S3, and a custom Perl app.
Glacier is better off working with chunks of files rather than individual ones, so I need a tool that knows to make archives out of folders, and to replace the whole lot in glacier if any files have changed -- but will leave it alone if it hasn't changed. A SHA checksum over all the (ordered) files would be sufficient I think. Software that could perform incremental backups would be better though.
I'd also prefer to have these archives encrypted before being uploaded, preferably using public-key crypto so that I don't need to store the passphrase on the server all the time.
If I write another script, at least it'll be in Scala instead of Perl, but still.. I feel like I'd just be reinventing the wheel. Surely someone has already created something that does the above?
Unfortunately, googling around just brings up GUI software, or else people with very simple one-off scripts. I want something I can stick in cron.nightly and forget about it, unless it emails me to say it's failed.
Does the hivemind have any recommendations? _______________________________________________ luv-main mailing list luv-main@luv.asn.au http://lists.luv.asn.au/listinfo/luv-main
-- simple is good http://brucewang.net http://twitter.com/number5

On 22 May 2014 17:56, Bruce Wang <bruce@brucewang.net> wrote:
Hi Toby,
Have you check out http://www.tarsnap.com/?
It's built by one of the main contributor of FreeBSD, it uses AWS S3, really secure and has very reasonable price.
Unfortunately my data is not going to compress or deduplicate, and while under a terabyte currently, I'm using 1000GB as the benchmark amount. With tarsnap I'd be looking at $250/month, or $3000/year, excluding transfer fees. Glacier, for the same amount, would be $10/month, or $120/year, excluding transfer fees. It's hard to beat that price!
Or if you don't care about their service, you might want to have a look on their blog post http://www.daemonology.net/blog/2012-09-04-why-tarsnap-doesnt-use-glacier.ht...
For those who haven't read the article, it's perhaps best summarised by a comment from my original post, when I said that glacier works best with archives of files, rather than individual files. Thus you need a different approach than some backup systems take.
participants (6)
-
Brent Wallis
-
Bruce Wang
-
Martin Paulo
-
Russell Coker
-
Toby Corkindale
-
trentbuck@gmail.com