Re: reproducibility of results

On Sun, Dec 25, 2016 at 04:56:05PM +1100, Paul van den Bergen wrote:
Funny, I was asked about exactly the same problem when I started @WEHI... only there was no attempt made to even start tackling the problem...
yeah, we were constantly getting individual academics and research groups asking us about storage, and then trying to do the best we could with minimal resources. the unfortunate fact is that disks/storage arrays and file-servers and tape libraries etc are expensive. You can replace a very large percentage of your up-front capital expense with skilled techs, which are an on-going cost (you're going to need them to look after expensive equipment anyway, and it has to be maintained & upgraded for 7+ years), but it's still going to cost a lot for huge data storage anyway, even if you avoid over-priced name-brand gear.
Virtualisation of workload makes the problem a lot easier to tackle, but even so... 7 years is a long time in IT...
cheap big disks helps a lot too. but you need a lot of them, plus backup - on-site and off-site. CPU & RAM are more than adequate for pretty nearly any file-storage needs these days...could always use more of both for computational stuff. craig -- craig sanders <cas@taz.net.au>

Virtualisation can help a lot. The ability to snapshot a server version along with the data it analysed is probably the right approach (and a relatively small footprint, since if you are doing it right, you are taking snapshots anyway) But you still have to set it up... Research data is interesting. Massive volumes, relatively short "live" time frame before it can be migrated to slower storage. Tiered storage solution help a lot. Archiving to tape is viable but really need a way for it to be automatically accessed (by users without sysadmin involvement - less work, more likely to be used, and users test the archiving validity constantly... ) On 25 Dec 2016 11:44 pm, "Craig Sanders via luv-main" <luv-main@luv.asn.au> wrote:
On Sun, Dec 25, 2016 at 04:56:05PM +1100, Paul van den Bergen wrote:
Funny, I was asked about exactly the same problem when I started @WEHI... only there was no attempt made to even start tackling the problem...
yeah, we were constantly getting individual academics and research groups asking us about storage, and then trying to do the best we could with minimal resources.
the unfortunate fact is that disks/storage arrays and file-servers and tape libraries etc are expensive. You can replace a very large percentage of your up-front capital expense with skilled techs, which are an on-going cost (you're going to need them to look after expensive equipment anyway, and it has to be maintained & upgraded for 7+ years), but it's still going to cost a lot for huge data storage anyway, even if you avoid over-priced name-brand gear.
Virtualisation of workload makes the problem a lot easier to tackle, but even so... 7 years is a long time in IT...
cheap big disks helps a lot too. but you need a lot of them, plus backup - on-site and off-site.
CPU & RAM are more than adequate for pretty nearly any file-storage needs these days...could always use more of both for computational stuff.
craig
-- craig sanders <cas@taz.net.au> _______________________________________________ luv-main mailing list luv-main@luv.asn.au https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main
participants (2)
-
Craig Sanders
-
Paul van den Bergen