
On Tue, 26 Mar 2013, Aryan Ameri <info@ameri.me> wrote:
What's the best way to copy a large directory tree (around 3TB in total) with a combination of large and small files? The files currently reside on my NAS which is on my LAN (connected via gigabit ethernet) and are mounted on my system as a NFS share. I would like to copy all files/directories to an external hard disk connected via USB.
As others have already suggested, I recommend rsync. Use it without the -c option (eg "rsync -va") and you can run it multiple times with little overhead. The -c option makes it do checksums (IE read all file data) but without it only metadata is checked and the cache on modern systems will generally cover that. One problem I've had with combining rsync and something else is getting the use of trailing / characters wrong and copying things twice. On Tue, 26 Mar 2013, Paul Miller <paul.miller@rmit.edu.au> wrote:
Does your NAS have usb ports on it?
If so, I'd log into there and do it all there. It does seem that an awful lot of time is being wasted here with network latency.
Especially if the NAS has any USB 3 ports.
USB 3 has a theoretical speed of 5Gbit/s or about 500MB/s. If the NAS and drive both support it then it will be faster than Gig-E. If either the NAS or the drive are USB 2 then you have a theoretical speed of 480Mbit/s and a maximum that I've measured of about 35MB/s. I doubt that a file sharing protocol could take 3* the data of a filesystem on a block device to copy files so the bottleneck when using GigE to copy files to a USB 2.0 device should be USB. Of course using GigE will add a little latency which will slow things down a bit, but then you could run two copies of rsync at the same time to stop that being a problem. On Tue, 26 Mar 2013, "Trent W. Buck" <trentbuck@gmail.com> wrote:
Note -a does not do -HSX, and you're only passing -S. Both -H and -S have significant overhead, so pass them iff you need them.
Why does -S have significant overhead? If the source file doesn't have blocks of zeros then there shouldn't be any difference, if it does have zero blocks then it's just replacing a write with a seek. On Wed, 27 Mar 2013, Jason White <jason@jasonjgw.net> wrote:
USB 2 is notoriously slow in this scenario. For details, look up Sarah Sharp's talk at LCA several years ago on USB 3, which solves the problems. As I recall, it isn't the actual data transfer rate that causes the poor performance, but I can't remember the details now.
http://en.wikipedia.org/wiki/USB_2#USB_2.0_.28High_Speed.29 Wikipedia says that USB 2 has a "maximum signaling rate of 480 Mbit/s (effective throughput up to 35 MB/s or 280 Mbit/s)". So it seems that it's about 58% efficient. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/