
On Thu, 16 Feb 2017 12:12:44 PM Bill Yang via luv-main wrote:
I need to transfer 200+ TB data from one storage server (Red Hat Linux based) to another (FreeBSD). I am planning to use rsync with multiple threads in a script. There are a number of suggestions on the Internet (find + xargs + rsync), but none of them worked well so far. I also need a reliable way to check whether all files/directories from the source server have been copied to the destination server. Any suggestions/help would be appreciated.
If the files are reasonably large and can be relied on not to change file data without changing metadata then checking is easy via a final run of rsync -va without the -c option. If the files are small then a lot of the rsync time will be taken up by seeking for metadata so that might not be viable (EG before SSDs became popular you couldn't just run anything like a find / on a large mail server). As for the multiple threads, the common way of doing this is copying by parent directory. For example copying a server you might copy /var and /usr separately. That has the obvious problem that the sizes are often significantly different. If you have lots of files in one directory you could transfer /directory/[a-k]* in one process and /directory/[l-z]* in another. This wouldn't support deleting directories that have been removed from the source but that can be easily fixed with a later pass of rsync -va as long as the files are reasonably large. Maybe it would help if you attached the scripts you tried using with xargs etc so we could see what you tried. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/