
On Fri, Feb 17, 2017 at 06:25:38PM +1100, Joel W. Shea wrote:
Are you maxing out your disk/network bandwidth already?
This is key, IMO, to whether running multiple rsyncs in parallel is worth it or not. Almost all of the time, rsync is going to be I/O bound (disk and network) rather than CPU bound - so adding more rsync processes is just going to slow them all down even more. A single rsync process can saturate the disk and I/O bandwidth of most common disk subsystems and network connections. about the only time more rsync processes might help is if you're transferring between two servers with SSD storage arrays via a direct-connect 10+Gbps link....and even then, only if the disk + network throughput is at least a few multiples of what a single rsync job (incl. child processes for ssh and/or compression if any) can cope with. or if the source AND destination of each of the multiple rsyncs are on completely separate disks/storage-arrays so they don't compete with each other for disk i/o. e.g. rsync from server1/disk1 to server2/disk1 can run at the same time as an rsync from server1/disk2 to server2/disk2...especially if you can use separate network interfaces for each rsync. splitting up the transfer into multiple smaller rsync jobs to be run consecutively, not simultaneously, can be useful....especially if you intend to run the transfers multiple times to get new/changed/deleted/etc files since the last run. There's a lot of startup overhead (and RAM & CPU usage) with rsync on every run, comparing file lists and file timestamps and/or checksums to figure out what needs to be transferred. Multiple smaller transfers (e.g. of entire subdirectory trees) tend to be noticably much faster than one large transfer. in other words, multiple parallel rsyncs is usually a false optimisation. craig -- craig sanders <cas@taz.net.au>