rsync speedup

I was curious about this number that is given after an rsync operation completes. Looking online yielded no results, so I hope to be a result that people can refer to here. When rsync runs on a directory, it goes through and builds the file list, then transfers the difference. That's what makes it so brilliant. Because it built the file list of the entire directory, recursively, it knows the entire size of that directory and it also keeps track of how much data it actually transfers. The report may look something like this:

sent 4013337 bytes received 76 bytes 39934.46 bytes/sec
total size is 5171204259 speedup is 1288.48


rsync is telling you that it actually transferred 4013337 bytes (~4MB) from the local machine to the remote machine, and 76 bytes in the other direction. It gives you an average transfer rate, and then the big number, the size of the directory that was just synchronized - 5171204259 (~5GB). The speedup number represents the Total size divided by the total data transferred in both directions (TOTAL SIZE / (SENT + RECEIVED)) = SPEEDUP. This is better termed the "advantage" of using rsync, but the developers chose "speedup" and that's what we got. You could think of it as a ratio/multiplier, since a speedup of 1 would mean that you basically transferred the whole thing with no speedup, as opposed to that example operation, which transferred 1/1288 of the total data to make up the difference.

Registered Linux User #370740 (http://counter.li.org)

15 comments:

Anonymous said...

Nice explanation. Thanks!

Anonymous said...

I was wondering what speedup was and googling it brought me here (btw, give the rsync --human-readable flag a try)

Anonymous said...

Eah, also get here by google.
Thanks for explanation!

Greg said...

I got here through google rsync speedup is.

I have never used rsync before tonight and I jumped into the deep end by mirroring getdeb and playdeb on my site http://rsync.labby.co.uk and rsync://rsync.labby.co.uk at the end I saw a lot I understood but couldn't figure out speedup.

Your page was very informative and crystal clear.

Thanks for the heads up :)

slaton said...

Very helpful explanation of the 'speedup' statistic, thanks.

SiNGH said...

Hiya, just want to say thanx for the to-the-point explanation of the 'speedup'.

It's really nice to know how it's calculated, & that we're getting 'speedups' in the 1000's factor!

Lead Developer @ Studios.DesertBox.Org

^_^

Blog to a global audience said...

Thanks for this explanation, I have been wondering what that meant!

Mark said...

Thanks for looking into this. Speedup is a common term, rsync didn't invent it. Speedup is technically the ratio of times or speeds, so what rsync says is speedup is actually the estimated speedup. It's also hyperinflated if for example, you are transferring data over ethernet, and all of the files needed to be scanned for differences. Reading and transferring all the data wouldn't take much longer than reading all the files just to find their differences. Big gains bandwidth-wise, though.

I wonder if I should make a patch to rsync to make their fake speedup an accurate estimate of *real* speedup instead...

Anonymous said...

thanks for explanation!

Anonymous said...

It helps. Thanks!

Anonymous said...

Thanks for the explanation, I also got here on the search engine bus.

Anonymous said...

Thanks for that explanation - helped me too

Johnny said...

The speedup will also increase if you use the -z flag, which tells rsync to compress the data before sending. In this way, the total size is much greater than what is actually sent.

I have been using rsync lately to transfer files from my linux desktop to a remote server. Due to the nature of these files, I first delete the entire directory from the server, then transfer the entire directory again from my box. So while I'm not getting the "delta advantage" of rsync, I still get a "speedup" because of the compression.

Paul Nguyen said...

Thanks for the tip, Johnny!

I think your usage would benefit from the --delete flag, which will delete files present in the target that are not present in the source, effectively giving you the "delta" functionality you were looking for. Of course, if your target is writable by some process that might hit it during the transfer, those files could be overlooked.

multia said...

rsync always calculates an MD5 checksum of the source and destination files to check if the file needs to be transferred (you can let it decide on size+mtime also, but then it will make the md5 anyway to check if the file has been transferred correctly). On slow connections, this can greatly speedup an incremental update, but on fast connections (like e.g. 100mbit network) and with limited CPU power (like e.g. a cheap NAS), the calculation of the MD5 checksum might take more time then the actual transfer.
In these cases you might notice that the incremental update takes more time then the first time, when the destination file was not there yet)

The speedup value is how much you saved on the data transfer, but as you understand, this might not be a useful indication for the duration of the total transaction

Facebook

Paul Nguyen's Facebook profile

Nerd Test

v1.0:
I am nerdier than 94% of all people. Are you a nerd? Click here to take the Nerd Test, get nerdy images and jokes, and talk on the nerd forum!
v2.0:
NerdTests.com says I'm an Uber Cool High Nerd.  Click here to take the Nerd Test, get nerdy images and jokes, and write on the nerd forum!

Bloggers' Rights

Bloggers' Rights at EFF