100% CPU freeze on read of large files with --sparse
While restoring a large data backup which contained some big sparse-ish files,
using rsync 3.1.1, (these were VMDK files to be precise), I found that adding
the --sparse option can permanently wedge the rsync processes.
I performed a few basic checks during the time it happened (at one point I
left it a few days so I suspect it can last more or less forever).
* strace didn't show any syscall activity, making me suspect it
was blocked in userland
* kill and kill -9 could not stop the processes, which would imply it was
blocked in kernel IO
* strace of the 100% processes did not display any syscall activity
* the processes refused to stop consuming 100% CPU, until the system was
* rebooting the system took forever on the all-process-kill timers
I wanted to see if anybody had seen similar behavior before, or if there is
more I could do to diagnose the cause. It's the first time in many years of
use I ever got any unexplaining behavior like this from rsync so I wasn't sure
what I should check since it defied most typical debug tools. The behavior
appeared to quit when --sparse was removed.