The --inplace is very different from the behaviour of --partial when resuming a complex case transfer.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

The --inplace is very different from the behaviour of --partial when resuming a complex case transfer.

Hongyi Zhao
Hi all,

From the manpage of rsync, I can see the following descriptions:

        --inplace
          The option implies --partial (since an interrupted transfer does
           not  delete  the  file)

So I do the following testings on the `--inplace' and `--partial' for
resuming a file with the following steps:

1- rsync ftp.cn.debian.org::debian/dists/wheezy/main/binary-amd64/
Packages.gz .

2- split -b 1M Packages.gz

At this point, I obtained the following eight files:

 xaa  xab  xac  xad  xae  xaf  xag  xah

3- Then I remove two files, say, xab, and xae from the above eight files,
and then use cat to regenerate the Packages.gz:

for i in  xaa  xac  xad xaf  xag  xah ; do
  cat $i >> Packages.gz;
done

4- Resuming the rsync transfer with the `--inplace' and `--partial'
respectively for updating the Packages.gz file to its correct version:

$ rsync --inplace --progress  -v ftp.cn.debian.org::debian/dists/wheezy/
main/binary-amd64/Packages.gz .
[snipped]

Packages.gz
      7,625,173 100%  101.07kB/s    0:01:13 (xfr#1, to-chk=0/1)

sent 14,201 bytes  received 6,581,416 bytes  86,217.22 bytes/sec
total size is 7,625,173  speedup is 1.16

At this point, I delete the Packages.gz, and regenerate it with the
commands:

for i in  xaa  xac  xad xaf  xag  xah ; do
  cat $i >> Packages.gz;
done

And then run the following commands:

$ rsync --partial --progress  -v ftp.cn.debian.org::debian/dists/wheezy/
main/binary-amd64/Packages.gz .
[snipped]
Packages.gz
      7,625,173 100%  177.44kB/s    0:00:41 (xfr#1, to-chk=0/1)

sent 14,201 bytes  received 2,112,951 bytes  45,745.20 bytes/sec
total size is 7,625,173  speedup is 3.58

As you can see, for my case, the Packages.gz file lacks 2M = 2097152
bytes, and the with the `--inplace' method, the rsync will retrieve  
6,581,416 bytes to update this file to its correct version.  While with
the `--partial' method, the rsync will only retrieve 2,112,951 bytes for
updating the file to its correct version.

So, I cann't figurine out why the manual of rsync say that the `--inplace'
implies `--partial'?

Any I also suspect that why the `--inplace' method is so inefficient,  as
you can see, in this case, it will almost re-transfer the whole file for
only updating a small part of this file.

While the `--partial' method will work smoothly.  The 2,112,951 bytes is
only a little larger than the actual data for updating the file.  Of
course, the control data are needed for rsync to work, so it will need a
little more data for that thing.  So this method is worked just as the
description in the manual.

Any hints on the above issues?

Regards
--
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: The --inplace is very different from the behaviour of --partial when resuming a complex case transfer.

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

2 things...

First, --partial has no effect in your commands.  --partial does not
mean resume.  Rsync would always attempt to verify+complete a shorter
file unless you specify --whole-file.  --partial just means don't
delete an incomplete file when rsync is aborted.  You never aborted
rsync so there was never a partial file for rsync to delete.
- --inplace does imply --partial as it also won't delete an incomplete
file on abort.

Second, poking a hole in a file is a known problem with --inplace.
Normally when rsync completes an existing file it writes out an entire
new file using the parts it can from the existing file and parts it
has to transfer.  Once complete rsync deletes the old version and
renames the new one into place.  --inplace changes this behavior
(important if you have big files and can't store an extra one or are
using a CoW filesystem).  The problem with --inplace and removing a
chunk from the middle of the file is that once rsync hits a difference
it starts overwriting good data with the data that is supposed to be
at that part of the file.  So rsync never finds any more good data.
If you were to swap 2 sections of the file or replace one with bogus
data resulting in the same size file you would get the same amount of
network transfer with either option though --inplace would save you
some disk IO.

On 04/14/2015 06:01 AM, Hongyi Zhao wrote:

> Hi all,
>
>> From the manpage of rsync, I can see the following descriptions:
>
> --inplace The option implies --partial (since an interrupted
> transfer does not  delete  the  file)
>
> So I do the following testings on the `--inplace' and `--partial'
> for resuming a file with the following steps:
>
> 1- rsync ftp.cn.debian.org::debian/dists/wheezy/main/binary-amd64/
> Packages.gz .
>
> 2- split -b 1M Packages.gz
>
> At this point, I obtained the following eight files:
>
> xaa  xab  xac  xad  xae  xaf  xag  xah
>
> 3- Then I remove two files, say, xab, and xae from the above eight
> files, and then use cat to regenerate the Packages.gz:
>
> for i in  xaa  xac  xad xaf  xag  xah ; do cat $i >> Packages.gz;
> done
>
> 4- Resuming the rsync transfer with the `--inplace' and `--partial'
>  respectively for updating the Packages.gz file to its correct
> version:
>
> $ rsync --inplace --progress  -v
> ftp.cn.debian.org::debian/dists/wheezy/
> main/binary-amd64/Packages.gz . [snipped]
>
> Packages.gz 7,625,173 100%  101.07kB/s    0:01:13 (xfr#1,
> to-chk=0/1)
>
> sent 14,201 bytes  received 6,581,416 bytes  86,217.22 bytes/sec
> total size is 7,625,173  speedup is 1.16
>
> At this point, I delete the Packages.gz, and regenerate it with the
>  commands:
>
> for i in  xaa  xac  xad xaf  xag  xah ; do cat $i >> Packages.gz;
> done
>
> And then run the following commands:
>
> $ rsync --partial --progress  -v
> ftp.cn.debian.org::debian/dists/wheezy/
> main/binary-amd64/Packages.gz . [snipped] Packages.gz 7,625,173
> 100%  177.44kB/s    0:00:41 (xfr#1, to-chk=0/1)
>
> sent 14,201 bytes  received 2,112,951 bytes  45,745.20 bytes/sec
> total size is 7,625,173  speedup is 3.58
>
> As you can see, for my case, the Packages.gz file lacks 2M =
> 2097152 bytes, and the with the `--inplace' method, the rsync will
> retrieve 6,581,416 bytes to update this file to its correct
> version.  While with the `--partial' method, the rsync will only
> retrieve 2,112,951 bytes for updating the file to its correct
> version.
>
> So, I cann't figurine out why the manual of rsync say that the
> `--inplace' implies `--partial'?
>
> Any I also suspect that why the `--inplace' method is so
> inefficient,  as you can see, in this case, it will almost
> re-transfer the whole file for only updating a small part of this
> file.
>
> While the `--partial' method will work smoothly.  The 2,112,951
> bytes is only a little larger than the actual data for updating the
> file.  Of course, the control data are needed for rsync to work, so
> it will need a little more data for that thing.  So this method is
> worked just as the description in the manual.
>
> Any hints on the above issues?
>
> Regards
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlUtNH0ACgkQVKC1jlbQAQdM3gCgvunKL29929LbUPNBhvP0sQrj
oXYAoNhUwhpH/1jzzkcxMSKwjFc7YDgs
=OSVx
-----END PGP SIGNATURE-----
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html