Why is my rsync transfer slow?

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
I run a rsync job transferring about 45 million files/approximately 1.8
TB data (a Mac OS X Time Machine backup) over a 100 MBit connection.

I use rsync 3.1.1 from MacPorts (I first tried the built in rsync,
version 2.6.9, since it has a Mac OS X specific cache parameter, but it
ran out of memory) with the following parameters

% rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b"
/source/ /destination/

The source is an external 3.5" HDD connected with Firewire 800. The
destination is a sparse disk image bundle mounted locally (but its
"source file" is on a network storage). Initially I got good speeds, 7-9
MB/s for reasonably large files but the longer this operation has been
going on (I restarted it three days ago, see below), the slower it gets.
There are also long pauses when nothing happens, like this:

2011-01-22-070305/Macintosh HD/Library/Application
Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png
1.28K 100% 3.26kB/s 0:00:00 (xfr#48406, ir-chk=1050/4166332)

2016/01/16 18:26:48
Volumes/src/Backups.backupdb/mm/2011-01-22-070305/Macintosh
HD/Library/Application
Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png
313

2011-01-22-070305/Macintosh HD/Library/Application
Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg
32.26K 100% 0.00kB/s 0:00:00 (xfr#48407, ir-chk=1049/4166332)

2016/01/16 19:17:37
Volumes/2TB/Backups.backupdb/mm/2011-01-22-070305/Macintosh
HD/Library/Application
Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg
31279

As you can see, the first file is finished 18:26, the second file 19:17,
almost an hour for a file that is just 32 kB.

I don't think the transfer is CPU limited. There are some CPU spikes but
generally CPU load is less than 10%. The three rsync processes spawned
by this operation has, all in all, used almost exactly 5h of CPU time in
the 72h the transfer has been going on. The computer itself idles 23h a day.

Nor is memory a problem. Memory pressure has been "green" since the
operation begun.

Kernel task has accumulated quite a bit of CPU time (57h when I write
this), but on the other hand, the uptime is 25 days and all these 57h
can't have been consumed by rsync.

Some final details

* I had had this process running for a couple of days when I restarted
it to get better logging three days ago. It took nine hours before the
first file was transferred.

* I first used Finder to transfer this directory tree from the same
source to the same destination. That took 3 days, all in all. Now I have
spent 6 days and I don't think I even have transferred a third of the tree.

* I have tried transferring files between the same source and
destination outside of this operation and they go at full speed

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

First, don't use -z on a local copy.  It will only make rsync slower
for no reason at all.

Second, 45 million files means 90 million calls to stat().  This will
take a while even if nothing needs copying.

On 01/21/2016 03:20 AM, [hidden email]
wrote:

> I run a rsync job transferring about 45 million files/approximately
> 1.8 TB data (a Mac OS X Time Machine backup) over a 100 MBit
> connection.
>
> I use rsync 3.1.1 from MacPorts (I first tried the built in rsync,
> version 2.6.9, since it has a Mac OS X specific cache parameter,
> but it ran out of memory) with the following parameters
>
> % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b"
> /source/ /destination/
>
> The source is an external 3.5" HDD connected with Firewire 800.
> The destination is a sparse disk image bundle mounted locally (but
> its "source file" is on a network storage). Initially I got good
> speeds, 7-9 MB/s for reasonably large files but the longer this
> operation has been going on (I restarted it three days ago, see
> below), the slower it gets. There are also long pauses when nothing
> happens, like this:
>
> 2011-01-22-070305/Macintosh HD/Library/Application
> Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents
/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png
>
>
1.28K 100% 3.26kB/s 0:00:00 (xfr#48406, ir-chk=1050/4166332)
>
> 2016/01/16 18:26:48
> Volumes/src/Backups.backupdb/mm/2011-01-22-070305/Macintosh
> HD/Library/Application
> Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents
/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png
>
>
313
>
> 2011-01-22-070305/Macintosh HD/Library/Application
> Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents
/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg
>
>  32.26K 100% 0.00kB/s 0:00:00 (xfr#48407, ir-chk=1049/4166332)
>
> 2016/01/16 19:17:37
> Volumes/2TB/Backups.backupdb/mm/2011-01-22-070305/Macintosh
> HD/Library/Application
> Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents
/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg
>
>
31279

>
> As you can see, the first file is finished 18:26, the second file
> 19:17, almost an hour for a file that is just 32 kB.
>
> I don't think the transfer is CPU limited. There are some CPU
> spikes but generally CPU load is less than 10%. The three rsync
> processes spawned by this operation has, all in all, used almost
> exactly 5h of CPU time in the 72h the transfer has been going on.
> The computer itself idles 23h a day.
>
> Nor is memory a problem. Memory pressure has been "green" since
> the operation begun.
>
> Kernel task has accumulated quite a bit of CPU time (57h when I
> write this), but on the other hand, the uptime is 25 days and all
> these 57h can't have been consumed by rsync.
>
> Some final details
>
> * I had had this process running for a couple of days when I
> restarted it to get better logging three days ago. It took nine
> hours before the first file was transferred.
>
> * I first used Finder to transfer this directory tree from the
> same source to the same destination. That took 3 days, all in all.
> Now I have spent 6 days and I don't think I even have transferred a
> third of the tree.
>
> * I have tried transferring files between the same source and
> destination outside of this operation and they go at full speed
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlag5JkACgkQVKC1jlbQAQcTwwCeKKbLa6UXxuiG7TJidqa1PKcT
lh0AnRfDtS90pUJFmDptXmyGEH09G0pS
=E+fZ
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-21 15:00, Kevin Korb wrote:
 > First, don't use -z on a local copy.  It will only make rsync slower
 > for no reason at all.

Thanks. Hadn't thought about that. I just copied most from the spelled
out "archive" list of switches. But is rsync so "stupid" that it really
considers z for a local transfer?

 > Second, 45 million files means 90 million calls to stat().  This will
 > take a while even if nothing needs copying.

Hmm, is there a way to benchmark how long time it takes to do a stat()
call?

And still, why is it so much slower than Finder? Finder is dog when it
comes to file operations. Rsync (and cp) is usually many times faster.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
In reply to this post by dbonde+forum+rsync.lists.samba.org
On 2016-01-21 09:20, [hidden email] wrote:
> I run a rsync job transferring about 45 million files/approximately 1.8
> TB data (a Mac OS X Time Machine backup) over a 100 MBit connection.
>
> I use rsync 3.1.1 from MacPorts (I first tried the built in rsync,
> version 2.6.9, since it has a Mac OS X specific cache parameter, but it
> ran out of memory) with the following parameters
>
> % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b"
> /source/ /destination/

Well, after some examination I found at least one problem with this
transfer (that is still running): hard links are not preserved:

This is how a certain file looks at the source where it is backed up on
several locations using hard links:

source volume:

zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG"
9236871 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG

zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG"
9236871 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG


destination volume:

zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG"
20765913 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG

zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG"
704428 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG

As you can see the inode number is the same on the source volume while
it is completely different on the destination volume.

Why are my hard links not preserved? I thought the purpose with -H was
to transfer the hard links rather than the file itself.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It will, assuming it sees both links in the same rsync run.

On 01/23/2016 11:46 AM, [hidden email]
wrote:

> On 2016-01-21 09:20, [hidden email]
> wrote:
>> I run a rsync job transferring about 45 million
>> files/approximately 1.8 TB data (a Mac OS X Time Machine backup)
>> over a 100 MBit connection.
>>
>> I use rsync 3.1.1 from MacPorts (I first tried the built in
>> rsync, version 2.6.9, since it has a Mac OS X specific cache
>> parameter, but it ran out of memory) with the following
>> parameters
>>
>> % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f
>> %b" /source/ /destination/
>
> Well, after some examination I found at least one problem with
> this transfer (that is still running): hard links are not
> preserved:
>
> This is how a certain file looks at the source where it is backed
> up on several locations using hard links:
>
> source volume:
>
> zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG"
> 9236871 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG
>
> zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG"
> 9236871 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG
>
>
> destination volume:
>
> zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG"
> 20765913 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG
>
> zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG"
> 704428 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG
>
> As you can see the inode number is the same on the source volume
> while it is completely different on the destination volume.
>
> Why are my hard links not preserved? I thought the purpose with -H
> was to transfer the hard links rather than the file itself.
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlajr1kACgkQVKC1jlbQAQdG3QCgwRt/K9u6xrxGFeZP2uoPoaoT
OlcAnjE4eozRjJ1Mb9YC88YNhVTLEpP8
=p3pD
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-23 17:50, Kevin Korb wrote:
> It will, assuming it sees both links in the same rsync run.

How does one handle interrupted transfers if one wants to preserve hard
links? Would --partial and --append-verify work?

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

As long as it still sees both links it is fine.

Essentially, the way it works is that whenever rsync -H (on the
source) sees a file with a link count >1 it remembers the
inode#>filename pair.  If it finds another instance of that inode it
then links to the same file on the target.  So, if you abort after it
copies one but before it links the other it will still handle it
correctly on the next run.

It just won't handle it if you rsync tree #1 then rsync tree #2.  It
won't see a hard link that is common to both since it wasn't analyzing
them together.

On 01/23/2016 03:04 PM, [hidden email]
wrote:
> On 2016-01-23 17:50, Kevin Korb wrote:
>> It will, assuming it sees both links in the same rsync run.
>
> How does one handle interrupted transfers if one wants to preserve
> hard links? Would --partial and --append-verify work?
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlaj34gACgkQVKC1jlbQAQdDPgCfQBP/mR7x0a6JVLIJZzye+6Io
0woAn3BGe9y0mjOfZbK62R0OHuzOzChl
=q3hs
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-23 21:16, Kevin Korb wrote:
 > As long as it still sees both links it is fine.
 >
 > Essentially, the way it works is that whenever rsync -H (on the
 > source) sees a file with a link count >1 it remembers the
 > inode#>filename pair.  If it finds another instance of that inode it
 > then links to the same file on the target.  So, if you abort after it
 > copies one but before it links the other it will still handle it
 > correctly on the next run.
 >
 > It just won't handle it if you rsync tree #1 then rsync tree #2.  It
 > won't see a hard link that is common to both since it wasn't analyzing
 > them together.

I'm not sure I understand your answer. As you could see in my previous
message, the files that should be linked but was duplicated was located
in the same root directory ("/backups"):

/backups/2011-06-23-040258/Pictures/DSCF0748.JPG"


/backups/2010-12-18-070445/Pictures/DSCF0748.JPG"

Why is rsync losing track of the links just because the transfer was
interrupted if your explanation is correct?

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

What was your rsync source and target that made those?

On 01/23/2016 03:44 PM, [hidden email]
wrote:

> On 2016-01-23 21:16, Kevin Korb wrote:
>> As long as it still sees both links it is fine.
>>
>> Essentially, the way it works is that whenever rsync -H (on the
>> source) sees a file with a link count >1 it remembers the
>> inode#>filename pair.  If it finds another instance of that inode
>> it then links to the same file on the target.  So, if you abort
>> after it copies one but before it links the other it will still
>> handle it correctly on the next run.
>>
>> It just won't handle it if you rsync tree #1 then rsync tree #2.
>> It won't see a hard link that is common to both since it wasn't
>> analyzing them together.
>
> I'm not sure I understand your answer. As you could see in my
> previous message, the files that should be linked but was
> duplicated was located in the same root directory ("/backups"):
>
> /backups/2011-06-23-040258/Pictures/DSCF0748.JPG"
>
>
> /backups/2010-12-18-070445/Pictures/DSCF0748.JPG"
>
> Why is rsync losing track of the links just because the transfer
> was interrupted if your explanation is correct?
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlaj6n4ACgkQVKC1jlbQAQcwzwCgnWFzGaLQ1/JZN/JQ/hghlE7C
rkcAoO1rLGhDnUj4dIGlqvNr7sZkDjMn
=236y
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-23 22:02, Kevin Korb wrote:
> What was your rsync source and target that made those?

What do you mean? Filesystem is HFS (Mac OS X). Rsync version is 3.1.2
from MacPorts. Source is a regular directory/folder on an external HD,
destination is a disk image.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I want to know what your whole command line was so I can understand
your results.

On 01/23/2016 04:57 PM, [hidden email]
wrote:
> On 2016-01-23 22:02, Kevin Korb wrote:
>> What was your rsync source and target that made those?
>
> What do you mean? Filesystem is HFS (Mac OS X). Rsync version is
> 3.1.2 from MacPorts. Source is a regular directory/folder on an
> external HD, destination is a disk image.
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlaj990ACgkQVKC1jlbQAQesqgCfVxaHonbTnYX2ItVzP7V37oG7
3V4AnRCEcTLIXmELY1w835KGWx98svIL
=VTAk
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-23 22:59, Kevin Korb wrote:
 > I want to know what your whole command line was so I can understand
 > your results.

% rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b"
/source/ /destination/

(and after the interruption I removed z)

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I need to know what the paths were so I know how they relate to the
file names you listed.

On 01/23/2016 05:10 PM, [hidden email]
wrote:
> On 2016-01-23 22:59, Kevin Korb wrote:
>> I want to know what your whole command line was so I can
>> understand your results.
>
> % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b"
> /source/ /destination/
>
> (and after the interruption I removed z)
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlaj/YIACgkQVKC1jlbQAQdPMwCgzXGs44+wEB/j76JN6wWcLZiM
gHIAniFzg7aGojdhmgmxJxJQ4mnTFNl/
=5M/S
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-23 23:24, Kevin Korb wrote:
> I need to know what the paths were so I know how they relate to the
> file names you listed.

I posted the relevant parts of the path in a previous message

/Volumes/A/Backups.backupdb/mm/2011-06-23-040258/path/DSCF0748.JPG
/Volumes/B/Backups.backupdb/mm/2011-06-23-040258/path/DSCF0748.JPG

The only difference is the name of the volume, A and B, above.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Kevin Korb
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Are you rsyncing from one to the other?  Both of them to somewhere
else?  One at a time to somewhere else?  Why won't you just show your
actual command line and an ls -li of the correct source and incorrect
target?

On 01/23/2016 06:25 PM, [hidden email]
wrote:

> On 2016-01-23 23:24, Kevin Korb wrote:
>> I need to know what the paths were so I know how they relate to
>> the file names you listed.
>
> I posted the relevant parts of the path in a previous message
>
> /Volumes/A/Backups.backupdb/mm/2011-06-23-040258/path/DSCF0748.JPG
> /Volumes/B/Backups.backupdb/mm/2011-06-23-040258/path/DSCF0748.JPG
>
> The only difference is the name of the volume, A and B, above.
>

- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlakPCsACgkQVKC1jlbQAQfRzACcDVYPU+1c6a03LjszmJmRhxQb
1VgAoLiYVcSi6pyuhmK+oXB61E182Gck
=KZ9c
-----END PGP SIGNATURE-----

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-24 03:51, Kevin Korb wrote:
> Are you rsyncing from one to the other?  Both of them to somewhere
> else?  One at a time to somewhere else?  Why won't you just show your
> actual command line and an ls -li of the correct source and incorrect
> target?


Are you trolling me? All the information you ask for above has been
clearly spelled out in previous messages, messages you have replied to.


Why do you need my username, drive/volume name and other personal
details from my folder hierarchy? Both paths are regular local paths
with no special characters except space. And they are identical except
for the volume name.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Selva Nair

On Sun, Jan 24, 2016 at 12:29 PM, <[hidden email]> wrote:
On 2016-01-24 03:51, Kevin Korb wrote:
Are you rsyncing from one to the other?  Both of them to somewhere
else?  One at a time to somewhere else?  Why won't you just show your
actual command line and an ls -li of the correct source and incorrect
target?


Are you trolling me? All the information you ask for above has been clearly spelled out in previous messages, messages you have replied to.

Sorry for butting in, but hope this helps:

The command line you posted earlier reads

 % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b" /source/ /destination/

I think Kevin is asking you write out that /source/ and /destination exactly as used on the command line so that one could understand what is going on better. The issues you're facing are rather unusual so a more complete description may help figure what's going on. Sure, you can mask username/password etc but do not simplify source and destination paths.

Also the the description "The destination is a sparse disk image bundle mounted locally (but its
"source file" is on a network storage)" is too cryptic. What kind of network storage? How is it mounted -- NFS? SMB? What kind of sparse disk image? What's a bundle?  Not that I have any clue why the transfer could be so slow or why rsync is not detecting hardlinks in your case (it should, as Kevin initially pointed out), but someone else may be able to shed some light..

Just trying to help,

Selva



--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

dbonde+forum+rsync.lists.samba.org
On 2016-01-24 20:39, Selva Nair wrote:

> Sorry for butting in, but hope this helps:
>
> The command line you posted earlier reads
>
>   % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b"
> /source/ /destination/
>
> I think Kevin is asking you write out that /source/ and /destination
> exactly as used on the command line so that one could understand what is
> going on better.

That doesn't make sense. Both the source and destination path contains
simple alphanumeric characters, no more no less. Why would it matter
whether the path is /abc/ or /def/ or even /123/?

  The issues you're facing are rather unusual so a more
> complete description may help figure what's going on. Sure, you can mask
> username/password etc but do not simplify source and destination paths.
>
> Also the the description "The destination is a sparse disk image bundle
> mounted locally (but its
> "source file" is on a network storage)" is too cryptic. What kind of
> network storage? How is it mounted -- NFS? SMB? What kind of sparse disk
> image? What's a bundle?

It is exactly as I wrote. On a network volume (A) a "sparse disk image
bundle" (B), i.e., a type of disk image used in OS X, is stored. B is
then mounted locally (i.e., local to where rsync is run) on a computer
(C) where it appears as one of many volumes.

In other words, B is stored on A. A is then mounted (using AFP) on C. C
then mounts B (=opens a file on a network volume, but instead of opening
e.g., a spreadsheet in Excel, opening B shows a new volume on the
desktop of C) stored on A. The computer where it is mounted just sees a
mounted volume - it can't distinguish between a disk image stored
remotely or stored on the computers internal hard drive.

I assume you are familiar with the idea of disk images?

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Selva Nair

On Sun, Jan 24, 2016 at 4:48 PM, <[hidden email]> wrote:
That doesn't make sense. Both the source and destination path contains simple alphanumeric characters, no more no less. Why would it matter whether the path is /abc/ or /def/ or even /123/?

Hmm.. I thought your are the one who has been asking for help. It does very much matter what your source and destination exactly are.
 
I assume you are familiar with the idea of disk images?

There are many different kinds of disk images, so are there many ways of network mounting. If you say I do "rsync /a /b/" and it runs too slow, you are not going to get any useful responses..

Good luck,

Selva


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Why is my rsync transfer slow?

Simon Hobson-2
In reply to this post by dbonde+forum+rsync.lists.samba.org
[hidden email] wrote:

> It is exactly as I wrote. On a network volume (A) a "sparse disk image bundle" (B), i.e., a type of disk image used in OS X, is stored. B is then mounted locally (i.e., local to where rsync is run) on a computer (C) where it appears as one of many volumes.
>
> In other words, B is stored on A. A is then mounted (using AFP) on C. C then mounts B (=opens a file on a network volume, but instead of opening e.g., a spreadsheet in Excel, opening B shows a new volume on the desktop of C) stored on A.


> The computer where it is mounted just sees a mounted volume - it can't distinguish between a disk image stored remotely or stored on the computers internal hard drive.

I wouldn't count on that !

> I assume you are familiar with the idea of disk images?

I think most are familiar with disk images - but not so many with the specific implementations used by OS X.
OS X has the concept of a "bundle". To the user this appears as a single file with it's own name and icon. Internally it's a folder tree with a number of files/folders.
As a quick test, I've just created a 100M sparse image, here's the contents before I've added any files :

> $ ls -lRh a.sparsebundle/
> total 16
> -rw-r--r--  1 simon  staff   496B 25 Jan 14:36 Info.bckup
> -rw-r--r--  1 simon  staff   496B 25 Jan 14:36 Info.plist
> drwxr-xr-x  8 simon  staff   272B 25 Jan 14:36 bands
> -rw-r--r--  1 simon  staff     0B 25 Jan 14:36 token
>
> a.sparsebundle//bands:
> total 34952
> -rw-r--r--  1 simon  staff   2.1M 25 Jan 14:37 0
> -rw-r--r--  1 simon  staff   2.4M 25 Jan 14:36 1
> -rw-r--r--  1 simon  staff   2.0M 25 Jan 14:36 2
> -rw-r--r--  1 simon  staff   912K 25 Jan 14:36 6
> -rw-r--r--  1 simon  staff   8.0M 25 Jan 14:36 b
> -rw-r--r--  1 simon  staff   1.7M 25 Jan 14:36 c
It is **NOT** the same as a unix sparse file !
The contents are divided up into chunks, with each chunk stored in a file of it's own. I suspect this may also have an impact on performance. As the disk is filled, the "bands" files grow in number and size - with the disk filled, the bands are are complete from 0 through c, with all but c being 8M.

As an aside, there is also an unfortunate combination of name and Finder behaviour. If you set the Finder to show file extensions, it will show (eg in this case) "a.sparsebundle" - but if the name is a bit longer, it shows the begging of the name, an ellipsis "...", and the end of the name including extension. My mother was "a little confused" when she saw a folder on my screen with several "...arsebundle"s !


There are a lot of layers in your setup - any of them (or some combination thereof) could be slowing things down.

Rsync
Filesystem on B
Loppback mount (and associated systems) on B
AFP between A and B - is the host for A an OS X machine running native AFP, or something like Linux running Netatalk ?
Filesystem on A - inc sparse bundle file support
Disk subsystem on A

A few things come to mind ...

1) I am aware that AFP has some performance issues with some combinations of operations - no I don't know if this is one of them.
2) More importantly, if you look back through the archives, there was a thread not long ago about poor performance of rsync for "very large" file counts - and 45 million is "large". I didn't pay much attention, but IIRC the originator of that thread was proposing some alterations to improve things.
3) While rsync is designed to operate efficiently over slow/high latency links - 100MBps is always going to have an impact on throughput.

As an experiment, can you mount the disk of A locally on B ? Shut down the system hosting A and put it in FireWire Target Mode then connect it to B - A's disk then appears as a local FireWire disk on B. This will show whether AFP has any bearing on performance. If the computer hosting A doesn't support target mode then your a bit stuffed - but there may be other options.
Or alternately, connect the external disk directly to A's host rather than to B.
Either way, you can then run rsync as a local copy without the network element.


But as I write this, something far far more important comes to mind.Files on HFS disks are not like files on other filesystems (though I believe NTFS has a feature which adds similar complications). I am not sure exactly how rsync handled this - I do recall that Apple's version adds support for the triplet of "metadata + resource fork + data fork". From memory this results in many files getting re-copied every time regardless of whether they were modified or not. Memory is only vague, but I think it was something to do with comparing source and dest doesn't work properly when one end is looking at "whole file" and the other is only looking at one part.

I would suggest doing a test copy using only a small part of the tree, and do the copy again (so no files actually changed) and watch carefully what's been copied. I vaguely recall (from a looong time ago) that any file with a resource fork was re-copied each time even though it's not changed.

If this is the case, and I'm not misremembering, then it's possible that the combination of "rsync not handling very large file sets well" and "resource forks causing issues" could be (at least partly) behind your performance problem.


Another test I;d be inclined to try would be to copy things one restore point at a time. As you'll be aware, each restore point is it's own timestamped directory - hardlinked to the previous one for files that haven't changed. Try rsyncing only the last one, then the last two, then the last 3, then the last 4, and so on. You can use --include and --exclude to do this. See how performance varies as the number of included trees increases - I suspect it increases more than linearly given the work involved in tracking hard-links.


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
12