rsync: connection unexpectedly closed

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

rsync: connection unexpectedly closed

Kip Warner
Hey list,

I am having problems as of late with my rsync backup. On the client
side I am using the following:

    OPTS="-avvvrz
          --compress-level=9
          --itemize-changes
          --delete
          --delete-excluded
          --human-readable
          --files-from=$FILES
          --include-from=$INCLUDES
          --exclude-from=$EXCLUDES
          --partial
          --progress 
          --owner
          --perms
          --progress
          --timeout=0
          --times
          --stats"

    sudo rsync -e "ssh -i ${IDENTITY_FILE} -v -p ${REMOTE_PORT}" $OPTS / $REMOTE_USER@$REMOTE_HOST:$REMOTE_PATH

Note that I have temporarily disabled timeouts and added extra
verbosity. The transfer to the remote host via SSH works fine, up until
it gets to a 30+ GB file (a VM image). It gets about 90+ percent of the
way through, hangs, and then times out. On the client side I see the
following:

    ...
    rsync: connection unexpectedly closed (3542035 bytes received so far) [sender]
    rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.1]
    [sender] _exit_cleanup(code=12, file=io.c, line=226): about to call exit(255)

On the server side if I attach to the rsync process via strace, I see
the following:

    $ strace -f -p 3095

    ...
    3095  select(4, [3], [1], [1], {60, 0}) = 1 (out [1], left {59, 999971})
    3095  write(1, "\3\0\0\7\1\0\0", 7)     = 7
    3095  gettimeofday({1476036967, 673095}, NULL) = 0
    3095  select(4, [3], [1], [1], {60, 0}) = 1 (out [1], left {59, 999970})
    3095  write(1, "H\0\0\trecv_files(home/kip/.Virtual"..., 76) = 76
    3095  gettimeofday({1476036967, 680312}, NULL) = 0
    3095  select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
    3095  select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
    3095  select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
    3095  select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
    3095  select(4, [3], [], NULL, {60, 0}) = 1 (in [3], left {40, 364402})
    3095  read(3, "B\0\0\trecv_files(home/kip/.Virtual"..., 8184) = 160
    3095  select(4, [3], [1], [1], {60, 0}) = 1 (out [1], left {59, 999973})
    3095  write(1, "B\0\0\trecv_files(home/kip/.Virtual"..., 70) = 70
    3095  gettimeofday({1476037227, 506412}, NULL) = 0
    3095  select(4, [3], [1], [1], {60, 0}) = 1 (out [1], left {59, 999971})
    3095  write(1, "V\0\0\trecv mapped home/kip/.Virtua"..., 90) = 90
    3095  gettimeofday({1476037227, 512591}, NULL) = 0
    3095  select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
    ... a couple hundred times or so repeats ...
    3095  select(4, [3], [], NULL, {60, 0}) = 0 (Timeout)
    3095  select(4, [3], [], NULL, {60, 0}

Note that it looks like the select() call is timing out for what I
presume is a regular file descriptor (4 since stdin, stdout, and stderr
are 0-3 respectively). This could have nothing to do with rsync at all
and could be a file system issue, but I figured I'd ask.

The server the data is being uploaded to with the strace running on it
has rsync version:

    $ rsync --version
    rsync  version 3.0.9  protocol version 30

The client reported:

    $ rsync --version
    rsync  version 3.1.1  protocol version 31

Any help appreciated.

Regards,

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Henri Shustak
> I am having problems as of late with my rsync backup.

Have you tried performing a copy to a known good local device?  If a local copy fails, then I would start checking the file system of the source and also the hardware of that system.


----------------------------
HTRAX : http://www.htrax.xyz
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Paul Slootman-5
In reply to this post by Kip Warner
On Mon 10 Oct 2016, Kip Warner wrote:

>
> The server the data is being uploaded to with the strace running on it
> has rsync version:
>
>     $ rsync --version
>     rsync  version 3.0.9  protocol version 30
>
> The client reported:
>
>     $ rsync --version
>     rsync  version 3.1.1  protocol version 31

As always it's best to first upgrade to the current version (3.1.3) if
at all possible, as there's always the chance that the cause of your
problems has already been fixed.


Paul

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
In reply to this post by Henri Shustak
On Wed, 2016-10-12 at 13:30 +1300, Henri Shustak wrote:
> Have you tried performing a copy to a known good local device?  If a
> local copy fails, then I would start checking the file system of the
> source and also the hardware of that system.

That's a good idea. I just tried that and it copied no problem.

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
In reply to this post by Paul Slootman-5
On Wed, 2016-10-12 at 08:36 +0200, Paul Slootman wrote:
> As always it's best to first upgrade to the current version (3.1.3)
> if at all possible, as there's always the chance that the cause of
> your problems has already been fixed.

Good call, but I believe I may have ruled this out. I didn't upgrade to
3.1.3, but both sides are running 3.1.1 protocol version 31 now. Same
problem. 

I think the key insight was in the strace log which showed the select()
call was timed out. If I knew what type of file descriptor it was being
fed, I might have a clue. It might have been a socket or something on
disk. I don't know.

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Paul Slootman-5
On Wed 12 Oct 2016, Kip Warner wrote:
>
> I think the key insight was in the strace log which showed the select()
> call was timed out. If I knew what type of file descriptor it was being
> fed, I might have a clue. It might have been a socket or something on
> disk. I don't know.

You can use lsof -p $pid to show what files that process has opened.
On linux you can also use 'ls -l /proc/$pid/fd'.


Paul

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Dave Howorth
In reply to this post by Kip Warner
On 2016-10-10 17:24, Kip Warner wrote:
> Note that I have temporarily disabled timeouts and added extra
> verbosity. The transfer to the remote host via SSH works fine, up until
> it gets to a 30+ GB file (a VM image). It gets about 90+ percent of the
> way through, hangs, and then times out.

I have a similar but different problem. I make a regular download from a
site that always errors out on a particular large file. However, my
rsync error symptoms are different.

Unfortunately, the server admins seem to be the strong, silent types who
have repeatedly changed their minds about what they think is wrong and
who may or may not be attempting to solve the problem in isolation -
I've failed to get any meaningful communication going with them.
Fortunately, excluding the problem file is a reasonable workaround for me.

Have you tried excluding the problem file from the transfer?

One possibility is that the problem is not caused directly by rsync but
because of some underlying filesystem glitch. What OS & filesystems are
you using?

Cheers, Dave

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Wayne Davison-2
In reply to this post by Kip Warner

On Wed, Oct 12, 2016 at 8:30 PM, Kip Warner <[hidden email]> wrote:
I think the key insight was in the strace log which showed the select() call was timed out.

No, that's totally expected. While select() is waiting for I/O to arrive, it returns to rsync every 60 seconds to allow it to decide if it wants to continue waiting or do something else. You have to find the process that is exiting first to discover why the connection is closing. This assumes that it's not a network issue, where all of the programs get a closed-connection error.
..wayne..

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kevin Korb
I don't remember whether or not you said you were running rsync over ssh
but if you are you can also debug the ssh layer.  You can even do it at
both ends....

On the server run a debugging sshd on an alternate port with:
/usr/sbin/sshd -dDp222
(note that this will only accept 1 connection, debug to the terminal,
then exit)
then on the client use rsync -e "ssh -vp222" to go verbose and use the
debugging port.

On 10/13/2016 08:05 PM, Kip Warner wrote:

> On Thu, 2016-10-13 at 16:58 -0700, Wayne Davison wrote:
>> No, that's totally expected. While select() is waiting for I/O to
>> arrive, it returns to rsync every 60 seconds to allow it to decide if
>> it wants to continue waiting or do something else. You have to find
>> the process that is exiting first to discover why the connection is
>> closing. This assumes that it's not a network issue, where all of the
>> programs get a closed-connection
>> error.
>
> Hey Wayne,
>
> I took Paul's advice and tried monitoring with lsof. This is what I
> saw:
>
>     $ lsof -p 3104
>     COMMAND  PID USER   FD   TYPE     DEVICE SIZE/OFF      NODE NAME
>     rsync   3104  kip  cwd    DIR       0,19     4096 214695958 /home/kip/Disk_Backups/kip-desktop/yakkety
>     rsync   3104  kip  rtd    DIR      179,2     4096         2 /
>     rsync   3104  kip  txt    REG      179,2   460584      7817 /usr/bin/rsync
>     rsync   3104  kip  mem    REG      179,2    42692      3974 /lib/arm-linux-gnueabihf/libnss_files-2.13.so
>     rsync   3104  kip  mem    REG      179,2    71624      3971 /lib/arm-linux-gnueabihf/libnsl-2.13.so
>     rsync   3104  kip  mem    REG      179,2    38608      3976 /lib/arm-linux-gnueabihf/libnss_nis-2.13.so
>     rsync   3104  kip  mem    REG      179,2    26484      3972 /lib/arm-linux-gnueabihf/libnss_compat-2.13.so
>     rsync   3104  kip  mem    REG      179,2  2953776     18967 /usr/lib/locale/locale-archive
>     rsync   3104  kip  mem    REG      179,2  1216624      3965 /lib/arm-linux-gnueabihf/libc-2.13.so
>     rsync   3104  kip  mem    REG      179,2   130448     41964 /lib/arm-linux-gnueabihf/libgcc_s.so.1
>     rsync   3104  kip  mem    REG      179,2    43024     10745 /lib/arm-linux-gnueabihf/libpopt.so.0.0.0
>     rsync   3104  kip  mem    REG      179,2    26240      1840 /lib/arm-linux-gnueabihf/libacl.so.1.1.0
>     rsync   3104  kip  mem    REG      179,2    17904      1854 /lib/arm-linux-gnueabihf/libattr.so.1.1.0
>     rsync   3104  kip  mem    REG      179,2    10170     28719 /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so
>     rsync   3104  kip  mem    REG      179,2   126236      3961 /lib/arm-linux-gnueabihf/ld-2.13.so
>     rsync   3104  kip    1w  FIFO       0,10      0t0      8455 pipe
>     rsync   3104  kip    2w  FIFO       0,10      0t0      8456 pipe
>     rsync   3104  kip    3u  unix 0xdbda86c0      0t0      8466 socket
>
> And on the strace...
>
>     ...
>     3104  select(4, [3], [1], [3], {60, 0}) = 1 (out [1], left {59, 999978})
>     3104  write(1, "h\340\0\7\0y@\312\227\241\236\255\367\312\3637\323X\206\314\250\372\362P1\202\374\"i\265c\16"..., 57452) = 57452
>     3104  select(4, [3], [], [3], {60, 0})  = 0 (Timeout)
>     (repeats above line until client (sender) reports connection timed out)
>
> Sender reports the following until it times out...
>
>     ...
>     match at 32949141504 last_match=32949141504 j=251382 len=131072 n=0
>              32.95G  96%   83.55MB/s    0:00:12
>
>
>
--
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
        Kevin Korb Phone:    (407) 252-6853
        Systems Administrator Internet:
        FutureQuest, Inc. [hidden email]  (work)
        Orlando, Florida [hidden email] (personal)
        Web page: http://www.sanitarium.net/
        PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
In reply to this post by Dave Howorth
On Thu, 2016-10-13 at 10:09 +0100, Dave Howorth wrote:
> Have you tried excluding the problem file from the transfer?

Hey Dave. All the other files appear to sync, up until it gets to that
one large file. Then it stalls, and finally times out. I could tell it
to exclude that important file, but that would defeat the purpose of my
backup.

> One possibility is that the problem is not caused directly by rsync
> but  because of some underlying filesystem glitch. What OS &
> filesystems are you using?

That could well be, but how to know?

Client side:

    $ lsb_release -a
    LSB Version:        security-9.20160110ubuntu5-amd64:security-9.20160110ubuntu5-noarch
    Distributor ID:        Ubuntu
    Description:        Ubuntu 16.10
    Release:        16.10
    Codename:        yakkety

Server side:

    $ lsb_release -a
    No LSB modules are available.
    Distributor ID:        Debian
    Description:        Debian GNU/Linux 7.11 (wheezy)
    Release:        7.11
    Codename:        wheezy

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
In reply to this post by Kevin Korb
On Thu, 2016-10-13 at 20:08 -0400, Kevin Korb wrote:
> I don't remember whether or not you said you were running rsync over
> ssh but if you are you can also debug the ssh layer.  You can even do
> it at both ends....

Yes, I'm doing this over an SSH tunnel.

> On the server run a debugging sshd on an alternate port with:
> /usr/sbin/sshd -dDp222
> (note that this will only accept 1 connection, debug to the terminal,
> then exit) then on the client use rsync -e "ssh -vp222" to go verbose
> and use the debugging port.

This is a great idea except that I only have one port forwarded to the
server (beyond my control) and that port is for the current SSH session
I would need to spawn another one on some other inaccessible port.

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
In reply to this post by Kevin Korb
On Thu, 2016-10-13 at 20:08 -0400, Kevin Korb wrote:
> I don't remember whether or not you said you were running rsync over
> ssh but if you are you can also debug the ssh layer.  You can even do
> it at both ends....
>
> On the server run a debugging sshd on an alternate port with:
> /usr/sbin/sshd -dDp222
> (note that this will only accept 1 connection, debug to the terminal,
> then exit) then on the client use rsync -e "ssh -vp222" to go verbose
> and use the debugging port.

Hey Kevin,

I managed to get a chance to try your suggestion. I noticed something
very interesting. On the server side, there's actually two processes of
rsync that appear to be spawned by the client's connection. 

Running the ssh daemon on the server side, I notice that long after the
server dies like so...

    ...
    debug1: session_by_channel: session 0 channel 0
    debug1: session_input_channel_req: session 0 req exec
    Read error from remote host <snip>: Connection reset by peer
    debug1: do_cleanup
    debug1: do_cleanup
    debug1: PAM: cleanup
    debug1: PAM: closing session
    Sessions still open, not unmounting
    debug1: PAM: deleting credentials

...and the client rsync dies...

    match at 32949010432 last_match=32949010432 j=251381 len=131072 n=0
    match at 32949141504 last_match=32949141504 j=251382 len=131072 n=0
             32.95G  96%   74.13MB/s    0:00:14  packet_write_wait: Connection to <snip> port 22223: Broken pipe

    rsync: [sender] write error: Broken pipe (32)
    rsync error: unexplained error (code 255) at io.c(820) [sender=3.1.1]
    [sender] _exit_cleanup(code=10, file=io.c, line=820): about to call exit(255)

...one of the server side rsync processes according to strace is still
alive...

    write(1, "h\340\0\7\310\376\\\233\227\241\236\255VC\31\351\323X\206\314F$\337\0321\202\374\"\320(d\271"..., 57452) = 57452
    select(4, [3], [], [3], {60, 0})        = 0 (Timeout)
    (last line repeats a couple hundred times)

    ...and apparently the second rsync server side process is actually
    still performing reads and writes more than an hour after the
    connection died...

    ...
    read(1, "\337\363\356\1^\26L\316\17\31izD\254\27\346\267\266H\343\223\v\357\252d'h\351\371\0ny"..., 262144) = 262144
    write(3, "\337\363\356\1^\26L\316\17\31izD\254\27\346\267\266H\343\223\v\357\252d'h\351\371\0ny"..., 262144) = 262144
    ...

    This makes me wonder if this has something to do with memory buffers
    for file i/o being exhausted on the server and this server side
    activity long after the connection died is an effort to flush a file
    buffer to disk. The file this always breaks on is 30+GB, but the server
    hardware is a small embedded system.

    --
    Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
On Sat, 2016-10-15 at 19:54 -0700, Kip Warner wrote:

>     ...and apparently the second rsync server side process is
> actually
>     still performing reads and writes more than an hour after the
>     connection died...
>
>     ...
>     read(1,
> "\337\363\356\1^\26L\316\17\31izD\254\27\346\267\266H\343\223\v\357\2
> 52d'h\351\371\0ny"..., 262144) = 262144
>     write(3,
> "\337\363\356\1^\26L\316\17\31izD\254\27\346\267\266H\343\223\v\357\2
> 52d'h\351\371\0ny"..., 262144) = 262144
>     ...
And it finally just terminated:

    ...
    select(5, [0], [4], [0], {60, 0})       = 2 (in [0], out [4], left {59, 999979})
    read(0, "\373\375\372\343\276M\337\374'\201\234o\306\34\313\255\274b\16\314\206\262\10\304\7{z-\316N\347\242"..., 16384) = 16384
    write(4, "B\0\0\trecv_files(home/kip/.Virtual"..., 160) = 160
    select(1, [0], [], [0], {60, 0})        = 1 (in [0], left {59, 999978})
    --- SIGUSR1 (User defined signal 1) @ 0 (0) ---
    rt_sigaction(SIGUSR1, {SIG_IGN, [], 0x4000000 /* SA_??? */}, NULL, 8) = 0
    rt_sigaction(SIGUSR2, {SIG_IGN, [], 0x4000000 /* SA_??? */}, NULL, 8) = 0
    close(1)                                = 0
    write(3, "\236\244\355{\356$\237\377\223\237\34\366\275\244\350\t\253\206<\266\3634\371\376\214\377\f\212\211|\310\263"..., 164033) = 164033
    close(3)                                = 0
    lstat64("home/kip/.VirtualBox/Machines/Woe64 8.1/.Woe64 8.1.vdi.1J1zb5", {st_mode=S_IFREG|0600, st_size=32949305537, ...}) = 0
    stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2875, ...}) = 0
    utimensat(AT_FDCWD, "home/kip/.VirtualBox/Machines/Woe64 8.1/.Woe64 8.1.vdi.1J1zb5", {UTIME_NOW, {0, 854158541}}, AT_SYMLINK_NOFOLLOW) = 0
    rename("home/kip/.VirtualBox/Machines/Woe64 8.1/.Woe64 8.1.vdi.1J1zb5", "home/kip/.VirtualBox/Machines/Woe64 8.1/Woe64 8.1.vdi") = 0
    gettimeofday({1476587334, 151625}, NULL) = 0
    select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
    gettimeofday({1476587334, 255616}, NULL) = 0
    exit_group(19)                          = ?
    Process 30619 detached

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Henri Shustak
In reply to this post by Kip Warner
>> Have you tried performing a copy to a known good local device?  If a
>> local copy fails, then I would start checking the file system of the
>> source and also the hardware of that system.
>
> That's a good idea. I just tried that and it copied no problem.

Do you have another system you could try this transfer with via SSH with on the local network?

Given that the local transfer works fine, I would suggest checking the hardware and file system integrity on the remote machine. In terms of hardware checking the memory and disks would be a top priority.

Also, you could try moving the partial file out of the way and also not using the partial option and transferring again?

Hope that helps

--------------------------------------------------------------------
HTRAX 2013 Revitalised : EGYPTIAN HUMP HTRAX : Direct URL download :
http://henri.shustak.org/download/htrax/egyptian-hump-htrax.mp3
"Dr Who Meets B52's" - More reviews : http://www.jessetaylor.com.au




--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Paul Slootman-5
Try the transfer without -z.

Paul

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Bernd Hohmann
In reply to this post by Henri Shustak
On 18.10.2016 07:03, Kip Warner wrote:

> From what I can tell, there are no hardware problems. I also ran fsck
> on the drive. The machine seems to be fine.

I can confirm the problem.

Situation here: 2 identical HP Microservers (Debian 7, on site compiled
rsync 3.1.2, connected via OpenVPN).

SSH is used for transport.

Both machines have the correct date/time set via ntpd.

All files on Client/Server are rw and have the right owner and are
copy'able. oth sides.

The "directory to backup" is a Samba-share (I stopped nmbd and smbd, no
change). Client: 200GB, 42000 files total. Enough disk-space and memory
on both sides.

All rsync instances were killed (Client/Server) before starting rsync.

tcpdump shows me a NOP packet every 2 min.


I can provoke the error doing this:

1) Start the transfer (rsync scans *all* client files and starts sending
a file)

2) ^C rsync on client

3) "pkill rsync" on server until all rsync-processes are killed. Same on
client (just to be sure)

4) Start the transfer again, now rsync scans the top directories only
and hangs (see straces below).


Commandline:

./rsync-debug -v --archive --progress
      --human-readable --delete-during \
      --rsync-path=/home/backup-hugo/bin/rsync-debug \
      /srv/backup-bernd [hidden email]:/srv/


Client says (PID 5909 = rsync, 5910 = ssh)
------------------------------------------------
[...]
5910  10:13:50 select(7, [3 4], [3], NULL, {240, 0}) = 1 (in [4], left
{239, 999990})
5910  10:13:50 read(4, "2010_20120119093643.pdf\0\3740O
[...]
\242}\30:V0124160__Nr.036_vom_10.09.2010_2012011"..., 16384) = 3072

loop:
5910  10:13:50 select(7, [3 4], [3], NULL, {240, 0} <unfinished ...>
5909  10:14:51 <... select resumed> )   = 0 (Timeout)
5909  10:14:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout)
5909  10:15:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout)
5909  10:16:51 select(6, [5], [], [5], {60, 0} <unfinished ...>
5910  10:17:51 <... select resumed> )   = 0 (Timeout)
goto loop
------------------------------------------------

Server says (PID 10331 = rsync --server, 10332 = ssh)
------------------------------------------------
[...]
10331 10:13:50 lstat("backup-bernd/Schreibtisch", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/VirtualBox", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/bin", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/projekte", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/transfer", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 select(4, [3], [1], [3], {60, 0}) = 1 (out [1], left {59,
999991})
10331 10:13:50 write(1, "\4\0\0\7\3\20\0\0", 8) = 8
10331 10:13:50 select(4, [3], [], [3], {60, 0} <unfinished ...>
10332 10:14:50 <... select resumed> )   = 0 (Timeout)

loop:
10332 10:14:50 select(1, [0], [], [0], {60, 0} <unfinished ...>
10331 10:14:50 <... select resumed> )   = 0 (Timeout)
10331 10:14:50 select(4, [3], [], [3], {60, 0} <unfinished ...>
10332 10:15:50 <... select resumed> )   = 0 (Timeout)
goto loop
------------------------------------------------

--
Bernd Hohmann
Organisationsprogrammierer
Höhenstrasse 2 * 61130 Nidderau
Telefon: 06187/900495 * Telefax: 06187/900493
Blog: http://blog.harddiskcafe.de



--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

devzero
what does lsof tell? does rsync hang on a specific file?

i would wonder if this is a rsync problem. as you told you killed all processes.

so, on the second run rsync knows nothing from before...

roland

Am 18. Oktober 2016 12:08:00 MESZ, schrieb Bernd Hohmann <[hidden email]>:
On 18.10.2016 07:03, Kip Warner wrote:

From what I can tell, there are no hardware problems. I also ran fsck
on the drive. The machine seems to be fine.

I can confirm the problem.

Situation here: 2 identical HP Microservers (Debian 7, on site compiled
rsync 3.1.2, connected via OpenVPN).

SSH is used for transport.

Both machines have the correct date/time set via ntpd.

All files on Client/Server are rw and have the right owner and are
copy'able. oth sides.

The "directory to backup" is a Samba-share (I stopped nmbd and smbd, no
change). Client: 200GB, 42000 files total. Enough disk-space and memory
on both sides.

All rsync instances were killed (Client/Server) before starting rsync.

tcpdump shows me a NOP packet every 2 min.


I can provoke the error doing this:

1) Start the transfer (rsync scans *all* client files and starts sending
a file)

2) ^C rsync on client

3) "pkill rsync" on server until all rsync-processes are killed. Same on
client (just to be sure)

4) Start the transfer again, now rsync scans the top directories only
and hangs (see straces below).


Commandline:

./rsync-debug -v --archive --progress
--human-readable --delete-during \
--rsync-path=/home/backup-hugo/bin/rsync-debug \
/srv/backup-bernd [hidden email]:/srv/


Client says (PID 5909 = rsync, 5910 = ssh)


[...]
5910 10:13:50 select(7, [3 4], [3], NULL, {240, 0}) = 1 (in [4], left
{239, 999990})
5910 10:13:50 read(4, "2010_20120119093643.pdf\0\3740O
[...]
\242}\30:V0124160__Nr.036_vom_10.09.2010_2012011"..., 16384) = 3072

loop:
5910 10:13:50 select(7, [3 4], [3], NULL, {240, 0} <unfinished ...>
5909 10:14:51 <... select resumed> ) = 0 (Timeout)
5909 10:14:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout)
5909 10:15:51 select(6, [5], [], [5], {60, 0}) = 0 (Timeout)
5909 10:16:51 select(6, [5], [], [5], {60, 0} <unfinished ...>
5910 10:17:51 <... select resumed> ) = 0 (Timeout)
goto loop



Server says (PID 10331 = rsync --server, 10332 = ssh)


[...]
10331 10:13:50 lstat("backup-bernd/Schreibtisch", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/VirtualBox", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/bin", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/projekte", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 lstat("backup-bernd/transfer", {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0
10331 10:13:50 select(4, [3], [1], [3], {60, 0}) = 1 (out [1], left {59,
999991})
10331 10:13:50 write(1, "\4\0\0\7\3\20\0\0", 8) = 8
10331 10:13:50 select(4, [3], [], [3], {60, 0} <unfinished ...>
10332 10:14:50 <... select resumed> ) = 0 (Timeout)

loop:
10332 10:14:50 select(1, [0], [], [0], {60, 0} <unfinished ...>
10331 10:14:50 <... select resumed> ) = 0 (Timeout)
10331 10:14:50 select(4, [3], [], [3], {60, 0} <unfinished ...>
10332 10:15:50 <... select resumed> ) = 0 (Timeout)
goto loop



--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Bernd Hohmann
On 19.10.2016 21:46, [hidden email] wrote:

> what does lsof tell? does rsync hang on a specific file?

Nothing. I ruled this out already.

> i would wonder if this is a rsync problem. as you told you killed all
> processes.
> so, on the second run rsync knows nothing from before...

Maybe something stupid in the SSH(D) stack. I checked all running ssh
instances for orphans - nothing.

Current status quo of testing:

1) After restarting the client, rsync starts working synchronizing
changed/missing files on the server via vpn and runs into a timeout
after a couple of hours.

2) Restarting the backup (after pkill'ing rsync on both sides, just to
be sure) runs into a timeout and no attempt was made to rsync files.

3) Hint: If I remove a directory (13 Files, 50GB total), everything is fine.

Are there some magic Unix file attributes on ext4?

Bernd

--
Bernd Hohmann
Organisationsprogrammierer
Höhenstrasse 2 * 61130 Nidderau
Telefon: 06187/900495 * Telefax: 06187/900493
Blog: http://blog.harddiskcafe.de


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Kip Warner
In reply to this post by Paul Slootman-5
On Tue, 2016-10-18 at 08:36 +0200, Paul Slootman wrote:
> Try the transfer without -z.
>
> Paul

I ended up giving up. What I did was I just removed the 30GB file
(which I really didn't need anyways) and the transfer carried on
without a hitch.

--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync: connection unexpectedly closed

Bernd Hohmann
On 20.10.2016 03:24, Kip Warner wrote:

> I ended up giving up.

Me too.

I'm copying all files via 'scp' now - takes 3 days but no aborts or errors.

So I am very sure the problem is somewhere in rsync.

Bernd

--
Bernd Hohmann
Organisationsprogrammierer
Höhenstrasse 2 * 61130 Nidderau
Telefon: 06187/900495 * Telefax: 06187/900493
Blog: http://blog.harddiskcafe.de


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html