Rsync hung when applying batch file

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Rsync hung when applying batch file

Kolya
Hello.

On rsync, version 3.0.9 got this trouble, rsync hung when applying batch file:
Sync all files from batch to asr_billing
install:   - skelet/cfg/     
install:   - skelet/etc/     
install: deleting usr/lib/python2.6/site-packages/IPython/external/Itpl.pyc - deleting     
install: 100% 9.83MB/s - 0     
install: 100% 51.39kB/s - usr/lib/python2.6/site-packages/dbfpy/utils.pyc     
install: 100% 86.10kB/s - usr/lib/python2.6/site-packages/django/contrib/admin/widgets     
install: 100% 25.41kB/s - usr/lib/python2.6/site-packages/django/contrib/auth/views.py     
install: 100% 5.99kB/s - usr/lib/python2.6/site-packages/django/contrib/contenttypes/     
install: 100% 2.66kB/s - usr/lib/python2.6/site-packages/django/contrib/formtools/wiz     
install: 100% 4.77kB/s - usr/lib/python2.6/site-packages/django/contrib/sessions/back     
install: 100% 16.92kB/s - usr/lib/python2.6/site-packages/django/core/management/valid     
install: 100% 3.54kB/s - usr/lib/python2.6/site-packages/johnny/backends/memcached.py     
install: 100% 45.51kB/s - usr/lib/python2.6/site-packages/openpyxl/comments/comments.p     
install:   - usr/lib/python2.6/site-packages/phonenumbers/data/     
install: 100% 73.82kB/s - usr/lib/python2.6/site-packages/pytz/tzinfo.pyc     
install: 100% 4.21kB/s - usr/lib/python2.6/site-packages/requests/packages/__init__.p     
install: 100% 9.61kB/s - usr/lib/python2.6/site-packages/suds/umx/typed.pyc     
install: 100% 4.13kB/s - usr/lib/python2.6/site-packages/unittest2/util.pyc     
WARNING: usr/lib64/python2.6/locale.pyc failed verification -- update discarded (may try again).
WARNING: usr/lib64/python2.6/optparse.pyc failed verification -- update discarded (may try again).
WARNING: usr/lib64/python2.6/urllib2.pyc failed verification -- update discarded (may try again).
install: 100% 23.09kB/s - usr/lib64/python2.6/curses/wrapper.pyc     
install: 100% 16.61kB/s - usr/lib64/python2.6/json/scanner.pyc     
install: 100% 42.08kB/s - usr/lib64/python2.6/multiprocessing/util.pyc   

and rsync freeze here, forewer (wait 2-3 hours). 

This two rsync process (from pstree) with arguments:
  │   │               └─rsync,2142 --filter=._- -z --compress-level=9 --block-size=40507 -c --port=873 --progress --delete -a --read-batch=/mnt/var/cache/carbon_update//asr_billing/devel/update.batch /mnt/apptmp/asr_billing/
  │   │                   └─rsync,2145 --filter=._- -z --compress-level=9 --block-size=40507 -c --port=873 --progress --delete -a --read-batch=/mnt/var/cache/carbon_update//asr_billing/devel/update.batch /mnt/apptmp/asr_billing/


File descriptions from proc:
# ll /proc/2145/fd
итого 0
lr-x------ 1 root root 64 Июн 29 03:00 0 -> /mnt/var/cache/carbon_update/asr_billing/devel/update.batch
l-wx------ 1 root root 64 Июн 29 03:00 1 -> /tmp/31178_asr_billing.tmp
lrwx------ 1 root root 64 Июн 29 03:00 2 -> /dev/pts/0
lrwx------ 1 root root 64 Июн 29 03:00 3 -> socket:[612268397]
lrwx------ 1 root root 64 Июн 29 03:00 6 -> socket:[612268400]
# ll /proc/2142/fd
итого 0
l-wx------ 1 root root 64 Июн 29 03:00 1 -> /tmp/31178_asr_billing.tmp
lrwx------ 1 root root 64 Июн 29 03:00 2 -> /dev/pts/0
lrwx------ 1 root root 64 Июн 29 03:00 3 -> socket:[612268397]
lrwx------ 1 root root 64 Июн 29 03:00 4 -> socket:[612268398]
lrwx------ 1 root root 64 Июн 29 03:00 5 -> socket:[612268399]



Strace: 
# strace -Ff -p 2142 -p 2145
Process 2142 attached - interrupt to quit
Process 2145 attached - interrupt to quit
[pid  2145] select(4, [3], [], NULL, {14, 509436} <unfinished ...>
[pid  2142] select(6, [5], [], NULL, {45, 724560}


 <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>
[pid  2142] <... select resumed> )      = 0 (Timeout)
[pid  2142] select(6, [5], [], NULL, {60, 0} <unfinished ...>
[pid  2145] <... select resumed> )      = 0 (Timeout)
[pid  2145] select(4, [3], [], NULL, {60, 0} <unfinished ...>



Stack from gdb:
(gdb) backtrace 
#0  0x00007f14cf42c243 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1  0x00000000004277a3 in read_timeout (fd=5, buf=0xcbbbd0 "\004", len=8184) at io.c:662
#2  0x000000000042822e in readfd_unbuffered (fd=5, buf=0x7fffac015840 "\201i", len=4) at io.c:1020
#3  0x00000000004290c3 in readfd (fd=5, buffer=0x7fffac015840 "\201i", N=4) at io.c:1163
#4  0x00000000004294b7 in read_msg_fd () at io.c:346
#5  0x000000000041250f in generate_files (f_out=4, local_name=<value optimized out>) at generator.c:2366
#6  0x000000000041d24c in do_recv (f_in=<value optimized out>, f_out=4, local_name=0x0) at main.c:843
#7  0x000000000041d4a2 in client_run (f_in=0, f_out=4, pid=-1, argc=<value optimized out>, argv=0xcba240) at main.c:1080
#8  0x000000000041e2fd in start_client (argc=1, argv=0xcba240) at main.c:1287
#9  main (argc=1, argv=0xcba240) at main.c:1514

(gdb) backtrace 
#0  0x00007f14cf42c243 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1  0x00000000004277a3 in read_timeout (fd=3, buf=0x7fffac012f00 "\300O\001\254\377\177", len=4) at io.c:662
#2  0x0000000000427cd4 in readfd_unbuffered (fd=3, buf=0x7fffac012f00 "\300O\001\254\377\177", len=4) at io.c:1017
#3  0x00000000004290c3 in readfd (fd=3, buffer=0x7fffac012f00 "\300O\001\254\377\177", N=4) at io.c:1163
#4  0x0000000000429a71 in read_int (f=<value optimized out>) at io.c:1191
#5  0x0000000000412d96 in gen_wants_ndx (desired_ndx=27012) at receiver.c:422
#6  0x00000000004145fd in recv_files (f_in=0, local_name=0x0) at receiver.c:481
#7  0x000000000041d150 in do_recv (f_in=0, f_out=4, local_name=0x0) at main.c:787
#8  0x000000000041d4a2 in client_run (f_in=0, f_out=4, pid=-1, argc=<value optimized out>, argv=0xcba240) at main.c:1080
#9  0x000000000041e2fd in start_client (argc=1, argv=0xcba240) at main.c:1287
#10 main (argc=1, argv=0xcba240) at main.c:1514


So, two process try get messages from each other, forewer.
I try trace process 2145 (other process in generate_files can set ignore_timeout = 1; inside generate_files(), i thoght, but i faul deattach from 2145 and process closed)

(gdb) 
check_timeout () at io.c:124
124	in io.c
(gdb) print io_timeout
$11 = 0
(gdb) print ignore_timeout
$2 = 0


So, timeout newer happen.

My solution is use --timeout parameter (now, always) in rsync. But i think, this is bug, when rsync hung on applying batch file.

I try, but unable to find other solution to fix this, so i will be glad if you pay attention to it.

Thanks.
-- 
Спасибо, что используете продукты компании Carbon Soft.

Успехов и процветания!
Крать Николай.
Отдел разработки компании «Carbon Soft».
--------------------------------------------
Тел.: +7 (495) 668-12-00, +7 (495) 668-07-88
E-mail: [hidden email]
ICQ: 672359629
http://www.carbonsoft.ru

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html