[Bug 11656] New: Escaping broken with --files-from

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] New: Escaping broken with --files-from

samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

            Bug ID: 11656
           Summary: Escaping broken with --files-from
           Product: rsync
           Version: 3.1.1
          Hardware: x64
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: core
          Assignee: [hidden email]
          Reporter: [hidden email]
        QA Contact: [hidden email]

The escaping mechanism in --files-from is broken when a file name contains a
carriage return. The problem is that a filename 'foo\nbar' gets written as
foo\#012bar by --out-format="%n" but gets transformed into foo\#134#012bar when
passing through the --files-from directive.

On my system
LC_CTYPE=en_US.UTF-8
LANG=en_US.UTF-8

More in general it would be great to have consistent escaping in the output of
--out-format for example like with an option of

ls --quoting-style=<style>

also to deal with space-containing names. Lots of people having problems with
that (will post links to serverfault questions when I find them again...)

EXAMPLE:

Make directories 'src' and 'dst' and in 'src' create the file 'foo\nbar'
$ mkdir src; mkdir dst; touch src/"$(echo -e 'foo\nbar')"

Suppose that I want to create a list of files that would copy the file
'foo\nbar' from src to dest via the command
$ rsync -v --files-from=filelist src/ dest

It seems there is no possible way of doing so. In particular one would want the
file list generated by the option --out-format="%n" to give the correct file

$ rsync -n --out-format='%n' src/* dst
foo\#012bar

But the following happens

$ rsync -n --out-format='%n' src/* dst | rsync -v --files-from=- src/ dst
building file list ...
rsync: link_stat "/home/guraltsev/test/src/foo\#134#012bar" failed: No such
file or directory (2)
done

sent 16 bytes  received 12 bytes  56.00 bytes/sec
total size is 0  speedup is 0.00
rsync error: some files/attrs were not transferred (see previous errors) (code
23) at main.c(1165) [sender=3.1.1]

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #1 from Kevin Korb <[hidden email]> ---
This is what --from0 is for.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #2 from Gennady Uraltsev <[hidden email]> ---
Actually this doesn't help.

$ mkdir src; mkdir dst; touch src/"$(echo -e 'foo\nbar')"

$ rsync -n --out-format='%n' src/* dst/| tr '\n' '\0' | rsync -v --from0
--files-from=- src/ dst

still fails completely. The problem is that the escaped string
foo\#012bar gets all mangled up. So I think that --from0 doesn't exactly solve
the problem.

The only way to solve the problem is

$ rsync -n --out-format='%n' src/* dst/| tr '\n' '\0'| sed 's/\\#012/\n/' |
rsync -v --from0 --files-from=- src/ dst

but maybe we could agree that this is needlessly complicated. A more consistent
way of dealing with this would be great especially bearing in mind that this
failure is not documented...

Finally it is still completely unexpected that
foo\#012bar
gets changed into
foo\#134#012bar

Anyway the problem with space delimited strings is more as follows. Immagine I
want to parse the log file generated with --out-format='%n %h %M %C'  in a
reliable way by some external program. Without consistent and documented
escaping there seems no way to do this.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #3 from Kevin Korb <[hidden email]> ---
I am not sure what exactly the point of using an rsync -n to feed an rsync
--files-from would be.  The --files-from option is really designed to be fed
from find which has a -print0 option which will format things correctly for
--from0.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #4 from Gennady Uraltsev <[hidden email]> ---
Well, imagine a poor mans replacement for batch files. We want to generate a
list of operations, maybe edit it by hand (a batch file is binary...) and then
feed it back to rsync. Or maybe do a dry run, look at it, and then just
selectively remove some files. There are many use cases.

Apart from that I argue that the inconsistencies in interpreting escape
sequences merit fixing. Ok, maybe it is not top priority but there still is no
reason why foo\#012bar becomes foo\#134#012bar.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #5 from Kevin Korb <[hidden email]> ---
I would say that if your goal is to make an editable list to be run through
rsync later you would be a lot better off with an --itemize-changes list and a
script to reformat it after editing.  I don't know about you but I would hate
to have to edit a null terminated text file and I would hate to have to go
lookup why a file is in the list without the --itemize-changes output.

Anyway, I think I am done commenting here and will leave this for Wayne to
decide if this is really a bug or a use case problem.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #6 from Gennady Uraltsev <[hidden email]> ---
I hope I am not upsetting anyone. Maybe I wasn't clear:
--itemize-changes is half the problem. Maybe I should post another bug.
In the situation I described

$ rsync -n --itemize-changes -a src/* dst/ gives:
>f..t...... foo\#012bar

with the new line being escaped in a weird way that cannot be effectively fed
back into any kind of other program, not even rsync itself!

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #7 from Gennady Uraltsev <[hidden email]> ---
Furthermore consider this test case:
in addition to what we did before create the file with the actual name
aaa\#012bbb by doing

touch 'src/aaa\#012bbb'

then

$ rsync -n --itemize-changes -a src/* dst/
>f+++++++++ aaa\#134#012bbb
>f..t...... foo\#012bar

where the escaping in itemize-change of aaa\#012bbb is completely absurd!

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #8 from Kevin Korb <[hidden email]> ---
I was not offended.  I was just trying to establish your use case and offer
possible alternative methods of accomplishing it while not actually being an
rsync dev.

Wayne is really the only person who can say "Yep, that's a bug" or "Nope, that
is how I want it to work."

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

samba-bugs
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

--- Comment #9 from Gennady Uraltsev <[hidden email]> ---
I looked through the source code and it seems that whatever is happening is
going bad in the function

static void filtered_fwrite in log.c

in particular the line
#134
fprintf(f, "\\#%03o", *(uchar*)s);

is suspicious. The problem is that #134 is the octal code for the backslash
character.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

[Bug 11656] Escaping broken with --files-from

Samba - rsync mailing list
In reply to this post by samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=11656

Wayne Davison <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #10 from Wayne Davison <[hidden email]> ---
The consistent and documented escaping is that some characters get ouput with a
backslash+hash+3-digit octal number. This includes control chars, some
backslashes, and (without -B) high-bit chars.  From the man page:

"The escape idiom that started in 2.6.7 is to output a literal backslash (\)
and a hash (#), followed by exactly 3 octal digits.  For example, a newline
would output as "\#012".  A literal backslash that is in a filename is not
escaped unless it is followed by a hash and 3 digits (0-9)."

One easy way to unescape is thus to filter names through something like this:

    perl -pe 's/\\#(\d\d\d)/chr(oct($1))/eg'

(...after any necessary parsing of the output to find the names or twiddle
newlines into nulls).  You'll note that this only matches exactly 3-digits, as
rsync will leave something like "\#5" alone in the output, since it cannot be
confused with an actual escaped char.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html