"Total File Size" Statistic counts each instance of hard linked files

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

"Total File Size" Statistic counts each instance of hard linked files

Chris Deigan
Hi,

This is a question is seeking clarification of intended behaviour. Right
now, rsync reports a statistic of "Total file size". This represents "the
total sum of all file sizes in the transfer" (as described in the man page).

A case I've hit in using this statistic is that it counts each instance of a
file even when it has multiple hard links. We are using --hard-links to
preserve hard links on the destination.

As a result we get a statistic of, for instance, 2TB when the actual sum on
disk (counted with du, using the default behaviour of counting hard linked
files only once) is only around 80GB.

I'm using the statistic for generating backup disk usage numbers that
eventually become billing data, so this has generated a few surprise cases.

There are a few alternatives for my use-case, but I was wondering if
counting hard links multiple times is actually correct behaviour?

My feeling is no, but this consideration isn't apparent in the source or
docs that I've read. Appreciate any comments, particularly from the project
maintainers.

Thanks,
Chris


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html