Options for a "I'm done" flag file

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Options for a "I'm done" flag file

Simon Hobson-2
As part of my backup system, I use Rsync to keep a copy of each server on one central backup server. This backup server then uses StoreBackup to keep multiple iterations of each clone directory.
So that the StoreBackup archives don't keep adding "redundant" and misleading backups, I update a flag file with the current date/time before doing the Rsync update, and test to see if this file is newer than the one in the latest StoreBackup backup. If it isn't, then I skip the StoreBackup for that server.

The end result is that if a system is down or out of communication (one or two are at sites that can be offline for days), then the list of backups in StoreBackup will reflect that. Eg, if the system did a sync on the 1st, but not on 2nd - 5th, then there will be no backups for 2nd-5th, and when looking later I won't be "fooled" into thinking that I have a backup from (say) the 4th.

Where this breaks down is if the sync fails part way through. The flag file has already been synced, so I have multiple backups which aren't actually complete.
I actually have this at the moment. Just put a small system on a customer site, it has a database that creates 1GB journal files (not that it handles anything like that volume of data), and at the moment their connectivity is a bit flakey.

My first thought was "do the flag file last", but a quick search confirms what I thought - that there isn't an option for this.

So, does anyone have any suggestions how I might reasonably easily get the ability for my script to see if the previous sync completed ?


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Michael Johnson - MJ

rsync -av /src/ /dst/ && touch /dst/done

That should do it as the touch only happens if rsync exits with a code of 0.  If you need to consider other non zero exit code, it is still doable, just a bit more shell code.

There are surely other options as well, but this is probably the most simple.

On Apr 28, 2015 3:47 AM, "Simon Hobson" <[hidden email]> wrote:
As part of my backup system, I use Rsync to keep a copy of each server on one central backup server. This backup server then uses StoreBackup to keep multiple iterations of each clone directory.
So that the StoreBackup archives don't keep adding "redundant" and misleading backups, I update a flag file with the current date/time before doing the Rsync update, and test to see if this file is newer than the one in the latest StoreBackup backup. If it isn't, then I skip the StoreBackup for that server.

The end result is that if a system is down or out of communication (one or two are at sites that can be offline for days), then the list of backups in StoreBackup will reflect that. Eg, if the system did a sync on the 1st, but not on 2nd - 5th, then there will be no backups for 2nd-5th, and when looking later I won't be "fooled" into thinking that I have a backup from (say) the 4th.

Where this breaks down is if the sync fails part way through. The flag file has already been synced, so I have multiple backups which aren't actually complete.
I actually have this at the moment. Just put a small system on a customer site, it has a database that creates 1GB journal files (not that it handles anything like that volume of data), and at the moment their connectivity is a bit flakey.

My first thought was "do the flag file last", but a quick search confirms what I thought - that there isn't an option for this.

So, does anyone have any suggestions how I might reasonably easily get the ability for my script to see if the previous sync completed ?


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Simon Hobson-2
Michael Johnson - MJ <[hidden email]> wrote:

> rsync -av /src/ /dst/ && touch /dst/done

Aaaaahhhh, knew I'd miss some detail.
All the syncs are pushed to the backup server.

But that does give me an idea. I guess I could do that on the source, then sync the flag file over.
rsync -avH ${other_gubbins} / [hidden email]:/dest/ &&
  touch /etc/donefile &&
  rsync -av ${some_other_gubbins} /etc/donefile [hidden email]:/dest/

That (or some variation of it) could work.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Lenz Weber
Or

rsync -avH ${all_gubbins} / [hidden email]:/dest/ && ssh [hidden email] touch /etc/donefile

so your client touches a file on your server (that sounds so wrong...)

Am 28.04.2015 um 13:36 schrieb Simon Hobson:

> Michael Johnson - MJ <[hidden email]> wrote:
>
>> rsync -av /src/ /dst/ && touch /dst/done
>
> Aaaaahhhh, knew I'd miss some detail.
> All the syncs are pushed to the backup server.
>
> But that does give me an idea. I guess I could do that on the source, then sync the flag file over.
> rsync -avH ${other_gubbins} / [hidden email]:/dest/ &&
>   touch /etc/donefile &&
>   rsync -av ${some_other_gubbins} /etc/donefile [hidden email]:/dest/
>
> That (or some variation of it) could work.
>

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Simon Hobson-2
Lorenz Weber <[hidden email]> wrote:

> rsync -avH ${all_gubbins} / [hidden email]:/dest/ && ssh [hidden email] touch /etc/donefile

No SSH access between them, only rsync. Besides, it would add the overhead of managing ssh access (users and keys) as well as Rsync.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Simon Hobson-2
As an aside to this, part of the problem I've been having is the transfer timing out/getting interrupted during a particular large file (1G, new file, 2-3 hours if it works).

So I've been experimenting with --partial and --partial-dir=.rsync-partial which weren't working. It appears to work at first - if the transfer is interrupted, the partial file is correctly saved in the named directory.
Then if I run the script again, it deletes the partial file before starting again.

I found that I needed to also specify --delete-delay to avoid deleting the partial file before it's used.

Is this "known", because it isn't implied (as I read it) by the --partial-dir section in the man page ?

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Using external filter on the server

Mark
In reply to this post by Simon Hobson-2
Searching the mail list archives, I found this old link
[https://bugzilla.samba.org/show_bug.cgi?id=2423] with an example of how
to use an external script with filter to select files by date.

It looks like it could do exactly what I need, which is only match files
less than 3 days old, but it doesn't seem to work.

I have added the filter to the rsyncd.conf on the source server, which
receives the native client connection over a network.

        filter='-!| newer-filter mm/dd/yy'

This is the script as suggested.
 From the cmd line it returns a line of 0's or 1's if I pipe an 'ls' of
the folder to it, so seems ok.

--------------------------------------------------------------
#!/bin/bash
# We convert times to seconds since the Epoch so
# we can compare them with [ -gt ].
cutoff_time=$(date +%s -d "$1")
while IFS='' read fname; do
     # Compare file's mtime to cutoff time
     if [ $(stat --format=%Y $fname) -gt $cutoff_time ]; then
         echo -n 1
     else
         echo -n 0
     fi
done
--------------------------------------------------------------

However, the rsync log output shows the desired files are parsed, but
none are never selected for sending.


Has anyone a working example of using an external script as a Filter
source ?


Thanks in advance,
Mark.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Mark
In reply to this post by Simon Hobson-2
You could try increasing the timeout delay, rather than resume.
rsync will tolerate quite long network dropouts and still carry on.
I have managed to keep an internet transfer of up to 100Gb alive for two
weeks.

I didn't find --partial to be much use for very large scale transfers
due to the very cpu intensive checksum process.

By large scale I have rsync'd several Petabytes of backup files up to
500Gb size over the last five years with good success.


On 29/04/2015 2:49 a.m., Simon Hobson wrote:
> As an aside to this, part of the problem I've been having is the transfer timing out/getting interrupted during a particular large file (1G, new file, 2-3 hours if it works).
>
> So I've been experimenting with --partial and --partial-dir=.rsync-partial which weren't working. It appears to work at first - if the transfer is interrupted, the partial file is correctly saved in the named directory.
> Then if I run the script again, it deletes the partial file before starting again.
>
> I found that I needed to also specify --delete-delay to avoid deleting the partial file before it's used.
>
> Is this "known", because it isn't implied (as I read it) by the --partial-dir section in the man page ?
>

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Options for a "I'm done" flag file

Mark
In reply to this post by Simon Hobson-2
For a push job.

Run the rsync for the files,
if the exit code is 0, create the flag file and then rsync just that
file on its own.




On 28/04/2015 10:38 p.m., Simon Hobson wrote:

> As part of my backup system, I use Rsync to keep a copy of each server on one central backup server. This backup server then uses StoreBackup to keep multiple iterations of each clone directory.
> So that the StoreBackup archives don't keep adding "redundant" and misleading backups, I update a flag file with the current date/time before doing the Rsync update, and test to see if this file is newer than the one in the latest StoreBackup backup. If it isn't, then I skip the StoreBackup for that server.
>
> The end result is that if a system is down or out of communication (one or two are at sites that can be offline for days), then the list of backups in StoreBackup will reflect that. Eg, if the system did a sync on the 1st, but not on 2nd - 5th, then there will be no backups for 2nd-5th, and when looking later I won't be "fooled" into thinking that I have a backup from (say) the 4th.
>
> Where this breaks down is if the sync fails part way through. The flag file has already been synced, so I have multiple backups which aren't actually complete.
> I actually have this at the moment. Just put a small system on a customer site, it has a database that creates 1GB journal files (not that it handles anything like that volume of data), and at the moment their connectivity is a bit flakey.
>
> My first thought was "do the flag file last", but a quick search confirms what I thought - that there isn't an option for this.
>
> So, does anyone have any suggestions how I might reasonably easily get the ability for my script to see if the previous sync completed ?
>
>

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Using external filter on the server

Wayne Davison-2
In reply to this post by Mark
On Wed, Apr 29, 2015 at 2:52 AM, Mark <[hidden email]> wrote:
I have added the filter to the rsyncd.conf on the source server, which receives the native client connection over a network.

        filter='-!| newer-filter mm/dd/yy'

That was a suggested syntax for a feature that nobody implemented. 

..wayne..

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: Using external filter on the server

Mark

Thanks, I thought that might be the case.

Instead, I'm working on a patch to "passthru" parameters to the server
side to use in a pre-xfer exec script so I can build a configurable
filter list to achieve something similar.

Much more realistic for my limited programming skills.

After updating some servers to 3.1.1 I'm now busy fixing issues caused
by the log output format changes from the 2.5x version breaking all my
report scripts.

On 5/05/2015 5:40 a.m., Wayne Davison wrote:

> On Wed, Apr 29, 2015 at 2:52 AM, Mark <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I have added the filter to the rsyncd.conf on the source server,
>     which receives the native client connection over a network.
>
>              filter='-!| newer-filter mm/dd/yy'
>
>
> That was a suggested syntax for a feature that nobody implemented.
>
> ..wayne..

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html