rsync script for snapshot backups

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

rsync script for snapshot backups

Dennis Steinkamp
Hey guys,

i tried to create a simple rsync script that should create daily backups from a ZFS storage and put them into a timestamp folder.
After creating the initial full backup, the following backups should only contain "new data" and the rest will be referenced via hardlinks (-link-dest)

This was at least a simple enough scenario to achieve it with my pathetic scripting skills. This is what i came up with:

#!/bin/sh

# rsync copy script for rsync pull from FreeNAS to BackupNAS for Buero dataset

# Set variables
EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`

# Copy previous timefile to timeold.txt if it exists
if [ -f "/volume1/rsync/Buero/timenow.txt" ]
then
    yes | cp /volume1/rsync/Buero/timenow.txt /volume1/rsync/Buero/timeold.txt
fi
# Create current timefile
    echo `date +"%d-%m-%Y-%H%M"` > /volume1/rsync/Buero/timenow.txt
# rsync command
if [ -f "/volume1/rsync/Buero/timeold.txt" ]
then
    rsync -aqzh \
    --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
    --log-file=/volume1/Backup_Test/logs/rsync-`date +"%d-%m-%Y-%H%M"`.log \
    --link-dest=/volume1/Backup_Test/`cat /volume1/rsync/Buero/timeold.txt` \
    [hidden email]`date +"%d-%m-%Y-%H%M"`
else
    rsync -aqzh \
    --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
    --log-file=/volume1/Backup_Buero/logs/rsync-`date +"%d-%m-%Y-%H%M"`.log \
    [hidden email]`date +"%d-%m-%Y-%H%M"`
fi

# Delete expired snapshots (2 weeks old)
if [ -d /volume1/Backup_Buero/$EXPIRED-* ]
then
rm -Rf /volume1/Backup_Buero/$EXPIRED-*
fi

Well, it works but there is a huge flaw with his approach and i am not able to solve it on my own unfortunately.
As long as the backups are finishing properly, everything is fine but as soon as one backup job couldn`t be finished for some reason, (like it will be aborted accidently or a power cut occurs)
the whole backup chain is messed up and usually the script creates a new full backup which fills up my backup storage.

What i would like to achieve is, to improve the script so that a backup run that wasn`t finished properly will be resumed, next time the script triggers.
Only if that was successful should the next incremental backup be created so that the files that didn`t changed from the previous backup can be hardlinked properly.

I did a little bit of research and i am not sure if i am on the right track here but apparently this can be done with return codes, but i honestly don`t know how to do this.
Thank you in advance for your help and sorry if this question may seem foolish to most of you people.

Regards

Dennis









--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Simon Hobson-2
Dennis Steinkamp <[hidden email]> wrote:

> i tried to create a simple rsync script that should create daily backups from a ZFS storage and put them into a timestamp folder.
> After creating the initial full backup, the following backups should only contain "new data" and the rest will be referenced via hardlinks (-link-dest)
> ...
> Well, it works but there is a huge flaw with his approach and i am not able to solve it on my own unfortunately.
> As long as the backups are finishing properly, everything is fine but as soon as one backup job couldn`t be finished for some reason, (like it will be aborted accidently or a power cut occurs)
> the whole backup chain is messed up and usually the script creates a new full backup which fills up my backup storage.

Yes indeed, this is a typical flaw with many systems - you often need to throw away the partial backup.
One option that comes to mind is this :
Create the new backup in a directory called (for example) "new" or "in-progress". If, and only if, the backup completes, then rename this to a timestamp. If when you start a new backup, if the in-progress folder exists, then use that and it'll be freshened to the current source state.

Also, have you looked at StoreBackup ? http://storebackup.org
I does most of this automagically, keeps a definable history (eg one/day for 14 days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so can detect bit-rot in your backups.


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Dennis Steinkamp
Am 19.06.2016 um 19:27 schrieb Simon Hobson:

> Dennis Steinkamp <[hidden email]> wrote:
>
>> i tried to create a simple rsync script that should create daily backups from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should only contain "new data" and the rest will be referenced via hardlinks (-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not able to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but as soon as one backup job couldn`t be finished for some reason, (like it will be aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a new full backup which fills up my backup storage.
> Yes indeed, this is a typical flaw with many systems - you often need to throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new" or "in-progress". If, and only if, the backup completes, then rename this to a timestamp. If when you start a new backup, if the in-progress folder exists, then use that and it'll be freshened to the current source state.
>
> Also, have you looked at StoreBackup ? http://storebackup.org
> I does most of this automagically, keeps a definable history (eg one/day for 14 days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so can detect bit-rot in your backups.
>
>
  Thank you for taking the time to answer me.
  Your suggestion is what i also had in mind but i wasn`t sure if this
would be "best practice"
  To build this idea into my script i probably need to hardcode the
target directory rsync writes to (e.g new or in-progress) and move the
directory name to a timestamp only after rsync gave a return code of 0,
am i correct? (or return code 0 and 24?)

  As for StoreBackup, it really does sound nice but i have to do all of
this from the console of a 2bay synology nas, so its not that easy to
use 3rd party software that may has other dependencies, the synology
system doesn`t meet.




--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Joe
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Joe
In reply to this post by Dennis Steinkamp
Rely on the other answers here as to how to do it right.

I just want to mention a few things in your script.
>     yes | cp /volume1/rsync/Buero/timenow.txt
> /volume1/rsync/Buero/timeold.txt
Yes is a program which puts out "Y" (or whatever you tell it to) forever
- not what you want - and cp does not accept input from a pipe unless
the first argument is "-" or some similar fancier construction. You can
probably just leave off  the "yes | " and have the statement work
exactly as it does now.

It looks like your EXPIRED logic will only find a directory which
*exactly* matches that date.

You might look at using something like a find command to find
directories older than 14 days.

Some find options which might help:

-ctime 14  specifies finding things modified more than 14 days ago
-type d specifies finding only directories
-maxdepth 1 specifies finding things only one level below the path find
starts at
-exec ls -l {} \; specifies running a command on every result which is
returned - in this case, an ls which can't hurt anything. You can
replace ls with something like rm -rf {} when you're *very* sure the
command is finding *exactly* what you want it to.

I didn't put the whole command together because until you understand how
it works, you don't want to try something that might delete a bunch of
things beyond what you actually want deleted.

Joe

On 06/19/2016 08:22 AM, Dennis Steinkamp wrote:

> Hey guys,
>
> i tried to create a simple rsync script that should create daily
> backups from a ZFS storage and put them into a timestamp folder.
> After creating the initial full backup, the following backups should
> only contain "new data" and the rest will be referenced via hardlinks
> (-link-dest)
>
> This was at least a simple enough scenario to achieve it with my
> pathetic scripting skills. This is what i came up with:
>
> #!/bin/sh
>
> # rsync copy script for rsync pull from FreeNAS to BackupNAS for Buero
> dataset
>
> # Set variables
> EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`
>
> # Copy previous timefile to timeold.txt if it exists
> if [ -f "/volume1/rsync/Buero/timenow.txt" ]
> then
>     yes | cp /volume1/rsync/Buero/timenow.txt
> /volume1/rsync/Buero/timeold.txt
> fi
> # Create current timefile
>     echo `date +"%d-%m-%Y-%H%M"` > /volume1/rsync/Buero/timenow.txt
> # rsync command
> if [ -f "/volume1/rsync/Buero/timeold.txt" ]
> then
>     rsync -aqzh \
>     --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
>     --log-file=/volume1/Backup_Test/logs/rsync-`date
> +"%d-%m-%Y-%H%M"`.log \
>     --link-dest=/volume1/Backup_Test/`cat
> /volume1/rsync/Buero/timeold.txt` \
> Test@192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
> else
>     rsync -aqzh \
>     --delete --stats --exclude-from=/volume1/rsync/Buero/exclude.txt \
>     --log-file=/volume1/Backup_Buero/logs/rsync-`date
> +"%d-%m-%Y-%H%M"`.log \
> Test@192.168.2.2::Test/volume1/Backup_Test/`date +"%d-%m-%Y-%H%M"`
> fi
>
> # Delete expired snapshots (2 weeks old)
> if [ -d /volume1/Backup_Buero/$EXPIRED-* ]
> then
> rm -Rf /volume1/Backup_Buero/$EXPIRED-*
> fi
>
> Well, it works but there is a huge flaw with his approach and i am not
> able to solve it on my own unfortunately.
> As long as the backups are finishing properly, everything is fine but
> as soon as one backup job couldn`t be finished for some reason, (like
> it will be aborted accidently or a power cut occurs)
> the whole backup chain is messed up and usually the script creates a
> new full backup which fills up my backup storage.
>
> What i would like to achieve is, to improve the script so that a
> backup run that wasn`t finished properly will be resumed, next time
> the script triggers.
> Only if that was successful should the next incremental backup be
> created so that the files that didn`t changed from the previous backup
> can be hardlinked properly.
>
> I did a little bit of research and i am not sure if i am on the right
> track here but apparently this can be done with return codes, but i
> honestly don`t know how to do this.
> Thank you in advance for your help and sorry if this question may seem
> foolish to most of you people.
>
> Regards
>
> Dennis
>
>
>
>
>
>
>
>
>
>


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Larry Irwin-2
In reply to this post by Simon Hobson-2
The scripts I use analyze the rsync log after it completes and then
sftp's a summary to the root of the just completed rsync.
If no summary is found or the summary is that it failed, the folder
rotation for that set is skipped and that folder is re-used on the
subsequent rsync.
The key here is that the folder rotation script runs separately from the
rsync script(s).
For each entity I want to rsync, I create a named folder to identify it
and the rsync'd data is held in sub-folders:
daily.[1-7] and monthly.[1-3]
When I rsync, I rsync into daily.0 using daily.1 as the link-dest.
Then the rotation script checks daily.0/rsync.summary - and if it
worked, it removes daily.7 and renames the daily folders.
On the first of the month, the rotation script removes monthly.3,
renames the other 2 and makes a complete hard-link copy of daily.1 to
monthly.1
It's been running now for about 4 years and, in my environment, the 10
copies take about 4 times the space of a single copy.
(we do complete copies of linux servers - starting from /)
If there's a good spot to post the scripts, I'd be glad to put them up.

--
Larry Irwin
Cell: 864-525-1322
Email: [hidden email]
Skype: larry_irwin
About: http://about.me/larry_irwin

On 06/19/2016 01:27 PM, Simon Hobson wrote:

> Dennis Steinkamp <[hidden email]> wrote:
>
>> i tried to create a simple rsync script that should create daily backups from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should only contain "new data" and the rest will be referenced via hardlinks (-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not able to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but as soon as one backup job couldn`t be finished for some reason, (like it will be aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a new full backup which fills up my backup storage.
> Yes indeed, this is a typical flaw with many systems - you often need to throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new" or "in-progress". If, and only if, the backup completes, then rename this to a timestamp. If when you start a new backup, if the in-progress folder exists, then use that and it'll be freshened to the current source state.
>
> Also, have you looked at StoreBackup ? http://storebackup.org
> I does most of this automagically, keeps a definable history (eg one/day for 14 days, one/week for x weeks, one/30d for y years), plus it keeps file hashes so can detect bit-rot in your backups.
>
>


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Dennis Steinkamp

Am 20.06.2016 um 22:01 schrieb Larry Irwin (gmail):

> The scripts I use analyze the rsync log after it completes and then
> sftp's a summary to the root of the just completed rsync.
> If no summary is found or the summary is that it failed, the folder
> rotation for that set is skipped and that folder is re-used on the
> subsequent rsync.
> The key here is that the folder rotation script runs separately from
> the rsync script(s).
> For each entity I want to rsync, I create a named folder to identify
> it and the rsync'd data is held in sub-folders:
> daily.[1-7] and monthly.[1-3]
> When I rsync, I rsync into daily.0 using daily.1 as the link-dest.
> Then the rotation script checks daily.0/rsync.summary - and if it
> worked, it removes daily.7 and renames the daily folders.
> On the first of the month, the rotation script removes monthly.3,
> renames the other 2 and makes a complete hard-link copy of daily.1 to
> monthly.1
> It's been running now for about 4 years and, in my environment, the 10
> copies take about 4 times the space of a single copy.
> (we do complete copies of linux servers - starting from /)
> If there's a good spot to post the scripts, I'd be glad to put them up.
>
Hi Larry,

that is something i couldn`t do with my current scripting skills but it
sounds very interesting and i really would like to know how you did it
if you don`t mind showing me your script of course.
As for my script, this is what i came up with.

#!/bin/sh

# rsync copy scriptv2 for rsync pull from FreeNAS to BackupNAS

# Set Date
B_DATE=$(date +"%d-%m-%Y-%H%M")
EXPIRED=`date +"%d-%m-%Y" -d "14 days ago"`

# Create directory if it doesn`t exist already
if ! [ -d /volume1/Backup_Test/in_progress ] ; then
         mkdir -p /volume1/Backup_Test/in_progress
fi

# rsync command
if   [ -f /volume1/rsync/Test/linkdest.txt ] ; then
         rsync -aqzh \
         --delete --stats --exclude-from=/volume1/rsync/Test/exclude.txt \
         --log-file=/volume1/Backup_Test/logs/rsync-$B_DATE.log \
         --link-dest=/volume1/Backup_Test/`cat
/volume1/rsync/Test/linkdest.txt`\
         Test@192.168.2.2::Test /volume1/Backup_Test/in_progress
else
         rsync -aqzh \
         --delete --stats --exclude-from=/volume1/rsync/Test/exclude.txt \
         --log-file=/volume1/Backup_Test/logs/rsync-$B_DATE.log \
         Test@192.168.2.2::Test /volume1/Backup_Test/in_progress
fi

# Check return value
if [ $? = 24 -o $? = 0 ] ; then
         mv /volume1/Backup_Test/in_progress /volume1/Backup_Test/$B_DATE
         echo $B_DATE > /volume1/rsync/Test/linkdest.txt
fi

# Delete expired snapshots (2 weeks old)
if [ -d /volume1/Backup_Test/$EXPIRED-* ]
then
rm -Rf /volume1/Backup_Test/$EXPIRED-*
fi

Keep in mind i am not very good at this and if something can be improved
or you see a big flaw in it, i would be grateful if you let me know.
So far it seems to do the trick. I would like to improve it so that the
logfile will be mailed to a specific e-mail adress after rsync completed
successfully.
Unfortunately the logfiles grow very big when i have lots of data to
back up and i couldn`t figure out how to only send a specific part of
the logfile or to customize the logfile somehow.



--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Petros Aggelatos
In reply to this post by Simon Hobson-2
On 19 June 2016 at 10:27, Simon Hobson <[hidden email]> wrote:

> Dennis Steinkamp <[hidden email]> wrote:
>
>> i tried to create a simple rsync script that should create daily backups from a ZFS storage and put them into a timestamp folder.
>> After creating the initial full backup, the following backups should only contain "new data" and the rest will be referenced via hardlinks (-link-dest)
>> ...
>> Well, it works but there is a huge flaw with his approach and i am not able to solve it on my own unfortunately.
>> As long as the backups are finishing properly, everything is fine but as soon as one backup job couldn`t be finished for some reason, (like it will be aborted accidently or a power cut occurs)
>> the whole backup chain is messed up and usually the script creates a new full backup which fills up my backup storage.
>
> Yes indeed, this is a typical flaw with many systems - you often need to throw away the partial backup.
> One option that comes to mind is this :
> Create the new backup in a directory called (for example) "new" or "in-progress". If, and only if, the backup completes, then rename this to a timestamp. If when you start a new backup, if the in-progress folder exists, then use that and it'll be freshened to the current source state.

I have an extremely similar script for my backups and that's exactly
what I do to deal with backups that are stopped mid-way, either by
power failures or by me. I rsync to a .tmp-$target directory, where
$target is what I'm backing up. I have separate backups for my rootfs
and /home. I also start the whole thing under ionice so that my
computer doesn't get slow from all this I/O. Lastly, before renaming
the .tmp-$target to the final directory I do a `sync -f` because rsync
doesn't seem to call fsync() when copying files and you can have a
failed backup if a power failure happens after the rename().

Here is my script:

#!/bin/bash

set -o errexit
set -o pipefail

target=$1

case "$target" in
    home)
        source=/home
        ;;
    root)
        source=/
        ;;
esac

PATHTOBACKUP=/root/backup

date=$(date --utc "+%Y-%m-%dT%H:%M:%S")

ionice --class 3 rsync \
    --archive \
    --verbose \
    --one-file-system \
    --sparse \
    --delete \
    --compress \
    --log-file=$PATHTOBACKUP/.tmp-$target.log \
    --link-dest=$PATHTOBACKUP/$target-current \
    $source $PATHTOBACKUP/.tmp-$target

sync -f $PATHTOBACKUP/.tmp-$target

mv $PATHTOBACKUP/.tmp-$target.log $PATHTOBACKUP/$target-$date.log
mv $PATHTOBACKUP/.tmp-$target $PATHTOBACKUP/$target-$date

ln --symbolic --force --no-dereference $target-$date
$PATHTOBACKUP/$target-current

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Reply | Threaded
Open this post in threaded view
|

Re: rsync script for snapshot backups

Henri Shustak
In reply to this post by Dennis Steinkamp

> As for StoreBackup, it really does sound nice but i have to do all of this from the console of a 2bay sinology nas

I am not familiar with the specific requirements for storebackup. However, you may be interested in taking a look at LBackup : http://www.lbackup.org

The reason for this is two fold :

 (1) LBackup is primarily written in BASH and this will allow you to take a look at
      the approaches used and possibly use those approaches within your system.

 (2) LBackup is fairly portable on *NIX systems despite it being primarily developed
     for Mac OS and Debian based distributions. I know someone made an Arch
     package at some point and it would be trivial to install or even package
     for most GNU/LINUX systems. I believe that Synology boxes are running some
     sort of GNU/LINUX under the hood.

 (3) LBackup is able to be driven and configured from the command line. Even the restore
     process is geared this way.

Hope this information is helpful.

DISCLAIMER : I am involved with the development of the LBackup project.

NOTE : If you do get LBackup running on the Synology box, then please let me know (off list) so that I know it works on a Synology Box. The kind of box and version would also be helpful if you get it working. If you need any help feel free to email the LBackup mailing list or me directly : http://www.lbackup.org/mailing_lists


------------------------------------------------------
Fresh beats now available for free download from HTRAX :
http://www.htrax.xyz



--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html