MS SQL Server Backups to S3 – On Linux!

Today I’m going to go over what is necessary in order to do full and transaction log backups for SQL Server Express on Linux. One of the big limitations of SQL Express is that it doesn’t include the SQL Agent, so most of the maintenance tasks that can normally be designed and implemented within SSMS need to be rethought. Thankfully Microsoft released sqlcmd for Linux, which makes it pretty easy to go ahead and do the backups as simple bash scripts scheduled through cron.

Prerequisites

This post isn’t going to go through all of the steps to install SQL Server and the associated tools, but Microsoft has done a great job of documenting that on their docs site. In order to push the backups to S3 we will need the s3cmd tool:

apt install s3cmd
s3cmd --configure

You’ll need to have an IAM identity with at least enough permissions to write to the S3 bucket you designate in the script. In the configure prompts include the keys and specify what region you want to default to.

The Scripts

In order to do the backups, two scripts are necessary: one for the full backups and one for the transaction log backups. I’ve opted for a very simple structure since I only care about one database, it shouldn’t be very hard to modify the script to generate backups for each database, but I’ll leave that as an exercise for the reader :).

Full Database Backups (fullBackup.sh)

TIMESTAMP=$(date +"%F")
BACKUP_DIR="/var/opt/mssql/backup/$TIMESTAMP"
SA_USER="SA"
SA_PASS="<Your_SA_User_Password>"

mkdir -p "$BACKUP_DIR"

chown -R mssql:mssql $BACKUP_DIR

sqlcmd -S localhost -Q "BACKUP DATABASE [<DBNAME>] TO DISK = N'$BACKUP_DIR/<DBNAME>.bak' WITH NOFORMAT, NOINIT, SKIP, NOREWIND, STATS=10" -U $SA_USER -P $SA_PASS

s3cmd put "$BACKUP_DIR/<DBNAME>.bak" "s3://<BUCKET_NAME>/$TIMESTAMP/<DBNAME>.bak"
rm -f "$BACKUP_DIR/<DBNAME>.bak"

Transaction Log Backups (logBackup.sh)

DATESTAMP=$(date +"%F")
TIMESTAMP=$(date +"%H%M%S")
BACKUP_DIR="/var/opt/mssql/backup/$DATESTAMP/logs/$TIMESTAMP"
SA_USER="SA"
SA_PASS="<Your_SA_User_Password>"

mkdir -p "$BACKUP_DIR"

chown -R mssql:mssql $BACKUP_DIR

sqlcmd -S localhost -Q "BACKUP LOG [<DBNAME>] TO DISK = N'$BACKUP_DIR/<DBNAME>_log.bak' WITH NOFORMAT, NOINIT, SKIP, NOREWIND, NOUNLOAD, STATS=5" -U SA -P $SA_PASS

s3cmd put "$BACKUP_DIR/<DBNAME>_log.bak" "s3://<BUCKET_NAME>/$DATESTAMP/logs/$TIMESTAMP/<DBNAME>_log.bak"

rm -f "$BACKUP_DIR/<DBNAME>_log.bak"

Then schedule them in cron:

0 0 * * * /root/bin/fullBackup.sh
*/15 * * * * /root/bin/logBackup.sh

With the default schedule I have, full backups are taken at midnight and transaction log backups are taken every 15 minutes.

S3 Lifecycles

While the scripts do a good job of cleaning up after themselves, S3 will (by design) never delete your data unless you specifically tell it to. S3 has a nifty feature called “Lifecycles” which allows us to specify rules for object retention (it is a powerful feature that can be used for a number of other things as well). To access it go to the AWS Console and enter into your S3 bucket. Follow these steps to setup object retention:
1. Select the Management Tab
2. Select Lifecycle
3. Click + Add lifecycle rule
4. Name the rule something descriptive (“Expire all files”). Leave the prefix blank
5. Leave Configure transition blank
6. In Expiration set the following options: S3 Lifecycle Creation
7. Click Save

That’s All

At this point we have full and transaction log backups configured, being pushed off site to Amazon S3. These backups are soft deleted after 7 days and fully deleted after 14 days.

The Bus Factor

I’ve been meaning to write this for quite a while, as the bus factor is something I’ve (literally) run into in my career. For those of you not familiar with it, the “Bus Factor” is basically an informal measure of resiliency of a project to the loss of one or more key members. It’s basically the programming version of the old adage “Don’t put all your eggs in one basket”.

Story Time

Some years ago I was a software development intern at a large company in Milwaukee, Wisconsin. The team I was on was broken into a U.S. development team, an offshore dev team in India and an offshore QA team in China. We had daily scrum meetings at 8 AM every morning so that the US and Indian teams could participate all in one. One day we got word that one of the senior-most developers had literally been hit by a bus while crossing the street (thankfully he made a full recovery, but it certainly slowed down that part of the team as he was out for 8 weeks or so).

How to Reduce the Bus Factor

I’m sure people can (and probably have) written entire books on the subject of reducing the bus factor, and spreading knowledge around through the entire team. Spreading knowledge is really the key element in bus factor reduction.

How many people currently work on a team where one or two people are basically the wizards who secret spells make critical things happen (like deployments or provisioning infrastructure assets, or SSL certificates, or any of the other million things that need to be done in order to make software work)? I know I’ve worked on several teams where that happened. I’ve also worked with people who wanted to increase the bus factor as they thought it gave them better job security (a notion I strongly disagree with).

In my experience one of the best ways to reduce the bus factor is to maintain an internal wiki where developers and administrators can document processes for anything which they are going to do more than once (and sometimes it’s good to document things that are being done once as well). Another great idea is to regularly schedule cross training (n+1 isn’t only a good idea for infrastructure, developers and admins should have a bit of redundancy as well).

Ethics

I personally feel that there is an ethical responsibility for all engineers to be transparent in what they do. I never want to be the only person capable of doing something, instead I do my best to make sure that anything I do, which may ever need to be done again, is documented at least well enough that someone can probably piece it together. Doing this ensures that if I am ever hit by a bus the rest of my team won’t have to try to figure out the magical incantations I have developed in order to do a number of things.

Conclusion

At the end of the day reducing the bus factor is good for your team. You never know when you or one of your colleagues are going to end up no longer being available to work (they might be hit by a bus, or it might be something more mundane like taking a new job, or leaving for a few months for a sabbatical or maternity/paternity leave). As an engineer and a member of a team you have an ethical obligation to ensure that you are both sharing processes and techniques you’ve developed with your colleagues, and also trying to learn those processes and techniques from your colleagues.

YASC: Yet Another SSL Checker

I wanted to do a little side project in less than 24 hours, which is fairly simple, but enough to really get my feet wet with ASP.NET Core MVC. I decided to build a little tool which can examine current details of an SSL certificate, and also which you can use to get proactive emails when you are 30 days out from a certificate expiring. This was also a great opportunity to play with bootstrap 4 and C# 7.

Technologies Used

  1. Bootstrap 4
  2. ASP.NET Core MVC
  3. Entity Framework Core
  4. SendGrid + SendGrid Transactional Templates
  5. Hangfire.IO

Before I start getting long-winded, if you’d like to just see the code, head over to my github.

High Level Approach

This application was kind of fun in that the core logic which actually “does” the important part of checking the SSL certificates is only a couple-dozen lines of code. The way it works is to open up a connection to a server and then take a look at the SSL certificate that was associated with the response. There are a few “limitations” around what can be inspected currently, as anything other than a non-expired, trusted certificate will throw an exception. I figured it would be fun to play around with Hangfire and Sendgrid as well in order to do a nice background batch email process.

Hosting On Azure

If you want to play around with it, the application is running on a free Azure App Service here. Note that these free services will turn themselves off with inactivity, so almost certainly (unless this service becomes incredibly popular), the cron tasks that Hangfire is executing will not happen (as that would require the server to be running).

Thoughts

This was a really fun 1 day project which let me play around with a few technologies I have played with in a very limited fashion. There are a number of bugs I noticed, so this could really use a bit of polish and TLC, but that probably won’t happen anytime soon. Please poke around at the code, let me know if you see anything really “surprising” or anything which would be an easy improvement.

Thanks for reading!