May 2017 – Luke Bearl

I’ve been meaning to write this for quite a while, as the bus factor is something I’ve (literally) run into in my career. For those of you not familiar with it, the “Bus Factor” is basically an informal measure of resiliency of a project to the loss of one or more key members. It’s basically the programming version of the old adage “Don’t put all your eggs in one basket”.

Story Time

Some years ago I was a software development intern at a large company in Milwaukee, Wisconsin. The team I was on was broken into a U.S. development team, an offshore dev team in India and an offshore QA team in China. We had daily scrum meetings at 8 AM every morning so that the US and Indian teams could participate all in one. One day we got word that one of the senior-most developers had literally been hit by a bus while crossing the street (thankfully he made a full recovery, but it certainly slowed down that part of the team as he was out for 8 weeks or so).

How to Reduce the Bus Factor

I’m sure people can (and probably have) written entire books on the subject of reducing the bus factor, and spreading knowledge around through the entire team. Spreading knowledge is really the key element in bus factor reduction.

How many people currently work on a team where one or two people are basically the wizards who secret spells make critical things happen (like deployments or provisioning infrastructure assets, or SSL certificates, or any of the other million things that need to be done in order to make software work)? I know I’ve worked on several teams where that happened. I’ve also worked with people who wanted to increase the bus factor as they thought it gave them better job security (a notion I strongly disagree with).

In my experience one of the best ways to reduce the bus factor is to maintain an internal wiki where developers and administrators can document processes for anything which they are going to do more than once (and sometimes it’s good to document things that are being done once as well). Another great idea is to regularly schedule cross training (n+1 isn’t only a good idea for infrastructure, developers and admins should have a bit of redundancy as well).

Ethics

I personally feel that there is an ethical responsibility for all engineers to be transparent in what they do. I never want to be the only person capable of doing something, instead I do my best to make sure that anything I do, which may ever need to be done again, is documented at least well enough that someone can probably piece it together. Doing this ensures that if I am ever hit by a bus the rest of my team won’t have to try to figure out the magical incantations I have developed in order to do a number of things.

Conclusion

At the end of the day reducing the bus factor is good for your team. You never know when you or one of your colleagues are going to end up no longer being available to work (they might be hit by a bus, or it might be something more mundane like taking a new job, or leaving for a few months for a sabbatical or maternity/paternity leave). As an engineer and a member of a team you have an ethical obligation to ensure that you are both sharing processes and techniques you’ve developed with your colleagues, and also trying to learn those processes and techniques from your colleagues.

I wanted to do a little side project in less than 24 hours, which is fairly simple, but enough to really get my feet wet with ASP.NET Core MVC. I decided to build a little tool which can examine current details of an SSL certificate, and also which you can use to get proactive emails when you are 30 days out from a certificate expiring. This was also a great opportunity to play with bootstrap 4 and C# 7.

Technologies Used

Bootstrap 4
ASP.NET Core MVC
Entity Framework Core
SendGrid + SendGrid Transactional Templates
Hangfire.IO

Before I start getting long-winded, if you’d like to just see the code, head over to my github.

High Level Approach

This application was kind of fun in that the core logic which actually “does” the important part of checking the SSL certificates is only a couple-dozen lines of code. The way it works is to open up a connection to a server and then take a look at the SSL certificate that was associated with the response. There are a few “limitations” around what can be inspected currently, as anything other than a non-expired, trusted certificate will throw an exception. I figured it would be fun to play around with Hangfire and Sendgrid as well in order to do a nice background batch email process.

Hosting On Azure

If you want to play around with it, the application is running on a free Azure App Service here. Note that these free services will turn themselves off with inactivity, so almost certainly (unless this service becomes incredibly popular), the cron tasks that Hangfire is executing will not happen (as that would require the server to be running).

Thoughts

This was a really fun 1 day project which let me play around with a few technologies I have played with in a very limited fashion. There are a number of bugs I noticed, so this could really use a bit of polish and TLC, but that probably won’t happen anytime soon. Please poke around at the code, let me know if you see anything really “surprising” or anything which would be an easy improvement.

Thanks for reading!

Luke Bearl

Software, Operations, and Retrospectives

Month: May 2017

The Bus Factor