Dec 26
Starting a new job
icon1 Darrell Mozingo | icon2 Uncategorized | icon4 December 26th, 2016| icon3No Comments »

I started a new position with Skyscanner during the summer. It made me realise that over the years of starting new jobs, and of course being on teams when others joined, I’ve made and seen plenty of annoying mistakes that hurt relationships, trust, and respect with colleague  before they even had a change to form. I think this is especially true for senior engineers, where you come in with more experience and opinions, and perhaps more of a desire to “prove” your new role/salary. Here’s some tips that might help both yourself and your new team with on-boarding to the new job:

  1. Humility – You were hired in part because you’re good at your job, having the right set of experience and technical skills that the organisation needs. Don’t let it get to your head though. Remember that you’re going into this organisation in part to learn from some other smart people, and there’s a lot you’re not going to know. Perhaps certain ways of doing things from your last role are done differently & better here. This is particularly important if you were coming from a “big fish small pond” situation as I have a few times. That skill reset towards the bottom is the best way to learn, and I feel staying humble keeps you the most open to it.
  2. Don’t try changing everything day 1 – Sure, fix problems you see and make things better – it’s part of the reason you were hired. Just take a breather and don’t try to fix all the things on day 1! It’s part taking time to understand why things are in their current state or are done a certain way, and part building relationships/trust/respect with your new colleagues before going forward. Coming in on day 1 and trying to change a good chunk of things will put your colleagues on the defensive, making the possibility of introducing change that much harder.
  3. “Lightly” question everything – Similar to above, do question things. Don’t just assume things are already done the best way possible, but ask why they are that way. If you think it can be improved, don’t push back too much right away but take note and come back to it after a bit, when you’re more familiar with the team and system. Don’t push back on the decisions or your opinion too much right off the bat.
  4. Prime Directive – Taking a nod from the retrospective prime directive, I’ve found it the most conductive way to approach new systems/code/processes. It’s best to assume things were done for a valid reason in their given context, not maliciously. Our profession, the new business you’re at, personal skill levels, and most importantly technology itself, are constantly changing. What’s “correct” today in any of those areas are bound to be “incorrect” next week. Even the decisions you’ve made in the past for these areas probably seems like crap to you now. ORMs, WebForms, heavy handed message broker SOA… these were all valid & common decisions at different points in time. Maybe the system you’re coming onto didn’t implement these patterns perfectly, but did we always get it right ourselves? Rather than slag off the code or the anonymous author, try to realise that they were like you right now, trying to make the best decision in their context. Perhaps try to understand what those contexts were, then take it for what it is, and move on with improving things.
Jul 14
Retrospective tips
icon1 Darrell Mozingo | icon2 Uncategorized | icon4 July 14th, 2015| icon32 Comments »

My friend Jeremy wrote an excellent post about spicing up retrospectives. I started writing this up as a comment to post there but it got a little long, so thought I’d break it out as a blog post.

Jeremy’s experiences mirror mine exactly from running and participating in many retros over the years. Actively making sure they’re not getting routine and becoming an after thought is an absolute must. Here’s a few additional tips we use to run, spice up, and management retros:

  • Retro bag: We keep a small bag in the office filled with post-its, sharpies, markers, bluetack, etc, to make retro facilitator’s lives easier – they can just grab and go. We also have a print copy of Jeremy’s linked retr-o-mat in it.
  • Facilitator picker: A small internal app which lets team enter their retro info and randomly select someone to facilitate. It favours those who haven’t done one recently and are available for the needed time span. Sure saves on walking around and asking for a facilitator!
  • Cross-company retros: We’ve gotten great value out of doing larger cross-company retros after big projects. These are larger (upwards of 20 people) representing as many teams involved as possible (developers, systems, product owners, management, sales, client ops, etc). We used the mail box technique Jeremy mentioned and had attendees generate ideas beforehand to get everything in, limiting the retro to 1.5 hours. Making sure everyone knew the prime directive was also a must, as many hadn’t been involved in retro’s before. Actions that came out ended up being for future similar projects, and were assigned to a team to champion. Sure enough they came in very handy a few months later as we embarked on a similarly large project.
  • Retro ideas: (don’t remember were I got these, but they’re not original!)
    1. Only listing 3 of the good things that happened in a given period. At first I didn’t think focusing purely on the good would result in any actionable outcomes, but the perspective brought about some interesting ideas
    2. Making a “treasure map” of the retro time period, with some members adding a “mountain of tech debt”, “bog of infrastructure”, and “sunny beach of automation”. Fun take on the situation to get at new insights
    3. Amazon reviews of the period with a star rating and “customer feedback”
    4. I’m excited to try out story cubes at the next retro I run – sounds good!
Sep 4
Managing the Unexpected
icon1 Darrell Mozingo | icon2 Books | icon4 September 4th, 2013| icon3No Comments »

I recently read Managing the Unexpected. It’s a brilliant book about running highly resilient organisations. While it’s mostly based on high-risk organisations like nuclear power plants and wild fire firefighting units, it’s still highly applicable to any company just trying to increase their resiliency to failures and outages.

A lot of the points in the book fall into that “that sounds so obvious” category after you read it, but I think those are the best kind as they help clarify information you weren’t able to and give you a good way to communicate them with your colleagues. Still plenty in there to give you something new to think about too. The first half of the book discusses five principals they feel all highly resilient organisations need to follow, while the second half goes over ways to introduce them to your organisation, complete with rating systems for how you function now.

The five main principals the book harps on are (the first three are for avoiding incidents, while the last two are for dealing with them when they occur):

  • Tracking small failures – don’t let errors slip through the cracks and go unnoticed.
  • Resisting oversimplification – don’t simply write off errors as “looking like the same one we see all the time”, but investigate them.
  • Remaining sensitive to operations – employees working on the front line are more likely to notice something out of the ordinary, which could indicate an impending failure. Listen to them.
  • Maintaining capabilities for resilience – shy away from removing things that’ll keep resilience in your system when there’s an outage.
  • Taking advantage of shifting locations of expertise – don’t leave all decision making power in the hands of managers that may be separated from the incident. Let front line members call the shots.

Here’s some of my favourite bits of wisdom from the book:

  • “… try to hold on to those feelings and resist the temptation to gloss over what has just happened and treat it as normal. In that brief interval between surprise and successful normalizing lies one of your few opportunities to discover what you don’t know. This is one of those rare moments when you can significantly improve your understanding. If you wait too long, normalizing will take over, and you’ll be convinced that there is nothing to learn.” (pg 31) There’s been too many times in the past I’ve been involved in system outages where everyone goes into panic mode, gets the problem solves, but then sits around afterwards going “yea, it was just because of that usual x or y issue that we know about”. It’s about digging in and never assuming a failure was because of a known situation (lying to yourself). Dig in and find out what happened with a blank slate after each failure. Keep asking why.
  • “Before an event occurs, write down what you think will happen. Be specific. Seal the list in an envelope, and set it aside. After the event is over, reread your list and assess where you were right and wrong.” (pg 49) Basically following the scientific method. Setup a null hypothesis with expectations that you can check after an event (software upgrade, new feature, added capacity, etc). It’s definitely not something I’m used to, but trying to build it into my work flow. I love the idea of Etsy’s Catapult tool where they setup expectations for error rates, client retention, etc before releasing a feature, then do A/B testing to show it met or failed each criteria.
  • “Resilience is a form of control. ‘A system is in control if it is able to minimize or eliminate unwanted variability, either in its own performance, in the environment, or in both… The fundamental characteristic of a resilient organization is that it does not lose control of what it does but is able to continue and rebound.'” (pg 70) – Don’t build highly resilient applications assuming they’ll never break, but instead assume that each and every piece will break or slow down at some point (even multiple together) and design your app to deal with it. We’ve built our streaming platform to assume everything will break, even our dependencies on other internal teams, and we’ll just keep going as best we can when they’re down and bounce back after.
  • “Every unexpected event has some resemblance to previous events and some novelty relative to previous events. […] The resilient system bears the marks of its dealings with the unexpected not in the form of more elaborate defences but in the form of more elaborate response capabilities.” (pg 72) – When you have an outage and determine the root cause, don’t focus on fixing that one specific error from ever happening again. Instead, try to build resilience into the system to stop that class of problem from having affects in the future. If your cache throwing a specific error was the root cause, for instance, build the system to handle any error from the cache rather than that specific one, and increase metrics around these to respond faster in the future.
  • “Clarify what constitutes good news. Is no news good news, or is no news bad news? Don’t let this remain a question. Remember, no news can mean either that things are going well or that someone is […] unable to give news, which is bad news. Don’t fiddle with this one. No news is bad news.” (pg 152) – If your alerting system hasn’t made a peep for a few days, it’s probably a bad thing. Some nominal level of errors will always be common, and if you’re hearing nothing it’s an error. Never assuming your monitoring and alerting systems are working smoothly!

Overall the book is an excellent read. A bit dense in writing style at time, but I’d recommend it if you’re working on a complex system that demands uptime in the face of shifting requirements and operating conditions.

Mar 18
DevOps Days London 2013
icon1 Darrell Mozingo | icon2 Events | icon4 March 18th, 2013| icon3No Comments »

I spent this past Friday & Saturday at DevOpsDays London. There’s been a few reviews written already about various bits (and a nice collection of resources by my co-worker Anna), and I wanted to throw my thoughts out there too. The talks each morning were all very good and well presented, but for me the real meat of the event for me was the 3 tracks of Open Spaces each afternoon, along with various break time and hallway discussions. I didn’t take as detailed notes as others did, but here’s the bits I took away from each Open Space:

  • Monitoring: – Discussed using Zabbix, continuous monitoring, and some companies trying out self-healing techniques with limited success (be careful with services flapping off and on)
  • Logstash: – Windows client support (not as good as it sounds), architecture (Zeromq everything to one or two servers, then to Elastic search), what to log (everything!)
  • Configuration Management 101 (w/Puppet & Chef): It was great having the guys from PuppetLabs and Opscode here to give views on both products (and trade some friendly jabs!). Good discussion about Window support, including a daily growing community with package support and the real possibility of actually doing config management on Windows. We’re using CFEngine, and while I got crickets after bringing it up, a few people were able to offer some good advise and compare with Puppet & Chef (stops on error like Chef, good for legacy support, promise support is nice, etc).
  • Op to dev feedback cycle: Besides the usual “put devs on call” idea (which I still feel is a bad idea), there was discussion about getting bugs like memory leaks prioritised above features. One of the better suggestions to me was simply going and talking to the devs, putting faces to names and getting to know one another. Suggestions were also made for ops to just patch the code themselves, which throws up a lot alarms to me (going through back channels, perhaps not properly tested, etc). I say make a pull request.
  • Deployment orchestration: Bittorrent for massive deploys (Twitter’s Murder), Jenkins/TeamCity/et al are still best for kicking off deploys, and MCollective for orchestration.
  • Ops user stories: Creating user stories for op project prioritisation is hard, as is fitting the work in for sprints. Ended up coming down to standard estimation difficulties – more work popping up, unknown unknowns, etc. Left a bit before the end to pop into a Biz & DevOps Open Space, but didn’t get much from it before it ended,

Overall it was a great conference. Well planned, good food, and great discussions. Nothing completely ground breaking, but a lot of really good tips & recommendations to dig into.

Jun 24
Software Craftsmanship 2012
icon1 Darrell Mozingo | icon2 Events | icon4 June 24th, 2012| icon3No Comments »

I attended the Software Craftsmanship 2012 conference last Thursday up at Bletchley Park. It was an awesome event ran mostly by Jason Gorman and the staff at the park. The company I work for, 7digital, sponsored the event so all ticket proceeds went directly to help the park, which is very cool. They’re in desperate need for funding and this event has brought in a hefty amount the past few years.

I did the Pathfinding Peril track in the morning. They went over basic pathfinding algorithms, including brute force and A*, and their applicability outside the gaming world. The rest of the session was spent pairing on bots that compete against other bots trying to automatically navigate a maze the fastest (using this open source tournament server). Unfortunately they didn’t have mono installed, so my pair and I wasted some time getting NetBeans installed and a basic Java app up and running. Very interesting, and it spurred a co-worker to setup a tournament server at work too. Looking forward to submitting a bot there to try out some path finding algorithms.

During our lunch break they gave a nice, albeit quick, tour of the park. We got to see the main sites, including Colossus. Very interesting stuff, and amazing to hear how they pulled off all those decoding and computational feats during the war.

For the afternoon I went to the Team Dojo session. We were told to write our strongest languages on name badges, then break off into teams of 4-6 based on that. I got together with a group of 6 devs, some co-workers. After a brief overview of the Google PageRank algorithm and a generic nearest neighbor one, we were set loose to create a developer-centric LinkedIn clone from a complete standing start. We had to figure out where to host our code, how to integrate, code the algorithms, parse in XML data, and throw it all up on the screen somehow in around 2 hours. Unfortunately we spent way too much time shaving yaks, as it were, with testing and our CI environment, and didn’t get to the algorithms until the end (although we were close to finishing it!). Learned a bit about trying to jump start a project like that with different personalities and making it all mesh together. It’d be interesting to see how we’d all do it again, especially since katas are meant to be repeated.

Between the talks, lunch, hog roast dinner, tour, and the great little side discussions had between it all, it was an excellent event (although they could try doing something about those beer prices!). Everyone did a great job putting it on. Here’s a video of the day Jason put together (I’m one of the last pair of interviews during our afternoon session). I’m quite looking forward to attending it again in the future.

Dec 30
Continuous Delivery
icon1 Darrell Mozingo | icon2 Build Management | icon4 December 30th, 2011| icon3No Comments »

I recently finished reading Continuous Delivery. It’s an excellent book that manages to straddle that “keep it broad to help lots of people yet specific enough to actually give value” line pretty well. It covers testing strategies, process management, deployment strategies, and more.

At my former job we had a PowerShell script that would handle our deployment and related tasks. Each type of build – commit, nightly, push, etc. – all worked off its own artifacts that it created right then, duplicating any compilation, testing, or pre-compiling tasks. That eats up a lot of time. Here’s a list of posts where I covered how that script generally works:

The book talks about creating a single set of artifacts from the first commit build, and passing those same artifacts through the pipeline of UI tests, acceptance tests, manual testing, and finally deployment. I really like that idea, as it cuts down on unnecessary rework, and gives you more confidence that this one set of artifacts are truly ready to go live. Sure, the tasks could call the same function to compile the source or run unit tests, so it was effectively the same, but there could have been slight differences where the assemblies produced from the commit build were slightly different than those in the push build.

I also like how they mention getting automation in your project from day one if you’re lucky enough to work on a green-field app. I’ve worked on production deployment scripts for legacy apps and for ones that weren’t production yet, but still a year or so old. The newer an app is and the less baggage it has, the easier it is to get started, and getting started is the hardest part. Once you have a script that just compiles and copies files, you’re 90% of the way there. You can tweak things and add rollback functionality later, but the meat of what’s needed is there.

However you slice it, you have to automate your deployments. If you’re still copying files out by hand, you’re flat out doing it wrong. In the age of PowerShell, there’s really no excuse to not automate your line of business app deployment. The faster deliveries, more transparency, and increased confidence that automation gives you can only lead to one place: the pit of success, and that’s a good place to be.

Nov 14
Moving on
icon1 Darrell Mozingo | icon2 Misc. | icon4 November 14th, 2011| icon3No Comments »

I’ve been at Synergy Data Systems for over 7 years now (I know, the site is horrible). I’ve worked with a lot of great people on some very interesting projects, and learned a boat load during that time. Unfortunately, they can’t offer the one thing my wife and I wanted: living abroad.

To that end, we’re moving to London and I’ll be starting at 7digital in early January. I’m super excited about both moves. 7digital seems like a great company working with a lot of principals and practices that are near and dear to me, and c’mon, it’s London. For two people that grew up in small town Ohio, this’ll be quite the adventure!

I’m looking forward to getting involved in the huge developer community over there, playing with new technologies, and working with fellow craftsmen!

Sep 29

UPDATE: See Paul’s comment below – sounds like the latest cygwin upgrade process isn’t as easy as it used to be.

If you install GitExtensions, up through the current 2.24 version (which comes bundled with the latest msysgit version 1.7.6-preview20110708), and use OpenSSH for your authentication (as opposed to Plink), you’ll likely notice some painfully slow cloning speeds. Like 1MB/sec on a 100Mb network kinda slow.

Thankfully, it’s a pretty easy fix. Apparently msysgit still comes bundled with an ancient version of OpenSSH:

$ ssh -V
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007

Until they get it updated, it’s easy to do yourself. Simply install the latest version of Cygwin, and make sure to search for and install OpenSSH on the package screen. Then go into the /bin directory of where you installed Cygwin, and copy the following files into C:\Program Files\Git\bin (or Program Files (x86) if you’re on 64-bit):

  • cygcrypto-0.9.8.dll
  • cyggcc_s-1.dll
  • cygssp-0.dll
  • cygwin1.dll
  • cygz.dll
  • ssh.exe
  • ssh-add.exe
  • ssh-agent.exe
  • ssh-keygen.exe
  • ssh-keyscan.exe

Checking the OpenSSH version should yield something a bit higher now:

$ ssh -V
OpenSSH_5.8p1, OpenSSL 0.9.8r 8 Feb 2011

Your clone speeds should be faster too. This upgrade bumped ours from literally around 1MB/sec to a bit over 10MB/sec. Nice.

Sep 15
Getting started with TDD
icon1 Darrell Mozingo | icon2 Musings, Testing | icon4 September 15th, 2011| icon3No Comments »

When I first read about TDD and saw all the super simple examples that litter the inter-tubes, like the calculator that does nothing but add and subtract, I thought the whole thing was pretty stupid and its approach to development was too naive. Thankfully I didn’t write the practice off – I started trying it, plugging away here and there. One thing I eventually figured out was that TDD is a lot like math. You start out easy (addition/subtraction), and continue building on those fundamentals as you get used to it.

So my suggestion to those starting down the TDD path is: don’t brush it off. Start simple. Do the simple calculator, the stack, or the bowling game. Don’t start thinking about how to mix in databases, UI’s, web servers, and all that other crud with the tests. Yes, these examples are easy, and yes they ignore a lot of stuff you need to use in your daily job, but that’s sort of the point. They’ll seem weird and contrived at first, but that’s OK. It serves a very real purpose. TDD has been around for a good while now, it’s not some fad that’s going away. People use it and get real value out of it.

The basic practice examples getting you used to the TDD flow – red, green, refactor. That’s the whole point of things like kata’s. Convert that flow into muscle memory. Get it ingrained in your brain, so when you start learning the more advanced practices (DIP, IoC containers, mocking, etc), you’ll just be building on that same basic flow. Write a failing test, make it pass, clean up. You don’t want to abandon that once you start learning more and going faster.

It seems everyone gets the red-green-refactor part down when they’re doing the simple examples, but forget it once they start working on production code. Sure, you don’t always know what your code is going to do or look like, but that’s why we have the tests. If you can’t even begin to imagine how your tests will work, write some throw away spike code. Get it working functionally, then delete it all and start again using TDD. You’ll be surprised how it changes.

Good luck with your journey. If you’re in the Canton area, don’t forget to check out the monthly Canton Software Craftsmanship meetup. There are experienced people there that are eager to help you out.

Jul 28
Commenting out old code kills puppies
icon1 Darrell Mozingo | icon2 Musings | icon4 July 28th, 2011| icon31 Comment »

There, I said it. Actually, I’m kind of worried that title won’t adequately state the intensity of this situation.

This is one of the fundamental reasons we have source control people, so we can go back through a file’s history and see the different revisions. Please, for the love of all that is holy, don’t comment out old code. Just delete it! Feel free to slap your own knuckles with a ruler if you start to think about commenting it. Don’t try to recreate a source control system through commented out code. Everyone knows exactly what I’m talking about:

// John Doe - 7/5/2011 - Changed to allow a higher limit.
// dozens of lines of old code....
// John Doe - 7/18/2011 - Changed algorithm slightly.
// dozens of lines of old code....
// random dozen lines of old code with no comment at all
public void ActualCode() { }

Those extra comment chunks are just crap to sift through to get to the real code, extra stuff you’ll have to parse to see if it’s relevant to the current situation, and creating more false-positives for ReSharper (and I’m guessing other refactoring tools) to pick up when you rename a variable/method that’s used inside those commented chunks. That chunk of old code at the bottom without even a hint as to why it’s commented out? That’s the worst of the worst – someone’s going to sit there and stare at it for a good while before they figure out why it was commented out, and we know when the author actually committed this file with that commented out the commit comment was blank too. Awesome.

So anyway, just remember what actually happens the next time you’re about to comment out old code and don’t do it, you’ll be doing future programers (and more than likely yourself) a huge service…

Commenting code kills puppies

« Previous Entries