Ramping Up – Week 1 of learning about business

I’ve always wanted to run my own business. Now that I’m 33, I’ve realized that I need to start actually trying if I want to achieve this goal.

What if this were a TV show? I would quit my job to create a startup. I would also be 22 and a lot hotter. Maybe I dropped out of college. The details aren’t important. “TV Jake” would drive himself to the edge of ruin. At the last second, everything would turn around and my startup would be the next big thing. But “Actual Jake” has a mortgage. At this point, “Going big or going home” isn’t as fun since it means “Go big or lose your home.”

The good news is that it doesn’t have to be that way. I’ve been introduced to online communities that view business differently. Podcasts like “Startups for the Rest of Us” and “Under the Radar” are run by independent developers that run small companies. They build smaller products at a more sustainable pace. The term “lifestyle business” also gets kicked around for these, since you’re exchanging some of the salary and comfort of a big company for the lifestyle you want.

I like the idea of starting out building small projects. This is similar to I learned how to program. I’d start tons of ideas. I made mistakes. I failed repeatedly. I’d try anything that sounded interesting. I wrote command line games, programs that solved my math homework, modifying WinAmp plugins to see what would happen. Eventually it started to stick.

I like the idea of failing on small ideas and building up. It maximizes what I can learn with a limited time budget. This approach has been informed by a lot of third parties. For instance, Rob Walling calls this the “stairstep approach“. David Smith of “Under the Radar” often talks about how he has a portfolio of products rather than going big on one.

So I’d like to learn business, and I feel like my first goal is very achievable:

Make $100 of profit, not counting the value of my time, on a business idea by the end of March, 2019
My first business goal

I’d like to do a few things to try to achieve this goal.

Incentivize myself. Ultimately I’d like to pay off my mortgage. BAM! Incentivized.
Hold myself accountable. I’m going to write a blog post once a week about what I have been doing in order to achieve my goal. I’ve heard this go both ways: “telling people about your goals feels like an accomplishment, which makes you less likely to actually accomplish them” versus “telling people about your goals adds a social pressure to actually complete them.” I’m choosing the method that involves filling out this domain with more content.
Work at least 5 hours a week on it. I believe that I will be working more than this on average. But setting a floor will mean that I will continue to make forward progress while giving myself the option to take some time off if I start to burn out.

So, let’s get started!

What did I do this week?

This week was all about ramping up! I split my time between doing introductory reading from people who run small businesses. I also started gathering data to look for the first opportunity that I want to do.

Side note: my first goal, “$100 of profit without factoring in the value of my time,” is low enough that it enables a lot of options. If the weather warms up, “selling umbrellas in Manhattan when it rains” could even be a way to do that. But I’d like to practice working on businesses with scalable economics. I’m not going to look for these kinds of opportunities unless I start to run up against my March 31 deadline.

I partitioned my research so that I wasn’t reading too much up-front. There’s no reason for me to read an article about improving my conversion rate if I don’t have conversions. So I divided up TODOist into a few really coarse categories, “Research”, “Setup”, “Validate”, and “Build”, and divided interesting articles into those buckets. I didn’t look at anything that ended up in a bucket past “Research.” Then I skimmed the articles that ended up in the Research buckets for ones that seemed particularly good. I took notes as I went, organized by category. This makes it easy to find the relevant article when I start something new like designing an onboarding flow.

The best article I read this week was a set of notes on the talk “Blind spots of the Developer/Entrepreneur” by Ben Orenstein. I thought it had a lot of really pragmatic advice for trying to make money on info products. This inspired a few of the ideas that I had this week.

I also started brainstorming and investigating niches that I could start using to make small products. I had the following three ideas:

Trivia questions. There are tons of existing companies that do things like run trivia nights or sell packs for you to run your own. I could pick a really narrow niche of trivia and sell questions for it, and slowly expand into being a trivia generalist. I’ve been going to a trivia night every week for the past 5 years, so I feel like this can inform the decisions I make. Plus, it means I know a few people who I can talk to about it – the trivia jockey, the bartender, and my friends.
Info product for Google Docs. I was a Googler that worked on Google Docs for over 4 years, and additionally I answered our internal feedback list. I’m one of the best positioned people in the world to write an info product on how to get the most out of Google Docs. This would be done as a Squarespace site designed to sell the info product. This would also give me an avenue to expand into other products and services, and would provide passive income.
Info product for how to get ramped up on PhpStorm, which is an editor that actually requires a license to run past the trial period. Since it has professional users, I’m more likely to be pre-qualified to users that are willing to spend money.

I vetted each of the ideas with the AdWords keyword tool. It may be a mistake, but I’m basically starting with a channel that I’d like to succeed with and comparing based on what has the highest demand.

The results were surprising to me. An unbelievable number of people look for trivia questions, and the search results for a lot of popular queries don’t really seem to serve the domain that well. In comparison, not many people were searching for Google Docs at all except for very high-level questions like “what is Google Docs?” Any approach here would have to be built around long-tail queries, which I think would be difficult to validate without any experience. And almost nobody was searching for PHPStorm anything. It was hard to justify doing either of the info product businesses even with some generous assumptions about conversion rates.

What am I doing this week?

This week I’m going to look closer at the trivia idea. I want to identify a segment within the search data that I can target. My current thinking is that I can try to validate whether holiday-based trivia is a good idea by targeting some upcoming holidays. MLK day is too close; I’d prefer to run a test for the 2 weeks before a holiday. But next month has both Valentine’s Day and President’s Day, so I could try to prepare trivia questions for each as a validation step – will people buy it at all? These are a month in the future though, so I’d also like to try to identify a second segment that I can start to target either this week or next week.

Interestingly, the government shutdown has also been on my mind. I wanted to start up an LLC with Stripe Atlas since the also automatically start up a bank account. However, I know that there’s a chance I won’t be able to get an EIN from the IRS while the government is shut down. So I’ve been holding off actually forming the LLC as much as possible.

That’s everything. See you next week!

Next week on “Learning about business” – I still haven’t learned about business.

A short guide to structuring code to write better tests

Why write this?

Well-written tests often have a positive return on investment. This makes sense; bugs become more expensive to fix the later in the development process they are discovered. This is backed by research. This also matches my experience at Etsy, my current employer. Detecting a bug in our development environment is cheaper than detecting it in staging, which is cheaper than detecting it in production, which is cheaper than trying to divine what a forums post means when it says “THEY BROKE SEARCH AGAIN WHY CAN’T THEY JUST FIX SEARCH??,” which is cheaper than debugging a vague alert about async jobs failing.

Over my career I’ve rediscovered what many know: there are good tests and bad tests. Good tests are mostly invisible except when they catch regressions. Bad tests fail frequently and their failures aren’t real regressions. More often they’re because the test logic makes assumptions about implementation logic and the two have drifted. These tests need endless tweaking to sync the implementation and test logic.

So here’s a guide to help you write better tests by improving how your code is structured. It’s presented as a set of guidelines. They were developed over a few years when I was at Google. My team noticed that we had good tests and bad tests, and we invested time in digging up characteristics of each. I feel like they are applicable outside the original domain, since I have successfully used these techniques since then.

Some may point out that this post isn’t a “short guide” by many definitions. But I think it’s better than saying “Read this 350 page book on testing. Now that I have pointed you to a resource I will not comment further on the issue.”

Please ask me questions!

“Testing” is a broad topic, so I want to explain the domain I have in mind. I’m targeting a database-driven website or API. I’m not thinking about countless other environments like microcontrollers or hard realtime robotics or batch data processing pipelines or anything else. The techniques in this post can be applied broadly, and can be applicable outside of the web domain. But not all of them work for all situations. You’re in the best position to decide what works for you.

For discussion, I will introduce an imaginary PHP testing framework for evil scientists looking to make city-wide assertions: “Citizens of New York”, or cony[0]. It will be invoked as follows:

$x = 3;
cony\BEHOLD::that($x)->equals(3);
cony\BEHOLD::that($x)->isNotNull();

Terminology

Everyone has their own testing terminology. That means this blog post is hopeless. People are going to skip this section and and disagree with something that I didn’t say. This happened with my test readers even though the terminology section was already in place. But here goes!

Here are some definitions from Martin Fowler – Mocks Aren’t Stubs:

Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
Mocks are […] objects pre-programmed with expectations which form a specification of the calls they are expected to receive.
Martin Fowler’s test object definitions

Here are a few more definitions that I will use:

Unit test: A test that verifies the return values, state transitions, and side effects of a single function or class. Assumed to be deterministic.
Integration test: A test that verifies the interaction between multiple components. May be fully deterministic or include non-deterministic elements. For instance, a test that executes a controller’s handler backed by a real database instance.
System test: A test that verifies a full system end-to-end without any knowledge of the code. Often contains nondeterministic elements like database connections and API requests. For instance, a Selenium test.
Real object: A function or class that you’d actually use in production.
Fragile test: A test whose assertion logic easily diverges from the implementation logic. Failures in fragile tests are often not due to regressions, but due to a logic divergence between the test and implementation.
A few more definitions I needed

This post mostly discusses using “real” vs “fake” vs “mocks.” When I say “fake” I will be interchanging a bunch of things that you can find defined in Martin Fowler’s article, like dummy, fake, stub, or a spy. This is because their implementations are often similar or identical despite being conceptually different. The differences matter in some contexts, but they don’t contribute much to this discussion.

Dependency injection is your best friend

Injecting a dependency means to pass it in where they are needed rather than statically accessing or constructing them in place.

For instance:

// No dependency injection.
public static function isMobileRequest(): bool {
   $request = HttpRequest::getInstance();
   // OMITTED: calculate $is_mobile from $request's user agent
   return $is_mobile;
}

// With dependency injection.
public static function isMobileRequest(HttpRequest $request): bool {
   // OMITTED: calculate $is_mobile from $request's user agent
   return $is_mobile;
}

Dependency injection makes this easier to test for three reasons.

First examine the static accessor for the HTTP request. Imagine testing it. You’d need to create machinery in the singleton to set an instance for testing. Alternatively you will need to mock out that call. But the following test is much simpler:

public static function testIsMobileRequest(): bool {
   $mobile_request = Testing_HttpRequest::newMobileRequest(); 
   $desktop_request = Testing_HttpRequest::newDesktopRequest(); 
   cony\BEHOLD::that(MyClass::isMobileRequest($mobile_request))->isTrue(); 
   cony\BEHOLD::that(MyClass::isMobileRequest($desktop_request))->isFalse();
}

Second, passing dependencies allows common utils to be written. There will be a one-time cost to implement newMobileRequest() and newDesktopRequest() if they don’t exist when you start writing your test. But other tests can use them once they exist. Writing utils pays off very quickly. Sometimes after only one or two usages.

Third, dependency injection will pay off for isMobileRequest() as the program grows. Imagine that it’s nested a few levels deep: used by a configuration object that’s used by a model util that’s called by a view. Now you’re calling your view renderer and you see that it takes an HTTP request. This has two benefits. It exposes that the behavior of the view is parameterized by the HTTP request. It also lets you say, “that’s insane! I need to restructure this” and figure out a cleaner structure. This is a tradeoff; you need to manage some parameter cruft to get these benefits. But in my long experience with this approach, managing these parameters aren’t a problem even when the list grows really long. And the benefits are worth it.

Inject the smallest thing needed by your code

We can make isMobileRequest even more maintainable. Look at testIsMobileRequest again. To write a proper test function, an entire HttpRequest needs to be created twice. Imagine that it gains extra dependencies over time. A MobileDetector and a DesktopDetector and a VirtualHeadsetDetector and a StreamProcessor. And because other tests inject their own, the constructors use dependency injection.

public static function testIsMobileRequest(): bool { 
   $mobile_detector = new MobileDetector();
    $desktop_detector = new DesktopDetector();
    $vh_detector = new VirtualHeadsetDetector();
    $stream_processor = new StreamProcessor();
    $mobile_request = Testing_HttpRequest::newMobileRequest(        $mobile_detector, $desktop_detector, $vh_detector, $stream_processor
    );
    $desktop_request = Testing_HttpRequest::newDesktopRequest( 
       $mobile_detector, $desktop_detector, $vh_detector, $stream_processor
    );    cony\BEHOLD::that(MyClass::isMobileRequest($mobile_request))->isTrue(); 
   cony\BEHOLD::that(MyClass::isMobileRequest($desktop_request))->isFalse();
}

It’s more code than before. That’s fine. This is what tests tend to look like when you have lots of dependency injection. But this test can be simpler. The implementation only needs the user agent in order to properly classify a request.

public static function isMobileRequest(string $user_agent): bool {   
    // OMITTED: calculate $is_mobile from $user_agent
    return $is_mobile;
}

public static function testIsMobileRequest(): bool {
    $mobile_ua = Testing_HttpRequest::$mobile_useragent;
    $desktop_ua = Testing_HttpRequest::$desktop_useragent;
    cony\BEHOLD::that(MyClass::isMobileRequest($mobile_ua))->isTrue();
    cony\BEHOLD::that(MyClass::isMobileRequest($desktop_ua))->isFalse();
}

We’ve made the code simpler by only passing in the limited dependency. The test is also more maintainable. Now isMobileRequest and testIsMobileRequest won’t need to be changed whenever changes are made to HttpRequest.

You should be aggressive about this. You need to instantiate the transitive closure of all dependencies in order to test an object. Keeping the dependencies narrow makes it easier to instantiate objects for test. This makes testing easier overall.

Write tests for failure cases

In my experience, failure cases are often neglected in tests. There’s a major temptation to check in a test when it first succeeds. There are often more ways for code to fail than to succeed. Failures can be nearly impossible to replicate manually, so it’s important to automatically verify failure cases in tests.

Understanding the failure cases for your systems is a major step towards resilience. Failure tests execute logic that could be the difference between partial degradation and a full outage: what happens when things go wrong? What happens when the connection to the database is down? What happens when you can’t read a file from disk? The tests will verify that your system behaves as expected when there is a partial outage, or that your users get the proper error messages, or whatever behaviors you need to ensure that the single failure doesn’t turn into a full-scale outage.

This isn’t a magic wand. There will always be failures that you don’t think to test, and they will bring down your site inevitably. But you can minimize this risk by starting to add failure tests as you code.

Use real objects whenever possible

You often have several options for injecting dependencies into the implementation being tested. You could construct a real instance of the dependency. You could create an interface for the dependency and create a fake implementation. And you could mock out the dependency.

When possible, prefer to use a real instance of the object rather than fakes or mocks. This should be done when the following circumstances are true:

Constructing the real object is not a burden. This becomes more likely when dependency injecting the smallest thing needed by the code
The resulting test is still deterministic
State transitions in the real object can be detected completely via the object’s API or the return value of the function

The real object is preferable to the fake because the test will be a verification of the real interaction the dependency and the fake will have in production. You can verify the correct thing happened in a few different ways. Maybe you’re testing whether the return values change in response to the injected object. Or you can check that the function actually modifies the state of the dependency, like seeing that an in-memory key value store has been modified.

The real object is preferable to the mock because it doesn’t make assumptions about how the two objects interact. The exact API details of the interaction is not important compared to what it actually does to the dependency. Mocks often create fragile tests since they record everything that should be happening; what methods should be invoked, any parameters that are being passed, etc.

Even worse, the test author indicates what the return value from the object is. It may not be a sane return value for the parameters when the test is written. It may not remain true over time. It bakes extra assumptions into the test file that don’t need to be there. And imagine that you go through the trouble of mocking a single method 85 times, and you implement a major change to the real method’s behavior that may invalidate the mock returns. Now you will need to go examine each of the 85 cases and decide how each of them will change and additionally how each of the test cases will need to adapt. Or alternatively you will fix the two that fail and hope that the other 83 are still accurate just because they’re still passing. For my money, I’d rather just use the real object.

The key observation is that “how did something get changed?” matters way less than “what changed?” Your users don’t care which API puts a word into spellcheck. They just care that it persists between page reloads. A corollary is that if “how” matters quite a lot, then you should be using a mock or a spy or something similar.

Combining this with the structuring rules above creates a relatively simple rule: Reduce necessary dependencies whenever possible, and prefer the real objects to mocks when you need complex dependencies.

A careful reader will note that using real objects turns unit tests into deterministic integration tests. That’s fine. Improving the maintenance burden is more desirable than maintaining ideological purity. Plus you will be testing how your code actually runs in production. Note that this isn’t an argument against unit tests – all of the structuring techniques in this doc are designed to make it easier to write unit tests. This is just a tactical case where the best unit test turns out to be a deterministic integration test.

Another complaint I’ve heard to this approach is “but a single error in a common dependency could cause dozens of errors across all tests.” That’s actually good! You made dozens of integration errors and the test suite caught all of them. What a time to be alive. These are also easy to debug. You can choose from dozens of stack traces to help investigate what went wrong. In my experience, the fix is usually in the dependency’s file rather than needing to be fixed across tons of files.

Prefer fakes to mocks

A real object should not be used if you can’t verify what you need from its interface, or it’s frustrating to construct, or it is nondeterministic. At that point the techniques at your disposal are fake implementations and mock implementations. Prefer fake implementations over mock implementations when all else is equal. This reuses much of the same reasoning as the previous section.

Despite their name, a fake implementation is a trivial but real implementation of an interface. When your code interacts with the fake object, side effects and return values should follow the same contract as the real implementation. This is good. You are verifying that your code behaves correctly with a correct implementation of the interface. You can also add convenience setters or getters to your fake implementation that you might not ordinarily put on the interface.

Fakes also minimize the number of assumptions that a test makes about the implementation. You’re not specifying the exact calls that are going to be made, or the order that the same function returns different values, or the exact values of parameters. Instead you will be either checking that the return value of your function changes based on data in the fake, or you will be verifying that the state of the fake matches your expectations after test function execution.

Here’s an example implementation:

interface KeyValueStore {
    public function has(string $key): bool;
    public function get(string $key): string;
    public function set(string $key, string $value);
}

// Only used in production. Connects to a real Redis implementation.
// Includes error logging, StatsD, everything!
class RedisKeyValueStore implements KeyValueStore {}

class Testing_FakeKeyValueStore implements KeyValueStore {
    public function __construct() {
       $this->data = [];
    }

    public function has(string $key): bool {
        return array_key_exists($key, $this->data);
    }

    public function get(string $key): string {
        if (!$this->has($key)) {
            throw new Exception("No key $key");
        }
        return $this->data[$key];
    }

    public function set(string $key, string $value) {
        $this->data[$key] = $value;
    }
}

Another benefit is that you now have a reusable test implementation of KeyValueStore that you can easily use anywhere. As you tweak the implementation of needsToBeCached() over time you will only need to change the tests when the side effects and return value changes. You will not need to update tests to keep the mocks up-to-date with the exact logic that is used in the implementation.

There are many cases where this is a bad fit, and anything that sounds like a bad idea is probably a bad idea. Don’t fake a SQL database. If your code has an I/O boundary like network requests, you will basically have no choice but to mock that. You can always abstract it behind other layers, but at some point you will need to write a test for that final layer.

Prefer writing a simple test with mocks to faking a ton of things or writing a massive integration test

I spend lots of time encouraging test authors to avoid mocks as a default testing strategy. I acknowledge that mocks exist for a reason. To borrow the XML adage, an automatic mocking framework is like violence: if it doesn’t solve your problem you’re not using enough of it. A determined tester can mock as many things as possible to isolate an effect in any code. My ideal testing strategy is more tactical and requires discipline. Imagine that you’re adding the first test for an ancient monolithic controller. You have roughly three options to write the test: prep a database to run against a fake request you construct, spending a ton of time refactoring dependencies, or mocking a couple of methods. You should probably do the latter one out of pragmatism. Just writing a test at all will make the file more testable, since now the infrastructure exists.

You can slowly make improvements as you continue to make edits. You can also slowly improve the code’s organization as you go. This will start to enable you to use techniques that lead to less fragile tests.

Always weigh the cost and benefit of the approaches you take. I’ve outlined several techniques above that I think lead to better tests. Unfortunately they may not be immediately usable on your project yet. It takes time to reshape a codebase. As you use them you will discover what works best for your own projects, and you should slowly improve them as you go.

System tests pay for themselves, but it’s hard to predict which ones are worth writing

At Google, my team had a long stretch where we wrote a system test for every regression. We were optimistic that they would become easier to write over time. Eventually the burden could not be ignored: they were flaky and we never ended up in our dream tooling state. So we phased out this strategy. But one day I was discussing an “incremental find” system test with a few teammates. We figured out that this single test saved us from regressing production an average of 4 times per person. Our bugs surfaced on our dev machines instead of later in our deployment process. This saved each of us lots of expensive debugging from user reports or monitoring graphs.

We couldn’t think of another system test that was nearly that valuable. It followed a Pareto distribution: most bugs were caught by a few tests. Many tests caught only a bug or two. Many other tests had similar characteristics (user-visible, simple functionality backed by lots of complex code, easy to make false assumptions about the spec), but only this one saved full eng-months.

So system tests aren’t magic and all of my experience with this suggests that we should only use them tactically. The critical paths for the customer flow are a good first-order metric to target what system tests we should write. Consider adding new system tests as the definition of your critical path changes.

What’s next?

Write tests for your code! Tests are the best forcing function for properly structuring your code. Properly structuring your implementation code will make testing easier for everyone. As you come up with good generic techniques, share them with people on your team. When you see utilities that others will find useful.

Even though this guide is well north of 3000 words, it still only scratches the surface of the subject of structuring code and tests. Check out “Refactoring” by Martin Fowler if you’d like to read more on the subject of how to write code to be more testable.

I don’t recommend following me on Twitter unless you want to read a software engineer complain about how cold it is outside.

Thanks to everyone at Etsy who provided feedback on drafts of this, whether you agreed with everything or not!

Footnotes

[0] I’ve seen this joke before but I can’t figure out where. Please send me pointers to the source material!

What does a tech lead do?

This was written internally at Etsy. I was encouraged to post on my own personal blog so people could share. These are my opinions, and not “Etsy official” in any way.

Motivation for writing this

For the past 5 months, I have been the tech lead on the Search Experience team at Etsy. Our engineering manager had a good philosophy for splitting work between managers and tech leads. The engineering manager is responsible for getting a project to the point where an engineer starts working on it. The tech lead makes sure everything happens after that. Accordingly, this is intended to document the mindset that helps drive “everything after that.”

Having a tech lead has helped our team work smoothly. We’ve generated sizable Gross Merchandise Sales (GMS) wins. We release our projects on a predictable schedule, with little drama. I’ve seen this structure succeed in the past, both at Etsy and at previous companies.

You can learn how to be a tech lead. You can be good at it. Somebody should do it. It might as well be you. This advice sounds a little strange since:

It’s a role at many companies, but not always an official title
Not every team has them
The work is hard, and can be unrecognized
You don’t need to be considered a tech lead to do anything this document recommends

But teams run more efficiently and spread knowledge more quickly when there is a single person setting the technical direction of a team.

Who is this meant for?

An engineer who is leading a project of 2-7 people, either officially or unofficially. This isn’t meant for larger teams, or leading a team of teams. In my experience, 8-10 people is an inflection point where communication overhead explodes. At this point, more time needs to be spent on process and organization.

What’s the mindset of a tech lead?

This is a series of principles that led to good results for Search Experience, or are necessary to do the job. I’m documenting what works well in my experience.

More responsibility → Less time writing code

When I was fresh out of college, I worked at a computer vision research lab. I thought the most important thing was to write lots of code. This worked well. My boss was happy, and I was slowly given more responsibility. But then the recession hit military subcontractors, and the company went under. Life comes at you fast!

So I joined BigCo, and started at the bottom of the totem pole again. I focused on writing a lot of code, and learned to do it on large teams. This worked well. I slowly gained responsibility, and was finally given the task of running a small project. Until this point, I had been successful by focusing on writing lots of code. So I was going to write lots of code, right?

Wrong. After 2 weeks, my manager pulled me aside, and said, “Nobody on your team has anything to do because you haven’t organized the backlog of tasks in three days. Why were you coding all morning? You need to make sure your team is running smoothly before you do anything else.”

Okay, point taken.

So I made daily calendar reminders to focus on doing this extra prep work for the team. When I did this work, we moved faster as a three person unit. But I could see on my code stats where I started focusing more on the team. There was a noticeable dip. And I felt guilty, even when I expected this! Commits and lines of code are very easy ways to measure productivity, but when you’re a tech lead, your first priority is the team’s holistic productivity. And you just need to fight the guilt. You’ll still experience it. You just need to recognize the feeling and work through it.

Help others first

It sounds nice to say that you should unblock your team before moving yourself forward, but what does this mean in practice?

First, if you have work, but someone needs your help, then you should help them first. As a senior engineer, your time is leveraged–spending 30 minutes of your time may save days of someone else’s. Those numbers sound skewed, but this is the same principle behind the idea that bugs get dramatically more expensive to fix the later they are discovered. It’s cheaper to do things than redo things. You get a chance to save your teammates from having to rediscover things that are already known, or spare them from writing something that’s already written. Some exploration is good. But there’s always a threshold, and you should encourage teammates to set deadlines based on the task. When they pass it, asking for help is the best move. This could also help with catching bugs that will become push problems or production problems before they are even written.

Same for code reviews. If you have technical work to do, but you have a code review waiting, you should do the code review first. Waiting on someone to review your code is brutal, especially if the reviewing round-trip is really long. If you sit on it, the engineer will context switch to a new task. It’s best to do reviews when their memory of the code is fresh. They’re going to have faster and better answers to your questions, and will be able to quickly tweak their pull request to be submission-ready.

It’s also important to encourage large changes to be split into multiple pull requests. When discussing projects up-front, make sure to recommend how to split it up. For instance, “The first one will add the API, the second one will send the data to the client, and the third one will use the data to render the new component.” This allows you to examine each change in detail, without needing to spend hours reviewing and re-reviewing an enormous pull request. If you believe a change is too risky to submit all at once because it’s so large that you can’t understand all of its consequences, it’s OK to request that it be split up. You should be confident that changes won’t take down the site.

Even with this attitude, you won’t review all pull requests quickly. It’s impossible. For instance, most of my team isn’t in my timezone. I get reviews outside of work hours, and I don’t hop on my computer to review them until I get into work at the crack of 10.

I personally view code reviews and questions as interruptible. If I have a code review from our team, I will stop what I am doing and review it. This is not for everybody, because it’s yet another interruption type, and honestly, it’s exhausting to be interrupted all day. Dealing with interruptions has gotten easier for me over time, but I’ve gotten feedback from several people that it hasn’t for them. You will never be good at it. I’m not. It’s impossible. You will just become better at managing your time out of pure necessity.

Much of your time will be spent helping junior engineers

A prototypical senior engineer is self-directed. You can throw them an unbounded problem, and they will organize. They have an instinct for when they need to build consensus. They break down technical work into chunks, and figure out what questions need to be answered. They will rarely surprise you in a negative way.

However, not everybody is a senior engineer. Your team will have a mix of junior and senior engineers. That’s good! Junior engineers are an investment, and every senior engineer is in that position because people invested in them. There’s no magical algorithm that dictates how to split time between engineers on your team. But I’ve noticed that the more junior a person is, the more time I spend with them.

There’s a corollary here. Make sure that new engineers are aware that they have this option. Make it clear that it is normal to contact you, and that there is no penalty for doing so. I remember being scared to ask senior engineers questions when I was a junior engineer, so I always try hard to be friendly when they ask their first few questions. Go over in-person if they are at the office, and make sure that their question has been fully answered. Check in on them if they disappear for a day or two. Draw a picture of what you’re talking about, and offer them the paper after you’re done talking.

The buck stops here

My manager once told me that leaders take responsibility for problems that don’t have a clear owner. In my experience, this means that you become responsible for lots of unsexy, and often thankless, work to move the team forward.

The question, “What are things that should be easy, but are hard?”, is a good heuristic for where to spend time. For instance, when Search Experience was a new team, rolling out new features was painful. We never tire-kicked features the same way, we didn’t know what groups we should test with, we’d (unpleasantly) surprise our data scientist, and sometimes we’d forget to enable stuff for employees when testing. So I wrote a document that explained, step-by-step, how we should guide features from conception to A/B testing to the decision to launch them or disable them. Then our data scientist added tons of information about when to involve her during this process. And now rolling out features is much easier, because we have a playbook for what to do.

This can be confusing with an engineering manager and/or product manager in the picture, since they should also be default-responsible for making sure things get done. But this isn’t as much of a problem as it sounds. Imagine a pop fly in baseball, where a ball falls between three people. It’s bad if everyone stands still and watches it hit the ground. It’s better if all of you run into each other trying to catch it (since the odds of catching it are better than nobody trying). It’s best if the three of you have a system for dealing with unexpected issues. Regular 1:1s and status updates are a great way to address this, especially in the beginning.

Being an ally

Read Toria Gibbs’ and Ian Malpass’ great post, “Being an Effective Ally to Women and Non-Binary People“, and take it to heart. You’re default-responsible for engineering on your team. And that means it’s up to you to make sure that all of your team members, including those from underrepresented groups, have an ally in you.

“What does being a tech lead have to do with being an ally?” is a fair question.

First, you are the point person within your team. You will be involved in most or all technical discussions, and you will be driving many of them. Make sure members of underrepresented groups have an opportunity to speak. If they haven’t gotten the chance yet, ask them questions like, “Are we missing any options?” or “You’ve done a lot of work on X, how do you think we should approach this?”. If you are reiterating someone’s point, always credit them: “I agree with Alice that X is the right way to go.”

You will also be the point person for external teams. Use that opportunity to amplify underrepresented groups by highlighting their work. If your time is taken up by tech leading, then other people are doing most of the coding on the team. When you give code pointers, mention who wrote it. If someone else has a stronger understanding of a part of the code, defer technical discussions to them, or include them in the conversation. Make sure the right names end up in visible places! For instance, Etsy’s A/B testing framework shows the name of the person who created the experiment. So I always encourage our engineers to make their own experiments, allowing the names to be visible to all of our resident A/B test snoopers (there are dozens of us). If someone contributes to a design, list them as co-authors. You never know how long a document will live.

Take advantage of the role for spreading knowledge

When a team has a tech lead, they end up acting as a central hub of activity. They’ll talk about designs and review code for each of the projects on the team.

If you read all the code your team sends out in pull requests, you will learn at an accelerated rate. You will quickly develop a deep understanding of your team’s codebase. You will see techniques that work. You can ask questions about things that are unclear. If you are also doing code reviews outside of your team, you will learn about new technologies, libraries, and techniques from other developers. This enables you to more effectively support your team with what you have learned from across the company.

In this small team, Alice is the tech lead, and Bob is working directly with Carol. All other projects are 1 person efforts. Alice is in a position where she can learn quickly from all engineers, and spread information through the team.

Since you are in this position, you are able to quickly define and spread best practices through the team. A good resource that offers some suggestions for code reviews is this presentation by former Etsy employee Amy Ciavolino. It is a good team-oriented style. Feel free to adapt parts to your own style. If you’ve worked with me, you’ll notice this sometimes differs from what I do. For instance, if I have “What do you think?” feedback, I prefer to have in-person/Slack/Vidyo conversations. This often ends in brainstorming, and creating a third approach that’s better than what either of us envisioned. But this presentation is a great start, and a strong guideline.

Day-to-day work

As I mentioned above, much of the work of a tech lead is interrupt-driven. This is good for the team, but it adds challenges to scheduling your own time. On a light day, I’ll spend maybe an hour doing tech lead work. But on a heavy day, I’ll get about an hour of time that’s not eaten up by interruptions.

Accordingly, it’s difficult to estimate what day you will finish something. I worked out a system with our engineering manager that worked well. I only took on projects that were either small, non-blocking, or didn’t have a deadline. This is going to work well with teams trying to have a minimal amount of process. This will be a major adjustment on teams that are hyper-organized with estimation.

You need to fight the guilt that comes with this. Your job isn’t to crank out the most code. Your job is to make the people on your team look good. If something important needs to be done, and you don’t have time to do it, you should delegate it. This will help the whole team move forward.

When I’m deciding what to do, I do things in roughly this priority:

Inner loop:

Answer any Slack pings
Help anybody who needs it in my team’s channel
Do any pending code reviews
Make sure everybody on the team has enough work for the rest of the week
Do any process / organizational work
Project work

Once a day:

Check performance graphs. Investigate (or delegate) major regressions to things it looks like we might have affected.
Check all A/B experiments. For new experiments, look for bucketing errors, performance problems (or unexpected gains, which are more likely to be bugs), etc.

Once a week:

Look through the bug backlog, make sure a major bug isn’t slipping through the cracks.

What this means for engineering managers

Many teams don’t have tech leads, but every team needs tech leadership in order to effectively function. This is a call-to-action for engineering managers to examine the dynamics of their teams. Who on your team is performing this work? Are they being rewarded for it? In particular, look for members of underrepresented groups, who may be penalized for writing less code due to unconscious bias.

Imagine a team of engineers. The duties listed above are probably in one of these categories:

A designated tech lead handles the work. If your team falls into this category, then great! Make sure that the engineer or engineers performing these these duties are recognized.

Someone’s taking responsibility for it, on top of their existing work. This can be a blessing or a curse for engineers, based on how the engineering manager perceives leadership work. It’s possible that their work is appreciated. But it’s also possible that people are only witnessing their coding output drop, without recognizing the work to move the team forward. If you’re on a team where #2 is mostly true (tech lead is not formalized, and some engineer is taking responsibility for moving the team forward, at the expense of their own IC work), ask yourself this: are they being judged just for the work they do? Or are they being rewarded for all the transitive work that they enable?

A few people do them, but they often get neglected. Work still gets done in this category, but there are systematic blockers. If nobody owns code reviews, it will take a long time for code to be reviewed. If nobody owns code quality, your codebase will become a swiss cheese of undeleted, broken flags.

Nobody is taking responsibility for them. In this category, some things just won’t get done at all. For instance, if nobody is default-responsible for being an ally for underrepresented groups, then it’s likely that this will just be dropped on the floor. This kind of thing is fractal: if we drop the ball on the group level, we’ve dropped the ball on both the individual, and company-wide, levels.

In conclusion

There is value in having a designated tech lead for your team. They will create and promote best practices, be a point-person within your team, and remove engineering roadblocks. Also, this work is likely already being done by somebody, so it’s important to officially recognize people that are taking this responsibility.

There is also lots of value in officially taking on this role. It allows you to leverage your time to move the organization forward, and enables you to influence engineering throughout the entire team.

If you’re taking on this work, and you’re not officially a tech lead, you should talk with your manager about it. If you’d like to move towards becoming a tech lead, talk to your manager (or tech lead, if you have one!) about any responsibilities you can take on.

Thanks to Katie Sylor-Miller, Rachana Kumar, and Toria Gibbs for providing great feedback on drafts of this, and to everyone who proofread my writing.

www.bitlog.com

Jake Voytko's personal log of bits