About a week ago, I watched a video that Jeremy Howard released. It explained the need for wearing masks during the coronavirus outbreak, and demonstrated how to make one with just a T-shirt and scissors.
Jump to 7:00 to see the cuts he makes in the shirt.
Luckily, I had surplus cotton undershirts to use for experimentation before I could get masks made from more comfortable material. This gave me room for trial-and-error. I made two masks that night, and my girlfriend and I started wearing them.
My girlfriend and I have both made a few masks by this point. Most of our masks have been made out of T-shirts, but I also sewed a mask yesterday using 600 thread count sheets. You can make a few choices that will improve your experience. Don’t get me wrong: these masks are forgiving. Anything you cut will probably work. But some decisions can save you some frustration.
Cut T-shirt masks low enough below the sleeves
Make sure that you cut low enough. Otherwise you may not have enough material to cover your nose and mouth. Or even worse: you’ll have exactly the right amount and need to continually adjust the mask. This is a nonstarter, since you shouldn’t touch your face when you have the mask on.
You don’t need to go crazy when considering the cut. An inch below the sleeves is sufficient. Cut straight across the shirt. Avoid veering the cut towards the neck. You’ll do great.
Cut the front and back of the shirt as evenly as you can
One of my first two masks was “the bad mask,” which we threw away. When cutting the mask, I allowed the top and bottom layers to slide against each other. I cut the cloth at different heights on each layer. After the cuts, the cloth from the back didn’t fully cover our faces.
This is easy to prevent. Spreading the shirts out on a table helped. Sharp scissors help. Stop cutting when the fabric starts to slide. The cuts may end up uneven. Just adjust it and keep going.
Err on the side of thicker straps
I broke my favorite mask while tying it. The straps had been too thin. Err on the side of cutting your straps a little thicker.
Cut the “U” in the sleeves deep enough
On men’s medium undershirts, I needed to cut the U deeper into the sleeves than I expected. I needed to cut even further on a real T-shirt. In our first cuts, the U stopped before the stitching that connects the sleeve to the torso. This made the masks difficult to tie. Extending the cut at least a half inch past this stitching made the masks much easier to tie.
Wearing the mask
Wear T-shirt masks with a hat
A hat helps hold the top knot in place. I went on a 2 hour grocery run a few days ago. The hat pinned the mask to my head. I didn’t need to readjust it once it was comfortable.
The hat also helps the aesthetic a little. At least, as measured by a handful of comments. One delivery driver offered that it looked like I had a hurt tooth. My girlfriend suggested that it looked like our heads were gift-wrapped presents. The hat fixes this somehow. Now, people joke that I’m going to rob them, which I guess is an improvement? My friend Ryan wore a hoodie with a homemade T-shirt mask. He looked like a cartoon assassin. So there’s a lot of variability in the look.
Plan for the time
I take walks during my working day to stay sane. The walk is often sandwiched between meetings, and I need to return before the next one starts. It takes a few minutes to tie the masks. It also takes extra time to wash them after the walk. I pad my walks with 5 more minutes, and that is enough.
Don’t mess this up. Clean your mask after you go outside.
You’ll get over the embarrassment
My girlfriend and I felt odd the first time that we stepped onto the street wearing our masks. It felt like wearing a Halloween costume in March. But we got over it. And we felt safer when we walked past people.
It became irrelevant a few days later. Most of my neighbors started wearing masks in response to what they were reading in the news. Now I’d feel silly if I didn’t wear a mask.
Wearing a T-shirt mask with glasses
My glasses fogged up on my first walk. Completely. I had to remove them to see anything. I was just walking around my neighborhood, so it wasn’t a problem. But I needed to fix this before I could go grocery shopping.
Glasses fogging has been a continual problem. But with some experimentation, I’ve started to make it manageable.
Breathe out through the cloth
My glasses fog when I exhale normally. To combat this, I press my lips against the cloth when I exhale. I also try to exhale through just my mouth. In practice, I get into a rhythm of inhaling through my nose and exhaling through the cloth. You don’t need to get this rigorous. It’s just a habit I’ve picked up.
This doesn’t work as well with the higher threadcount mask that I sewed. The air is still escaping around my nose. But it’s sufficient for the T-shirt masks.
Tie it lower and tighter than you think you need
When tying the knot behind your neck, tie it lower and much tighter than you think you need.
The air that fogs your glasses is escaping around your nose. Making the bottom knot tight addresses this: it creates a tighter seal around your nose.
I also noticed that the knot slipped down the first few times. This caused the pressure on my nose to relax. I fixed this by tying the knot lower than I naturally would.
Put the mask higher on the bridge of your nose than you think you need
Putting the mask higher on the bridge of your nose works well with the low knot. It changes the angle of attack so that the T-shirt hugs your face a little closer.
Here’s a picture of me with one of the masks. You can see that I’m wearing it really high on the bridge of my nose – it’s roughly level with my top eyelid. You don’t need to be this extreme if you don’t have glasses, and you don’t need to worry about this if you wear glasses and they’re not fogging up.
Wearing a T-shirt mask with long hair
I asked my girlfriend about wearing the mask with long hair (her hair extends past her shoulders).
Prefer a ponytail
She said that it’s easier to wear with a ponytail because it doesn’t interfere with her hair.
A bun is still manageable
She said that It’s harder when her hair is in a bun. The bun made it easier to accidentally tie the mask into her hair. She called it a double-edged sword: resting the top knot against the bun helps keep the top knot in place. But if too much pressure was applied to it, then the bun would risk being undone.
Wearing a T-shirt mask with headphones
It took some adjustment, but I got to the point where I could wear the mask comfortably with headphones.
Pass the top knot behind your ears, not over them
The most comfortable way to wear the top knot is over your ears. But this conflicts with headphones. Instead, pass it behind your ears. This puts some pressure on my ears. They stick out a tiny bit. But it’s still comfortable to wear the headphones with the masks.
Do your best
This is a process of trial-and-error, and we’re all doing our best. Share your tips with your friends.
Google recently launched a desktop redesign. The favicon and URL breadcrumbs were turned into a header for organic search results. Ads had the same design, but were identified using the string “Ad” instead of the favicon. This design wasn’t new. Google’s mobile web search has served this design since May 2019. But users and regulators complained that the desktop version blurred the distinction between ads and organic results. Google reverted the change a few weeks later, citing the backlash.
I experienced change aversion when I tried the redesign. Change aversion is a simple idea: users react negatively to new experiences, but they stop caring as new experiences become normal. Anyways, looking at the Google redesign gave me change aversion. I knew that I wouldn’t care about it within a few days. But I decided to put it to good use: I would try DuckDuckGo. If it was time for Google to experiment, then it was time for me to experiment. I had wanted to try it for a while. This finally gave me the activation energy to switch.
DuckDuckGo’s premise is simple. They do not collect or share personal information. They log searches, but they promise that these logs are not linked to personally identifiable information. Their search engine results seemingly come from Bing, but they claim to have their own crawler and hundreds of other sources on top of that. They do customize the results a little: geo-searches like bars near my location give me results from my home city of New York. But search results aren’t personalized. I’ve always wondered how good the results would be.
Anyways, here are the guidelines that I set for my experiment:
I would use DuckDuckGo for at least a month. This would give me enough time to learn some of its strengths and weaknesses.
I would not use any DuckDuckGo poweruser features unless I could guess that they existed. I wanted to understand the out-of-the-box experience on the site.
I could use the !g operator to search Google if DuckDuckGo failed. Some will point out that this violates the previous rule. But as soon as a discussion changes to DuckDuckGo usage, people can’t WAIT to talk about how often they use !g or g!. Do you need an example? I discussed it in this paragraph and tried to blame it on other people. I’m serious: people can’t talk about DuckDuckGo without talking about !g. It’s the law. So I know about it and I will use it.
I haven’t tried a new search engine since I tried Bing in 2009. It was time to find out how good DuckDuckGo is in 2020. What was the biggest difference that I found?
Google is the king of low-intent searches
Google has a structured understanding of many domains. This is a difficult moat for other search engines to cross. This is evident when comparing low-intent searches. These are searches with an ambiguous purpose. The subject is broad and it’s not clear what the user wanted. The user might not even “want” anything except to kill five minutes before a meeting.
Let’s try a low-intent search. Type harry potter into Google. In response, Google throws everything at the wall to see what sticks. In addition to the organic links, Google serves me:
A panel on the right with a ton of metadata. This includes oddly-specific structured data like “Sport: Quidditch”.
A list of five of the seven books in the series.
Fantasy books from five related searches.
A news panel containing three articles about Harry Potter actors.
The harry potter Google Maps search, centered on the New York area.
A “People also ask” panel with four questions.
A link to three Harry Potter-related YouTube videos.
Three recent tweets from @HarryPotterFilm.
A panel with 7 “Fantasy book series” results.
A panel with 7 “Kids book series” results.
8 other search strings related to harry potter.
This makes sense: what did I want when I searched for harry potter? Google can’t know. So Google returns information from many domains to attempt to satisfy the query. Google returns so much information that something will be close enough. This is a huge competitive advantage. They can serve good results for bad searches by covering as many domains as possible.
This is a departure from how search used to work. When I was in grade school, I was taught how to craft search queries. Someone herded us into a library and explained how to pick effective keywords, quote text, use operators like AND or OR, etc. These days are dead. None of this matters on Google. If you want to know showtimes for “Harry Potter and the Cursed Child,” a search for harry potter will get you close enough.
In comparison, DuckDuckGo’s results for harry potter are relaxing. It serves a small knowledge panel to the right and three recent news articles at the top, some organic links, and nothing else. It’s much easier to scan this page. It’s a more relaxed vibe. But if I actually wanted something, it likely wouldn’t be on this page. You can make the argument that I got what I deserved: I didn’t clearly communicate what I wanted, and therefore I didn’t get it. But Google has trained everyone that broad queries are effective. It feels like magic. It’s not. It’s the result of years of developing a structured understanding of the world and crafting ways to surface the structure. And it’s something that potential competitors will need to come to terms with.
I don’t personally miss most of Google’s result panels. Especially the panels that highlight information snippets. It’s easy to find these. Searching microsoft word justify text provides me a snippet from Microsoft’s Office’s support page explaining what to click or type to justify text. I’ve learned not to trust information in these panels without reading the source they came from. Google seems to cite this information uncritically. I’ve found enough oversimplified knowledge panel answers that I’ve stopped reading most of them. Recently, I was chatting with a Googler who works on these. I asked them if I was wrong to feel this way. And they replied, “I trust them, but I’ve read enough bug reports and user feedback that I don’t blame you.” So my position is wrong, but not very wrong. I’ll take that.
Some of Google’s panels are great. I miss them. I haven’t found anything better than Google’s stock panel for quickly looking at after-hours stock movements. Searching Google for goog stock will show you this panel. I miss you buddy. I hope you’re doing well.
Ultimately, it stresses me out when Google returns many panels in a search. I’m sure that each is a marginal gain for Google. But I don’t like how Google feels as a result. I’m continually glad to see just 10 links on DuckDuckGo, even if this means that I’m not getting what I wanted. This has been training me to craft more specific searches.
DuckDuckGo is good enough
Let’s move away from Google’s competitive advantages. How does DuckDuckGo perform for most of my search traffic? DuckDuckGo does a good job. I haven’t found a reason to switch back to Google.
I combed through my browser’s history of DuckDuckGo searches. I compared it to my Google search history. When I fell back to Google, I often didn’t find what I wanted on Google either.
Most of my searches relate to my job, which means that most of my searches are technical queries. DuckDuckGo serves good results for my searches. I’ll admit that I’m a paranoid searcher: I reformat error strings, remove identifiers that are unique to my code, and remove quotes before searching. I’m not sure how well DuckDuckGo would handle copy/pasted error strings with lots of quotes and unique identifiers. This means that I don’t know if DuckDuckGo handles all technical searches well. But it does a good job for me.
There are many domains where Google outperforms DuckDuckGo. Product search and local search are some examples. I recently made a window plug. It was much easier to find which big-box hardware stores had the materials I need with Google. I also recently bought a pair of ANC headphones. I got much better comparison information starting at Google. Google also shines with sparse results like rare programming error messages. If you’re a programmer, you know what I’m talking about: imagine a Google search page with three results. One is a page in Chinese that has the English error string, one is a forum post that gives you the first hint that you need to solve the problem, and one is the error string in the original source code in Github. DuckDuckGo often returns nothing for these kinds of searches.
Even though Google is better for some specific domains, I am confident that DuckDuckGo can find what I need. When it doesn’t, Google often doesn’t help either.
Sample of times when both Google and DuckDuckGo failed me
I tried to write a protobuf compiler plugin using the official PHP protocol buffer bindings. I now believe that writing a protobuf compiler plugin in PHP is impossible due to several arbitrary facts, but I needed to piece this information together myself. My searches sprawled over Google and DuckDuckGo across several days before I concluded that it could not be done and that I could not find a workaround. This isn’t DuckDuckGo or Google’s fault. Some things just don’t have answers online.
I often fell back to Google for gif searches. It turns out that I’m bad at finding gifs. Sometimes I get exactly what I want, like searching for gritty turning around. But I had a lot of trouble finding a string that gave me this. Eventually I found it by remembering a Twitter user that had posted it and scanning their “Media” posts.
Trying to find a very specific CS:GO clip that I had seen on Reddit years ago. I found it via a combination of Reddit search and skimming the bottom of Reddit threads for video links.
What is australian licorice? Is it a marketing gimmick? Stores sell it. It’s tasty. But I can’t find an explanation anywhere.
If you’re thinking of switching to DuckDuckGo because of the Google redesign, I’ll save you the trouble: DuckDuckGo’s inline ads are formatted similarly to the Google redesign that got reverted. If anything, DuckDuckGo’s ads are harder to spot because DuckDuckGo’s (Ad) icon is on the right, while Google’s was on the left where my eyes naturally skim.
It turns out that I care about privacy, but I still use Google Analytics on my blog. I haven’t been thinking about digital privacy for long enough to have a consistent and principled opinion. Sorry about that.
Let’s go back to the original selling point of DuckDuckGo: they don’t track you.
I have been reading my DuckDuckGo searches in my browser history for this post. It’s wonderful that all of these searches remained private. Some of them should remain private for stupid reasons. I don’t want anyone to know that I searched for what is the value of a human life because it makes me sound like a killer robot. Other searches are much more sensitive. One is the name of a medication I’m on. Others are searches about pains and fears that I have. DuckDuckGo allows me to perform these searches without building a profile of me. I’m sure that advertisers pick up the scent as soon as I click a link. But I appreciate the delay. I didn’t think about the traces I left online when I searched on Google. But now that I know I have the choice, I’m actively comforted by reviewing my DuckDuckGo search history and reading everything that they didn’t track.
I also noticed that many searches show trends. I knew that this was true in theory. But it’s different when you see it in your own search results. A month ago, many of my searches related to vacation planning. But now they don’t anymore. The coronavirus scrapped my plans. But there are many life events that could have also caused this: health reasons, family problems, etc. These are things that ad networks could piece together as I visit sites. It’s possible to imagine even darker versions of this – imagine the months of searches that relate to a pregnancy with a miscarriage. Many companies could profit from a couple going through that process, if they showed the right ads in the right places at the right time. There is a lot of trend information that you just want to keep to yourself.
What happens moving forward?
I will continue using DuckDuckGo. I don’t see a reason to switch back to Google. I’m going to continue to fall back using !g when I need to. I’m going to try to avoid talking about the fallback (but let’s be honest, I just did it again).
I still use lots of Google products. I’m not in the process of porting away from any of them. I still use Chrome in addition to Firefox and mobile Safari. Google Docs still holds a place in my heart. Etsy is hosted on GCP and uses Google Apps. Google Photos is still the best place for me to store and share my photos.
I liked the exercise of reading a month of my search history. You should do it, too. It became clear that I broadcast lots of information by having these very personal conversations with search engines. I’d like to understand more about the digital traces I leave online.
I don’t want to turn into a digital hermit. But I would like to become more deliberate about the traces that I leave around the internet. Even as a developer, I’m not sure what will happen if I disable third-party cookies across the internet. But I’d like to start reading more about digital privacy to understand what tradeoffs I am making.
Disclaimer: I worked at Google from 2010-2015, but did not work on search.
An app contributed to chaos at last week’s 2020 Democratic Iowa Caucus. Hours after the caucus opened, it became obvious that something had gone wrong. No results had been reported yet. Reports surfaced that described technical problems and inconsistencies. The Iowa Democratic Party released a statement declaring that they didn’t suffer a cyberattack, but instead had technical difficulties with an app.
A week later, we have a better understanding of what happened. A mobile app was written specifically for the caucus. The app was distributed through beta testing programs instead of the major app stores. Users struggled to install the app via this process. Once installed it had a high risk of becoming unresponsive. Some caucus locations had no internet connectivity, rendering an internet-connected app useless. They had a backup plan: use the same phone lines that the caucus had always used. But the phone lines were clogged by online trolls who jammed the phone lines “for the lulz.”
As Tweets containing the words “app” and “problems” made their rounds, software engineers started spreading the above XKCD comic. I did too. One line summarizes the comic (and the sentiment that I saw on Twitter): “I don’t quite know how to put this, but our entire field is bad at what we do, and if you rely on us, everyone will die.” Software engineers don’t literally believe this. But it also rings true. What do we mean?
Here’s what we mean: We’re decent at building software when the consequences of failure are unimportant. The average piece of software is good enough that it’s expected to work. Yet most software is bad enough that bugs don’t surprise us. This is no accident. Many common practices in software engineering come from environments where failures can be retried and new features are lucrative. And failure truly is cheap. If any online service provided by the top 10 public companies by market capitalization were completely offline for two hours, it would be forgotten within a week. This premise is driven home in mantras like “Move fast and break things” and “launch and iterate.”
And the rewards are tremendous. A small per-user gain is multiplied by millions (or billions!) of users at many web companies. This is lucrative for companies with consumer-facing apps or websites. Implementation is expensive but finite, and distribution is nearly free. The consumer software engineering industry reaches a tradeoff: we reduce our implementation velocity just enough to keep our defect rate low, but not any lower than it has to be.
I’ll call this the “website economic model” of software development: When the rewards of implementation are high and the cost of retries is low, management sets incentives to optimize for a high short-term feature velocity. This is reflected in modern project management practices and their implementation, which I will discuss below.
But as I said earlier, “We’re decent at building software when the consequences of failure are unimportant.” It fails horribly when failure isn’t cheap, like in Iowa. Common software engineering practices grew out of the internet economic model, and when the assumptions of that model are violated, software engineers become bad at what we do.
How does software engineering work in web companies?
Let’s imagine our hypothetical company: QwertyCo. It’s a consumer-facing software company that earns $100 million in revenue per year. We can estimate the size of QwertyCo by comparing it to other companies. WP Engine, a WordPress hosting site, hit $100 million ARR in 2018. Blue Apron earned $667 million of revenue in 2018. So QwertyCo is a medium-size company. It has between a few dozen and a few hundred engineers and is not public.
First, let’s look at the economics of project management at QwertyCo. Executives have learned that you can’t decree a feature into existence immediately. There are tradeoffs between software quality, time given, and implementation speed.
How much does software quality matter to them? Not much. If QwertyCo’s website is down for 24 hours a year, they’d expect to lose 273,972 dollars total (assuming that uptime linearly correlates with revenue). And anecdotally, the site is often down for 15 minutes and nobody seems to care. If a feature takes the site down, they roll the feature back and try again later. Retries are cheap.
How valuable is a new feature to QwertyCo? Based on my own personal observation, one engineer-month can change an optimized site’s revenue in the ballpark of -2% to 1%. That’s a monthly chance at $1 million dollars of incremental QwertyCo revenue per engineer. Techniques like A/B testing even mitigate the mistakes: within a few weeks, you can detect negative or neutral changes and delete those features. The bad features don’t cost a lot – they last a finite amount of time, and the wins are forever. Small win rates are still lucrative for QwertyCo.
Considering the downside and upside, when should QwertyCo launch a feature? The economics suggest that features should launch even if they’re high risk, as long as they occasionally produce revenue wins. Accordingly, every project turns into an optimization game: “How much can be implemented by $date?”, “How long does it take to implement $big_project? What if we took out X? What if we took out X and Y? Is there any way that we can make $this_part take less time?”
Now let’s examine a software project from the software engineer’s perspective.
The software engineer’s primary commodity is time. Safe software engineering takes a lot of time. Once projects cross a small complexity threshold, it will have many stages (even if they don’t happen as part of an explicit process). It needs to be scoped with the help of a designer or product manager, converted into a technical design or plan if necessary, divided into subtasks if necessary. Then the code is written with tests, the code is reviewed, stats are logged and integrated with dashboards and alerting if necessary, manual testing is performed if necessary. Additionally, coding often has up-front costs known as refactoring: modifying the existing system to make it easier to implement the new feature. Coding could take as little as 10-30% of the time required to implement a “small” feature.
How do engineers lose time? System-wide failures are the most obvious. Site downtime is an all-hands-on-deck situation. The most knowledgeable engineers stop what they are doing to make the site operational again. But time spent firefighting is time they are not adding value. Their projects are now behind schedule, which reflects poorly on them. How can downtime be mitigated? Written tests, monitoring, alerting, and manual testing all reduce the risk that these catastrophic events will happen.
How else do engineers lose time? Through subtle bugs. Some bugs are serious but uncommon. Maybe users lose data if they perform a rare set of actions. When an engineer receives this bug report, they must stop everything and fix the bug. This detracts from their current project, and can be a significant penalty over time.
Accordingly, experienced software engineers become bullish on code quality. They want to validate that code is correct. This is why engineering organizations adopt practices that, on their face, slow down development speed: code review, continuous integration, observability and monitoring, etc. Errors are more expensive the later they are caught, so engineers invest heavily in catching errors early. They also focus on refactorings that make implementation simpler. Simpler implementations are less likely to have bugs.
Thus, management and engineering have opposing perspectives on quality. Management wants the error rate to be high (but low enough), and engineers want the error rate to be low.
How does this feed into project management? Product and engineering split projects into small tasks that encompass the whole project. The project length is a function of the number of tasks and the number of engineers. Most commonly, the project will take too long and it is adjusted by removing features. Then the engineers implement the tasks. Task implementation is often done inside of a “sprint.” If the sprint time is two weeks, then every task has an implicit two week timer. Yet tasks often take longer than you think. Engineers make tough prioritization decisions to finish on time: “I can get this done by the end of the sprint if I write basic tests, and if I skip this refactoring I was planning.” The sprint process applies a constant downward pressure on time spent, which means that the engineer can either compromise on quality, or admit failure in the sprint planning meeting.
Some will say that I’m being too hard on the sprint process, and they’re right. This is really because of time-boxed incentives. The sprint process is just a convenient way to apply time pressure multiple times: once when scoping the entire project, and once for each task. If the product organization is judged by how much value they add to the company, then they will naturally negotiate implementation time with engineers without any extra prodding from management. Engineers are also incentivized to implement quickly, but they might try optimizing for the long-term instead of the short-term. This is why multiple organizations are often given incentives to increase short-term velocity.
So by setting the proper incentive structure, executives get what they wanted at the beginning: they can name a feature and a future date, and product and engineering will naturally negotiate what is necessary to make it happen. “I want you to implement account-free checkouts within 2 months.” And product and engineering will write out all of the 2 week tasks, and pare down the list until they can launch something called “account-free checkouts.” It will have a moderate risk of breaking, and will likely undergo a few iterations before it’s mature. But the breakage is temporary, and the feature is forever.
What happens if the assumptions of the website economic model are violated?
As I said before, “We’re decent at building software when the consequences of failure are unimportant.” The “launch and iterate” and “move fast and break things” slogans point to this assumption. But we can all imagine situations where a do-over is expensive or impossible. At the extreme end, a building collapse could kill thousands of people and cause billions of dollars in damage. The 2020 Iowa Democratic Caucus is a more mild example. If the caucus fails, everyone will go home at the end of the day. But a party can’t run a caucus a second timeā¦ not without burning lots of time, money, and goodwill.
Quick note: In this section, I’m going to use “high-risk” as a shorthand for “situations without do-overs” and “situations with expensive do-overs.”
What happens when the website economic model is applied to a high-risk situation? Let’s pick an example completely at random: you are writing an app for reporting Iowa Caucus results. What steps will you take to write, test, and validate the app?
First, the engineering logistics: you must write both an Android app and an iPhone app. Reporting is a central requirement, so a server is necessary. The convoluted caucus rules must be coded into both the client and the server. The system must report results to an end-user; this is yet another interface that you must code. The Democratic Party probably has validation and reporting requirements that you must write into the app. Also, it’d be really bad if the server went down during the caucus, so you need to write some kind of observability into the system.
Next, how would you validate the app? One option is user testing. You would show hypothetical images of the app to potential users and ask them questions like, “What do you think this screen allows you to do?” and “If you wanted to accomplish $a_thing, what would you tap?”. Design always requires iteration, so you can expect several rounds of user testing before your mockups reflect a high-quality app. Big companies often perform several rounds of testing before implementing large features. Sometimes they cancel features based on this feedback, before they ever write a line of code. User testing is cheap. How hard is it to find 5 people to answer questions for 15 minutes for a $5 gift card? The only trick is finding users that are representative of Iowa Caucus volunteers.
Next, you need to verify the end-to-end experience: The app must be installed and set up. The Democratic Party must understand how to retrieve the results. A backup plan will be required in case the app fails. A good test might involve holding a “practice caucus” where a few Iowa Democratic Party operatives download the app and report results on a given date. This can uncover systemic problems or help set expectations. This could also be done in stages as parts of the product are implemented.
Next, the Internet is filled with bad actors. For instance, Russian groups famously ran a disinformation campaign across social media sites like Facebook, Reddit, and Twitter. You will need to ensure that they cannot interfere with the caucus. Can you verify that the results you receive are from Iowa caucusgoers? Also, the Internet is filled with people who will lie and cause damage just for the lulz. Can it withstand Denial of Service attacks? If it can’t, do you have a fallback plan? Who is responsible for declaring the fallback plan is in action and communicates that to the caucuses? What happens if individuals hack into the accounts of caucusgoers? If there are not security experts within the company, it’s plausible that an app that runs a caucus or election should undergo a full third-party security review.
Next, how do you ensure that there isn’t a bug in the software that misreports or misaggregates the results? Relatedly, the Democratic Party should also be suspicious of you: can the Democratic Party be confident of the results even if your company has a bad actor? The results should be auditable with paper backups.
Ok, let’s stop enumerating issues. You will require a lot of time and resources to validate that this working.
The maker of the Iowa Caucus app was given $60,000 and 2 months. They had four engineers. $60k doesn’t cover salary and benefits for four engineers for two months, especially on top of any business expenses. Money cannot be traded for time. There is little or no outside help.
Let’s imagine that you apply the common practice of removing and scoping-down tasks until your timeline makes sense. You will do everything possible to save time. App review frequently takes less than a day, but worst-case it can take a week or be rejected. So let’s skip that: the caucus staff will need to download the app through beta testing links. Even if the security review was free, it would take too long to implement all of their recommendations. You’re not doing a security review. Maybe you pay a designer $1000 to make app mockups and a logo while you build the server. You will plan to do one round of user testing (and later skip it once engineering timelines slip). Launch and iterate! You can always fix it before the next caucus.
And coding always takes longer than you expect! You will run into roadblocks. First, the caucus’ rules will have ambiguities. This always happens when applying a digital solution to an analog world: the real world can handle ambiguity and inconsistency and the digital world cannot. The caucus may issue rule clarifications in response to your questions. This will delay you. The caucus might also change their rules at the last second. This will cause you to change your app very close to the deadline. Next, there are multiple developers, so there will be coordination overhead. Is every coder 100% comfortable with both mobile and server development? Is everyone fully fluent in React Native? JS? Typescript? Client-server communication? The exact frameworks and libraries that you picked? Every “no” will add development time to account for coordination and learning. Is everyone comfortable with the test frameworks that you are using? Just kidding. A few tests were written in the beginning, but the app changed so quickly that the tests were deleted.
Time waits for no one. 2 months are up, and you crash across the finish line in flames.
In the website economic model, crashing across the finish line in flames is good. After all, the flames don’t matter, and you crossed the finish line! You can fix your problems within a few weeks and then move to the next project.
But flames matter in the Iowa caucus. As the evening wears on, the Democratic Caucus is fielding calls from people complaining about the app. You get results that are impossible or double-reported. Soon, software engineers are gleefully sharing comics and declaring that the Iowa Caucus never should have paid for an app, and that paper is the only technology that can be trusted for voting.
What did we learn?
This essay helped me develop a personal takeaway: I need to formalize the cost of a redo when planning a project. I’ve handled this intuitively in the past, but it should be explicit. This formalization makes it easier to determine which tasks cannot be compromised on. This matches my past behavior; I used to work in mobile robotics, which had long implementation cycles and the damage of failure can be high. We spent a lot of time adding observability and making foolproof ways to throttle and terminate out-of-control systems. I’ve also worked on consumer websites for a decade, where the consequences of failure are lower. I’ve been more willing to take on short-term debt and push forward in the face of temporary failure, especially when rollback is cheap and data loss isn’t likely. After all, I’m incentivized to do this. Our industry also has techniques for teasing out these questions. “Premortems” are one example. I should do more of those.
On the positive side, some people outside of the software engineering profession will learn that sometimes software projects go really badly. People funding political process app development will be asking, “How do we know this won’t turn into an Iowa Caucus situation?” for a few years. They might stumble upon some of the literature that attempts to teach non-engineers how to hire engineers. For example, the Department of Defense has a guide called “Detecting Agile BS” (PDF warning) that gives non-engineers tools for detecting red flags when negotiating a contract. Startup forums are filled with non-technical founders who ask for (and receive) advice on hiring engineers.
The software engineering industry learned nothing. The Iowa Caucus gives our industry an opportunity. We could be examining how the assumption of “expensive failure” should change our underlying processes. We will not take this opportunity, and we will not grow from it. The consumer-facing software engineering industry doesn’t respond to the risk of failure. In fact, we celebrate our plan to fail. If the outside world is interested in increasing our code quality in specific domains, they should regulate those domains. It wouldn’t be the first one: HIPAA and Sarbanes-Oxley are examples of regulations that affect engineering at website economic model companies. Regulation is insufficient, but it may be necessary.
But, yeah. That’s what we mean when we say, “I don’t quite know how to put this, but our entire field is bad at what we do, and if you rely on us, everyone will die.” Our industry’s mindset grew in an environment where failure is cheap and we are incentivized to move quickly. Our processes are poorly applied when the cost of a redo is high or a redo is impossible.