Navigating ambiguity.

Published on January 19, 2024. staff-plus (37), management (218)

Perceiving the layers of context in problems will unlock another stage of career progression as a Staff-plus engineer, but there’s at least one essential skill to develop afterwards: navigating ambiguity. In my experience, navigating deeply ambiguous problems is the rarest skill in engineers, and doing it well is a rarity. It’s sufficiently rare that many executives can’t do it well either, although I do believe that all long-term successful executives find at least one toolkit for these kinds of problems.

Before going further, let’s get a bit less abstract by identifying a few examples of the kinds of problems I’m referring to with the label deeply ambiguous:

At Stripe, we knew that data locality laws were almost certainly coming, but we didn’t know when or what shape that regulation would come in. One scenario was that many countries would require domestic transactions (e.g. transactions where the buyer and seller reside in the same jurisdiction) to be stored in a domestic datacenter, which India did indeed require. Another scenario was that all transactions would have to store a replica in jurisdictions that had a buyer or seller present. There were many such scenarios, and which seemed most likely changed as various political parties won and lost elections in various regions When explaining this problem to new colleagues, my starting explanation became, “The only thing we can know is that the requirements will change every six months for the lifespan of the internet.”
If the requirements were ambiguous, so were our tools for solving it. Many solutions involved degrading the reliability or usability of our user-facing functionality for impacted zones. Evaluating those solutions required Product, Engineering, and Sales alignment. Other solutions reduced the user-impact but reduced our operating margin, which required alignment from Product, Engineering and Finance. Even implementing this work had a significant opportunity cost relative to other work, which was also difficult to get agreement on.
This was a deeply ambiguous problem.
At Calm, we would eventually acquire Ripple Health Group to enter the healthcare space, but beforehand we made a series of efforts to enter the space ourselves. None of our Product, Engineering or Legal teams were from a healthcare background, and we ran into an unexpected source of ambiguity: HIPAA.
It quickly became clear that we couldn’t make forward progress on a number of product and engineering decisions without agreeing on our interpretation of HIPAA. Some interpretations implied significant engineering costs, and others implied almost no engineering costs at all, but some potential legal risk. Teams were glad to highlight concerns, but no one had conviction on how to solve the whole set of concerns.
This, too, was a deeply ambiguous problem.

These examples highlight why perceiving layers of context are a prerequisite to effectively navigating deeply ambiguous problems: they almost always happen across cross-functional boundaries, and never happen in the seams where your organization has built experience solving the particular problem. They are atypical exceptions, that involve a lot of folks, and where the decision has meaningful consequences.

It’s also true that what’s a deeply ambiguous problem for one company isn’t necessarily that ambiguous for another company. For example, Amazon has solved data locality in a comprehensive way, so it’s certainly not a deeply ambiguous problem for them at this point, but I suspect it was until they figured it out. What falls into this category at any given company changes over time too: data locality isn’t deeply ambiguous for Stripe anymore, either.

It would be disingenuous to claim that there’s a universal process to navigating ambiguous problems–they’re ambiguous for a reason!–but there is a general approach that I’ve found effective for most of these problems that I’ve encountered: map out the state of play, develop options for discussion, and drive a decision.

First, map out the state of play:

Talk to involved teams to understand what the problems at hand are, and rough sketches of what the solution might look like. Particularly focus on the points of confusion or disagreement whose existence are at the root of this problem being deeply ambiguous, e.g. data locality is important, for sure, but wouldn’t it be better to delay solving it until we have clearer requirements?
Debug the gaps in cross-functional awareness and partnership that are making this difficult to resolve. You’re not looking to assign blame, simply to diagnose the areas where you’ll need to dig in to resolve perceived tradeoffs, e.g. as Product, we can’t be responsible for interpreting HIPAA, we need Legal to tell us exactly how to implement this
Identify who the key stakeholders are, and also the potential executive sponsors for this work, e.g. the General Counsel, Chief Technology Officer, and Chief Product Officer

Next, develop potential options to solve the state of play:

Cluster the potential approaches and develop a clear vocabulary for the clusters and general properties of each approach. For the data locality situation, the clusters might be something like: (1) highly-available and eventually consistent, (2) strongly consistent and single-region, and (3) strongly consistent with backup region
Develop the core tradeoffs to be made in the various approaches. It helps to be very specific, because getting agreement across stakeholders who don’t understand the implications will usually backfire on you.
For example, if we allow transactions to be processed in any region, and then forwarded them for storage in their home region, then it means that balances in the home region will sometimes be stale. This is because a transaction may be completed in another region and not yet forwarded to the home, causing the sum of all calculations to be stale. Are you willing to temporarily expose stale balances? If you’re not comfortable with that, you have to be comfortable failing to complete transactions if the home region is unavailable. What’s the right tradeoff for our users?
Talk to folks solving similar problems at other companies. It’s one thing for you to say that you want to run a wholly isolated instance of your product per region, but it’s something else entirely to say that Amazon used to do so. As you gather more of this data, you can benchmark against how similar companies approached this issue. (It’s true that companies rely too heavily on social proof to make decisions, but it’s also true that there are few things more comforting for leadership making a decision they don’t understand particularly well than knowing what another successful company did to solve it.)

Finally, drive a decision:

Determine who has the authority to enforce the decision. The right answer is almost always one or more executives. The wrong answer is the person who cares about solving the problem the most (which might well be you, at this point)
Document a decision making process and ensure stakeholders are aware of that process. No matter how reasonable the process is, some stakeholders may push back on the process, and you should spend time working to build buy-in on the process with those skeptics. Eventually, you will lean on the authorities to hold folks to the process, but they’ll only do that if you’ve already mostly gotten folks aligned
Follow that process to its end. Slow down as necessary to bring people along, but do promptly escalate on anyone who withholds their consent from the process
Align on the criteria to reopen this decision. One way that solutions to ambiguous problems die is that the debates are immediately reopened for litigation after the decision is made, and you should strive to prevent that. Generally a reasonable proposal is “material, new information or six months from now”

This formula often works, but sometimes you’ll follow it diligently and still find yourself unable to make forward progress. Let’s dig into the two most common challenges: trying to solve the problem too early, and not having an executive sponsor.

Is the time ripe?

Something that I’ve learned the hard way is that there’s a time and place for solving any given ambiguous problem. The strongest indicator that the time isn’t now is if you drive a decision to conclusion with the relevant stakeholders, but find the decision simply won’t stay made. It’s easy to feel like you’re failing at that moment, but usually it’s not personal. Instead, it probably means that the company values the optionality of various potential solutions more than it values the specific consequences those solutions imply.

Going back to the data locality challenge for Stripe, I found it quite difficult to make forward progress–even after getting stakeholders bought in–and it was only when the risk of legal penalties for non-compliance became clear that the organization was able to truly accept the necessary consequences to solve the problem at hand. Until the legal consequences became clear, the very clear opportunity cost, product tradeoffs, and margin impact weren’t worth tolerating. Once the legal consequences were better understood, it became obvious which trade offs were tolerable.

Here are some questions to ask yourself if you’re debugging whether your approach is flawed or it’s simply the wrong time to solve a given problem:

Does new information keep getting introduced throughout the process, despite your attempts to uncover all the relevant information? If so, you’re probably trying to solve the problem too early
Are there clear camps advocating for various approaches? If so, it’s probably not a timing issue. Rather it sounds like there’s more stakeholder management to do, starting with aligning with your executive sponsor
Do your meetings attempting to agree on a decision keep ending with requests for additional information? This may or may not indicate being early. Many leaders hide from difficult decisions by requesting more information, so this might be either a timing issue or a leadership issue. Don’t assume requests for additional information mean it’s too early

If you’re still unclear, then escalate to your leadership chain for direction. That will make it clear whether they value you solving this problem at this moment. If they don’t, then slow down and wait for the circumstances to unfold before returning to push on the problem again.

Maybe you’re reading my advice to escalate up the leadership chain and thinking to yourself, “Yeah, I wish I had someone to escalate this to!” To be explicit: that’s a bad sign! If you can’t find an executive committed to solving an ambiguous problem, then you’re unlikely to solve it. Yes, there are some exceptions when it comes to very small companies or when you yourself are a quasi-executive with significant organizational resources, but those are exceptions.

My general advice is that you should only take on deeply ambiguous problems when an executive tells you to, and ensure that the executive is committed to being your sponsor. If not, the timing might be good, but you’re still extremely likely to fail. Flag the issue to your management chain, and then let them decide when they’re willing to provide the necessary support to make progress.

Don’t get stuck

My biggest advice on solving deeply ambiguous problems is pretty simple: don’t overreact when you get stuck. If you can’t make progress, then escalate. If escalating doesn’t clarify the path forward, then slow down until circumstances evolve. Failing to solve an ambiguous problem is often the only reasonable outcome. The only true failure is if feeling stuck leads you to push so hard that you alienate those working on the problem with you.

Navigation process

Is the time ripe?

Do you have an executive sponsor?

Don’t get stuck