Policy for strategy
This is a work-in-progress draft!
Explain the general role of policy for writing strategy.
This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts.
Notes
With your diagnosis in hand, the next step is determining guiding policies. There are many, many ways to articulate your guiding policies, but I would recommend starting by answering three key questions, which I believe get at the heart of effective engineering strategy:
What is the organization’s resource allocation against its priorities? (And why these ones?)
Competition can be healthy, but competing internally on budget and headcount tends to reward empire building rather than effectiveness. It also means you’ll often underinvest in critical priorities like compliance or security. Avoid this sort of internal competition by ensuring your engineering strategy clearly articulates resourcing and priorities.
As an engineering executive, it’s particularly important to think about the priorities that no one else is asking for (especially security, reliability, compliance, and developer productivity), and ensure your investment thesis addresses those.
Just as important is connecting the resource allocation back to your diagnosis. This grounds your allocation in the specific constraints you’re solving for, and makes it clear what problems any counter-proposal must address.
Example: We aim to maintain a ratio of 4 product engineers for every 1 platform engineer (security, reliability, infrastructure, developer productivity, etc). In addition to that standard ratio, this year we are running two major projects outside of that ratio, prioritizing a total of 10 engineers on security (all production access requires MFA and is connected to an uneditable audit-trail), and developer productivity (progressive migration of all JavaScript codebases to TypeScript)
What are the fundamental rules that all teams must abide by? (And why do they matter?)
Many of the most impactful guiding policies are predicated on broad, consistent adoption. For example, requiring all backend projects to be implemented in Golang would greatly narrow your security, compliance, and tooling needs. Similarly, requiring all new projects to use a specific database would narrow those needs.
These sorts of rules must be specified at the engineering organizational level because that’s the only place where you can make the appropriate, organization-level tradeoffs.
Folks are much more open to following rules if you explain why the rules are valuable, so I strongly recommend explicitly explaining why each rule is important. Things that are obvious to you may not be obvious to others.
Example: All development must use our standard development stack ( background services use Golang, frontend services use TypeScript, storage is in a service-isolated instance of Aurora PostgreSQL) and development lifecycle (standard code review, linting, and deployment processes documented in Development Lifecycle wiki). Exceptions to these rules must be approved by both Tech Spec Review and CTO
How are decisions made within engineering? (And why do we work this way?)
Even the most comprehensive strategy will omit many important details, but it should explain how those decisions are generally made. This is a nuanced navigation of positive and negative freedoms between teams and others impacted by their decisions.
You want teams to know what they can decide themselves, what they should optimize for when making those decisions, and how to move forward with decisions that they can’t make independently. You also want individuals to know why you work this way: there are many implicit tradeoffs in each way of working, and these tradeoffs are often invisible to folks who are frustrated with a current process.
Example: Technical decisions that deviate from the standard development stack or standard development lifecycle should be approved by Tech Spec Review and CTO. Changes to those two standards should similarly be approved by Tech Spec Review and CTO. Changes to organizational structure, hiring prioritization, and general people process should be approved by CTO. All other decisions should be made by the teams and leaders closest to the decision. If anyone believes we are making a meaningfully suboptimal decision, please escalate that decision using our Escalation Process
If you answer those three questions clearly, you will have an uncommonly valuable engineering strategy. At least as importantly, the strategy will be explicit about how it ties into the surrounding company strategies, and the degree of freedom it cedes to the teams and leadership within engineering.
Conversely, if your diagnosis doesn’t support answering these questions, then I’d push you to think more deeply about your diagnosis. It’s likely accurate, but missing an altitude that an executive is uniquely suited to bring.
It can be difficult to write guiding policies without unnecessarily constraining the teams within your organization. You can argue that each team’s roadmap and technology choices fall within the scope of engineering strategy. I recommend using engineering strategy sparsely, while ensuring you take advantage of its unique advantages.
To ensure your strategy is operating at the right altitude, ask if each of your guiding policies is applicable, enforced, and creates leverage:
Applicable: it can be used to navigate complex, real scenarios, particularly when making tradeoffs.
Much as applicability is essential for useful values, it applies to guiding policies as well. Guiding policies should be living, useful tools. If you can’t apply them, then scrap it!
Example: we generally prioritize stability of the existing product over new product work. If stability work takes less than a week, teams should self-approve the work. If it takes longer, they should review sequencing one step up their management chain.
Example: we prefer SaaS vendors over building our own commodity solutions, but we only consider SaaS vendors with current SOC2 Type 2 compliance. Build versus buy decisions should be reviewed by Tech Spec Review. Exceptions to our SOC2 Type 2 policy should be approved by CTO (but won’t be granted).
Enforced: teams will be held accountable for following the guiding policy.
Guiding policies will only actually guide an organization if they’re enforced. Every experienced engineer has their own stories of working somewhere with a standardized technology stack, hiring a new engineer that doesn’t want to use it, and the ensuing conflict. A policy is only effective to the extent that you are willing to enforce the policy, even if the person violating is your friend, or previously worked at a cool company.
It’s hard to talk about universal examples. Instead, this is more of a cultural question for you to ask yourself: are you willing to enforce this policy? If not, look for something else that you’re willing to enforce. Often the gap between unenforcable and enforcable can be bridged by a simple nuance (e.g. “unless approved by CTO”).
Create leverage: create compounding or multiplicative impact.
Leverage is making the organizaton more efficient, either directly (e.g. using a data interface that abstracts data privacy issues from product engineers), or indirectly (e.g. creating a new machine learning powered content selection tool, which means folks don’t need to argue about what content is shown where).
Many forms of leverage are accessible to the teams within engineering, and it’s often not necessary to directly address those opportunities in your engineering strategy. However, some approaches must be deployed at the engineering strategy layer to be impactful, particularly standardization strategies that require org-wide commitment (e.g. everyone uses TypeScript for frontend development).
Engineering strategy also needs to solve for scenarios where no team is capable of prioritizing a given effort, despite the effort being very valuable, such as a compliance or privacy initiative that doesn’t fall cleanly into any given teams scope but is necessary for continued business operation.
Example: Google historically constrained development to four languages: Python, C++, Go, and Java. They enforced this fairly rigoriously, and it created leverage in their development tooling. Each new project happening within that ecosystem increases a centralized tooling team’s impact on the company.
A closely related example is Dan McKinley’s Choose Boring Technology, which advocates building leverage by constraining technology choice, which was heavily enforced during Kellan Elliott-McCrea’s era of Etsy engineering leadership.
Example: Uber (in 2014) had an implicit technology strategy, related to its Let Builders Build value, of letting teams select their own tools. This aimed to create leverage by allowing teams to select the best tool for the job at hand, and was enforced through both engineering leadership’s absence of an enforced counter-policy, and permissive service tool which merrily ran any Docker container.
This approach is implicitly grounded in the theory that teams’ individual gains would outweight the inability to operate a high-leverage developer productivity team. More importantly it highlights the value of explicit engineering strategy: otherwise you get implicit engineering strategy, which is often ineffective.
While ineffective in this case, in a consulting company that built bespoke tools for other companies, it’s possible this guiding policy would be very effective.
If one of your guiding policies doesn’t meet these criteria, but is necessary to address your diagnosis, then I wouldn’t worry about it. However, if you find that many of your guiding policies don’t meet these criteria, then it’s worth spending time reflecting on what’s creating that gap until you’re confident that they do meet these criteria, or that these criteria don’t apply to your diagnosis.