Who runs Engineering processes?

Published on April 3, 2023. executive (41), management (206)

Uber ran a tech spec review process called the DUCK Review. “DUCK” didn’t stand for anything–it was created as a deliberate non-acronym–but was otherwise a fairly typical review process. When I first joined, we’d review one or two specs each week. The volume of requested reviews kept growing, and six months later there was a one to two week delay between requesting a review and receiving feedback. A year after that the process was disbanded due to lack of bandwidth to process all the incoming specifications. This was an instructive experience for me, because the DUCK Review produced very valuable feedback, but ultimately still failed because its operational costs were higher than we were willing to pay for that feedback.

Early in my management career, I believed most processes failed because they insufficiently addressed the problem at hand. I thought that if a process’ results were poor, it was because the process needed more nuance and sophistication. What I’ve learned since, mostly through designing thoughtful processes that nonetheless failed, is that good process is a deliberate tradeoff between quality and overhead. It’s common to see a code review process that runs quickly but provides low quality feedback, for example requiring code reviews be completed within two working hours of a pull request being made. It’s equally common to see very involved, high-quality process that eventually folks ignore because it’s too slow, including the earlier story of the DUCK Review.

To figure out the appropriate degree of process sophistication for your team, you have to start by figuring out who’s going to run the processes, and reason backwards from that answer to the degree of process sophistication your organization is capable of sustaining. To help you think through that question, I’ll work through:

What is the typical progression for companies across patterns?
What are the pros and cons of these patterns?
How do you operate the baseline pattern successfully?
How do you deal with the budgeting realities?
How should you navigate the trend cycle?

When you’ve finished, you’ll have a clear perspective on who should be running process within your organization as it exists today, when you should consider switching patterns, and some tips on avoiding the most frequent mistakes made in staffing process.

This is an unedited chapter from O’Reilly’s The Engineering Executive’s Primer.

Typical pattern progression

There are five frequent patterns I see organizations use to run their Engineering-scoped and company-scoped processes. Whatever pattern you’re using today, it’s likely you used a different one three years ago, and if you’re growing quickly then it’s likely you will use a third pattern three years from now. There’s no right pattern, just the most appropriate pattern for your current constraints. (There is one wrong pattern, the business unit pattern rarely goes particularly well.)

Early Startup

Brand new companies all use what I call the Early Startup pattern: either a founder or the functional executive does something for their function, or it doesn’t happen. My first year as Calm’s CTO, there were no shared People processes, and I ran Engineering’s performance review process out of a spreadsheet. It needed to get done, and I was in the position to do it. This is how much very small companies work, typically through their first 30-50 hires, at which point they move onto something slightly more structured.

Baseline

The next phase is the Baseline pattern, which is first adopted around ~50 hires, and is the final pattern for most companies. Company-scoped processes (e.g. company planning, performance management, and budgeting) are run by a central People (aka Human Resources) function, and engineering-scoped processes (e.g. Tech Spec Review, Incident Review, and developer productivity surveys) are run by engineers and engineering managers. There are no specialized roles to support engineering processes, and limited specialization of the company’s processes to support engineering.

Specialized Engineering Roles

As companies grow beyond 200 engineers, Engineering-scoped processes often start to take enough time that they benefit from full-time support. This often happens first for either an Incident Review or Tech Spec Review, but depending on your company’s particulars, any process might be the first one to become sufficiently expensive to run. I call this the Specialized Engineering Roles pattern, because most companies address this problem by spinning up a new function, usually either Technical Program Management or Engineering Operations.

Company Embedded Roles

At many technology companies, Engineering is the largest function, and consequently will reach certain points of scale earlier than other parts of the business. Ensuring that processes work properly despite all these extended requirements often leads to Engineering splitting away from the company’s default mechanisms for hiring, performance management, and so on, such that there are dedicated recruiters and members of the People team working with engineering. I call this the Company Embeddes Roles pattern.

In addition to pure size, Engineering has several other factors that create a particularly complex set of requirements, and support the switch to this pattern:

Dual-sided career ladder across engineers and engineering managers, whereas other functions typically have only one
Often includes one or more specialized roles along the lines of Developer Relations, Technical Program Management, Quality Engineering, or Security Engineering
Engineers and engineering managers tend to be slower and more expensive to hire and retain than most other functions
Long ramp time from starting to becoming highly productive, which places a higher value on retention

Business Unit Local

Finally, there is the Business Unit Local pattern, where engineering reports into each business unit’s leadership, and there is no longer any one leader who is responsible for the entirety of engineering. This typically results in engineering moving back to a standardized process run by centralized functions or embracing inconsistency as each business unit’s engineering organization finds its own path forward.

Patterns’ pros and cons

Now that we’ve introduced the typical sequence that companies go through these process management patterns, I want to go a bit deeper into the pros and cons of the patterns themselves:

Early Startup: when you start a new company, it’s just one or more founders. Early hires will almost always focus on building a small product to validate the market or run the business day to day, and generally won’t include anyone focused purely on process. This makes sense, because processes are about coordinating large groups, and the company is only a handful of people at this point. Any process that you want to run needs to be run by the founders, or managers hired by those founders.
Companies usually exit this pattern because there is impactful process they want to run, often initially this is performance management, but they feel they’re too busy to run it with the current team.
Pros: low cost, low overhead, few downsides at small scale beacuse most process is focused on supporting large teams
Cons: often valuable stuff doesn’t happen because everyone is too busy, sometimes quality of process is very low due to limited time or experience running that process
Baseline: Engineering-scoped processes are run by engineers or engineering managers, and company-scoped processes are run by a centralized function, usually in People or Human Resources. Most companies stay in the baseline pattern for an extended period of time, with perhaps one small exception for a dedicated recruiter for engineering hires. It typically works well through 100+ engineers.
One of the reasons this works significantly better than the early startup pattern is that identifies a clear team responsible for selecting vendors (e.g. Greenhouse for hiring needs, Lattice for performance management). These tools are not perfect, but they’re significantly better than a spreadsheet, which is the default tool when seven managers are independently trying to run a decentralized process.
Pros: some modest specialization to allow Engineering to focus on engineering rather than foundational company process, complexity of HR tasks tend to outstrip engineering manager depth by this phase (e.g. managing visas), unified systems allow executive team to inspect across functions rather than just in their own function
Cons: outcomes are highly dependent on quality of these centralized functions, changing company-scoped processes becomes challenging
Specialized Engineering Roles: increase efficiency in Engineering-scoped processes by hiring dedicated, specialized roles to fulfill that work. Typically this is hiring either an Engineering Operations or Technical Program Management team. The first areas of investment are usually organizational planning, incident management, and engineering onboarding.
Pros: specialized roles generally introduce significant efficiency, offering a better skill set for the task to be done at a lower salary point. Introducing these roles is usually energizing for the individuals no longer doing the work. Specialists who are able to focus on a given process often make significant improvements in that process
Cons: While it is more efficient, it’s usually more expensive as these roles introduce additional headcount. It’s a lot of work to effectively support specialized roles like engineering operations or technical program management. Specialized roles often freeze a company in a given way of working, whereas engineers are incentivized to eliminate a process, the specialist is incentivized to improve it
Company Embedded Roles: Engineering’s needs for company-scoped processes reach a sufficient scale or sufficiently diverge from standard operating procedure such that support functions like People and Recruiting hire individuals specifically to support engineering. This usually happens one function at a time, often starting with Recruiting, followed by People and Finance.
While it will often happen first for Engineering, this pattern will repeat for other sizable organizations in the company, for example you’ll frequently see a setup where Engineering and Sales have dedicated embeds, and the rest of the organization work with a centralized team. Within a sufficiently large company, every function will have dedicated support.
Pros: empowers Engineering to customize process and approach. Builds stronger relationships across functions
Cons: expensive to operate. Quality is heavily dependent on the embedded individuals
Business Unit Local: each business unit in a company has its own, separate engineering function. This split typically occurs when an organization becomes large enough to have multiple meaningful business units, and there’s a perception that standardization across engineering in those business units is unnecessary or unnecessarily expensive. For example, one business unit might require HITRUST compliance, whereas the others do not, which might lead to significantly different engineering tradeoffs.
Sometimes this pattern is adopted but one business unit is so much larger than the others, that it becomes the defacto engineering hub for the wider company, including leading cross-organizational developer tooling and infrastructure investments.
Pros: aligns engineering with business priorities within each business unit
Con: engineering process and strategy often gets stuck wherever it was when the split first happened. Future changes require consensus across many engineering leaders. Investments across engineering become difficult to prioritize

All of these patterns have flaws, but are worth considering. That said, if you asked me to pick one without any additional context, I would always recommend the baseline pattern. It’s generally good enough, and the easiest to manage.

Operating the Baseline pattern

What I’ve called the Baseline is the most common approach to running processes, with a centralized People organization running company-scoped processes, and a mix of engineers and engineering managers running the Engineering-scoped processes.

It’s common for a handful of reasons:

It’s the default scenario: simply don’t hire any specialized roles or embed support directly with engineering and you’re at the baseline
It’s relatively simple: you don’t have to learn to hire or manage anything new, and it doesn’t require treating engineering differently from other functions
It appears cheaper: when you propose moving to another pattern, it’s usually framed as a way to free up existing capacity to focus on higher value work, but that requires hiring additional roles to support the pattern, such as Technical Program Managers. That makes it more expensive from a budgeting perspective, even if it’s meaningfully more effective, and most budgeting discussions are cost-driven rather than effectiveness-driven

There’s a fourth reason it’s common: it works pretty well! Every company I’ve worked at, including Uber and Stripe, got very far before leaving this pattern, well past a hundred engineers. I personally believe that many companies switch to other patterns too early, often because they’re copying the playbook from their previous, much larger, employer.

If you’re tempted to switch away from Baseline and are below a hundred engineers, there are a few things I’d recommend focusing on first to try to make the current pattern work for you:

Rotate the work. Calm has had great success with six month rotations, where individuals serve in one capacity (e.g. running Incident Review) for six months before swapping off. We increase continuity by having two people serve together in each capacity with overlapping but offset periods of service
Make the process interesting for the people running it. If someone is leading your Incident Review, also give them exposure to the company’s wider planning process as you factor in reliability projects. If someone is running your Tech Spec Review, pull them into your annual revision of your engineering strategy. You should go beyond service, folks should feel that these are unique learning opportunities to work with senior leadership
Ensure your promotion rubrics, or your promotions if you don’t have a rubric yet, value this kind of work. You can even experiment with the rubric strongly preferring that engineers serve a term in one of your processes before moving into Staff-plus or managerial roles. People are smart and drift towards the work that you value
Make sure to call out the individuals doing the work behind your processes, particularly in All-Hands and Q&A meetings

There are no guarantees, but my lived experience is that making these relatively minor tweaks will allow the Baseline pattern to function well, and keep working much longer than you might expect.

Dealing with budgeting realities

While there are operational challenges to moving across patterns, most of those challenges are only obvious in hindsight. Instead, the biggest impediment for moving from Early Startup to Baseline or from Baseline to Specialized Engineering Roles will be the budget implications. Even if you can show that three Technical Program Managers would significantly improve Engineering impact–which likely is true but you likely can’t prove with a spreadsheet–most budgeting processes are anchored on relative dollars rather than absolute impact.

Digging in, the problems at hand are:

Most companies that successfully shift up the patterns’ cost curve do so when they are rapidly growing. Such companies are comfortable with rapidly growing headcount costs, and are generally tolerant of adding roles to support that growth. In all other circumstances, companies generally aren’t tolerant of adding new roles and structures, even if they would likely increase efficiency. In the latter scenarios, you’re much more likely to hear folks talk about “doing more with less” than “designing our organization to operate efficiently,” and it’s very hard to make efficiency-based arguments with an executive team in a cost reduction mindset
Many companies operate to a portfolio allocation target across departments, e.g. 20% on General & Administrative (G&A), 30% to Sales & Marketing (S&M), and 50% to Research & Development (R&D). Most pattern shifts will eat into the G&A budget, increasing G&A costs and preventing other planned G&A hiring. This generally means that many of your peer executives will be resistant to the proposal
A surprising number of budgeting processes operate in headcount rather than budget, which makes it hard to intuitively express ideas like “replacing two high cost Staff engineers with three mid-level Technical Program Managers.” Headcount-anchored budgeting processes portray the later as more expensive than the former, even though it’s not. It’s possible to push through this confusion, but it imposes an ongoing communication cost

Each of these concerns can be overcome with planning, patience, and conviction, but you have to decide yourself whether it makes sense to do so. My general advice is to stick with Baseline, and avoid moving up the pattern cost curve unless revenue or headcount is growing significantly.

Navigating the trend cycle

There are numerous process trends, for example the industry’s era of Agile obsession: I once worked with someone who’s job title was Agiletect. Similarly, there are also trends in who works on processes, and whether evolving processes is highly impactful, essential work or boring, dull, glue work that won’t get you promoted. In my own career, I’ve seen a major shift from “technical management” to “people-centric management,” and an equally abrupt counter-reaction from “people-centric management” to “management is overhead.”

If I bought into each of these trends, then I’d inflict a constant churn on the teams I lead, and we’d never get particularly good at any given way of working. That’s not to say that trends are meaningless, there’s always something real behind a given trend. In industry downturns, there will be a trend that reduces cash flow. In industry upturns, there will be a trend that takes advantage of copious investor cash. The shift towards people-centric management was in part driven by a generational shift in United States employees, with Millennials becoming a larger part of the workforce. There’s something to learn from these trends, but it’s a mistake to take them at face value: what has been working thus far hasn’t become irrelevant overnight.

My advice is to design your organization around one specific set of these foundational beliefs, and stick solid to those beliefs for at least three to four years before changing them again. Change can be restorative, but if you change too quickly then you’ll never be able to learn what is or isn’t working well enough to inform future changes.