Want to become a partner or sponsor, click here to schedule time to talk with the team.
FinOps & Beyond is what engineering, finance, and IT leaders read to understand FinOps, and what it means for operating models, accountability, and spend decisions.

Table of Contents

At the same time, there is a widely accepted narrative that AI is improving engineering productivity by increasing output. More code is being written, and more features are moving through the development pipeline. On the surface, that should translate into better efficiency and stronger margins as we discussed last week.

In practice, the system has not fully adapted to absorb that output. Data from Faros, spanning a large population of developers, points to a shift rather than an elimination of work. Lead times are increasing, time spent in pull request review is rising, incidents per change are climbing, and code churn is expanding.

The work has not gone away. It has moved downstream into review cycles, incident response, and the attention of more senior engineers. This matters because it changes not just the volume of work, but the cost profile of that work. More of the effort now sits with higher-cost labor, and more time is required to validate and stabilize what is being produced. All of which is operationally focused work that traditionally is included in margin calculations.

Where Margin Gets Squeezed

At the same time, a new set of cost categories has emerged, most of which did not exist in a meaningful way less than two years ago. Inference compute, token consumption, reasoning model overhead, and evaluation tooling are now tied directly to product functionality. These costs are not fixed. They scale with usage and grow alongside customer adoption.

When combined with the shift in labor dynamics, this creates two simultaneous sources of pressure on gross margin. The first is an increase in infrastructure-related cost per unit driven by AI. The second is an increase in labor cost per unit driven by the need for more senior oversight and operationally focused activities.

Neither of these is currently captured cleanly in a traditional cloud cost report. Both are clearly visible in financial outcomes. This is why organizations can report improving cloud efficiency while still seeing margin degradation.

The 50% Reality

For a long time, software economics were built on the assumption that the marginal cost of delivery was close to zero. Code was expensive to build, but inexpensive to run at scale. That dynamic supported gross margins in the seventy to eighty percent range and shaped how companies were valued and operated.

AI introduces a different model. Each interaction carries a cost, whether through inference, validation, or additional processing layers. At the same time, the human effort required to manage that system does not disappear. It shifts into more specialized and often more expensive forms of operationally focused work.

The result is a change in marginal economics that many organizations are still adjusting to. While it is too early to define a universal new baseline, it is increasingly clear that the historical margin assumptions are under strain. The gap between expected and actual performance is what many teams are now working through.

The Honest Reframe

The response to this pressure is often framed as efficiency. Headcount reductions are explained as a result of AI-driven productivity gains, with the implication that less labor is required to produce the same output.

A more accurate framing, in many cases, is margin defense. Organizations are reallocating cost from labor to AI infrastructure and services in order to maintain financial targets that were built on a different cost structure. This is a rational adjustment, but it is not the same as work disappearing.

The underlying effort still exists, and in some areas, it has increased. When decisions are made based on the assumption that productivity gains will fully offset these changes, the system can correct in unintended ways. Reports indicating that a significant percentage of companies regret AI-driven layoffs, or that rehiring costs exceed initial savings, reflect this mismatch between expectation and operational reality.

This is why the Forrester boomerang stat exists. Fifty-five percent of employers "regret laying off workers because of AI," according to Forrester's Predictions 2026: The Future of Work. Nearly a third say rehiring cost more than the layoffs saved. The decision got made on the productivity story instead of the margin story, and the productivity story did not survive contact with the actual workflow.

What This Means in Practice

This shift is already changing the types of questions being asked across organizations. Engineering and IT leaders are being asked to explain margin performance at the product level, not just infrastructure efficiency. FinOps teams are being drawn into discussions that extend beyond cloud billing data. Engineering managers are increasingly expected to quantify work that is much more operationally focused, which includes time spent reviewing and validating AI-generated output and incidents.

The organizations that navigate this effectively will be the ones that treat gross margin as a shared responsibility across engineering, IT, finance, and FinOps. They will invest in better instrumentation of both labor and AI-related costs, even when that data is less structured than a cloud bill. And they will align their operating models to reflect the reality that cost now scales with both usage and complexity.

The Shift for the Discipline

FinOps is not becoming less relevant in this environment. It is becoming more central, but with a broader mandate. Measuring cloud cost remains necessary, but it is no longer sufficient to explain financial performance.

The discipline is evolving toward full unit economic stewardship, where infrastructure, AI, and labor are considered together. That transition is still in progress, and there is no established playbook yet.

What is clear is that the conversation has moved. Gross margin is now a major focal point, and the ability to connect cost drivers across the system will define how effectively organizations respond.

That is the gap many teams are now encountering, and it is where the next phase of FinOps will need to operate.

FinOps Signal

Structural Trend Quick Takeaway

Engineering Work Tracking Becomes a Margin Discipline

The earlier argument established that AI has shifted a meaningful portion of engineering and/or IT cost into areas that FinOps does not consistently measure. The implication is more practical than theoretical. Tracking where work actually happens, across the stages of the SDLC, is no longer a nice-to-have optimization signal. It is becoming a requirement for explaining margin impact and the downward trend.

Without that visibility, teams are left with an incomplete story. Cloud costs may trend down, AI-related spend may trend up, and gross margin may move in a way that does not reconcile cleanly with either. The data exists, but it does not connect. That is the gap most Directors, VPs, and CTOs/CIOs are now encountering when they are asked to explain product-level efficiency.

What has changed is not just the introduction of new costs, but the redistribution of effort. AI has altered where time is spent in the engineering workflow. If that shift is not measured, it cannot be tied to financial outcomes. The result is a growing disconnect between what teams optimize and what the business ultimately cares about.

What This Actually Looks Like

In practice, this requires a shift away from thinking about cost purely in terms of headcount and toward thinking about cost in terms of work.

Instead of allocating engineering and/or IT cost based on org structure, the focus moves to where time is actually spent across the lifecycle of delivery. That means understanding how effort is distributed across stages such as authoring, code review, testing, integration, deployment, incident response, and rework. Once that distribution is visible, it can be mapped to the products or features that work supports, and then translated into cost using fully loaded rates that reflect differences in seniority.

Most organizations already have the raw signals needed to approximate this. Pull request metadata captures review cycles. Ticketing systems reflect transitions between stages. CI/CD pipelines log deployments. Incident management systems record response and resolution time. The data is already being generated as part of normal operations.

The shift is in aggregation and alignment. Those signals need to be brought together into a stage-based view of work, and then aligned with cloud and AI-related cost data. This is not a clean or perfectly defined process, and it is not owned by a single function. Engineering/IT, FinOps, and FP&A all have a role. But FinOps is increasingly the function that can connect these datasets into a single allocation model.

Why This Changes the Margin Conversation

The reason this level of tracking matters is that AI does not simply add cost. It changes how cost and effort are distributed across the system.

New infrastructure costs tend to attach to specific stages of the lifecycle, such as inference during runtime or evaluation during testing. At the same time, engineering effort shifts as workflows adapt. More time moves into review and validation. Less time may be spent on initial authoring, but the total effort does not necessarily decline, and the work increasingly concentrates with more senior engineers.

If cost is only measured at the infrastructure layer, these movements remain invisible. The gross margin line reflects the outcome, but not the cause. This is why organizations can report improving cloud efficiency while still experiencing margin compression at the product level.

Once work is tracked at the stage level, the connection becomes visible. The cost of delivering and operating a feature can be understood as a combination of infrastructure, AI services, and labor distributed across the lifecycle. Changes in margin can then be traced back to specific shifts in that distribution, whether driven by tooling, process, or product behavior.

This is the difference between observing margin movement and being able to explain it.

Where to Start

The starting point is not new tooling. It is using existing signals with a different lens.

Most organizations already capture the data needed to approximate how work flows through the SDLC. Ticketing systems, source control, CI/CD pipelines, and incident management platforms all contain fragments of that picture. The first step is to bring those signals together into a simple, stage-based view of engineering work. It does not need to be precise. It needs to be directionally consistent.

From there, that stage-based view should be mapped to product lines or major workflows. This is where the model begins to align with how the business measures margin. Even a rough allocation of engineering effort to products is enough to surface differences in cost structure that are otherwise hidden.

Once that baseline exists, the value comes from repetition. Running the same allocation on a regular cadence, monthly is typically sufficient, allows teams to observe how the distribution of work changes over time. When AI tooling shifts effort toward review, validation, or incident response, it will appear in the stage mix. When that shift carries cost implications, it will eventually surface in gross margin.

The objective is not a perfect model. It is a stable one. Consistency over time is what turns this from an exercise into a signal.

The Shift for FinOps

This is where the scope of FinOps is beginning to extend.

Cloud cost remains a necessary foundation, but it is no longer sufficient to explain financial performance on its own. Engineering/IT labor and AI-related costs are now part of the same unit economic equation, whether they are formally included in the model or not.

The teams that start connecting these inputs now will be able to answer a different class of question. Not just what infrastructure costs, but what it actually takes to build, ship, and operate a product.

That is the shift underway. And it is what turns engineering work tracking from an operational detail into a margin discipline.

FinOps Industry

News or Market Updates for Engineering, IT, or FinOps leaders.

Summary: Hyperscalers shipped visibility primitives through April, but the FinOps news that matters landed in the past week. AWS opened a new cost domain. The MCP Server went GA. Coinbase joined the AI layoff list. The regret consensus solidified. Practitioners are now writing the same gross-margin compression argument the lead essay makes.

Practitioner Read

FinOps Foundation and Open Source

  • FinOps X 2026 (June 8-11, San Diego) is nearly here. AI-related sessions dominate. Day 1 keynote: "The Wild West of AI, Token Economics and the New Job of FinOps." Other sessions include "Implementing Full Stack AI Cost Attribution," "The Cost of Cognition," and "Onboarding Your AI Hire."

  • FOCUS v1.4 is in final review with public announcement scheduled for FinOps X (release planning). Adds FOCUS Invoice Dataset and contract commitment dataset support.

  • OpenCost v1.120.0 shipped MCP support built into the Helm chart.

Hyperscaler Updates

AWS

  • AWS Weekly Roundup, May 11 flagged two FinOps-adjacent items.

    • Amazon Bedrock AgentCore payments (Coinbase partnership) lets AI agents pay for APIs autonomously. New cost domain, no current FinOps allocation.

    • MCP Server reached GA, giving AI agents authenticated access to AWS services through a fixed tool set.

    Both signal where the next wave of blind spots lands.

April highlights, in brief:

Google
  • Google shipped Prepay Billing for Gemini API plus mandatory tier caps ($250 to $20,000+ monthly ceilings, 10-minute enforcement delay).

AWS
  • AWS launched Bedrock Projects and IAM principal cost allocation, giving two complementary attribution paths for inference spend.

Microsoft
  • Azure SRE Agent published a dual-rate billing model with Azure Agent Units as the unit cost. .

Macro Trends

  • Coinbase laid off 14% of staff, roughly 700 roles, on May 5. CEO Brian Armstrong cited AI productivity gains. (Business Insider.) Year-to-date tech layoffs cross 93,000 across 106 companies, with 2026 already at 75% of 2025's full-year total by month five. (Economic Times, May 7.)

  • Diginomica, May 11 on the regret wave: Forrester's J.P. Gownder and Orgvue research describe companies eliminating expertise before building proven AI capability ending up with "skills gaps, institutional knowledge loss, and failed automation initiatives that cost more than the headcount savings." The lead essay's "margin defense dressed up as productivity" framing is now analyst consensus.

FinOps Company Spotlight

If you would like your company included in the Spotlight, contact the CloudXray AI Team

Company: CloudXray AI

Category: Managed Services & Consulting

What They Do: FinOps Consulting & Advisory Services; Owners & maintainers of the single largest FinOps company directory (finops.cloudxray.ai)

Why It Matters: Companies still need guidance on implementing FinOps and understanding the landscape of companies that exist

Operator Playbook

Practical guide for leaders and practitioners

Standing Up Engineering / IT Work Allocation

Issue 09's playbook covered how to build a margin defense once you can see where AI moved the cost. This playbook covers the work that has to happen first. Stage-level engineering allocation does not exist in most FinOps stacks today. Building it is new territory for these teams as it has mostly stayed within Engineering and/or IT. I have done this before, so here is how to start.

Step 1. Pick one product line, not the whole org

Stage-level allocation across an entire engineering organization is easily a six-month project. For one product line it is a six-week project or less. Pick the product line your CFO is asking about, margin pressure already visible, or a recent AI investment that lacks an outcome story. That is your pilot.

If you do not know which product line, ask FP&A. They have the list of business units where gross margin is moving the wrong way. Start there.

Oh, and make this team your champion internally, it will come in handy later.

Step 2. Get FP&A into the working group from day one

This is not engineering / IT work that gets handed to finance later. The allocation methodology, the cost rates, and the materiality thresholds are FP&A's domain. Bringing them in at the start makes the eventual reporting credible. Bringing them in at the end means you produce numbers nobody will defend in a board meeting.

The working group is small; Engineering / IT leader for the product line, FinOps practitioner, and FP&A partner. One owner from each and that is the room.

Step 3. Map SDLC stages to existing data sources

You are not building new instrumentation. You are aggregating data that already exists.

Authoring: commit logs from GitHub or GitLab. Code review: PR open and merge timestamps, comment counts. QA: test run logs from your CI system. Integration and deployment: deploy timestamps and durations. Incident response: PagerDuty or your incident tool. Rework: ticket transitions in Jira flagged as defects or post-deployment fixes.

You will not get a clean answer on the first try. That is fine. The goal is a first-cut baseline, not perfection.

(Note - tools exist on the market and if needed, CloudXray AI can provide advise on ones that can be helpful, based on your requirements)

Step 4. Build the first allocation and Compare it to AI cost on the same product line

Engineering / IT hours by stage, multiplied by cost rates, mapped to the product line. That is one number.

AI cost on the same product line, pulled from Bedrock Projects, Azure Cost Management, or your Gemini billing breakdown. That is the second number.

Both numbers go into the same view. For the first time, you can see the all-in cost of shipping that product line's features. Cloud, AI infrastructure, and engineering labor in a single allocation. This is where FinOps is going and AI helped make it happen.

Step 6. Run it monthly and Track the drift

The static number is interesting once. The drift over time is even more important.

Compare each month against the prior. Watch how the stage mix moves. When AI tooling changes the shape of the workflow, you will see it. When senior-review hours rise, you will see it. When AI infrastructure cost starts to dominate, you will see it. The drift is the discipline.

This is the report the CFO has been asking for. Run it for the pilot product line for a quarter. Use that quarter's data to make the case for extending the allocation across the rest of the engineering organization.

And remember; even though you are reporting, use this data to take action. Don’t become Visibility Theater!

Keep Reading