The Forward Deployed Engineer

Forward Deployed Engineer (FDE) is the most in-demand technical role in AI right now. Not because of the title, because of the problem it solves.

AI research labs are shipping capabilities faster than enterprises can absorb them. The gap between what is technically possible and what is actually running in production is enormous, and closing that gap is the entire job.

Where the term comes from
What the job actually is
Audit
Evals
Deployment
Who does well in this role
Red flags
How to break in: a 30-day plan
The hiring bar
Why this role matters

Where the term comes from

The term originated at Palantir. Their version was literal: in 2010, they embedded engineers with Special Forces units in Afghanistan. Soldiers would run missions during the day, collect feedback on what the software wasn’t handling, and the FDEs would ship code overnight. The loop ran in hours, not quarters.

That model, an engineer embedded in the customer’s environment building against real constraints in real time, is what defines the role today. Palantir built their entire go-to-market around it. Now OpenAI, Anthropic, Google, and a growing number of Applied AI companies are hiring for the same thing.

The reason is structural. Foundation models are becoming commoditized. Every major enterprise will eventually have access to roughly the same capabilities. The competitive edge isn’t the model. It’s where and how you deploy it.

What the job actually is

An FDE is a technical person who works onsite with a customer’s team to integrate AI into their systems. Not a sales engineer closing demos. Not a PM gathering requirements. They write production code into codebases they’ve never seen, on timelines measured in weeks, while simultaneously explaining the business case to a VP who doesn’t have a technical background.

That combination, technical depth and commercial fluency, is what makes this a difficult hire and a high-leverage one.

The job has three phases: audit, evals, and deployment.

Audit

The audit is investigative. You’re onsite, embedded with different teams across the organization. Two weeks with revenue ops, one week with procurement, a full month with finance. The goal is to map every workflow and figure out where agents can actually deliver value.

From each team, you’re learning what their job looks like day to day, where the bottlenecks are, and where an agent might make a material difference.

The hardest part isn’t finding automation opportunities. It’s deciding what not to automate. Agents can create more problems than they solve in the wrong places. Three rough rules help:

Use an agent when the workflow has consistent rules but variable inputs (emails one day, PDFs the next, scanned images after that) and involves calling external tools. Variable inputs with consistent rules are exactly where agents outperform rigid code.

Use code when both the rules and inputs are predictable. Code is faster, cheaper, and easier to debug than a model for deterministic tasks.

Leave it manual when the decision requires pattern recognition and domain expertise that can’t be reduced to rules. Some judgment calls aren’t worth automating.

Beyond those principles: look for volume. An agent that runs five times a month isn’t going to move the needle. You want lengthy, repetitive processes where even a modest efficiency gain compounds.

The audit ends with a prototype, not a production system. A proof of concept to validate your hypothesis before committing real engineering time.

Evals

When a company is spending serious money on an AI deployment, they need to know it’s working. Evals are how you prove that.

A weak eval checks whether the final answer is correct. A stronger one checks whether the agent is reasoning correctly, whether it’s hitting the same checkpoints a skilled human would hit along the way.

Trace the human’s steps. A human doesn’t solve complex problems in one move. They break it down, check intermediate results, revise. Map those steps explicitly and grade the agent on each one, not just the output.

Build a golden dataset. Start with 20 real queries. For each one, manually write out what the ideal response looks like. Do this before you train or tune anything. Now you have a ground truth. Everything gets measured against it.

This does two things at once: it catches regressions as you iterate on the agent, and it builds trust with the customer. There are plenty of executives who are skeptical of whether AI actually works in their environment. A rigorous eval framework showing the agent performing correctly across real business scenarios is what converts a skeptic into a sponsor.

Deployment

The most expensive mistake in enterprise AI deployment is migrating data. Avoid it.

Instead of replacing existing systems, build APIs over the current data layer (SharePoint, legacy databases, ERPs) and put a model on top as an orchestration layer. The model queries through the API. The data stays where it is. This saves time, money, and the organizational pain of ripping out systems companies have spent years building.

Once the architecture is settled, create a sandbox within the customer’s own infrastructure. Run the agent there first. Test it. Break it on purpose. Only when it’s working reliably in the sandbox do you move toward production.

When you go to production, start narrow. Pick one workflow, get it working, then layer on capabilities. A practical starting point: an agent that catches bugs, investigates the root cause, and writes a ticket summarizing what it found. If that works consistently, then give it the ability to write code and open a pull request. Each new capability is earned by demonstrated reliability at the previous level.

Who does well in this role

Across multiple technology waves, the pattern for who succeeds in customer-facing technical roles is consistent. It comes down to traits that most engineers aren’t hired or evaluated on.

The first is communication. If someone can’t explain their most complex project to a non-technical person, they can’t be an FDE. Most of the job is bridging the gap between what AI can do and what a business decision-maker can evaluate. When that communication breaks down, the deployment doesn’t happen.

Related to that: ego. FDEs work inside the customer’s organization, alongside their engineering teams. Engineers who produce great code when left alone but create friction the moment they’re in a room with people who think differently are a liability here. The customer’s engineers have domain knowledge the FDE needs, and getting access to it requires genuinely respecting them.

Comfort with ambiguity matters more than most job descriptions admit. No two customer environments are alike. Some run on a chaotic mix of legacy systems and undocumented processes. Some have informal power structures that determine what actually gets approved. An FDE who gets frustrated by this won’t navigate it. The role suits someone who finds messy environments interesting rather than intolerable.

The best FDEs also tend to be tinkerers. They enjoy trying new things and don’t get attached to one stack. Every deployment involves adapting to constraints you didn’t anticipate. You have to be comfortable improvising.

Underlying all of it is systems thinking. The FDE knows their product well. The customer’s engineers know their domain well. There’s usually no natural overlap between those two worlds. Systems thinking is what builds the bridge, the ability to see how a solution fits into a business domain you’re encountering for the first time.

Red flags

Poor communication is the obvious one, but worth stating clearly. An FDE is the face of your organization at the customer’s site. If they can’t communicate confidently, it erodes trust in everything the product does.

Cultural inflexibility is less obvious but just as damaging. The customer’s culture may be very different from your own. FDEs who can’t adapt to different working norms will create friction. The job is to deploy solutions, not to fix how the customer runs their company.

Over-engineering with AI is the third. Most automation tasks can be handled with a series of tool calls and a single LLM call as the orchestration layer. Reaching for a model at every step increases token costs (which compound at scale), adds latency, and often produces worse outputs. Knowing when not to use AI is as important as knowing how.

How to break in: a 30-day plan

Three backgrounds tend to do well in FDE roles: consultants, product managers, and software engineers. Each has a different gap to close.

If you come from consulting or product

You can already translate data into ROI. That’s half the job. Your gap is engineering depth. Close it with a portfolio, not a credential. Pick two of these and build them end to end:

A production-ready agent that automates a whole workflow you used to do manually. It should call external APIs, log its reasoning, and handle failures without crashing.
A RAG pipeline built on a custom dataset for an industry you want to break into: legal docs, medical records, financial filings.
An eval framework that scores agent outputs across correctness, format, cost, and latency, applied to real business processes.
An MCP integration that connects an LLM to legacy software with no native AI support.

Don’t outsource your understanding to AI while building these. The point is to actually learn how these systems work, not produce artifacts.

If you come from software engineering

Your gap is communication. Build similar projects, but document every decision: the tech stack, the results, the iterations, and most importantly the business problem you were solving. Have a reason for building each thing. What was the pain point? How would this conversation go with a real client? What does a non-technical executive need to understand to approve the budget?

If you can’t explain what you built in business terms, you’re not ready.

The four checkpoints

Day 7: Implement an agent loop from scratch (prompt to model to response to next step). Add two tool calls using the Anthropic or OpenAI documentation. Build basic guardrails: input validation, a max-step limit, output filtering. Add an audit trail that logs every prompt, tool call, and response with timestamps.

Day 14: Enforce structured outputs (JSON by default). Understand what breaks going from demo to production and how to handle it. Add checkpointing: save agent state every N steps so it can restart from the last good state if something fails.

Day 21: Add retry logic with exponential backoff: 1s, 2s, 4s, 8s, cap at 16s for every external call. Optimize cost by using cheaper models for cheap subtasks, caching common prompts, and tracking cost per query. Build your golden dataset for evals. Learn multi-agent architectures where one agent plans, others execute, and one synthesizes.

Final week: Review everything. Then explain all of it out loud, to someone who doesn’t know what a model is. Tie every project to a business metric. If you can’t, figure out why.

The hiring bar

OpenAI’s FDE job description says it plainly: engineers who can operate independently in a customer environment, build trust with technical and non-technical stakeholders, and ship production code against real constraints. Not one or two of those things. All three.

The role is hard to staff because it requires fluency in both directions: deep enough technically to understand what’s possible, articulate enough to explain why it matters to someone who doesn’t write code. Most engineers have the first. Most non-technical roles have the second.

One wrong hire here doesn’t just create an underperformer. It damages the customer relationship. That’s why the bar is high and the compensation reflects it.

Why this role matters

Enterprise AI adoption isn’t moving slowly because the technology isn’t ready. It’s moving slowly because there aren’t enough people who can take it from demo to production inside a real organization, with real politics, real legacy infrastructure, and real skepticism from the people whose jobs it will change.

That’s the actual job. It’s harder than it looks from the outside. It’s also more interesting than most engineering roles, and the demand for people who can do it well isn’t going away.

“In a crisis, be aware of the danger—but recognize the opportunity.”-John F. Kennedy

Rushi's

Ctrl+AI+Ship