The most common failure mode in enterprise agent deployments is not a model problem. It is a context problem. Agents that can read files, run terminal commands, and call APIs still fail consistently when they encounter questions that require understanding the living system around them: what changed last week, which dependency introduced a vulnerability, why a service degraded after a deployment. GitLab Orbit and SigNoz's Noz assistant both illustrate this pattern clearly, and together they point toward an architectural shift that engineering leaders need to understand before they evaluate their next agentic tool.
Why File-Level Awareness Is Not Enough
Most agent frameworks give models access to a working directory and a set of tools. That is sufficient for isolated tasks. It is insufficient for the kinds of questions that matter in production engineering workflows.
When a developer asks an agent why a pipeline is failing, the answer rarely lives in a single file. It lives in the intersection of a recent commit, a dependency version bump, a changed environment variable, and a test that was passing three days ago. An agent without access to that change history cannot reason about the failure. It can only guess.
The commercial consequence is that agents trained on strong models still produce low-confidence, low-utility answers for system-level questions. Engineering teams lose trust in the tooling quickly, and adoption stalls before the productivity gains materialise.
What GitLab Orbit Introduces: Lifecycle Context Graphs
GitLab Orbit is designed around the idea that code does not exist in isolation. It exists within a lifecycle that includes merge requests, CI/CD pipelines, security scans, dependency graphs, and review history. Orbit's architecture makes that lifecycle queryable by agents.
Dependency-Aware Reasoning
Rather than treating a repository as a flat collection of files, Orbit constructs a graph of relationships between components. An agent querying this graph can answer questions like "which services depend on this library" or "what changed in this module since the last stable release" without requiring a human to assemble that context manually.
This matters because dependency reasoning is where most file-level agents break down. A vulnerability in a transitive dependency is invisible to an agent that can only read the top-level manifest. Orbit surfaces that relationship explicitly, which means the agent can reason about impact rather than just presence.
Change History as a First-Class Input
Orbit also treats commit and review history as structured data rather than narrative logs. An agent can query which engineer reviewed a change, what automated checks passed or failed, and whether a similar change was reverted previously. That historical signal is what separates a useful suggestion from a plausible-sounding hallucination.
What Noz Introduces: MCP-Native Observability Reasoning
SigNoz's Noz assistant approaches the same problem from the observability side. It is built on the Model Context Protocol, which means it does not just surface metrics and traces as text. It makes them queryable by agents as structured, typed context.
Structured Telemetry as Agent Input
The practical difference between a chatbot that can read a dashboard screenshot and an MCP-native assistant is significant. Noz can receive a trace, identify the span where latency spiked, correlate it with a deployment event, and return a structured answer. A screenshot-reading agent can describe what it sees. Noz can reason about what it means.
This distinction matters for engineering leaders because it determines whether an agent can close a loop autonomously or whether it always requires a human to interpret the output. Noz is designed for the former, which is the precondition for genuine workflow automation rather than assisted search.
In-Product Agent Design Patterns
Noz also illustrates a broader pattern: agents embedded directly in the tools engineers already use, with access to the full context those tools hold. This is architecturally different from a general-purpose coding assistant that has been given a few tool calls. The agent is a native participant in the system, not a visitor with limited credentials.
The Architectural Shift: From Capability to Context Infrastructure
The pattern across both Orbit and Noz is not that they use better models. It is that they invest in richer context layers. The model is a reasoning engine. The context layer is what determines whether that engine has enough signal to produce a useful answer.
Engineering leaders evaluating agentic tooling should ask a specific question: what can this agent query, and how is that query answered? An agent that can call a search API is different from an agent that can traverse a dependency graph or correlate a trace with a deployment event. The former answers questions about text. The latter answers questions about systems.
This shift also has implications for how organisations build their own agent infrastructure. Context layers need to be designed, maintained, and governed. They are not a byproduct of connecting a model to a codebase. They require intentional investment in data structures, access controls, and freshness guarantees.
Evaluating Your Agent Infrastructure Against This Standard
The practical question for a CTO is not which model vendor to use. It is whether the context infrastructure around your agents is rich enough to support the questions your engineers actually need to answer.
A useful evaluation framework has three dimensions. First, coverage: can agents query the systems that hold relevant context, including version control history, dependency graphs, CI/CD state, and observability data? Second, structure: is that context returned as typed, queryable data or as unstructured text that the model must parse? Third, freshness: is the context layer updated in near-real-time or does it reflect a stale snapshot?
Agents that score poorly on these dimensions will underperform regardless of model capability. The failure will look like hallucination or low confidence, but the root cause is missing context. Addressing it requires infrastructure investment, not model upgrades.
Companion piece to our broader work on production agent deployments. See Why Most of Enterprise AI Agent Projects Never Leave the Pilot Stage for a practical guide to the organisational and architectural decisions that separate production deployments from perpetual pilots.
FAQs
A lifecycle context graph is a structured representation of the relationships between code components, change history, CI/CD state, and dependency trees within a software development system. It matters for agent deployments because it gives agents a queryable model of how the system has evolved over time, not just what it looks like at a single point. Without this graph, agents cannot reason about causality, which is the core requirement for answering system-level engineering questions reliably.
The Model Context Protocol is a specification for passing structured, typed context to language models in a way that preserves semantic meaning. Standard tool-calling returns text or JSON that the model must interpret. MCP-native systems return context in a form the model can reason about directly, with explicit relationships and types. For observability use cases, this means an agent can correlate a trace with a deployment event rather than simply describing what a dashboard shows.
Evaluate across three dimensions: coverage (which systems can the agent query), structure (is context returned as typed queryable data or unstructured text), and freshness (how quickly does the context layer reflect changes in the underlying systems). Ask vendors to demonstrate a query that crosses system boundaries, for example correlating a code change with a downstream service degradation. If the agent cannot answer that kind of question without human assembly of context, the context layer is insufficient for production engineering workflows.
Model capability determines how well an agent reasons over the context it receives. It does not determine how much context the agent has access to. When an agent lacks access to dependency graphs, change history, or observability data, it cannot answer system-level questions accurately regardless of model quality. The output looks like hallucination or low confidence, but the root cause is missing signal. Upgrading the model does not resolve a context infrastructure deficit.
Context layers introduce new access control requirements because they aggregate sensitive information across systems that previously had separate permission models. An agent querying a dependency graph, a CI/CD history, and an observability platform simultaneously may have broader effective access than any human operator. Engineering teams need to apply least-privilege principles to context layer access, maintain audit trails of what context was retrieved for each agent action, and establish freshness policies to prevent agents from reasoning over stale data. These are infrastructure concerns, not model concerns, and they need to be designed before the context layer is built.
For organisations that already run mature observability, version control, and CI/CD infrastructure, the data required for a rich context layer largely exists. The investment is in structuring that data for agent consumption, building the query interfaces, and maintaining freshness. That is a meaningful engineering effort, but it is not a greenfield build. Organisations that lack mature underlying systems will find context layer investment harder, because the quality of the context layer is bounded by the quality of the systems it draws from. Addressing tooling maturity is a prerequisite, not a parallel workstream.

