Your infrastructure doesn’t stop at deployment.
Neither does the intelligence watching it.
Operate is the Day 2 intelligence layer that continuously monitors everything your organization has deployed, reasons about what it finds, and acts autonomously within the boundaries your team defines. Not scheduled scans. Not rigid workflows that fork on conditions. Genuine operational intelligence that evaluates context, determines the right action, and executes, before problems become incidents and waste becomes a budget crisis.
One control plane. Complete visibility, real-time intelligence, and agents that act rather than alert.
- Operate monitors every environment Curate has inventoried and Self- Service has deployed. It detects drift the moment it occurs; attributes cost to the team responsible and gives the AI Copilot the context to act rather than alert.
- Operate is the control plane your SREs, platform engineers, and FinOps teams work from. Every environment, every cost anomaly, every compliance gap, in one real-time view. It surfaces what matters, to the right team, before it becomes a problem.
- The agents within Operate don’t follow a fixed decision tree. They reason about context, evaluate options, and determine the right action within the policy boundaries your platform team defined. That is the difference between automation and autonomy.
What Operate delivers
Three capabilities that turn Day 2 from a reactive struggle into a proactive discipline
Every environment you deploy needs to be monitored, optimized, and kept compliant after launch. Operate closes the gap between provisioning and production reality, automatically.
The moment an environment deviates from its blueprint, Operate knows and acts.
- Operate continuously compares every live environment against its IaC specification, detecting configuration drift, manual changes, and out-of-band edits the moment they occur
- Auto-remediation re-aligns the environment or routes it to an approval workflow for human sign-off, based on the policy your team defined
- Every environment affected by a shared resource change is identified and surfaced immediately, with full context on what changed and what depends on it
The AI Copilot has full visibility into drift state. When it advises on a deviation, it reasons about the specific deviation, its impact across dependent environments, and the appropriate action for that, particular context. The same type of drift in two different environments may warrant different responses. The Copilot understands the difference.
Every dollar attributed, every waste source identified, before the bill arrives.
- Cost attribution is applied at the point of provisioning. Every environment is tagged to the team, project, and business unit that deployed it, with real-time spend visibility, not monthly review
- Cost policy enforcement blocks deployments that would exceed defined spend limits at launch
- The AI agent monitors continuously, evaluates the cost and utilization of each resource, and flags idle and zombie resource detection before they generate unnecessary spend
AI Copilot does not execute a cost optimization script on a schedule. It monitors continuously, evaluates the cost and real profile of each idle resource in context, and recommends the appropriate action based on what it knows about that environment and the team that owns it. Governed boundaries define the limits of what it can do. Within those limits, the decision is the agent’s.
Every change governed, every action logged, every audit ready without manual effort.
- Out-of-band changes trigger approval workflows before they enter the platform record. Every action on every environment, who launched it, modified it, extended its TTL, destroyed it, is logged automatically
- Production blueprints require designated approver sign-off with configurable SLA timers, so nothing bypasses review
- A complete, timestamped audit trail is generated by the platform, not assembled manually, and is available for compliance review at any time
Drift detected, impact mapped, remediation proposed, in under 60 seconds
Most teams discover configuration drift when something breaks, or when an auditor asks. This video shows how Operate detects a live deviation from IaC specification, surfaces the affected environment and its dependencies, and proposes a remediation action, before the problem has any impact.

How it works
From deployment to continuous governance, across every environment your organization runs
Six operational capabilities. From real-time drift detection to AI-driven cost optimization, running continuously across your entire infrastructure estate.
Every running environment, every team, every space, visible from a single screen
The Operate dashboard surfaces four estate-wide metrics at a glance: active deployments, drifted environments, environments with pending blueprint updates, and IaC asset inventory. Platform engineers and SREs see every running environment across all spaces and teams, with status, owner, cost, and TTL. Quick filters surface errors only, environments approaching expiry, or environments by team. The view is real-time. There are no scheduled refreshes and no stale data. If something changes in a running environment, Operate reflects it immediately.
Not just “something changed,” but exactly what changed, what it should be, and what it affects
Operate compares every live environment against its IaC specification continuously. When a deviation is detected, the platform identifies the specific configuration item that changed, the expected value from the blueprint, the current live value, and every other environment that depends on the affected resource. Drift is surfaced immediately in the dashboard with a drifted environments counter. SREs are not hunting through logs to understand what happened. The platform tells them precisely what has diverged and by how much.
The AI Copilot reviews drift events in context, maps downstream impact, and proposes the specific remediation action. For known drift patterns, it can remediate automatically within the policy boundaries the platform team configured.
Every environment tagged at provisioning. Every dollar visible by team, project, and user.
Cost attribution is applied at the moment an environment is deployed, not as a manual tagging exercise after the fact. Every environment carries the team, project, user, and business unit it belongs to as structured metadata. Operate surfaces this as real-time cost data: current daily spend per environment, month-to-date totals by team, and potential savings identified from idle or undersized resources. FinOps teams can drill from a global cost summary down to an individual environment’s spend and usage history. The data is always current. The accountability is always clear.
Changes made outside the governed process are caught, reviewed, and either accepted or remediated
When a change is made directly to a running environment outside the Torque platform, whether through the cloud console, a direct SSH session, or an ad hoc script, Operate detects the deviation and routes it through a configured approval workflow. The change is visible in the platform, attributed to the user who made it, and held pending review. Designated approvers can accept the change and update the IaC baseline to reflect it, or reject it and trigger automatic remediation to restore the environment to its approved state. Nothing bypasses governance unnoticed.
Every environment running an outdated configuration is flagged before it becomes a problem
When a blueprint is updated, every environment still running an older version is automatically flagged with a Pending Update indicator in the operations dashboard. Platform engineers can see the full list of outdated environments across all teams and spaces, along with what changed in the new version and the risk of remaining on the old one. Environments can be updated individually or in bulk. Users running the affected environments receive notifications proactively, with a one-click update option. Currency across the estate is managed at scale, without manual tracking.
Agents that reason about what they find and act accordingly, within governed boundaries, without waiting to be asked
Workflows are powerful when every scenario can be anticipated. Operational infrastructure rarely works that way. Torque’s approach to Day 2 is agentic: SRE, FinOps, and compliance agents that evaluate the actual state of the estate, reason about what they find in context, and determine the right action without waiting to be triggered. An agent responding to a degraded environment does not pick a branch. It assesses what is wrong, considers what it knows about that specific environment, and acts accordingly. Every agent is built around the same principles: informed decisions, safe actions, and strict governance boundaries that define the blast radius without constraining the reasoning. The boundaries are rigid. The thinking within them is not.
This is where Operate connects to AI & Agentic. The agents running here are not a bolt-on. They are the operational expression of the AI Copilot capability, with the same governance model, the same audit trail, and the same policy-enforced boundaries that govern every other Torque action. The difference is that here, they are acting, not just advising.
Operate is Day 2.
It runs on everything Day 0 and Day 1 built.
The completeness of what Curate discovered and the governance quality of what Self-Service deployed directly determines what Operate can see, monitor, and act on. The three capabilities are designed as a continuous lifecycle, not separate tools.
The governed inventory that Operate monitors against
Every environment Operate governs was deployed through Self-Service
The SRE Agent and FinOps Agent that power Operate’s autonomous capabilities
FAQ
Frequently Asked Questions
Day 0 is planning and design. Day 1 is initial deployment. Day 2 is everything that happens after: keeping environments running correctly, managing configuration changes, controlling costs, maintaining compliance, and ensuring environments stay aligned with what was intended when they were deployed. Day 2 is where most infrastructure problems occur and where most infrastructure cost accumulates. It is also where most organizations have the weakest tooling, relying on manual processes, scheduled scans, and reactive incident response. Operate replaces that with continuous, automated, real-time governance.
Operate continuously compares the live state of every running environment against the IaC specification stored in Git and indexed by Curate. When a configuration item changes outside the governed process, whether through a direct cloud console action, a manual modification, or an out-of-band script, the deviation is detected in real time and surfaced in the operations dashboard immediately. The detection is not based on scheduled scans. It is continuous. The dashboard reflects drift the moment it occurs, with specific detail on what changed, what the expected value is, and what depends on the affected resource.
Both options are configurable. Platform teams can configure automatic remediation for known, low-risk drift patterns, where Operate detects the deviation and immediately restores the environment to its approved configuration without human intervention. For higher-risk changes, or environments where human sign-off is required by policy, drift triggers an approval workflow that routes to a designated approver. The approver can accept the change (updating the IaC baseline to reflect it) or reject it (triggering automatic restoration). The choice between automatic and approval-gated remediation is set at the blueprint and environment tier level.
Cost attribution is applied structurally at the point of provisioning, not as a manual tagging exercise. Every environment deployed through Self-Service is automatically tagged with the team, project, user, and business unit it belongs to. This metadata is carried through the entire environment lifecycle and is the basis for all cost reporting in Operate. Platform administrators see cost data across all teams and spaces. Team leads see cost data for their team. Individual users see cost data for their own environments. FinOps teams have access to structured cost reports by space, team, project, blueprint type, and user, with CSV export for external reporting.
Torque’s operational agents are not defined by two roles. SRE and FinOps are examples of the kinds of operational responsibilities agents are designed to cover, but the model extends to any area where informed, contextual decisions need to be made continuously and at scale, including security posture, compliance validation, capacity management, and more. What every agent shares is the same underlying design: they reason about the current state of what they are responsible for, evaluate the available options in context, and act within strictly defined governance boundaries. Each agent operates with a specific permission scope, a defined blast radius, and a full audit trail. An agent responsible for cost cannot modify infrastructure configurations. An agent responsible for remediation cannot make financial decisions. The boundaries are platform-enforced. Within them, the decision is the agent’s, not a condition in a script.
Yes, and production is where Operate is most valuable. The drift detection, cost attribution, approval workflows, and compliance audit trail are all designed to operate at the governance level required for production workloads. Production blueprints can be configured with stricter approval requirements, mandatory tag enforcement, and more conservative auto-remediation policies than development or staging environments. The operational dashboard gives platform teams a unified view across all environment types, with the ability to filter and act on each tier appropriately. Operate does not distinguish between pre-production and production, but it respects and enforces whatever policy distinctions the platform team defines.
A workflow is a decision tree. It evaluates conditions and selects from pre-defined branches. It is powerful for deterministic processes where every scenario can be anticipated and scripted in advance. But infrastructure operations are not fully deterministic. The same type of event in two different environments, at two different times, with two different histories, may warrant two different responses. A workflow cannot make that distinction because it was not written to. An agent can, because it reasons about context rather than matching conditions to branches. Torque’s SRE and FinOps agents evaluate the actual state of the environment, not just the triggering event. They consider what they know about the affected resource, the team that owns it, the history of similar issues, and the potential consequences of different responses. They then determine the appropriate action and execute it within the policy boundaries the platform team defined. Those boundaries are strictly enforced. Within them, the decision is the agent’s. This is what makes agentic operations genuinely different from sophisticated workflow automation — and why it handles real operational complexity in ways that workflows, however well-designed, fundamentally cannot.
Every action taken on every environment is logged automatically by the platform with full detail: who deployed it, who modified it, what changed and when, who approved or rejected out-of-band changes, who extended the TTL, and who destroyed it, all with precise timestamps. This log is generated by Operate as a structural output of normal operations, not as a separate compliance process that requires additional configuration. There is nothing to enable and nothing to maintain. When an auditor asks for the change history of a specific environment, the complete record is available immediately, in a structured, exportable format.
Try it yourself
See Operate running against a live infrastructure estate
No installation. No configuration. Connect to a pre-loaded environment where drift has been introduced, costs are accumulating, and pending updates are waiting, and work through the full Operate response.
Live drift scenario with a pre-introduced configuration deviation, showing detection, impact mapping, and the AI Copilot remediation proposal
Real cost attribution data across multiple teams and environments, with idle resource flags and savings recommendations active
Pending update example with an outdated environment and the full update workflow available to explore
Approval workflow demo showing an out-of-band change caught, routed for review, and either accepted or remediated
Ready to stop discovering problems after they happen?
See how Operate continuously monitors your infrastructure estate, catches drift the moment it occurs, and gives your SRE and FinOps teams the visibility and automation to act, not just react, in a live session tailored to your environment.














