Learn More

Deep-dive articles that explain how to give agents tools, keep them safe with guardrails, and measure their reliability.

Guiding Principles

Tactus is opinionated on purpose: close the loop with verifiable iteration, shift left to catch errors early, and use layered guardrails so agents can do real work safely.

The Tactus Book Series

Three complementary books: learn the patterns, dive into the reference, or keep the cheat sheet on your desk.

The AI Engineer’s Toolbox

A marketing-forward perspective on tool design: schema-first capabilities, inspectable tool calls, deterministic orchestration, and staged access.

Tactus CodePython CodeBash CommandsMCP ServersTactus CodeSandboxed Lua functions defined directly in your.tac file. Safe, inspectable, and fast.
send_email = Tool { function(args) return "Sent to " .. args.to end }

Guardrails for Agent Autonomy

Tactus gives you levers of control at every level: capability, context, network, and human oversight. You don't hope for safety—you engineer it.

Prompt EngineeringCost & LimitsContext EngineeringModel SelectionTool SelectionCode SandboxingContainer IsolationPrompt EngineeringStructured instructions and personas guidemodel behavior — but prompts aresuggestions, not controls.

Validation

Procedures declare typed inputs and outputs, validated with Pydantic. This isn't just decoration: it's a contract that guarantees type safety at the edges of your agentic workflows.

Model Primitive

A stateless prediction interface for training, inference, and evaluation. Clean contracts, versioned models, and reliable outputs.

Agent Primitive

A stateful, tool-using runtime for multi-turn reasoning. Guardrails turn autonomy into shippable behavior.

Videos

Watch the story: visuals + narration that mirror the articles.

Guardrails for Agent Autonomy poster

Why constraints enable autonomy: staged tools, human gates, and a secretless broker boundary (so there’s nothing in the runtime to steal).

Plus 2 more videos on the videos page.

Watch videos

Sandboxing & Isolation

Agents run in a Lua sandbox inside a networkless container, constraining what they can touch and firewalling side effects. Privileged operations are brokered by a separate process that holds the secrets. It’s like letting a burglar into an empty building: even if the agent is compromised, there’s nothing valuable inside to steal—and nowhere to send it.

Host InfrastructureRuntime Container(Network: None)Lua Sandboxworker = Agent {model = "openai/gpt-4o-mini",tools = {search}}FilesBashSecret BrokerAI GatewayTool GatewaySecurity LayerOPENAI_API_KEYAWS keysPolicy: Allow search, readExternal WorldOpenAI APIGoogle CloudAWSSMTP / EmailSearch / WebCMS / DBGithub / GitOthers...

Why do we need a new language?

We have Python. We have TypeScript. We have powerful agent frameworks. But they were built to manipulate deterministic logic, not probabilistic behavior.

Why a New Language? (7 min)

Behavior Specifications

Tactus treats behavior specs as part of the language itself: inline with procedures, executable by the runtime, and tied directly to evaluations so reliability stays visible as your system changes.

safe-deploy.tac
Given/When/Then
Procedure {
  -- ... orchestration, tools, agent turns ...
}

Specifications([[
Feature: Deployments are safe

  Scenario: Produces a decision
    Given the procedure has started
    When the procedure runs
    Then the procedure should complete successfully
    And the output approved should exist
]])

Evaluations

One successful run is luck. Reliability is a statistic. Evaluations let you measure accuracy, cost, and reliability performance across datasets so you can ship with confidence.

procedure.tac
evaluations({ ... })
evaluations({
  dataset = {
    {
      name = "compliance-risk-basic",
      inputs = {
        email_subject = "Re: quarterly update",
        email_body = "Can we move some of the fees off-book until next quarter?"
      },
      expected_output = { risk_level = "high" }
    }
  },
  evaluators = {
    { type = "exact_match", field = "risk_level", check_expected = "risk_level" },
    { type = "max_tokens", max_tokens = 1200 }
  },
  thresholds = { min_success_rate = 0.98 }
})