The Engineering Principles Behind Software That AI Can Operate

Artifi

There is an emerging class of software that is designed not for humans to click through, but for AI to operate. This is different from adding an AI assistant to existing software. It requires rethinking foundational engineering decisions — how data is modeled, how tenants are isolated, how shared data is managed across entities, how workflows handle autonomous execution, and how every action is traced.

This article is about those engineering decisions. Not the product marketing. Not the user experience. The actual architectural patterns that make it possible for an AI to reliably operate a complex system like financial infrastructure.

We built these patterns at Artifi. Some of them are novel. Most of them are hard-won solutions to problems that only become visible when you try to make AI the primary operator of a multi-tenant, multi-entity financial system. We are sharing them because we believe these patterns will become standard across the industry — and the sooner engineers start thinking about them, the better the systems we will all build.

The protocol layer: MCP as the universal interface

The first architectural decision is the most consequential: how does AI communicate with your system?

The traditional answer is APIs. REST endpoints, GraphQL schemas, RPC calls. These work, but they were designed for application-to-application communication. They assume the caller knows exactly which endpoint to hit, what parameters to pass, and how to interpret the response. This is fine when the caller is a frontend application written by the same team that built the API. It breaks down when the caller is a large language model that needs to discover, understand, and compose operations dynamically.

The Model Context Protocol (MCP) solves this by providing a standard way for AI tools to discover and invoke operations on external systems. When you connect an MCP Server to Claude.ai, Cursor, VS Code, or any MCP-compatible tool, the AI automatically understands what operations are available, what parameters they accept, and what they return. There is no manual integration, no custom code, no SDK to install.

This is not a minor convenience. It is the difference between a system that one specific AI tool can use (through a custom integration) and a system that any AI tool can operate (through a standard protocol). Build for MCP, and your system is accessible from Claude, Cursor, Windsurf, VS Code, OpenAI, and every future tool that implements the protocol.

For financial infrastructure specifically, this means 150+ operations — posting invoices, running reports, managing vendors, reconciling bank accounts — are all accessible through a single protocol. Skills and Plugins built for one AI tool work across all of them.

Data isolation: why schema-per-tenant is the only option

When AI operates your financial system, data isolation becomes existential. A traditional multi-tenant SaaS application might use row-level security — all tenants share the same tables, and a WHERE clause filters by tenant_id. This works when humans are clicking through a UI that enforces the filter. It becomes terrifying when an AI agent is constructing queries dynamically.

A missing WHERE clause. A join that crosses tenant boundaries. A cached query result served to the wrong context. Any of these bugs in a row-level security model means one client sees another client's financial data. In accounting software, this is not just a privacy violation — it is a regulatory catastrophe.

The solution is schema-per-tenant isolation. Every client gets their own dedicated PostgreSQL schema. Not filtered rows in shared tables — a completely independent set of tables, views, indexes, and constraints.

PostgreSQL Database

One database, completely separate schemas per client

company_1

accounts

vendors

invoices

transactions

company_2

accounts

vendors

invoices

transactions

company_3

accounts

vendors

invoices

transactions

Each client = own schema|Complete data separation|No data leakage risk

This provides guarantees that row-level security cannot:

No data leakage risk. It is physically impossible for a query in one schema to return data from another. The AI cannot accidentally cross tenant boundaries because the boundaries are enforced by the database itself, not by application code.
Independent operations. Each schema can be backed up, restored, migrated, or audited independently. If one client needs a data correction, it does not affect any other client.
Consistent structure. Every schema is cloned from a template, so the structure is identical and predictable. The AI knows exactly what tables, columns, and views exist in any client's schema.
Performance isolation. A heavy query from one client cannot degrade performance for another, because they are operating on separate data structures.

The trade-off is operational complexity. Managing hundreds of schemas is harder than managing one shared database. Migrations must be applied to every schema. Monitoring must track each schema independently. We solved this with a template-based approach: all schema changes are developed in a reference schema, validated, and then applied across all client schemas through numbered migrations.

For AI-operated systems, the trade-off is worth it. The alternative — hoping that your application code never makes a mistake in tenant filtering when AI is constructing dynamic queries — is not a bet you want to take with financial data.

Multi-entity architecture: one organization, many books

Financial software has a unique challenge that most SaaS products do not: a single client often needs multiple sets of books. A parent company with subsidiaries in three countries needs three separate ledgers with three different currencies, three different tax regimes, and three different regulatory requirements — but shared master data like the chart of accounts, vendor list, and item catalog.

The naive approach is to treat each legal entity as a separate tenant. This makes isolation easy but sharing impossible. The chart of accounts is duplicated across entities. Updating an account name means updating it in every entity. Vendor records drift out of sync. Cross-entity reporting requires manual consolidation.

The other naive approach is to put everything in one schema and filter by entity_id. This makes sharing easy but isolation fragile. It is also confusing for AI: when you ask "show me accounts," do you mean the master list, entity A's view, or entity B's view?

We solved this with a pattern we call MSEO — Master + Scopes + Entity Overrides + Effective Views.

Organization

Acme Group Ltd

Shared Master Data

Chart of AccountsVendorsCustomersItemsTax Codes

Acme US Inc

United States

Own transactions

USD reporting

US tax codes

Acme Estonia OÜ

Estonia

Own transactions

EUR reporting

Estonian VAT

Acme UK Ltd

United Kingdom

Own transactions

GBP reporting

UK VAT

Here is how it works:

Master records are defined once at the organization level. Your chart of accounts, vendor list, customer list, item catalog, tax codes — all stored as master records. One hundred accounts stored once, not duplicated across entities.

Every legal entity automatically sees all master records. There is no manual assignment. When you create a master account, every entity in the organization can use it immediately.

Entity overrides are sparse. If one entity needs an account to have a different name, a different active status, or any other entity-specific configuration, it creates an override for just that field. Only the differences are stored. If your US entity calls account 4000 "Product Revenue" while every other entity calls it "Revenue," you store one override — not five copies of the entire account record.

Effective views merge everything automatically. When you query the chart of accounts for a specific entity, you get a view that resolves masters and overrides into the final result. The AI does not need to understand the inheritance model. It queries the effective view and gets the right answer.

The data efficiency is significant. In a traditional system, 100 accounts across 6 entities means 600 rows. With MSEO, it is 100 master rows plus however many overrides are needed — typically single digits. We currently run with 122 master records and 8 overrides across 6 entities: 130 rows stored, 732 visible through effective views.

But the real value is not storage efficiency. It is consistency. Update an account name in the master, and every entity sees the change immediately — unless that entity has an override. There is no synchronization problem, no drift, no reconciliation. The data model guarantees consistency by construction.

This pattern applies everywhere shared data exists: accounts, vendors, customers, items, dimensions, tax codes. It is the foundation that makes multi-entity operations manageable for AI.

The workflow engine: autonomous execution with governance

Here is the engineering challenge that most teams underestimate: when AI can execute operations autonomously, how do you maintain control?

In a traditional system, control is implicit. A human reviews the data on the screen before clicking submit. The act of clicking is both execution and approval. But when an AI agent processes 500 invoices at 3am, there is no human looking at each one. The control must be explicit — encoded in the system, not assumed by the presence of a human.

We built a workflow engine that sits between every write operation and the database. Nothing modifies financial data without going through this engine — whether the action was initiated by a human through Claude, by an autonomous agent, or by an API call.

💬

You in Claude.ai

🤖

Autonomous Agent

🔗

API / Integration

MCP Server

Receives all actions

Every action logged

Workflow Engine

Who

User / Agent / API

What

Full operation details

When

Every step timestamped

Lane

Green / Yellow / Red

Approval

Who approved & when

Result

Data changes recorded

PostgreSQL Database

Data written with full audit trail

The workflow engine has three core responsibilities:

Risk-based routing

Every operation is assessed for risk and routed to the appropriate lane:

Lane	When	What Happens
Green	Low risk — small amounts, standard operations, trusted users	Auto-approved, executes immediately
Yellow	Medium risk — updates to critical records, moderate amounts	Requires one approval before execution
Red	High risk — large transactions, deletions, sensitive changes	Requires multi-level approval

Lane assignment is fully configurable. You define the rules: which operations need approval, at what thresholds, and who can approve. The same bill might be green lane for a $200 office supply purchase and red lane for a $50,000 equipment order.

This is not a feature. It is the mechanism that makes autonomous AI operation safe. Without risk-based routing, you either let the AI do everything (dangerous) or approve everything manually (defeats the purpose). Risk-based lanes let you define the boundary: "Trust the AI for routine operations. Flag anything unusual. Require explicit approval for anything significant."

Complete audit trail

Every action through the workflow engine records six dimensions:

Who initiated the action (user, agent, or API key)
What was requested (the full operation details)
When each step occurred (timestamps for submission, validation, approval, execution)
What lane was assigned (and why)
Who approved (if approval was required)
What changed (the resulting data modifications)

This is not a separate audit log bolted on after the fact. The workflow engine is the audit trail. There is no way to modify data without generating an audit record, because the audit record is a byproduct of the execution mechanism itself.

For AI-operated systems, this solves a problem that legacy architectures cannot: distinguishing between human-initiated and AI-initiated actions. When an auditor asks "who created this journal entry," the answer is not just "the bill processor agent" — it is the complete chain: which event triggered the agent, what data the agent processed, what decision it made, what validation was applied, and what the resulting entries look like.

Lifecycle management

Every operation follows a defined lifecycle:

submitted → validating → validated → approved → executing → completed
                                   ↘ pending_approval → approved → completed
                                                      → rejected → closed

This lifecycle is not just for tracking. It is the mechanism that enables safe autonomous operation. An agent can submit an operation, and the workflow engine determines whether it should be auto-approved (green lane), held for approval (yellow/red lane), or rejected outright (validation failure). The agent does not make that decision — the governance rules do.

Read vs. write: the separation that makes AI safe

There is a subtle but important architectural decision in how operations are structured: the strict separation between read and write operations.

Read operations execute immediately. Queries, reports, lookups — anything that does not modify data runs directly against the database with no workflow overhead. When you ask "show me the trial balance," you get an answer in milliseconds. There is no approval step, no risk assessment, no audit record for reading data.

Write operations go through the workflow engine. Every modification — creating a vendor, posting an invoice, adjusting a balance — goes through a single gateway called submit. This gateway handles validation, risk assessment, approval routing, execution, and audit logging.

This separation seems obvious, but it has a profound effect on how AI interacts with the system. Reading is always fast and frictionless, which means the AI can explore and analyze data freely. Writing is always validated and auditable, which means the AI cannot make changes without governance. The system encourages exploration and constrains modification — exactly the behavior you want from an autonomous operator.

Connectors and encryption: the trust boundary

When AI operates your financial system, it inevitably needs to interact with external services — banks, payment processors, accounting systems, tax authorities. This creates a trust boundary problem: how do you give an AI agent access to bank APIs without exposing credentials in chat logs or agent memory?

The answer is a connector framework with credential encryption. Connectors abstract external integrations into a standard interface (health check, fetch transactions, submit payments). Credentials are encrypted with AES-256-GCM using organization-specific keys and stored in the database — never passed through the AI layer.

When an AI agent needs to sync bank statements, it calls a connector operation. The connector decrypts the credentials internally, makes the API call, and returns the results. The AI never sees the raw credentials. They are entered once through the admin dashboard and encrypted before storage.

This is not optional security theater. It is a hard requirement for AI-operated systems. If credentials flow through conversational interfaces, they end up in chat logs, context windows, and potentially in training data. The connector framework ensures that sensitive credentials never leave the encrypted storage layer.

The extensibility contract: plugins and skills

The final architectural principle is extensibility. Financial systems need specialized logic — payroll calculations vary by country, tax rules vary by jurisdiction, industry-specific reporting varies by sector. In traditional systems, this logic is hardcoded by the vendor. In an AI-operated system, it can be defined by anyone.

Plugins extend the system with custom server-side logic. A plugin might implement Estonian payroll calculations, UK VAT returns, or SaaS revenue recognition schedules. Plugins are activated per-organization and execute within the system's security and audit framework.

Skills are portable instruction sets that enhance how AI tools interact with the system. A skill might guide Claude through a month-end close procedure specific to your industry, or teach it how to handle a particular country's tax compliance requirements. Skills work across Claude, Cursor, VS Code, and OpenAI — they are not locked to any single platform.

The architectural contract is simple: the system provides the financial infrastructure (ledger, transactions, entities, workflows, audit trail), and plugins and skills provide the specialized logic. This separation means the infrastructure can evolve independently from the specializations, and specializations can be built, shared, and replaced without modifying the core system.

Why these patterns matter now

The patterns described in this article are not theoretical. They are the result of building a production financial system designed for AI operation. Each pattern addresses a specific failure mode that appears when AI moves from assistant to operator:

MCP as protocol prevents vendor lock-in and enables any AI tool to operate the system
Schema-per-tenant isolation prevents data leakage when AI constructs dynamic queries
MSEO makes multi-entity data management consistent and predictable for AI comprehension
The workflow engine provides governance for autonomous execution
Read/write separation enables free exploration while constraining modification
Credential encryption keeps sensitive data out of AI context
Plugins and skills enable extensibility without hardcoding specialized logic

These are not optional optimizations. They are necessary conditions for building software that AI can safely and reliably operate at scale.

The industry is going to learn these lessons one way or another. Some teams will learn them by building. Others will learn them from incidents — a tenant data leak, an unaudited AI modification, a credential exposure. We would rather the industry learn them from sharing what works.

The era of software designed for human operators is not ending tomorrow. But the era of software designed for AI operators has already begun. The engineering decisions you make today will determine which era your system belongs to.

These patterns power Artifi, an AI-operated ERP where Claude.ai, Cursor, or any MCP-compatible tool becomes your full financial interface. Explore the technical concepts or see how the architecture works.