Docs/Data & Integration/Bulk Import

Bulk Import

Universal bulk data import system for master data, transactions, and migrations with automatic sync/async routing

10 min read

Overview#

Artifi's universal import system provides a single, consistent way to import data in bulk across all object types -- from master data like customers and vendors to financial transactions. The system automatically handles small and large imports differently, with real-time progress tracking for bigger jobs.

Key Features#

  • Single consistent interface for all object types
  • Automatic sync/async routing based on record count
  • Background processing with real-time progress tracking
  • Configurable duplicate handling (skip, update, or error)
  • Validate-only mode for dry runs before committing
  • Preflight checks to discover required fields before importing
  • External ID resolution for seamless data migration from other systems
  • Detailed error reporting per record with recovery guidance

Sync vs. Async Processing#

The system automatically routes imports based on the number of records:

RecordsModeBehavior
1-50SynchronousFull result returned immediately with counts, errors, and created IDs
51+AsynchronousReturns an import ID immediately; processes in the background

Why Async?#

Each record can involve dozens of database operations (validation, entity resolution, GL posting, tax calculation). For large batches, synchronous processing would be too slow. The async mode processes records in batches and provides real-time progress updates.

Progress Tracking#

For async imports, you can poll the import status at any time to see:

  • Current progress percentage
  • Records processed so far
  • Import, skip, and error counts
  • Detailed error information on completion

Only one large import per organization runs at a time. Small imports (50 records or fewer) are unaffected by this limit.


Supported Object Types#

Master Data#

Object TypeDuplicate DetectionNotes
CustomerTax ID, then nameSupports billing/shipping addresses
VendorTax ID, then nameSupports billing/remit-to addresses
EmployeeWork emailAuto-generates global employee ID
AccountAccount numberSupports parent account hierarchy
ItemItem numberRevenue/expense/asset/COGS account linking
Dimension TypeNameHierarchy support
Dimension ValueType + codeParent code for hierarchies
Fixed AssetAsset number + entityCategory resolution, depreciation config

Transactions#

Object TypeDuplicate DetectionNotes
Transaction (all types)Type + reference + partyHandles 30+ transaction types through a single importer

Supported transaction types include AP invoices, AR invoices, journal entries, payments, expense reports, bank transfers, and more.

Banking & Opening Balances#

Object TypeNotes
Card TransactionFrom payment processors or CSV
Opening BalanceFor migration cutover

Import Options#

OptionDefaultDescription
Duplicate handlingSkipSkip ignores duplicates, Update overwrites existing records, Error treats duplicates as failures
Batch size100Records per batch; each batch is a transaction boundary
Stop on errorNoWhen enabled, the entire import stops on the first error
Validate onlyNoValidates all records without persisting (dry run)
DefaultsNoneDefault values merged into every record (e.g., currency, location)

Notes:

  • Duplicate update is not supported for financial transactions (immutable records)
  • If a batch fails, only that batch rolls back; other batches are unaffected

Preflight Checks#

Before importing, you can run a preflight check to discover the required and optional fields for a specific object type or transaction type. Required fields are configured per organization, so they may vary.

A preflight check returns:

  • Required header fields with types
  • Required line fields (for transactions)
  • Optional header and line fields
  • Optionally, validation results for a sample record

This helps ensure your import file is properly structured before processing.


Entity Resolution#

The import system flexibly resolves references to entities using multiple identifier types. This is particularly valuable during data migration when the source system uses different ID schemes.

Resolution priority (checked in order):

Entity TypeResolution Order
VendorInternal ID, global ID, external ID, tax ID, name
CustomerInternal ID, global ID, external ID, name
EmployeeInternal ID, external ID, name
AccountAccount number, external ID
ItemInternal ID, item number, external ID
ProjectInternal ID, external ID, name

Name-based lookups are case-insensitive. All reference data is pre-cached at import start for fast lookups during batch processing.


Data Migration#

When migrating from external systems (QuickBooks, Xero, or legacy software), the import system supports a phased approach:

Phase 1: Reference Data (Small Volume)#

Import foundational data first:

  • Chart of accounts
  • Dimension types and values
  • Payment terms, tax codes, number series

Phase 2: Master Data (Medium Volume)#

Import entities with external IDs for cross-referencing:

  • Customers, vendors, employees
  • Items and products

When importing master data, include the external system's ID in the metadata. This external ID is then used to resolve references when importing transactions.

Phase 3: Transactions (High Volume, Async)#

Import financial transactions that reference the master data imported in earlier phases:

  • AP and AR invoices
  • Payments
  • Journal entries
  • Bank statements

Transactions can reference entities by external ID, so you do not need to know the new internal IDs.


Error Handling#

Per-Record Errors#

Each record is validated individually, and errors include:

  • Row number for easy cross-reference
  • Field name causing the error
  • Descriptive error message
  • Record identifier (name, number, etc.)

Recovery from Partial Failures#

  1. Review the error details in the import results
  2. Fix the problematic records in your source data
  3. Re-submit with duplicate handling set to "skip" to avoid re-importing successful records
  4. Successfully imported records are committed independently and are not rolled back

Performance Expectations#

VolumeModeExpected Time
10 recordsSync~5 seconds
50 recordsSync~25 seconds
200 recordsAsync~90 seconds
500 recordsAsync~4 minutes
5,000 recordsAsync~50 minutes

Dimension caching reduces repeated lookups, improving performance by approximately 30% for large imports.


Import Tracking#

Every import is tracked with a full audit trail:

  • Import ID, object type, and source
  • Status progression: pending, importing, completed, or failed
  • Record counts: total, imported, updated, skipped, errors
  • Error and warning details (first 100 of each)
  • IDs of created and updated records
  • Duration and timestamps
  • User who initiated the import

For async imports, progress is updated after each batch, enabling real-time monitoring.

Subscribe to new posts

Get notified when we publish new insights on AI-native finance.