Handling CSV Imports at Scale: Fail-Fast vs. Fault-Tolerant Processing in Node.js | Blog

When building enterprise applications, few features seem as simple yet prove as treacherous as "Bulk Import via CSV."

It starts innocently:

Read a file
Loop through rows
Insert into database

But reality quickly intrudes:

What happens when row 4,999 of 5,000 fails?
Do you rollback everything? Continue processing?
If you retry the file, do you end up with 4,999 duplicates?

This post explores industry-standard strategies for handling stream processing of data into a database, focusing on the transition from a naive implementation to a robust, idempotent system.

The Problem: "Silent Failures" and Partial State

Imagine a system processing event occurrence data. The initial implementation might look like this:

async function processFile(rows) {
  for (const row of rows) {
    try {
      await createOccurrence(row);
    } catch (error) {
      console.error("Row failed", error); 
    }
  }
  await markFileStatus('COMPLETED');
}

What's wrong here?

❌ Continues processing after failures
❌ Marks file as "Completed" even if 50% of rows failed
❌ No way to identify which rows succeeded or failed
❌ Re-uploading creates duplicates of successful rows

This approach leads to data drift: The system reports "Success," but the database is missing records. The operations team has no visibility into failures, and retrying the file risks creating duplicates.

Strategy 1: The "All-or-Nothing" Transaction (ACID)

For critical financial data or strictly relational data, partial success is unacceptable.

How It Works

Wrap the entire batch in a database transaction. If one row fails, the transaction aborts and rolls back to the state before the file was touched.

Pros

Zero Data Corruption: Database is always in a valid state
Simple Error Handling: File status is binary (Success or Fail)

Cons

Performance: Long-running transactions lock resources
Scale Limits: MongoDB transactions have size limits (16MB of oplog)
Unfeasible for Large Files: Processing 50,000 rows in one transaction often fails

Strategy 2: "Fail-Fast" with Idempotency (The Balanced Approach)

This is the preferred approach for operational data like occurrences, logs, or inventory updates.

The Concept

Process rows sequentially
If any row fails, stop immediately and mark the file as FAILED
Crucially: Ensure the processing logic is idempotent

What is Idempotency?

Idempotency means performing an operation multiple times has the same result as performing it once.

Non-Idempotent:

INSRT INTO occurrences (id) VALUES (auto_increment())

→ Running twice creates two records ❌

Idempotent:

INSERT INTO occurrences (event_id) VALUES ('EVT-123') ON CONFLICT DO NOTHING

→ Running twice creates one record ✅

Implementing the Solution

Here's how to refactor your file consumer to handle this robustly.

1. Use Stable Identifiers

Move away from generating new IDs on every run. Instead, use stable data from the CSV (like eventId or externalRef) to identify records.

2. Implement "Skip-Existing" Logic

Instead of complex "upsert" logic, simply check if the work is already done.

for (const row of allRecords) {
  const existing = await OccurrenceModel.findOne({
    $or: [
      { eventId: row.eventId },
      { externalRef: row.externalRef }
    ]
  });

  if (existing) {
    logger.info(`Skipping existing occurrence: ${row.eventId}`);
    continue;
  }

  try {
    await createNewOccurrence(row);
  } catch (err) {
    logger.error(`Row failed: ${err.message}`);
    throw err;
  }
}

Key Points:

✅ Check for duplicates using stable identifiers
✅ Skip existing records (idempotency)
✅ Fail fast on errors
✅ Safe to retry the entire file

The Operational Workflow

With this architecture, the workflow becomes clear and safe:

Step-by-Step Example

Upload: User uploads a file with 5,000 rows
Failure: Row 101 has invalid data. System stops immediately
- Rows 1-100 are saved
- File status: FAILED
Fix: User corrects the data in row 101
Retry: User uploads the corrected file
- Rows 1-100: System detects existing records (by eventId) and skips them
- Row 101: System processes successfully
- Rows 102-5000: System processes them
Success: File status: COMPLETED

Why This Works

No duplicates: Idempotency ensures safe retries
Clear failures: Operations team knows exactly what failed
Partial progress preserved: No need to reprocess successful rows
Simple recovery: Fix the issue and re-upload

Summary

When designing data ingestion systems, avoid the temptation to just "catch and log" errors to keep the loop moving.

Key Principles

Fail loudly: Don't hide data issues
Make it repeatable: Ensure your code handles re-running the same data without side effects
Use stable keys: Rely on business keys (Event IDs, External Refs, SKUs) rather than auto-generated IDs for duplicate detection

By adopting these patterns, you transform a fragile import feature into a resilient system that your operations team can trust.

Quick Reference

Approach	Use Case	Pros	Cons
ACID Transactions	Financial data, small batches	Zero corruption, simple	Performance issues, scale limits
Fail-Fast + Idempotency	Operational data, large files	Scalable, safe retries	Requires stable identifiers