Handling CSV Imports at Scale: Fail-Fast vs. Fault-Tolerant Processing in Node.js

HC
Blog Post

Learn industry-standard strategies for robust CSV processing in Node.js with MongoDB, from naive implementations to idempotent, production-ready systems.

When building enterprise applications, few features seem as simple yet prove as treacherous as "Bulk Import via CSV."

It starts innocently:

  1. Read a file
  2. Loop through rows
  3. Insert into database

But reality quickly intrudes:

  • What happens when row 4,999 of 5,000 fails?
  • Do you rollback everything? Continue processing?
  • If you retry the file, do you end up with 4,999 duplicates?

This post explores industry-standard strategies for handling stream processing of data into a database, focusing on the transition from a naive implementation to a robust, idempotent system.

The Problem: "Silent Failures" and Partial State

Imagine a system processing event occurrence data. The initial implementation might look like this:

async function processFile(rows) {
  for (const row of rows) {
    try {
      await createOccurrence(row);
    } catch (error) {
      console.error("Row failed", error); 
    }
  }
  await markFileStatus('COMPLETED');
}

What's wrong here?

  • ❌ Continues processing after failures
  • ❌ Marks file as "Completed" even if 50% of rows failed
  • ❌ No way to identify which rows succeeded or failed
  • ❌ Re-uploading creates duplicates of successful rows

This approach leads to data drift: The system reports "Success," but the database is missing records. The operations team has no visibility into failures, and retrying the file risks creating duplicates.

Strategy 1: The "All-or-Nothing" Transaction (ACID)

For critical financial data or strictly relational data, partial success is unacceptable.

How It Works

Wrap the entire batch in a database transaction. If one row fails, the transaction aborts and rolls back to the state before the file was touched.

Pros

  • Zero Data Corruption: Database is always in a valid state
  • Simple Error Handling: File status is binary (Success or Fail)

Cons

  • Performance: Long-running transactions lock resources
  • Scale Limits: MongoDB transactions have size limits (16MB of oplog)
  • Unfeasible for Large Files: Processing 50,000 rows in one transaction often fails

Strategy 2: "Fail-Fast" with Idempotency (The Balanced Approach)

This is the preferred approach for operational data like occurrences, logs, or inventory updates.

The Concept

  1. Process rows sequentially
  2. If any row fails, stop immediately and mark the file as FAILED
  3. Crucially: Ensure the processing logic is idempotent

What is Idempotency?

Idempotency means performing an operation multiple times has the same result as performing it once.

Non-Idempotent:

INSRT INTO occurrences (id) VALUES (auto_increment())

→ Running twice creates two records ❌

Idempotent:

INSERT INTO occurrences (event_id) VALUES ('EVT-123') ON CONFLICT DO NOTHING

→ Running twice creates one record ✅

Implementing the Solution

Here's how to refactor your file consumer to handle this robustly.

1. Use Stable Identifiers

Move away from generating new IDs on every run. Instead, use stable data from the CSV (like eventId or externalRef) to identify records.

2. Implement "Skip-Existing" Logic

Instead of complex "upsert" logic, simply check if the work is already done.

for (const row of allRecords) {
  const existing = await OccurrenceModel.findOne({
    $or: [
      { eventId: row.eventId },
      { externalRef: row.externalRef }
    ]
  });

  if (existing) {
    logger.info(`Skipping existing occurrence: ${row.eventId}`);
    continue;
  }

  try {
    await createNewOccurrence(row);
  } catch (err) {
    logger.error(`Row failed: ${err.message}`);
    throw err;
  }
}

Key Points:

  • ✅ Check for duplicates using stable identifiers
  • ✅ Skip existing records (idempotency)
  • ✅ Fail fast on errors
  • ✅ Safe to retry the entire file

The Operational Workflow

With this architecture, the workflow becomes clear and safe:

Step-by-Step Example

  1. Upload: User uploads a file with 5,000 rows
  2. Failure: Row 101 has invalid data. System stops immediately
    • Rows 1-100 are saved
    • File status: FAILED
  3. Fix: User corrects the data in row 101
  4. Retry: User uploads the corrected file
    • Rows 1-100: System detects existing records (by eventId) and skips them
    • Row 101: System processes successfully
    • Rows 102-5000: System processes them
  5. Success: File status: COMPLETED

Why This Works

  • No duplicates: Idempotency ensures safe retries
  • Clear failures: Operations team knows exactly what failed
  • Partial progress preserved: No need to reprocess successful rows
  • Simple recovery: Fix the issue and re-upload

Summary

When designing data ingestion systems, avoid the temptation to just "catch and log" errors to keep the loop moving.

Key Principles

  • Fail loudly: Don't hide data issues
  • Make it repeatable: Ensure your code handles re-running the same data without side effects
  • Use stable keys: Rely on business keys (Event IDs, External Refs, SKUs) rather than auto-generated IDs for duplicate detection

By adopting these patterns, you transform a fragile import feature into a resilient system that your operations team can trust.

Quick Reference

ApproachUse CaseProsCons
ACID TransactionsFinancial data, small batchesZero corruption, simplePerformance issues, scale limits
Fail-Fast + IdempotencyOperational data, large filesScalable, safe retriesRequires stable identifiers

Published on 12/10/2025by Claudio Teixeira