vibecode.wiki
RU EN
~/wiki / integratsii-i-api / idempotentnost-ai-integratsii-bez-dublei-i-poter-dannih

Idempotence of AI integrations: how to remove duplicates and losses

Next step

Open the bot or continue inside this section.

$ cd section/ $ open @mmorecil_bot

Article -> plan in AI

Paste this article URL into any AI and get an implementation plan for your project.

Read this article: https://vibecode.morecil.ru/en/integratsii-i-api/idempotentnost-ai-integratsii-bez-dublei-i-poter-dannih/ Work in my current project context. Create an implementation plan for this stack: 1) what to change 2) which files to edit 3) risks and typical mistakes 4) how to verify everything works If there are options, provide "quick" and "production-ready".
How to use
  1. Copy this prompt and send it to your AI chat.
  2. Attach your project or open the repository folder in the AI tool.
  3. Ask for file-level changes, risks, and a quick verification checklist.

Introduction

AI-assisted integration development almost always works in the context of repeated attempts: the client re-requested the response, the workman restarted, the queue delivered the message a second time, the gateway repeated the call after a timeout. As long as everything is green on the stand, it's invisible. In real operation, it is such repetitions that create expensive errors: double write-offs, duplicate tasks, conflicting statuses, broken reports.

This article is for developers, techlids and platform engineers who build AI-pipelines with external APIs, queues, workmen and tool calls. After reading, you’ll have a working diagram: how to introduce idempotence at the input, where to store the keys, how to design retros, and how to check that the system doesn’t really do side effects twice.

In retroactive systems, the question is not whether there will be a repeat, but whether a repeat will break your business process. Idempotence makes repetition safe.

Why AI integrations are particularly vulnerable to duplicates

Classic integrations also suffer from repeated calls, but in AI systems, the risk is higher due to a combination of factors:

  • long chains of calls: API -> orchestrator -> tools -> storage;
  • background tasks with automatic retracements;
  • queue with the delivery model at-least-once;
  • external providers that sometimes respond slowly or unstablely;
  • human reboots: just in case i ran it again.

If surgery has a side effect, repeating without protection almost always creates debris. Examples of side effects:

  • creation of a new order;
  • write-offs;
  • issuance of an access token;
  • recording the final status in the database;
  • send a notification to the customer.

It is important to separate the two scenarios:

  1. Repeated GET without side effects. It's usually safe.
  2. Repeated POST or PATCH that changes state. Dangerous without idempotence.

The short conclusion: the more asynchronous and external dependencies, the higher the cost of a non-idempotent operation.

Practical part by steps

Step 1. Find critical surgeries where the take hits the business

Don’t start with “covering everything.” First, select 5-10 operations with the maximum cost of error:

  • payments and billing;
  • finalization of orders;
  • creation of tickets and tasks in external systems;
  • a status change that triggers downstream processes;
  • granting privileges and access.

For each operation, record:

  • what side effect does it have;
  • what will happen with double performance;
  • can the cached result be returned safely;
  • how long does deduplication (TTL) last.

First, stabilize transactions with financial, legal and user risk.

Step 2. Enter the idempotency key as a binding contract

Idempotence should be part of an API agreement, not an option. For critical POST/PATCH, require the Idempotency-Key header or field.

Key rules:

  • the key is unique for a single business operation;
  • the same key cannot be used for different payloads;
  • the server checks the key before performing a side effect;
  • when the same payload is repeated, the same result is returned;
  • when repeated with another payload, the conflict returns (e.g., 409).

Working format of the key:

  • tenant_id + operation_type + client_generated_uuid;
  • or hash of the business id of the transaction.

The main mistake in this step is to generate the key on the server after starting execution. Then the repetition has time to go into business logic. The key must come in and be validated at the beginning.

Short conclusion: if the key is not required for a critical operation, there is virtually no idempotence in the system.

Step 3. Design a Deadup Store and Key State

You need a table or key-value store where the fate of the key is recorded. Minimum model:

  • idempotency_key;
  • request_fingerprint (canonized hash payload);
  • status (in_progress, succeeded, failed);
  • response_snapshot;
  • created_at, expires_at.

Processing procedure:

  1. The request came with the key.
  2. Looking for the key in the dead-up shop.
  3. If the key is missing, create a in_progress record atomically.
  4. Performing surgery.
  5. Save the result and put succeeded.
  6. When we repeat, we give the saved answer.

Critical: the operation of creating in_progress must be atomic, otherwise two parallel requests will pass simultaneously and both will have a side effect.

Short conclusion: without atomic key fixation, the protection against doubles falls apart at peak traffic.

Step 4. Canonize Payload and Validate Conflict

One key is not enough. The client may accidentally repeat the key but change the body of the request. So we need request_fingerprint:

  • remove non-essential fields (trace id, time stamps);
  • sort the JSON keys;
  • normalize the format of numbers / date;
  • count the hash according to the canonical version.

If the key matches and the fingerprint is different, return the conflict and do not perform the operation. This protects against the same key, different intention class of errors.

Short conclusion: idempotence without payload compliance testing gives a false sense of security.

Step 5. Set up a retry policy coherently

Retrai are needed, but must be managed:

  • exponential delay with jitter;
  • limit of attempts;
  • general timeout for a business operation;
  • a clear list of retryable errors;
  • unified policy for API clients and workshops.

If the retras are not coordinated, one layer repeats 3 times, the second 5 more, the third 4 more. As a result, one network error turns into dozens of repetitions of the same operation.

The rule of thumb is that repetition should remain safe because of idempotence, but this is not an excuse for endless repetition.

Short conclusion: idempotence and retry-policy work only in pairs.

Step 6. Add outbox/inbox for asynchronous contours

In queues and event-driven systems, "just once" is usually unattainable at the transport level. The realistic goal is "at-least-once + idempotent consumer.".

What to implement:

  • outbox table for guaranteed publication of events;
  • inbox/processed-events from a event_id deduplication consumer;
  • the business key of the transaction, if the event may come again from different sources.

This is especially important in AI loops, where a single task can run multiple downstream services and each one restarts independently.

Short conclusion: idempotent API without idempotent consumer does not cover the risk of duplicates in queues.

Step 7. Check Observability: You should see takes before the incident

Minimum set of metrics and logs:

  • number of requests from reuse of one Idempotency-Key;
  • share of key/fingerprint conflicts;
  • the number of blocked duplicates;
  • the average recording time of in_progress;
  • dead-up party errors;
  • percentage of operations without a key (should be zero on critical endpoints).

Add correlation id and link it to:

  • an input API request;
  • dedup party entry;
  • calls from external systems;
  • final response to the client.

Without this link, you will only learn about the problem from user complaints.

Idempotence without metrics is difficult to maintain and almost impossible to improve.

Real use cases

Scenario 1. Payment endpoint with customer-side timeout

The client sent POST /payments, the network hovered, the client did not receive a response and repeated the request. Without idempotence, the system creates a second write-off. With the idempotency key, the second call receives the already saved result of the first operation.

Practical benefit: Double write-offs and manual returns disappear after network bursts.

Scenario 2. Worker reprocesses the task after restart

The task of generating the report was completed, but the confirmation was not recorded before the process fell. After the restart, the queue gives the message again. The job_id processor determines that the result has already been finalized and misses the side effect.

There are no duplicate reports and status conflicts in analytics.

Scenario 3. Agent re-invokes instrument after 502

Tool returned 502, the agent decided to repeat the call with the same business intent. The integrator service checks the key and returns the previous result if the operation has already been performed, or correctly continues if the previous start is not completed.

Practical benefit: stable workflow even with an unstable external API.

Tools and technologies

In practice, it is not brands that matter, but architectural roles:

  • API gateway or middleware for mandatory key validation;
  • dedupe-side in relational database or key-value storage;
  • transactional fixation model in_progress;
  • canonizer payload and fingerprint hash;
  • outbox/inbox for event-driven contours;
  • centralized logging and metrics.

If the stack already uses several clients, the main thing is a single idempotence contract. Otherwise, one part of the system works safely, and the other continues to create duplicates.

The short conclusion is that technological choice is secondary, contract, atomicity and observability are critical.

Comparative table of approaches

Подход Защита от дублей Цена внедрения Операционный риск
Только retry без ключей Низкая Низкая Высокий
Idempotency key без fingerprint Средняя Средняя Средний
Key + fingerprint + дедуп-стор + outbox/inbox Высокая Выше на старте Низкий в эксплуатации

The bottom line is that the minimum “fast” option seems cheap only before the first incident. For critical operations, the full circuit pays off.

Implementation checklist

  • Critical operations with side effects are highlighted.
  • For them, Idempotency-Key is required in the API contract.
  • Atomic recording of in_progress to business logic is implemented.
  • request_fingerprint is stored to check the payload conflict.
  • Configured TTL and cleaning old keys.
  • Retrai are limited and agreed between layers.
  • Outbox/inbox patterns are introduced for queues.
  • Metrics of duplicates and conflicts are monitored.
  • Logs are connected via correlation id.
  • There are tests for competitive repeats and network failures.

Typical errors and how to fix them

Mistake 1. The key is optional or not checked everywhere

Problem: some customers send the key, some don't, and take-offs continue.

Fix: Make the key mandatory for critical endpoints and reject requests without it.

Mistake 2. The key is retained after the operation

Problem: With parallel queries, both threads reach a side effect.

Correction: First the atomic fixation of in_progress, then the operation.

Mistake 3. There is no payload comparison when repeated

Problem: One key is accidentally used for different operations.

Fix: Store the fingerprint and return the conflict if the mismatches.

Mistake 4. Idempotence is implemented only in the HTTP layer

Problem: in queues and workmen, doubles continue to pass.

Fix: implement idempotent consumer and event_id/job_id processing.

Mistake 5. No double metrics

Problem: The degradation accumulates imperceptibly and manifests itself only in an incident.

Fix: Remove reuse-key, fingerprint conflicts, and blocked duplicates into a mandatory dashboard.

FAQ

Is Idempotence Only Needed by Payments?

Nope. It needs any operation with a side effect: creating an entity, changing status, issuing access, sending a notification.

Is it possible to make idempotence only on the client?

Nope. Client restrictions are useful, but the final protection should be on the server where the side effect is performed.

Which TTL to choose for idempotency key?

TTL depends on the business context: repeat windows, queue delays, and SLAs. The main rule is that TTL should cover a realistic re-delivery period.

What to do with existing endpoints without keys?

Introduce gradually: first new versions of endpoint and the riskiest operations, then migrate old customers and close the legacy path.

Idempotence solves all reliability problems?

Nope. It closes the class of doubles and repetitions. In parallel, you need timeouts, circuit breaker, rate limiting and a clear pullback strategy.

Outcome and next practical step

Idempotence is the underlying sustainability mechanism of AI integrations, not “additional optimization.” If the operation can be repeated, it must be designed so that the repetition does not break the state of the system.

Next step: Select one critical endpoint, add the mandatory Idempotency-Key, implement the in_progress atomic record, and check the competitive repeat in the load test. This will result in a rapid and measurable increase in reliability.