Stateless Services β
A stateless service doesn't remember anything between requests. Kill it, restart it, run 10 of them in parallel β the result is identical. This is factor VI of our architecture.
In NestJS, services are singletons β context always travels as method arguments, never stored on the class. CompanyAwareService enforces this: every method takes user explicitly, nothing leaks between requests.
State moves out of the process:
| What | Where |
|---|---|
| Persistent data | MongoDB |
| Job state | BullMQ (backed by Redis) |
| Auth context | Firebase session cookie, verified per request |
| Files | GCS |
| Config | Environment variables |
Externalize jobs β
A common class of bugs comes from triggering work inside the process lifecycle:
| Pattern | Problem |
|---|---|
EventEmitter to mutate data | Event lives in-process β service crashes, job fails, data is never mutated. No retry, no trace. |
| MongoDB change stream listeners | Listener is in-process β service restarts, you miss every event that fired while it was down. No replay. |
Mutations in constructor / onModuleInit | Works with one instance. Two instances = runs twice. Bootstrap jobs, assistant deployment, scheduled job registration all double-execute. |
@Cron / NestJS scheduler | Every running instance executes the job. Two instances = duplicate emails, duplicate reports, duplicate API calls. Use BullMQ repeatable jobs instead β Redis lock ensures only one instance wins. |
The rule: if the outcome matters, it can't live inside the process.
Use EventEmitter only for in-process coordination where losing the event is acceptable. For everything else β use BullMQ. Jobs survive restarts, are retried on failure, and with proper locking run exactly once regardless of how many instances are running.
β Queues & Async durable work that survives process restarts
β Bootstrap jobs one-time mutations that are safe to run with multiple instances
β Scheduling how to schedule jobs safely across multiple instances
Why it matters β
A stateful service can only run as one instance β the second instance doesn't share its memory. Stateless services scale horizontally for free: put a load balancer in front, run as many instances as needed.
This is where Convention over Configuration pays off at the infrastructure level.
By following these patterns:
- services without in-process state
- work externalized to BullMQ
- config in env vars
Scaling is offered by design. It's a consequence of how the code is written. You don't design for scale, you get it by default.
TIP
If you can restart the process without losing anything, it's stateless. That's the test.
CAUTION
Local in-memory caches, class-level request state, or sticky sessions are the usual culprits. Move them to Redis.