🍃 Tellia / 🥷🏿 Engineering

🗄️ MongoDB: Records vs. References

When designing Mongoose schemas in apps/agri-backend, data relationships follow two patterns: embedded documents (denormalized) and ObjectId references (normalized).

Note The Workspace model has isResources: boolean. This is the key distinction — isResources: true workspaces are "references" (resource/knowledge bases), while isResources: false are "records" (parse results from conversations).

Embedded Documents

Used when data is hierarchical, immutable, or always read together with the parent:

typescript

// Field embeds GeoJSON geometry — always read together, never shared
@Prop({ type: Geometry, required: true })
geometry: Polygon | MultiPolygon;

// Draft embeds full message objects — snapshot, no join needed
@Prop({ type: [DraftMessageSchema], default: [] })
messages: DraftMessage[];

// CallLog embeds asset metadata — tightly coupled to the call record
@Prop({ type: Map, of: Object })
assets: Map<string, { gcs: GcsAsset; transcriptions: Record<string, Transcription> }>;

ObjectId References

Used for cross-domain relationships where each entity has its own lifecycle:

typescript

// Draft references User — user has independent lifecycle
@Prop({ type: Types.ObjectId, ref: 'User' })
userId: Types.ObjectId;

// Draft references multiple CallLogs — grow independently
@Prop({ type: [{ type: Types.ObjectId, ref: 'CallLog' }] })
callLogIds: Types.ObjectId[];

// Virtual reverse lookup — Transcription owns the foreign key
CallLogSchema.virtual('transcriptions', {
  ref: 'Transcription',
  localField: '_id',
  foreignField: 'callLogId',
});

Decision Guide

Use embedding when...	Use references when...
Data is a value object (GeoJSON, metadata bundle)	The sub-entity has its own lifecycle
Always read together with the parent	The sub-collection can grow very large
Never shared across documents	Cross-domain queries are needed
Immutable or append-only	Reverse lookups are required

Domain Relationship Map

User
 ├─ userId ──────────────────────────────────────────────┐
 │                                                       │
 ▼                                                       ▼
Draft                                              CallLog
 ├─ messages[]  (embedded DraftMessage[])           ├─ assets  (embedded GCS + transcription data)
 ├─ messageIds[] ──────────► Message                ├─ parseId ──────────► ParseResult
 └─ callLogIds[] ──────────► CallLog                └─ virtual: transcriptions ← Transcription

Field (standalone — no cross-domain refs)
 └─ geometry (embedded GeoJSON Polygon)

Note: Field and Draft have no direct schema link. They connect through the BullMQ job queue at processing time — a deliberate decoupling so geospatial and conversational data evolve independently.

✅ Testing

🐙 Git

🚂 Deployment

👁️ Observability

🗄️ MongoDB: Records vs. References

Embedded Documents

ObjectId References

Decision Guide

Domain Relationship Map

🗄️ MongoDB: Records vs. References ​

Embedded Documents ​

ObjectId References ​

Decision Guide ​

Domain Relationship Map ​

🗄️ MongoDB: Records vs. References

Embedded Documents

ObjectId References

Decision Guide

Domain Relationship Map