Vox Pet Digital is a vertical SaaS for pet shops and veterinary clinics. This is not a small project: 95 Prisma models, 91 frontend pages, 22 feature modules, integrations with OpenAI, WhatsApp, Stripe, Mercado Pago, Asaas, NF-e and an active migration from Express to NestJS running in the same process. In this case study, I'll detail the 4 biggest technical challenges I faced.
The system
Vox Pet covers the complete cycle of a veterinary clinic: appointments, medical records, vaccines, prescriptions, hospitalizations, sales, cash register, inventory, commissions, invoices (NF-e) and automated customer service via WhatsApp. It's multi-tenant (each clinic is an isolated tenant) and multi-branch (a clinic chain shares data across locations with granular control).
- Backend: Node.js — Express (v1) + NestJS (v2) coexisting in the same process
- Frontend: Next.js 16 + React 19 with App Router, MUI v7 + shadcn/ui + Tailwind v4
- Database: PostgreSQL 16 via Prisma — 95 models, 93 with tenantid, 76 with branchid
- AI: OpenAI (GPT-4o-mini + embeddings + Whisper) + Baileys (self-hosted WhatsApp)
- Payments: Stripe (SaaS subscriptions) + Mercado Pago + Asaas (BR payments)
- Fiscal: Focus NFe + NFe.io (two providers with fallback)
- Storage: Firebase Admin
- Leads: Meta/Facebook integration
Challenge #1: gradual Express → NestJS migration in the same process
When I joined the project, the backend was an Express monolith with 41 controllers — no strong typing, no DTOs, no consistent validation. Rewriting everything at once was unfeasible: the system was in production with clinics depending on it daily.
The solution was the strangler fig pattern: Express and NestJS running in the same Node.js process. V2 routes live at /api/v2, while v1 continues working normally. Both share the same Prisma instance, auth system and tracer.
Rules for any v2 code are strict:
- Strict TypeScript — no
any, no// @ts-ignore - Thin controllers — business logic in services, controllers only route
- DTOs with validation — every input goes through class-validator
- Mandatory tenant_id — no query exists without tenant filter
- Zero
console.log— everything via structured logger with context
So far, 12 modules have been migrated to v2: pets (17 endpoints), hospitalizations (8), procedures (6), branches, stock-transfers (5 + approve/reject/complete workflow), reminders, fiscal/NF-e (11), imports (NF-e XML), public booking, and analytics. The rest — clients, sales, appointments, medical records, vaccines, cash register, WhatsApp, admin — still runs on v1 and is being migrated incrementally.
// bootstrap.ts — Express and NestJS coexisting
const expressApp = express()
// v1 routes (legacy)
expressApp.use('/api/v1', authMiddleware, v1Router)
// v2 routes (NestJS)
const nestApp = await NestFactory.create(AppModule)
nestApp.setGlobalPrefix('api/v2')
const nestAdapter = nestApp.getHttpAdapter().getInstance()
expressApp.use(nestAdapter)
// Shared: Prisma, auth, tracer
expressApp.listen(PORT)Challenge #2: 24/7 self-hosted WhatsApp + AI
WhatsApp customer service is one of Vox Pet's biggest differentiators. It's not a simple chatbot — it's an AI agent with 10 tools, RAG (business knowledge base), conversation memory, media processing (audio via Whisper) and automated follow-up.
The architecture:
- Connection: Baileys (self-hosted WhatsApp Web) running on Railway with persistent disk to maintain the session
- Orchestrator: receives the message, identifies the tenant, loads context (history + knowledge base) and decides which tool to use
- 10 available tools: schedule appointment, check availability, check price, recommend product, look up medical records, send reminder, among others
- RAG: per-tenant knowledge base with OpenAI embeddings, semantic search to contextualize responses
- Whisper: when the customer sends audio, automatically transcribes and processes as text
- Memory: conversation history per customer, maintains context across messages
- Follow-up: minute-by-minute cron checks conversations without response in the 15-240min window and sends automated follow-up
// Simplified orchestrator
async function handleMessage(tenantId: string, message: WAMessage) {
const tenant = await loadTenantConfig(tenantId)
const history = await getConversationHistory(message.from, tenantId)
const knowledge = await ragSearch(message.text, tenantId)
// If audio, transcribe with Whisper first
const text = message.type === 'audio'
? await whisperTranscribe(message.media)
: message.text
const response = await openai.chat({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: buildSystemPrompt(tenant, knowledge) },
...history,
{ role: 'user', content: text },
],
tools: getAvailableTools(tenant),
})
// Execute tool calls if any
if (response.tool_calls) {
for (const call of response.tool_calls) {
await executeToolCall(call, tenantId)
}
}
await sendWhatsAppMessage(message.from, response.content)
await saveToHistory(message.from, tenantId, text, response.content)
}The biggest challenge here wasn't the AI — it was reliability. Baileys reconnects on its own, but when Railway restarts the container, the session can be lost. We implemented persistent disk + health checks + alerting to ensure the bot never goes offline without notice.
Challenge #3: consistent multi-tenant + multi-branch
Multi-tenant in SaaS is common. Multi-branch within a tenant is another level of complexity. In Vox Pet, a clinic chain can have 3 branches sharing client and pet records, but each branch has its own inventory, cash register, schedule and commissions.
Of the 95 Prisma models:
- 93 models have
tenant_id— total isolation between clinics - 76 models have
branch_id— per-branch isolation within the tenant - 15 migrations to reach this structure without breaking existing data
The most complex case is inter-branch stock transfers. The flow has 3 states (pending → approved/rejected → completed) with 5 dedicated endpoints and an approval workflow. The source branch requests, the destination branch approves or rejects, and only then is inventory moved atomically.
// Inter-branch stock transfer — atomic transaction
async function completeTransfer(transferId: string, tenantId: string) {
return prisma.$transaction(async (tx) => {
const transfer = await tx.stockTransfer.findUnique({
where: { id: transferId, tenantId },
include: { items: true },
})
if (transfer.status !== 'APPROVED') {
throw new BadRequestException('Transfer must be approved first')
}
for (const item of transfer.items) {
// Deduct from source branch
await tx.stock.update({
where: { productId_branchId: { productId: item.productId, branchId: transfer.fromBranchId } },
data: { quantity: { decrement: item.quantity } },
})
// Add to destination branch
await tx.stock.upsert({
where: { productId_branchId: { productId: item.productId, branchId: transfer.toBranchId } },
create: { productId: item.productId, branchId: transfer.toBranchId, tenantId, quantity: item.quantity },
update: { quantity: { increment: item.quantity } },
})
}
await tx.stockTransfer.update({
where: { id: transferId },
data: { status: 'COMPLETED', completedAt: new Date() },
})
})
}Challenge #4: fiscal invoice emission with two providers and fallback
NF-e (fiscal invoice) emission in Brazil is critical — if the invoice provider goes down, the clinic can't sell. We integrated two providers (Focus NFe and NFe.io) with automatic fallback. A v2 cron job processes the queue every 30 seconds.
- 11 endpoints dedicated to NF-e in the v2 module
- XML import — clinics can import invoices received from suppliers
- Resilient queue — if the primary provider fails, automatically tries the secondary
- Processing every 30s — avoids burst calls and respects provider rate limits
System scale
Some numbers showing the real complexity of the system:
- 95 models in Prisma (complex relational database)
- 91 pages in the frontend (86 functional)
- 22 feature modules in the frontend
- 41 Express controllers (v1) + 12 NestJS modules (v2)
- 14 external integration services
- 4 cron jobs in operation (WhatsApp follow-up, NF-e, reminders, analytics)
- 10 tools in the WhatsApp AI agent
Lessons learned
- Strangler fig works. Gradually migrating within the same process is safer than big bang. The key is having strict rules for new code and never relaxing them
- Self-hosted WhatsApp is fragile. Baileys solves it, but requires dedicated infra with persistent disk and monitoring. Starting over, I'd evaluate the official WhatsApp Business API for larger tenants
- Multi-branch is 3x more complex than multi-tenant. It's not just adding branch_id — there are different business rules per model. Inventory is per-branch, client is per-tenant, commission is per-branch, pet is per-tenant
- Fallback on critical integrations is not optional. The day Focus NFe went down for 2 hours and NFe.io took over without anyone noticing was the day the investment paid for itself
--- Want to learn more about the project? Visit the Vox Pet Digital dedicated page.
