Case Study: Vox Pet Digital — migrating a veterinary SaaS from Express to NestJS with zero downtime | Blog

Vox Pet Digital is a vertical SaaS for pet shops and veterinary clinics. This is not a small project: 95 Prisma models, 91 frontend pages, 22 feature modules, integrations with OpenAI, WhatsApp, Stripe, Mercado Pago, Asaas, NF-e and an active migration from Express to NestJS running in the same process. In this case study, I'll detail the 4 biggest technical challenges I faced.

The system

Vox Pet covers the complete cycle of a veterinary clinic: appointments, medical records, vaccines, prescriptions, hospitalizations, sales, cash register, inventory, commissions, invoices (NF-e) and automated customer service via WhatsApp. It's multi-tenant (each clinic is an isolated tenant) and multi-branch (a clinic chain shares data across locations with granular control).

Backend: Node.js — Express (v1) + NestJS (v2) coexisting in the same process
Frontend: Next.js 16 + React 19 with App Router, MUI v7 + shadcn/ui + Tailwind v4
Database: PostgreSQL 16 via Prisma — 95 models, 93 with tenantid, 76 with branchid
AI: OpenAI (GPT-4o-mini + embeddings + Whisper) + Baileys (self-hosted WhatsApp)
Payments: Stripe (SaaS subscriptions) + Mercado Pago + Asaas (BR payments)
Fiscal: Focus NFe + NFe.io (two providers with fallback)
Storage: Firebase Admin
Leads: Meta/Facebook integration

Challenge #1: gradual Express → NestJS migration in the same process

When I joined the project, the backend was an Express monolith with 41 controllers — no strong typing, no DTOs, no consistent validation. Rewriting everything at once was unfeasible: the system was in production with clinics depending on it daily.

The solution was the strangler fig pattern: Express and NestJS running in the same Node.js process. V2 routes live at /api/v2, while v1 continues working normally. Both share the same Prisma instance, auth system and tracer.

Rules for any v2 code are strict:

Strict TypeScript — no any, no // @ts-ignore
Thin controllers — business logic in services, controllers only route
DTOs with validation — every input goes through class-validator
Mandatory tenant_id — no query exists without tenant filter
Zero console.log — everything via structured logger with context

So far, 12 modules have been migrated to v2: pets (17 endpoints), hospitalizations (8), procedures (6), branches, stock-transfers (5 + approve/reject/complete workflow), reminders, fiscal/NF-e (11), imports (NF-e XML), public booking, and analytics. The rest — clients, sales, appointments, medical records, vaccines, cash register, WhatsApp, admin — still runs on v1 and is being migrated incrementally.

// bootstrap.ts — Express and NestJS coexisting
const expressApp = express()

// v1 routes (legacy)
expressApp.use('/api/v1', authMiddleware, v1Router)

// v2 routes (NestJS)
const nestApp = await NestFactory.create(AppModule)
nestApp.setGlobalPrefix('api/v2')
const nestAdapter = nestApp.getHttpAdapter().getInstance()
expressApp.use(nestAdapter)

// Shared: Prisma, auth, tracer
expressApp.listen(PORT)

Challenge #2: 24/7 self-hosted WhatsApp + AI

WhatsApp customer service is one of Vox Pet's biggest differentiators. It's not a simple chatbot — it's an AI agent with 10 tools, RAG (business knowledge base), conversation memory, media processing (audio via Whisper) and automated follow-up.

The architecture:

Connection: Baileys (self-hosted WhatsApp Web) running on Railway with persistent disk to maintain the session
Orchestrator: receives the message, identifies the tenant, loads context (history + knowledge base) and decides which tool to use
10 available tools: schedule appointment, check availability, check price, recommend product, look up medical records, send reminder, among others
RAG: per-tenant knowledge base with OpenAI embeddings, semantic search to contextualize responses
Whisper: when the customer sends audio, automatically transcribes and processes as text
Memory: conversation history per customer, maintains context across messages
Follow-up: minute-by-minute cron checks conversations without response in the 15-240min window and sends automated follow-up

// Simplified orchestrator
async function handleMessage(tenantId: string, message: WAMessage) {
  const tenant = await loadTenantConfig(tenantId)
  const history = await getConversationHistory(message.from, tenantId)
  const knowledge = await ragSearch(message.text, tenantId)

  // If audio, transcribe with Whisper first
  const text = message.type === 'audio'
    ? await whisperTranscribe(message.media)
    : message.text

  const response = await openai.chat({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: buildSystemPrompt(tenant, knowledge) },
      ...history,
      { role: 'user', content: text },
    ],
    tools: getAvailableTools(tenant),
  })

  // Execute tool calls if any
  if (response.tool_calls) {
    for (const call of response.tool_calls) {
      await executeToolCall(call, tenantId)
    }
  }

  await sendWhatsAppMessage(message.from, response.content)
  await saveToHistory(message.from, tenantId, text, response.content)
}

The biggest challenge here wasn't the AI — it was reliability. Baileys reconnects on its own, but when Railway restarts the container, the session can be lost. We implemented persistent disk + health checks + alerting to ensure the bot never goes offline without notice.

Challenge #3: consistent multi-tenant + multi-branch

Multi-tenant in SaaS is common. Multi-branch within a tenant is another level of complexity. In Vox Pet, a clinic chain can have 3 branches sharing client and pet records, but each branch has its own inventory, cash register, schedule and commissions.

Of the 95 Prisma models:

93 models have tenant_id — total isolation between clinics
76 models have branch_id — per-branch isolation within the tenant
15 migrations to reach this structure without breaking existing data

The most complex case is inter-branch stock transfers. The flow has 3 states (pending → approved/rejected → completed) with 5 dedicated endpoints and an approval workflow. The source branch requests, the destination branch approves or rejects, and only then is inventory moved atomically.

// Inter-branch stock transfer — atomic transaction
async function completeTransfer(transferId: string, tenantId: string) {
  return prisma.$transaction(async (tx) => {
    const transfer = await tx.stockTransfer.findUnique({
      where: { id: transferId, tenantId },
      include: { items: true },
    })

    if (transfer.status !== 'APPROVED') {
      throw new BadRequestException('Transfer must be approved first')
    }

    for (const item of transfer.items) {
      // Deduct from source branch
      await tx.stock.update({
        where: { productId_branchId: { productId: item.productId, branchId: transfer.fromBranchId } },
        data: { quantity: { decrement: item.quantity } },
      })
      // Add to destination branch
      await tx.stock.upsert({
        where: { productId_branchId: { productId: item.productId, branchId: transfer.toBranchId } },
        create: { productId: item.productId, branchId: transfer.toBranchId, tenantId, quantity: item.quantity },
        update: { quantity: { increment: item.quantity } },
      })
    }

    await tx.stockTransfer.update({
      where: { id: transferId },
      data: { status: 'COMPLETED', completedAt: new Date() },
    })
  })
}

Challenge #4: fiscal invoice emission with two providers and fallback

NF-e (fiscal invoice) emission in Brazil is critical — if the invoice provider goes down, the clinic can't sell. We integrated two providers (Focus NFe and NFe.io) with automatic fallback. A v2 cron job processes the queue every 30 seconds.

11 endpoints dedicated to NF-e in the v2 module
XML import — clinics can import invoices received from suppliers
Resilient queue — if the primary provider fails, automatically tries the secondary
Processing every 30s — avoids burst calls and respects provider rate limits

System scale

Some numbers showing the real complexity of the system:

95 models in Prisma (complex relational database)
91 pages in the frontend (86 functional)
22 feature modules in the frontend
41 Express controllers (v1) + 12 NestJS modules (v2)
14 external integration services
4 cron jobs in operation (WhatsApp follow-up, NF-e, reminders, analytics)
10 tools in the WhatsApp AI agent

Lessons learned

Strangler fig works. Gradually migrating within the same process is safer than big bang. The key is having strict rules for new code and never relaxing them
Self-hosted WhatsApp is fragile. Baileys solves it, but requires dedicated infra with persistent disk and monitoring. Starting over, I'd evaluate the official WhatsApp Business API for larger tenants
Multi-branch is 3x more complex than multi-tenant. It's not just adding branch_id — there are different business rules per model. Inventory is per-branch, client is per-tenant, commission is per-branch, pet is per-tenant
Fallback on critical integrations is not optional. The day Focus NFe went down for 2 hours and NFe.io took over without anyone noticing was the day the investment paid for itself

--- Want to learn more about the project? Visit the Vox Pet Digital dedicated page.

Vinicius Aguiar

Case Study: Vox Pet Digital — migrating a veterinary SaaS from Express to NestJS with zero downtime