Vinicius Aguiar
Case Study

Case Study: Vox Pet Digital — migrating a veterinary SaaS from Express to NestJS with zero downtime

Apr 16, 2026 · 14 min read

Vox Pet Digital is a vertical SaaS for pet shops and veterinary clinics. This is not a small project: 95 Prisma models, 91 frontend pages, 22 feature modules, integrations with OpenAI, WhatsApp, Stripe, Mercado Pago, Asaas, NF-e and an active migration from Express to NestJS running in the same process. In this case study, I'll detail the 4 biggest technical challenges I faced.

The system

Vox Pet covers the complete cycle of a veterinary clinic: appointments, medical records, vaccines, prescriptions, hospitalizations, sales, cash register, inventory, commissions, invoices (NF-e) and automated customer service via WhatsApp. It's multi-tenant (each clinic is an isolated tenant) and multi-branch (a clinic chain shares data across locations with granular control).

  • Backend: Node.js — Express (v1) + NestJS (v2) coexisting in the same process
  • Frontend: Next.js 16 + React 19 with App Router, MUI v7 + shadcn/ui + Tailwind v4
  • Database: PostgreSQL 16 via Prisma — 95 models, 93 with tenantid, 76 with branchid
  • AI: OpenAI (GPT-4o-mini + embeddings + Whisper) + Baileys (self-hosted WhatsApp)
  • Payments: Stripe (SaaS subscriptions) + Mercado Pago + Asaas (BR payments)
  • Fiscal: Focus NFe + NFe.io (two providers with fallback)
  • Storage: Firebase Admin
  • Leads: Meta/Facebook integration

Challenge #1: gradual Express → NestJS migration in the same process

When I joined the project, the backend was an Express monolith with 41 controllers — no strong typing, no DTOs, no consistent validation. Rewriting everything at once was unfeasible: the system was in production with clinics depending on it daily.

The solution was the strangler fig pattern: Express and NestJS running in the same Node.js process. V2 routes live at /api/v2, while v1 continues working normally. Both share the same Prisma instance, auth system and tracer.

Rules for any v2 code are strict:

  • Strict TypeScript — no any, no // @ts-ignore
  • Thin controllers — business logic in services, controllers only route
  • DTOs with validation — every input goes through class-validator
  • Mandatory tenant_id — no query exists without tenant filter
  • Zero console.log — everything via structured logger with context

So far, 12 modules have been migrated to v2: pets (17 endpoints), hospitalizations (8), procedures (6), branches, stock-transfers (5 + approve/reject/complete workflow), reminders, fiscal/NF-e (11), imports (NF-e XML), public booking, and analytics. The rest — clients, sales, appointments, medical records, vaccines, cash register, WhatsApp, admin — still runs on v1 and is being migrated incrementally.

// bootstrap.ts — Express and NestJS coexisting
const expressApp = express()

// v1 routes (legacy)
expressApp.use('/api/v1', authMiddleware, v1Router)

// v2 routes (NestJS)
const nestApp = await NestFactory.create(AppModule)
nestApp.setGlobalPrefix('api/v2')
const nestAdapter = nestApp.getHttpAdapter().getInstance()
expressApp.use(nestAdapter)

// Shared: Prisma, auth, tracer
expressApp.listen(PORT)

Challenge #2: 24/7 self-hosted WhatsApp + AI

WhatsApp customer service is one of Vox Pet's biggest differentiators. It's not a simple chatbot — it's an AI agent with 10 tools, RAG (business knowledge base), conversation memory, media processing (audio via Whisper) and automated follow-up.

The architecture:

  1. Connection: Baileys (self-hosted WhatsApp Web) running on Railway with persistent disk to maintain the session
  2. Orchestrator: receives the message, identifies the tenant, loads context (history + knowledge base) and decides which tool to use
  3. 10 available tools: schedule appointment, check availability, check price, recommend product, look up medical records, send reminder, among others
  4. RAG: per-tenant knowledge base with OpenAI embeddings, semantic search to contextualize responses
  5. Whisper: when the customer sends audio, automatically transcribes and processes as text
  6. Memory: conversation history per customer, maintains context across messages
  7. Follow-up: minute-by-minute cron checks conversations without response in the 15-240min window and sends automated follow-up
// Simplified orchestrator
async function handleMessage(tenantId: string, message: WAMessage) {
  const tenant = await loadTenantConfig(tenantId)
  const history = await getConversationHistory(message.from, tenantId)
  const knowledge = await ragSearch(message.text, tenantId)

  // If audio, transcribe with Whisper first
  const text = message.type === 'audio'
    ? await whisperTranscribe(message.media)
    : message.text

  const response = await openai.chat({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: buildSystemPrompt(tenant, knowledge) },
      ...history,
      { role: 'user', content: text },
    ],
    tools: getAvailableTools(tenant),
  })

  // Execute tool calls if any
  if (response.tool_calls) {
    for (const call of response.tool_calls) {
      await executeToolCall(call, tenantId)
    }
  }

  await sendWhatsAppMessage(message.from, response.content)
  await saveToHistory(message.from, tenantId, text, response.content)
}

The biggest challenge here wasn't the AI — it was reliability. Baileys reconnects on its own, but when Railway restarts the container, the session can be lost. We implemented persistent disk + health checks + alerting to ensure the bot never goes offline without notice.

Challenge #3: consistent multi-tenant + multi-branch

Multi-tenant in SaaS is common. Multi-branch within a tenant is another level of complexity. In Vox Pet, a clinic chain can have 3 branches sharing client and pet records, but each branch has its own inventory, cash register, schedule and commissions.

Of the 95 Prisma models:

  • 93 models have tenant_id — total isolation between clinics
  • 76 models have branch_id — per-branch isolation within the tenant
  • 15 migrations to reach this structure without breaking existing data

The most complex case is inter-branch stock transfers. The flow has 3 states (pending → approved/rejected → completed) with 5 dedicated endpoints and an approval workflow. The source branch requests, the destination branch approves or rejects, and only then is inventory moved atomically.

// Inter-branch stock transfer — atomic transaction
async function completeTransfer(transferId: string, tenantId: string) {
  return prisma.$transaction(async (tx) => {
    const transfer = await tx.stockTransfer.findUnique({
      where: { id: transferId, tenantId },
      include: { items: true },
    })

    if (transfer.status !== 'APPROVED') {
      throw new BadRequestException('Transfer must be approved first')
    }

    for (const item of transfer.items) {
      // Deduct from source branch
      await tx.stock.update({
        where: { productId_branchId: { productId: item.productId, branchId: transfer.fromBranchId } },
        data: { quantity: { decrement: item.quantity } },
      })
      // Add to destination branch
      await tx.stock.upsert({
        where: { productId_branchId: { productId: item.productId, branchId: transfer.toBranchId } },
        create: { productId: item.productId, branchId: transfer.toBranchId, tenantId, quantity: item.quantity },
        update: { quantity: { increment: item.quantity } },
      })
    }

    await tx.stockTransfer.update({
      where: { id: transferId },
      data: { status: 'COMPLETED', completedAt: new Date() },
    })
  })
}

Challenge #4: fiscal invoice emission with two providers and fallback

NF-e (fiscal invoice) emission in Brazil is critical — if the invoice provider goes down, the clinic can't sell. We integrated two providers (Focus NFe and NFe.io) with automatic fallback. A v2 cron job processes the queue every 30 seconds.

  • 11 endpoints dedicated to NF-e in the v2 module
  • XML import — clinics can import invoices received from suppliers
  • Resilient queue — if the primary provider fails, automatically tries the secondary
  • Processing every 30s — avoids burst calls and respects provider rate limits

System scale

Some numbers showing the real complexity of the system:

  • 95 models in Prisma (complex relational database)
  • 91 pages in the frontend (86 functional)
  • 22 feature modules in the frontend
  • 41 Express controllers (v1) + 12 NestJS modules (v2)
  • 14 external integration services
  • 4 cron jobs in operation (WhatsApp follow-up, NF-e, reminders, analytics)
  • 10 tools in the WhatsApp AI agent

Lessons learned

  1. Strangler fig works. Gradually migrating within the same process is safer than big bang. The key is having strict rules for new code and never relaxing them
  2. Self-hosted WhatsApp is fragile. Baileys solves it, but requires dedicated infra with persistent disk and monitoring. Starting over, I'd evaluate the official WhatsApp Business API for larger tenants
  3. Multi-branch is 3x more complex than multi-tenant. It's not just adding branch_id — there are different business rules per model. Inventory is per-branch, client is per-tenant, commission is per-branch, pet is per-tenant
  4. Fallback on critical integrations is not optional. The day Focus NFe went down for 2 hours and NFe.io took over without anyone noticing was the day the investment paid for itself

--- Want to learn more about the project? Visit the Vox Pet Digital dedicated page.