gRPCGoMicroservicesDocker

gRPC Microservices with Go and Next.js: A Practical Guide

A deep dive into building production-ready microservices using gRPC, RabbitMQ, and Docker — based on real architecture from OmniAI.

February 28, 202612 min read

The Problem with REST at Scale

When I started building OmniAI, I used REST APIs between services. It worked fine for two services. By the time I had five — document ingestion, embedding, retrieval, chat, and auth — the inter-service communication was a mess of HTTP clients, inconsistent error handling, and no shared type contracts.

gRPC solves all three problems.

Why gRPC

Protobuf contracts — both sides agree on the schema at compile time. No more "what does this field actually return?"
Streaming — perfect for AI responses that stream token by token
Performance — binary protocol, ~7x smaller payloads than JSON

Project Structure

omniai/
├── proto/           # .proto definitions shared across services
├── services/
│   ├── gateway/     # Next.js API routes → gRPC calls
│   ├── ingest/      # Go: document parsing + chunking
│   ├── embed/       # Go: vector embedding via OpenAI
│   └── chat/        # Go: RAG pipeline + streaming
├── docker-compose.yml

Defining the Contract

service ChatService {
  rpc StreamChat(ChatRequest) returns (stream ChatChunk);
}

message ChatRequest {
  string session_id = 1;
  string message = 2;
  repeated string doc_ids = 3;
}

message ChatChunk {
  string content = 1;
  bool done = 2;
}

Calling gRPC from Next.js

Next.js can't call gRPC directly (browsers don't support HTTP/2 trailers). The gateway service translates:

// app/api/chat/route.ts
export async function POST(req: Request) {
  const { message, sessionId } = await req.json();
  const stream = client.streamChat({ sessionId, message });

  return new Response(
    new ReadableStream({
      async start(controller) {
        for await (const chunk of stream) {
          controller.enqueue(new TextEncoder().encode(chunk.content));
          if (chunk.done) controller.close();
        }
      },
    }),
    { headers: { "Content-Type": "text/event-stream" } }
  );
}

RabbitMQ for Async Work

Document ingestion is slow (parsing, chunking, embedding). We push it to a RabbitMQ queue so the user gets an immediate response and the work happens in the background.

// Publish ingestion job
ch.Publish("", "ingest_queue", false, false, amqp.Publishing{
  ContentType: "application/json",
  Body:        jobJSON,
})

Lessons Learned

Run a local proto registry — don't copy .proto files between services manually
Add deadlines to every gRPC call — ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
Use gRPC health checks — Docker and Kubernetes can probe them natively

All PostsRana Dolui