BACK TO BLOG
gRPCGoMicroservicesDocker
gRPC Microservices with Go and Next.js: A Practical Guide
A deep dive into building production-ready microservices using gRPC, RabbitMQ, and Docker — based on real architecture from OmniAI.
February 28, 202612 min read
The Problem with REST at Scale
When I started building OmniAI, I used REST APIs between services. It worked fine for two services. By the time I had five — document ingestion, embedding, retrieval, chat, and auth — the inter-service communication was a mess of HTTP clients, inconsistent error handling, and no shared type contracts.
gRPC solves all three problems.
Why gRPC
- Protobuf contracts — both sides agree on the schema at compile time. No more "what does this field actually return?"
- Streaming — perfect for AI responses that stream token by token
- Performance — binary protocol, ~7x smaller payloads than JSON
Project Structure
omniai/
├── proto/ # .proto definitions shared across services
├── services/
│ ├── gateway/ # Next.js API routes → gRPC calls
│ ├── ingest/ # Go: document parsing + chunking
│ ├── embed/ # Go: vector embedding via OpenAI
│ └── chat/ # Go: RAG pipeline + streaming
├── docker-compose.yml
Defining the Contract
service ChatService {
rpc StreamChat(ChatRequest) returns (stream ChatChunk);
}
message ChatRequest {
string session_id = 1;
string message = 2;
repeated string doc_ids = 3;
}
message ChatChunk {
string content = 1;
bool done = 2;
}
Calling gRPC from Next.js
Next.js can't call gRPC directly (browsers don't support HTTP/2 trailers). The gateway service translates:
// app/api/chat/route.ts
export async function POST(req: Request) {
const { message, sessionId } = await req.json();
const stream = client.streamChat({ sessionId, message });
return new Response(
new ReadableStream({
async start(controller) {
for await (const chunk of stream) {
controller.enqueue(new TextEncoder().encode(chunk.content));
if (chunk.done) controller.close();
}
},
}),
{ headers: { "Content-Type": "text/event-stream" } }
);
}
RabbitMQ for Async Work
Document ingestion is slow (parsing, chunking, embedding). We push it to a RabbitMQ queue so the user gets an immediate response and the work happens in the background.
// Publish ingestion job
ch.Publish("", "ingest_queue", false, false, amqp.Publishing{
ContentType: "application/json",
Body: jobJSON,
})
Lessons Learned
- Run a local proto registry — don't copy .proto files between services manually
- Add deadlines to every gRPC call —
ctx, cancel := context.WithTimeout(ctx, 5*time.Second) - Use gRPC health checks — Docker and Kubernetes can probe them natively
All PostsRana Dolui