Connection Was Closed - When One Line of Code Breaks Everything
Connection Was Closed - When One Line of Code Breaks Everything
Six o'clock on December 24th. Instead of preparing for Christmas Eve celebrations, our team was staring at monitoring dashboards. Thousands of transactions were failing with the same message: "Connection was closed".
This system had been running stable for 4 months of integration. Occasionally there were similar errors in UAT, but everyone thought it was an infrastructure issue - restart and it's fixed. That night, we realized we were wrong.
The First Clue
Around midnight, after ruling out everything from Kubernetes to sidecars, one team member noticed something strange: "The first few requests don't fail. After some time, it starts failing."
This made me pause. If it were an infrastructure issue, errors would occur randomly or immediately. But the pattern "first request OK, subsequent requests fail" suggested something was accumulating.
Then another colleague nailed it at 12:55 AM: "If server closes connection, client throws exception immediately when submitting. Not after waiting."
Immediately. Not timeout. This was the most important clue.
A Tale of Handshakes
To understand the problem, I need to tell you a story about how computers "talk" to each other.
Imagine calling someone on the phone. Before you can talk, you have to wait for them to pick up, say "Hello", then you respond. That's a handshake - a necessary ritual before exchanging information.
TCP connections work the same way. Every time a client wants to send a request to a server, they must "shake hands" first:
Client: "I want to connect" (SYN)
Server: "OK, I'm ready" (SYN-ACK)
Client: "Great, let's begin" (ACK)Sounds simple, but each handshake costs about 1.5 round trips across the network. With a server in Singapore and client in Vietnam, each round trip takes ~50ms. That means each handshake costs ~75ms.
Now imagine you have 100 requests. If each request needs its own handshake, you lose 7.5 seconds just... saying hello. Haven't done anything yet.
The Solution: Keep the Line Open
In 1997, the designers of HTTP/1.1 had a simple idea: Why hang up after each question?
Instead of:
- Call → Ask → Hang up
- Call → Ask → Hang up
- Call → Ask → Hang up
They proposed:
- Call → Ask → Ask again → Ask again → ... → Done, then hang up
This is HTTP Keep-Alive. One connection stays open for multiple requests. 100 requests now need only one handshake.
Connection Pool - When Efficiency Meets Complexity
Engineers took it further. They thought: "If Keep-Alive is good, why not prepare multiple connections in advance?"
Like a restaurant that doesn't wait for customers to arrive before washing dishes. They prepare a cabinet of clean dishes. Customer arrives, grab a dish. Customer finishes, wash it, put it back. Next customer grabs it again.
Connection Pool works exactly the same:
- Startup: Pre-create 10-20 connections to server
- Need to send request: Grab 1 connection from "cabinet"
- Got response: Return connection to "cabinet"
- Next request: Use existing connection, no need to create new
Extremely efficient. But there's an implicit assumption everyone forgets: Dishes in the cabinet must be clean and intact.
What happens if someone secretly breaks a dish and puts it back? Next person grabs it... boom.
The Silent Saboteur
Back to the debugging story. At 12:55 AM, when Hieu said "exception immediately", I started visualizing the problem.
If client sends request and server returns timeout error, client would wait. But client doesn't wait - it throws error immediately. This only happens when the connection was already dead.
I thought: Our server is doing something that kills the connection. Client doesn't know, still puts the "dead" connection back in pool. Next request grabs it, boom.
Hieu articulated it precisely: "Request A finishes processing then closes. Client doesn't know it's closed. Uses that connection for request B. Request B fails."
Right. Exactly. Now just need to find which code is "closing".
The Culprit Revealed
1 AM. I grep the codebase for close(). And I found it:
HttpServerResponse response = request.response();
response.end(result);
response.close(); // ← There it is!Delete response.close(), run test. Fire 100 requests. 1000 requests. 10000 requests.
"Tested for 2 years, no errors yet" - I joked in the group chat.
But wait, why is `close()` harmful?
This is something many developers don't understand well. In Vert.x (and most HTTP frameworks), there are two easily confused methods:
response.end() - "I've finished sending the response"
- Server sends data to client
- But keeps connection open
- Ready for next request on the same connection
response.close() - "I want to close the line"
- Closes TCP connection immediately
- Client doesn't receive notification
- Connection in pool becomes a "corpse"
The developer who wrote that code probably thought: "Done with work, clean up". Logic seems right. But with HTTP Keep-Alive, end() is already "cleanup". Calling close() on top is sabotage.
Sequence of Disaster
To help you visualize, here's what happens:
Request 1 - Everything goes smoothly:
- Client grabs a connection from the pool
- Sends request to server
- Server processes, returns response
- Server calls
response.close()→ silently kills the TCP connection - Client doesn't notice, happily puts the "dead" connection back in pool
Request 2 - Disaster strikes:
- Client grabs that same connection (thinking it's alive)
- Tries to send data through a dead line
- Boom - "Connection was closed"
It's like putting your phone in your pocket without realizing the other person already hung up. Next time you pick it up to continue talking... only silence answers back.
Why Did This Bug Hide for 4 Months?
This question haunted me. Why didn't UAT catch it?
First, the first request always succeeds. A newly created connection hasn't been "killed" yet. If testers only test one request then stop, they'll never see the error.
Second, low traffic helps hide the bug. In UAT, there might be seconds or minutes between requests. Enough time for connections to timeout and be removed from pool. In production, requests pour in, "dead" connections get reused immediately.
Third, restart "fixes" the bug. Each restart resets the pool, new connections are created. Bug disappears... temporarily. This made the team think it was "infrastructure issue, restart fixes it".
This is the most expensive lesson: Never treat "restart fixes it" as a solution. Every error has a cause. Rare errors can become disasters when traffic increases.
The Fix and Aftermath
The fix is simple - delete one line of code:
// Before
response.end(result);
response.close(); // Delete this line
// After
response.end(result);
// Vert.x manages connection lifecycle automaticallyBut the story doesn't end there. After finding the root cause, we grep the entire codebase. And found this pattern in many other services.
Those "Connection lost" errors that partners complained about? Probably same cause. Those times backend said "didn't receive the log"? Connection was closed before the request arrived.
One line of code. Many victims. A lesson for life.
What If You Really Need to Close the Connection?
There are rare cases where you want to close the connection - for example, server preparing to shutdown. The right way is to tell the client instead of closing abruptly:
response.putHeader("Connection", "close"); // Announce first
response.end(data);
// DON'T call close() - client will close when it sees the headerThe Connection: close header tells the client: "This is the last response on this connection. Please close it." Client won't put the connection back in pool, and will create a new one for the next request.
Conclusion
3 AM on Christmas Day, after 9 hours of debugging, we found the culprit: one line of code.
response.close();That night there was no party, no gifts. But the team had a lesson that money can't buy about HTTP Keep-Alive and Connection Pooling.
Hieu was right: "+1 interview experience."
And I learned: Never ignore "small" errors. Never think "restart fixes it". And don't "help" frameworks do things they already do better than you.
Merry Christmas. 🎄