When the client or server closes the communication it sends TCP FIN type packet, this is a normal socket closure. However, it is not uncommon that the client server communication abruptly ending without closing the end points properly by sending FIN packet, for example, when the client or server crashed, power down or a network error which prevents sending the closure packet to the other end. In that cases, the firewall will not know that the end-points will no longer use the opened channel. As a passive intermediary, it had no way to determine if the endpoints are still active. As is it not possible to maintain resources forever, and also, it is a security threat keeping a port open for undefined time. So, firewall imposes a BLACKOUT on those connections that stay idle for a predefined amount of time.
Initially FW were designed to protect the application servers, network and then to protect client/server connection. With these in mind, a time-out in terms of hours (1 hour is the default for most FW) is reasonable. With the advent of more complex security schemes, FW are not only between client and server, but also between different application servers ( intranet, demilitarized zone (DMZ) , and such) and database servers. So, the horizon of 1 hour idle time for communication between servers maybe not be appropriate.
Idle connections can be expected from an application server. There is the case of J2EE using pooled JDBC connections. The pool usually returns the first available connection to the requester, so the first connections of the pool list are the most likely to be active. The last one, which are at the end of the list, are only used at peek loads, and most of the time it will be inactive.
Other cases are the connections established from a HTTP Server, either SQL connections from mod_plsql, or AJP connections from mod_oc4j.
##### Resolving problems with connection idle time-out
### TCP KeepAlive
You can enable TCP KeepAlive option at the Operating System(OS) level. Once TCP keepalive option is enabled and configured, a small probe packet will be sent to the other end at every predefined in-activity interval and it expects an ACK from the other end. And ACK will be returned only when the other end is alive and is reachable. If ACK is not returned, then after some retry, the OS will close the end points and will release the resources allocated for that. The application which is listening on that particular socket will recieve the error, so that application can take necessary action upon receiving the error signal from the OS.
When a communication is blacked out by the firewall, the probe will not reach its other end, and then the OS will close the socket end points and the application will be notified of the exception.
Steps to configure TCP KeepAlive depends on a specific Operating Systems. You will have to refer the appropriate OS documentation for it.
It is common to enable TCP KeepAlive option at the server end. Because server is the one which holds many resources for a communication, it any communication is broken, then those resources at the server will be released than holding it for indefinite time. By default TCP KeepAlive is not enabled at the OS.
TCP KeepAlive is applicable for all network applications running on that particular Operating System.
### DCD for DataBase Servers
For database connections, one of the endpoints is a passive listener, either is a dedicated process or a dispatcher process. If the connection becomes blacked out , this backend will never know that client cannot send any more requests, and then will lock important resources as database sessions, locks , and at least , a file descriptor used for maintaining the socket.
A solution is to make this backend “not so” passive, using the DCD (dead connection detection) to figure out if the communication is still possible.
Simply, set in the $ORACLE_HOME/network/admin/sqlnet.ora, in the server side SQLNET.EXPIRE_TIME=10 (10 minutes, for example). With this parameter in place, after 10 minutes of inactivity, the server send a small 10 bytes probe packet to the client. If this packet is not acknowledge, the connection will be closed and the associated resources will be released.
There are two benefits with this DCD
1. If the SQLNET.EXPIRE_TIME is less than the FW connection idle time-out, then the firewall will consider this packet as activity, and the idle time-out (firewall blackout) will never happen until both the client and the server processes are alive.
2. If the SQLNET.EXPIRE_TIME (let’s say a little bit higher) than the FW idle limit, then , as soon as the blackout happens , the RDBMS will know and will close the connection.
The first case is recommended when the connection comes from another application server , and the second makes sense for client applications.
DCD works at the application level and also works on top of TCP/IP protocol. If you have set the SQLNET.EXPIRE_TIME=10 then do not expect that the connections will be closed exactly after 10 minutes of the blackout or network outage. The TCP timeout and TCP retransmission values also adds to this time.
Please note that some latest firewalls may not see DCD packets as a valid traffic, and thus the DCD may not be useful. In this case, firewall timeout should be increased or users should not leave the application idle for longer than the idle time out configured on the firewall.