Get your server issues fixed by our experts for a price starting at just 25 USD/Hour. Click here to register and open a ticket with us now!

Author Topic: Troubleshooting the "Out of socket memory" error  (Read 3536 times)

0 Members and 1 Guest are viewing this topic.

Aby

  • Guest
Troubleshooting the "Out of socket memory" error
« on: April 14, 2014, 06:54:54 am »
Troubleshooting the "Out of socket memory" error


If the following error message occasionally gets written to the /var/log/messages file:


Code: [Select]
[root@host1 ~]# tail -f /var/log/messages
APR 14 15:05:39 ztm-n08 kernel: [12624150.315458] Out of socket memory

It usually means one of two things:

The server is running out of TCP memory
There are too many orphaned sockets on the system

To see how much memory the kernel is configured to dedicate to TCP run:


Code: [Select]
[root@host1 ~]# cat /proc/sys/net/ipv4/tcp_mem
3480768 4641024 6961536

tcp_mem is a vector of 3 integers: min, pressure and max.

min : below this number of pages TCP is not bothered about its memory consumption.
pressure: when the amount of memory allocated to TCP by the kernel exceeds this threshold, the kernel starts to moderate the memory consumption. This mode is exited when memory consumption falls under min.
max : the max number of pages allowed for queuing by all TCP sockets. When the system goes above this threshold, the kernel will start throwing the "Out of socket memory" error in the logs.

Now let's compare the 'max' number with how much of that memory TCP actually uses:

Code: [Select]
[root@host1 ~]# cat /proc/net/sockstat
sockets: used 48476
TCP: inuse 174950 orphan 126800 tw 153787 alloc 174954 mem 102910
UDP: inuse 34 mem 3
UDPLITE: inuse 0
RAW: inuse 1
FRAG: inuse 3 memory 4968

The last value on line 3 (mem 102910) is the number of pages currently allocated to TCP. In this example you can see that this value is way lower than the maximum number of pages the kernel is willing to give to TCP - the 'max' vector described above, so we can dismiss this as a cause of the error.
To examine if the server has too many orphan sockets run the following:

Code: [Select]
[root@host1 ~]# cat /proc/sys/net/ipv4/tcp_max_orphans
524288

An orphan socket is a socket that isn't associated with a file descriptor, usually after the close() call and there is no longer a file descriptor that reference it, but the scoket still exists in memory, until TCP is done with it.The tcp_max_orphans file shows the maximal number of TCP sockets not attached to any user file handle, held by system that the kernel can support. If this number is exceeded orphaned connections are reset immediately and warning is printed. This limit exists only to prevent simple DoS attacks. Each orphan sockets eats up to 64K of unswappable memory.
Now that we know what the limit of orphaned sockets on a system can be, let's see the current number of orphaned sockets:

Code: [Select]
[root@host1 ~]# cat /proc/net/sockstat
sockets: used 48476
TCP: inuse 174950 orphan 126800 tw 153787 alloc 174954 mem 102910
UDP: inuse 34 mem 3
UDPLITE: inuse 0
RAW: inuse 1
FRAG: inuse 3 memory 4968

In this case the 'orphan 126800' on line 3 is the field we are interested in. If this number is bigger than the one from tcp_max_orphans then this can be a reason for the "Out of socket memory".Fixing this is a matter of increasing the max limit in tcp_max_orphans:

Code: [Select]
[root@host1 ~]# echo 400000 > /proc/sys/net/ipv4/tcp_max_orphans
One thing worth mentioning is that in certain cases, the kernel may penalize some sockets more by multiplying the number of orphans by 2x or 4x to artificially increase the "score" of the "bad socket".

To account for that get the number of orphaned sockets during peak server utilization and multiple that by 4 to be safe. That should be the value you set in tcp_max_orphans.

In some cases if there are many TCP short lived connections on the system the number of orphaned sockets such as TIME_WAIT will be pretty big. To fix this situation you might need to increase the TIME_WAIT timeout (MSL) and experiment with tcp_tw_reuse /tcp_tw_recycle kernel tunable as describe in my other article on TCP Tuning.