Archive for the ‘networks’ Category

I was just trying to find the answer for that question. Alrite..will update the post when I get the answer..


Netfilter is a subsystem in the Linux 2.4 kernel. Netfilter provides an generic and abstract interface to the standard routing code. This is currently used in Linux kernel for packet filtering, mangling, NAT(network address translation) and queuing packets to the userspace.I have been using netfilters in my final year project(in VoIP security) as a packet capture mechanism.Netfilter makes connection tracking possible through the use of various hooks in the kernel’s network code.

These hooks are places that kernel code, either statically built or in the form of a loadable module, can register functions to be called for specific network events. An example of such an event is the reception of a packet.

Linux , since 2.4,  supports hooks for IPv4 and IPv6. Netfilter defines five hooks for IPv4. The declaration of the symbols for these can be found in “linux/netfilter_ipv4.h”. These hooks are displayed in the table below:


Called at

NF_IP_PRE_ROUTING After sanity checks, before routing decisions.
NF_IP_LOCAL_IN After routing decisions if packet is for this host.
NF_IP_FORWARD If the packet is destined for another interface.
NF_IP_LOCAL_OUT Packets coming from local processes on their way out.
NF_IP_POST_ROUTING Just before outbound packets “hit the wire”.

The NF_IP_PRE_ROUTING hook, called as the first hook after a packet has been received, has been used for packet capture in this project. The following diagrams diagrammatically represents where exactly does each of these hooks operate at.

At any of these hooks we can define Callback functions which act as handler routines. These functions are invoked when the corresponding network event occurs. Now the fate of the packet is determined by the return code of the hook function. Various return codes available are:

Return Code


NF_DROP Discard the packet.
NF_ACCEPT Keep the packet.
NF_STOLEN Forget about the packet.
NF_QUEUE Queue packet for userspace.
NF_REPEAT Call this hook function again.

The NF_DROP return code means that this packet should be dropped completely and any resources allocated for it should be released. NF_ACCEPT tells Netfilter that so far the packet is still acceptable and that it should move to the next stage of the network stack. NF_STOLEN is an interesting one because it tells Netfilter to “forget” about the packet. What this tells Netfilter is that the hook function will take processing of this packet from here and that Netfilter should drop all processing of it. This does not mean, however, that resources for the packet are released. The packet and it’s respective sk_buff structure are still valid, it’s just that the hook function has taken ownership of the packet away from Netfilter. Unfortunately I’m not exactly clear on what NF_QUEUE really does so for now I won’t discuss it. The last return value, NF_REPEAT requests that Netfilter calls the hook function again. Obviously one must be careful using NF_REPEAT so as to avoid an endless loop.

I will try to add more details on the same with sample codes in my future posts.

PS: The post contains adopted contents. Content was actually prepared for the project documentation purpose.

ENOMEM error

Posted: April 9, 2009 in networks, OS
Tags: ,

ENOMEM is an OS error code , as defined in kern/include/kern/errno.h ,which is returned due to insufficient

memory. The name ENOMEM stands for Error NO MEMory. Its one of the error codes returned by the fork() call

which means no more storage space available.In connection with sockets they are raised when there isn’t enough

resources available to create a socket. The value of the error code is 12.

And also I have read that from the point of view of an application, ENOMEM is pretty diffiult to be handled constructively. One way to deal with it is to fail instantly and to release all the allocated resources as soon as possible,
avoiding operations requiring allocating new resources [Refer.].

Right now I am having this error when I try to send a large number of UDP packets to a receiver. It happens at the receiver end at a point and then there is a steep increase in the packet loss. Now I am trying to fix it and once its done I will update this post. And if someone can help me with the same, please do it by leaving a comment. And thanx in advance..

Linux network buffers

Posted: April 7, 2009 in networks, OS
Tags: ,


I borrowed this document from Mr.Harald Welte ( and since I found it very useful I
just thought of sharing this with you all..and more importantly i can always keep it as a note for myself.
So please dont sue me for copyright issues..ok.. 🙂


skbuffs are the buffers in which the linux kernel handles network packets. The packet is received by the network card, put into a skbuff and then passed to the network stack, which uses the skbuff all the time.

2.1 struct sk_buff

The struct sk_buff is defined in <linux/skbuff.h> as follows:


next buffer in list

previous buffer in list

list we are on

socket we belong to

timeval we arrived at

device we are leaving by

device we arrived at

transport layer header (tcp,udp,icmp,igmp,spx,raw)

network layer header (ip,ipv6,arp,ipx,raw)

link layer header


control buffer, used internally

length of actual data


FIXME: data moved to user and not MSG_PEEK

we are a clone

head may be cloned

packet class

driver fed us ip checksum

packet queuing priority

user count

packet protocol from driver

security level of packet

real size of the buffer

pointer to head of buffer

data head pointer

tail pointer

end pointer

destructor function


netfilter mark

netfilter internal caching info

associated connection, if any


traffic control index

2.2 skb support functions

There are a bunch of skb support functions provided by the sk_buff layer. I briefly describe the most important ones in this section.

allocation / free / copy / clone and expansion functions

struct sk_buff *alloc_skb(unsigned int size, int gfp_mask)

This function allocates a new skb. This is provided by the skb layer to initialize some privat data and do memory statistics. The returned buffer has no headroom and a tailroom of /size/ bytes.

void kfree_skb(struct sk_buff *skb)

Decrement the skb’s usage count by one and free the skb if no references left.

struct sk_buff *skb_get(struct sk_buff *skb)

Increments the skb’s usage count by one and returns a pointer to it.

struct sk_buff *skb_clone(struct sk_buff *skb, int gfp_mask)

This function clones a skb. Both copies share the packet data but have their own struct sk_buff. The new copy is not owned by any socket, reference count is 1.

struct sk_buff *skb_copy(const struct sk_buff *skb, int gfp_mask)

Makes a real copy of the skb, including packet data. This is needed, if You wish to modify the packet data. Reference count of the new skb is 1.

struct skb_copy_expand(const struct sk_buff *skb, int new_headroom, int new_tailroom, int gfp_mask)

Make a copy of the skb, including packet data. Additionally the new skb has a haedroom of /new_headroom/ bytes size and a tailroom of /new_tailroom/ bytes.

anciliary functions

int skb_cloned(struct sk_buff *skb)

Is the skb a clone?

int skb_shared(struct sk_Buff *skb)

Is this skb shared? (is the reference count > 1)?

operations on lists of skb’s

struct sk_buff *skb_peek(struct sk_buff_head *list_)

peek a skb from front of the list; does not remove skb from the list

struct sk_buff *skb_peek_tail(struct sk_buff_head *list_)

peek a skb from tail of the list; does not remove sk from the list

__u32 skb_queue_len(sk_buff_head *list_)

return the length of the given skb list

void skb_queue_head(struct sk_buff_head *list_, struct sk_buff *newsk)

enqueue a skb at the head of a given list

void skb_queue_tail(struct sk_buff_head *list_, struct sk_buff *newsk)

enqueue a skb at the end of a given list.

struct sk_buff *skb_dequeue(struct sk_buff_head *list_)

dequeue a skb from the head of the given list.

struct sk_buff *sbk_dequeue_tail(struct sk_buff_head *list_)

dequeue a skb from the tail of the given list

operations on skb data

unsigned char *skb_put(struct sk_buff *sbk, int len)

extends the data area of the skb. if the total size exceeds the size of the skb, the kernel will panic. A pointer to the first byte of new data is returned.

unsigned char *skb_push(struct sk_buff *skb, int len)

extends the data area of the skb. if the total size exceeds the size of the skb, the kernel will panic. A pointer to the first byte of new data is returned.

unsigned char *skb_pull(struct sk_buff *skb, int len)

remove data from the start of a buffer, returning the bytes to headroom. A pointr to the next data in the buffer is returned.

int skb_headroom(struct sk_buff *skb)

return the amount of bytes of free space at the head of skb

int skb_tailroom(struct sk_buff *skb)

return the amount of bytes of free space at the end of skb

struct sk_buff *skb_cow(struct sk_buff *skb, int headroom)

if the buffer passed lacks sufficient headroom or is a clone it is copied and additional headroom made available.

These are some usefull tweaks on the network performance and
works on kernel versions since 2.4.XX ..

1. Make sure that you have root privleges.

2. Type: sysctl -p | grep mem
This will display your current buffer settings. Save These! You may want to roll-back these changes

3. Type: sysctl -w net.core.rmem_max=8388608
This sets the max OS receive buffer size for all types of connections.

4. Type: sysctl -w net.core.wmem_max=8388608
This sets the max OS send buffer size for all types of connections.

5. Type: sysctl -w net.core.rmem_default=65536
This sets the default OS receive buffer size for all types of connections.

6. Type: sysctl -w net.core.wmem_default=65536
This sets the default OS send buffer size for all types of connections.

7. Type: sysctl -w net.ipv4.tcp_mem=’8388608 8388608 8388608′
TCP Autotuning setting. “The tcp_mem variable defines how the TCP stack should behave when it comes to memory usage. … The first value specified in the tcp_mem variable tells the kernel the low threshold. Below this point, the TCP stack do not bother at all about putting any pressure on the memory usage by different TCP sockets. … The second value tells the kernel at which point to start pressuring memory usage down. … The final value tells the kernel how many memory pages it may use maximally. If this value is reached, TCP streams and packets start getting dropped until we reach a lower memory usage again. This value includes all TCP sockets currently in use.”

8. Type: sysctl -w net.ipv4.tcp_rmem=’4096 87380 8388608′
TCP Autotuning setting. “The first value tells the kernel the minimum receive buffer for each TCP connection, and this buffer is always allocated to a TCP socket, even under high pressure on the system. … The second value specified tells the kernel the default receive buffer allocated for each TCP socket. This value overrides the /proc/sys/net/core/rmem_default value used by other protocols. … The third and last value specified in this variable specifies the maximum receive buffer that can be allocated for a TCP socket.”

9. Type: sysctl -w net.ipv4.tcp_wmem=’4096 65536 8388608′
TCP Autotuning setting. “This variable takes 3 different values which holds information on how much TCP sendbuffer memory space each TCP socket has to use. Every TCP socket has this much buffer space to use before the buffer is filled up. Each of the three values are used under different conditions. … The first value in this variable tells the minimum TCP send buffer space available for a single TCP socket. … The second value in the variable tells us the default buffer space allowed for a single TCP socket to use. … The third value tells the kernel the maximum TCP send buffer space.”

10. Type:sysctl -w net.ipv4.route.flush=1
This will enusre that immediatly subsequent connections use these values.

Quick Step
Cut and paste the following into a linux shell with root privleges:

sysctl -w net.core.rmem_max=8388608
sysctl -w net.core.wmem_max=8388608
sysctl -w net.core.rmem_default=65536
sysctl -w net.core.wmem_default=65536
sysctl -w net.ipv4.tcp_rmem=’4096 87380 8388608′
sysctl -w net.ipv4.tcp_wmem=’4096 65536 8388608′
sysctl -w net.ipv4.tcp_mem=’8388608 8388608 8388608′
sysctl -w net.ipv4.route.flush=1