linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: 520042475122-0001@t-online.de (Xuan Baldauf)
To: netfilter-devel@us5.samba.org, linux-kernel@vger.kernel.org
Subject: [patch] conntrack and skb
Date: Fri, 08 Dec 2000 23:05:42 +0100	[thread overview]
Message-ID: <3A315B36.208ACE91@baldauf.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 74 bytes --]

Resent patch, hope that it will be acknowledged or discussed.

Xuân. :o)


[-- Attachment #2: Type: message/rfc822, Size: 5082 bytes --]

From: Xuan Baldauf <xuan--netfilter-devel@baldauf.org>
To: netfilter@lists.samba.org
Subject: [patch] conntrack and skb
Date: Sun, 03 Dec 2000 01:56:19 +0100
Message-ID: <3A299A32.46B8FAC2@baldauf.org>

Hello,

I discovered a bug in netfilter and worked the last 4 days to track it
down (I'm not a kernel hacker...):

Symptoms:
"Sometimes" the ip_conntrack module won't unload. rmmod or modprobe -r
would stay unkillably and eat all your CPU time (as showable in top).

Analysis:
rmmod sometimes loops in ip_conntrack_cleanup() with

 i_see_dead_people:
  ip_ct_selective_cleanup(kill_all, NULL);
  if (atomic_read(&ip_conntrack_count) != 0) {
    schedule();
    goto i_see_dead_people;
  }

So why is ip_conntrack_count!=0? Discovering this is a very long story,
but in short: Every socket has a queue of skbs which are to be read.
Every skb (call it network packet meta data if you want) can hold a
reference to a conntrack structure. Every conntrack structure is
refcounted with ip_conntrack_count. When skbs are destroyed, there
potential references to conntrack structures are freed. So what happens
if skbs are not destroyed? Gee, they do not free their conntrack
structure references and therefore do not free conntrack data and
therefore do not let the module clean up itself. And when happens this
case? It's rare, but it happens, exactly if a process has input data on
its socket and this input data is not read. This happend a couple of
times for me, sometimes with BIND, sometimes with smbfs.

Fix:
I searched for a global skb list to search for on module cleanup, but
found no one. So I searched for a global "struct sock" list to search
for it's skb queues, but found no one. So I search for a global "struct
socket" list to search for its "struct sock" members, but found no one.
So I searched for a global filedescriptor list and gave up, because
shared filedescriptors are not deemed to be fully race safe, comments
stated. So what did I do? If I assume that the only case a skb stays
long in the system is in a socket receive queue, the only case where
this can be is after "local delivery". So I clear the conntrack
reference early after local delivery (and assume that the packet already
passed iptables, so conntrack references are not needed anymore):

(patch again linux-2.4.0-test10 with all iptables 1.1.2 patches
applied:)

--- linux/net/ipv4/ip_input.c.orig      Sun Dec  3 00:37:42 2000
+++ linux/net/ipv4/ip_input.c   Sun Dec  3 00:49:12 2000
@@ -225,6 +225,26 @@
        nf_debug_ip_local_deliver(skb);
 #endif /*CONFIG_NETFILTER_DEBUG*/

+#ifdef CONFIG_NETFILTER
+       /*
+               Free the reference from the skb to a possible conntrack
early,
+               else an skb could stay for an arbitrary amount of time
in the
+               socket-receive-queue (for hours!) and hence delay
unloading of
+               the ip_conntrack module, which would lead to useless CPU
time
+               consumption and a module state where the module is
neither
+               unloadable nor loadable (since ip_conntrack_cleanup
would still
+               be looping...).
+
+               Connection tracking is done in more early stages, so we
can
+               free the conntrack:
+       */
+       nf_conntrack_put(skb->nfct);
+       skb->nfct = NULL;
+#ifdef CONFIG_NETFILTER_DEBUG
+       skb->nf_debug = 0;
+#endif /*CONFIG_NETFILTER_DEBUG*/
+#endif /*CONFIG_NETFILTER*/
+
        /* Free rx_dev before enqueueing to sockets */
        if (skb->rx_dev) {
                dev_put(skb->rx_dev);


If you want to reproduce that case (before applying this patch):

Type following: (I assume that you have killall, which kills all
processes of a given name)

modprobe ip_conntrack
( /usr/sbin/nc -l -p 1234|sleep 10000 ; echo output done ) &
cat /var/log/messages | ( /usr/sbin/nc localhost 1234; echo input done )
&
( rmmod ip_conntrack ; echo rmmod done ) &
# wait here
ps aux|grep rmmod
# wait here
killall nc

It might show as follows:

router|01:30:51|~> modprobe ip_conntrack
router|01:30:59|~> ( /usr/sbin/nc -l -p 1234|sleep 10000 ; echo output
done ) &
[1] 14866
router|01:31:05|~> cat /var/log/messages | ( /usr/sbin/nc localhost
1234; echo input done ) &
[2] 14870
router|01:31:08|~> ( rmmod ip_conntrack ; echo rmmod done ) &
[3] 14872
router|01:31:10|~> ps aux|grep rmmod
root     14873 85.5  0.4  1144  348 pty/s5   R    01:31   0:01 rmmod
ip_conntrack
root     14875  0.0  0.6  1288  492 pty/s5   S    01:31   0:00 grep
rmmod
# rmmod will still run until nc is killed, which is definitely a bug
router|01:31:13|~> killall nc
 punt!
input done
[2]-  Done                    cat /var/log/messages | ( /usr/sbin/nc
localhost 1234; echo input done )
router|01:31:21|~> rmmod done

[3]+  Done                    ( rmmod ip_conntrack; echo rmmod done )
router|01:31:22|~>



I hope I could help :-)

Xuân Baldauf.




                 reply	other threads:[~2000-12-08 22:36 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3A315B36.208ACE91@baldauf.org \
    --to=520042475122-0001@t-online.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netfilter-devel@us5.samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).