linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking)
       [not found] ` <200012181811.KAA05490@pizda.ninka.net>
@ 2000-12-18 19:14   ` Harald Welte
  2000-12-19 13:55     ` Tom Leete
  2000-12-19 14:12     ` David S. Miller
  0 siblings, 2 replies; 8+ messages in thread
From: Harald Welte @ 2000-12-18 19:14 UTC (permalink / raw)
  To: David S. Miller; +Cc: rusty, kuznet, netfilter-devel, linux-kernel

On Mon, Dec 18, 2000 at 10:11:14AM -0800, David S. Miller wrote:
>    From: Rusty Russell <rusty@linuxcare.com.au>
>    Date: Mon, 18 Dec 2000 14:15:52 +1100
> 
>    Alexey is right, locking is screwed (explains some reports of
>    occasional failure during rmmod).
> 
> Patch applied, thank you.
> 
>    I have a patch which compiles for non-linear skb mods to netfilter
>    (NAT uses linear packets still, but connection tracking and packet
>    filtering only linearize minimal requirements).  Waiting for
>    DaveM's solution to ip_ct_gather_frags()...
> 
> I feel it's best to just skb_clone() the skb arg to ip_defrag
> and this will close the whole thing, I think.

no. The clone()d skb will still have a skb->dev pointing to NULL.

The problem occurs only for locally-generated outgoing packets, which 
need to be fragmented:

- ip_build_xmit() discoveres it has to fragment
- ip_build_xmit_slow() generates fragments and calls
- NF_HOOK() for NF_IP_LOCAL_OUT
- ip_conntrack_local() is called, which in turn calls
- ip_conntrack_in(), which calls
- ip_ct_gather_frags(), which calls
- ip_defrag(), which calls 

[now we have two possible oops - caues]

a ip_find(), which calls
a ip_frag_create(), which initialises a timer with the function
a ip_expire(), which dereferences qp->iif

b ip_frag_queue(), which dereferences it in qp->iif= skb->dev->ifindex


as andi kleen pointed out:

> > Also is it sure that the backtrace involves ip_rcv ? A more likely
> > guess is that it happens during the IP_LOCAL_OUT hook, when skb->dev
> > isn't set yet, but conntrack already has to already reassemble fragments.

 
> Actually, I do not understand how current code could even have worked
> in the past.  Once the SKB is passed to ip_defrag, it is nobody's
> buisness to reference that SKB anymore.  This ip_defrag call is (from

mmh... we really don't do this. We use the return value of ip_defrag(),
which is what ip_frag_reasm() returns (== the new datagram consisting
out of all its fragments).

> Alexey, what have I missed?  I don't like the ip_fragment.c proposed
> fix for this reason, what netfilter is doing with ip_defrag here looks
> just wrong.

Well, my conclusion is:

- the defragmentation code in the ipv4 stack assumes that skb->dev points
  to a valid device. It does this primaryly to send icmp reassembly errors
  if a fragment reassembly timeout occurs

- netfilter wants to use the ip_defrag for defragmentation, not only for 
  incoming packets from the network at NF_IP_PRE_ROUTING, but also for
  locally-generated fragmented packets (at NF_IP_LOCAL_OUT). Unfortunately
  we don't have a valid skb->dev at this point. 

I don't think that there is any skb_clone()ing needed, nor this would solve
the problem. The solution is

a) netfilter conntrack (more exactly: ip_ct_gather_frags) sets skb->dev to
   something valid (skb->dst->dev). Dirty hack.

b) make ip_defrag(), ... aware of the case where skb->dev == NULL. Sounds
   like a good idea, since it is only one if(skb->dev) clause.

c) netfilter stops using ip_defrag() for this case. Bad idea, it had to
   reinvent the wheel :(

> David S. Miller
> davem@redhat.com

-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org                http://www.gnumonks.org
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- 
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter  locking)
  2000-12-18 19:14   ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Harald Welte
@ 2000-12-19 13:55     ` Tom Leete
  2000-12-19 14:12     ` David S. Miller
  1 sibling, 0 replies; 8+ messages in thread
From: Tom Leete @ 2000-12-19 13:55 UTC (permalink / raw)
  To: Harald Welte
  Cc: David S. Miller, rusty, kuznet, netfilter-devel, linux-kernel

Harald Welte wrote:
> 
> On Mon, Dec 18, 2000 at 10:11:14AM -0800, David S. Miller wrote:
> >    From: Rusty Russell <rusty@linuxcare.com.au>
> >    Date: Mon, 18 Dec 2000 14:15:52 +1100
> >
> >    Alexey is right, locking is screwed (explains some reports of
> >    occasional failure during rmmod).
> >
> > Patch applied, thank you.
> >
> >    I have a patch which compiles for non-linear skb mods to netfilter
> >    (NAT uses linear packets still, but connection tracking and packet
> >    filtering only linearize minimal requirements).  Waiting for
> >    DaveM's solution to ip_ct_gather_frags()...
> >
> > I feel it's best to just skb_clone() the skb arg to ip_defrag
> > and this will close the whole thing, I think.
> 
> no. The clone()d skb will still have a skb->dev pointing to NULL.
> 
> The problem occurs only for locally-generated outgoing packets, which
> need to be fragmented:
> 
> - ip_build_xmit() discoveres it has to fragment
> - ip_build_xmit_slow() generates fragments and calls
> - NF_HOOK() for NF_IP_LOCAL_OUT
> - ip_conntrack_local() is called, which in turn calls
> - ip_conntrack_in(), which calls
> - ip_ct_gather_frags(), which calls
> - ip_defrag(), which calls
> 
> [now we have two possible oops - caues]
> 
> a ip_find(), which calls
> a ip_frag_create(), which initialises a timer with the function
> a ip_expire(), which dereferences qp->iif
> 
> b ip_frag_queue(), which dereferences it in qp->iif= skb->dev->ifindex
> 
> as andi kleen pointed out:
> 
> > > Also is it sure that the backtrace involves ip_rcv ? A more likely
> > > guess is that it happens during the IP_LOCAL_OUT hook, when skb->dev
> > > isn't set yet, but conntrack already has to already reassemble fragments.
> 
> 
> > Actually, I do not understand how current code could even have worked
> > in the past.  Once the SKB is passed to ip_defrag, it is nobody's
> > buisness to reference that SKB anymore.  This ip_defrag call is (from
> 
> mmh... we really don't do this. We use the return value of ip_defrag(),
> which is what ip_frag_reasm() returns (== the new datagram consisting
> out of all its fragments).
> 
> > Alexey, what have I missed?  I don't like the ip_fragment.c proposed
> > fix for this reason, what netfilter is doing with ip_defrag here looks
> > just wrong.
> 
> Well, my conclusion is:
> 
> - the defragmentation code in the ipv4 stack assumes that skb->dev points
>   to a valid device. It does this primaryly to send icmp reassembly errors
>   if a fragment reassembly timeout occurs
> 
> - netfilter wants to use the ip_defrag for defragmentation, not only for
>   incoming packets from the network at NF_IP_PRE_ROUTING, but also for
>   locally-generated fragmented packets (at NF_IP_LOCAL_OUT). Unfortunately
>   we don't have a valid skb->dev at this point.
> 
> I don't think that there is any skb_clone()ing needed, nor this would solve
> the problem. The solution is
> 
> a) netfilter conntrack (more exactly: ip_ct_gather_frags) sets skb->dev to
>    something valid (skb->dst->dev). Dirty hack.
> 
> b) make ip_defrag(), ... aware of the case where skb->dev == NULL. Sounds
>    like a good idea, since it is only one if(skb->dev) clause.
> 
> c) netfilter stops using ip_defrag() for this case. Bad idea, it had to
>    reinvent the wheel :(
> 
> - Harald Welte / laforge@gnumonks.org                http://www.gnumonks.org

I'm now testing this small patch based on your suggestion b). I have
netfilter running, nfs mounts are behaving well, and fragmented pings dont
kill the box. I made no attempt to resolve anything but the derefernce of a
netdevice *dev.= NULL

The patch only deals with the ip_defrag_queue path. I have not seen the
alternate one happen. It's only been up an hour, I'll put some more time on
it.

Cheers,
Tom

--- linux/net/ipv4/ip_fragment.c~	Tue Dec 12 06:56:29 2000
+++ linux/net/ipv4/ip_fragment.c	Tue Dec 19 07:29:53 2000
@@ -485,7 +485,8 @@
 	else
 		qp->fragments = skb;
 
-	qp->iif = skb->dev->ifindex;
+	if (skb->dev)
+	        qp->iif = skb->dev->ifindex;
 	skb->dev = NULL;
 	qp->meat += skb->len;
 	atomic_add(skb->truesize, &ip_frag_mem);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter  locking)
  2000-12-18 19:14   ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Harald Welte
  2000-12-19 13:55     ` Tom Leete
@ 2000-12-19 14:12     ` David S. Miller
  2000-12-19 14:54       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet
  2000-12-20  5:59       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Tom Leete
  1 sibling, 2 replies; 8+ messages in thread
From: David S. Miller @ 2000-12-19 14:12 UTC (permalink / raw)
  To: tleete; +Cc: laforge, rusty, kuznet, netfilter-devel, linux-kernel

   Date: Tue, 19 Dec 2000 08:55:35 -0500
   From: Tom Leete <tleete@mountain.net>

   The patch only deals with the ip_defrag_queue path. I have not seen the
   alternate one happen. It's only been up an hour, I'll put some more time on
   it.

   --- linux/net/ipv4/ip_fragment.c~	Tue Dec 12 06:56:29 2000
   +++ linux/net/ipv4/ip_fragment.c	Tue Dec 19 07:29:53 2000
   @@ -485,7 +485,8 @@

This is basically what is in my tree right now.  However, there was
one reporter who claimed that after this kind of change he still was
able to lockup/OOPS his machine by logging into X as a user who had
his home directory over NFS.  This was with netfilter enabled as well
so it has to be the same bug.

Later,
David S. Miller
davem@redhat.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter
  2000-12-19 14:12     ` David S. Miller
@ 2000-12-19 14:54       ` kuznet
  2000-12-20  1:35         ` David S. Miller
  2000-12-20  1:45         ` Mohammad A. Haque
  2000-12-20  5:59       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Tom Leete
  1 sibling, 2 replies; 8+ messages in thread
From: kuznet @ 2000-12-19 14:54 UTC (permalink / raw)
  To: David S. Miller; +Cc: tleete, laforge, rusty, netfilter-devel, linux-kernel

Hello!

> able to lockup/OOPS his machine by logging into X as a user who had
> his home directory over NFS. 

I believe this report is to be ignored. It is fully meaningless.
X has nothing to do with NFS, NFS is with X, and defragmenter is
at least with one of them.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter
  2000-12-19 14:54       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet
@ 2000-12-20  1:35         ` David S. Miller
  2000-12-20  2:47           ` Mohammad A. Haque
  2000-12-20  1:45         ` Mohammad A. Haque
  1 sibling, 1 reply; 8+ messages in thread
From: David S. Miller @ 2000-12-20  1:35 UTC (permalink / raw)
  To: mhaque; +Cc: kuznet, tleete, laforge, rusty, netfilter-devel, linux-kernel

   Date: Tue, 19 Dec 2000 20:45:14 -0500
   From: "Mohammad A. Haque" <mhaque@haque.net>

   how is this meaningless?

   This just confirms what I and others have found in test12 wrt to the
   netfilter issue.

Alexey is talking about test12/netfilter + our ip_fragment.c fix,
not vanilla test12.

Later,
David S. Miller
davem@redhat.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter
  2000-12-19 14:54       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet
  2000-12-20  1:35         ` David S. Miller
@ 2000-12-20  1:45         ` Mohammad A. Haque
  1 sibling, 0 replies; 8+ messages in thread
From: Mohammad A. Haque @ 2000-12-20  1:45 UTC (permalink / raw)
  To: kuznet
  Cc: David S. Miller, tleete, laforge, rusty, netfilter-devel, linux-kernel

how is this meaningless?

This just confirms what I and others have found in test12 wrt to the
netfilter issue.

kuznet@ms2.inr.ac.ru wrote:
> 
> Hello!
> 
> > able to lockup/OOPS his machine by logging into X as a user who had
> > his home directory over NFS.
> 
> I believe this report is to be ignored. It is fully meaningless.
> X has nothing to do with NFS, NFS is with X, and defragmenter is
> at least with one of them.

-- 

=====================================================================
Mohammad A. Haque                              http://www.haque.net/ 
                                               mhaque@haque.net

  "Alcohol and calculus don't mix.             Project Lead
   Don't drink and derive." --Unknown          http://wm.themes.org/
                                               batmanppc@themes.org
=====================================================================
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter
  2000-12-20  1:35         ` David S. Miller
@ 2000-12-20  2:47           ` Mohammad A. Haque
  0 siblings, 0 replies; 8+ messages in thread
From: Mohammad A. Haque @ 2000-12-20  2:47 UTC (permalink / raw)
  To: David S. Miller
  Cc: kuznet, tleete, laforge, rusty, netfilter-devel, linux-kernel

Woops, my bad. These finally exams are getting to me.

Sorry.

"David S. Miller" wrote:
> Alexey is talking about test12/netfilter + our ip_fragment.c fix,
> not vanilla test12.

-- 

=====================================================================
Mohammad A. Haque                              http://www.haque.net/ 
                                               mhaque@haque.net

  "Alcohol and calculus don't mix.             Project Lead
   Don't drink and derive." --Unknown          http://wm.themes.org/
                                               batmanppc@themes.org
=====================================================================
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter  locking)
  2000-12-19 14:12     ` David S. Miller
  2000-12-19 14:54       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet
@ 2000-12-20  5:59       ` Tom Leete
  1 sibling, 0 replies; 8+ messages in thread
From: Tom Leete @ 2000-12-20  5:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: laforge, rusty, kuznet, netfilter-devel, linux-kernel

"David S. Miller" wrote:
> This is basically what is in my tree right now.  However, there was
> one reporter who claimed that after this kind of change he still was
> able to lockup/OOPS his machine by logging into X as a user who had
> his home directory over NFS.  This was with netfilter enabled as well
> so it has to be the same bug.
> 
> Later,
> David S. Miller
> davem@redhat.com

2.4.0-test12+ip_fragment.c.patch is still up. I hadn't previously tried
client-side mounts on that box, just exports.

I tried to reproduce the problem with nfs home directory mount + X. I am
unable to generate the error -- it works for me. I think Alexey is right.
There may be either coincidence or confusion in the report.

Btw, I am able to compile a kernel over an nfs mount with this. Very handy
since the remote is much faster at it.

Cheers,
Tom
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2000-12-20  6:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <E147qmG-0001yB-00@halfway>
     [not found] ` <200012181811.KAA05490@pizda.ninka.net>
2000-12-18 19:14   ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Harald Welte
2000-12-19 13:55     ` Tom Leete
2000-12-19 14:12     ` David S. Miller
2000-12-19 14:54       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet
2000-12-20  1:35         ` David S. Miller
2000-12-20  2:47           ` Mohammad A. Haque
2000-12-20  1:45         ` Mohammad A. Haque
2000-12-20  5:59       ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Tom Leete

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).