All of lore.kernel.org
 help / color / mirror / Atom feed
* crash with bridge and inconsistent handling of NETDEV_TX_OK
@ 2010-04-20 23:40 Mikulas Patocka
  2010-04-20 23:45 ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-20 23:40 UTC (permalink / raw)
  To: lists.linux-foundation.org, netdev; +Cc: kaber, davem

Hi

I got this crash on 2.6.34-rc4 when using it as a bridge. The crash is not 
reproducible. There are two interfaces, eth0 (sun-hme) and eth1 (tg3) and 
there is a bridge between them.

The crash was in inlined function skb_drop_list, inlined from 
skb_drop_fraglist from skb_release_data. The "list" pointer was 0x18.

This is backtrace:
skb_release_data+7c/e0
__kfree_skb+c/c0
dev_hard_start_xmit+288/3c0
sch_direct_xmit+12c/1c0
dev_queue_xmit+3fc/520
br_dev_queue_push_xmit+60/80
br_handle_frame_finish+100/1e00
br_handle_frame+168/240
netif_receive_skb+274/480
process_backlog+64/c0
net_rx_action+cc/180
__do_softirq+94/120

Settings for for eth0 and eth1 are:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

For br0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

(there is no netfilter compiled on that machine)

I'm curious --- this code path in dev_hard_start_xmit is taken only if GSO 
is used. From the trace, it can be seen that a packet was received by one 
nic and forwarded by the bridge to the other nic. How could GSO be used in 
this scenario?

---

Reviewing the code further, I found one very weird commit 
572a9d7b6fc7f20f573664063324c086be310c42 committed to 2.6.33. What it 
does, it changes the semantics of ndo_hard_start_xmit(). Prior to the 
patch, the meaning was --- return zero (NETDEV_TX_OK) --- the skb is 
consumed by the driver. Returns non-zero --- the skb is left owned by the 
caller. The patch changes it to return other flags in bits 4-7 and changes 
the consumed/returned logic.

The problem is that there is still plenty of code that compares it against 
NETDEV_TX_OK to find out if the skb was consumed.

I don't know if this caused my crash (it is not reproducible), but the 
patch is buggy and dangerous.

Handling of the return codes is inconsistent:
* most callers use dev_xmit_complete(int rc) { return rc < 
NET_XMIT_MASK(0x0f); } to test if the skb was consumed. This seems to be 
the method to be used intended by the developers. (why not <= 0x0f ?)

* dev_hard_start_xmit uses rc == NETDEV_TX_OK || rc & ~NETDEV_TX_MASK 
(~0xf0) to test if the skb is consumed. This is differs from 
dev_xmit_complete for some values.

net/core/netpoll.c : at two places, it compares the return value of 
ndo_start_xmit with NETDEV_TX_OK
net/sched/sch_teql.c : compares with NETDEV_TX_OK
drivers/net/wan/dlci.c : ignores the return value of ndo_start_xmit

--- the above inconsistencies lead to situations where both the caller and 
the callee think that they own the skb, and the result is memory 
corruption.


If we grep for places that assume that the packet was consumed if the 
return code is NETDEV_TX_OK, there are even more:
drivers/net/qla3xxx.c
drivers/net/chelsio/sge.c
drivers/net/wireless/hostap/hostap_80211_tx.c
drivers/net/wireless/ipw2x00/libipw_tx.c
drivers/net/sfc/selftest.c
net/mac80211/tx.c
--- Not all these cases are bugs, because sometimes the return value is 
self-generated by the driver. But it is just confusing and it may trigger 
bugs in the future.

---

I'd recommend to revert 572a9d7b6fc7f20f573664063324c086be310c42 because 
it made several bugs (the code that compared the return value against 
NETDEV_TX_OK was correct before and is buggy after the patch). 
Furthermore, there may be hidden bugs, if someone compares the return 
value with 0 and frees skb based on this comparison, it is impossible to 
find it with grep, yet changing the meaning of return values would make a 
bug here.

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-20 23:40 crash with bridge and inconsistent handling of NETDEV_TX_OK Mikulas Patocka
@ 2010-04-20 23:45 ` David Miller
  2010-04-20 23:48   ` David Miller
  2010-04-20 23:57   ` Mikulas Patocka
  0 siblings, 2 replies; 21+ messages in thread
From: David Miller @ 2010-04-20 23:45 UTC (permalink / raw)
  To: mpatocka; +Cc: lists.linux-foundation.org, netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 19:40:56 -0400 (EDT)

> Reviewing the code further, I found one very weird commit
> 572a9d7b6fc7f20f573664063324c086be310c42 committed to 2.6.33. What
> it does, it changes the semantics of ndo_hard_start_xmit(). Prior to
> the patch, the meaning was --- return zero (NETDEV_TX_OK) --- the
> skb is consumed by the driver. Returns non-zero --- the skb is left
> owned by the caller.  The patch changes it to return other flags in
> bits 4-7 and changes the consumed/returned logic.
> 
> The problem is that there is still plenty of code that compares it
> against NETDEV_TX_OK to find out if the skb was consumed.

Drivers are not supposed to return those new flag bits, the new flag
bits as return values exist only in the packet scheduler path.

We're not reverting the commit you mention, we're going to fix
whatever bug exists instead.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-20 23:45 ` David Miller
@ 2010-04-20 23:48   ` David Miller
  2010-04-20 23:57   ` Mikulas Patocka
  1 sibling, 0 replies; 21+ messages in thread
From: David Miller @ 2010-04-20 23:48 UTC (permalink / raw)
  To: mpatocka; +Cc: lists.linux-foundation.org, netdev, kaber

From: David Miller <davem@davemloft.net>
Date: Tue, 20 Apr 2010 16:45:05 -0700 (PDT)

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 19:40:56 -0400 (EDT)
> 
>> Reviewing the code further, I found one very weird commit
>> 572a9d7b6fc7f20f573664063324c086be310c42 committed to 2.6.33. What
>> it does, it changes the semantics of ndo_hard_start_xmit(). Prior to
>> the patch, the meaning was --- return zero (NETDEV_TX_OK) --- the
>> skb is consumed by the driver. Returns non-zero --- the skb is left
>> owned by the caller.  The patch changes it to return other flags in
>> bits 4-7 and changes the consumed/returned logic.
>> 
>> The problem is that there is still plenty of code that compares it
>> against NETDEV_TX_OK to find out if the skb was consumed.
> 
> Drivers are not supposed to return those new flag bits, the new flag
> bits as return values exist only in the packet scheduler path.

And BTW, NETDEV_TX_OK is only ever returned by itself, the
flag bits only get set when a non-NETDEV_TX_OK value is returned.

So we really haven't changed semantics at all, NETDEV_TX_OK (which is
zero) and non-zero are the two valid return value cases.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-20 23:45 ` David Miller
  2010-04-20 23:48   ` David Miller
@ 2010-04-20 23:57   ` Mikulas Patocka
  2010-04-21  0:00     ` David Miller
  1 sibling, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-20 23:57 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber

On Tue, 20 Apr 2010, David Miller wrote:

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 19:40:56 -0400 (EDT)
> 
> > Reviewing the code further, I found one very weird commit
> > 572a9d7b6fc7f20f573664063324c086be310c42 committed to 2.6.33. What
> > it does, it changes the semantics of ndo_hard_start_xmit(). Prior to
> > the patch, the meaning was --- return zero (NETDEV_TX_OK) --- the
> > skb is consumed by the driver. Returns non-zero --- the skb is left
> > owned by the caller.  The patch changes it to return other flags in
> > bits 4-7 and changes the consumed/returned logic.
> > 
> > The problem is that there is still plenty of code that compares it
> > against NETDEV_TX_OK to find out if the skb was consumed.
> 
> Drivers are not supposed to return those new flag bits, the new flag
> bits as return values exist only in the packet scheduler path.

If they are not supposed to return it, why do you test it in 
dev_hard_start_xmit?

Define clearly what the return status is allowed (and maybe enforce it 
with BUG()s) and follow that definition.

Currently, there is no specification and the code is unintelligible crap: 
it first masks the value with ~0xf0 to test if the skb is consumed 
(dev_hard_start_xmit) and then compares it with 0x0f for the same test 
(dev_queue_xmit).

> We're not reverting the commit you mention, we're going to fix
> whatever bug exists instead.

So have a fun searching for them :-/

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-20 23:57   ` Mikulas Patocka
@ 2010-04-21  0:00     ` David Miller
  2010-04-21  0:12       ` Mikulas Patocka
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  0:00 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 19:57:45 -0400 (EDT)

>> We're not reverting the commit you mention, we're going to fix
>> whatever bug exists instead.
> 
> So have a fun searching for them :-/

When you report a problem and your knee-jerk reaction is "revert this
commit", expect some push back.

We'll instead analyze this bug and take the time to figure out the
most appropriate course of action.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  0:00     ` David Miller
@ 2010-04-21  0:12       ` Mikulas Patocka
  2010-04-21  0:15         ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21  0:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber



On Tue, 20 Apr 2010, David Miller wrote:

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 19:57:45 -0400 (EDT)
> 
> >> We're not reverting the commit you mention, we're going to fix
> >> whatever bug exists instead.
> > 
> > So have a fun searching for them :-/
> 
> When you report a problem and your knee-jerk reaction is "revert this
> commit", expect some push back.

I tried to analyze skb handling and couldn't check correctness of this 
commit. I really think it should be reverted and applied later, when the 
author does it correctly (for example, one bit in the return value to 
indicate consumed/returned skb, other unimportant informational bits).

> We'll instead analyze this bug and take the time to figure out the
> most appropriate course of action.

And do you have any idea what caused it? Why is it using GSO on bridging?

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  0:12       ` Mikulas Patocka
@ 2010-04-21  0:15         ` David Miller
  2010-04-21  0:23           ` Mikulas Patocka
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  0:15 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 20:12:45 -0400 (EDT)

> Why is it using GSO on bridging?

Unlike LRO, GRO and GSO are completely valid in bridging and
routing situations.

In fact, in virtualization environments it is essential for
good performance.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  0:15         ` David Miller
@ 2010-04-21  0:23           ` Mikulas Patocka
  2010-04-21  1:02             ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21  0:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber



On Tue, 20 Apr 2010, David Miller wrote:

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 20:12:45 -0400 (EDT)
> 
> > Why is it using GSO on bridging?
> 
> Unlike LRO, GRO and GSO are completely valid in bridging and
> routing situations.
> 
> In fact, in virtualization environments it is essential for
> good performance.

I know it may be used for bridging. But it doesn't explain how it happened 
in my case.

I have two NICs, each with 1500 MTU. The stack trace indicates that a 
packet was received by one NIC and bridged to the other. The stack trace 
also indicates that it went through GSO code path. The question is why? 
How could a forwarded packet be so large to use GSO?

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  0:23           ` Mikulas Patocka
@ 2010-04-21  1:02             ` David Miller
  2010-04-21  1:10               ` Mikulas Patocka
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  1:02 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 20:23:57 -0400 (EDT)

> I have two NICs, each with 1500 MTU. The stack trace indicates that a 
> packet was received by one NIC and bridged to the other. The stack trace 
> also indicates that it went through GSO code path. The question is why? 
> How could a forwarded packet be so large to use GSO?

GRO automatically accumulates packets together, accumulating them into
larger super-packets.  This is done regardless of input device feeding
it.

Can you understand this now?  In software, we accumulate all incoming
packets for a TCP stream into larger super-packets.  Just because it's
a bridging scenerio doesn't mean we disable GRO.

These super-packets are being bridged, then forwarded out your
destination device and GSO has to de-segment these GRO frames.

GRO is done unconditionally, all the time, for all packets.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:02             ` David Miller
@ 2010-04-21  1:10               ` Mikulas Patocka
  2010-04-21  1:14                 ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21  1:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber

On Tue, 20 Apr 2010, David Miller wrote:

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 20:23:57 -0400 (EDT)
> 
> > I have two NICs, each with 1500 MTU. The stack trace indicates that a 
> > packet was received by one NIC and bridged to the other. The stack trace 
> > also indicates that it went through GSO code path. The question is why? 
> > How could a forwarded packet be so large to use GSO?
> 
> GRO automatically accumulates packets together, accumulating them into
> larger super-packets.  This is done regardless of input device feeding
> it.
> 
> Can you understand this now?  In software, we accumulate all incoming
> packets for a TCP stream into larger super-packets.  Just because it's
> a bridging scenerio doesn't mean we disable GRO.
> 
> These super-packets are being bridged, then forwarded out your
> destination device and GSO has to de-segment these GRO frames.
> 
> GRO is done unconditionally, all the time, for all packets.

I see, but GRO is turned off on my interfaces, according to ethtool.

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:10               ` Mikulas Patocka
@ 2010-04-21  1:14                 ` David Miller
  2010-04-21  1:16                   ` David Miller
  2010-04-21  1:20                   ` crash with bridge and inconsistent handling of NETDEV_TX_OK Mikulas Patocka
  0 siblings, 2 replies; 21+ messages in thread
From: David Miller @ 2010-04-21  1:14 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 21:10:04 -0400 (EDT)

> I see, but GRO is turned off on my interfaces, according to ethtool.

GRO is just a flag bit, so it's possible that if your kernel is too
old ethtool will always show that it's off.

If you haven't turned off GRO explicitly, then it's a good bet that
this is why it looks like it's off.  And GRO is on by default.

I still contend, therefore, that it's completely normal to see GSO
packet processing in the TX code path, even with bridging, and
therefore seeing the GSO code path get taken is not indicative of some
bug wrt. ->ndo_start_xmit() return value flag bit handling.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:14                 ` David Miller
@ 2010-04-21  1:16                   ` David Miller
  2010-04-21  1:21                     ` Mikulas Patocka
  2010-04-21  1:20                   ` crash with bridge and inconsistent handling of NETDEV_TX_OK Mikulas Patocka
  1 sibling, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  1:16 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: David Miller <davem@davemloft.net>
Date: Tue, 20 Apr 2010 18:14:34 -0700 (PDT)

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 21:10:04 -0400 (EDT)
> 
>> I see, but GRO is turned off on my interfaces, according to ethtool.
> 
> GRO is just a flag bit, so it's possible that if your kernel is too
> old ethtool will always show that it's off.

Actually, looking back at your original report, are you confusing
"large-receive-offload" as reported by ethtool with GRO?

They are completely seperate things.

"large-receive-offload" is LRO, whereas GRO is something done
in software and something entirely different.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:14                 ` David Miller
  2010-04-21  1:16                   ` David Miller
@ 2010-04-21  1:20                   ` Mikulas Patocka
  2010-04-21  1:22                     ` David Miller
  1 sibling, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21  1:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber



On Tue, 20 Apr 2010, David Miller wrote:

> From: Mikulas Patocka <mpatocka@redhat.com>
> Date: Tue, 20 Apr 2010 21:10:04 -0400 (EDT)
> 
> > I see, but GRO is turned off on my interfaces, according to ethtool.
> 
> GRO is just a flag bit, so it's possible that if your kernel is too
> old ethtool will always show that it's off.
> 
> If you haven't turned off GRO explicitly, then it's a good bet that
> this is why it looks like it's off.  And GRO is on by default.

I have kernel 2.6.34-rc4, ethtool 2.6.33 and GRO is off. I haven't turned 
it off, I left it on default.

> I still contend, therefore, that it's completely normal to see GSO
> packet processing in the TX code path, even with bridging, and
> therefore seeing the GSO code path get taken is not indicative of some
> bug wrt. ->ndo_start_xmit() return value flag bit handling.

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:16                   ` David Miller
@ 2010-04-21  1:21                     ` Mikulas Patocka
  2010-04-21  1:24                       ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21  1:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber



On Tue, 20 Apr 2010, David Miller wrote:

> From: David Miller <davem@davemloft.net>
> Date: Tue, 20 Apr 2010 18:14:34 -0700 (PDT)
> 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > Date: Tue, 20 Apr 2010 21:10:04 -0400 (EDT)
> > 
> >> I see, but GRO is turned off on my interfaces, according to ethtool.
> > 
> > GRO is just a flag bit, so it's possible that if your kernel is too
> > old ethtool will always show that it's off.
> 
> Actually, looking back at your original report, are you confusing
> "large-receive-offload" as reported by ethtool with GRO?
> 
> They are completely seperate things.
> 
> "large-receive-offload" is LRO, whereas GRO is something done
> in software and something entirely different.

Posting once more. Both LRO and GRO are off. And it is default.

Mikulas

Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:20                   ` crash with bridge and inconsistent handling of NETDEV_TX_OK Mikulas Patocka
@ 2010-04-21  1:22                     ` David Miller
  2010-04-21  1:24                       ` Mikulas Patocka
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  1:22 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 21:20:09 -0400 (EDT)

> 
> 
> On Tue, 20 Apr 2010, David Miller wrote:
> 
>> From: Mikulas Patocka <mpatocka@redhat.com>
>> Date: Tue, 20 Apr 2010 21:10:04 -0400 (EDT)
>> 
>> > I see, but GRO is turned off on my interfaces, according to ethtool.
>> 
>> GRO is just a flag bit, so it's possible that if your kernel is too
>> old ethtool will always show that it's off.
>> 
>> If you haven't turned off GRO explicitly, then it's a good bet that
>> this is why it looks like it's off.  And GRO is on by default.
> 
> I have kernel 2.6.34-rc4, ethtool 2.6.33 and GRO is off. I haven't turned 
> it off, I left it on default.

See my follow-up, what ethtool output makes you think GRO is
off?  "large-receive-offload" is not GRO

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:21                     ` Mikulas Patocka
@ 2010-04-21  1:24                       ` David Miller
  2010-04-21  2:01                         ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  1:24 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 20 Apr 2010 21:21:36 -0400 (EDT)

> Posting once more. Both LRO and GRO are off. And it is default.

Weird, ok thanks for posting this information.

Maybe whatever sets up your bridge turns off GRO, but that's extremely
unfortunate since the whole point of GRO was so that we could keep it
enabled even with bridging or forwarding enabled.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:22                     ` David Miller
@ 2010-04-21  1:24                       ` Mikulas Patocka
  0 siblings, 0 replies; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21  1:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber

> > On Tue, 20 Apr 2010, David Miller wrote:
> > 
> >> From: Mikulas Patocka <mpatocka@redhat.com>
> >> Date: Tue, 20 Apr 2010 21:10:04 -0400 (EDT)
> >> 
> >> > I see, but GRO is turned off on my interfaces, according to ethtool.
> >> 
> >> GRO is just a flag bit, so it's possible that if your kernel is too
> >> old ethtool will always show that it's off.
> >> 
> >> If you haven't turned off GRO explicitly, then it's a good bet that
> >> this is why it looks like it's off.  And GRO is on by default.
> > 
> > I have kernel 2.6.34-rc4, ethtool 2.6.33 and GRO is off. I haven't turned 
> > it off, I left it on default.
> 
> See my follow-up, what ethtool output makes you think GRO is
> off?  "large-receive-offload" is not GRO

generic-receive-offload: off

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  1:24                       ` David Miller
@ 2010-04-21  2:01                         ` David Miller
  2010-04-21 13:21                           ` Mikulas Patocka
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2010-04-21  2:01 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev, kaber


I looked more at your crash report.

You shouldn't even be in this code path for other reasons, namely
skb->next should be NULL.  But it's not in your case.  skb->next would
only be non-NULL for GSO frames, which we've established we should not
be seeing here.

Given that skb->next is non-NULL and the fraglists of this SKB are
corrupted (next pointer is 0x18), I think we're getting memory
corruption from somewhere else.  This also jives with the fact that
this is not readily reproducable.

The whole ->ndo_start_xmit() return value stuff is unrelated to this
issue, we shouldn't even be in this code path.  In fact, if reverting
that TX flags handling commit makes your crashes go away it would be a
huge surprise.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: crash with bridge and inconsistent handling of NETDEV_TX_OK
  2010-04-21  2:01                         ` David Miller
@ 2010-04-21 13:21                           ` Mikulas Patocka
  2010-04-29 22:14                             ` RCU error in networking (was: crash with bridge and inconsistent handling of NETDEV_TX_OK) Mikulas Patocka
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-21 13:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber



On Tue, 20 Apr 2010, David Miller wrote:

> 
> I looked more at your crash report.
> 
> You shouldn't even be in this code path for other reasons, namely
> skb->next should be NULL.  But it's not in your case.  skb->next would
> only be non-NULL for GSO frames, which we've established we should not
> be seeing here.
> 
> Given that skb->next is non-NULL and the fraglists of this SKB are
> corrupted (next pointer is 0x18), I think we're getting memory
> corruption from somewhere else.  This also jives with the fact that
> this is not readily reproducable.

The crash happened just a few days after I started to use the machine for 
bridging. There were no unexplained crashes before. So I suspect that the 
cause is bridging or tg3.

> The whole ->ndo_start_xmit() return value stuff is unrelated to this
> issue, we shouldn't even be in this code path.  In fact, if reverting
> that TX flags handling commit makes your crashes go away it would be a
> huge surprise.

I thought that if some weird ->ndo_start_xmit() return values appeared, 
this would lead to misunderstanding who owns the skb, using of already 
deallocated skb and mentioned memory corruption. But I can't prove it.

Mikulas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RCU error in networking (was: crash with bridge and inconsistent handling of NETDEV_TX_OK)
  2010-04-21 13:21                           ` Mikulas Patocka
@ 2010-04-29 22:14                             ` Mikulas Patocka
  2010-04-29 23:04                               ` RCU error in networking David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Mikulas Patocka @ 2010-04-29 22:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev



On Wed, 21 Apr 2010, Mikulas Patocka wrote:

> 
> 
> On Tue, 20 Apr 2010, David Miller wrote:
> 
> > 
> > I looked more at your crash report.
> > 
> > You shouldn't even be in this code path for other reasons, namely
> > skb->next should be NULL.  But it's not in your case.  skb->next would
> > only be non-NULL for GSO frames, which we've established we should not
> > be seeing here.
> > 
> > Given that skb->next is non-NULL and the fraglists of this SKB are
> > corrupted (next pointer is 0x18), I think we're getting memory
> > corruption from somewhere else.  This also jives with the fact that
> > this is not readily reproducable.
> 
> The crash happened just a few days after I started to use the machine for 
> bridging. There were no unexplained crashes before. So I suspect that the 
> cause is bridging or tg3.
> 
> > The whole ->ndo_start_xmit() return value stuff is unrelated to this
> > issue, we shouldn't even be in this code path.  In fact, if reverting
> > that TX flags handling commit makes your crashes go away it would be a
> > huge surprise.
> 
> I thought that if some weird ->ndo_start_xmit() return values appeared, 
> this would lead to misunderstanding who owns the skb, using of already 
> deallocated skb and mentioned memory corruption. But I can't prove it.
> 
> Mikulas


BTW. when I enabled lockdep, I got this (2.6.34-rc5; the machine is no 
longer bridging, it has just a single interface):

Mikulas

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
net/core/dev.c:1993 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
2 locks held by ntpd/1462:
 #0:  (sk_lock-AF_INET){+.+.+.}, at: [<0000000000678d88>] 
udp_sendmsg+0x208/0x620
 #1:  (rcu_read_lock_bh){.+....}, at: [<00000000006305e0>] 
dev_queue_xmit+0x40/0x660

stack backtrace:
Call Trace:
 [000000000047bb88] lockdep_rcu_dereference+0x88/0xa0
 [0000000000630adc] dev_queue_xmit+0x53c/0x660
 [0000000000654e10] ip_finish_output+0x190/0x340
 [000000000065501c] ip_output+0x5c/0x80
 [0000000000655200] ip_local_out+0x20/0x40
 [0000000000655540] ip_push_pending_frames+0x320/0x3e0
 [0000000000676ec4] udp_push_pending_frames+0x164/0x440
 [0000000000678e60] udp_sendmsg+0x2e0/0x620
 [000000000068044c] inet_sendmsg+0x2c/0x60
 [000000000061cf24] sock_sendmsg+0x64/0xa0
 [000000000061d738] SyS_sendto+0x98/0xe0
 [000000000061b2b8] SyS_send+0x18/0x40
 [0000000000406054] linux_sparc_syscall32+0x34/0x40


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RCU error in networking
  2010-04-29 22:14                             ` RCU error in networking (was: crash with bridge and inconsistent handling of NETDEV_TX_OK) Mikulas Patocka
@ 2010-04-29 23:04                               ` David Miller
  0 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2010-04-29 23:04 UTC (permalink / raw)
  To: mpatocka; +Cc: netdev

From: Mikulas Patocka <mpatocka@redhat.com>
Date: Thu, 29 Apr 2010 18:14:11 -0400 (EDT)

> BTW. when I enabled lockdep, I got this (2.6.34-rc5; the machine is no 
> longer bridging, it has just a single interface):
> 

Fixed in Linus's tree for more than a week:

commit 05d17608a69b3ae653ea5c9857283bef3439c733
Author: David Howells <dhowells@redhat.com>
Date:   Tue Apr 20 00:25:58 2010 +0000

    net: Fix an RCU warning in dev_pick_tx()
    
    Fix the following RCU warning in dev_pick_tx():
    
    ===================================================
    [ INFO: suspicious rcu_dereference_check() usage. ]
    ---------------------------------------------------
    net/core/dev.c:1993 invoked rcu_dereference_check() without protection!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 1, debug_locks = 0
    2 locks held by swapper/0:
     #0:  (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff81039e65>] run_timer_softirq+0x17b/0x278
     #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff812ea3eb>] dev_queue_xmit+0x14e/0x4dc
    
    stack backtrace:
    Pid: 0, comm: swapper Not tainted 2.6.34-rc5-cachefs #4
    Call Trace:
     <IRQ>  [<ffffffff810516c4>] lockdep_rcu_dereference+0xaa/0xb2
     [<ffffffff812ea4f6>] dev_queue_xmit+0x259/0x4dc
     [<ffffffff812ea3eb>] ? dev_queue_xmit+0x14e/0x4dc
     [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffff81035362>] ? local_bh_enable_ip+0xbc/0xc1
     [<ffffffff812f0954>] neigh_resolve_output+0x24b/0x27c
     [<ffffffff8134f673>] ip6_output_finish+0x7c/0xb4
     [<ffffffff81350c34>] ip6_output2+0x256/0x261
     [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffff813517fb>] ip6_output+0xbbc/0xbcb
     [<ffffffff8135bc5d>] ? fib6_force_start_gc+0x2b/0x2d
     [<ffffffff81368acb>] mld_sendpack+0x273/0x39d
     [<ffffffff81368858>] ? mld_sendpack+0x0/0x39d
     [<ffffffff81052099>] ? mark_held_locks+0x52/0x70
     [<ffffffff813692fc>] mld_ifc_timer_expire+0x24f/0x288
     [<ffffffff81039ed6>] run_timer_softirq+0x1ec/0x278
     [<ffffffff81039e65>] ? run_timer_softirq+0x17b/0x278
     [<ffffffff813690ad>] ? mld_ifc_timer_expire+0x0/0x288
     [<ffffffff81035531>] ? __do_softirq+0x69/0x140
     [<ffffffff8103556a>] __do_softirq+0xa2/0x140
     [<ffffffff81002e0c>] call_softirq+0x1c/0x28
     [<ffffffff81004b54>] do_softirq+0x38/0x80
     [<ffffffff81034f06>] irq_exit+0x45/0x47
     [<ffffffff810177c3>] smp_apic_timer_interrupt+0x88/0x96
     [<ffffffff810028d3>] apic_timer_interrupt+0x13/0x20
     <EOI>  [<ffffffff810488dd>] ? __atomic_notifier_call_chain+0x0/0x86
     [<ffffffff810096bf>] ? mwait_idle+0x6e/0x78
     [<ffffffff810096b6>] ? mwait_idle+0x65/0x78
     [<ffffffff810011cb>] cpu_idle+0x4d/0x83
     [<ffffffff81380b05>] rest_init+0xb9/0xc0
     [<ffffffff81380a4c>] ? rest_init+0x0/0xc0
     [<ffffffff8168dcf0>] start_kernel+0x392/0x39d
     [<ffffffff8168d2a3>] x86_64_start_reservations+0xb3/0xb7
     [<ffffffff8168d38b>] x86_64_start_kernel+0xe4/0xeb
    
    An rcu_dereference() should be an rcu_dereference_bh().
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/core/dev.c b/net/core/dev.c
index 92584bf..f769098 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1990,7 +1990,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 				queue_index = skb_tx_hash(dev, skb);
 
 			if (sk) {
-				struct dst_entry *dst = rcu_dereference(sk->sk_dst_cache);
+				struct dst_entry *dst = rcu_dereference_bh(sk->sk_dst_cache);
 
 				if (dst && skb_dst(skb) == dst)
 					sk_tx_queue_set(sk, queue_index);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-04-30 17:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-20 23:40 crash with bridge and inconsistent handling of NETDEV_TX_OK Mikulas Patocka
2010-04-20 23:45 ` David Miller
2010-04-20 23:48   ` David Miller
2010-04-20 23:57   ` Mikulas Patocka
2010-04-21  0:00     ` David Miller
2010-04-21  0:12       ` Mikulas Patocka
2010-04-21  0:15         ` David Miller
2010-04-21  0:23           ` Mikulas Patocka
2010-04-21  1:02             ` David Miller
2010-04-21  1:10               ` Mikulas Patocka
2010-04-21  1:14                 ` David Miller
2010-04-21  1:16                   ` David Miller
2010-04-21  1:21                     ` Mikulas Patocka
2010-04-21  1:24                       ` David Miller
2010-04-21  2:01                         ` David Miller
2010-04-21 13:21                           ` Mikulas Patocka
2010-04-29 22:14                             ` RCU error in networking (was: crash with bridge and inconsistent handling of NETDEV_TX_OK) Mikulas Patocka
2010-04-29 23:04                               ` RCU error in networking David Miller
2010-04-21  1:20                   ` crash with bridge and inconsistent handling of NETDEV_TX_OK Mikulas Patocka
2010-04-21  1:22                     ` David Miller
2010-04-21  1:24                       ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.