All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Eric Dumazet <edumazet@google.com>,
	Jiri Pirko <jpirko@redhat.com>,
	"Paul E. McKenney" <paulmck@us.ibm.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [ 53/56] net: add a synchronize_net() in netdev_rx_handler_unregister()
Date: Tue,  2 Apr 2013 15:50:20 -0700	[thread overview]
Message-ID: <20130402224717.948465292@linuxfoundation.org> (raw)
In-Reply-To: <20130402224711.840825715@linuxfoundation.org>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------


From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 00cfec37484761a44a3b6f4675a54caa618210ae ]

commit 35d48903e97819 (bonding: fix rx_handler locking) added a race
in bonding driver, reported by Steven Rostedt who did a very good
diagnosis :

<quoting Steven>

I'm currently debugging a crash in an old 3.0-rt kernel that one of our
customers is seeing. The bug happens with a stress test that loads and
unloads the bonding module in a loop (I don't know all the details as
I'm not the one that is directly interacting with the customer). But the
bug looks to be something that may still be present and possibly present
in mainline too. It will just be much harder to trigger it in mainline.

In -rt, interrupts are threads, and can schedule in and out just like
any other thread. Note, mainline now supports interrupt threads so this
may be easily reproducible in mainline as well. I don't have the ability
to tell the customer to try mainline or other kernels, so my hands are
somewhat tied to what I can do.

But according to a core dump, I tracked down that the eth irq thread
crashed in bond_handle_frame() here:

        slave = bond_slave_get_rcu(skb->dev);
        bond = slave->bond; <--- BUG

the slave returned was NULL and accessing slave->bond caused a NULL
pointer dereference.

Looking at the code that unregisters the handler:

void netdev_rx_handler_unregister(struct net_device *dev)
{

        ASSERT_RTNL();
        RCU_INIT_POINTER(dev->rx_handler, NULL);
        RCU_INIT_POINTER(dev->rx_handler_data, NULL);
}

Which is basically:
        dev->rx_handler = NULL;
        dev->rx_handler_data = NULL;

And looking at __netif_receive_skb() we have:

        rx_handler = rcu_dereference(skb->dev->rx_handler);
        if (rx_handler) {
                if (pt_prev) {
                        ret = deliver_skb(skb, pt_prev, orig_dev);
                        pt_prev = NULL;
                }
                switch (rx_handler(&skb)) {

My question to all of you is, what stops this interrupt from happening
while the bonding module is unloading?  What happens if the interrupt
triggers and we have this:

        CPU0                    CPU1
        ----                    ----
  rx_handler = skb->dev->rx_handler

                        netdev_rx_handler_unregister() {
                           dev->rx_handler = NULL;
                           dev->rx_handler_data = NULL;

  rx_handler()
   bond_handle_frame() {
    slave = skb->dev->rx_handler;
    bond = slave->bond; <-- NULL pointer dereference!!!

What protection am I missing in the bond release handler that would
prevent the above from happening?

</quoting Steven>

We can fix bug this in two ways. First is adding a test in
bond_handle_frame() and others to check if rx_handler_data is NULL.

A second way is adding a synchronize_net() in
netdev_rx_handler_unregister() to make sure that a rcu protected reader
has the guarantee to see a non NULL rx_handler_data.

The second way is better as it avoids an extra test in fast path.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/dev.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3070,6 +3070,7 @@ int netdev_rx_handler_register(struct ne
 	if (dev->rx_handler)
 		return -EBUSY;
 
+	/* Note: rx_handler_data must be set before rx_handler */
 	rcu_assign_pointer(dev->rx_handler_data, rx_handler_data);
 	rcu_assign_pointer(dev->rx_handler, rx_handler);
 
@@ -3090,6 +3091,11 @@ void netdev_rx_handler_unregister(struct
 
 	ASSERT_RTNL();
 	rcu_assign_pointer(dev->rx_handler, NULL);
+	/* a reader seeing a non NULL rx_handler in a rcu_read_lock()
+	 * section has a guarantee to see a non NULL rx_handler_data
+	 * as well.
+	 */
+	synchronize_net();
 	rcu_assign_pointer(dev->rx_handler_data, NULL);
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);



  parent reply	other threads:[~2013-04-02 22:51 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-02 22:49 [ 00/56] 3.0.72-stable review Greg Kroah-Hartman
2013-04-02 22:49 ` [ 01/56] signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer Greg Kroah-Hartman
2013-04-02 22:49 ` [ 02/56] kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER Greg Kroah-Hartman
2013-04-02 22:49 ` [ 03/56] SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked Greg Kroah-Hartman
2013-04-02 22:49 ` [ 04/56] Bluetooth: Fix not closing SCO sockets in the BT_CONNECT2 state Greg Kroah-Hartman
2013-04-02 22:49 ` [ 05/56] Bluetooth: Add support for Dell[QCA 0cf3:0036] Greg Kroah-Hartman
2013-04-02 22:49 ` [ 06/56] Bluetooth: Add support for Dell[QCA 0cf3:817a] Greg Kroah-Hartman
2013-04-02 22:49 ` [ 07/56] staging: comedi: s626: fix continuous acquisition Greg Kroah-Hartman
2013-04-02 22:49 ` [ 08/56] sysfs: fix race between readdir and lseek Greg Kroah-Hartman
2013-04-02 22:49 ` [ 09/56] sysfs: handle failure path correctly for readdir() Greg Kroah-Hartman
2013-04-02 22:49 ` [ 10/56] b43: A fix for DMA transmission sequence errors Greg Kroah-Hartman
2013-04-02 22:49 ` [ 11/56] xen-blkback: fix dispatch_rw_block_io() error path Greg Kroah-Hartman
2013-04-02 22:49 ` [ 12/56] usb: ftdi_sio: Add support for Mitsubishi FX-USB-AW/-BD Greg Kroah-Hartman
2013-04-02 22:49 ` [ 13/56] vt: synchronize_rcu() under spinlock is not nice Greg Kroah-Hartman
2013-04-02 22:49 ` [ 14/56] mwifiex: cancel cmd timer and free curr_cmd in shutdown process Greg Kroah-Hartman
2013-04-06 19:55   ` Ben Hutchings
2013-04-08 17:58     ` Bing Zhao
2013-04-08 17:58       ` Bing Zhao
2013-04-02 22:49 ` [ 15/56] net/irda: add missing error path release_sock call Greg Kroah-Hartman
2013-04-02 22:49 ` [ 16/56] usb: xhci: Fix TRB transfer length macro used for Event TRB Greg Kroah-Hartman
2013-04-02 22:49 ` [ 17/56] Btrfs: limit the global reserve to 512mb Greg Kroah-Hartman
2013-04-02 22:49 ` [ 18/56] KVM: Clean up error handling during VCPU creation Greg Kroah-Hartman
2013-04-02 22:49 ` [ 19/56] x25: Validate incoming call user data lengths Greg Kroah-Hartman
2013-04-02 22:49 ` [ 20/56] x25: Handle undersized/fragmented skbs Greg Kroah-Hartman
2013-04-02 22:49 ` [ 21/56] batman-adv: bat_socket_read missing checks Greg Kroah-Hartman
2013-04-02 22:49 ` [ 22/56] batman-adv: Only write requested number of byte to user buffer Greg Kroah-Hartman
2013-04-02 22:49 ` [ 23/56] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Greg Kroah-Hartman
2013-04-02 22:49 ` [ 24/56] NFSv4: include bitmap in nfsv4 get acl data Greg Kroah-Hartman
2013-04-02 22:49 ` [ 25/56] NFSv4: Fix an Oops in the NFSv4 getacl code Greg Kroah-Hartman
2013-04-02 22:49 ` [ 26/56] NFS: nfs_getaclargs.acl_len is a size_t Greg Kroah-Hartman
2013-04-02 22:49 ` [ 27/56] KVM: Ensure all vcpus are consistent with in-kernel irqchip settings Greg Kroah-Hartman
2013-04-02 22:49 ` [ 28/56] macvtap: zerocopy: validate vectors before building skb Greg Kroah-Hartman
2013-04-02 22:49 ` [ 29/56] KVM: Fix buffer overflow in kvm_set_irq() Greg Kroah-Hartman
2013-04-02 22:49 ` [ 30/56] mm/hotplug: correctly add new zone to all other nodes zone lists Greg Kroah-Hartman
2013-04-02 22:49 ` [ 31/56] KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-2012-4461) Greg Kroah-Hartman
2013-04-02 22:49 ` [ 32/56] loop: prevent bdev freeing while device in use Greg Kroah-Hartman
2013-04-02 22:50 ` [ 33/56] nfsd4: reject "negative" acl lengths Greg Kroah-Hartman
2013-04-02 22:50 ` [ 34/56] drm/i915: dont set unpin_work if vblank_get fails Greg Kroah-Hartman
2013-04-02 22:50 ` [ 35/56] drm/i915: Dont clobber crtc->fb when queue_flip fails Greg Kroah-Hartman
2013-04-02 22:50 ` [ 36/56] efivars: explicitly calculate length of VariableName Greg Kroah-Hartman
2013-04-02 22:50 ` [ 37/56] efivars: Handle duplicate names from get_next_variable() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 38/56] ext4: use atomic64_t for the per-flexbg free_clusters count Greg Kroah-Hartman
2013-04-02 22:50 ` [ 39/56] tracing: Protect tracer flags with trace_types_lock Greg Kroah-Hartman
2013-04-02 22:50 ` [ 40/56] tracing: Prevent buffer overwrite disabled for latency tracers Greg Kroah-Hartman
2013-04-02 22:50 ` [ 41/56] sky2: Receive Overflows not counted Greg Kroah-Hartman
2013-04-02 22:50 ` [ 42/56] sky2: Threshold for Pause Packet is set wrong Greg Kroah-Hartman
2013-04-02 22:50 ` [ 43/56] tcp: preserve ACK clocking in TSO Greg Kroah-Hartman
2013-04-02 22:50 ` [ 44/56] tcp: undo spurious timeout after SACK reneging Greg Kroah-Hartman
2013-04-02 22:50 ` [ 45/56] 8021q: fix a potential use-after-free Greg Kroah-Hartman
2013-04-02 22:50 ` [ 46/56] thermal: shorten too long mcast group name Greg Kroah-Hartman
2013-04-02 22:50 ` [ 47/56] unix: fix a race condition in unix_release() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 48/56] aoe: reserve enough headroom on skbs Greg Kroah-Hartman
2013-04-02 22:50 ` [ 49/56] drivers: net: ethernet: davinci_emac: use netif_wake_queue() while restarting tx queue Greg Kroah-Hartman
2013-04-02 22:50 ` [ 50/56] atl1e: drop pci-msi support because of packet corruption Greg Kroah-Hartman
2013-04-02 22:50   ` Greg Kroah-Hartman
2013-04-02 22:50 ` [ 51/56] ipv6: fix bad free of addrconf_init_net Greg Kroah-Hartman
2013-04-02 22:50 ` [ 52/56] ks8851: Fix interpretation of rxlen field Greg Kroah-Hartman
2013-04-02 22:50 ` Greg Kroah-Hartman [this message]
2013-04-02 22:50 ` [ 54/56] pch_gbe: fix ip_summed checksum reporting on rx Greg Kroah-Hartman
2013-04-02 22:50 ` [ 55/56] smsc75xx: fix jumbo frame support Greg Kroah-Hartman
2013-04-02 22:50 ` [ 56/56] iommu/amd: Make sure dma_ops are set for hotplug devices Greg Kroah-Hartman
2013-04-03 15:19 ` [ 00/56] 3.0.72-stable review Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130402224717.948465292@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jpirko@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulmck@us.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.