All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Joe Korty <joe.korty@concurrent-rt.com>,
	Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 4.19 15/43] NFSD: Repair misuse of sv_lock in 5.10.16-rt30.
Date: Mon, 22 Mar 2021 13:28:29 +0100	[thread overview]
Message-ID: <20210322121920.428427760@linuxfoundation.org> (raw)
In-Reply-To: <20210322121919.936671417@linuxfoundation.org>

From: Joe Korty <joe.korty@concurrent-rt.com>

commit c7de87ff9dac5f396f62d584f3908f80ddc0e07b upstream.

[ This problem is in mainline, but only rt has the chops to be
able to detect it. ]

Lockdep reports a circular lock dependency between serv->sv_lock and
softirq_ctl.lock on system shutdown, when using a kernel built with
CONFIG_PREEMPT_RT=y, and a nfs mount exists.

This is due to the definition of spin_lock_bh on rt:

	local_bh_disable();
	rt_spin_lock(lock);

which forces a softirq_ctl.lock -> serv->sv_lock dependency.  This is
not a problem as long as _every_ lock of serv->sv_lock is a:

	spin_lock_bh(&serv->sv_lock);

but there is one of the form:

	spin_lock(&serv->sv_lock);

This is what is causing the circular dependency splat.  The spin_lock()
grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable.
If later on in the critical region,  someone does a local_bh_disable, we
get a serv->sv_lock -> softirq_ctrl.lock dependency established.  Deadlock.

Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no
exceptions.

[  OK  ] Stopped target NFS client services.
         Stopping Logout off all iSCSI sessions on shutdown...
         Stopping NFS server and services...
[  109.442380]
[  109.442385] ======================================================
[  109.442386] WARNING: possible circular locking dependency detected
[  109.442387] 5.10.16-rt30 #1 Not tainted
[  109.442389] ------------------------------------------------------
[  109.442390] nfsd/1032 is trying to acquire lock:
[  109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270
[  109.442405]
[  109.442405] but task is already holding lock:
[  109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[  109.442415]
[  109.442415] which lock already depends on the new lock.
[  109.442415]
[  109.442416]
[  109.442416] the existing dependency chain (in reverse order) is:
[  109.442417]
[  109.442417] -> #1 (&serv->sv_lock){+.+.}-{0:0}:
[  109.442421]        rt_spin_lock+0x2b/0xc0
[  109.442428]        svc_add_new_perm_xprt+0x42/0xa0
[  109.442430]        svc_addsock+0x135/0x220
[  109.442434]        write_ports+0x4b3/0x620
[  109.442438]        nfsctl_transaction_write+0x45/0x80
[  109.442440]        vfs_write+0xff/0x420
[  109.442444]        ksys_write+0x4f/0xc0
[  109.442446]        do_syscall_64+0x33/0x40
[  109.442450]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  109.442454]
[  109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}:
[  109.442457]        __lock_acquire+0x1264/0x20b0
[  109.442463]        lock_acquire+0xc2/0x400
[  109.442466]        rt_spin_lock+0x2b/0xc0
[  109.442469]        __local_bh_disable_ip+0xd9/0x270
[  109.442471]        svc_xprt_do_enqueue+0xc0/0x4d0
[  109.442474]        svc_close_list+0x60/0x90
[  109.442476]        svc_close_net+0x49/0x1a0
[  109.442478]        svc_shutdown_net+0x12/0x40
[  109.442480]        nfsd_destroy+0xc5/0x180
[  109.442482]        nfsd+0x1bc/0x270
[  109.442483]        kthread+0x194/0x1b0
[  109.442487]        ret_from_fork+0x22/0x30
[  109.442492]
[  109.442492] other info that might help us debug this:
[  109.442492]
[  109.442493]  Possible unsafe locking scenario:
[  109.442493]
[  109.442493]        CPU0                    CPU1
[  109.442494]        ----                    ----
[  109.442495]   lock(&serv->sv_lock);
[  109.442496]                                lock((softirq_ctrl.lock).lock);
[  109.442498]                                lock(&serv->sv_lock);
[  109.442499]   lock((softirq_ctrl.lock).lock);
[  109.442501]
[  109.442501]  *** DEADLOCK ***
[  109.442501]
[  109.442501] 3 locks held by nfsd/1032:
[  109.442503]  #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270
[  109.442508]  #1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[  109.442512]  #2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[  109.442518]
[  109.442518] stack backtrace:
[  109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 #1
[  109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
[  109.442524] Call Trace:
[  109.442527]  dump_stack+0x77/0x97
[  109.442533]  check_noncircular+0xdc/0xf0
[  109.442546]  __lock_acquire+0x1264/0x20b0
[  109.442553]  lock_acquire+0xc2/0x400
[  109.442564]  rt_spin_lock+0x2b/0xc0
[  109.442570]  __local_bh_disable_ip+0xd9/0x270
[  109.442573]  svc_xprt_do_enqueue+0xc0/0x4d0
[  109.442577]  svc_close_list+0x60/0x90
[  109.442581]  svc_close_net+0x49/0x1a0
[  109.442585]  svc_shutdown_net+0x12/0x40
[  109.442588]  nfsd_destroy+0xc5/0x180
[  109.442590]  nfsd+0x1bc/0x270
[  109.442595]  kthread+0x194/0x1b0
[  109.442600]  ret_from_fork+0x22/0x30
[  109.518225] nfsd: last server has exited, flushing export cache
[  OK  ] Stopped NFSv4 ID-name mapping service.
[  OK  ] Stopped GSSAPI Proxy Daemon.
[  OK  ] Stopped NFS Mount Daemon.
[  OK  ] Stopped NFS status monitor for NFSv2/3 locking..

Fixes: 719f8bcc883e ("svcrpc: fix xpt_list traversal locking on shutdown")
Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sunrpc/svc_xprt.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -1052,7 +1052,7 @@ static int svc_close_list(struct svc_ser
 	struct svc_xprt *xprt;
 	int ret = 0;
 
-	spin_lock(&serv->sv_lock);
+	spin_lock_bh(&serv->sv_lock);
 	list_for_each_entry(xprt, xprt_list, xpt_list) {
 		if (xprt->xpt_net != net)
 			continue;
@@ -1060,7 +1060,7 @@ static int svc_close_list(struct svc_ser
 		set_bit(XPT_CLOSE, &xprt->xpt_flags);
 		svc_xprt_enqueue(xprt);
 	}
-	spin_unlock(&serv->sv_lock);
+	spin_unlock_bh(&serv->sv_lock);
 	return ret;
 }
 



  parent reply	other threads:[~2021-03-22 13:04 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-22 12:28 [PATCH 4.19 00/43] 4.19.183-rc1 review Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 01/43] ASoC: ak4458: Add MODULE_DEVICE_TABLE Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 02/43] ASoC: ak5558: " Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 03/43] ALSA: hda: generic: Fix the micmute led init state Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 04/43] Revert "PM: runtime: Update device status before letting suppliers suspend" Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 05/43] vmlinux.lds.h: Create section for protection against instrumentation Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 06/43] lkdtm: dont move ctors to .rodata Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 07/43] perf tools: Use %define api.pure full instead of %pure-parser Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 08/43] tools build feature: Check if get_current_dir_name() is available Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 09/43] tools build feature: Check if eventfd() " Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 10/43] tools build: Check if gettid() is available before providing helper Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 11/43] btrfs: fix race when cloning extent buffer during rewind of an old root Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 12/43] btrfs: fix slab cache flags for free space tree bitmap Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 13/43] ASoC: fsl_ssi: Fix TDM slot setup for I2S mode Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 14/43] nvmet: dont check iosqes,iocqes for discovery controllers Greg Kroah-Hartman
2021-03-22 12:28 ` Greg Kroah-Hartman [this message]
2021-03-22 12:28 ` [PATCH 4.19 16/43] svcrdma: disable timeouts on rdma backchannel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 17/43] sunrpc: fix refcount leak for rpc auth modules Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 18/43] net/qrtr: fix __netdev_alloc_skb call Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 19/43] scsi: lpfc: Fix some error codes in debugfs Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 20/43] nvme-rdma: fix possible hang when failing to set io queues Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 21/43] powerpc: Force inlining of cpu_has_feature() to avoid build failure Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 22/43] usb-storage: Add quirk to defeat Kindles automatic unload Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 23/43] usbip: Fix incorrect double assignment to udc->ud.tcp_rx Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 24/43] USB: replace hardcode maximum usb string length by definition Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 25/43] usb: gadget: configfs: Fix KASAN use-after-free Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 26/43] iio:adc:stm32-adc: Add HAS_IOMEM dependency Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 27/43] iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 28/43] iio: adis16400: Fix an error code in adis16400_initial_setup() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 29/43] iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 30/43] iio: hid-sensor-humidity: Fix alignment issue of timestamp channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 31/43] iio: hid-sensor-prox: Fix scale not correct issue Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 32/43] iio: hid-sensor-temperature: Fix issues of timestamp channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 33/43] PCI: rpadlpar: Fix potential drc_name corruption in store functions Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 34/43] perf/x86/intel: Fix a crash caused by zero PEBS status Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 35/43] x86/ioapic: Ignore IRQ2 again Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 36/43] kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 37/43] x86: Move TS_COMPAT back to asm/thread_info.h Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 38/43] x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 39/43] ext4: find old entry again if failed to rename whiteout Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 40/43] ext4: do not try to set xattr into ea_inode if value is empty Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 41/43] ext4: fix potential error in ext4_do_update_inode Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 42/43] genirq: Disable interrupts for force threaded handlers Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 43/43] x86/apic/of: Fix CPU devicetree-node lookups Greg Kroah-Hartman
2021-03-22 14:35 ` [PATCH 4.19 00/43] 4.19.183-rc1 review Jon Hunter
2021-03-22 20:14 ` Pavel Machek
2021-03-22 21:53 ` Guenter Roeck
2021-03-23  7:17 ` Samuel Zou
2021-03-23 10:31 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210322121920.428427760@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=chuck.lever@oracle.com \
    --cc=joe.korty@concurrent-rt.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.