All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gioh Kim <gi-oh.kim@ionos.com>
To: linux-block@vger.kernel.org
Cc: axboe@kernel.dk, hch@infradead.org, sagi@grimberg.me,
	bvanassche@acm.org, haris.iqbal@ionos.com, jinpu.wang@ionos.com,
	Gioh Kim <gi-oh.kim@cloud.ionos.com>,
	Gioh Kim <gi-oh.kim@ionos.com>
Subject: [PATCHv3 for-next 09/19] block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel
Date: Tue,  6 Apr 2021 09:07:06 +0200	[thread overview]
Message-ID: <20210406070716.168541-10-gi-oh.kim@ionos.com> (raw)
In-Reply-To: <20210406070716.168541-1-gi-oh.kim@ionos.com>

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

We got a warning message below.
When server tries to close one session by force, it locks the sysfs
interface and locks the srv_sess lock.
The problem is that client can send a request to close at the same time.
By close request, server locks the srv_sess lock and locks the sysfs
to remove the sysfs interfaces.

The simplest way to prevent that situation could be just use
mutex_trylock.

[  234.153965] ======================================================
[  234.154093] WARNING: possible circular locking dependency detected
[  234.154219] 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Tainted: G           O
[  234.154381] ------------------------------------------------------
[  234.154531] kworker/1:1H/618 is trying to acquire lock:
[  234.154651] ffff8887a09db0a8 (kn->count#132){++++}, at: kernfs_remove_by_name_ns+0x40/0x80
[  234.154819]
               but task is already holding lock:
[  234.154965] ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server]
[  234.155132]
               which lock already depends on the new lock.

[  234.155311]
               the existing dependency chain (in reverse order) is:
[  234.155462]
               -> #1 (&srv_sess->lock){+.+.}:
[  234.155614]        __mutex_lock+0x134/0xcb0
[  234.155761]        rnbd_srv_sess_dev_force_close+0x36/0x50 [rnbd_server]
[  234.155889]        rnbd_srv_dev_session_force_close_store+0x69/0xc0 [rnbd_server]
[  234.156042]        kernfs_fop_write+0x13f/0x240
[  234.156162]        vfs_write+0xf3/0x280
[  234.156278]        ksys_write+0xba/0x150
[  234.156395]        do_syscall_64+0x62/0x270
[  234.156513]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  234.156632]
               -> #0 (kn->count#132){++++}:
[  234.156782]        __lock_acquire+0x129e/0x23a0
[  234.156900]        lock_acquire+0xf3/0x210
[  234.157043]        __kernfs_remove+0x42b/0x4c0
[  234.157161]        kernfs_remove_by_name_ns+0x40/0x80
[  234.157282]        remove_files+0x3f/0xa0
[  234.157399]        sysfs_remove_group+0x4a/0xb0
[  234.157519]        rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server]
[  234.157648]        rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server]
[  234.157775]        process_io_req+0x29a/0x6a0 [rtrs_server]
[  234.157924]        __ib_process_cq+0x8c/0x100 [ib_core]
[  234.158709]        ib_cq_poll_work+0x31/0xb0 [ib_core]
[  234.158834]        process_one_work+0x4e5/0xaa0
[  234.158958]        worker_thread+0x65/0x5c0
[  234.159078]        kthread+0x1e0/0x200
[  234.159194]        ret_from_fork+0x24/0x30
[  234.159309]
               other info that might help us debug this:

[  234.159513]  Possible unsafe locking scenario:

[  234.159658]        CPU0                    CPU1
[  234.159775]        ----                    ----
[  234.159891]   lock(&srv_sess->lock);
[  234.160005]                                lock(kn->count#132);
[  234.160128]                                lock(&srv_sess->lock);
[  234.160250]   lock(kn->count#132);
[  234.160364]
                *** DEADLOCK ***

[  234.160536] 3 locks held by kworker/1:1H/618:
[  234.160677]  #0: ffff8883ca1ed528 ((wq_completion)ib-comp-wq){+.+.}, at: process_one_work+0x40a/0xaa0
[  234.160840]  #1: ffff8883d2d5fe10 ((work_completion)(&cq->work)){+.+.}, at: process_one_work+0x40a/0xaa0
[  234.161003]  #2: ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server]
[  234.161168]
               stack backtrace:
[  234.161312] CPU: 1 PID: 618 Comm: kworker/1:1H Tainted: G           O      5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
[  234.161490] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
[  234.161643] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[  234.161765] Call Trace:
[  234.161910]  dump_stack+0x96/0xe0
[  234.162028]  check_noncircular+0x29e/0x2e0
[  234.162148]  ? print_circular_bug+0x100/0x100
[  234.162267]  ? register_lock_class+0x1ad/0x8a0
[  234.162385]  ? __lock_acquire+0x68e/0x23a0
[  234.162505]  ? trace_event_raw_event_lock+0x190/0x190
[  234.162626]  __lock_acquire+0x129e/0x23a0
[  234.162746]  ? register_lock_class+0x8a0/0x8a0
[  234.162866]  lock_acquire+0xf3/0x210
[  234.162982]  ? kernfs_remove_by_name_ns+0x40/0x80
[  234.163127]  __kernfs_remove+0x42b/0x4c0
[  234.163243]  ? kernfs_remove_by_name_ns+0x40/0x80
[  234.163363]  ? kernfs_fop_readdir+0x3b0/0x3b0
[  234.163482]  ? strlen+0x1f/0x40
[  234.163596]  ? strcmp+0x30/0x50
[  234.163712]  kernfs_remove_by_name_ns+0x40/0x80
[  234.163832]  remove_files+0x3f/0xa0
[  234.163948]  sysfs_remove_group+0x4a/0xb0
[  234.164068]  rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server]
[  234.164196]  rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server]
[  234.164345]  ? _raw_spin_unlock_irqrestore+0x43/0x50
[  234.164466]  ? lockdep_hardirqs_on+0x1a8/0x290
[  234.164597]  ? mlx4_ib_poll_cq+0x927/0x1280 [mlx4_ib]
[  234.164732]  ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server]
[  234.164859]  process_io_req+0x29a/0x6a0 [rtrs_server]
[  234.164982]  ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server]
[  234.165130]  __ib_process_cq+0x8c/0x100 [ib_core]
[  234.165279]  ib_cq_poll_work+0x31/0xb0 [ib_core]
[  234.165404]  process_one_work+0x4e5/0xaa0
[  234.165550]  ? pwq_dec_nr_in_flight+0x160/0x160
[  234.165675]  ? do_raw_spin_lock+0x119/0x1d0
[  234.165796]  worker_thread+0x65/0x5c0
[  234.165914]  ? process_one_work+0xaa0/0xaa0
[  234.166031]  kthread+0x1e0/0x200
[  234.166147]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[  234.166268]  ret_from_fork+0x24/0x30
[  234.251591] rnbd_server L243: </dev/loop1@close_device_session>: Device closed
[  234.604221] rnbd_server L264: RTRS Session close_device_session disconnected

Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
---
 drivers/block/rnbd/rnbd-srv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index a4fd9f167c18..1549a6361630 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -334,7 +334,9 @@ void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev)
 	struct rnbd_srv_session	*sess = sess_dev->sess;
 
 	sess_dev->keep_id = true;
-	mutex_lock(&sess->lock);
+	/* It is already started to close by client's close message. */
+	if (!mutex_trylock(&sess->lock))
+		return;
 	rnbd_srv_destroy_dev_session_sysfs(sess_dev);
 	mutex_unlock(&sess->lock);
 }
-- 
2.25.1


  parent reply	other threads:[~2021-04-06  7:07 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-06  7:06 [PATCHv3 for-next 00/19] Misc update for rnbd Gioh Kim
2021-04-06  7:06 ` [PATCHv3 for-next 01/19] MAINTAINERS: Change maintainer for rnbd module Gioh Kim
2021-04-06  7:06 ` [PATCHv3 for-next 02/19] Documentation/sysfs-block-rnbd: Add descriptions for remap_device and resize Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 03/19] block/rnbd-clt: Remove some arguments from insert_dev_if_not_exists_devpath Gioh Kim
2021-04-06  8:19   ` Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 04/19] block/rnbd-clt: Remove some arguments from rnbd_client_setup_device Gioh Kim
2021-04-06  8:20   ` Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 05/19] block/rnbd-clt: Move add_disk(dev->gd) to rnbd_clt_setup_gen_disk Gioh Kim
2021-04-06  8:18   ` Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 06/19] block/rnbd: Kill rnbd_clt_destroy_default_group Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 07/19] block/rnbd: Kill destroy_device_cb Gioh Kim
2021-04-06  8:21   ` Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 08/19] block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT} Gioh Kim
2021-04-06  8:23   ` Gioh Kim
2021-04-06  7:07 ` Gioh Kim [this message]
2021-04-06  7:07 ` [PATCHv3 for-next 10/19] block/rnbd-srv: Remove force_close file after holding a lock Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 11/19] block/rnbd-clt: Improve find_or_create_sess() return check Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 12/19] block/rnbd-clt: Fix missing a memory free when unloading the module Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 14/19] Documentation/ABI/rnbd-clt: Add description for nr_poll_queues Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 15/19] block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Gioh Kim
2021-04-06  8:25   ` Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 16/19] block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 17/19] block/rnbd-clt: Remove max_segment_size Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 18/19] block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name Gioh Kim
2021-04-06  8:25   ` Gioh Kim
2021-04-06  7:07 ` [PATCHv3 for-next 19/19] block/rnbd: Use strscpy instead of strlcpy Gioh Kim
2021-04-06  8:26   ` Gioh Kim
2021-04-06  8:16 ` [PATCHv3 for-next 00/19] Misc update for rnbd Gioh Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210406070716.168541-10-gi-oh.kim@ionos.com \
    --to=gi-oh.kim@ionos.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=gi-oh.kim@cloud.ionos.com \
    --cc=haris.iqbal@ionos.com \
    --cc=hch@infradead.org \
    --cc=jinpu.wang@ionos.com \
    --cc=linux-block@vger.kernel.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.