All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Sagi Grimberg <sagi@grimberg.me>, Yi Zhang <yi.zhang@redhat.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	"open list:NVM EXPRESS DRIVER" <linux-nvme@lists.infradead.org>
Subject: Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
Date: Sat, 28 May 2022 21:00:16 +0200	[thread overview]
Message-ID: <4d65a168-c701-6ffa-45b9-858ddcabbbda@acm.org> (raw)
In-Reply-To: <20220527125229.GC2960187@ziepe.ca>

On 5/27/22 14:52, Jason Gunthorpe wrote:
> On Wed, May 25, 2022 at 08:50:52PM +0200, Bart Van Assche wrote:
>> On 5/25/22 13:01, Sagi Grimberg wrote:
>>> iirc this was reported before, based on my analysis lockdep is giving
>>> a false alarm here. The reason is that the id_priv->handler_mutex cannot
>>> be the same for both cm_id that is handling the connect and the cm_id
>>> that is handling the rdma_destroy_id because rdma_destroy_id call
>>> is always called on a already disconnected cm_id, so this deadlock
>>> lockdep is complaining about cannot happen.
>>>
>>> I'm not sure how to settle this.
>>
>> If the above is correct, using lockdep_register_key() for
>> id_priv->handler_mutex instead of a static key should make the lockdep false
>> positive disappear.
> 
> That only works if you can detect actual different lock classes during
> lock creation. It doesn't seem applicable in this case.

Why doesn't it seem applicable in this case? The default behavior of 
mutex_init() and related initialization functions is to create one lock 
class per synchronization object initialization caller. 
lockdep_register_key() can be used to create one lock class per 
synchronization object instance. I introduced lockdep_register_key() 
myself a few years ago.

After having taken a closer look at the RDMA/CM code, I decided not yet 
to implement what I proposed above. I noticed that handler_mutex is held 
around callback invocations. An example:

static int cma_cm_event_handler(struct rdma_id_private *id_priv,
				struct rdma_cm_event *event)
{
	int ret;

	lockdep_assert_held(&id_priv->handler_mutex);

	trace_cm_event_handler(id_priv, event);
	ret = id_priv->id.event_handler(&id_priv->id, event);
	trace_cm_event_done(id_priv, event, ret);
	return ret;
}

My opinion is that holding *any* lock around the invocation of a 
callback function is an antipattern, in other words, something that 
never should be done. John Ousterhout already described this in 1996 in 
his presentation [1]. Patches like 071ba4cc559d ("RDMA: Add 
rdma_connect_locked()") work around this problem but do not solve it.

Has it been considered to rework the RDMA/CM such that no locks are held 
around the invocation of callback functions like the event_handler 
callback? There are other mechanisms to report events from one software 
layer (RDMA/CM) to a higher software layer (ULP), e.g. a linked list 
with event information. The RDMA/CM could queue events onto that list 
and the ULP can dequeue events from that list.

Thanks,

Bart.

[1] Ousterhout, John. "Why threads are a bad idea (for most purposes)." 
In Presentation given at the 1996 Usenix Annual Technical Conference, 
vol. 5. 1996.

  reply	other threads:[~2022-05-28 19:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-04  3:04 [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002 Yi Zhang
2022-02-27 23:21 ` Bart Van Assche
2022-05-25  3:40   ` yangx.jy
2022-05-25 11:01 ` Sagi Grimberg
2022-05-25 18:50   ` Bart Van Assche
2022-05-27 12:52     ` Jason Gunthorpe
2022-05-28 19:00       ` Bart Van Assche [this message]
2022-05-31 12:35         ` Jason Gunthorpe
2022-05-31 17:55           ` Bart Van Assche
2022-06-01 12:45             ` Jason Gunthorpe
2022-06-01 16:26               ` Bart Van Assche
2022-06-01 17:30                 ` Jason Gunthorpe
2022-06-03  5:13                   ` Bart Van Assche
2022-06-06 16:21                     ` Jason Gunthorpe
2022-08-23  7:29                       ` yangx.jy
2022-08-25  5:59   ` yangx.jy
2022-08-25  6:26     ` Guoqing Jiang
2022-08-26 10:03       ` yangx.jy
2022-08-26 11:32         ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d65a168-c701-6ffa-45b9-858ddcabbbda@acm.org \
    --to=bvanassche@acm.org \
    --cc=jgg@ziepe.ca \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sagi@grimberg.me \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.