All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] [PATCH v2] lustre: mdc: fix possible deadlock in chlg_open()
Date: Wed, 07 Nov 2018 11:33:25 +1100	[thread overview]
Message-ID: <87d0rhybe2.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <a513884a-046f-a28c-3fbb-dd112abd4b26@cea.fr>

On Tue, Nov 06 2018, quentin.bouget at cea.fr wrote:

> Le 06/11/2018 ? 04:11, NeilBrown a ?crit?:
>> On Mon, Nov 05 2018, quentin.bouget at cea.fr wrote:
>>
>>> Le 04/11/2018 ? 22:34, James Simmons a ?crit?:
>>>>> Lockdep reports a possible deadlock between chlg_open() and
>>>>> mdc_changelog_cdev_init()
>>>>>
>>>>> mdc_changelog_cdev_init() takes chlg_registered_dev_lock and then
>>>>> calls misc_register() which takes misc_mtx.
>>>>> chlg_open() is called while misc_mtx is held, and tries to take
>>>>> chlg_registered_dev_lock.
>>>>> If these two functions race, a deadlock can occur as each thread will
>>>>> hold one of the locks while trying to take the other.
>>>>>
>>>>> chlg_open() does not need to take a lock.  It only uses the
>>>>> lock to stablize a list while looking for the matching
>>>>> chlg_registered_dev, and this can be found directly by examining
>>>>> file->private_data.
>>>>>
>>>>> So remove chlg_obd_get(), and use file->private_data to find the
>>>>> obd_device.
>>>>> Also ensure the device is fully initialized before calling
>>>>> misc_register().  This means setting up some list linkage before the
>>>>> call, and tearing it down if there is an error.
>>>> I have been testing this but I'm no HSM expert. I pushed this patch
>>>> to OpenSFS branch as well.
>>>>
>>>> https://jira.whamcloud.com/browse/LU-11617
>>>> https://review.whamcloud.com/#/c/33572/
>>>>
>>>> Reviewed-by: James Simmons <jsimmons@infradead.org>
>>> Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
>>>
>> Thanks to you both for the review.
>>
>> NeilBrown
>>
> Wait! I just realised there might be another issue!
> I think there is now a race between chlg_open() and 
> mdc_changelog_cdev_finish().

Hmmm.. yes.  Also another deadlock possibility as the locks can be taken
in the wrong order here too.

>
> Wait! I just realised there might be another bigger issue!
> The whole "take the first obd you can find" is broken! I opened a ticket 
> <https://jira.whamcloud.com/browse/LU-11626> on whamcloud's JIRA about it.

Yep...
My guess is that chlg_load(), chlg_clear() and (possibly)
chlg_read_cat_process_cb() should take the mutex, choose an obd, used
it, and release the mutex.

I might post an RFC patch.

NeilBrown


>
> Quentin Bouget
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181107/df58b631/attachment.sig>

  reply	other threads:[~2018-11-07  0:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-31  1:29 [lustre-devel] [PATCH] lustre: mdc: fix possible deadlock in chlg_open() NeilBrown
2018-10-31 12:24 ` quentin.bouget at cea.fr
2018-10-31 20:56   ` NeilBrown
2018-10-31 23:01   ` [lustre-devel] [PATCH v2] " NeilBrown
2018-11-04 21:34     ` James Simmons
2018-11-05 13:37       ` quentin.bouget at cea.fr
2018-11-06  3:11         ` NeilBrown
2018-11-06 10:41           ` quentin.bouget at cea.fr
2018-11-07  0:33             ` NeilBrown [this message]
2018-11-07  0:43               ` NeilBrown
2018-11-07  1:09               ` [lustre-devel] [PATCH/RFC] lustre: changelog_cdev need to find obd on each access NeilBrown
2018-11-07 10:05                 ` quentin.bouget at cea.fr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d0rhybe2.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.