All of lore.kernel.org
 help / color / mirror / Atom feed
From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
To: NeilBrown <neilb@suse.com>, Shaohua Li <shli@kernel.org>
Cc: Linux Raid <linux-raid@vger.kernel.org>
Subject: Re: [PATCH] md: create new workqueue for object destruction
Date: Wed, 18 Oct 2017 13:21:59 +0200	[thread overview]
Message-ID: <6454f28e-4728-a10d-f3c3-b68afedec8d9@intel.com> (raw)
In-Reply-To: <87h8uwj4mz.fsf@notabene.neil.brown.name>

On 10/18/2017 09:29 AM, NeilBrown wrote:
> On Tue, Oct 17 2017, Shaohua Li wrote:
> 
>> On Tue, Oct 17, 2017 at 04:04:52PM +1100, Neil Brown wrote:
>>>
>>> lockdep currently complains about a potential deadlock
>>> with sysfs access taking reconfig_mutex, and that
>>> waiting for a work queue to complete.
>>>
>>> The cause is inappropriate overloading of work-items
>>> on work-queues.
>>>
>>> We currently have two work-queues: md_wq and md_misc_wq.
>>> They service 5 different tasks:
>>>
>>>   mddev->flush_work                       md_wq
>>>   mddev->event_work (for dm-raid)         md_misc_wq
>>>   mddev->del_work (mddev_delayed_delete)  md_misc_wq
>>>   mddev->del_work (md_start_sync)         md_misc_wq
>>>   rdev->del_work                          md_misc_wq
>>>
>>> We need to call flush_workqueue() for md_start_sync and ->event_work
>>> while holding reconfig_mutex, but mustn't hold it when
>>> flushing mddev_delayed_delete or rdev->del_work.
>>>
>>> md_wq is a bit special as it has WQ_MEM_RECLAIM so it is
>>> best to leave that alone.
>>>
>>> So create a new workqueue, md_del_wq, and a new work_struct,
>>> mddev->sync_work, so we can keep two classes of work separate.
>>>
>>> md_del_wq and ->del_work are used only for destroying rdev
>>> and mddev.
>>> md_misc_wq is used for event_work and sync_work.
>>>
>>> Also document the purpose of each flush_workqueue() call.
>>>
>>> This removes the lockdep warning.
>>
>> I had the exactly same patch queued internally,
> 
> Cool :-)
> 
>>                                                   but the mdadm test suite still
>> shows lockdep warnning. I haven't time to check further.
>>
> 
> The only other lockdep I've seen later was some ext4 thing, though I
> haven't tried the full test suite.  I might have a look tomorrow.

I'm also seeing a lockdep warning with or without this patch,
reproducible with:

export IMSM_NO_PLATFORM=1 # for platforms without IMSM
mdadm -C /dev/md/imsm0 -eimsm -n4 /dev/sd[a-d] -R
mdadm -C /dev/md/vol0 -l5 -n4 /dev/sd[a-d] -R --assume-clean
mdadm -Ss

[   42.489474] md/raid:md126: device sdd operational as raid disk 3
[   42.490731] md/raid:md126: device sdc operational as raid disk 2
[   42.492363] md/raid:md126: device sdb operational as raid disk 1
[   42.494177] md/raid:md126: device sda operational as raid disk 0
[   42.500864] md/raid:md126: raid level 5 active with 4 out of 4 devices, algorithm 0
[   42.503740] md126: detected capacity change from 0 to 3208642560
[   45.177525] md126: detected capacity change from 3208642560 to 0
[   45.179136] md: md126 still in use.
[   45.185846] md: md126 stopped.
[   45.211960] 
[   45.212081] ======================================================
[   45.212081] WARNING: possible circular locking dependency detected
[   45.212081] 4.14.0-rc3+ #391 Not tainted
[   45.212081] ------------------------------------------------------
[   45.212081] kworker/1:0/17 is trying to acquire lock:
[   45.212081]  (kn->count#89){++++}, at: [<ffffffff812514f3>] kernfs_remove+0x23/0x40
[   45.212081] 
[   45.212081] but task is already holding lock:
[   45.212081]  ((&mddev->del_work)){+.+.}, at: [<ffffffff810783d0>] process_one_work+0x1c0/0x630
[   45.212081] 
[   45.212081] which lock already depends on the new lock.
[   45.212081] 
[   45.212081] 
[   45.212081] the existing dependency chain (in reverse order) is:
[   45.212081] 
[   45.212081] -> #2 ((&mddev->del_work)){+.+.}:
[   45.212081]        __lock_acquire+0xc48/0x1140
[   45.212081]        lock_acquire+0x19d/0x1d0
[   45.212081]        process_one_work+0x212/0x630
[   45.212081]        worker_thread+0x211/0x400
[   45.212081]        kthread+0x172/0x180
[   45.212081]        ret_from_fork+0x27/0x40
[   45.212081] 
[   45.212081] -> #1 ("md_del"){+.+.}:
[   45.212081]        __lock_acquire+0xc48/0x1140
[   45.212081]        lock_acquire+0x19d/0x1d0
[   45.212081]        flush_workqueue+0xbb/0x460
[   45.212081]        new_dev_store+0xb2/0x1e0 [md_mod]
[   45.212081]        md_attr_store+0x90/0xc0 [md_mod]
[   45.212081]        sysfs_kf_write+0x42/0x50
[   45.212081]        kernfs_fop_write+0x119/0x180
[   45.212081]        __vfs_write+0x28/0x110
[   45.212081]        vfs_write+0xb4/0x1a0
[   45.212081]        SyS_write+0x49/0xa0
[   45.212081]        entry_SYSCALL_64_fastpath+0x18/0xad
[   45.212081] 
[   45.212081] -> #0 (kn->count#89){++++}:
[   45.212081]        check_prev_add+0x125/0x690
[   45.212081]        __lock_acquire+0xc48/0x1140
[   45.212081]        lock_acquire+0x19d/0x1d0
[   45.212081]        __kernfs_remove+0x15e/0x270
[   45.212081]        kernfs_remove+0x23/0x40
[   45.212081]        sysfs_remove_dir+0x53/0x60
[   45.212081]        kobject_del+0x18/0x50
[   45.212081]        mddev_delayed_delete+0x28/0x40 [md_mod]
[   45.212081]        process_one_work+0x330/0x630
[   45.212081]        worker_thread+0x211/0x400
[   45.212081]        kthread+0x172/0x180
[   45.212081]        ret_from_fork+0x27/0x40
[   45.212081] 
[   45.212081] other info that might help us debug this:
[   45.212081] 
[   45.212081] Chain exists of:
[   45.212081]   kn->count#89 --> "md_del" --> (&mddev->del_work)
[   45.212081] 
[   45.212081]  Possible unsafe locking scenario:
[   45.212081] 
[   45.212081]        CPU0                    CPU1
[   45.212081]        ----                    ----
[   45.212081]   lock((&mddev->del_work));
[   45.212081]                                lock("md_del");
[   45.212081]                                lock((&mddev->del_work));
[   45.212081]   lock(kn->count#89);
[   45.212081] 
[   45.212081]  *** DEADLOCK ***
[   45.212081] 
[   45.212081] 2 locks held by kworker/1:0/17:
[   45.212081]  #0:  ("md_del"){+.+.}, at: [<ffffffff810783d0>] process_one_work+0x1c0/0x630
[   45.212081]  #1:  ((&mddev->del_work)){+.+.}, at: [<ffffffff810783d0>] process_one_work+0x1c0/0x630
[   45.212081] 
[   45.212081] stack backtrace:
[   45.212081] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 4.14.0-rc3+ #391
[   45.212081] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[   45.212081] Workqueue: md_del mddev_delayed_delete [md_mod]
[   45.212081] Call Trace:
[   45.212081]  dump_stack+0x70/0x9a
[   45.212081]  print_circular_bug+0x2d3/0x2f0
[   45.212081]  ? __print_lock_name+0x80/0x80
[   45.212081]  check_prev_add+0x125/0x690
[   45.212081]  ? ret_from_fork+0x27/0x40
[   45.212081]  __lock_acquire+0xc48/0x1140
[   45.212081]  ? __lock_acquire+0xc48/0x1140
[   45.212081]  ? __print_lock_name+0x80/0x80
[   45.212081]  lock_acquire+0x19d/0x1d0
[   45.212081]  ? kernfs_remove+0x23/0x40
[   45.212081]  __kernfs_remove+0x15e/0x270
[   45.212081]  ? kernfs_remove+0x23/0x40
[   45.212081]  kernfs_remove+0x23/0x40
[   45.212081]  sysfs_remove_dir+0x53/0x60
[   45.212081]  kobject_del+0x18/0x50
[   45.212081]  mddev_delayed_delete+0x28/0x40 [md_mod]
[   45.212081]  process_one_work+0x330/0x630
[   45.212081]  worker_thread+0x211/0x400
[   45.212081]  kthread+0x172/0x180
[   45.212081]  ? process_one_work+0x630/0x630
[   45.212081]  ? kthread_stop+0x250/0x250
[   45.212081]  ret_from_fork+0x27/0x40
[   45.295721] md: md127 stopped.

Thanks,
Artur

  reply	other threads:[~2017-10-18 11:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-17  5:04 [PATCH] md: create new workqueue for object destruction NeilBrown
2017-10-18  6:21 ` Shaohua Li
2017-10-18  7:29   ` NeilBrown
2017-10-18 11:21     ` Artur Paszkiewicz [this message]
2017-10-18 22:36       ` NeilBrown
2017-10-19  8:27         ` Artur Paszkiewicz
2017-10-19 22:28           ` NeilBrown
2017-10-20 14:00             ` Artur Paszkiewicz
2017-10-22 23:31               ` NeilBrown
2017-10-27 10:44                 ` Artur Paszkiewicz
2017-10-29 22:18                   ` NeilBrown
2017-10-30 13:02                     ` Artur Paszkiewicz
2017-11-01  3:57                       ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6454f28e-4728-a10d-f3c3-b68afedec8d9@intel.com \
    --to=artur.paszkiewicz@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.