All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: paulmck@linux.vnet.ibm.com, tj@kernel.org, mingo@redhat.com,
	linux-kernel@vger.kernel.org, der.herr@hofr.at,
	dave@stgolabs.net, riel@redhat.com, viro@ZenIV.linux.org.uk,
	torvalds@linux-foundation.org
Subject: Re: [RFC][PATCH 09/13] hotplug: Replace hotplug lock with percpu-rwsem
Date: Wed, 24 Jun 2015 18:15:25 +0200	[thread overview]
Message-ID: <20150624161524.GO3644@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20150624151212.GA3766@redhat.com>

On Wed, Jun 24, 2015 at 05:12:12PM +0200, Oleg Nesterov wrote:
> On 06/24, Peter Zijlstra wrote:

> > I'm confused.. why isn't the read-in-read recursion good enough?
> 
> Because the code above can actually deadlock if 2 CPU's do this at
> the same time?

Hmm yes.. this makes the hotplug locking worse than I feared it was, but
alas.

FYI, the actual splat.

---

[    7.399737] ======================================================
[    7.406640] [ INFO: possible circular locking dependency detected ]
[    7.413643] 4.1.0-02756-ge3d06bd-dirty #185 Not tainted
[    7.419481] -------------------------------------------------------
[    7.426483] kworker/0:1/215 is trying to acquire lock:
[    7.432221]  (&cpu_hotplug.rwsem){++++++}, at: [<ffffffff810ebd63>] apply_workqueue_attrs+0x183/0x4b0
[    7.442564] 
[    7.442564] but task is already holding lock:
[    7.449079]  (&item->mutex){+.+.+.}, at: [<ffffffff815c4dc3>] drm_global_item_ref+0x33/0xe0
[    7.458455] 
[    7.458455] which lock already depends on the new lock.
[    7.458455] 
[    7.467591] 
[    7.467591] the existing dependency chain (in reverse order) is:
[    7.475949] 
-> #3 (&item->mutex){+.+.+.}:
[    7.480662]        [<ffffffff811232b1>] lock_acquire+0xd1/0x290
[    7.487280]        [<ffffffff818ea777>] mutex_lock_nested+0x47/0x3c0
[    7.494390]        [<ffffffff815c4dc3>] drm_global_item_ref+0x33/0xe0
[    7.501596]        [<ffffffff815dcd90>] mgag200_mm_init+0x50/0x1c0
[    7.508514]        [<ffffffff815d757f>] mgag200_driver_load+0x30f/0x500
[    7.515916]        [<ffffffff815b1491>] drm_dev_register+0xb1/0x100
[    7.522922]        [<ffffffff815b428d>] drm_get_pci_dev+0x8d/0x1e0
[    7.529840]        [<ffffffff815dbd3f>] mga_pci_probe+0x9f/0xc0
[    7.536463]        [<ffffffff814bde92>] local_pci_probe+0x42/0xa0
[    7.543283]        [<ffffffff810e54e8>] work_for_cpu_fn+0x18/0x30
[    7.550106]        [<ffffffff810e9d57>] process_one_work+0x1e7/0x7e0
[    7.557214]        [<ffffffff810ea518>] worker_thread+0x1c8/0x460
[    7.564029]        [<ffffffff810f05b6>] kthread+0xf6/0x110
[    7.570166]        [<ffffffff818eefdf>] ret_from_fork+0x3f/0x70
[    7.576792] 
-> #2 (drm_global_mutex){+.+.+.}:
[    7.581891]        [<ffffffff811232b1>] lock_acquire+0xd1/0x290
[    7.588514]        [<ffffffff818ea777>] mutex_lock_nested+0x47/0x3c0
[    7.595622]        [<ffffffff815b1406>] drm_dev_register+0x26/0x100
[    7.602632]        [<ffffffff815b428d>] drm_get_pci_dev+0x8d/0x1e0
[    7.609547]        [<ffffffff815dbd3f>] mga_pci_probe+0x9f/0xc0
[    7.616170]        [<ffffffff814bde92>] local_pci_probe+0x42/0xa0
[    7.622987]        [<ffffffff810e54e8>] work_for_cpu_fn+0x18/0x30
[    7.629806]        [<ffffffff810e9d57>] process_one_work+0x1e7/0x7e0
[    7.636913]        [<ffffffff810ea518>] worker_thread+0x1c8/0x460
[    7.643727]        [<ffffffff810f05b6>] kthread+0xf6/0x110
[    7.649866]        [<ffffffff818eefdf>] ret_from_fork+0x3f/0x70
[    7.656490] 
-> #1 ((&wfc.work)){+.+.+.}:
[    7.661104]        [<ffffffff811232b1>] lock_acquire+0xd1/0x290
[    7.667727]        [<ffffffff810e737d>] flush_work+0x3d/0x260
[    7.674155]        [<ffffffff810e9822>] work_on_cpu+0x82/0x90
[    7.680584]        [<ffffffff814bf2a2>] pci_device_probe+0x112/0x120
[    7.687692]        [<ffffffff815e685f>] driver_probe_device+0x17f/0x2e0
[    7.695094]        [<ffffffff815e6a94>] __driver_attach+0x94/0xa0
[    7.701910]        [<ffffffff815e4786>] bus_for_each_dev+0x66/0xa0
[    7.708824]        [<ffffffff815e626e>] driver_attach+0x1e/0x20
[    7.715447]        [<ffffffff815e5ed8>] bus_add_driver+0x168/0x210
[    7.722361]        [<ffffffff815e7880>] driver_register+0x60/0xe0
[    7.729180]        [<ffffffff814bd754>] __pci_register_driver+0x64/0x70
[    7.736580]        [<ffffffff81f9a10d>] pcie_portdrv_init+0x66/0x79
[    7.743593]        [<ffffffff810002c8>] do_one_initcall+0x88/0x1c0
[    7.750508]        [<ffffffff81f5f169>] kernel_init_freeable+0x1f5/0x282
[    7.758005]        [<ffffffff818da36e>] kernel_init+0xe/0xe0
[    7.764338]        [<ffffffff818eefdf>] ret_from_fork+0x3f/0x70
[    7.770961] 
-> #0 (&cpu_hotplug.rwsem){++++++}:
[    7.776255]        [<ffffffff81122817>] __lock_acquire+0x2207/0x2240
[    7.783363]        [<ffffffff811232b1>] lock_acquire+0xd1/0x290
[    7.789986]        [<ffffffff810cb6e2>] get_online_cpus+0x62/0xb0
[    7.796805]        [<ffffffff810ebd63>] apply_workqueue_attrs+0x183/0x4b0
[    7.804398]        [<ffffffff810ed7bc>] __alloc_workqueue_key+0x2ec/0x560
[    7.811992]        [<ffffffff815cbefa>] ttm_mem_global_init+0x5a/0x310
[    7.819295]        [<ffffffff815dcbb2>] mgag200_ttm_mem_global_init+0x12/0x20
[    7.827277]        [<ffffffff815c4df5>] drm_global_item_ref+0x65/0xe0
[    7.834481]        [<ffffffff815dcd90>] mgag200_mm_init+0x50/0x1c0
[    7.841395]        [<ffffffff815d757f>] mgag200_driver_load+0x30f/0x500
[    7.848793]        [<ffffffff815b1491>] drm_dev_register+0xb1/0x100
[    7.855804]        [<ffffffff815b428d>] drm_get_pci_dev+0x8d/0x1e0
[    7.862715]        [<ffffffff815dbd3f>] mga_pci_probe+0x9f/0xc0
[    7.869338]        [<ffffffff814bde92>] local_pci_probe+0x42/0xa0
[    7.876159]        [<ffffffff810e54e8>] work_for_cpu_fn+0x18/0x30
[    7.882979]        [<ffffffff810e9d57>] process_one_work+0x1e7/0x7e0
[    7.890087]        [<ffffffff810ea518>] worker_thread+0x1c8/0x460
[    7.896907]        [<ffffffff810f05b6>] kthread+0xf6/0x110
[    7.903043]        [<ffffffff818eefdf>] ret_from_fork+0x3f/0x70
[    7.909673] 
[    7.909673] other info that might help us debug this:
[    7.909673] 
[    7.918616] Chain exists of:
  &cpu_hotplug.rwsem --> drm_global_mutex --> &item->mutex

[    7.927907]  Possible unsafe locking scenario:
[    7.927907] 
[    7.934521]        CPU0                    CPU1
[    7.939580]        ----                    ----
[    7.944639]   lock(&item->mutex);
[    7.948359]                                lock(drm_global_mutex);
[    7.955292]                                lock(&item->mutex);
[    7.961855]   lock(&cpu_hotplug.rwsem);
[    7.966158] 
[    7.966158]  *** DEADLOCK ***
[    7.966158] 
[    7.972771] 4 locks held by kworker/0:1/215:
[    7.977539]  #0:  ("events"){.+.+.+}, at: [<ffffffff810e9cc6>] process_one_work+0x156/0x7e0
[    7.986929]  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff810e9cc6>] process_one_work+0x156/0x7e0
[    7.996600]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffff815b1406>] drm_dev_register+0x26/0x100
[    8.006690]  #3:  (&item->mutex){+.+.+.}, at: [<ffffffff815c4dc3>] drm_global_item_ref+0x33/0xe0
[    8.016559] 
[    8.016559] stack backtrace:
[    8.021427] CPU: 0 PID: 215 Comm: kworker/0:1 Not tainted 4.1.0-02756-ge3d06bd-dirty #185
[    8.030565] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[    8.042034] Workqueue: events work_for_cpu_fn
[    8.046909]  ffffffff82857e30 ffff88042b3437c8 ffffffff818e5189 0000000000000011
[    8.055216]  ffffffff8282aa40 ffff88042b343818 ffffffff8111ee76 0000000000000004
[    8.063522]  ffff88042b343888 ffff88042b33f040 0000000000000004 ffff88042b33f040
[    8.071827] Call Trace:
[    8.074559]  [<ffffffff818e5189>] dump_stack+0x4c/0x6e
[    8.080300]  [<ffffffff8111ee76>] print_circular_bug+0x1c6/0x220
[    8.087011]  [<ffffffff81122817>] __lock_acquire+0x2207/0x2240
[    8.093528]  [<ffffffff811232b1>] lock_acquire+0xd1/0x290
[    8.099559]  [<ffffffff810ebd63>] ? apply_workqueue_attrs+0x183/0x4b0
[    8.106755]  [<ffffffff810cb6e2>] get_online_cpus+0x62/0xb0
[    8.112981]  [<ffffffff810ebd63>] ? apply_workqueue_attrs+0x183/0x4b0
[    8.120176]  [<ffffffff810ead27>] ? alloc_workqueue_attrs+0x27/0x80
[    8.127178]  [<ffffffff810ebd63>] apply_workqueue_attrs+0x183/0x4b0
[    8.134182]  [<ffffffff8111cc21>] ? debug_mutex_init+0x31/0x40
[    8.140690]  [<ffffffff810ed7bc>] __alloc_workqueue_key+0x2ec/0x560
[    8.147691]  [<ffffffff815cbefa>] ttm_mem_global_init+0x5a/0x310
[    8.154405]  [<ffffffff8122b050>] ? __kmalloc+0x5e0/0x630
[    8.160435]  [<ffffffff815c4de2>] ? drm_global_item_ref+0x52/0xe0
[    8.167243]  [<ffffffff815dcbb2>] mgag200_ttm_mem_global_init+0x12/0x20
[    8.174631]  [<ffffffff815c4df5>] drm_global_item_ref+0x65/0xe0
[    8.181245]  [<ffffffff815dcd90>] mgag200_mm_init+0x50/0x1c0
[    8.187570]  [<ffffffff815d757f>] mgag200_driver_load+0x30f/0x500
[    8.194383]  [<ffffffff815b1491>] drm_dev_register+0xb1/0x100
[    8.200802]  [<ffffffff815b428d>] drm_get_pci_dev+0x8d/0x1e0
[    8.207125]  [<ffffffff818ebf9e>] ? mutex_unlock+0xe/0x10
[    8.213156]  [<ffffffff815dbd3f>] mga_pci_probe+0x9f/0xc0
[    8.219187]  [<ffffffff814bde92>] local_pci_probe+0x42/0xa0
[    8.225412]  [<ffffffff8111db81>] ? __lock_is_held+0x51/0x80
[    8.231736]  [<ffffffff810e54e8>] work_for_cpu_fn+0x18/0x30
[    8.237962]  [<ffffffff810e9d57>] process_one_work+0x1e7/0x7e0
[    8.244477]  [<ffffffff810e9cc6>] ? process_one_work+0x156/0x7e0
[    8.251187]  [<ffffffff810ea518>] worker_thread+0x1c8/0x460
[    8.257410]  [<ffffffff810ea350>] ? process_one_work+0x7e0/0x7e0
[    8.264120]  [<ffffffff810ea350>] ? process_one_work+0x7e0/0x7e0
[    8.270829]  [<ffffffff810f05b6>] kthread+0xf6/0x110
[    8.276375]  [<ffffffff818ee230>] ? _raw_spin_unlock_irq+0x30/0x60
[    8.283282]  [<ffffffff810f04c0>] ? kthread_create_on_node+0x220/0x220
[    8.290566]  [<ffffffff818eefdf>] ret_from_fork+0x3f/0x70
[    8.296597]  [<ffffffff810f04c0>] ? kthread_create_on_node+0x220/0x220

  reply	other threads:[~2015-06-24 16:15 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-22 12:16 [RFC][PATCH 00/13] percpu rwsem -v2 Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 01/13] rcu: Create rcu_sync infrastructure Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 02/13] rcusync: Introduce struct rcu_sync_ops Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 03/13] rcusync: Add the CONFIG_PROVE_RCU checks Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 04/13] rcusync: Introduce rcu_sync_dtor() Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 05/13] percpu-rwsem: Optimize readers and reduce global impact Peter Zijlstra
2015-06-22 23:02   ` Oleg Nesterov
2015-06-23  7:28   ` Nicholas Mc Guire
2015-06-25 19:08     ` Peter Zijlstra
2015-06-25 19:17       ` Tejun Heo
2015-06-29  9:32         ` Peter Zijlstra
2015-06-29 15:12           ` Tejun Heo
2015-06-29 15:14             ` Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 06/13] percpu-rwsem: Provide percpu_down_read_trylock() Peter Zijlstra
2015-06-22 23:08   ` Oleg Nesterov
2015-06-22 12:16 ` [RFC][PATCH 07/13] sched: Reorder task_struct Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 08/13] percpu-rwsem: DEFINE_STATIC_PERCPU_RWSEM Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 09/13] hotplug: Replace hotplug lock with percpu-rwsem Peter Zijlstra
2015-06-22 22:57   ` Oleg Nesterov
2015-06-23  7:16     ` Peter Zijlstra
2015-06-23 17:01       ` Oleg Nesterov
2015-06-23 17:53         ` Peter Zijlstra
2015-06-24 13:50           ` Oleg Nesterov
2015-06-24 14:13             ` Peter Zijlstra
2015-06-24 15:12               ` Oleg Nesterov
2015-06-24 16:15                 ` Peter Zijlstra [this message]
2015-06-28 23:56             ` [PATCH 0/3] percpu-rwsem: introduce percpu_rw_semaphore->recursive mode Oleg Nesterov
2015-06-28 23:56               ` [PATCH 1/3] rcusync: introduce rcu_sync_struct->exclusive mode Oleg Nesterov
2015-06-28 23:56               ` [PATCH 2/3] percpu-rwsem: don't use percpu_rw_semaphore->rw_sem to exclude writers Oleg Nesterov
2015-06-28 23:56               ` [PATCH 3/3] percpu-rwsem: introduce percpu_rw_semaphore->recursive mode Oleg Nesterov
2015-06-22 12:16 ` [RFC][PATCH 10/13] fs/locks: Replace lg_global with a percpu-rwsem Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 11/13] fs/locks: Replace lg_local with a per-cpu spinlock Peter Zijlstra
2015-06-23  0:19   ` Oleg Nesterov
2015-06-22 12:16 ` [RFC][PATCH 12/13] stop_machine: Remove lglock Peter Zijlstra
2015-06-22 22:21   ` Oleg Nesterov
2015-06-23 10:09     ` Peter Zijlstra
2015-06-23 10:55       ` Peter Zijlstra
2015-06-23 11:20         ` Peter Zijlstra
2015-06-23 13:08           ` Peter Zijlstra
2015-06-23 16:36             ` Oleg Nesterov
2015-06-23 17:30             ` Paul E. McKenney
2015-06-23 18:04               ` Peter Zijlstra
2015-06-23 18:26                 ` Paul E. McKenney
2015-06-23 19:05                   ` Paul E. McKenney
2015-06-24  2:23                     ` Paul E. McKenney
2015-06-24  8:32                       ` Peter Zijlstra
2015-06-24  9:31                         ` Peter Zijlstra
2015-06-24 13:48                           ` Paul E. McKenney
2015-06-24 15:01                         ` Paul E. McKenney
2015-06-24 15:34                           ` Peter Zijlstra
2015-06-24  7:35                   ` Peter Zijlstra
2015-06-24  8:42                     ` Ingo Molnar
2015-06-24 13:39                       ` Paul E. McKenney
2015-06-24 13:43                         ` Ingo Molnar
2015-06-24 14:03                           ` Paul E. McKenney
2015-06-24 14:50                     ` Paul E. McKenney
2015-06-24 15:01                       ` Peter Zijlstra
2015-06-24 15:27                         ` Paul E. McKenney
2015-06-24 15:40                           ` Peter Zijlstra
2015-06-24 16:09                             ` Paul E. McKenney
2015-06-24 16:42                               ` Peter Zijlstra
2015-06-24 17:10                                 ` Paul E. McKenney
2015-06-24 17:20                                   ` Paul E. McKenney
2015-06-24 17:29                                     ` Peter Zijlstra
2015-06-24 17:28                                   ` Peter Zijlstra
2015-06-24 17:32                                     ` Peter Zijlstra
2015-06-24 18:14                                     ` Peter Zijlstra
2015-06-24 17:58                                   ` Peter Zijlstra
2015-06-25  3:23                                     ` Paul E. McKenney
2015-06-25 11:07                                       ` Peter Zijlstra
2015-06-25 13:47                                         ` Paul E. McKenney
2015-06-25 14:20                                           ` Peter Zijlstra
2015-06-25 14:51                                             ` Paul E. McKenney
2015-06-26 12:32                                               ` Peter Zijlstra
2015-06-26 16:14                                                 ` Paul E. McKenney
2015-06-29  7:56                                                   ` Peter Zijlstra
2015-06-30 21:32                                                     ` Paul E. McKenney
2015-07-01 11:56                                                       ` Peter Zijlstra
2015-07-01 15:56                                                         ` Paul E. McKenney
2015-07-01 16:16                                                           ` Peter Zijlstra
2015-07-01 18:45                                                             ` Paul E. McKenney
2015-06-23 14:39         ` Paul E. McKenney
2015-06-23 16:20       ` Oleg Nesterov
2015-06-23 17:24         ` Oleg Nesterov
2015-06-25 19:18           ` Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 13/13] locking: " Peter Zijlstra
2015-06-22 12:36 ` [RFC][PATCH 00/13] percpu rwsem -v2 Peter Zijlstra
2015-06-22 18:11 ` Daniel Wagner
2015-06-22 19:05   ` Peter Zijlstra
2015-06-23  9:35     ` Daniel Wagner
2015-06-23 10:00       ` Ingo Molnar
2015-06-23 14:34       ` Peter Zijlstra
2015-06-23 14:56         ` Daniel Wagner
2015-06-23 17:50           ` Peter Zijlstra
2015-06-23 19:36             ` Peter Zijlstra
2015-06-24  8:46               ` Ingo Molnar
2015-06-24  9:01                 ` Peter Zijlstra
2015-06-24  9:18                 ` Daniel Wagner
2015-07-01  5:57                   ` Daniel Wagner
2015-07-01 21:54                     ` Linus Torvalds
2015-07-02  9:41                       ` Peter Zijlstra
2015-07-20  5:53                         ` Daniel Wagner
2015-07-20 18:44                           ` Linus Torvalds
2015-06-22 20:06 ` Linus Torvalds
2015-06-23 16:10 ` Davidlohr Bueso
2015-06-23 16:21   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150624161524.GO3644@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dave@stgolabs.net \
    --cc=der.herr@hofr.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.