All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Alex Chiang <achiang@hp.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Berg <johannes@sipsolutions.net>,
	jbarnes@virtuousgeek.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, kaneshige.kenji@jp.fujitsu.com
Subject: Re: [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove
Date: Tue, 24 Mar 2009 17:12:59 +0100	[thread overview]
Message-ID: <20090324161259.GA5964@redhat.com> (raw)
In-Reply-To: <20090324092525.GE6605@elte.hu>

On 03/24, Ingo Molnar wrote:
> >
> > * Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>:
> >
> > Kenji Kaneshige reported the below lockdep problem when testing
> > my patch on one of his machines.
> >
> > > I still have the following kernel error messages in testing with your
> > > latest set of patches (Jesse's linux-next). The test case is removing
> > > e1000e device or its parent bridge by "echo 1 > /sys/bus/pci/devices/
> > > .../remove".
> > >
> > > [  537.379995] =============================================
> > > [  537.380124] [ INFO: possible recursive locking detected ]
> > > [  537.380128] 2.6.29-rc8-kk #1
> > > [  537.380128] ---------------------------------------------
> > > [  537.380128] events/4/56 is trying to acquire lock:
> > > [  537.380128]  (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> > > [  537.380128]
> > > [  537.380128] but task is already holding lock:
> > > [  537.380128]  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> > > [  537.380128]
> > > [  537.380128] other info that might help us debug this:
> > > [  537.380128] 3 locks held by events/4/56:
> > > [  537.380128]  #0:  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> > > [  537.380128]  #1:  (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> > > [  537.380128]  #2:  (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
> > > [  537.380128]
> > > [  537.380128] stack backtrace:
> > > [  537.380128] Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
> > > [  537.380128] Call Trace:
> > > [  537.380128]  [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
> > > [  537.380128]  [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
> > > [  537.380128]  [<ffffffff8026f148>] lock_acquire+0x58/0x80
> > > [  537.380128]  [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> > > [  537.380128]  [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
> > > [  537.380128]  [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> > > [  537.383380]  [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
> > > [  537.383380]  [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
> > > [  537.383380]  [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
> > > [  537.383380]  [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
> > > [  537.383380]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
> > > [  537.383380]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
> > > [  537.383380]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
> > > [  537.384382]  [<ffffffff8043e46b>] device_del+0x12b/0x190
> > > [  537.384382]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
> > > [  537.384382]  [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
> > > [  537.384382]  [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
> > > [  537.384382]  [<ffffffff803c10d9>] remove_callback+0x29/0x40
> > > [  537.384382]  [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
> > > [  537.384382]  [<ffffffff8025769a>] run_workqueue+0x15a/0x230
> > > [  537.384382]  [<ffffffff80257648>] ? run_workqueue+0x108/0x230
> > > [  537.384382]  [<ffffffff8025846f>] worker_thread+0x9f/0x100
> > > [  537.384382]  [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
> > > [  537.384382]  [<ffffffff802583d0>] ? worker_thread+0x0/0x100
> > > [  537.384382]  [<ffffffff8025b89d>] kthread+0x4d/0x80
> > > [  537.384382]  [<ffffffff8020d4ba>] child_rip+0xa/0x20
> > > [  537.386380]  [<ffffffff8020cebc>] ? restore_args+0x0/0x30
> > > [  537.386380]  [<ffffffff8025b850>] ? kthread+0x0/0x80
> > > [  537.386380]  [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
> > >
> > > I think the cause of this error message is flush_workqueue()
> > > from the work of keventd. When removing device using
> > > "/sys/bus/pci/devices/.../ remove", pci_remove_bus_device() is
> > > executed by the keventd's work through
> > > device_schedule_callback(), and it invokes e1000e's remove
> > > callback. And then, e1000e's remove callback invokes
> > > flush_workqueue().  Actually, the kernel error messages are not
> > > displayed when I changed e1000e driver to not call
> > > flush_workqueue(). In my understanding, flush_workqueue() from
> > > the work must be avoided because it can cause a deadlock.
> > > Please note that this is not a problem of e1000e driver.
> > > Drivers can use flush_workqueue(), of course.
> >
> > I agree with this analysis; the reason we're seeing this lockdep
> > warning is because the sysfs attributed scheduled a removal for
> > itself using device_schedule_callback(). This is necessary
> > because sysfs attributes can't remove themselves due to other
> > locking issues.
> >
> > My question is -- is it a bug to call flush_workqueue during
> > run_workqueue?
>
> Yes, it generally is.
>
> > Conceptually, I don't think it should be a bug; it should be a
> > nop, since run_workqueue _is_ flushing the work queue.

As it was already said, we can deadlock.

Can't e1000_remove() avoid flush_scheduled_work() ? (and it should
be always avoided when possible).

Of course, I don't understand this code. But afaics e1000_remove()
can just cancel its own works (in struct e1000_adapter), no?

cancel_work_sync(work) from run_workqueue() should be OK even if
this work is queued on the same wq. If it is queued on the same CPU
cancel_work_sync() won't block because we are ->current_work.


Btw. Again, I don't understand the code, but this looks suspicious:

	e1000_remove:

		set_bit(__E1000_DOWN, &adapter->state);
		del_timer_sync(&adapter->watchdog_timer);
		flush_scheduled_work();

What if e1000_watchdog_task() is running, has already checked
!test_bit(__E1000_DOWN, &adapter->state), but didn't call
mod_timer(&adapter->phy_info_timer) yet?

Oleg.


  parent reply	other threads:[~2009-03-24 16:18 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-20 20:55 [PATCH v5 00/13] PCI core learns hotplug Alex Chiang
2009-03-20 20:55 ` [PATCH v5 01/13] PCI: pci_is_root_bus helper Alex Chiang
2009-03-20 22:00   ` Jesse Barnes
2009-03-20 20:56 ` [PATCH v5 02/13] PCI: don't scan existing devices Alex Chiang
2009-03-20 20:56 ` [PATCH v5 03/13] PCI: pci_scan_slot() returns newly found devices Alex Chiang
2009-03-20 20:56 ` [PATCH v5 04/13] PCI: always scan child buses Alex Chiang
2009-03-20 20:56 ` [PATCH v5 05/13] PCI: do not initialize bridges more than once Alex Chiang
2009-03-20 20:56 ` [PATCH v5 06/13] PCI: do not enable " Alex Chiang
2009-03-20 20:56 ` [PATCH v5 07/13] PCI: Introduce pci_rescan_bus() Alex Chiang
2009-03-20 20:56 ` [PATCH v5 08/13] PCI: Introduce /sys/bus/pci/rescan Alex Chiang
2009-03-20 20:56 ` [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove Alex Chiang
2009-03-23  9:01   ` Kenji Kaneshige
2009-03-24  3:23     ` Alex Chiang
2009-03-24  9:25       ` Ingo Molnar
2009-03-24 10:46         ` Andrew Morton
2009-03-24 11:17           ` Peter Zijlstra
2009-03-24 13:21             ` Johannes Berg
2009-03-24 12:32           ` Johannes Berg
2009-03-24 17:23             ` Alex Chiang
2009-03-24 20:22               ` Johannes Berg
2009-03-24 16:12         ` Oleg Nesterov [this message]
2009-03-24 17:32           ` Alex Chiang
2009-03-24 19:29     ` Alex Chiang
2009-03-25  5:06       ` Kenji Kaneshige
2009-03-25  5:20         ` Alex Chiang
2009-03-25  5:39           ` Kenji Kaneshige
2009-03-20 20:56 ` [PATCH v5 10/13] PCI: Introduce /sys/bus/pci/devices/.../rescan Alex Chiang
2009-03-20 20:56 ` [PATCH v5 11/13] PCI Hotplug: restore fakephp interface with complete reimplementation Alex Chiang
2009-03-20 20:56 ` [PATCH v5 12/13] PCI Hotplug: rename legacy_fakephp to fakephp Alex Chiang
2009-03-20 20:56 ` [PATCH v5 13/13] PCI Hotplug: schedule fakephp for feature removal Alex Chiang
2012-03-10 21:20   ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090324161259.GA5964@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=achiang@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=jbarnes@virtuousgeek.org \
    --cc=johannes@sipsolutions.net \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.