linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Garrett <matthew.garrett@nebula.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	kexec@lists.infradead.org
Subject: Re: [Q] Why does kexec use device_shutdown rather than ubind them
Date: Thu, 16 Jan 2014 20:52:46 -0800	[thread overview]
Message-ID: <871u07fbvl.fsf@xmission.com> (raw)
In-Reply-To: <1389929988.7406.18.camel@pasglop> (Benjamin Herrenschmidt's message of "Fri, 17 Jan 2014 14:39:48 +1100")

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> Hi Folks !
>
> Sorry for the semi-random CC list, not sure who owns kexec nowadays. So
> we are working on a new crop of power servers for which the bootloader
> is going to be using kexec.
>
> As expected, we've been chasing a number of reliability issues mostly
> due to drivers not behaving properly, such as leaving devices DMA'ing or
> in a state that upsets the new kernel etc...
>
> So far our approach has been to fix the drivers one by one, adding the
> shutdown() method when it's missing, etc...
>
> But that lead me to wonder ... why shutdown() in the first place ? The
> semantic of shutdown() is that we are going to power the machine off. In
> some cases, that method will actively participate in the shutdown,
> powering things off, spinning disk down, etc....
>
> It doesn't have the semantic of "put the device into a clean state for a
> new driver to pick up". It's also rarely implemented.
>
> On the other hand, the remove() routine is almost everywhere, and is
> already well understood as needing to leave the device in a clean state,
> as it's often used for rmmod (often by the driver developer
> him/herself), more likely to be tested in a condition that doesn't
> involve having the machine off immediately afterward but on the contrare
> in a condition where a new driver can come and try to pick the device
> up.
>
> Additionally, remove() is also what KVM does when assigning devices to
> guest, ie, the original driver is unbound from the host, and VFIO is
> bound in its place.
>
> So we have common purpose with kexec (somewhat) and possibly common (and
> better) testing coverage with remove() than with shutdown().
>
> I plan to experiment a bit in our bootloader see if that makes a
> difference, maybe doing a first pass of unbind for anything that can be
> unbound, and shutdown for the rest.
>
> (I'll probably also sneak it a PCIe hot reset at the end but that's more
> platform specific).
>
> Any opinion ?

When I was young and kexec was a new idea and the device model was just
getting methods.  An argument was made that using remove in the reboot
path was a bad idea because it does more than is necessary so we needed
shutdown.

I wanted to use remove at the time.

There is a lot of silliness now, but my inclination is to remove
shutdown entirely and replace it with remove.  I certainly would not
have heart burn with doing so just for kexec but getting the whole
reboot path at this point would make me very happy and then we could
kill a little used rarely implemented method from the kernel.

Given that we have found emperically that flipping the bus master enable
bit to off helps enormously for stability, but doing so generically
occassionally has unfortunate consequences why not.

I think we have largely survied until now because kdump is so popular
and kdump winds up having to reinitialize devices from any random state.

But like I said I am all for reducing the burden on device driver developers.

Eric

  reply	other threads:[~2014-01-17  4:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-17  3:39 [Q] Why does kexec use device_shutdown rather than ubind them Benjamin Herrenschmidt
2014-01-17  4:52 ` Eric W. Biederman [this message]
2014-01-17  5:59   ` Benjamin Herrenschmidt
2014-01-17 14:13     ` Vivek Goyal
2014-01-17 15:57       ` Sumner, William
2014-01-17 21:00       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871u07fbvl.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.garrett@nebula.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).