All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	David Airlie <airlied@linux.ie>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Daniel Vetter <daniel@ffwll.ch>,
	dri-devel@lists.freedesktop.org,
	Harald Freudenberger <freude@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	intel-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Jason Herne <jjherne@linux.ibm.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	kvm@vger.kernel.org, linux-s390@vger.kernel.org,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>,
	Tony Krowiak <akrowiak@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>, Kevin Tian <kevin.tian@intel.com>
Subject: Re: [PATCH v3 1/2] vfio: Replace the DMA unmapping notifier with a callback
Date: Wed, 20 Jul 2022 08:56:49 -0300	[thread overview]
Message-ID: <20220720115649.GS4609@nvidia.com> (raw)
In-Reply-To: <874jzcp6nz.fsf@redhat.com>

On Wed, Jul 20, 2022 at 09:47:12AM +0200, Cornelia Huck wrote:

> > If the FSM trapped in a bad state here, such as
> > VFIO_CCW_STATE_NOT_OPER, then it means it should have already unpinned
> > the pages and this is considered a success for this purpose
> 
> A rather pathological case would be a subchannel that cannot be
> quiesced and does not end up being non-operational; in theory, the
> hardware could still try to access the buffers we provided for I/O. I'd
> say that is extremely unlikely, we might log it, but really cannot do
> anything else.

I think if the FSM can't reach NOT_OPER then it would be appropriate
to panic the kernel when it realizes it has lost control of the
device.

> > The return code here exists only to return to userspace so it can
> > detect during a VFIO_DEVICE_RESET that the device has crashed
> > irrecoverably.
> 
> Does it imply only that ("it's dead, Jim"), or can it also imply a
> runaway device? Not that userspace can do much in any case.

The kernel cannot permit a runaway device, the driver must panic if it
unable to quiet the device's DMA.

I assume this return from RESET is for cases where quieting was
successful.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: kvm@vger.kernel.org, David Airlie <airlied@linux.ie>,
	Kevin Tian <kevin.tian@intel.com>,
	dri-devel@lists.freedesktop.org,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>,
	linux-s390@vger.kernel.org,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	intel-gfx@lists.freedesktop.org, Zhi Wang <zhi.a.wang@intel.com>,
	Tony Krowiak <akrowiak@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Harald Freudenberger <freude@linux.ibm.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	intel-gvt-dev@lists.freedesktop.org,
	Jason Herne <jjherne@linux.ibm.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>
Subject: Re: [PATCH v3 1/2] vfio: Replace the DMA unmapping notifier with a callback
Date: Wed, 20 Jul 2022 08:56:49 -0300	[thread overview]
Message-ID: <20220720115649.GS4609@nvidia.com> (raw)
In-Reply-To: <874jzcp6nz.fsf@redhat.com>

On Wed, Jul 20, 2022 at 09:47:12AM +0200, Cornelia Huck wrote:

> > If the FSM trapped in a bad state here, such as
> > VFIO_CCW_STATE_NOT_OPER, then it means it should have already unpinned
> > the pages and this is considered a success for this purpose
> 
> A rather pathological case would be a subchannel that cannot be
> quiesced and does not end up being non-operational; in theory, the
> hardware could still try to access the buffers we provided for I/O. I'd
> say that is extremely unlikely, we might log it, but really cannot do
> anything else.

I think if the FSM can't reach NOT_OPER then it would be appropriate
to panic the kernel when it realizes it has lost control of the
device.

> > The return code here exists only to return to userspace so it can
> > detect during a VFIO_DEVICE_RESET that the device has crashed
> > irrecoverably.
> 
> Does it imply only that ("it's dead, Jim"), or can it also imply a
> runaway device? Not that userspace can do much in any case.

The kernel cannot permit a runaway device, the driver must panic if it
unable to quiet the device's DMA.

I assume this return from RESET is for cases where quieting was
successful.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: kvm@vger.kernel.org, David Airlie <airlied@linux.ie>,
	dri-devel@lists.freedesktop.org,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>,
	linux-s390@vger.kernel.org,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	intel-gfx@lists.freedesktop.org,
	Tony Krowiak <akrowiak@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Harald Freudenberger <freude@linux.ibm.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	intel-gvt-dev@lists.freedesktop.org,
	Jason Herne <jjherne@linux.ibm.com>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>
Subject: Re: [Intel-gfx] [PATCH v3 1/2] vfio: Replace the DMA unmapping notifier with a callback
Date: Wed, 20 Jul 2022 08:56:49 -0300	[thread overview]
Message-ID: <20220720115649.GS4609@nvidia.com> (raw)
In-Reply-To: <874jzcp6nz.fsf@redhat.com>

On Wed, Jul 20, 2022 at 09:47:12AM +0200, Cornelia Huck wrote:

> > If the FSM trapped in a bad state here, such as
> > VFIO_CCW_STATE_NOT_OPER, then it means it should have already unpinned
> > the pages and this is considered a success for this purpose
> 
> A rather pathological case would be a subchannel that cannot be
> quiesced and does not end up being non-operational; in theory, the
> hardware could still try to access the buffers we provided for I/O. I'd
> say that is extremely unlikely, we might log it, but really cannot do
> anything else.

I think if the FSM can't reach NOT_OPER then it would be appropriate
to panic the kernel when it realizes it has lost control of the
device.

> > The return code here exists only to return to userspace so it can
> > detect during a VFIO_DEVICE_RESET that the device has crashed
> > irrecoverably.
> 
> Does it imply only that ("it's dead, Jim"), or can it also imply a
> runaway device? Not that userspace can do much in any case.

The kernel cannot permit a runaway device, the driver must panic if it
unable to quiet the device's DMA.

I assume this return from RESET is for cases where quieting was
successful.

Jason

  reply	other threads:[~2022-07-20 11:57 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-05  0:59 [PATCH v3 0/2] Remove the VFIO_IOMMU_NOTIFY_DMA_UNMAP notifier Jason Gunthorpe
2022-07-05  0:59 ` Jason Gunthorpe
2022-07-05  0:59 ` [Intel-gfx] " Jason Gunthorpe
2022-07-05  0:59 ` [PATCH v3 1/2] vfio: Replace the DMA unmapping notifier with a callback Jason Gunthorpe
2022-07-05  0:59   ` [Intel-gfx] " Jason Gunthorpe
2022-07-05  0:59   ` Jason Gunthorpe
2022-07-05  8:31   ` Zhenyu Wang
2022-07-05  8:31     ` [Intel-gfx] " Zhenyu Wang
2022-07-05  8:31     ` Zhenyu Wang
2022-07-07 21:37   ` Alex Williamson
2022-07-07 21:37     ` [Intel-gfx] " Alex Williamson
2022-07-07 21:37     ` Alex Williamson
2022-07-19 23:44     ` Jason Gunthorpe
2022-07-19 23:44       ` [Intel-gfx] " Jason Gunthorpe
2022-07-19 23:44       ` Jason Gunthorpe
2022-07-20  7:47       ` Cornelia Huck
2022-07-20  7:47         ` [Intel-gfx] " Cornelia Huck
2022-07-20  7:47         ` Cornelia Huck
2022-07-20 11:56         ` Jason Gunthorpe [this message]
2022-07-20 11:56           ` [Intel-gfx] " Jason Gunthorpe
2022-07-20 11:56           ` Jason Gunthorpe
2022-07-05  0:59 ` [PATCH v3 2/2] vfio: Replace the iommu notifier with a device list Jason Gunthorpe
2022-07-05  0:59   ` Jason Gunthorpe
2022-07-05  0:59   ` [Intel-gfx] " Jason Gunthorpe
2022-07-05 13:37 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Remove the VFIO_IOMMU_NOTIFY_DMA_UNMAP notifier (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220720115649.GS4609@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=agordeev@linux.ibm.com \
    --cc=airlied@linux.ie \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=farman@linux.ibm.com \
    --cc=freude@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=jjherne@linux.ibm.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=oberpar@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=svens@linux.ibm.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=vneethv@linux.ibm.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhi.a.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.