All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
To: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: "iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Rafael Wysocki <rafael.j.wysocki@intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	Raj Ashok <ashok.raj@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Lu Baolu <baolu.lu@linux.intel.com>
Subject: Re: [PATCH v4 14/22] iommu: handle page response timeout
Date: Mon, 30 Apr 2018 11:58:10 +0100	[thread overview]
Message-ID: <e98a1385-9e55-c021-4e89-7d07701f4b84@arm.com> (raw)
In-Reply-To: <20180425083711.222202e7@jacob-builder>

On 25/04/18 16:37, Jacob Pan wrote:
>> In the other cases (unsupported PRI or rogue guest) then disabling PRI
>> using a FAILURE status might be the right thing to do. However,
>> assuming the device follows the PCI spec it will stop sending page
>> requests once there are as many PPRs in flight as the allocated
>> credit.
>>
> Agreed, here I am not taking any actions. There may be need to drain
> in-fly requests.

Right, as long as we first ensure that no new fault is generated (by
using a Response Failure). Though in my opinion not taking action might
be the safest option :)

Another thought: currently the comment in iommu.h says
"@IOMMU_FAULT_STATUS_FAILURE: General error. Drop all subsequent faults
from this device if possible. This is "Response Failure" in PCI PRI."

I wonder if we should simply say "Drop all subsequent faults from the
device". Even if the PCI device doesn't properly implement PRI, the
IOMMU driver should set a "PRI disabled" bit in the device data that
prevents it from from reporting new faults and flooding the queue.
Anyway, it's a small detail that could go in a future patch series.

>> If there isn't any possibility of memory leak or abusing resources, I
>> don't think it's our problem that the guest is excessively slow at
>> handling page requests. Setting an upper bound to page request latency
>> might do more harm than good. Ensuring that devices respect the number
>> of allocated in-flight PPRs is more important in my opinion.
>>
> How about we have a really long timeout, e.g. 1 min similar to device
> invalidate response timeout in ATS spec., just for basic safety and
> diagnosis. Optionally, we could have quota in parallel.

I agree that for development a timeout is useful. It might be worth
adding it as an option to the IOMMU module instead of a define. Perhaps
a number of seconds, 10 being the default and 0 disabling the timeout?
Otherwise we would probably end up with a succession of patches
incrementing the timeout by arbitrary values, if people find it
inconvenient.

Thanks,
Jean

WARNING: multiple messages have this Message-ID (diff)
From: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
To: Jacob Pan <jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: Raj Ashok <ashok.raj-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Greg Kroah-Hartman
	<gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>,
	Rafael Wysocki
	<rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
	<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH v4 14/22] iommu: handle page response timeout
Date: Mon, 30 Apr 2018 11:58:10 +0100	[thread overview]
Message-ID: <e98a1385-9e55-c021-4e89-7d07701f4b84@arm.com> (raw)
In-Reply-To: <20180425083711.222202e7@jacob-builder>

On 25/04/18 16:37, Jacob Pan wrote:
>> In the other cases (unsupported PRI or rogue guest) then disabling PRI
>> using a FAILURE status might be the right thing to do. However,
>> assuming the device follows the PCI spec it will stop sending page
>> requests once there are as many PPRs in flight as the allocated
>> credit.
>>
> Agreed, here I am not taking any actions. There may be need to drain
> in-fly requests.

Right, as long as we first ensure that no new fault is generated (by
using a Response Failure). Though in my opinion not taking action might
be the safest option :)

Another thought: currently the comment in iommu.h says
"@IOMMU_FAULT_STATUS_FAILURE: General error. Drop all subsequent faults
from this device if possible. This is "Response Failure" in PCI PRI."

I wonder if we should simply say "Drop all subsequent faults from the
device". Even if the PCI device doesn't properly implement PRI, the
IOMMU driver should set a "PRI disabled" bit in the device data that
prevents it from from reporting new faults and flooding the queue.
Anyway, it's a small detail that could go in a future patch series.

>> If there isn't any possibility of memory leak or abusing resources, I
>> don't think it's our problem that the guest is excessively slow at
>> handling page requests. Setting an upper bound to page request latency
>> might do more harm than good. Ensuring that devices respect the number
>> of allocated in-flight PPRs is more important in my opinion.
>>
> How about we have a really long timeout, e.g. 1 min similar to device
> invalidate response timeout in ATS spec., just for basic safety and
> diagnosis. Optionally, we could have quota in parallel.

I agree that for development a timeout is useful. It might be worth
adding it as an option to the IOMMU module instead of a define. Perhaps
a number of seconds, 10 being the default and 0 disabling the timeout?
Otherwise we would probably end up with a succession of patches
incrementing the timeout by arbitrary values, if people find it
inconvenient.

Thanks,
Jean

  reply	other threads:[~2018-04-30 10:58 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-16 21:48 [PATCH v4 00/22] IOMMU and VT-d driver support for Shared Virtual Address (SVA) Jacob Pan
2018-04-16 21:48 ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 01/22] iommu: introduce bind_pasid_table API function Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 02/22] iommu/vt-d: move device_domain_info to header Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 03/22] iommu/vt-d: add a flag for pasid table bound status Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 04/22] iommu/vt-d: add bind_pasid_table function Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-17 19:10   ` Alex Williamson
2018-04-17 19:10     ` Alex Williamson
2018-04-20 18:25     ` Jean-Philippe Brucker
2018-04-20 18:25       ` Jean-Philippe Brucker
2018-04-20 23:42       ` Jacob Pan
2018-04-20 23:42         ` Jacob Pan
2018-05-29 20:09         ` Alex Williamson
2018-05-29 20:09           ` Alex Williamson
2018-05-30  1:41           ` Tian, Kevin
2018-05-30  1:41             ` Tian, Kevin
2018-05-30  3:17             ` Alex Williamson
2018-05-30  3:17               ` Alex Williamson
2018-05-30  3:45               ` Tian, Kevin
2018-05-30  3:45                 ` Tian, Kevin
2018-05-30 11:53                 ` Jean-Philippe Brucker
2018-05-30 11:53                   ` Jean-Philippe Brucker
2018-05-30 19:52                   ` Jacob Pan
2018-05-30 19:52                     ` Jacob Pan
2018-05-31  9:09                     ` Jean-Philippe Brucker
2018-05-31  9:09                       ` Jean-Philippe Brucker
2018-06-05 17:32                       ` Jacob Pan
2018-06-06 11:20                         ` Jean-Philippe Brucker
2018-06-06 11:20                           ` Jean-Philippe Brucker
2018-06-06 21:22                           ` Jacob Pan
2018-06-07 13:21                             ` Jean-Philippe Brucker
2018-04-20 23:22     ` Jacob Pan
2018-04-20 23:22       ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 05/22] iommu: introduce iommu invalidate API function Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-20 18:19   ` Jean-Philippe Brucker
2018-04-20 18:19     ` Jean-Philippe Brucker
2018-04-23 20:43     ` Jacob Pan
2018-04-23 20:43       ` Jacob Pan
2018-04-27 18:07       ` Jean-Philippe Brucker
2018-04-27 18:07         ` Jean-Philippe Brucker
2018-04-28  2:41         ` Tian, Kevin
2018-04-28  2:41           ` Tian, Kevin
2018-05-01 22:58         ` Jacob Pan
2018-05-01 22:58           ` Jacob Pan
2018-05-02  9:31           ` Jean-Philippe Brucker
2018-05-02  9:31             ` Jean-Philippe Brucker
2018-05-04  4:46             ` Jacob Pan
2018-05-04  4:46               ` Jacob Pan
2018-05-04 18:07               ` Jacob Pan
2018-05-04 18:07                 ` Jacob Pan
2018-05-08 10:35                 ` Jean-Philippe Brucker
2018-05-08 10:35                   ` Jean-Philippe Brucker
2018-05-09 12:55                   ` Jacob Pan
2018-05-09 12:55                     ` Jacob Pan
2018-05-05 22:19   ` Jerry Snitselaar
2018-05-05 22:19     ` Jerry Snitselaar
2018-05-07 15:41     ` Jacob Pan
2018-05-07 15:41       ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 06/22] iommu/vt-d: add definitions for PFSID Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 07/22] iommu/vt-d: fix dev iotlb pfsid use Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 08/22] iommu/vt-d: support flushing more translation cache types Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 09/22] iommu/vt-d: add svm/sva invalidate function Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-17 19:10   ` Alex Williamson
2018-04-17 19:10     ` Alex Williamson
2018-04-20 22:36     ` Jacob Pan
2018-04-20 22:36       ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 10/22] iommu: introduce device fault data Jacob Pan
2018-04-16 21:48   ` Jacob Pan
2018-04-23 10:11   ` Jean-Philippe Brucker
2018-04-23 10:11     ` Jean-Philippe Brucker
2018-04-23 11:54     ` Jacob Pan
2018-04-23 11:54       ` Jacob Pan
2018-05-20  8:17   ` Liu, Yi L
2018-05-21 23:16     ` Jacob Pan
2018-05-21 23:16       ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 11/22] driver core: add per device iommu param Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-23 10:26   ` Greg Kroah-Hartman
2018-04-23 10:26     ` Greg Kroah-Hartman
2018-04-16 21:49 ` [PATCH v4 12/22] iommu: introduce device fault report API Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-23 11:30   ` Jean-Philippe Brucker
2018-04-23 11:30     ` Jean-Philippe Brucker
2018-04-24 18:29     ` Jacob Pan
2018-04-24 18:29       ` Jacob Pan
2018-04-30 16:53   ` Jean-Philippe Brucker
2018-04-30 16:53     ` Jean-Philippe Brucker
2018-04-30 18:54     ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 13/22] iommu: introduce page response function Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-23 11:47   ` Jean-Philippe Brucker
2018-04-23 11:47     ` Jean-Philippe Brucker
2018-04-23 12:16     ` Jacob Pan
2018-04-23 12:16       ` Jacob Pan
2018-04-23 15:50       ` Jean-Philippe Brucker
2018-04-23 15:50         ` Jean-Philippe Brucker
2018-04-16 21:49 ` [PATCH v4 14/22] iommu: handle page response timeout Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-23 15:36   ` Jean-Philippe Brucker
2018-04-25 15:37     ` Jacob Pan
2018-04-25 15:37       ` Jacob Pan
2018-04-30 10:58       ` Jean-Philippe Brucker [this message]
2018-04-30 10:58         ` Jean-Philippe Brucker
2018-04-30 17:54         ` Jacob Pan
2018-04-30 17:54           ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 15/22] iommu/config: add build dependency for dmar Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 16/22] iommu/vt-d: report non-recoverable faults to device Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 17/22] iommu/intel-svm: report device page request Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 18/22] iommu/intel-svm: replace dev ops with fault report API Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 19/22] iommu/intel-svm: do not flush iotlb for viommu Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 20/22] iommu/vt-d: add intel iommu page response function Jacob Pan
2018-04-16 21:49   ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 21/22] trace/iommu: add sva trace events Jacob Pan
2018-04-16 21:49 ` [PATCH v4 22/22] iommu: use sva invalidate and device fault trace event Jacob Pan
  -- strict thread matches above, loose matches on Subject: below --
2018-03-23  3:11 [PATCH v4 00/22] IOMMU and VT-d driver support for Shared Virtual Address (SVA) Jacob Pan
2018-03-23  3:12 ` [PATCH v4 14/22] iommu: handle page response timeout Jacob Pan
2018-03-23  3:12   ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e98a1385-9e55-c021-4e89-7d07701f4b84@arm.com \
    --to=jean-philippe.brucker@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.