archive mirror
 help / color / mirror / Atom feed
From: Zhou Wang <>
To: Jean-Philippe Brucker <>
Subject: Re: [PATCH v12 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
Date: Sat, 27 Feb 2021 11:40:01 +0800	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <YDkh8qR7csPB68sC@myrica>

On 2021/2/27 0:29, Jean-Philippe Brucker wrote:
> Hi Zhou,
> On Fri, Feb 26, 2021 at 05:43:27PM +0800, Zhou Wang wrote:
>> On 2021/2/1 19:14, Jean-Philippe Brucker wrote:
>>> Hi Zhou,
>>> On Mon, Feb 01, 2021 at 09:18:42AM +0800, Zhou Wang wrote:
>>>>> @@ -1033,8 +1076,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>>>>>  			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
>>>>>  			CTXDESC_CD_0_V;
>>>>> -		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
>>>>> -		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
>>>>> +		if (smmu_domain->stall_enabled)
>>>> Could we add ssid checking here? like: if (smmu_domain->stall_enabled && ssid).
>>>> The reason is if not CD.S will also be set when ssid is 0, which is not needed.
>>> Some drivers may want to get stall events on SSID 0:
>>> Are you seeing an issue with stall events on ssid 0?  Normally there
>>> shouldn't be any fault on this context, but if they happen and no handler
>>> is registered, the SMMU driver will just abort them and report them like a
>>> non-stall event.
>> Hi Jean,
>> I notice that there is problem. In my case, I expect that CD0 is for kernel
>> and other CDs are for user space. Normally there shouldn't be any fault in
>> kernel, however, we have RAS case which is for some reason there may has
>> invalid address access from hardware device.
>> So at least there are two different address access failures: 1. hardware RAS problem;
>> 2. software fault fail(e.g. kill process when doing DMA). Handlings for these
>> two are different: for 1, we should reset hardware device; for 2, stop related
>> DMA is enough.
> Right, and in case 2 there should be no report printed since it can be
> triggered by user, while you probably want to be loud in case 1.
>> Currently if SMMU returns the same signal(by SMMU resume abort), master device
>> driver can not tell these two kinds of cases.
> This part I don't understand. So the SMMU sends a RESUME(abort) command,
> and then the master reports the DMA error to the device driver, which
> cannot differentiate 1 from 2?  (I guess there is no SSID in this report?)
> But how does disabling stall change this?  The invalid DMA access will
> still be aborted by the SMMU.

This is about the hardware design. In D06 board, an invalid DMA access from
accelerator devices will be aborted, and an hardware error signal will be
returned to accelerator devices, which reports it as a RAS error irq.
while for the stall case, error signal triggered by SMMU resume abort is
also reported as same RAS error irq. This is problem in D60 board.

In next generation of hardware, a new irq will be added to report SMMU resume
abort information, it works with related registers in accelerator devices to
get related hardware queue, which need to be stopped.

So if CD0.S is 1, invalid DMA access in kernel will be reported into above
new added irq, which has not enough information to tell RAS errors(there are 10+
hardware RAS errors) from SMMU resume abort.

> Hypothetically, would it work if all stall events that could not be
> handled went to the device driver?  Those reports would contain the SSID
> (or lack thereof), so you could reset the device in case 1 and ignore case
> 2. Though resetting the device in the middle of a stalled transaction

As above, it is hard to tell RAS errors and SMMU resume abort in SMMU resume abort
now :(

> probably comes with its own set of problems.
>> From the basic concept, if a CD is used for kernel, its S bit should not be set.
>> How about we add iommu domain check here too, if DMA domain we do not set S bit for
>> CD0, if unmanaged domain we set S bit for all CDs?
> I think disabling stall for CD0 of a DMA domain makes sense in general,
> even though I don't really understand how that fixes your issue. But

As above, if disabling stall for CD0, an invalid DMA access will be handled
by RAS error irq.

> someone might come up with a good use-case for receiving stall events on

If A DMA access in kernel fails, I think there should be a RAS issue :)
So better to disable CD0 stall for DMA domain.


> DMA mappings, so I'm wondering whether the alternative solution where we
> report unhandled stall events to the device driver would also work for
> you.
> Thanks,
> Jean
> .

iommu mailing list

      reply	other threads:[~2021-02-27  3:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 15:43 [PATCH v12 00/10] iommu: I/O page faults for SMMUv3 Jean-Philippe Brucker
2021-01-27 15:43 ` [PATCH v12 01/10] iommu: Fix comment for struct iommu_fwspec Jean-Philippe Brucker
2021-01-27 15:43 ` [PATCH v12 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits Jean-Philippe Brucker
2021-02-01  7:30   ` Auger Eric
2021-01-27 15:43 ` [PATCH v12 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA Jean-Philippe Brucker
2021-02-01  7:35   ` Auger Eric
2021-01-27 15:43 ` [PATCH v12 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF Jean-Philippe Brucker
2021-01-27 15:43 ` [PATCH v12 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF Jean-Philippe Brucker
2021-01-27 15:43 ` [PATCH v12 06/10] iommu: Add a page fault handler Jean-Philippe Brucker
2021-01-31 18:29   ` Auger Eric
2021-02-02  5:51   ` Shenming Lu
2021-01-27 15:43 ` [PATCH v12 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure Jean-Philippe Brucker
2021-01-27 15:43 ` [PATCH v12 08/10] dt-bindings: document stall property for IOMMU masters Jean-Philippe Brucker
2021-02-01  7:28   ` Auger Eric
2021-01-27 15:43 ` [PATCH v12 09/10] ACPI/IORT: Enable stall support for platform devices Jean-Philippe Brucker
2021-01-27 15:43 ` [PATCH v12 10/10] iommu/arm-smmu-v3: Add " Jean-Philippe Brucker
2021-01-31 18:29   ` Auger Eric
2021-02-01 11:12     ` Jean-Philippe Brucker
2021-02-01 13:16       ` Auger Eric
2021-02-01 15:19         ` Jean-Philippe Brucker
2021-02-01  1:18   ` Zhou Wang
2021-02-01 11:14     ` Jean-Philippe Brucker
2021-02-01 12:53       ` Zhou Wang
2021-02-26  9:43       ` Zhou Wang
2021-02-26 16:29         ` Jean-Philippe Brucker
2021-02-27  3:40           ` Zhou Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).