All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: will@kernel.org, joro@8bytes.org
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	iommu@lists.linux-foundation.org, robin.murphy@arm.com,
	linux-arm-kernel@lists.infradead.org,
	Aaro Koskinen <aaro.koskinen@nokia.com>
Subject: [PATCH] iommu/arm-smmu-v3: Ratelimit event dump
Date: Mon, 31 May 2021 11:56:50 +0200	[thread overview]
Message-ID: <20210531095648.118282-1-jean-philippe@linaro.org> (raw)

When a device or driver misbehaves, it is possible to receive DMA fault
events much faster than we can print them out, causing a lock up of the
system and inability to cancel the source of the problem. Ratelimit
printing of events to help recovery.

Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---

Aiming for v5.14 rather than 5.13, since it mainly fixes a nuisance
during development/debug. Conflicts with "iommu/arm-smmu-v3: Add stall
support for platform devices" currently on the list [1], because they
both change arm_smmu_evtq_thread(). This patch is based onto [1].

I encountered this while developing SVA on hardware, although the
problem is not specific to SVA or stall. The device driver didn't
properly stop DMA, and the SMMU would flood the event queue with
translation faults. Without rate limiting I was unable to even reset the
device. Note that this is not a problem for normal SVA operations, since
userspace cannot cause DMA to print kernel messages.

Aaro Koskinen reported a similar problem [2]

[1] https://lore.kernel.org/linux-iommu/20210526161927.24268-4-jean-philippe@linaro.org/
[2] https://lore.kernel.org/linux-iommu/20210528080958.GA60351@darkstar.musicnaut.iki.fi/
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 39bdb4264248..2792382ad3bd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1518,6 +1518,8 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 	struct arm_smmu_device *smmu = dev;
 	struct arm_smmu_queue *q = &smmu->evtq.q;
 	struct arm_smmu_ll_queue *llq = &q->llq;
+	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
+				      DEFAULT_RATELIMIT_BURST);
 	u64 evt[EVTQ_ENT_DWORDS];
 
 	do {
@@ -1525,7 +1527,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
 
 			ret = arm_smmu_handle_evt(smmu, evt);
-			if (!ret)
+			if (!ret || !__ratelimit(&rs))
 				continue;
 
 			dev_info(smmu->dev, "event 0x%02x received:\n", id);
-- 
2.31.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: will@kernel.org, joro@8bytes.org
Cc: robin.murphy@arm.com, linux-arm-kernel@lists.infradead.org,
	iommu@lists.linux-foundation.org,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Aaro Koskinen <aaro.koskinen@nokia.com>
Subject: [PATCH] iommu/arm-smmu-v3: Ratelimit event dump
Date: Mon, 31 May 2021 11:56:50 +0200	[thread overview]
Message-ID: <20210531095648.118282-1-jean-philippe@linaro.org> (raw)

When a device or driver misbehaves, it is possible to receive DMA fault
events much faster than we can print them out, causing a lock up of the
system and inability to cancel the source of the problem. Ratelimit
printing of events to help recovery.

Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---

Aiming for v5.14 rather than 5.13, since it mainly fixes a nuisance
during development/debug. Conflicts with "iommu/arm-smmu-v3: Add stall
support for platform devices" currently on the list [1], because they
both change arm_smmu_evtq_thread(). This patch is based onto [1].

I encountered this while developing SVA on hardware, although the
problem is not specific to SVA or stall. The device driver didn't
properly stop DMA, and the SMMU would flood the event queue with
translation faults. Without rate limiting I was unable to even reset the
device. Note that this is not a problem for normal SVA operations, since
userspace cannot cause DMA to print kernel messages.

Aaro Koskinen reported a similar problem [2]

[1] https://lore.kernel.org/linux-iommu/20210526161927.24268-4-jean-philippe@linaro.org/
[2] https://lore.kernel.org/linux-iommu/20210528080958.GA60351@darkstar.musicnaut.iki.fi/
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 39bdb4264248..2792382ad3bd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1518,6 +1518,8 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 	struct arm_smmu_device *smmu = dev;
 	struct arm_smmu_queue *q = &smmu->evtq.q;
 	struct arm_smmu_ll_queue *llq = &q->llq;
+	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
+				      DEFAULT_RATELIMIT_BURST);
 	u64 evt[EVTQ_ENT_DWORDS];
 
 	do {
@@ -1525,7 +1527,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
 
 			ret = arm_smmu_handle_evt(smmu, evt);
-			if (!ret)
+			if (!ret || !__ratelimit(&rs))
 				continue;
 
 			dev_info(smmu->dev, "event 0x%02x received:\n", id);
-- 
2.31.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2021-05-31 10:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-31  9:56 Jean-Philippe Brucker [this message]
2021-05-31  9:56 ` [PATCH] iommu/arm-smmu-v3: Ratelimit event dump Jean-Philippe Brucker
2021-06-08 11:42 ` Will Deacon
2021-06-08 11:42   ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210531095648.118282-1-jean-philippe@linaro.org \
    --to=jean-philippe@linaro.org \
    --cc=aaro.koskinen@nokia.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.