linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region
@ 2021-08-24 17:30 Alyssa Rosenzweig
  2021-08-24 17:30 ` [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation Alyssa Rosenzweig
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-24 17:30 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Herring, Tomeu Vizoso, Steven Price, Alyssa Rosenzweig,
	David Airlie, Daniel Vetter, linux-kernel

Chris Morgan reported UBSAN errors in panfrost and tracked them down to
the size computation in lock_region. This calculation is overcomplicated
(cargo culted from kbase) and can be simplified with kernel helpers and
some mathematical identities. The first patch in the series rewrites the
calculation in a form avoiding undefined behaviour; Chris confirms it
placates UBSAN.

While researching this function, I noticed a pair of other potential
bugs: Bifrost can lock more than 4GiB at a time, but must lock at least
32KiB at a time. The latter patches in the series handle these cases.

In review of v1 of this series, Steven pointed out a fourth potential
bug: rounding down the iova can truncate the lock region. v2 adds a new
patch for this case.

The size computation was unit-tested in userspace. Relevant code below,
just missing some copypaste definitions for fls64/clamp/etc:

	#define MIN_LOCK (1ULL << 12)
	#define MAX_LOCK (1ULL << 48)

	struct {
		uint64_t size;
		uint8_t encoded;
	} tests[] = {
		/* Clamping */
		{ 0, 11 },
		{ 1, 11 },
		{ 2, 11 },
		{ 4095, 11 },
		/* Power of two */
		{ 4096, 11 },
		/* Round up */
		{ 4097, 12 },
		{ 8192, 12 },
		{ 16384, 13 },
		{ 16385, 14 },
		/* Maximum */
		{ ~0ULL, 47 },
	};

	static uint8_t region_width(uint64_t size)
	{
		size = clamp(size, MIN_LOCK, MAX_LOCK);
		return fls64(size - 1) - 1;
	}

	int main(int argc, char **argv)
	{
		for (unsigned i = 0; i < ARRAY_SIZE(tests); ++i) {
			uint64_t test = tests[i].size;
			uint8_t expected = tests[i].encoded;
			uint8_t actual = region_width(test);

			assert(expected == actual);
		}
	}

Changes in v2:

* New patch for non-aligned lock addresses
* Commit message improvements.
* Add Steven's tags.

Alyssa Rosenzweig (4):
  drm/panfrost: Simplify lock_region calculation
  drm/panfrost: Use u64 for size in lock_region
  drm/panfrost: Clamp lock region to Bifrost minimum
  drm/panfrost: Handle non-aligned lock addresses

 drivers/gpu/drm/panfrost/panfrost_mmu.c  | 32 ++++++++++--------------
 drivers/gpu/drm/panfrost/panfrost_regs.h |  2 ++
 2 files changed, 15 insertions(+), 19 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation
  2021-08-24 17:30 [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Alyssa Rosenzweig
@ 2021-08-24 17:30 ` Alyssa Rosenzweig
  2021-08-25  9:03   ` Steven Price
  2021-08-24 17:30 ` [PATCH v2 2/4] drm/panfrost: Use u64 for size in lock_region Alyssa Rosenzweig
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-24 17:30 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Herring, Tomeu Vizoso, Steven Price, Alyssa Rosenzweig,
	David Airlie, Daniel Vetter, linux-kernel, Chris Morgan, stable

In lock_region, simplify the calculation of the region_width parameter.
This field is the size, but encoded as ceil(log2(size)) - 1.
ceil(log2(size)) may be computed directly as fls(size - 1). However, we
want to use the 64-bit versions as the amount to lock can exceed
32-bits.

This avoids undefined (and completely wrong) behaviour when locking all
memory (size ~0). In this case, the old code would "round up" ~0 to the
nearest page, overflowing to 0. Since fls(0) == 0, this would calculate
a region width of 10 + 0 = 10. But then the code would shift by
(region_width - 11) = -1. As shifting by a negative number is undefined,
UBSAN flags the bug. Of course, even if it were defined the behaviour is
wrong, instead of locking all memory almost none would get locked.

The new form of the calculation corrects this special case and avoids
the undefined behaviour.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-and-tested-by: Chris Morgan <macromorgan@hotmail.com>
Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
Cc: <stable@vger.kernel.org>
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 0da5b3100ab1..f6e02d0392f4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
 {
 	u8 region_width;
 	u64 region = iova & PAGE_MASK;
-	/*
-	 * fls returns:
-	 * 1 .. 32
-	 *
-	 * 10 + fls(num_pages)
-	 * results in the range (11 .. 42)
-	 */
-
-	size = round_up(size, PAGE_SIZE);
 
-	region_width = 10 + fls(size >> PAGE_SHIFT);
-	if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {
-		/* not pow2, so must go up to the next pow2 */
-		region_width += 1;
-	}
+	/* The size is encoded as ceil(log2) minus(1), which may be calculated
+	 * with fls. The size must be clamped to hardware bounds.
+	 */
+	size = max_t(u64, size, PAGE_SIZE);
+	region_width = fls64(size - 1) - 1;
 	region |= region_width;
 
 	/* Lock the region that needs to be updated */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/4] drm/panfrost: Use u64 for size in lock_region
  2021-08-24 17:30 [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Alyssa Rosenzweig
  2021-08-24 17:30 ` [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation Alyssa Rosenzweig
@ 2021-08-24 17:30 ` Alyssa Rosenzweig
  2021-08-24 17:30 ` [PATCH v2 3/4] drm/panfrost: Clamp lock region to Bifrost minimum Alyssa Rosenzweig
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-24 17:30 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Herring, Tomeu Vizoso, Steven Price, Alyssa Rosenzweig,
	David Airlie, Daniel Vetter, linux-kernel, Chris Morgan, stable

Mali virtual addresses are 48-bit. Use a u64 instead of size_t to ensure
we can express the "lock everything" condition as ~0ULL without
overflow. This code was silently broken on any platform where a size_t
is less than 48-bits; in particular, it was broken on 32-bit armv7
platforms which remain in use with panfrost. (Mainly RK3288)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Rob Herring <robh@kernel.org>
Tested-by: Chris Morgan <macromorgan@hotmail.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
Cc: <stable@vger.kernel.org>
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index f6e02d0392f4..3a795273e505 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -58,7 +58,7 @@ static int write_cmd(struct panfrost_device *pfdev, u32 as_nr, u32 cmd)
 }
 
 static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
-			u64 iova, size_t size)
+			u64 iova, u64 size)
 {
 	u8 region_width;
 	u64 region = iova & PAGE_MASK;
@@ -78,7 +78,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
 
 
 static int mmu_hw_do_operation_locked(struct panfrost_device *pfdev, int as_nr,
-				      u64 iova, size_t size, u32 op)
+				      u64 iova, u64 size, u32 op)
 {
 	if (as_nr < 0)
 		return 0;
@@ -95,7 +95,7 @@ static int mmu_hw_do_operation_locked(struct panfrost_device *pfdev, int as_nr,
 
 static int mmu_hw_do_operation(struct panfrost_device *pfdev,
 			       struct panfrost_mmu *mmu,
-			       u64 iova, size_t size, u32 op)
+			       u64 iova, u64 size, u32 op)
 {
 	int ret;
 
@@ -112,7 +112,7 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m
 	u64 transtab = cfg->arm_mali_lpae_cfg.transtab;
 	u64 memattr = cfg->arm_mali_lpae_cfg.memattr;
 
-	mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0UL, AS_COMMAND_FLUSH_MEM);
+	mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
 
 	mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), transtab & 0xffffffffUL);
 	mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), transtab >> 32);
@@ -128,7 +128,7 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m
 
 static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
 {
-	mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0UL, AS_COMMAND_FLUSH_MEM);
+	mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
 
 	mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), 0);
 	mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), 0);
@@ -242,7 +242,7 @@ static size_t get_pgsize(u64 addr, size_t size)
 
 static void panfrost_mmu_flush_range(struct panfrost_device *pfdev,
 				     struct panfrost_mmu *mmu,
-				     u64 iova, size_t size)
+				     u64 iova, u64 size)
 {
 	if (mmu->as < 0)
 		return;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/4] drm/panfrost: Clamp lock region to Bifrost minimum
  2021-08-24 17:30 [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Alyssa Rosenzweig
  2021-08-24 17:30 ` [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation Alyssa Rosenzweig
  2021-08-24 17:30 ` [PATCH v2 2/4] drm/panfrost: Use u64 for size in lock_region Alyssa Rosenzweig
@ 2021-08-24 17:30 ` Alyssa Rosenzweig
  2021-08-24 17:30 ` [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses Alyssa Rosenzweig
  2021-08-24 22:37 ` [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Rob Herring
  4 siblings, 0 replies; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-24 17:30 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Herring, Tomeu Vizoso, Steven Price, Alyssa Rosenzweig,
	David Airlie, Daniel Vetter, linux-kernel, Chris Morgan, stable

When locking a region, we currently clamp to a PAGE_SIZE as the minimum
lock region. While this is valid for Midgard, it is invalid for Bifrost,
where the minimum locking size is 8x larger than the 4k page size. Add a
hardware definition for the minimum lock region size (corresponding to
KBASE_LOCK_REGION_MIN_SIZE_LOG2 in kbase) and respect it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Chris Morgan <macromorgan@hotmail.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: <stable@vger.kernel.org>
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c  | 2 +-
 drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 3a795273e505..dfe5f1d29763 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -66,7 +66,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
 	/* The size is encoded as ceil(log2) minus(1), which may be calculated
 	 * with fls. The size must be clamped to hardware bounds.
 	 */
-	size = max_t(u64, size, PAGE_SIZE);
+	size = max_t(u64, size, AS_LOCK_REGION_MIN_SIZE);
 	region_width = fls64(size - 1) - 1;
 	region |= region_width;
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index 1940ff86e49a..6c5a11ef1ee8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -316,6 +316,8 @@
 #define AS_FAULTSTATUS_ACCESS_TYPE_READ		(0x2 << 8)
 #define AS_FAULTSTATUS_ACCESS_TYPE_WRITE	(0x3 << 8)
 
+#define AS_LOCK_REGION_MIN_SIZE                 (1ULL << 15)
+
 #define gpu_write(dev, reg, data) writel(data, dev->iomem + reg)
 #define gpu_read(dev, reg) readl(dev->iomem + reg)
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses
  2021-08-24 17:30 [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Alyssa Rosenzweig
                   ` (2 preceding siblings ...)
  2021-08-24 17:30 ` [PATCH v2 3/4] drm/panfrost: Clamp lock region to Bifrost minimum Alyssa Rosenzweig
@ 2021-08-24 17:30 ` Alyssa Rosenzweig
  2021-08-25  9:04   ` Steven Price
  2021-08-24 22:37 ` [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Rob Herring
  4 siblings, 1 reply; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-24 17:30 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Herring, Tomeu Vizoso, Steven Price, Alyssa Rosenzweig,
	David Airlie, Daniel Vetter, linux-kernel

When locking memory, the base address is rounded down to the nearest
page. The current code does not adjust the size in this case,
truncating the lock region:

	Input:      [----size----]
	Round: [----size----]

To fix the truncation, extend the lock region by the amount rounded off.

	Input:      [----size----]
	Round: [-------size------]

This bug is difficult to hit under current assumptions: since the size
of the lock region is stored as a ceil(log2), the truncation must cause
us to cross a power-of-two boundary. This is possible, for example if
the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
last 15 bytes.

In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
the bug. Therefore this doesn't need to be backported. Still, that's a
happy accident and not a precondition of lock_region, so we let's do the
right thing to future proof.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index dfe5f1d29763..14be32497ec3 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -63,6 +63,9 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
 	u8 region_width;
 	u64 region = iova & PAGE_MASK;
 
+	/* After rounding the address down, extend the size to lock the end. */
+	size += (region - iova);
+
 	/* The size is encoded as ceil(log2) minus(1), which may be calculated
 	 * with fls. The size must be clamped to hardware bounds.
 	 */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region
  2021-08-24 17:30 [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Alyssa Rosenzweig
                   ` (3 preceding siblings ...)
  2021-08-24 17:30 ` [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses Alyssa Rosenzweig
@ 2021-08-24 22:37 ` Rob Herring
  4 siblings, 0 replies; 11+ messages in thread
From: Rob Herring @ 2021-08-24 22:37 UTC (permalink / raw)
  To: Alyssa Rosenzweig
  Cc: dri-devel, Tomeu Vizoso, Steven Price, David Airlie,
	Daniel Vetter, linux-kernel

On Tue, Aug 24, 2021 at 12:30 PM Alyssa Rosenzweig
<alyssa.rosenzweig@collabora.com> wrote:
>
> Chris Morgan reported UBSAN errors in panfrost and tracked them down to
> the size computation in lock_region. This calculation is overcomplicated
> (cargo culted from kbase) and can be simplified with kernel helpers and
> some mathematical identities. The first patch in the series rewrites the
> calculation in a form avoiding undefined behaviour; Chris confirms it
> placates UBSAN.
>
> While researching this function, I noticed a pair of other potential
> bugs: Bifrost can lock more than 4GiB at a time, but must lock at least
> 32KiB at a time. The latter patches in the series handle these cases.
>
> In review of v1 of this series, Steven pointed out a fourth potential
> bug: rounding down the iova can truncate the lock region. v2 adds a new
> patch for this case.
>
> The size computation was unit-tested in userspace. Relevant code below,
> just missing some copypaste definitions for fls64/clamp/etc:
>
>         #define MIN_LOCK (1ULL << 12)
>         #define MAX_LOCK (1ULL << 48)
>
>         struct {
>                 uint64_t size;
>                 uint8_t encoded;
>         } tests[] = {
>                 /* Clamping */
>                 { 0, 11 },
>                 { 1, 11 },
>                 { 2, 11 },
>                 { 4095, 11 },
>                 /* Power of two */
>                 { 4096, 11 },
>                 /* Round up */
>                 { 4097, 12 },
>                 { 8192, 12 },
>                 { 16384, 13 },
>                 { 16385, 14 },
>                 /* Maximum */
>                 { ~0ULL, 47 },
>         };
>
>         static uint8_t region_width(uint64_t size)
>         {
>                 size = clamp(size, MIN_LOCK, MAX_LOCK);
>                 return fls64(size - 1) - 1;
>         }
>
>         int main(int argc, char **argv)
>         {
>                 for (unsigned i = 0; i < ARRAY_SIZE(tests); ++i) {
>                         uint64_t test = tests[i].size;
>                         uint8_t expected = tests[i].encoded;
>                         uint8_t actual = region_width(test);
>
>                         assert(expected == actual);
>                 }
>         }
>
> Changes in v2:
>
> * New patch for non-aligned lock addresses
> * Commit message improvements.
> * Add Steven's tags.
>
> Alyssa Rosenzweig (4):
>   drm/panfrost: Simplify lock_region calculation
>   drm/panfrost: Use u64 for size in lock_region
>   drm/panfrost: Clamp lock region to Bifrost minimum
>   drm/panfrost: Handle non-aligned lock addresses
>
>  drivers/gpu/drm/panfrost/panfrost_mmu.c  | 32 ++++++++++--------------
>  drivers/gpu/drm/panfrost/panfrost_regs.h |  2 ++
>  2 files changed, 15 insertions(+), 19 deletions(-)

For the series,

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation
  2021-08-24 17:30 ` [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation Alyssa Rosenzweig
@ 2021-08-25  9:03   ` Steven Price
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Price @ 2021-08-25  9:03 UTC (permalink / raw)
  To: Alyssa Rosenzweig, dri-devel
  Cc: Rob Herring, Tomeu Vizoso, David Airlie, Daniel Vetter,
	linux-kernel, Chris Morgan, stable

On 24/08/2021 18:30, Alyssa Rosenzweig wrote:
> In lock_region, simplify the calculation of the region_width parameter.
> This field is the size, but encoded as ceil(log2(size)) - 1.
> ceil(log2(size)) may be computed directly as fls(size - 1). However, we
> want to use the 64-bit versions as the amount to lock can exceed
> 32-bits.
> 
> This avoids undefined (and completely wrong) behaviour when locking all
> memory (size ~0). In this case, the old code would "round up" ~0 to the
> nearest page, overflowing to 0. Since fls(0) == 0, this would calculate
> a region width of 10 + 0 = 10. But then the code would shift by
> (region_width - 11) = -1. As shifting by a negative number is undefined,
> UBSAN flags the bug. Of course, even if it were defined the behaviour is
> wrong, instead of locking all memory almost none would get locked.
> 
> The new form of the calculation corrects this special case and avoids
> the undefined behaviour.
> 
> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Reported-and-tested-by: Chris Morgan <macromorgan@hotmail.com>
> Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
> Cc: <stable@vger.kernel.org>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++--------------
>  1 file changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 0da5b3100ab1..f6e02d0392f4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
>  {
>  	u8 region_width;
>  	u64 region = iova & PAGE_MASK;
> -	/*
> -	 * fls returns:
> -	 * 1 .. 32
> -	 *
> -	 * 10 + fls(num_pages)
> -	 * results in the range (11 .. 42)
> -	 */
> -
> -	size = round_up(size, PAGE_SIZE);
>  
> -	region_width = 10 + fls(size >> PAGE_SHIFT);
> -	if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {
> -		/* not pow2, so must go up to the next pow2 */
> -		region_width += 1;
> -	}
> +	/* The size is encoded as ceil(log2) minus(1), which may be calculated
> +	 * with fls. The size must be clamped to hardware bounds.
> +	 */
> +	size = max_t(u64, size, PAGE_SIZE);
> +	region_width = fls64(size - 1) - 1;
>  	region |= region_width;
>  
>  	/* Lock the region that needs to be updated */
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses
  2021-08-24 17:30 ` [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses Alyssa Rosenzweig
@ 2021-08-25  9:04   ` Steven Price
  2021-08-25 14:07     ` Alyssa Rosenzweig
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Price @ 2021-08-25  9:04 UTC (permalink / raw)
  To: Alyssa Rosenzweig, dri-devel
  Cc: Rob Herring, Tomeu Vizoso, David Airlie, Daniel Vetter, linux-kernel

On 24/08/2021 18:30, Alyssa Rosenzweig wrote:
> When locking memory, the base address is rounded down to the nearest
> page. The current code does not adjust the size in this case,
> truncating the lock region:
> 
> 	Input:      [----size----]
> 	Round: [----size----]
> 
> To fix the truncation, extend the lock region by the amount rounded off.
> 
> 	Input:      [----size----]
> 	Round: [-------size------]
> 
> This bug is difficult to hit under current assumptions: since the size
> of the lock region is stored as a ceil(log2), the truncation must cause
> us to cross a power-of-two boundary. This is possible, for example if
> the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
> existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
> region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
> last 15 bytes.
> 
> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> the bug. Therefore this doesn't need to be backported. Still, that's a
> happy accident and not a precondition of lock_region, so we let's do the
> right thing to future proof.

Actually it's worse than that due to the hardware behaviour, the spec
states (for LOCKADDR_BASE):

> Only the upper bits of the address are used. The address is aligned to a
> multiple of the region size, so a variable number of low-order bits are
> ignored, depending on the selected region size. It is recommended that software
> ensures that these low bits in the address are cleared, to avoid confusion.

It appears that indeed this has caused confusion ;)

So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
= 0x30000) the region width gets rounded up (to 0x40000) which causes
the start address to be effectively rounded down (by the hardware) to
0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.

Interestingly (unless my reading of this is wrong) that means to lock
0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
*at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).

This appears to be broken in kbase (which actually does zero out the low
bits of the address) - I've raised a bug internally so hopefully someone
will tell me if I've read the spec completely wrong here.

Steve

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Reported-by: Steven Price <steven.price@arm.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index dfe5f1d29763..14be32497ec3 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -63,6 +63,9 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
>  	u8 region_width;
>  	u64 region = iova & PAGE_MASK;
>  
> +	/* After rounding the address down, extend the size to lock the end. */
> +	size += (region - iova);
> +
>  	/* The size is encoded as ceil(log2) minus(1), which may be calculated
>  	 * with fls. The size must be clamped to hardware bounds.
>  	 */
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses
  2021-08-25  9:04   ` Steven Price
@ 2021-08-25 14:07     ` Alyssa Rosenzweig
  2021-08-25 14:34       ` Steven Price
  0 siblings, 1 reply; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-25 14:07 UTC (permalink / raw)
  To: Steven Price
  Cc: Alyssa Rosenzweig, dri-devel, Rob Herring, Tomeu Vizoso,
	David Airlie, Daniel Vetter, linux-kernel

> > In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> > the bug. Therefore this doesn't need to be backported. Still, that's a
> > happy accident and not a precondition of lock_region, so we let's do the
> > right thing to future proof.
> 
> Actually it's worse than that due to the hardware behaviour, the spec
> states (for LOCKADDR_BASE):
> 
> > Only the upper bits of the address are used. The address is aligned to a
> > multiple of the region size, so a variable number of low-order bits are
> > ignored, depending on the selected region size. It is recommended that software
> > ensures that these low bits in the address are cleared, to avoid confusion.
> 
> It appears that indeed this has caused confusion ;)
> 
> So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
> = 0x30000) the region width gets rounded up (to 0x40000) which causes
> the start address to be effectively rounded down (by the hardware) to
> 0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.
> 
> Interestingly (unless my reading of this is wrong) that means to lock
> 0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
> *at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).
> 
> This appears to be broken in kbase (which actually does zero out the low
> bits of the address) - I've raised a bug internally so hopefully someone
> will tell me if I've read the spec completely wrong here.

Horrifying, and not what I wanted to read my last day before 2 weeks of
leave. Let's drop this patch, hopefully by the time I'm back, your
friends in GPU can confirm that's a spec bug and not an actual
hardware/driver one...

Can you apply the other 3 patches in the mean time? Thanks :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses
  2021-08-25 14:07     ` Alyssa Rosenzweig
@ 2021-08-25 14:34       ` Steven Price
  2021-08-25 15:05         ` Alyssa Rosenzweig
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Price @ 2021-08-25 14:34 UTC (permalink / raw)
  To: Alyssa Rosenzweig
  Cc: Alyssa Rosenzweig, dri-devel, Rob Herring, Tomeu Vizoso,
	David Airlie, Daniel Vetter, linux-kernel

On 25/08/2021 15:07, Alyssa Rosenzweig wrote:
>>> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
>>> the bug. Therefore this doesn't need to be backported. Still, that's a
>>> happy accident and not a precondition of lock_region, so we let's do the
>>> right thing to future proof.
>>
>> Actually it's worse than that due to the hardware behaviour, the spec
>> states (for LOCKADDR_BASE):
>>
>>> Only the upper bits of the address are used. The address is aligned to a
>>> multiple of the region size, so a variable number of low-order bits are
>>> ignored, depending on the selected region size. It is recommended that software
>>> ensures that these low bits in the address are cleared, to avoid confusion.
>>
>> It appears that indeed this has caused confusion ;)
>>
>> So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
>> = 0x30000) the region width gets rounded up (to 0x40000) which causes
>> the start address to be effectively rounded down (by the hardware) to
>> 0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.
>>
>> Interestingly (unless my reading of this is wrong) that means to lock
>> 0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
>> *at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).
>>
>> This appears to be broken in kbase (which actually does zero out the low
>> bits of the address) - I've raised a bug internally so hopefully someone
>> will tell me if I've read the spec completely wrong here.
> 
> Horrifying, and not what I wanted to read my last day before 2 weeks of
> leave. Let's drop this patch, hopefully by the time I'm back, your
> friends in GPU can confirm that's a spec bug and not an actual
> hardware/driver one...
> 
> Can you apply the other 3 patches in the mean time? Thanks :-)
> 

Yeah, sure. I'll push the first 3 to drm-misc-next-fixes (should land in
v5.15).

It's interesting that if my (new) reading of the spec is correct then
kbase has been horribly broken in this respect forever. So clearly it
can't be something that crops up very often. It would have been good if
the spec could have included wording such as "naturally aligned" if
that's what was intended.

Enjoy your holiday!

Thanks,
Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses
  2021-08-25 14:34       ` Steven Price
@ 2021-08-25 15:05         ` Alyssa Rosenzweig
  0 siblings, 0 replies; 11+ messages in thread
From: Alyssa Rosenzweig @ 2021-08-25 15:05 UTC (permalink / raw)
  To: Steven Price
  Cc: Alyssa Rosenzweig, dri-devel, Rob Herring, Tomeu Vizoso,
	David Airlie, Daniel Vetter, linux-kernel

> > Horrifying, and not what I wanted to read my last day before 2 weeks of
> > leave. Let's drop this patch, hopefully by the time I'm back, your
> > friends in GPU can confirm that's a spec bug and not an actual
> > hardware/driver one...
> > 
> > Can you apply the other 3 patches in the mean time? Thanks :-)
> > 
> 
> Yeah, sure. I'll push the first 3 to drm-misc-next-fixes (should land in
> v5.15).
> 
> It's interesting that if my (new) reading of the spec is correct then
> kbase has been horribly broken in this respect forever. So clearly it
> can't be something that crops up very often. It would have been good if
> the spec could have included wording such as "naturally aligned" if
> that's what was intended.

Indeed. Fingers crossed this is a mix-up. Although the text you quoted
seems pretty clear unfortunately :|

> Enjoy your holiday!

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-08-25 15:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-24 17:30 [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Alyssa Rosenzweig
2021-08-24 17:30 ` [PATCH v2 1/4] drm/panfrost: Simplify lock_region calculation Alyssa Rosenzweig
2021-08-25  9:03   ` Steven Price
2021-08-24 17:30 ` [PATCH v2 2/4] drm/panfrost: Use u64 for size in lock_region Alyssa Rosenzweig
2021-08-24 17:30 ` [PATCH v2 3/4] drm/panfrost: Clamp lock region to Bifrost minimum Alyssa Rosenzweig
2021-08-24 17:30 ` [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses Alyssa Rosenzweig
2021-08-25  9:04   ` Steven Price
2021-08-25 14:07     ` Alyssa Rosenzweig
2021-08-25 14:34       ` Steven Price
2021-08-25 15:05         ` Alyssa Rosenzweig
2021-08-24 22:37 ` [PATCH v2 0/4] drm/panfrost: Bug fixes for lock_region Rob Herring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).