All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] nvme: Always invalidate whole cqes[] array
@ 2021-02-08 13:31 Andre Przywara
  2021-02-09 11:27 ` Bin Meng
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Andre Przywara @ 2021-02-08 13:31 UTC (permalink / raw)
  To: u-boot

At the moment nvme_read_completion_status() tries to invidate a single
member of the cqes[] array, which is shady as just a single entry is
not cache line aligned.
The structure is dictated by hardware, and with 16 bytes is smaller than
any cache line we usually deal with. Also multiple entries need to be
consecutive in memory, so we can't pad them to cover a whole cache line.

As a consequence we can only always invalidate all of them - U-Boot just
uses two of them anyway. This is fine, as they are only ever read by the
CPU (apart from the initial zeroing), so they can't become dirty.

Make this obvious by always invalidating the whole array, regardless of
the entry number we are about to read.
Also blow up the allocation size to cover whole cache lines, to avoid
other heap allocations to sneak in.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Hi,

this is just compile tested, and should fix the only questionable
cache invalidate call in this driver.
Please verify if this fixes any issues!

Cheers,
Andre

 drivers/nvme/nvme.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c
index 5d6331ad346..c9efeff4bc9 100644
--- a/drivers/nvme/nvme.c
+++ b/drivers/nvme/nvme.c
@@ -22,6 +22,8 @@
 #define NVME_AQ_DEPTH		2
 #define NVME_SQ_SIZE(depth)	(depth * sizeof(struct nvme_command))
 #define NVME_CQ_SIZE(depth)	(depth * sizeof(struct nvme_completion))
+#define NVME_CQ_ALLOCATION	ALIGN(NVME_CQ_SIZE(NVME_Q_DEPTH), \
+				      ARCH_DMA_MINALIGN)
 #define ADMIN_TIMEOUT		60
 #define IO_TIMEOUT		30
 #define MAX_PRP_POOL		512
@@ -144,8 +146,14 @@ static __le16 nvme_get_cmd_id(void)
 
 static u16 nvme_read_completion_status(struct nvme_queue *nvmeq, u16 index)
 {
-	u64 start = (ulong)&nvmeq->cqes[index];
-	u64 stop = start + sizeof(struct nvme_completion);
+	/*
+	 * Single CQ entries are always smaller than a cache line, so we
+	 * can't invalidate them individually. However CQ entries are
+	 * read only by the CPU, so it's safe to always invalidate all of them,
+	 * as the cache line should never become dirty.
+	 */
+	ulong start = (ulong)&nvmeq->cqes[0];
+	ulong stop = start + NVME_CQ_ALLOCATION;
 
 	invalidate_dcache_range(start, stop);
 
@@ -241,7 +249,7 @@ static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev,
 		return NULL;
 	memset(nvmeq, 0, sizeof(*nvmeq));
 
-	nvmeq->cqes = (void *)memalign(4096, NVME_CQ_SIZE(depth));
+	nvmeq->cqes = (void *)memalign(4096, NVME_CQ_ALLOCATION);
 	if (!nvmeq->cqes)
 		goto free_nvmeq;
 	memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(depth));
@@ -339,7 +347,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid)
 	nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride];
 	memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(nvmeq->q_depth));
 	flush_dcache_range((ulong)nvmeq->cqes,
-			   (ulong)nvmeq->cqes + NVME_CQ_SIZE(nvmeq->q_depth));
+			   (ulong)nvmeq->cqes + NVME_CQ_ALLOCATION);
 	dev->online_queues++;
 }
 
-- 
2.17.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH] nvme: Always invalidate whole cqes[] array
  2021-02-08 13:31 [RFC PATCH] nvme: Always invalidate whole cqes[] array Andre Przywara
@ 2021-02-09 11:27 ` Bin Meng
  2021-02-09 11:32   ` Michael Nazzareno Trimarchi
  2021-02-26 14:41 ` Jagan Teki
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Bin Meng @ 2021-02-09 11:27 UTC (permalink / raw)
  To: u-boot

On Mon, Feb 8, 2021 at 9:32 PM Andre Przywara <andre.przywara@arm.com> wrote:
>
> At the moment nvme_read_completion_status() tries to invidate a single

typo: invalidate

> member of the cqes[] array, which is shady as just a single entry is
> not cache line aligned.
> The structure is dictated by hardware, and with 16 bytes is smaller than
> any cache line we usually deal with. Also multiple entries need to be
> consecutive in memory, so we can't pad them to cover a whole cache line.
>
> As a consequence we can only always invalidate all of them - U-Boot just
> uses two of them anyway. This is fine, as they are only ever read by the
> CPU (apart from the initial zeroing), so they can't become dirty.
>
> Make this obvious by always invalidating the whole array, regardless of
> the entry number we are about to read.
> Also blow up the allocation size to cover whole cache lines, to avoid
> other heap allocations to sneak in.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Hi,
>
> this is just compile tested, and should fix the only questionable
> cache invalidate call in this driver.
> Please verify if this fixes any issues!
>
> Cheers,
> Andre
>
>  drivers/nvme/nvme.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
>

Reviewed-by: Bin Meng <bmeng.cn@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH] nvme: Always invalidate whole cqes[] array
  2021-02-09 11:27 ` Bin Meng
@ 2021-02-09 11:32   ` Michael Nazzareno Trimarchi
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Nazzareno Trimarchi @ 2021-02-09 11:32 UTC (permalink / raw)
  To: u-boot

Hi Jagan

On Tue, Feb 9, 2021 at 12:28 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Mon, Feb 8, 2021 at 9:32 PM Andre Przywara <andre.przywara@arm.com> wrote:
> >
> > At the moment nvme_read_completion_status() tries to invidate a single
>
> typo: invalidate
>
> > member of the cqes[] array, which is shady as just a single entry is
> > not cache line aligned.
> > The structure is dictated by hardware, and with 16 bytes is smaller than
> > any cache line we usually deal with. Also multiple entries need to be
> > consecutive in memory, so we can't pad them to cover a whole cache line.
> >
> > As a consequence we can only always invalidate all of them - U-Boot just
> > uses two of them anyway. This is fine, as they are only ever read by the
> > CPU (apart from the initial zeroing), so they can't become dirty.
> >
> > Make this obvious by always invalidating the whole array, regardless of
> > the entry number we are about to read.
> > Also blow up the allocation size to cover whole cache lines, to avoid
> > other heap allocations to sneak in.
> >
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> > Hi,
> >
> > this is just compile tested, and should fix the only questionable
> > cache invalidate call in this driver.
> > Please verify if this fixes any issues!
> >
> > Cheers,
> > Andre
> >
> >  drivers/nvme/nvme.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> >
>
> Reviewed-by: Bin Meng <bmeng.cn@gmail.com>

Please test again. I don't have the hardware
Reviewed-by: Michael Trimarchi <michael@amarulasolutions.com>




-- 
Michael Nazzareno Trimarchi
Amarula Solutions BV
COO Co-Founder
Cruquiuskade 47 Amsterdam 1018 AM NL
T. +31(0)851119172
M. +39(0)3479132170
[`as] https://www.amarulasolutions.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH] nvme: Always invalidate whole cqes[] array
  2021-02-08 13:31 [RFC PATCH] nvme: Always invalidate whole cqes[] array Andre Przywara
  2021-02-09 11:27 ` Bin Meng
@ 2021-02-26 14:41 ` Jagan Teki
  2021-02-26 16:26   ` Suniel Mahesh
  2021-02-26 16:12 ` Neil Armstrong
  2021-03-19 20:41 ` Tom Rini
  3 siblings, 1 reply; 7+ messages in thread
From: Jagan Teki @ 2021-02-26 14:41 UTC (permalink / raw)
  To: u-boot

Hi Suniel,

On Mon, Feb 8, 2021 at 7:02 PM Andre Przywara <andre.przywara@arm.com> wrote:
>
> At the moment nvme_read_completion_status() tries to invidate a single
> member of the cqes[] array, which is shady as just a single entry is
> not cache line aligned.
> The structure is dictated by hardware, and with 16 bytes is smaller than
> any cache line we usually deal with. Also multiple entries need to be
> consecutive in memory, so we can't pad them to cover a whole cache line.
>
> As a consequence we can only always invalidate all of them - U-Boot just
> uses two of them anyway. This is fine, as they are only ever read by the
> CPU (apart from the initial zeroing), so they can't become dirty.
>
> Make this obvious by always invalidating the whole array, regardless of
> the entry number we are about to read.
> Also blow up the allocation size to cover whole cache lines, to avoid
> other heap allocations to sneak in.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Hi,
>
> this is just compile tested, and should fix the only questionable
> cache invalidate call in this driver.
> Please verify if this fixes any issues!

Can you test this on RK3399?

Jagan.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH] nvme: Always invalidate whole cqes[] array
  2021-02-08 13:31 [RFC PATCH] nvme: Always invalidate whole cqes[] array Andre Przywara
  2021-02-09 11:27 ` Bin Meng
  2021-02-26 14:41 ` Jagan Teki
@ 2021-02-26 16:12 ` Neil Armstrong
  2021-03-19 20:41 ` Tom Rini
  3 siblings, 0 replies; 7+ messages in thread
From: Neil Armstrong @ 2021-02-26 16:12 UTC (permalink / raw)
  To: u-boot

On 08/02/2021 14:31, Andre Przywara wrote:
> At the moment nvme_read_completion_status() tries to invidate a single
> member of the cqes[] array, which is shady as just a single entry is
> not cache line aligned.
> The structure is dictated by hardware, and with 16 bytes is smaller than
> any cache line we usually deal with. Also multiple entries need to be
> consecutive in memory, so we can't pad them to cover a whole cache line.
> 
> As a consequence we can only always invalidate all of them - U-Boot just
> uses two of them anyway. This is fine, as they are only ever read by the
> CPU (apart from the initial zeroing), so they can't become dirty.
> 
> Make this obvious by always invalidating the whole array, regardless of
> the entry number we are about to read.
> Also blow up the allocation size to cover whole cache lines, to avoid
> other heap allocations to sneak in.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Hi,
> 
> this is just compile tested, and should fix the only questionable
> cache invalidate call in this driver.
> Please verify if this fixes any issues!
> 
> Cheers,
> Andre
> 
>  drivers/nvme/nvme.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c
> index 5d6331ad346..c9efeff4bc9 100644
> --- a/drivers/nvme/nvme.c
> +++ b/drivers/nvme/nvme.c
> @@ -22,6 +22,8 @@
>  #define NVME_AQ_DEPTH		2
>  #define NVME_SQ_SIZE(depth)	(depth * sizeof(struct nvme_command))
>  #define NVME_CQ_SIZE(depth)	(depth * sizeof(struct nvme_completion))
> +#define NVME_CQ_ALLOCATION	ALIGN(NVME_CQ_SIZE(NVME_Q_DEPTH), \
> +				      ARCH_DMA_MINALIGN)
>  #define ADMIN_TIMEOUT		60
>  #define IO_TIMEOUT		30
>  #define MAX_PRP_POOL		512
> @@ -144,8 +146,14 @@ static __le16 nvme_get_cmd_id(void)
>  
>  static u16 nvme_read_completion_status(struct nvme_queue *nvmeq, u16 index)
>  {
> -	u64 start = (ulong)&nvmeq->cqes[index];
> -	u64 stop = start + sizeof(struct nvme_completion);
> +	/*
> +	 * Single CQ entries are always smaller than a cache line, so we
> +	 * can't invalidate them individually. However CQ entries are
> +	 * read only by the CPU, so it's safe to always invalidate all of them,
> +	 * as the cache line should never become dirty.
> +	 */
> +	ulong start = (ulong)&nvmeq->cqes[0];
> +	ulong stop = start + NVME_CQ_ALLOCATION;
>  
>  	invalidate_dcache_range(start, stop);
>  
> @@ -241,7 +249,7 @@ static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev,
>  		return NULL;
>  	memset(nvmeq, 0, sizeof(*nvmeq));
>  
> -	nvmeq->cqes = (void *)memalign(4096, NVME_CQ_SIZE(depth));
> +	nvmeq->cqes = (void *)memalign(4096, NVME_CQ_ALLOCATION);
>  	if (!nvmeq->cqes)
>  		goto free_nvmeq;
>  	memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(depth));
> @@ -339,7 +347,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid)
>  	nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride];
>  	memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(nvmeq->q_depth));
>  	flush_dcache_range((ulong)nvmeq->cqes,
> -			   (ulong)nvmeq->cqes + NVME_CQ_SIZE(nvmeq->q_depth));
> +			   (ulong)nvmeq->cqes + NVME_CQ_ALLOCATION);
>  	dev->online_queues++;
>  }
>  
> 

On Amlogic A311D, fixes timeout on nvme scan:

Tested-by: Neil Armstrong <narmstrong@baylibre.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH] nvme: Always invalidate whole cqes[] array
  2021-02-26 14:41 ` Jagan Teki
@ 2021-02-26 16:26   ` Suniel Mahesh
  0 siblings, 0 replies; 7+ messages in thread
From: Suniel Mahesh @ 2021-02-26 16:26 UTC (permalink / raw)
  To: u-boot

On Fri, Feb 26, 2021 at 8:12 PM Jagan Teki <jagan@amarulasolutions.com> wrote:
>
> Hi Suniel,
>
> On Mon, Feb 8, 2021 at 7:02 PM Andre Przywara <andre.przywara@arm.com> wrote:
> >
> > At the moment nvme_read_completion_status() tries to invidate a single
> > member of the cqes[] array, which is shady as just a single entry is
> > not cache line aligned.
> > The structure is dictated by hardware, and with 16 bytes is smaller than
> > any cache line we usually deal with. Also multiple entries need to be
> > consecutive in memory, so we can't pad them to cover a whole cache line.
> >
> > As a consequence we can only always invalidate all of them - U-Boot just
> > uses two of them anyway. This is fine, as they are only ever read by the
> > CPU (apart from the initial zeroing), so they can't become dirty.
> >
> > Make this obvious by always invalidating the whole array, regardless of
> > the entry number we are about to read.
> > Also blow up the allocation size to cover whole cache lines, to avoid
> > other heap allocations to sneak in.
> >
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> > Hi,
> >
> > this is just compile tested, and should fix the only questionable
> > cache invalidate call in this driver.
> > Please verify if this fixes any issues!
>
> Can you test this on RK3399?

Hi Jagan/All

Here are the test results after applying this patch on mainline. The
patch has been tested on
roc-rk3399-pc an RK3399 based SBC. It looks fine, below is the log:

U-Boot TPL 2021.04-rc2-00169-g958c213e3f (Feb 26 2021 - 21:38:32)
Trying to boot from BOOTROM
Returning to boot ROM...

U-Boot SPL 2021.04-rc2-00169-g958c213e3f (Feb 26 2021 - 21:38:32 +0530)
Loading Environment from SPIFlash... *** Warning -
spi_flash_probe_bus_cs() failed, using default environment

Trying to boot from MMC1

U-Boot 2021.04-rc2-00169-g958c213e3f (Feb 26 2021 - 21:38:32 +0530)

SoC: Rockchip rk3399
Reset cause: POR
Model: Firefly ROC-RK3399-PC Mezzanine Board
DRAM:  3.9 GiB
PMIC:  RK808
MMC:   mmc at fe310000: 2, mmc at fe320000: 1, sdhci at fe330000: 0
Loading Environment from SPIFlash... jedec_spi_nor flash at 0:
unrecognized JEDEC id bytes: ff, ff, ff
*** Warning - spi_flash_probe_bus_cs() failed, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Firefly ROC-RK3399-PC Mezzanine Board
Net:
Error: ethernet at fe300000 address not set.
No ethernet found.

Hit any key to stop autoboot:  0
=>
=>
=> pci
Scanning PCI devices on bus 0
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
00.00.00   0x1d87     0x0100     Bridge device           0x04

=> nvme scan

=> nvme info
Device 0: Vendor: 0x15b7 Rev: 21110001 Prod: 2031C2440521
            Type: Hard Disk
            Capacity: 244198.3 MB = 238.4 GB (500118192 x 512)
=> nvme device

IDE device 0: Vendor: 0x15b7 Rev: 21110001 Prod: 2031C2440521
            Type: Hard Disk
            Capacity: 244198.3 MB = 238.4 GB (500118192 x 512)


Tested-by: Suniel Mahesh <sunil@amarulasolutions.com>

>
> Jagan.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH] nvme: Always invalidate whole cqes[] array
  2021-02-08 13:31 [RFC PATCH] nvme: Always invalidate whole cqes[] array Andre Przywara
                   ` (2 preceding siblings ...)
  2021-02-26 16:12 ` Neil Armstrong
@ 2021-03-19 20:41 ` Tom Rini
  3 siblings, 0 replies; 7+ messages in thread
From: Tom Rini @ 2021-03-19 20:41 UTC (permalink / raw)
  To: u-boot

On Mon, Feb 08, 2021 at 01:31:54PM +0000, Andre Przywara wrote:

> At the moment nvme_read_completion_status() tries to invidate a single
> member of the cqes[] array, which is shady as just a single entry is
> not cache line aligned.
> The structure is dictated by hardware, and with 16 bytes is smaller than
> any cache line we usually deal with. Also multiple entries need to be
> consecutive in memory, so we can't pad them to cover a whole cache line.
> 
> As a consequence we can only always invalidate all of them - U-Boot just
> uses two of them anyway. This is fine, as they are only ever read by the
> CPU (apart from the initial zeroing), so they can't become dirty.
> 
> Make this obvious by always invalidating the whole array, regardless of
> the entry number we are about to read.
> Also blow up the allocation size to cover whole cache lines, to avoid
> other heap allocations to sneak in.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
> Reviewed-by: Michael Trimarchi <michael@amarulasolutions.com>
> Tested-by: Neil Armstrong <narmstrong@baylibre.com>

Applied to u-boot/master, thanks!

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20210319/59e77d30/attachment.sig>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-03-19 20:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-08 13:31 [RFC PATCH] nvme: Always invalidate whole cqes[] array Andre Przywara
2021-02-09 11:27 ` Bin Meng
2021-02-09 11:32   ` Michael Nazzareno Trimarchi
2021-02-26 14:41 ` Jagan Teki
2021-02-26 16:26   ` Suniel Mahesh
2021-02-26 16:12 ` Neil Armstrong
2021-03-19 20:41 ` Tom Rini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.