linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nvme-fc: Improve memory usage in nvme_fc_rcv_ls_req()
@ 2022-10-02  9:59 Christophe JAILLET
  2022-10-02 17:27 ` James Smart
  2022-11-01  9:45 ` Christoph Hellwig
  0 siblings, 2 replies; 3+ messages in thread
From: Christophe JAILLET @ 2022-10-02  9:59 UTC (permalink / raw)
  To: James Smart, Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: linux-kernel, kernel-janitors, Christophe JAILLET, linux-nvme

sizeof( struct nvmefc_ls_rcv_op ) = 64
sizeof( union nvmefc_ls_requests ) = 1024
sizeof( union nvmefc_ls_responses ) = 128

So, in nvme_fc_rcv_ls_req(), 1216 bytes of memory are requested when
kzalloc() is called.

Because of the way memory allocations are performed, 2048 bytes are
allocated. So about 800 bytes are wasted for each request.

Switch to 3 distinct memory allocations, in order to:
   - save these 800 bytes
   - avoid zeroing this extra memory
   - make sure that memory is properly aligned in case of DMA access
    ("fc_dma_map_single(lsop->rspbuf)" just a few lines below)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
This patch is only a RFC to see if this kind of approach makes sense or
not.
I've not checked all paths, so it is likely that it is incomplete.

Anyway, it is just a trade-of between memory footprint and CPU usage (3
kzalloc() instead of 1)

I don't know if it is a slow path or not, nor if the "rport->ls_rcv_list"
list can get big (each item overuses these 800 bytes of memory)

3 kzalloc is more than just 1 (sic!), but with this patch, 800 bytes are
not zeroed anymore. Moreover, maybe the zeroing of rqstbuf and/or rspbuf
can be saved as well.
So, it could balance the impact of the 3 kzalloc().

So, if it looks promising, s.o. with the corresponding hardware should
make some measurements on memory and CPU usage.
---
 drivers/nvme/host/fc.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 5d57a042dbca..2d3c54838496 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1475,6 +1475,8 @@ nvme_fc_xmt_ls_rsp_done(struct nvmefc_ls_rsp *lsrsp)
 	fc_dma_unmap_single(lport->dev, lsop->rspdma,
 			sizeof(*lsop->rspbuf), DMA_TO_DEVICE);
 
+	kfree(lsop->rspbuf);
+	kfree(lsop->rqstbuf);
 	kfree(lsop);
 
 	nvme_fc_rport_put(rport);
@@ -1751,20 +1753,17 @@ nvme_fc_rcv_ls_req(struct nvme_fc_remote_port *portptr,
 		goto out_put;
 	}
 
-	lsop = kzalloc(sizeof(*lsop) +
-			sizeof(union nvmefc_ls_requests) +
-			sizeof(union nvmefc_ls_responses),
-			GFP_KERNEL);
-	if (!lsop) {
+	lsop = kzalloc(sizeof(*lsop), GFP_KERNEL);
+	lsop->rqstbuf = kzalloc(sizeof(*lsop->rqstbuf), GFP_KERNEL);
+	lsop->rspbuf = kzalloc(sizeof(*lsop->rspbuf), GFP_KERNEL);
+	if (!lsop || !lsop->rqstbuf || !lsop->rspbuf) {
 		dev_info(lport->dev,
 			"RCV %s LS failed: No memory\n",
 			(w0->ls_cmd <= NVME_FC_LAST_LS_CMD_VALUE) ?
 				nvmefc_ls_names[w0->ls_cmd] : "");
 		ret = -ENOMEM;
-		goto out_put;
+		goto out_free;
 	}
-	lsop->rqstbuf = (union nvmefc_ls_requests *)&lsop[1];
-	lsop->rspbuf = (union nvmefc_ls_responses *)&lsop->rqstbuf[1];
 
 	lsop->rspdma = fc_dma_map_single(lport->dev, lsop->rspbuf,
 					sizeof(*lsop->rspbuf),
@@ -1801,6 +1800,8 @@ nvme_fc_rcv_ls_req(struct nvme_fc_remote_port *portptr,
 	fc_dma_unmap_single(lport->dev, lsop->rspdma,
 			sizeof(*lsop->rspbuf), DMA_TO_DEVICE);
 out_free:
+	kfree(lsop->rspbuf);
+	kfree(lsop->rqstbuf);
 	kfree(lsop);
 out_put:
 	nvme_fc_rport_put(rport);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] nvme-fc: Improve memory usage in nvme_fc_rcv_ls_req()
  2022-10-02  9:59 [PATCH] nvme-fc: Improve memory usage in nvme_fc_rcv_ls_req() Christophe JAILLET
@ 2022-10-02 17:27 ` James Smart
  2022-11-01  9:45 ` Christoph Hellwig
  1 sibling, 0 replies; 3+ messages in thread
From: James Smart @ 2022-10-02 17:27 UTC (permalink / raw)
  To: linux-nvme

On 10/2/2022 2:59 AM, Christophe JAILLET wrote:
> sizeof( struct nvmefc_ls_rcv_op ) = 64
> sizeof( union nvmefc_ls_requests ) = 1024
> sizeof( union nvmefc_ls_responses ) = 128
> 
> So, in nvme_fc_rcv_ls_req(), 1216 bytes of memory are requested when
> kzalloc() is called.
> 
> Because of the way memory allocations are performed, 2048 bytes are
> allocated. So about 800 bytes are wasted for each request.
> 
> Switch to 3 distinct memory allocations, in order to:
>     - save these 800 bytes
>     - avoid zeroing this extra memory
>     - make sure that memory is properly aligned in case of DMA access
>      ("fc_dma_map_single(lsop->rspbuf)" just a few lines below)
> 
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> ---
> This patch is only a RFC to see if this kind of approach makes sense or
> not.
> I've not checked all paths, so it is likely that it is incomplete.

I think it's ok.  some of the other paths that behave similarly may be 
aware of the contiguous allocation, but I don't think this one is.

> 
> Anyway, it is just a trade-of between memory footprint and CPU usage (3
> kzalloc() instead of 1)
> 
> I don't know if it is a slow path or not, nor if the "rport->ls_rcv_list"
> list can get big (each item overuses these 800 bytes of memory)

It is slow path/not often and the ls typically doesn't stay outstanding 
long. thus it hasn't been critical to optimize.

Should be few items on ls_rcv_list, but in rare conditions there could 
be a burst.

> 
> 3 kzalloc is more than just 1 (sic!), but with this patch, 800 bytes are
> not zeroed anymore. Moreover, maybe the zeroing of rqstbuf and/or rspbuf
> can be saved as well.
> So, it could balance the impact of the 3 kzalloc().

zeroing of rqstbuf may be skipped, as the meaningful part of the buffer 
will be memcpy'd a few lines below.

rspbuf should be zero'd.


> 
> So, if it looks promising, s.o. with the corresponding hardware should
> make some measurements on memory and CPU usage.
> ---
>   drivers/nvme/host/fc.c | 17 +++++++++--------
>   1 file changed, 9 insertions(+), 8 deletions(-)
> 

I think the mods are fine

Reviewed-by: James Smart <jsmart2021@gmail.com>

-- james




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] nvme-fc: Improve memory usage in nvme_fc_rcv_ls_req()
  2022-10-02  9:59 [PATCH] nvme-fc: Improve memory usage in nvme_fc_rcv_ls_req() Christophe JAILLET
  2022-10-02 17:27 ` James Smart
@ 2022-11-01  9:45 ` Christoph Hellwig
  1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2022-11-01  9:45 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: James Smart, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, linux-kernel, kernel-janitors, linux-nvme

Thanks,

applied to nvme-6.2.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-11-01  9:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-02  9:59 [PATCH] nvme-fc: Improve memory usage in nvme_fc_rcv_ls_req() Christophe JAILLET
2022-10-02 17:27 ` James Smart
2022-11-01  9:45 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).