All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernard Metzler <BMT@zurich.ibm.com>
To: "ira.weiny" <ira.weiny@intel.com>
Cc: "Jason Gunthorpe" <jgg@ziepe.ca>,
	"Mike Marciniszyn" <mike.marciniszyn@cornelisnetworks.com>,
	"Dennis Dalessandro" <dennis.dalessandro@cornelisnetworks.com>,
	"Doug Ledford" <dledford@redhat.com>,
	"Faisal Latif" <faisal.latif@intel.com>,
	"Shiraz Saleem" <shiraz.saleem@intel.com>,
	"Kamal Heib" <kheib@redhat.com>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH V3] RDMA/siw: Convert siw_tx_hdt() to kmap_local_page()
Date: Thu, 24 Jun 2021 15:45:55 +0000	[thread overview]
Message-ID: <OF739F2480.B35209F8-ON002586FE.00569A1A-002586FE.00569A24@ch.ibm.com> (raw)
In-Reply-To: <20210623221543.2799198-1-ira.weiny@intel.com>


-----ira.weiny@intel.com wrote: -----

>To: "Jason Gunthorpe" <jgg@ziepe.ca>
>From: ira.weiny@intel.com
>Date: 06/24/2021 12:16AM
>Cc: "Ira Weiny" <ira.weiny@intel.com>, "Mike Marciniszyn"
><mike.marciniszyn@cornelisnetworks.com>, "Dennis Dalessandro"
><dennis.dalessandro@cornelisnetworks.com>, "Doug Ledford"
><dledford@redhat.com>, "Faisal Latif" <faisal.latif@intel.com>,
>"Shiraz Saleem" <shiraz.saleem@intel.com>, "Bernard Metzler"
><bmt@zurich.ibm.com>, "Kamal Heib" <kheib@redhat.com>,
>linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
>Subject: [EXTERNAL] [PATCH V3] RDMA/siw: Convert siw_tx_hdt() to
>kmap_local_page()
>
>From: Ira Weiny <ira.weiny@intel.com>
>
>kmap() is being deprecated and will break uses of device dax after
>PKS
>protection is introduced.[1]
>
>The use of kmap() in siw_tx_hdt() is all thread local therefore
>kmap_local_page() is a sufficient replacement and will work with
>pgmap
>protected pages when those are implemented.
>
>siw_tx_hdt() tracks pages used in a page_array.  It uses that array
>to
>unmap pages which were mapped on function exit.  Not all entries in
>the
>array are mapped and this is tracked in kmap_mask.
>
>kunmap_local() takes a mapped address rather than a page.  Alter
>siw_unmap_pages() to take the iov array to reuse the iov_base address
>of
>each mapping.  Use PAGE_MASK to get the proper address for
>kunmap_local().
>
>kmap_local_page() mappings are tracked in a stack and must be
>unmapped
>in the opposite order they were mapped in.  Because segments are
>mapped
>into the page array in increasing index order, modify
>siw_unmap_pages()
>to unmap pages in decreasing order.
>
>Use kmap_local_page() instead of kmap() to map pages in the
>page_array.
>
>[1]
>INVALID URI REMOVED
>lkml_20201009195033.3208459-2D59-2Dira.weiny-40intel.com_&d=DwIDAg&c=
>jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&
>m=eI4Db7iSlEKRl4l5pYKwY5rL5WXWWxahhxNciwy2lRA&s=vo11VhOvYbAkABdhV6htX
>TmXgFZeWbBZAFnPDvg7Bzs&e= 
>
>Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>
>---
>Jason, I went ahead and left this a separate patch.  Let me know if
>you really
>want this and the other siw squashed.
>
>Changes for V3:
>	From Bernard
>		Use 'p' in kmap_local_page()
>		Use seg as length to siw_unmap_pages()
>
>Changes for V2:
>	From Bernard
>		Reuse iov[].iov_base rather than declaring another array
>		of pointers and preserve the use of kmap_mask to know
>		which iov's were kmapped.
>---
> drivers/infiniband/sw/siw/siw_qp_tx.c | 30
>+++++++++++++++++----------
> 1 file changed, 19 insertions(+), 11 deletions(-)
>
>diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
>b/drivers/infiniband/sw/siw/siw_qp_tx.c
>index db68a10d12cd..89a5b75f7254 100644
>--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
>+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
>@@ -396,13 +396,20 @@ static int siw_0copy_tx(struct socket *s,
>struct page **page,
> 
> #define MAX_TRAILER (MPA_CRC_SIZE + 4)
> 
>-static void siw_unmap_pages(struct page **pp, unsigned long
>kmap_mask)
>+static void siw_unmap_pages(struct kvec *iov, unsigned long
>kmap_mask, int len)
> {
>-	while (kmap_mask) {
>-		if (kmap_mask & BIT(0))
>-			kunmap(*pp);
>-		pp++;
>-		kmap_mask >>= 1;
>+	int i;
>+
>+	/*
>+	 * Work backwards through the array to honor the kmap_local_page()
>+	 * ordering requirements.
>+	 */
>+	for (i = (len-1); i >= 0; i--) {
>+		if (kmap_mask & BIT(i)) {
>+			unsigned long addr = (unsigned long)iov[i].iov_base;
>+
>+			kunmap_local((void *)(addr & PAGE_MASK));
>+		}
> 	}
> }
> 
>@@ -498,7 +505,7 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
>struct socket *s)
> 					p = siw_get_upage(mem->umem,
> 							  sge->laddr + sge_off);
> 				if (unlikely(!p)) {
>-					siw_unmap_pages(page_array, kmap_mask);
>+					siw_unmap_pages(iov, kmap_mask, seg);
> 					wqe->processed -= c_tx->bytes_unsent;
> 					rv = -EFAULT;
> 					goto done_crc;
>@@ -506,11 +513,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx
>*c_tx, struct socket *s)
> 				page_array[seg] = p;
> 
> 				if (!c_tx->use_sendpage) {
>-					iov[seg].iov_base = kmap(p) + fp_off;
>-					iov[seg].iov_len = plen;
>+					void *kaddr = kmap_local_page(p);
> 
> 					/* Remember for later kunmap() */
> 					kmap_mask |= BIT(seg);
>+					iov[seg].iov_base = kaddr + fp_off;
>+					iov[seg].iov_len = plen;
> 
> 					if (do_crc)
> 						crypto_shash_update(
>@@ -542,7 +550,7 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
>struct socket *s)
> 
> 			if (++seg > (int)MAX_ARRAY) {
> 				siw_dbg_qp(tx_qp(c_tx), "to many fragments\n");
>-				siw_unmap_pages(page_array, kmap_mask);
>+				siw_unmap_pages(iov, kmap_mask, seg-1);
> 				wqe->processed -= c_tx->bytes_unsent;
> 				rv = -EMSGSIZE;
> 				goto done_crc;
>@@ -593,7 +601,7 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
>struct socket *s)
> 	} else {
> 		rv = kernel_sendmsg(s, &msg, iov, seg + 1,
> 				    hdr_len + data_len + trl_len);
>-		siw_unmap_pages(page_array, kmap_mask);
>+		siw_unmap_pages(iov, kmap_mask, seg+1);

seg+1 is one to many, since the last segment references the iWarp
trailer (CRC). There are 2 reason for this multi-segment processing
in the transmit path. (1) efficiency and (2) MTU based packet framing.
The iov contains the complete iWarp frame with header, (potentially
multiple) data fragments, and the CRC. It gets pushed to TCP in one
go, praying for iWarp framing stays intact (which most time works).
So the code can collect data form multiple SGE's of a WRITE or
SEND and tries putting those into one frame, if MTU allows, and
adds header and trailer. 
The last segment (seg + 1) references the CRC, which is never kmap'ed.

I'll try the code next days, but it looks good otherwise!

Thanks very much!
> 	}
> 	if (rv < (int)hdr_len) {
> 		/* Not even complete hdr pushed or negative rv */
>-- 
>2.28.0.rc0.12.gb6a658bd00c9
>
>

  parent reply	other threads:[~2021-06-24 15:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22  6:14 [PATCH 0/4] Remove use of kmap() ira.weiny
2021-06-22  6:14 ` [PATCH 1/4] RDMA/hfi1: " ira.weiny
2021-06-24 18:13   ` Jason Gunthorpe
2021-06-22  6:14 ` [PATCH 2/4] RDMA/i40iw: " ira.weiny
2021-06-22 12:14   ` Jason Gunthorpe
2021-06-22 16:56   ` [PATCH V2] RDMA/irdma: " ira.weiny
2021-06-24 18:13     ` Jason Gunthorpe
2021-06-22  6:14 ` [PATCH 3/4] RDMA/siw: Remove kmap() ira.weiny
2021-07-15 18:00   ` Jason Gunthorpe
2021-06-22  6:14 ` [PATCH 4/4] RDMA/siw: Convert siw_tx_hdt() to kmap_local_page() ira.weiny
2021-06-22 20:34   ` [PATCH V2] " ira.weiny
2021-06-23 22:15     ` [PATCH V3] " ira.weiny
2021-06-24 17:48       ` [PATCH V4] " ira.weiny
2021-07-15 18:00         ` Jason Gunthorpe
2021-06-29 14:11       ` Bernard Metzler
2021-06-29 22:13         ` Ira Weiny
2021-06-24 15:45     ` Bernard Metzler [this message]
2021-06-24 17:33       ` [PATCH V3] " Ira Weiny
2021-06-23 14:36   ` [PATCH V2] " Bernard Metzler
2021-06-23 15:34     ` Ira Weiny
2021-06-22 16:42 ` [PATCH 4/4] " Bernard Metzler
2021-06-22 20:39   ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OF739F2480.B35209F8-ON002586FE.00569A1A-002586FE.00569A24@ch.ibm.com \
    --to=bmt@zurich.ibm.com \
    --cc=dennis.dalessandro@cornelisnetworks.com \
    --cc=dledford@redhat.com \
    --cc=faisal.latif@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=kheib@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mike.marciniszyn@cornelisnetworks.com \
    --cc=shiraz.saleem@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.