linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	Jason Gunthorpe <jgunthorpe@obsidianresearch.com>,
	"James E.J. Bottomley" <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-nvdimm@ml01.01.org, linux-rdma@vger.kernel.org,
	linux-pci@vger.kernel.org,
	Steve Wise <swise@opengridcomputing.com>,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	Keith Busch <keith.busch@intel.com>,
	linux-scsi@vger.kernel.org, Max Gurtovoy <maxg@mellanox.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [RFC 5/8] scatterlist: Modify SG copy functions to support io memory.
Date: Mon, 3 Apr 2017 15:20:57 -0600	[thread overview]
Message-ID: <445bc352-75d7-438f-96ef-c2411215628d@deltatee.com> (raw)
In-Reply-To: <435d4471-436b-87e6-8827-c9fc6cbdde2c@deltatee.com>

Hi Christoph,

What are your thoughts on an approach like the following untested
draft patch.

The patch (if fleshed out) makes it so iomem can be used in an sgl
and WARN_ONs will occur in places where drivers attempt to access
iomem directly through the sgl.

I'd also probably create a p2pmem_alloc_sgl helper function so driver
writers wouldn't have to mess with sg_set_iomem_page.

With all that in place, it should be relatively safe for drivers to
implement p2pmem even though we'd still technically be violating the
__iomem boundary in some places.

Logan


commit b435a154a4ec4f82766f6ab838092c3c5a9388ac
Author: Logan Gunthorpe <logang@deltatee.com>
Date:   Wed Feb 8 12:44:52 2017 -0700

    scatterlist: Add support for iomem pages

    This patch steals another bit from the page_link field to indicate the
    sg points to iomem. In sg_copy_buffer we use this flag to select
    between memcpy and iomemcpy. Other sg_miter users will get an WARN_ON
    unless they indicate they support iomemory by setting the
    SG_MITER_IOMEM flag.

    Also added are sg_kmap functions which would replace a common pattern
    of kmap(sg_page(sg)). These new functions then also warn if the caller
    tries to map io memory. Another option may be to automatically copy
    the iomem to a new page and return that transparently to the driver.

    Another coccinelle patch would then be done to convert kmap(sg_page(sg))
    instances to the appropriate sg_kmap calls.

    Signed-off-by: Logan Gunthorpe <logang@deltatee.com>

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 0007b79..bd690a2c 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -37,6 +37,9 @@

 #include <uapi/linux/dma-buf.h>

+/* Avoid the highmem.h macro from aliasing our ops->kunmap_atomic */
+#undef kunmap_atomic
+
 static inline int is_dma_buf_file(struct file *);

 struct dma_buf_list {
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe..7608da0 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -5,6 +5,7 @@
 #include <linux/types.h>
 #include <linux/bug.h>
 #include <linux/mm.h>
+#include <linux/highmem.h>
 #include <asm/io.h>

 struct scatterlist {
@@ -53,6 +54,9 @@ struct sg_table {
  *
  * If bit 1 is set, then this sg entry is the last element in a list.
  *
+ * We also use bit 2 to indicate whether the page_link points to an
+ * iomem page or not.
+ *
  * See sg_next().
  *
  */
@@ -64,10 +68,17 @@ struct sg_table {
  * a valid sg entry, or whether it points to the start of a new
scatterlist.
  * Those low bits are there for everyone! (thanks mason :-)
  */
-#define sg_is_chain(sg)		((sg)->page_link & 0x01)
-#define sg_is_last(sg)		((sg)->page_link & 0x02)
+#define PAGE_LINK_MASK	0x7
+#define PAGE_LINK_CHAIN	0x1
+#define PAGE_LINK_LAST	0x2
+#define PAGE_LINK_IOMEM	0x4
+
+#define sg_is_chain(sg)		((sg)->page_link & PAGE_LINK_CHAIN)
+#define sg_is_last(sg)		((sg)->page_link & PAGE_LINK_LAST)
 #define sg_chain_ptr(sg)	\
-	((struct scatterlist *) ((sg)->page_link & ~0x03))
+	((struct scatterlist *) ((sg)->page_link & ~(PAGE_LINK_CHAIN | \
+						     PAGE_LINK_LAST)))
+#define sg_is_iomem(sg)		((sg)->page_link & PAGE_LINK_IOMEM)

 /**
  * sg_assign_page - Assign a given page to an SG entry
@@ -81,13 +92,13 @@ struct sg_table {
  **/
 static inline void sg_assign_page(struct scatterlist *sg, struct page
*page)
 {
-	unsigned long page_link = sg->page_link & 0x3;
+	unsigned long page_link = sg->page_link & PAGE_LINK_MASK;

 	/*
 	 * In order for the low bit stealing approach to work, pages
-	 * must be aligned at a 32-bit boundary as a minimum.
+	 * must be aligned at a 64-bit boundary as a minimum.
 	 */
-	BUG_ON((unsigned long) page & 0x03);
+	BUG_ON((unsigned long) page & PAGE_LINK_MASK);
 #ifdef CONFIG_DEBUG_SG
 	BUG_ON(sg->sg_magic != SG_MAGIC);
 	BUG_ON(sg_is_chain(sg));
@@ -117,13 +128,56 @@ static inline void sg_set_page(struct scatterlist
*sg, struct page *page,
 	sg->length = len;
 }

+/**
+ * sg_set_page - Set sg entry to point at given iomem page
+ * @sg:		 SG entry
+ * @page:	 The page
+ * @len:	 Length of data
+ * @offset:	 Offset into page
+ *
+ * Description:
+ *   Same as sg_set_page but used when the page is a ZONE_DEVICE page that
+ *   points to IO memory.
+ *
+ **/
+static inline void sg_set_iomem_page(struct scatterlist *sg, struct
page *page,
+				     unsigned int len, unsigned int offset)
+{
+	sg_set_page(sg, page, len, offset);
+	sg->page_link |= PAGE_LINK_IOMEM;
+}
+
 static inline struct page *sg_page(struct scatterlist *sg)
 {
 #ifdef CONFIG_DEBUG_SG
 	BUG_ON(sg->sg_magic != SG_MAGIC);
 	BUG_ON(sg_is_chain(sg));
 #endif
-	return (struct page *)((sg)->page_link & ~0x3);
+	return (struct page *)((sg)->page_link & ~PAGE_LINK_MASK);
+}
+
+static inline void *sg_kmap(struct scatterlist *sg)
+{
+	WARN_ON(sg_is_iomem(sg));
+
+	return kmap(sg_page(sg));
+}
+
+static inline void sg_kunmap(struct scatterlist *sg, void *addr)
+{
+	kunmap(addr);
+}
+
+static inline void *sg_kmap_atomic(struct scatterlist *sg)
+{
+	WARN_ON(sg_is_iomem(sg));
+
+	return kmap(sg_page(sg));
+}
+
+static inline void sg_kunmap_atomic(struct scatterlist *sg, void *addr)
+{
+	kunmap_atomic(addr);
 }

 /**
@@ -171,7 +225,8 @@ static inline void sg_chain(struct scatterlist *prv,
unsigned int prv_nents,
 	 * Set lowest bit to indicate a link pointer, and make sure to clear
 	 * the termination bit if it happens to be set.
 	 */
-	prv[prv_nents - 1].page_link = ((unsigned long) sgl | 0x01) & ~0x02;
+	prv[prv_nents - 1].page_link =
+		((unsigned long) sgl & ~PAGE_LINK_MASK) | PAGE_LINK_CHAIN;
 }

 /**
@@ -191,8 +246,8 @@ static inline void sg_mark_end(struct scatterlist *sg)
 	/*
 	 * Set termination bit, clear potential chain bit
 	 */
-	sg->page_link |= 0x02;
-	sg->page_link &= ~0x01;
+	sg->page_link &= ~PAGE_LINK_MASK;
+	sg->page_link |= PAGE_LINK_LAST;
 }

 /**
@@ -208,7 +263,7 @@ static inline void sg_unmark_end(struct scatterlist *sg)
 #ifdef CONFIG_DEBUG_SG
 	BUG_ON(sg->sg_magic != SG_MAGIC);
 #endif
-	sg->page_link &= ~0x02;
+	sg->page_link &= ~PAGE_LINK_LAST;
 }

 /**
@@ -383,6 +438,7 @@ static inline dma_addr_t
sg_page_iter_dma_address(struct sg_page_iter *piter)
 #define SG_MITER_ATOMIC		(1 << 0)	 /* use kmap_atomic */
 #define SG_MITER_TO_SG		(1 << 1)	/* flush back to phys on unmap */
 #define SG_MITER_FROM_SG	(1 << 2)	/* nop */
+#define SG_MITER_IOMEM		(1 << 3)	/* support iomem in miter ops */

 struct sg_mapping_iter {
 	/* the following three fields can be accessed directly */
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c6cf822..6d8f39b 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -580,6 +580,9 @@ bool sg_miter_next(struct sg_mapping_iter *miter)
 	if (!sg_miter_get_next_page(miter))
 		return false;

+	if (!(miter->__flags & SG_MITER_IOMEM))
+		WARN_ON(sg_is_iomem(miter->piter.sg));
+
 	miter->page = sg_page_iter_page(&miter->piter);
 	miter->consumed = miter->length = miter->__remaining;

@@ -651,7 +654,7 @@ size_t sg_copy_buffer(struct scatterlist *sgl,
unsigned int nents, void *buf,
 {
 	unsigned int offset = 0;
 	struct sg_mapping_iter miter;
-	unsigned int sg_flags = SG_MITER_ATOMIC;
+	unsigned int sg_flags = SG_MITER_ATOMIC | SG_MITER_IOMEM;

 	if (to_buffer)
 		sg_flags |= SG_MITER_FROM_SG;
@@ -668,10 +671,17 @@ size_t sg_copy_buffer(struct scatterlist *sgl,
unsigned int nents, void *buf,

 		len = min(miter.length, buflen - offset);

-		if (to_buffer)
-			memcpy(buf + offset, miter.addr, len);
-		else
-			memcpy(miter.addr, buf + offset, len);
+		if (sg_is_iomem(miter.piter.sg)) {
+			if (to_buffer)
+				memcpy_fromio(buf + offset,  miter.addr, len);
+			else
+				memcpy_toio(miter.addr, buf + offset, len);
+		} else {
+			if (to_buffer)
+				memcpy(buf + offset, miter.addr, len);
+			else
+				memcpy(miter.addr, buf + offset, len);
+		}

 		offset += len;
 	}

  reply	other threads:[~2017-04-03 21:21 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 22:12 [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory Logan Gunthorpe
2017-03-30 22:12 ` [RFC 1/8] Introduce Peer-to-Peer memory (p2pmem) device Logan Gunthorpe
2017-03-31 18:49   ` Sinan Kaya
2017-03-31 21:23     ` Logan Gunthorpe
2017-03-31 21:38       ` Sinan Kaya
2017-03-31 22:42         ` Logan Gunthorpe
2017-03-31 23:51           ` Sinan Kaya
2017-04-01  1:57             ` Logan Gunthorpe
2017-04-01  2:17               ` okaya
2017-04-01 22:16                 ` Logan Gunthorpe
2017-04-02  2:26                   ` Sinan Kaya
2017-04-02 17:21                     ` Logan Gunthorpe
2017-04-02 21:03                       ` Sinan Kaya
2017-04-03  4:26                         ` Logan Gunthorpe
2017-04-25 11:58                           ` Marta Rybczynska
2017-04-25 16:58                             ` Logan Gunthorpe
2017-03-30 22:12 ` [RFC 2/8] cxgb4: setup pcie memory window 4 and create p2pmem region Logan Gunthorpe
2017-04-04 10:42   ` Sagi Grimberg
2017-04-04 15:56     ` Logan Gunthorpe
2017-04-05 15:41     ` Steve Wise
2017-03-30 22:12 ` [RFC 3/8] nvmet: Use p2pmem in nvme target Logan Gunthorpe
2017-04-04 10:40   ` Sagi Grimberg
2017-04-04 16:16     ` Logan Gunthorpe
2017-04-06  5:47       ` Sagi Grimberg
2017-04-06 15:52         ` Logan Gunthorpe
2017-03-30 22:12 ` [RFC 4/8] p2pmem: Add debugfs "stats" file Logan Gunthorpe
2017-04-04 10:46   ` Sagi Grimberg
2017-04-04 17:25     ` Logan Gunthorpe
2017-04-05 15:43     ` Steve Wise
2017-03-30 22:12 ` [RFC 5/8] scatterlist: Modify SG copy functions to support io memory Logan Gunthorpe
2017-03-31  7:09   ` Christoph Hellwig
2017-03-31 15:41     ` Logan Gunthorpe
2017-04-03 21:20       ` Logan Gunthorpe [this message]
2017-04-03 21:44         ` Dan Williams
2017-04-03 22:10           ` Logan Gunthorpe
2017-04-03 22:47             ` Dan Williams
2017-04-03 23:12               ` Logan Gunthorpe
2017-04-04  0:07                 ` Dan Williams
2017-04-07 17:59                   ` Logan Gunthorpe
2017-03-30 22:12 ` [RFC 6/8] nvmet: Be careful about using iomem accesses when dealing with p2pmem Logan Gunthorpe
2017-04-04 10:59   ` Sagi Grimberg
2017-04-04 15:46     ` Jason Gunthorpe
2017-04-04 17:21       ` Logan Gunthorpe
2017-04-06  5:33       ` Sagi Grimberg
2017-04-06 16:02         ` Logan Gunthorpe
2017-04-06 16:35         ` Jason Gunthorpe
2017-04-07 11:19         ` Stephen  Bates
2017-04-10  8:29           ` Sagi Grimberg
2017-04-10 16:03             ` Logan Gunthorpe
2017-03-30 22:12 ` [RFC 7/8] p2pmem: Support device removal Logan Gunthorpe
2017-03-30 22:12 ` [RFC 8/8] p2pmem: Added char device user interface Logan Gunthorpe
2017-04-12  5:22 ` [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory Benjamin Herrenschmidt
2017-04-12 17:09   ` Logan Gunthorpe
2017-04-12 21:55     ` Benjamin Herrenschmidt
2017-04-13 21:22       ` Logan Gunthorpe
2017-04-13 22:37         ` Benjamin Herrenschmidt
2017-04-13 23:26         ` Bjorn Helgaas
2017-04-14  4:16           ` Jason Gunthorpe
2017-04-14  4:40             ` Logan Gunthorpe
2017-04-14 11:37               ` Benjamin Herrenschmidt
2017-04-14 11:39                 ` Benjamin Herrenschmidt
2017-04-14 11:37             ` Benjamin Herrenschmidt
2017-04-14 17:30               ` Logan Gunthorpe
2017-04-14 19:04                 ` Bjorn Helgaas
2017-04-14 22:07                   ` Benjamin Herrenschmidt
2017-04-15 17:41                     ` Logan Gunthorpe
2017-04-15 22:09                       ` Dan Williams
2017-04-16  3:01                         ` Benjamin Herrenschmidt
2017-04-16  4:46                           ` Logan Gunthorpe
2017-04-16 15:53                           ` Dan Williams
2017-04-16 16:34                             ` Logan Gunthorpe
2017-04-16 22:31                               ` Benjamin Herrenschmidt
2017-04-24  7:36                                 ` Knut Omang
2017-04-24 16:14                                   ` Logan Gunthorpe
2017-04-25  6:30                                     ` Knut Omang
2017-04-25 17:03                                       ` Logan Gunthorpe
2017-04-25 21:23                                         ` Stephen  Bates
2017-04-25 21:23                                   ` Stephen  Bates
2017-04-16 22:26                             ` Benjamin Herrenschmidt
2017-04-15 22:17                       ` Benjamin Herrenschmidt
2017-04-16  5:36                         ` Logan Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=445bc352-75d7-438f-96ef-c2411215628d@deltatee.com \
    --to=logang@deltatee.com \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=jgunthorpe@obsidianresearch.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=maxg@mellanox.com \
    --cc=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).