nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver
@ 2017-08-03 23:31 Dave Jiang
  2017-08-03 23:31 ` [PATCH v3 1/8] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels Dave Jiang
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:31 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

v3:
- Added patch to rename DMA_SG to DMA_SG_SG to make it explicit
- Added DMA_MEMCPY_SG transaction type to dmaengine
- Misc patch to add verification of DMA_MEMSET_SG that was missing
- Addressed all nd_pmem driver comments from Ross.

v2:
- Make dma_prep_memcpy_* into one function per Dan.
- Addressed various comments from Ross with code formatting and etc.
- Replaced open code with offset_in_page() macro per Johannes.

The following series implements adds blk-mq support to the pmem block driver
and also adds infrastructure code to ioatdma and dmaengine in order to
support copying to and from scatterlist in order to process block
requests provided by blk-mq. The usage of DMA engines available on certain
platforms allow us to drastically reduce CPU utilization and at the same time
maintain performance that is good enough. Experimentations have been done on
DRAM backed pmem block device that showed the utilization of DMA engine is
beneficial. User can revert back to original behavior by providing
queue_mode=0 to the nd_pmem kernel module if desired.

---

Dave Jiang (8):
      dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels
      dmaengine: change transaction type DMA_SG to DMA_SG_SG
      dmaengine: Add DMA_MEMCPY_SG transaction op
      dmaengine: add verification of DMA_MEMSET_SG in dmaengine
      dmaengine: ioatdma: dma_prep_memcpy_sg support
      dmaengine: add SG support to dmaengine_unmap
      libnvdimm: Adding blk-mq support to the pmem driver
      libnvdimm: add DMA support for pmem blk-mq


 Documentation/dmaengine/provider.txt |    5 -
 drivers/crypto/ccp/ccp-dmaengine.c   |    2 
 drivers/dma/at_hdmac.c               |    8 -
 drivers/dma/dmaengine.c              |   51 ++++-
 drivers/dma/dmatest.c                |   12 +
 drivers/dma/fsldma.c                 |    2 
 drivers/dma/ioat/dma.h               |    4 
 drivers/dma/ioat/init.c              |    5 -
 drivers/dma/ioat/prep.c              |   57 ++++++
 drivers/dma/mv_xor.c                 |    6 -
 drivers/dma/nbpfaxi.c                |    2 
 drivers/dma/ste_dma40.c              |    6 -
 drivers/dma/xgene-dma.c              |    4 
 drivers/dma/xilinx/zynqmp_dma.c      |    2 
 drivers/nvdimm/pmem.c                |  333 ++++++++++++++++++++++++++++++++--
 drivers/nvdimm/pmem.h                |    3 
 include/linux/dmaengine.h            |   12 +
 17 files changed, 456 insertions(+), 58 deletions(-)

--
Signature
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/8] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
@ 2017-08-03 23:31 ` Dave Jiang
  2017-08-03 23:31 ` [PATCH v3 2/8] dmaengine: change transaction type DMA_SG to DMA_SG_SG Dave Jiang
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:31 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

Commit 7618d0359c16 ("dmaengine: ioatdma: Set non RAID channels to be
private capable") makes all non-RAID ioatdma channels as private to be
requestable by dma_request_channel(). With PQ CAP support going away for
ioatdma, this would make all channels private. To support the usage of
ioatdma for blk-mq implementation of pmem we need as many channels we can
share in order to be high performing. Thus reverting the patch.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/dma/ioat/init.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma/ioat/init.c b/drivers/dma/ioat/init.c
index ed8ed11..1b881fb 100644
--- a/drivers/dma/ioat/init.c
+++ b/drivers/dma/ioat/init.c
@@ -1153,9 +1153,6 @@ static int ioat3_dma_probe(struct ioatdma_device *ioat_dma, int dca)
 		}
 	}
 
-	if (!(ioat_dma->cap & (IOAT_CAP_XOR | IOAT_CAP_PQ)))
-		dma_cap_set(DMA_PRIVATE, dma->cap_mask);
-
 	err = ioat_probe(ioat_dma);
 	if (err)
 		return err;

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/8] dmaengine: change transaction type DMA_SG to DMA_SG_SG
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
  2017-08-03 23:31 ` [PATCH v3 1/8] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels Dave Jiang
@ 2017-08-03 23:31 ` Dave Jiang
  2017-08-03 23:32 ` [PATCH v3 3/8] dmaengine: Add DMA_MEMCPY_SG transaction op Dave Jiang
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:31 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

In preparation of adding an API to perform SG to/from buffer for dmaengine,
we will change DMA_SG to DMA_SG_SG in order to explicitly making clear what
this op type is for.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 Documentation/dmaengine/provider.txt |    2 +-
 drivers/crypto/ccp/ccp-dmaengine.c   |    2 +-
 drivers/dma/at_hdmac.c               |    8 ++++----
 drivers/dma/dmaengine.c              |    2 +-
 drivers/dma/dmatest.c                |   12 ++++++------
 drivers/dma/fsldma.c                 |    2 +-
 drivers/dma/mv_xor.c                 |    6 +++---
 drivers/dma/nbpfaxi.c                |    2 +-
 drivers/dma/ste_dma40.c              |    6 +++---
 drivers/dma/xgene-dma.c              |    4 ++--
 drivers/dma/xilinx/zynqmp_dma.c      |    2 +-
 include/linux/dmaengine.h            |    2 +-
 12 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/Documentation/dmaengine/provider.txt b/Documentation/dmaengine/provider.txt
index e33bc1c..8f189c9 100644
--- a/Documentation/dmaengine/provider.txt
+++ b/Documentation/dmaengine/provider.txt
@@ -181,7 +181,7 @@ Currently, the types available are:
     - Used by the client drivers to register a callback that will be
       called on a regular basis through the DMA controller interrupt
 
-  * DMA_SG
+  * DMA_SG_SG
     - The device supports memory to memory scatter-gather
       transfers.
     - Even though a plain memcpy can look like a particular case of a
diff --git a/drivers/crypto/ccp/ccp-dmaengine.c b/drivers/crypto/ccp/ccp-dmaengine.c
index e00be01..f477c82 100644
--- a/drivers/crypto/ccp/ccp-dmaengine.c
+++ b/drivers/crypto/ccp/ccp-dmaengine.c
@@ -704,7 +704,7 @@ int ccp_dmaengine_register(struct ccp_device *ccp)
 	dma_dev->directions = DMA_MEM_TO_MEM;
 	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
 	dma_cap_set(DMA_MEMCPY, dma_dev->cap_mask);
-	dma_cap_set(DMA_SG, dma_dev->cap_mask);
+	dma_cap_set(DMA_SG_SG, dma_dev->cap_mask);
 	dma_cap_set(DMA_INTERRUPT, dma_dev->cap_mask);
 
 	/* The DMA channels for this device can be set to public or private,
diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c
index 1baf340..7124074 100644
--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -1933,14 +1933,14 @@ static int __init at_dma_probe(struct platform_device *pdev)
 
 	/* setup platform data for each SoC */
 	dma_cap_set(DMA_MEMCPY, at91sam9rl_config.cap_mask);
-	dma_cap_set(DMA_SG, at91sam9rl_config.cap_mask);
+	dma_cap_set(DMA_SG_SG, at91sam9rl_config.cap_mask);
 	dma_cap_set(DMA_INTERLEAVE, at91sam9g45_config.cap_mask);
 	dma_cap_set(DMA_MEMCPY, at91sam9g45_config.cap_mask);
 	dma_cap_set(DMA_MEMSET, at91sam9g45_config.cap_mask);
 	dma_cap_set(DMA_MEMSET_SG, at91sam9g45_config.cap_mask);
 	dma_cap_set(DMA_PRIVATE, at91sam9g45_config.cap_mask);
 	dma_cap_set(DMA_SLAVE, at91sam9g45_config.cap_mask);
-	dma_cap_set(DMA_SG, at91sam9g45_config.cap_mask);
+	dma_cap_set(DMA_SG_SG, at91sam9g45_config.cap_mask);
 
 	/* get DMA parameters from controller type */
 	plat_dat = at_dma_get_driver_data(pdev);
@@ -2078,7 +2078,7 @@ static int __init at_dma_probe(struct platform_device *pdev)
 		atdma->dma_common.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 	}
 
-	if (dma_has_cap(DMA_SG, atdma->dma_common.cap_mask))
+	if (dma_has_cap(DMA_SG_SG, atdma->dma_common.cap_mask))
 		atdma->dma_common.device_prep_dma_sg = atc_prep_dma_sg;
 
 	dma_writel(atdma, EN, AT_DMA_ENABLE);
@@ -2087,7 +2087,7 @@ static int __init at_dma_probe(struct platform_device *pdev)
 	  dma_has_cap(DMA_MEMCPY, atdma->dma_common.cap_mask) ? "cpy " : "",
 	  dma_has_cap(DMA_MEMSET, atdma->dma_common.cap_mask) ? "set " : "",
 	  dma_has_cap(DMA_SLAVE, atdma->dma_common.cap_mask)  ? "slave " : "",
-	  dma_has_cap(DMA_SG, atdma->dma_common.cap_mask)  ? "sg-cpy " : "",
+	  dma_has_cap(DMA_SG_SG, atdma->dma_common.cap_mask)  ? "sg-cpy " : "",
 	  plat_dat->nr_channels);
 
 	dma_async_device_register(&atdma->dma_common);
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index d9118ec..2219b1f 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -937,7 +937,7 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_memset);
 	BUG_ON(dma_has_cap(DMA_INTERRUPT, device->cap_mask) &&
 		!device->device_prep_dma_interrupt);
-	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
+	BUG_ON(dma_has_cap(DMA_SG_SG, device->cap_mask) &&
 		!device->device_prep_dma_sg);
 	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
 		!device->device_prep_dma_cyclic);
diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index a07ef3d..b062113 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -448,7 +448,7 @@ static int dmatest_func(void *data)
 	if (thread->type == DMA_MEMCPY) {
 		align = dev->copy_align;
 		src_cnt = dst_cnt = 1;
-	} else if (thread->type == DMA_SG) {
+	} else if (thread->type == DMA_SG_SG) {
 		align = dev->copy_align;
 		src_cnt = dst_cnt = sg_buffers;
 	} else if (thread->type == DMA_XOR) {
@@ -640,7 +640,7 @@ static int dmatest_func(void *data)
 			tx = dev->device_prep_dma_memcpy(chan,
 							 dsts[0] + dst_off,
 							 srcs[0], len, flags);
-		else if (thread->type == DMA_SG)
+		else if (thread->type == DMA_SG_SG)
 			tx = dev->device_prep_dma_sg(chan, tx_sg, src_cnt,
 						     rx_sg, src_cnt, flags);
 		else if (thread->type == DMA_XOR)
@@ -821,7 +821,7 @@ static int dmatest_add_threads(struct dmatest_info *info,
 
 	if (type == DMA_MEMCPY)
 		op = "copy";
-	else if (type == DMA_SG)
+	else if (type == DMA_SG_SG)
 		op = "sg";
 	else if (type == DMA_XOR)
 		op = "xor";
@@ -883,9 +883,9 @@ static int dmatest_add_channel(struct dmatest_info *info,
 		}
 	}
 
-	if (dma_has_cap(DMA_SG, dma_dev->cap_mask)) {
+	if (dma_has_cap(DMA_SG_SG, dma_dev->cap_mask)) {
 		if (dmatest == 1) {
-			cnt = dmatest_add_threads(info, dtc, DMA_SG);
+			cnt = dmatest_add_threads(info, dtc, DMA_SG_SG);
 			thread_count += cnt > 0 ? cnt : 0;
 		}
 	}
@@ -962,7 +962,7 @@ static void run_threaded_test(struct dmatest_info *info)
 
 	request_channels(info, DMA_MEMCPY);
 	request_channels(info, DMA_XOR);
-	request_channels(info, DMA_SG);
+	request_channels(info, DMA_SG_SG);
 	request_channels(info, DMA_PQ);
 }
 
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 3b8b752..bb63dbe 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -1357,7 +1357,7 @@ static int fsldma_of_probe(struct platform_device *op)
 	fdev->irq = irq_of_parse_and_map(op->dev.of_node, 0);
 
 	dma_cap_set(DMA_MEMCPY, fdev->common.cap_mask);
-	dma_cap_set(DMA_SG, fdev->common.cap_mask);
+	dma_cap_set(DMA_SG_SG, fdev->common.cap_mask);
 	dma_cap_set(DMA_SLAVE, fdev->common.cap_mask);
 	fdev->common.device_alloc_chan_resources = fsl_dma_alloc_chan_resources;
 	fdev->common.device_free_chan_resources = fsl_dma_free_chan_resources;
diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index 25bc5b1..109501d 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -1254,7 +1254,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
 		dma_dev->device_prep_dma_interrupt = mv_xor_prep_dma_interrupt;
 	if (dma_has_cap(DMA_MEMCPY, dma_dev->cap_mask))
 		dma_dev->device_prep_dma_memcpy = mv_xor_prep_dma_memcpy;
-	if (dma_has_cap(DMA_SG, dma_dev->cap_mask))
+	if (dma_has_cap(DMA_SG_SG, dma_dev->cap_mask))
 		dma_dev->device_prep_dma_sg = mv_xor_prep_dma_sg;
 	if (dma_has_cap(DMA_XOR, dma_dev->cap_mask)) {
 		dma_dev->max_xor = 8;
@@ -1309,7 +1309,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
 		 mv_chan->op_in_desc ? "Descriptor Mode" : "Registers Mode",
 		 dma_has_cap(DMA_XOR, dma_dev->cap_mask) ? "xor " : "",
 		 dma_has_cap(DMA_MEMCPY, dma_dev->cap_mask) ? "cpy " : "",
-		 dma_has_cap(DMA_SG, dma_dev->cap_mask) ? "sg " : "",
+		 dma_has_cap(DMA_SG_SG, dma_dev->cap_mask) ? "sg " : "",
 		 dma_has_cap(DMA_INTERRUPT, dma_dev->cap_mask) ? "intr " : "");
 
 	dma_async_device_register(dma_dev);
@@ -1552,7 +1552,7 @@ static int mv_xor_probe(struct platform_device *pdev)
 
 			dma_cap_zero(cap_mask);
 			dma_cap_set(DMA_MEMCPY, cap_mask);
-			dma_cap_set(DMA_SG, cap_mask);
+			dma_cap_set(DMA_SG_SG, cap_mask);
 			dma_cap_set(DMA_XOR, cap_mask);
 			dma_cap_set(DMA_INTERRUPT, cap_mask);
 
diff --git a/drivers/dma/nbpfaxi.c b/drivers/dma/nbpfaxi.c
index 3f45b9b..1127629 100644
--- a/drivers/dma/nbpfaxi.c
+++ b/drivers/dma/nbpfaxi.c
@@ -1417,7 +1417,7 @@ static int nbpf_probe(struct platform_device *pdev)
 	dma_cap_set(DMA_MEMCPY, dma_dev->cap_mask);
 	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
 	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
-	dma_cap_set(DMA_SG, dma_dev->cap_mask);
+	dma_cap_set(DMA_SG_SG, dma_dev->cap_mask);
 
 	/* Common and MEMCPY operations */
 	dma_dev->device_alloc_chan_resources
diff --git a/drivers/dma/ste_dma40.c b/drivers/dma/ste_dma40.c
index c3052fb..d66ba81 100644
--- a/drivers/dma/ste_dma40.c
+++ b/drivers/dma/ste_dma40.c
@@ -2821,7 +2821,7 @@ static void d40_ops_init(struct d40_base *base, struct dma_device *dev)
 		dev->copy_align = DMAENGINE_ALIGN_4_BYTES;
 	}
 
-	if (dma_has_cap(DMA_SG, dev->cap_mask))
+	if (dma_has_cap(DMA_SG_SG, dev->cap_mask))
 		dev->device_prep_dma_sg = d40_prep_memcpy_sg;
 
 	if (dma_has_cap(DMA_CYCLIC, dev->cap_mask))
@@ -2865,7 +2865,7 @@ static int __init d40_dmaengine_init(struct d40_base *base,
 
 	dma_cap_zero(base->dma_memcpy.cap_mask);
 	dma_cap_set(DMA_MEMCPY, base->dma_memcpy.cap_mask);
-	dma_cap_set(DMA_SG, base->dma_memcpy.cap_mask);
+	dma_cap_set(DMA_SG_SG, base->dma_memcpy.cap_mask);
 
 	d40_ops_init(base, &base->dma_memcpy);
 
@@ -2883,7 +2883,7 @@ static int __init d40_dmaengine_init(struct d40_base *base,
 	dma_cap_zero(base->dma_both.cap_mask);
 	dma_cap_set(DMA_SLAVE, base->dma_both.cap_mask);
 	dma_cap_set(DMA_MEMCPY, base->dma_both.cap_mask);
-	dma_cap_set(DMA_SG, base->dma_both.cap_mask);
+	dma_cap_set(DMA_SG_SG, base->dma_both.cap_mask);
 	dma_cap_set(DMA_CYCLIC, base->dma_slave.cap_mask);
 
 	d40_ops_init(base, &base->dma_both);
diff --git a/drivers/dma/xgene-dma.c b/drivers/dma/xgene-dma.c
index 8b693b7..216dbb7 100644
--- a/drivers/dma/xgene-dma.c
+++ b/drivers/dma/xgene-dma.c
@@ -1653,7 +1653,7 @@ static void xgene_dma_set_caps(struct xgene_dma_chan *chan,
 	dma_cap_zero(dma_dev->cap_mask);
 
 	/* Set DMA device capability */
-	dma_cap_set(DMA_SG, dma_dev->cap_mask);
+	dma_cap_set(DMA_SG_SG, dma_dev->cap_mask);
 
 	/* Basically here, the X-Gene SoC DMA engine channel 0 supports XOR
 	 * and channel 1 supports XOR, PQ both. First thing here is we have
@@ -1732,7 +1732,7 @@ static int xgene_dma_async_register(struct xgene_dma *pdma, int id)
 	/* DMA capability info */
 	dev_info(pdma->dev,
 		 "%s: CAPABILITY ( %s%s%s)\n", dma_chan_name(&chan->dma_chan),
-		 dma_has_cap(DMA_SG, dma_dev->cap_mask) ? "SGCPY " : "",
+		 dma_has_cap(DMA_SG_SG, dma_dev->cap_mask) ? "SGCPY " : "",
 		 dma_has_cap(DMA_XOR, dma_dev->cap_mask) ? "XOR " : "",
 		 dma_has_cap(DMA_PQ, dma_dev->cap_mask) ? "PQ " : "");
 
diff --git a/drivers/dma/xilinx/zynqmp_dma.c b/drivers/dma/xilinx/zynqmp_dma.c
index 47f6419..0258c2e 100644
--- a/drivers/dma/xilinx/zynqmp_dma.c
+++ b/drivers/dma/xilinx/zynqmp_dma.c
@@ -1064,7 +1064,7 @@ static int zynqmp_dma_probe(struct platform_device *pdev)
 	INIT_LIST_HEAD(&zdev->common.channels);
 
 	dma_set_mask(&pdev->dev, DMA_BIT_MASK(44));
-	dma_cap_set(DMA_SG, zdev->common.cap_mask);
+	dma_cap_set(DMA_SG_SG, zdev->common.cap_mask);
 	dma_cap_set(DMA_MEMCPY, zdev->common.cap_mask);
 
 	p = &zdev->common;
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 5336808..b1182c6 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -68,7 +68,7 @@ enum dma_transaction_type {
 	DMA_MEMSET,
 	DMA_MEMSET_SG,
 	DMA_INTERRUPT,
-	DMA_SG,
+	DMA_SG_SG,
 	DMA_PRIVATE,
 	DMA_ASYNC_TX,
 	DMA_SLAVE,

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/8] dmaengine: Add DMA_MEMCPY_SG transaction op
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
  2017-08-03 23:31 ` [PATCH v3 1/8] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels Dave Jiang
  2017-08-03 23:31 ` [PATCH v3 2/8] dmaengine: change transaction type DMA_SG to DMA_SG_SG Dave Jiang
@ 2017-08-03 23:32 ` Dave Jiang
  2017-08-03 23:32 ` [PATCH v3 4/8] dmaengine: add verification of DMA_MEMSET_SG in dmaengine Dave Jiang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:32 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

Adding a dmaengine transaction operation that allows copy to/from a
scatterlist and a flat buffer.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 Documentation/dmaengine/provider.txt |    3 +++
 drivers/dma/dmaengine.c              |    2 ++
 include/linux/dmaengine.h            |    6 ++++++
 3 files changed, 11 insertions(+)

diff --git a/Documentation/dmaengine/provider.txt b/Documentation/dmaengine/provider.txt
index 8f189c9..03e8650 100644
--- a/Documentation/dmaengine/provider.txt
+++ b/Documentation/dmaengine/provider.txt
@@ -188,6 +188,9 @@ Currently, the types available are:
       scatter-gather transfer, with a single chunk to transfer, it's a
       distinct transaction type in the mem2mem transfers case
 
+  * DMA_MEMCPY_SG
+    - The device supports scatterlist to/from memory.
+
   * DMA_PRIVATE
     - The devices only supports slave transfers, and as such isn't
       available for async transfers.
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 2219b1f..1c424f6 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -939,6 +939,8 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_interrupt);
 	BUG_ON(dma_has_cap(DMA_SG_SG, device->cap_mask) &&
 		!device->device_prep_dma_sg);
+	BUG_ON(dma_has_cap(DMA_MEMCPY_SG, device->cap_mask) &&
+		!device->device_prep_dma_memcpy_sg);
 	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
 		!device->device_prep_dma_cyclic);
 	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index b1182c6..fc25475 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -67,6 +67,7 @@ enum dma_transaction_type {
 	DMA_PQ_VAL,
 	DMA_MEMSET,
 	DMA_MEMSET_SG,
+	DMA_MEMCPY_SG,
 	DMA_INTERRUPT,
 	DMA_SG_SG,
 	DMA_PRIVATE,
@@ -693,6 +694,7 @@ struct dma_filter {
  * @device_prep_dma_pq_val: prepares a pqzero_sum operation
  * @device_prep_dma_memset: prepares a memset operation
  * @device_prep_dma_memset_sg: prepares a memset operation over a scatter list
+ * @device_prep_dma_memcpy_sg: prepares memcpy between scatterlist and buffer
  * @device_prep_dma_interrupt: prepares an end of chain interrupt operation
  * @device_prep_slave_sg: prepares a slave dma operation
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
@@ -769,6 +771,10 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_memset_sg)(
 		struct dma_chan *chan, struct scatterlist *sg,
 		unsigned int nents, int value, unsigned long flags);
+	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy_sg)(
+		struct dma_chan *chan,
+		struct scatterlist *dst_sg, unsigned int dst_nents,
+		dma_addr_t src, bool to_sg, unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
 		struct dma_chan *chan, unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_sg)(

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 4/8] dmaengine: add verification of DMA_MEMSET_SG in dmaengine
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
                   ` (2 preceding siblings ...)
  2017-08-03 23:32 ` [PATCH v3 3/8] dmaengine: Add DMA_MEMCPY_SG transaction op Dave Jiang
@ 2017-08-03 23:32 ` Dave Jiang
  2017-08-03 23:32 ` [PATCH v3 5/8] dmaengine: ioatdma: dma_prep_memcpy_sg support Dave Jiang
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:32 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

DMA_MEMSET_SG is missing the verification of having the operation set and
also a supporting function provided.

Fixes: Commit 50c7cd2bd ("dmaengine: Add scatter-gathered memset")

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/dma/dmaengine.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 1c424f6..d9a71f0 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -935,6 +935,8 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_pq_val);
 	BUG_ON(dma_has_cap(DMA_MEMSET, device->cap_mask) &&
 		!device->device_prep_dma_memset);
+	BUG_ON(dma_has_cap(DMA_MEMSET_SG, device->cap_mask) &&
+		!device->device_prep_dma_memset_sg);
 	BUG_ON(dma_has_cap(DMA_INTERRUPT, device->cap_mask) &&
 		!device->device_prep_dma_interrupt);
 	BUG_ON(dma_has_cap(DMA_SG_SG, device->cap_mask) &&

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 5/8] dmaengine: ioatdma: dma_prep_memcpy_sg support
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
                   ` (3 preceding siblings ...)
  2017-08-03 23:32 ` [PATCH v3 4/8] dmaengine: add verification of DMA_MEMSET_SG in dmaengine Dave Jiang
@ 2017-08-03 23:32 ` Dave Jiang
  2017-08-03 23:32 ` [PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap Dave Jiang
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:32 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

Adding ioatdma support to copy from a physically contiguous buffer to a
provided scatterlist and vice versa. This is used to support
reading/writing persistent memory in the pmem driver.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/dma/ioat/dma.h  |    4 +++
 drivers/dma/ioat/init.c |    2 ++
 drivers/dma/ioat/prep.c |   57 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+)

diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h
index 56200ee..6c08b06 100644
--- a/drivers/dma/ioat/dma.h
+++ b/drivers/dma/ioat/dma.h
@@ -370,6 +370,10 @@ struct dma_async_tx_descriptor *
 ioat_dma_prep_memcpy_lock(struct dma_chan *c, dma_addr_t dma_dest,
 			   dma_addr_t dma_src, size_t len, unsigned long flags);
 struct dma_async_tx_descriptor *
+ioat_dma_prep_memcpy_sg_lock(struct dma_chan *c,
+		struct scatterlist *sg, unsigned int sg_nents,
+		dma_addr_t dma_addr, bool to_sg, unsigned long flags);
+struct dma_async_tx_descriptor *
 ioat_prep_interrupt_lock(struct dma_chan *c, unsigned long flags);
 struct dma_async_tx_descriptor *
 ioat_prep_xor(struct dma_chan *chan, dma_addr_t dest, dma_addr_t *src,
diff --git a/drivers/dma/ioat/init.c b/drivers/dma/ioat/init.c
index 1b881fb..5c69ff6 100644
--- a/drivers/dma/ioat/init.c
+++ b/drivers/dma/ioat/init.c
@@ -1081,6 +1081,8 @@ static int ioat3_dma_probe(struct ioatdma_device *ioat_dma, int dca)
 
 	dma = &ioat_dma->dma_dev;
 	dma->device_prep_dma_memcpy = ioat_dma_prep_memcpy_lock;
+	dma_cap_set(DMA_MEMCPY_SG, dma->cap_mask);
+	dma->device_prep_dma_memcpy_sg = ioat_dma_prep_memcpy_sg_lock;
 	dma->device_issue_pending = ioat_issue_pending;
 	dma->device_alloc_chan_resources = ioat_alloc_chan_resources;
 	dma->device_free_chan_resources = ioat_free_chan_resources;
diff --git a/drivers/dma/ioat/prep.c b/drivers/dma/ioat/prep.c
index 243421a..d8219af 100644
--- a/drivers/dma/ioat/prep.c
+++ b/drivers/dma/ioat/prep.c
@@ -159,6 +159,63 @@ ioat_dma_prep_memcpy_lock(struct dma_chan *c, dma_addr_t dma_dest,
 	return &desc->txd;
 }
 
+struct dma_async_tx_descriptor *
+ioat_dma_prep_memcpy_sg_lock(struct dma_chan *c,
+		struct scatterlist *sg, unsigned int sg_nents,
+		dma_addr_t dma_addr, bool to_sg, unsigned long flags)
+{
+	struct ioatdma_chan *ioat_chan = to_ioat_chan(c);
+	struct ioat_dma_descriptor *hw = NULL;
+	struct ioat_ring_ent *desc = NULL;
+	dma_addr_t dma_off = dma_addr;
+	int num_descs, idx, i;
+	struct scatterlist *s;
+	size_t total_len = 0, len;
+
+
+	if (test_bit(IOAT_CHAN_DOWN, &ioat_chan->state))
+		return NULL;
+
+	/*
+	 * The upper layer will garantee that each entry does not exceed
+	 * xfercap.
+	 */
+	num_descs = sg_nents;
+
+	if (likely(num_descs) &&
+	    ioat_check_space_lock(ioat_chan, num_descs) == 0)
+		idx = ioat_chan->head;
+	else
+		return NULL;
+
+	for_each_sg(sg, s, sg_nents, i) {
+		desc = ioat_get_ring_ent(ioat_chan, idx + i);
+		hw = desc->hw;
+		len = sg_dma_len(s);
+		hw->size = len;
+		hw->ctl = 0;
+		if (to_sg) {
+			hw->src_addr = dma_off;
+			hw->dst_addr = sg_dma_address(s);
+		} else {
+			hw->src_addr = sg_dma_address(s);
+			hw->dst_addr = dma_off;
+		}
+		dma_off += len;
+		total_len += len;
+		dump_desc_dbg(ioat_chan, desc);
+	}
+
+	desc->txd.flags = flags;
+	desc->len = total_len;
+	hw->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT);
+	hw->ctl_f.fence = !!(flags & DMA_PREP_FENCE);
+	hw->ctl_f.compl_write = 1;
+	dump_desc_dbg(ioat_chan, desc);
+	/* we leave the channel locked to ensure in order submission */
+
+	return &desc->txd;
+}
 
 static struct dma_async_tx_descriptor *
 __ioat_prep_xor_lock(struct dma_chan *c, enum sum_check_flags *result,

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
                   ` (4 preceding siblings ...)
  2017-08-03 23:32 ` [PATCH v3 5/8] dmaengine: ioatdma: dma_prep_memcpy_sg support Dave Jiang
@ 2017-08-03 23:32 ` Dave Jiang
  2017-08-04 13:24   ` kbuild test robot
  2017-08-03 23:32 ` [PATCH v3 7/8] libnvdimm: Adding blk-mq support to the pmem driver Dave Jiang
  2017-08-03 23:32 ` [PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq Dave Jiang
  7 siblings, 1 reply; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:32 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

This should provide support to unmap scatterlist with the
dmaengine_unmap_data. We will support only 1 scatterlist per
direction. The DMA addresses array has been overloaded for the
2 or less entries DMA unmap data structure in order to store the
SG pointer(s).

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/dma/dmaengine.c   |   45 ++++++++++++++++++++++++++++++++++++---------
 include/linux/dmaengine.h |    4 ++++
 2 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index d9a71f0..230da68 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -1130,16 +1130,35 @@ static void dmaengine_unmap(struct kref *kref)
 {
 	struct dmaengine_unmap_data *unmap = container_of(kref, typeof(*unmap), kref);
 	struct device *dev = unmap->dev;
-	int cnt, i;
+	int cnt, i, sg_nents;
+	struct scatterlist *sg;
+
+	sg_nents = dma_unmap_data_sg_to_nents(unmap, unmap->map_cnt);
+	if (sg_nents) {
+		i = 0;
+		cnt = 1;
+		sg = (struct scatterlist *)unmap->addr[i];
+		dma_unmap_sg(dev, sg, sg_nents, DMA_TO_DEVICE);
+	} else {
+		cnt = unmap->to_cnt;
+		for (i = 0; i < cnt; i++)
+			dma_unmap_page(dev, unmap->addr[i], unmap->len,
+					DMA_TO_DEVICE);
+	}
+
+	sg_nents = dma_unmap_data_sg_from_nents(unmap, unmap->map_cnt);
+	if (sg_nents) {
+		sg = (struct scatterlist *)unmap->addr[i];
+		dma_unmap_sg(dev, sg, sg_nents, DMA_FROM_DEVICE);
+		cnt++;
+		i++;
+	} else {
+		cnt += unmap->from_cnt;
+		for (; i < cnt; i++)
+			dma_unmap_page(dev, unmap->addr[i], unmap->len,
+					DMA_FROM_DEVICE);
+	}
 
-	cnt = unmap->to_cnt;
-	for (i = 0; i < cnt; i++)
-		dma_unmap_page(dev, unmap->addr[i], unmap->len,
-			       DMA_TO_DEVICE);
-	cnt += unmap->from_cnt;
-	for (; i < cnt; i++)
-		dma_unmap_page(dev, unmap->addr[i], unmap->len,
-			       DMA_FROM_DEVICE);
 	cnt += unmap->bidi_cnt;
 	for (; i < cnt; i++) {
 		if (unmap->addr[i] == 0)
@@ -1183,6 +1202,10 @@ static int __init dmaengine_init_unmap_pool(void)
 		size = sizeof(struct dmaengine_unmap_data) +
 		       sizeof(dma_addr_t) * p->size;
 
+		/* add 2 more entries for SG nents overload */
+		if (i == 0)
+			size += sizeof(dma_addr_t) * 2;
+
 		p->cache = kmem_cache_create(p->name, size, 0,
 					     SLAB_HWCACHE_ALIGN, NULL);
 		if (!p->cache)
@@ -1209,6 +1232,10 @@ dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
 		return NULL;
 
 	memset(unmap, 0, sizeof(*unmap));
+	/* clear the overloaded sg nents entries */
+	if (nr < 3)
+		memset(&unmap->addr[nr], 0,
+				DMA_UNMAP_SG_ENTS * sizeof(dma_addr_t));
 	kref_init(&unmap->kref);
 	unmap->dev = dev;
 	unmap->map_cnt = nr;
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index fc25475..db2234f 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -476,6 +476,10 @@ struct dmaengine_unmap_data {
 	dma_addr_t addr[0];
 };
 
+#define DMA_UNMAP_SG_ENTS	2
+#define dma_unmap_data_sg_to_nents(x, n) x->addr[n]
+#define dma_unmap_data_sg_from_nents(x, n) x->addr[n+1]
+
 /**
  * struct dma_async_tx_descriptor - async transaction descriptor
  * ---dma generic offload fields---

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 7/8] libnvdimm: Adding blk-mq support to the pmem driver
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
                   ` (5 preceding siblings ...)
  2017-08-03 23:32 ` [PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap Dave Jiang
@ 2017-08-03 23:32 ` Dave Jiang
  2017-08-03 23:32 ` [PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq Dave Jiang
  7 siblings, 0 replies; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:32 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

Adding blk-mq support to the pmem driver in addition to the direct bio
support. This allows for hardware offloading via DMA engines. By default
the bio method will be enabled. The blk-mq support can be turned on via
module parameter queue_mode=1.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/nvdimm/pmem.c |  139 ++++++++++++++++++++++++++++++++++++++++++-------
 drivers/nvdimm/pmem.h |    3 +
 2 files changed, 121 insertions(+), 21 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f7099ada..519b949 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -31,10 +31,24 @@
 #include <linux/uio.h>
 #include <linux/dax.h>
 #include <linux/nd.h>
+#include <linux/blk-mq.h>
 #include "pmem.h"
 #include "pfn.h"
 #include "nd.h"
 
+enum {
+	PMEM_Q_BIO = 0,
+	PMEM_Q_MQ = 1,
+};
+
+static int queue_mode = PMEM_Q_BIO;
+module_param(queue_mode, int, 0444);
+MODULE_PARM_DESC(queue_mode, "Pmem Queue Mode (0=BIO, 1=BLK-MQ)");
+
+struct pmem_cmd {
+	struct request *rq;
+};
+
 static struct device *to_dev(struct pmem_device *pmem)
 {
 	/*
@@ -260,9 +274,13 @@ static const struct attribute_group *pmem_attribute_groups[] = {
 	NULL,
 };
 
-static void pmem_release_queue(void *q)
+static void pmem_release_queue(void *data)
 {
-	blk_cleanup_queue(q);
+	struct pmem_device *pmem = (struct pmem_device *)data;
+
+	blk_cleanup_queue(pmem->q);
+	if (queue_mode == PMEM_Q_MQ)
+		blk_mq_free_tag_set(&pmem->tag_set);
 }
 
 static void pmem_freeze_queue(void *q)
@@ -280,6 +298,54 @@ static void pmem_release_disk(void *__pmem)
 	put_disk(pmem->disk);
 }
 
+static int pmem_handle_cmd(struct pmem_cmd *cmd)
+{
+	struct request *req = cmd->rq;
+	struct request_queue *q = req->q;
+	struct pmem_device *pmem = q->queuedata;
+	struct nd_region *nd_region = to_region(pmem);
+	struct bio_vec bvec;
+	struct req_iterator iter;
+	int rc = 0;
+
+	if (req->cmd_flags & REQ_FLUSH)
+		nvdimm_flush(nd_region);
+
+	rq_for_each_segment(bvec, req, iter) {
+		rc = pmem_do_bvec(pmem, bvec.bv_page, bvec.bv_len,
+				bvec.bv_offset, op_is_write(req_op(req)),
+				iter.iter.bi_sector);
+		if (rc < 0)
+			break;
+	}
+
+	if (req->cmd_flags & REQ_FUA)
+		nvdimm_flush(nd_region);
+
+	blk_mq_end_request(cmd->rq, rc);
+
+	return rc;
+}
+
+static blk_status_t pmem_queue_rq(struct blk_mq_hw_ctx *hctx,
+		const struct blk_mq_queue_data *bd)
+{
+	struct pmem_cmd *cmd = blk_mq_rq_to_pdu(bd->rq);
+
+	cmd->rq = bd->rq;
+
+	blk_mq_start_request(bd->rq);
+
+	if (pmem_handle_cmd(cmd) < 0)
+		return BLK_STS_IOERR;
+	else
+		return BLK_STS_OK;
+}
+
+static const struct blk_mq_ops pmem_mq_ops = {
+	.queue_rq	= pmem_queue_rq,
+};
+
 static int pmem_attach_disk(struct device *dev,
 		struct nd_namespace_common *ndns)
 {
@@ -293,10 +359,10 @@ static int pmem_attach_disk(struct device *dev,
 	struct nd_pfn_sb *pfn_sb;
 	struct pmem_device *pmem;
 	struct resource pfn_res;
-	struct request_queue *q;
 	struct device *gendev;
 	struct gendisk *disk;
 	void *addr;
+	int rc;
 
 	/* while nsio_rw_bytes is active, parse a pfn info block if present */
 	if (is_nd_pfn(dev)) {
@@ -329,17 +395,49 @@ static int pmem_attach_disk(struct device *dev,
 		return -EBUSY;
 	}
 
-	q = blk_alloc_queue_node(GFP_KERNEL, dev_to_node(dev));
-	if (!q)
-		return -ENOMEM;
+	if (queue_mode == PMEM_Q_MQ) {
+		pmem->tag_set.ops = &pmem_mq_ops;
+		pmem->tag_set.nr_hw_queues = nr_online_nodes;
+		pmem->tag_set.queue_depth = 64;
+		pmem->tag_set.numa_node = dev_to_node(dev);
+		pmem->tag_set.cmd_size = sizeof(struct pmem_cmd);
+		pmem->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
+		pmem->tag_set.driver_data = pmem;
+
+		rc = blk_mq_alloc_tag_set(&pmem->tag_set);
+		if (rc < 0)
+			return rc;
+
+		pmem->q = blk_mq_init_queue(&pmem->tag_set);
+		if (IS_ERR(pmem->q)) {
+			blk_mq_free_tag_set(&pmem->tag_set);
+			return -ENOMEM;
+		}
 
-	if (devm_add_action_or_reset(dev, pmem_release_queue, q))
-		return -ENOMEM;
+		if (devm_add_action_or_reset(dev, pmem_release_queue, pmem)) {
+			pmem_release_queue(pmem);
+			return -ENOMEM;
+		}
+	} else if (queue_mode == PMEM_Q_BIO) {
+		pmem->q = blk_alloc_queue_node(GFP_KERNEL, dev_to_node(dev));
+		if (!pmem->q)
+			return -ENOMEM;
+
+		if (devm_add_action_or_reset(dev, pmem_release_queue, pmem)) {
+			pmem_release_queue(pmem);
+			return -ENOMEM;
+		}
+
+		blk_queue_make_request(pmem->q, pmem_make_request);
+	} else {
+		dev_warn(dev, "Invalid queue mode: %d\n", queue_mode);
+		return -EINVAL;
+	}
 
 	pmem->pfn_flags = PFN_DEV;
 	if (is_nd_pfn(dev)) {
-		addr = devm_memremap_pages(dev, &pfn_res, &q->q_usage_counter,
-				altmap);
+		addr = devm_memremap_pages(dev, &pfn_res,
+				&pmem->q->q_usage_counter, altmap);
 		pfn_sb = nd_pfn->pfn_sb;
 		pmem->data_offset = le64_to_cpu(pfn_sb->dataoff);
 		pmem->pfn_pad = resource_size(res) - resource_size(&pfn_res);
@@ -348,7 +446,7 @@ static int pmem_attach_disk(struct device *dev,
 		res->start += pmem->data_offset;
 	} else if (pmem_should_map_pages(dev)) {
 		addr = devm_memremap_pages(dev, &nsio->res,
-				&q->q_usage_counter, NULL);
+				&pmem->q->q_usage_counter, NULL);
 		pmem->pfn_flags |= PFN_MAP;
 	} else
 		addr = devm_memremap(dev, pmem->phys_addr,
@@ -358,21 +456,20 @@ static int pmem_attach_disk(struct device *dev,
 	 * At release time the queue must be frozen before
 	 * devm_memremap_pages is unwound
 	 */
-	if (devm_add_action_or_reset(dev, pmem_freeze_queue, q))
+	if (devm_add_action_or_reset(dev, pmem_freeze_queue, pmem->q))
 		return -ENOMEM;
 
 	if (IS_ERR(addr))
 		return PTR_ERR(addr);
 	pmem->virt_addr = addr;
 
-	blk_queue_write_cache(q, wbc, fua);
-	blk_queue_make_request(q, pmem_make_request);
-	blk_queue_physical_block_size(q, PAGE_SIZE);
-	blk_queue_logical_block_size(q, pmem_sector_size(ndns));
-	blk_queue_max_hw_sectors(q, UINT_MAX);
-	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
-	queue_flag_set_unlocked(QUEUE_FLAG_DAX, q);
-	q->queuedata = pmem;
+	blk_queue_write_cache(pmem->q, wbc, fua);
+	blk_queue_physical_block_size(pmem->q, PAGE_SIZE);
+	blk_queue_logical_block_size(pmem->q, pmem_sector_size(ndns));
+	blk_queue_max_hw_sectors(pmem->q, UINT_MAX);
+	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, pmem->q);
+	queue_flag_set_unlocked(QUEUE_FLAG_DAX, pmem->q);
+	pmem->q->queuedata = pmem;
 
 	disk = alloc_disk_node(0, nid);
 	if (!disk)
@@ -380,7 +477,7 @@ static int pmem_attach_disk(struct device *dev,
 	pmem->disk = disk;
 
 	disk->fops		= &pmem_fops;
-	disk->queue		= q;
+	disk->queue		= pmem->q;
 	disk->flags		= GENHD_FL_EXT_DEVT;
 	nvdimm_namespace_disk_name(ndns, disk->disk_name);
 	set_capacity(disk, (pmem->size - pmem->pfn_pad - pmem->data_offset)
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 5434321..bbbe982 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -4,6 +4,7 @@
 #include <linux/types.h>
 #include <linux/pfn_t.h>
 #include <linux/fs.h>
+#include <linux/blk-mq.h>
 
 #ifdef CONFIG_ARCH_HAS_PMEM_API
 #define ARCH_MEMREMAP_PMEM MEMREMAP_WB
@@ -35,6 +36,8 @@ struct pmem_device {
 	struct badblocks	bb;
 	struct dax_device	*dax_dev;
 	struct gendisk		*disk;
+	struct blk_mq_tag_set	tag_set;
+	struct request_queue	*q;
 };
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq
  2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
                   ` (6 preceding siblings ...)
  2017-08-03 23:32 ` [PATCH v3 7/8] libnvdimm: Adding blk-mq support to the pmem driver Dave Jiang
@ 2017-08-03 23:32 ` Dave Jiang
  2017-08-04 12:55   ` kbuild test robot
  7 siblings, 1 reply; 11+ messages in thread
From: Dave Jiang @ 2017-08-03 23:32 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams; +Cc: dmaengine, linux-nvdimm

Adding DMA support for pmem blk reads. This provides signficant CPU
reduction with large memory reads with good performance. DMAs are triggered
with test against bio_multiple_segment(), so the small I/Os (4k or less?)
are still performed by the CPU in order to reduce latency. By default
the pmem driver will be using blk-mq with DMA.

Numbers below are measured against pmem simulated via DRAM using
memmap=NN!SS.  DMA engine used is the ioatdma on Intel Skylake Xeon
platform.  Keep in mind the performance for actual persistent memory
will differ.
Fio 2.21 was used.

64k: 1 task queuedepth=1
CPU Read:  7631 MB/s  99.7% CPU    DMA Read: 2415 MB/s  54% CPU
CPU Write: 3552 MB/s  100% CPU     DMA Write 2173 MB/s  54% CPU

64k: 16 tasks queuedepth=16
CPU Read: 36800 MB/s  1593% CPU    DMA Read:  29100 MB/s  607% CPU
CPU Write 20900 MB/s  1589% CPU    DMA Write: 23400 MB/s  585% CPU

2M: 1 task queuedepth=1
CPU Read:  6013 MB/s  99.3% CPU    DMA Read:  7986 MB/s  59.3% CPU
CPU Write: 3579 MB/s  100% CPU     DMA Write: 5211 MB/s  58.3% CPU

2M: 16 tasks queuedepth=16
CPU Read:  18100 MB/s 1588% CPU    DMA Read:  21300 MB/s 180.9% CPU
CPU Write: 14100 MB/s 1594% CPU    DMA Write: 20400 MB/s 446.9% CPU

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/nvdimm/pmem.c |  214 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 204 insertions(+), 10 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 519b949..3ad78e3 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -32,6 +32,8 @@
 #include <linux/dax.h>
 #include <linux/nd.h>
 #include <linux/blk-mq.h>
+#include <linux/dmaengine.h>
+#include <linux/dma-mapping.h>
 #include "pmem.h"
 #include "pfn.h"
 #include "nd.h"
@@ -41,12 +43,24 @@ enum {
 	PMEM_Q_MQ = 1,
 };
 
-static int queue_mode = PMEM_Q_BIO;
+static int queue_mode = PMEM_Q_MQ;
 module_param(queue_mode, int, 0444);
-MODULE_PARM_DESC(queue_mode, "Pmem Queue Mode (0=BIO, 1=BLK-MQ)");
+MODULE_PARM_DESC(queue_mode, "Pmem Queue Mode (0=BIO, 1=BLK-MQ & DMA)");
+
+static int queue_depth = 128;
+module_param(queue_depth, int, 0444);
+MODULE_PARM_DESC(queue_depth, "I/O Queue Depth for multi queue mode");
+
+/* typically maps to number of DMA channels/devices per socket */
+static int q_per_node = 8;
+module_param(q_per_node, int, 0444);
+MODULE_PARM_DESC(q_per_node, "Hardware queues per node\n");
 
 struct pmem_cmd {
 	struct request *rq;
+	struct dma_chan *chan;
+	int sg_nents;
+	struct scatterlist sg[];
 };
 
 static struct device *to_dev(struct pmem_device *pmem)
@@ -298,6 +312,159 @@ static void pmem_release_disk(void *__pmem)
 	put_disk(pmem->disk);
 }
 
+static void nd_pmem_dma_callback(void *data,
+		const struct dmaengine_result *res)
+{
+	struct pmem_cmd *cmd = data;
+	struct request *req = cmd->rq;
+	struct request_queue *q = req->q;
+	struct pmem_device *pmem = q->queuedata;
+	struct nd_region *nd_region = to_region(pmem);
+	struct device *dev = to_dev(pmem);
+	int rc = 0;
+
+	if (res) {
+		enum dmaengine_tx_result dma_err = res->result;
+
+		switch (dma_err) {
+		case DMA_TRANS_READ_FAILED:
+		case DMA_TRANS_WRITE_FAILED:
+		case DMA_TRANS_ABORTED:
+			dev_dbg(dev, "bio failed\n");
+			rc = -ENXIO;
+			break;
+		case DMA_TRANS_NOERROR:
+		default:
+			break;
+		}
+	}
+
+	if (req->cmd_flags & REQ_FUA)
+		nvdimm_flush(nd_region);
+
+	blk_mq_end_request(cmd->rq, rc);
+}
+
+static int pmem_handle_cmd_dma(struct pmem_cmd *cmd, bool is_write)
+{
+	struct request *req = cmd->rq;
+	struct request_queue *q = req->q;
+	struct pmem_device *pmem = q->queuedata;
+	struct device *dev = to_dev(pmem);
+	phys_addr_t pmem_off = blk_rq_pos(req) * 512 + pmem->data_offset;
+	void *pmem_addr = pmem->virt_addr + pmem_off;
+	struct nd_region *nd_region = to_region(pmem);
+	size_t len;
+	struct dma_device *dma = cmd->chan->device;
+	struct dmaengine_unmap_data *unmap;
+	dma_cookie_t cookie;
+	struct dma_async_tx_descriptor *txd;
+	struct page *page;
+	unsigned int off;
+	int rc;
+	enum dma_data_direction dir;
+	dma_addr_t dma_addr;
+
+	if (req->cmd_flags & REQ_FLUSH)
+		nvdimm_flush(nd_region);
+
+	unmap = dmaengine_get_unmap_data(dma->dev, 2, GFP_NOWAIT);
+	if (!unmap) {
+		dev_dbg(dev, "failed to get dma unmap data\n");
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	/*
+	 * If reading from pmem, writing to scatterlist,
+	 * and if writing to pmem, reading from scatterlist.
+	 */
+	dir = is_write ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+	cmd->sg_nents = blk_rq_map_sg(req->q, req, cmd->sg);
+	if (cmd->sg_nents < 1) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	if (cmd->sg_nents > 128) {
+		rc = -ENOMEM;
+		dev_warn(dev, "Number of sg greater than allocated\n");
+		goto err;
+	}
+
+	rc = dma_map_sg(dma->dev, cmd->sg, cmd->sg_nents, dir);
+	if (rc < 1) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	len = blk_rq_payload_bytes(req);
+	page = virt_to_page(pmem_addr);
+	off = offset_in_page(pmem_addr);
+	dir = is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+	dma_addr = dma_map_page(dma->dev, page, off, len, dir);
+	if (dma_mapping_error(dma->dev, unmap->addr[0])) {
+		dev_dbg(dma->dev, "src DMA mapping error\n");
+		rc = -ENXIO;
+		goto err_unmap_sg;
+	}
+
+	unmap->len = len;
+
+	if (is_write) {
+		unmap->addr[0] = dma_addr;
+		unmap->addr[1] = (dma_addr_t)cmd->sg;
+		unmap->to_cnt = 1;
+		unmap->from_cnt = 0;
+		dma_unmap_data_sg_from_nents(unmap, 2) = cmd->sg_nents;
+	} else {
+		unmap->addr[0] = (dma_addr_t)cmd->sg;
+		unmap->addr[1] = dma_addr;
+		unmap->from_cnt = 1;
+		unmap->to_cnt = 0;
+		dma_unmap_data_sg_to_nents(unmap, 2) = cmd->sg_nents;
+	}
+
+	txd = dma->device_prep_dma_memcpy_sg(cmd->chan,
+				cmd->sg, cmd->sg_nents, dma_addr,
+				!is_write, DMA_PREP_INTERRUPT);
+	if (!txd) {
+		dev_dbg(dma->dev, "dma prep failed\n");
+		rc = -ENXIO;
+		goto err_unmap_buffer;
+	}
+
+	txd->callback_result = nd_pmem_dma_callback;
+	txd->callback_param = cmd;
+	dma_set_unmap(txd, unmap);
+	cookie = dmaengine_submit(txd);
+	if (dma_submit_error(cookie)) {
+		dev_dbg(dma->dev, "dma submit error\n");
+		rc = -ENXIO;
+		goto err_set_unmap;
+	}
+
+	dmaengine_unmap_put(unmap);
+	dma_async_issue_pending(cmd->chan);
+
+	return 0;
+
+err_set_unmap:
+	dmaengine_unmap_put(unmap);
+err_unmap_buffer:
+	dma_unmap_page(dev, dma_addr, len, dir);
+err_unmap_sg:
+	if (dir == DMA_TO_DEVICE)
+		dir = DMA_FROM_DEVICE;
+	else
+		dir = DMA_TO_DEVICE;
+	dma_unmap_sg(dev, cmd->sg, cmd->sg_nents, dir);
+	dmaengine_unmap_put(unmap);
+err:
+	blk_mq_end_request(cmd->rq, rc);
+	return rc;
+}
+
 static int pmem_handle_cmd(struct pmem_cmd *cmd)
 {
 	struct request *req = cmd->rq;
@@ -331,12 +498,18 @@ static blk_status_t pmem_queue_rq(struct blk_mq_hw_ctx *hctx,
 		const struct blk_mq_queue_data *bd)
 {
 	struct pmem_cmd *cmd = blk_mq_rq_to_pdu(bd->rq);
+	struct request *req;
+	int rc;
 
-	cmd->rq = bd->rq;
-
-	blk_mq_start_request(bd->rq);
+	req = cmd->rq = bd->rq;
+	cmd->chan = dma_find_channel(DMA_MEMCPY_SG);
+	blk_mq_start_request(req);
 
-	if (pmem_handle_cmd(cmd) < 0)
+	if (cmd->chan && bio_multiple_segments(req->bio))
+		rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req)));
+	else
+		rc = pmem_handle_cmd(cmd);
+	if (rc < 0)
 		return BLK_STS_IOERR;
 	else
 		return BLK_STS_OK;
@@ -363,6 +536,7 @@ static int pmem_attach_disk(struct device *dev,
 	struct gendisk *disk;
 	void *addr;
 	int rc;
+	struct dma_chan *chan;
 
 	/* while nsio_rw_bytes is active, parse a pfn info block if present */
 	if (is_nd_pfn(dev)) {
@@ -396,11 +570,20 @@ static int pmem_attach_disk(struct device *dev,
 	}
 
 	if (queue_mode == PMEM_Q_MQ) {
+		chan = dma_find_channel(DMA_MEMCPY_SG);
+		if (!chan) {
+			queue_mode = PMEM_Q_BIO;
+			dev_warn(dev, "Forced back to PMEM_Q_BIO, no DMA\n");
+		}
+	}
+
+	if (queue_mode == PMEM_Q_MQ) {
 		pmem->tag_set.ops = &pmem_mq_ops;
-		pmem->tag_set.nr_hw_queues = nr_online_nodes;
-		pmem->tag_set.queue_depth = 64;
+		pmem->tag_set.nr_hw_queues = nr_online_nodes * q_per_node;
+		pmem->tag_set.queue_depth = queue_depth;
 		pmem->tag_set.numa_node = dev_to_node(dev);
-		pmem->tag_set.cmd_size = sizeof(struct pmem_cmd);
+		pmem->tag_set.cmd_size = sizeof(struct pmem_cmd) +
+			sizeof(struct scatterlist) * 128;
 		pmem->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
 		pmem->tag_set.driver_data = pmem;
 
@@ -466,7 +649,11 @@ static int pmem_attach_disk(struct device *dev,
 	blk_queue_write_cache(pmem->q, wbc, fua);
 	blk_queue_physical_block_size(pmem->q, PAGE_SIZE);
 	blk_queue_logical_block_size(pmem->q, pmem_sector_size(ndns));
-	blk_queue_max_hw_sectors(pmem->q, UINT_MAX);
+	if (queue_mode == PMEM_Q_MQ) {
+		blk_queue_max_hw_sectors(pmem->q, 0x200000);
+		blk_queue_max_segments(pmem->q, 128);
+	} else
+		blk_queue_max_hw_sectors(pmem->q, UINT_MAX);
 	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, pmem->q);
 	queue_flag_set_unlocked(QUEUE_FLAG_DAX, pmem->q);
 	pmem->q->queuedata = pmem;
@@ -628,15 +815,22 @@ static struct nd_device_driver nd_pmem_driver = {
 
 static int __init pmem_init(void)
 {
+	if (queue_mode == PMEM_Q_MQ)
+		dmaengine_get();
+
 	return nd_driver_register(&nd_pmem_driver);
 }
 module_init(pmem_init);
 
 static void pmem_exit(void)
 {
+	if (queue_mode == PMEM_Q_MQ)
+		dmaengine_put();
+
 	driver_unregister(&nd_pmem_driver.drv);
 }
 module_exit(pmem_exit);
 
+MODULE_SOFTDEP("pre: dmaengine");
 MODULE_AUTHOR("Ross Zwisler <ross.zwisler@linux.intel.com>");
 MODULE_LICENSE("GPL v2");

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq
  2017-08-03 23:32 ` [PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq Dave Jiang
@ 2017-08-04 12:55   ` kbuild test robot
  0 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2017-08-04 12:55 UTC (permalink / raw)
  To: Dave Jiang; +Cc: linux-nvdimm, vinod.koul, kbuild-all, dmaengine

Hi Dave,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc3]
[cannot apply to next-20170804]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Dave-Jiang/Adding-blk-mq-and-DMA-support-to-pmem-block-driver/20170804-191719
config: i386-randconfig-x018-201731 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   drivers//nvdimm/pmem.c: In function 'pmem_handle_cmd_dma':
>> drivers//nvdimm/pmem.c:416:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      unmap->addr[1] = (dma_addr_t)cmd->sg;
                       ^
   drivers//nvdimm/pmem.c:421:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      unmap->addr[0] = (dma_addr_t)cmd->sg;
                       ^

vim +416 drivers//nvdimm/pmem.c

   347	
   348	static int pmem_handle_cmd_dma(struct pmem_cmd *cmd, bool is_write)
   349	{
   350		struct request *req = cmd->rq;
   351		struct request_queue *q = req->q;
   352		struct pmem_device *pmem = q->queuedata;
   353		struct device *dev = to_dev(pmem);
   354		phys_addr_t pmem_off = blk_rq_pos(req) * 512 + pmem->data_offset;
   355		void *pmem_addr = pmem->virt_addr + pmem_off;
   356		struct nd_region *nd_region = to_region(pmem);
   357		size_t len;
   358		struct dma_device *dma = cmd->chan->device;
   359		struct dmaengine_unmap_data *unmap;
   360		dma_cookie_t cookie;
   361		struct dma_async_tx_descriptor *txd;
   362		struct page *page;
   363		unsigned int off;
   364		int rc;
   365		enum dma_data_direction dir;
   366		dma_addr_t dma_addr;
   367	
   368		if (req->cmd_flags & REQ_FLUSH)
   369			nvdimm_flush(nd_region);
   370	
   371		unmap = dmaengine_get_unmap_data(dma->dev, 2, GFP_NOWAIT);
   372		if (!unmap) {
   373			dev_dbg(dev, "failed to get dma unmap data\n");
   374			rc = -ENOMEM;
   375			goto err;
   376		}
   377	
   378		/*
   379		 * If reading from pmem, writing to scatterlist,
   380		 * and if writing to pmem, reading from scatterlist.
   381		 */
   382		dir = is_write ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
   383		cmd->sg_nents = blk_rq_map_sg(req->q, req, cmd->sg);
   384		if (cmd->sg_nents < 1) {
   385			rc = -EINVAL;
   386			goto err;
   387		}
   388	
   389		if (cmd->sg_nents > 128) {
   390			rc = -ENOMEM;
   391			dev_warn(dev, "Number of sg greater than allocated\n");
   392			goto err;
   393		}
   394	
   395		rc = dma_map_sg(dma->dev, cmd->sg, cmd->sg_nents, dir);
   396		if (rc < 1) {
   397			rc = -ENXIO;
   398			goto err;
   399		}
   400	
   401		len = blk_rq_payload_bytes(req);
   402		page = virt_to_page(pmem_addr);
   403		off = offset_in_page(pmem_addr);
   404		dir = is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
   405		dma_addr = dma_map_page(dma->dev, page, off, len, dir);
   406		if (dma_mapping_error(dma->dev, unmap->addr[0])) {
   407			dev_dbg(dma->dev, "src DMA mapping error\n");
   408			rc = -ENXIO;
   409			goto err_unmap_sg;
   410		}
   411	
   412		unmap->len = len;
   413	
   414		if (is_write) {
   415			unmap->addr[0] = dma_addr;
 > 416			unmap->addr[1] = (dma_addr_t)cmd->sg;
   417			unmap->to_cnt = 1;
   418			unmap->from_cnt = 0;
   419			dma_unmap_data_sg_from_nents(unmap, 2) = cmd->sg_nents;
   420		} else {
   421			unmap->addr[0] = (dma_addr_t)cmd->sg;
   422			unmap->addr[1] = dma_addr;
   423			unmap->from_cnt = 1;
   424			unmap->to_cnt = 0;
   425			dma_unmap_data_sg_to_nents(unmap, 2) = cmd->sg_nents;
   426		}
   427	
   428		txd = dma->device_prep_dma_memcpy_sg(cmd->chan,
   429					cmd->sg, cmd->sg_nents, dma_addr,
   430					!is_write, DMA_PREP_INTERRUPT);
   431		if (!txd) {
   432			dev_dbg(dma->dev, "dma prep failed\n");
   433			rc = -ENXIO;
   434			goto err_unmap_buffer;
   435		}
   436	
   437		txd->callback_result = nd_pmem_dma_callback;
   438		txd->callback_param = cmd;
   439		dma_set_unmap(txd, unmap);
   440		cookie = dmaengine_submit(txd);
   441		if (dma_submit_error(cookie)) {
   442			dev_dbg(dma->dev, "dma submit error\n");
   443			rc = -ENXIO;
   444			goto err_set_unmap;
   445		}
   446	
   447		dmaengine_unmap_put(unmap);
   448		dma_async_issue_pending(cmd->chan);
   449	
   450		return 0;
   451	
   452	err_set_unmap:
   453		dmaengine_unmap_put(unmap);
   454	err_unmap_buffer:
   455		dma_unmap_page(dev, dma_addr, len, dir);
   456	err_unmap_sg:
   457		if (dir == DMA_TO_DEVICE)
   458			dir = DMA_FROM_DEVICE;
   459		else
   460			dir = DMA_TO_DEVICE;
   461		dma_unmap_sg(dev, cmd->sg, cmd->sg_nents, dir);
   462		dmaengine_unmap_put(unmap);
   463	err:
   464		blk_mq_end_request(cmd->rq, rc);
   465		return rc;
   466	}
   467	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap
  2017-08-03 23:32 ` [PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap Dave Jiang
@ 2017-08-04 13:24   ` kbuild test robot
  0 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2017-08-04 13:24 UTC (permalink / raw)
  To: Dave Jiang; +Cc: linux-nvdimm, vinod.koul, kbuild-all, dmaengine

Hi Dave,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc3]
[cannot apply to next-20170804]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Dave-Jiang/Adding-blk-mq-and-DMA-support-to-pmem-block-driver/20170804-191719
config: i386-randconfig-x003-201731 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   drivers//dma/dmaengine.c: In function 'dmaengine_unmap':
>> drivers//dma/dmaengine.c:1140:8: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      sg = (struct scatterlist *)unmap->addr[i];
           ^
   drivers//dma/dmaengine.c:1151:8: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      sg = (struct scatterlist *)unmap->addr[i];
           ^
   Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size
   Cyclomatic Complexity 5 include/linux/compiler.h:__write_once_size
   Cyclomatic Complexity 3 arch/x86/include/asm/bitops.h:set_bit
   Cyclomatic Complexity 3 arch/x86/include/asm/bitops.h:clear_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u32
   Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid
   Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid
   Cyclomatic Complexity 1 include/linux/list.h:__list_del
   Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry
   Cyclomatic Complexity 1 include/linux/err.h:ERR_PTR
   Cyclomatic Complexity 1 include/linux/err.h:PTR_ERR
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_set
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_inc
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_dec
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_dec_and_test
   Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order
   Cyclomatic Complexity 1 arch/x86/include/asm/processor.h:rep_nop
   Cyclomatic Complexity 1 arch/x86/include/asm/processor.h:cpu_relax
   Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies
   Cyclomatic Complexity 5 include/linux/jiffies.h:msecs_to_jiffies
   Cyclomatic Complexity 1 include/linux/refcount.h:refcount_set
   Cyclomatic Complexity 1 include/linux/kref.h:kref_init
   Cyclomatic Complexity 1 include/linux/kobject.h:kobject_name
   Cyclomatic Complexity 1 include/linux/rculist.h:list_del_rcu
   Cyclomatic Complexity 1 include/linux/device.h:dev_to_node
   Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_unmap_page
   Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_unmap_sg
   Cyclomatic Complexity 1 include/linux/dma-mapping.h:valid_dma_direction
   Cyclomatic Complexity 1 arch/x86/include/asm/dma-mapping.h:get_arch_dma_ops
   Cyclomatic Complexity 1 include/linux/module.h:__module_get
   Cyclomatic Complexity 1 include/linux/module.h:try_module_get
   Cyclomatic Complexity 1 include/linux/module.h:module_put
   Cyclomatic Complexity 1 include/linux/dmaengine.h:txd_lock
   Cyclomatic Complexity 1 include/linux/dmaengine.h:txd_unlock
   Cyclomatic Complexity 1 include/linux/dmaengine.h:txd_clear_parent
   Cyclomatic Complexity 1 include/linux/dmaengine.h:txd_clear_next
   Cyclomatic Complexity 1 include/linux/dmaengine.h:txd_next
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_set
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_clear
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_async_issue_pending
   Cyclomatic Complexity 56 include/linux/slab.h:kmalloc_index
   Cyclomatic Complexity 67 include/linux/slab.h:kmalloc_large
   Cyclomatic Complexity 9 include/linux/slab.h:kmalloc
   Cyclomatic Complexity 1 include/linux/slab.h:kzalloc
   Cyclomatic Complexity 1 drivers//dma/dmaengine.c:dma_chan_to_owner
   Cyclomatic Complexity 2 drivers//dma/dmaengine.c:balance_ref_count
   Cyclomatic Complexity 5 include/linux/dmaengine.h:dma_async_is_tx_complete
   Cyclomatic Complexity 2 include/linux/dmaengine.h:__dma_has_cap
   Cyclomatic Complexity 7 drivers//dma/dmaengine.c:device_has_all_tx_types
   Cyclomatic Complexity 9 drivers//dma/dmaengine.c:dma_chan_get
   Cyclomatic Complexity 3 include/linux/device.h:dev_name
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_chan_name
   Cyclomatic Complexity 10 include/linux/dma-mapping.h:get_dma_ops
   Cyclomatic Complexity 3 include/linux/bitops.h:get_count_order
   Cyclomatic Complexity 9 include/linux/bitmap.h:bitmap_fill
   Cyclomatic Complexity 7 drivers//dma/dmaengine.c:dma_channel_table_init
   Cyclomatic Complexity 9 include/linux/bitmap.h:bitmap_zero
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_zero
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_read_lock
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_read_unlock
   Cyclomatic Complexity 1 include/linux/err.h:IS_ERR
   Cyclomatic Complexity 3 include/linux/err.h:IS_ERR_OR_NULL
   Cyclomatic Complexity 3 include/acpi/acpi_bus.h:is_acpi_device_node
   Cyclomatic Complexity 1 include/linux/acpi.h:has_acpi_companion
   Cyclomatic Complexity 2 include/linux/cpumask.h:cpumask_check
   Cyclomatic Complexity 1 include/linux/cpumask.h:cpumask_test_cpu
   Cyclomatic Complexity 3 drivers//dma/dmaengine.c:dma_chan_is_local
   Cyclomatic Complexity 3 include/linux/cpumask.h:cpumask_next
   Cyclomatic Complexity 9 include/linux/bitmap.h:bitmap_and
   Cyclomatic Complexity 17 include/linux/bitmap.h:bitmap_equal
   Cyclomatic Complexity 1 drivers//dma/dmaengine.c:__dma_device_satisfies_mask
   Cyclomatic Complexity 27 drivers//dma/dmaengine.c:private_candidate
   Cyclomatic Complexity 11 drivers//dma/dmaengine.c:find_candidate
   Cyclomatic Complexity 7 drivers//dma/dmaengine.c:dma_filter_match
   Cyclomatic Complexity 3 include/linux/dmaengine.h:dmaengine_synchronize
   Cyclomatic Complexity 19 drivers//dma/dmaengine.c:dma_chan_put
   Cyclomatic Complexity 36 drivers//dma/dmaengine.c:min_chan
   Cyclomatic Complexity 18 drivers//dma/dmaengine.c:dma_channel_rebalance
   Cyclomatic Complexity 3 include/linux/dma-mapping.h:dma_unmap_sg_attrs
   Cyclomatic Complexity 3 include/linux/dma-mapping.h:dma_unmap_page_attrs
   Cyclomatic Complexity 1 drivers//dma/dmaengine.c:__get_unmap_pool
   Cyclomatic Complexity 4 drivers//dma/dmaengine.c:chan_dev_release
   Cyclomatic Complexity 3 drivers//dma/dmaengine.c:dev_to_dma_chan
   Cyclomatic Complexity 3 drivers//dma/dmaengine.c:in_use_show
   Cyclomatic Complexity 4 drivers//dma/dmaengine.c:bytes_transferred_show
   Cyclomatic Complexity 4 drivers//dma/dmaengine.c:memcpy_count_show
   Cyclomatic Complexity 1 include/linux/idr.h:ida_get_new
   Cyclomatic Complexity 3 drivers//dma/dmaengine.c:get_dma_id
   Cyclomatic Complexity 4 include/linux/rculist.h:__list_add_rcu
   Cyclomatic Complexity 1 include/linux/rculist.h:list_add_tail_rcu
   Cyclomatic Complexity 12 drivers//dma/dmaengine.c:dmaengine_unmap

vim +1140 drivers//dma/dmaengine.c

  1128	
  1129	static void dmaengine_unmap(struct kref *kref)
  1130	{
  1131		struct dmaengine_unmap_data *unmap = container_of(kref, typeof(*unmap), kref);
  1132		struct device *dev = unmap->dev;
  1133		int cnt, i, sg_nents;
  1134		struct scatterlist *sg;
  1135	
  1136		sg_nents = dma_unmap_data_sg_to_nents(unmap, unmap->map_cnt);
  1137		if (sg_nents) {
  1138			i = 0;
  1139			cnt = 1;
> 1140			sg = (struct scatterlist *)unmap->addr[i];
  1141			dma_unmap_sg(dev, sg, sg_nents, DMA_TO_DEVICE);
  1142		} else {
  1143			cnt = unmap->to_cnt;
  1144			for (i = 0; i < cnt; i++)
  1145				dma_unmap_page(dev, unmap->addr[i], unmap->len,
  1146						DMA_TO_DEVICE);
  1147		}
  1148	
  1149		sg_nents = dma_unmap_data_sg_from_nents(unmap, unmap->map_cnt);
  1150		if (sg_nents) {
  1151			sg = (struct scatterlist *)unmap->addr[i];
  1152			dma_unmap_sg(dev, sg, sg_nents, DMA_FROM_DEVICE);
  1153			cnt++;
  1154			i++;
  1155		} else {
  1156			cnt += unmap->from_cnt;
  1157			for (; i < cnt; i++)
  1158				dma_unmap_page(dev, unmap->addr[i], unmap->len,
  1159						DMA_FROM_DEVICE);
  1160		}
  1161	
  1162		cnt += unmap->bidi_cnt;
  1163		for (; i < cnt; i++) {
  1164			if (unmap->addr[i] == 0)
  1165				continue;
  1166			dma_unmap_page(dev, unmap->addr[i], unmap->len,
  1167				       DMA_BIDIRECTIONAL);
  1168		}
  1169		cnt = unmap->map_cnt;
  1170		mempool_free(unmap, __get_unmap_pool(cnt)->pool);
  1171	}
  1172	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-08-04 13:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-03 23:31 [PATCH v3 0/8] Adding blk-mq and DMA support to pmem block driver Dave Jiang
2017-08-03 23:31 ` [PATCH v3 1/8] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels Dave Jiang
2017-08-03 23:31 ` [PATCH v3 2/8] dmaengine: change transaction type DMA_SG to DMA_SG_SG Dave Jiang
2017-08-03 23:32 ` [PATCH v3 3/8] dmaengine: Add DMA_MEMCPY_SG transaction op Dave Jiang
2017-08-03 23:32 ` [PATCH v3 4/8] dmaengine: add verification of DMA_MEMSET_SG in dmaengine Dave Jiang
2017-08-03 23:32 ` [PATCH v3 5/8] dmaengine: ioatdma: dma_prep_memcpy_sg support Dave Jiang
2017-08-03 23:32 ` [PATCH v3 6/8] dmaengine: add SG support to dmaengine_unmap Dave Jiang
2017-08-04 13:24   ` kbuild test robot
2017-08-03 23:32 ` [PATCH v3 7/8] libnvdimm: Adding blk-mq support to the pmem driver Dave Jiang
2017-08-03 23:32 ` [PATCH v3 8/8] libnvdimm: add DMA support for pmem blk-mq Dave Jiang
2017-08-04 12:55   ` kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).