All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] mmc: core: modernize ioctl() requests
@ 2017-05-10  8:24 Linus Walleij
  2017-05-10  8:24 ` [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option Linus Walleij
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-10  8:24 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Linus Walleij

This is a series that starts to untangle the MMC "big host lock",
i.e. what is taken by issueing mmc_claim_host() which usually happens
through mmc_get_card().

The host lock is standing in the way of a bunch of modernizations,
because the block layer interface takes this lock when a new request
arrives, then will not release it until the queue is empty, which
is detected by getting NULL from blk_fetch_request().

This does not work with the new MQ block layer because it issues
requests asynchrously and there is no way of telling if the submit
queue is empty or not. Block requests must always be served. Trying
to introduce an interface for figuring out if the queue is empty
is probably generally a bad idea as it will require cross-talk
between all CPUs and then we hit Amdahl's Law again when scaling
upwards with CPUs.

So Arnd Bergmann suggested that whatever parallel requests the
host lock is protecting, maybe these requests can be funneled
through the block layer itself?

It turns out that most block drivers already do this, by using the
"special" block requests REQ_OP_DRV_IN and REQ_OP_DRV_OUT. And
it is what we should have done from the beginning.

To do this, the per-request extra data (which I think is also
referred to as "tag") need to be handled the same way that all
other block drivers do it: allocate it through the block layer.
In the MMC stack, this extra per-request data (tag) is called
struct mmc_queue_req.

So the second patch convert to using the generic block layer
mechanism to allocate tags. This makes the core simpler IMO and
that is why the patch series has negative line count. We move
stuff over to the block layer.

The first patch cleans out bounce buffer configurability and
move that to be a property of the host rather than a Kconfig option
in order to make it cleaner to do the rest of the refactorings.

The last two patches moves the ioctl() calls over to use the
new per-request tags and removes two instances of mmc_get_card(),
so we start to untangle the big host lock.

This approach can be used to get rid of the debugfs mmc_get_card()
as well but I want to start with the ioctl()s.

It already fixes a problem: userspace and the block layer could
essentially starve each other by bombing requests from either
block access or ioctl() congesting the host lock. With this,
ioctl() operations get scheduled by the block layer with
everything else. Not that I know how the block layer prioritizes
REQ_OP_DRV_IN and REQ_OP_DRV_OUT requests, but I am confident
it does a better job than the first-come-first-served host lock.

This drives a truck through mine and Adrians patch sets for
multiqueue and command queueing respectively, but I think that
both of our patch series will get easier to implement if we
exploit these patches as a base for future work, so if they can
get positive reviews and does not make everything explode, I
would suggest we merge them as a starter for the v4.13 kernel
cycle.

Linus Walleij (5):
  mmc: core: Delete bounce buffer Kconfig option
  mmc: core: Allocate per-request data using the block layer core
  mmc: block: Tag is_rpmb as bool
  mmc: block: move single ioctl() commands to block requests
  mmc: block: move multi-ioctl() to use block layer

 drivers/mmc/core/Kconfig  |  18 ----
 drivers/mmc/core/block.c  | 126 +++++++++++++++----------
 drivers/mmc/core/queue.c  | 235 ++++++++++++----------------------------------
 drivers/mmc/core/queue.h  |  26 +++--
 drivers/mmc/host/cavium.c |   3 +
 drivers/mmc/host/pxamci.c |   6 ++
 include/linux/mmc/card.h  |   2 -
 include/linux/mmc/host.h  |   1 +
 8 files changed, 161 insertions(+), 256 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option
  2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
@ 2017-05-10  8:24 ` Linus Walleij
  2017-05-15 11:55   ` Ulf Hansson
  2017-05-15 14:04   ` Bartlomiej Zolnierkiewicz
  2017-05-10  8:24 ` [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core Linus Walleij
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-10  8:24 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Linus Walleij

This option is activated by all multiplatform configs and what
not so we almost always have it turned on, and the memory it
saves is negligible, even more so moving forward. The actual
bounce buffer only gets allocated only when used, the only
thing the ifdefs are saving is a little bit of code.

It is highly improper to have this as a Kconfig option that
get turned on by Kconfig, make this a pure runtime-thing and
let the host decide whether we use bounce buffers. We add a
new property "disable_bounce" to the host struct.

Notice that mmc_queue_calc_bouncesz() already disables the
bounce buffers if host->max_segs != 1, so any arch that has a
maximum number of segments higher than 1 will have bounce
buffers disabled.

The option CONFIG_MMC_BLOCK_BOUNCE is default y so the
majority of platforms in the kernel already have it on, and
it then gets turned off at runtime since most of these have
a host->max_segs > 1. The few exceptions that have
host->max_segs == 1 and still turn off the bounce buffering
are those that disable it in their defconfig.

Those are the following:

arch/arm/configs/colibri_pxa300_defconfig
arch/arm/configs/zeus_defconfig
- Uses MMC_PXA, drivers/mmc/host/pxamci.c
- Sets host->max_segs = NR_SG, which is 1
- This needs its bounce buffer deactivated so we set
  host->disable_bounce to true in the host driver

arch/arm/configs/davinci_all_defconfig
- Uses MMC_DAVINCI, drivers/mmc/host/davinci_mmc.c
- This driver sets host->max_segs to MAX_NR_SG, which is 16
- That means this driver anyways disabled bounce buffers
- No special action needed for this platform

arch/arm/configs/lpc32xx_defconfig
arch/arm/configs/nhk8815_defconfig
arch/arm/configs/u300_defconfig
- Uses MMC_ARMMMCI, drivers/mmc/host/mmci.[c|h]
- This driver by default sets host->max_segs to NR_SG,
  which is 128, unless a DMA engine is used, and in that case
  the number of segments are also > 1
- That means this driver already disables bounce buffers
- No special action needed for these platforms

arch/arm/configs/sama5_defconfig
- Uses MMC_SDHCI, MMC_SDHCI_PLTFM, MMC_SDHCI_OF_AT91, MMC_ATMELMCI
- Uses drivers/mmc/host/sdhci.c
- Normally sets host->max_segs to SDHCI_MAX_SEGS which is 128 and
  thus disables bounce buffers
- Sets host->max_segs to 1 if SDHCI_USE_SDMA is set
- SDHCI_USE_SDMA is only set by SDHCI on PCI adapers
- That means that for this platform bounce buffers are already
  disabled at runtime
- No special action needed for this platform

arch/blackfin/configs/CM-BF533_defconfig
arch/blackfin/configs/CM-BF537E_defconfig
- Uses MMC_SPI (a simple MMC card connected on SPI pins)
- Uses drivers/mmc/host/mmc_spi.c
- Sets host->max_segs to MMC_SPI_BLOCKSATONCE which is 128
- That means this platform already disables bounce buffers at
  runtime
- No special action needed for these platforms

arch/mips/configs/cavium_octeon_defconfig
- Uses MMC_CAVIUM_OCTEON, drivers/mmc/host/cavium.c
- Sets host->max_segs to 16 or 1
- Setting host->disable_bounce to be sure for the 1 case

arch/mips/configs/qi_lb60_defconfig
- Uses MMC_JZ4740, drivers/mmc/host/jz4740_mmc.c
- This sets host->max_segs to 128 so bounce buffers are
  already runtime disabled
- No action needed for this platform

It would be interesting to come up with a list of the platforms
that actually end up using bounce buffers. I have not been
able to infer such a list, but it occurs when
host->max_segs == 1 and the bounce buffering is not explicitly
disabled.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/Kconfig  | 18 ------------------
 drivers/mmc/core/queue.c  | 15 +--------------
 drivers/mmc/host/cavium.c |  3 +++
 drivers/mmc/host/pxamci.c |  6 ++++++
 include/linux/mmc/host.h  |  1 +
 5 files changed, 11 insertions(+), 32 deletions(-)

diff --git a/drivers/mmc/core/Kconfig b/drivers/mmc/core/Kconfig
index fc1ecdaaa9ca..42e89060cd41 100644
--- a/drivers/mmc/core/Kconfig
+++ b/drivers/mmc/core/Kconfig
@@ -61,24 +61,6 @@ config MMC_BLOCK_MINORS
 
 	  If unsure, say 8 here.
 
-config MMC_BLOCK_BOUNCE
-	bool "Use bounce buffer for simple hosts"
-	depends on MMC_BLOCK
-	default y
-	help
-	  SD/MMC is a high latency protocol where it is crucial to
-	  send large requests in order to get high performance. Many
-	  controllers, however, are restricted to continuous memory
-	  (i.e. they can't do scatter-gather), something the kernel
-	  rarely can provide.
-
-	  Say Y here to help these restricted hosts by bouncing
-	  requests back and forth from a large buffer. You will get
-	  a big performance gain at the cost of up to 64 KiB of
-	  physical memory.
-
-	  If unsure, say Y here.
-
 config SDIO_UART
 	tristate "SDIO UART/GPS class support"
 	depends on TTY
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 5c37b6be3e7b..545466342fb1 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -219,7 +219,6 @@ static struct mmc_queue_req *mmc_queue_alloc_mqrqs(int qdepth)
 	return mqrq;
 }
 
-#ifdef CONFIG_MMC_BLOCK_BOUNCE
 static int mmc_queue_alloc_bounce_bufs(struct mmc_queue_req *mqrq, int qdepth,
 				       unsigned int bouncesz)
 {
@@ -258,7 +257,7 @@ static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
 {
 	unsigned int bouncesz = MMC_QUEUE_BOUNCESZ;
 
-	if (host->max_segs != 1)
+	if (host->max_segs != 1 || host->disable_bounce)
 		return 0;
 
 	if (bouncesz > host->max_req_size)
@@ -273,18 +272,6 @@ static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
 
 	return bouncesz;
 }
-#else
-static inline bool mmc_queue_alloc_bounce(struct mmc_queue_req *mqrq,
-					  int qdepth, unsigned int bouncesz)
-{
-	return false;
-}
-
-static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
-{
-	return 0;
-}
-#endif
 
 static int mmc_queue_alloc_sgs(struct mmc_queue_req *mqrq, int qdepth,
 			       int max_segs)
diff --git a/drivers/mmc/host/cavium.c b/drivers/mmc/host/cavium.c
index 58b51ba6aabd..66066f73e477 100644
--- a/drivers/mmc/host/cavium.c
+++ b/drivers/mmc/host/cavium.c
@@ -1050,6 +1050,9 @@ int cvm_mmc_of_slot_probe(struct device *dev, struct cvm_mmc_host *host)
 	else
 		mmc->max_segs = 1;
 
+	/* Disable bounce buffers for max_segs = 1 */
+	mmc->disable_bounce = true;
+
 	/* DMA size field can address up to 8 MB */
 	mmc->max_seg_size = 8 * 1024 * 1024;
 	mmc->max_req_size = mmc->max_seg_size;
diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c
index c763b404510f..d3b5e6376504 100644
--- a/drivers/mmc/host/pxamci.c
+++ b/drivers/mmc/host/pxamci.c
@@ -666,6 +666,12 @@ static int pxamci_probe(struct platform_device *pdev)
 	mmc->max_segs = NR_SG;
 
 	/*
+	 * This architecture used to disable bounce buffers through its
+	 * defconfig, now it is done at runtime as a host property.
+	 */
+	mmc->disable_bounce = true;
+
+	/*
 	 * Our hardware DMA can handle a maximum of one page per SG entry.
 	 */
 	mmc->max_seg_size = PAGE_SIZE;
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 21385ac0c9b1..b53c0e18a33b 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -310,6 +310,7 @@ struct mmc_host {
 	/* host specific block data */
 	unsigned int		max_seg_size;	/* see blk_queue_max_segment_size */
 	unsigned short		max_segs;	/* see blk_queue_max_segments */
+	bool			disable_bounce; /* disable bounce buffers */
 	unsigned short		unused;
 	unsigned int		max_req_size;	/* maximum number of bytes in one req */
 	unsigned int		max_blk_size;	/* maximum size of one mmc block */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
  2017-05-10  8:24 ` [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option Linus Walleij
@ 2017-05-10  8:24 ` Linus Walleij
  2017-05-16  9:02   ` Ulf Hansson
  2017-05-16 11:54   ` Adrian Hunter
  2017-05-10  8:24 ` [PATCH 3/5] mmc: block: Tag is_rpmb as bool Linus Walleij
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-10  8:24 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Linus Walleij

The mmc_queue_req is a per-request state container the MMC core uses
to carry bounce buffers, pointers to asynchronous requests and so on.
Currently allocated as a static array of objects, then as a request
comes in, a mmc_queue_req is assigned to it, and used during the
lifetime of the request.

This is backwards compared to how other block layer drivers work:
they usally let the block core provide a per-request struct that get
allocated right beind the struct request, and which can be obtained
using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function
name is misleading: it is used by both the old and the MQ block
layer.)

The per-request struct gets allocated to the size stored in the queue
variable .cmd_size initialized using the .init_rq_fn() and
cleaned up using .exit_rq_fn().

The block layer code makes the MMC core rely on this mechanism to
allocate the per-request mmc_queue_req state container.

Doing this make a lot of complicated queue handling go away. We only
need to keep the .qnct that keeps count of how many request are
currently being processed by the MMC layer. The MQ block layer will
replace also this once we transition to it.

Doing this refactoring is necessary to move the ioctl() operations
into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT]
instead of the custom code using the BigMMCHostLock that we have
today: those require that per-request data be obtainable easily from
a request after creating a custom request with e.g.:

struct request *rq = blk_get_request(q, REQ_OP_DRV_IN, __GFP_RECLAIM);
struct mmc_queue_req *mq_rq = req_to_mq_rq(rq);

And this is not possible with the current construction, as the request
is not immediately assigned the per-request state container, but
instead it gets assigned when the request finally enters the MMC
queue, which is way too late for custom requests.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c |  38 ++------
 drivers/mmc/core/queue.c | 222 +++++++++++++----------------------------------
 drivers/mmc/core/queue.h |  22 ++---
 include/linux/mmc/card.h |   2 -
 4 files changed, 80 insertions(+), 204 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 8273b078686d..be782b8d4a0d 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -129,13 +129,6 @@ static inline int mmc_blk_part_switch(struct mmc_card *card,
 				      struct mmc_blk_data *md);
 static int get_card_status(struct mmc_card *card, u32 *status, int retries);
 
-static void mmc_blk_requeue(struct request_queue *q, struct request *req)
-{
-	spin_lock_irq(q->queue_lock);
-	blk_requeue_request(q, req);
-	spin_unlock_irq(q->queue_lock);
-}
-
 static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
 {
 	struct mmc_blk_data *md;
@@ -1642,7 +1635,7 @@ static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
 	if (mmc_card_removed(card))
 		req->rq_flags |= RQF_QUIET;
 	while (blk_end_request(req, -EIO, blk_rq_cur_bytes(req)));
-	mmc_queue_req_free(mq, mqrq);
+	mq->qcnt--;
 }
 
 /**
@@ -1662,7 +1655,7 @@ static void mmc_blk_rw_try_restart(struct mmc_queue *mq, struct request *req,
 	if (mmc_card_removed(mq->card)) {
 		req->rq_flags |= RQF_QUIET;
 		blk_end_request_all(req, -EIO);
-		mmc_queue_req_free(mq, mqrq);
+		mq->qcnt--; /* FIXME: just set to 0? */
 		return;
 	}
 	/* Else proceed and try to restart the current async request */
@@ -1685,12 +1678,8 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 	bool req_pending = true;
 
 	if (new_req) {
-		mqrq_cur = mmc_queue_req_find(mq, new_req);
-		if (!mqrq_cur) {
-			WARN_ON(1);
-			mmc_blk_requeue(mq->queue, new_req);
-			new_req = NULL;
-		}
+		mqrq_cur = req_to_mq_rq(new_req);
+		mq->qcnt++;
 	}
 
 	if (!mq->qcnt)
@@ -1764,12 +1753,12 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 				if (req_pending)
 					mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
 				else
-					mmc_queue_req_free(mq, mq_rq);
+					mq->qcnt--;
 				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
 				return;
 			}
 			if (!req_pending) {
-				mmc_queue_req_free(mq, mq_rq);
+				mq->qcnt--;
 				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
 				return;
 			}
@@ -1814,7 +1803,7 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 			req_pending = blk_end_request(old_req, -EIO,
 						      brq->data.blksz);
 			if (!req_pending) {
-				mmc_queue_req_free(mq, mq_rq);
+				mq->qcnt--;
 				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
 				return;
 			}
@@ -1844,7 +1833,7 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 		}
 	} while (req_pending);
 
-	mmc_queue_req_free(mq, mq_rq);
+	mq->qcnt--;
 }
 
 void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
@@ -2166,7 +2155,6 @@ static int mmc_blk_probe(struct mmc_card *card)
 {
 	struct mmc_blk_data *md, *part_md;
 	char cap_str[10];
-	int ret;
 
 	/*
 	 * Check that the card supports the command class(es) we need.
@@ -2176,15 +2164,9 @@ static int mmc_blk_probe(struct mmc_card *card)
 
 	mmc_fixup_device(card, mmc_blk_fixups);
 
-	ret = mmc_queue_alloc_shared_queue(card);
-	if (ret)
-		return ret;
-
 	md = mmc_blk_alloc(card);
-	if (IS_ERR(md)) {
-		mmc_queue_free_shared_queue(card);
+	if (IS_ERR(md))
 		return PTR_ERR(md);
-	}
 
 	string_get_size((u64)get_capacity(md->disk), 512, STRING_UNITS_2,
 			cap_str, sizeof(cap_str));
@@ -2222,7 +2204,6 @@ static int mmc_blk_probe(struct mmc_card *card)
  out:
 	mmc_blk_remove_parts(card, md);
 	mmc_blk_remove_req(md);
-	mmc_queue_free_shared_queue(card);
 	return 0;
 }
 
@@ -2240,7 +2221,6 @@ static void mmc_blk_remove(struct mmc_card *card)
 	pm_runtime_put_noidle(&card->dev);
 	mmc_blk_remove_req(md);
 	dev_set_drvdata(&card->dev, NULL);
-	mmc_queue_free_shared_queue(card);
 }
 
 static int _mmc_blk_suspend(struct mmc_card *card)
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 545466342fb1..65a8e0e63012 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -40,35 +40,6 @@ static int mmc_prep_request(struct request_queue *q, struct request *req)
 	return BLKPREP_OK;
 }
 
-struct mmc_queue_req *mmc_queue_req_find(struct mmc_queue *mq,
-					 struct request *req)
-{
-	struct mmc_queue_req *mqrq;
-	int i = ffz(mq->qslots);
-
-	if (i >= mq->qdepth)
-		return NULL;
-
-	mqrq = &mq->mqrq[i];
-	WARN_ON(mqrq->req || mq->qcnt >= mq->qdepth ||
-		test_bit(mqrq->task_id, &mq->qslots));
-	mqrq->req = req;
-	mq->qcnt += 1;
-	__set_bit(mqrq->task_id, &mq->qslots);
-
-	return mqrq;
-}
-
-void mmc_queue_req_free(struct mmc_queue *mq,
-			struct mmc_queue_req *mqrq)
-{
-	WARN_ON(!mqrq->req || mq->qcnt < 1 ||
-		!test_bit(mqrq->task_id, &mq->qslots));
-	mqrq->req = NULL;
-	mq->qcnt -= 1;
-	__clear_bit(mqrq->task_id, &mq->qslots);
-}
-
 static int mmc_queue_thread(void *d)
 {
 	struct mmc_queue *mq = d;
@@ -149,11 +120,11 @@ static void mmc_request_fn(struct request_queue *q)
 		wake_up_process(mq->thread);
 }
 
-static struct scatterlist *mmc_alloc_sg(int sg_len)
+static struct scatterlist *mmc_alloc_sg(int sg_len, gfp_t gfp)
 {
 	struct scatterlist *sg;
 
-	sg = kmalloc_array(sg_len, sizeof(*sg), GFP_KERNEL);
+	sg = kmalloc_array(sg_len, sizeof(*sg), gfp);
 	if (sg)
 		sg_init_table(sg, sg_len);
 
@@ -179,80 +150,6 @@ static void mmc_queue_setup_discard(struct request_queue *q,
 		queue_flag_set_unlocked(QUEUE_FLAG_SECERASE, q);
 }
 
-static void mmc_queue_req_free_bufs(struct mmc_queue_req *mqrq)
-{
-	kfree(mqrq->bounce_sg);
-	mqrq->bounce_sg = NULL;
-
-	kfree(mqrq->sg);
-	mqrq->sg = NULL;
-
-	kfree(mqrq->bounce_buf);
-	mqrq->bounce_buf = NULL;
-}
-
-static void mmc_queue_reqs_free_bufs(struct mmc_queue_req *mqrq, int qdepth)
-{
-	int i;
-
-	for (i = 0; i < qdepth; i++)
-		mmc_queue_req_free_bufs(&mqrq[i]);
-}
-
-static void mmc_queue_free_mqrqs(struct mmc_queue_req *mqrq, int qdepth)
-{
-	mmc_queue_reqs_free_bufs(mqrq, qdepth);
-	kfree(mqrq);
-}
-
-static struct mmc_queue_req *mmc_queue_alloc_mqrqs(int qdepth)
-{
-	struct mmc_queue_req *mqrq;
-	int i;
-
-	mqrq = kcalloc(qdepth, sizeof(*mqrq), GFP_KERNEL);
-	if (mqrq) {
-		for (i = 0; i < qdepth; i++)
-			mqrq[i].task_id = i;
-	}
-
-	return mqrq;
-}
-
-static int mmc_queue_alloc_bounce_bufs(struct mmc_queue_req *mqrq, int qdepth,
-				       unsigned int bouncesz)
-{
-	int i;
-
-	for (i = 0; i < qdepth; i++) {
-		mqrq[i].bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
-		if (!mqrq[i].bounce_buf)
-			return -ENOMEM;
-
-		mqrq[i].sg = mmc_alloc_sg(1);
-		if (!mqrq[i].sg)
-			return -ENOMEM;
-
-		mqrq[i].bounce_sg = mmc_alloc_sg(bouncesz / 512);
-		if (!mqrq[i].bounce_sg)
-			return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static bool mmc_queue_alloc_bounce(struct mmc_queue_req *mqrq, int qdepth,
-				   unsigned int bouncesz)
-{
-	int ret;
-
-	ret = mmc_queue_alloc_bounce_bufs(mqrq, qdepth, bouncesz);
-	if (ret)
-		mmc_queue_reqs_free_bufs(mqrq, qdepth);
-
-	return !ret;
-}
-
 static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
 {
 	unsigned int bouncesz = MMC_QUEUE_BOUNCESZ;
@@ -273,71 +170,62 @@ static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
 	return bouncesz;
 }
 
-static int mmc_queue_alloc_sgs(struct mmc_queue_req *mqrq, int qdepth,
-			       int max_segs)
+/**
+ * mmc_init_request() - initialize the MMC-specific per-request data
+ * @q: the request queue
+ * @req: the request
+ * @gfp: memory allocation policy
+ */
+static int mmc_init_request(struct request_queue *q, struct request *req,
+			    gfp_t gfp)
 {
-	int i;
+	struct mmc_queue_req *mq_rq = req_to_mq_rq(req);
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
 
-	for (i = 0; i < qdepth; i++) {
-		mqrq[i].sg = mmc_alloc_sg(max_segs);
-		if (!mqrq[i].sg)
+	/* FIXME: use req_to_mq_rq() everywhere this is dereferenced */
+	mq_rq->req = req;
+
+	if (card->bouncesz) {
+		mq_rq->bounce_buf = kmalloc(card->bouncesz, gfp);
+		if (!mq_rq->bounce_buf)
+			return -ENOMEM;
+		if (card->bouncesz > 512) {
+			mq_rq->sg = mmc_alloc_sg(1, gfp);
+			if (!mq_rq->sg)
+				return -ENOMEM;
+			mq_rq->bounce_sg = mmc_alloc_sg(card->bouncesz / 512,
+							gfp);
+			if (!mq_rq->bounce_sg)
+				return -ENOMEM;
+		}
+	} else {
+		mq_rq->bounce_buf = NULL;
+		mq_rq->bounce_sg = NULL;
+		mq_rq->sg = mmc_alloc_sg(host->max_segs, gfp);
+		if (!mq_rq->sg)
 			return -ENOMEM;
 	}
 
 	return 0;
 }
 
-void mmc_queue_free_shared_queue(struct mmc_card *card)
+static void mmc_exit_request(struct request_queue *q, struct request *req)
 {
-	if (card->mqrq) {
-		mmc_queue_free_mqrqs(card->mqrq, card->qdepth);
-		card->mqrq = NULL;
-	}
-}
+	struct mmc_queue_req *mq_rq = req_to_mq_rq(req);
 
-static int __mmc_queue_alloc_shared_queue(struct mmc_card *card, int qdepth)
-{
-	struct mmc_host *host = card->host;
-	struct mmc_queue_req *mqrq;
-	unsigned int bouncesz;
-	int ret = 0;
-
-	if (card->mqrq)
-		return -EINVAL;
-
-	mqrq = mmc_queue_alloc_mqrqs(qdepth);
-	if (!mqrq)
-		return -ENOMEM;
+	/* It is OK to kfree(NULL) so this will be smooth */
+	kfree(mq_rq->bounce_sg);
+	mq_rq->bounce_sg = NULL;
 
-	card->mqrq = mqrq;
-	card->qdepth = qdepth;
+	kfree(mq_rq->bounce_buf);
+	mq_rq->bounce_buf = NULL;
 
-	bouncesz = mmc_queue_calc_bouncesz(host);
-
-	if (bouncesz && !mmc_queue_alloc_bounce(mqrq, qdepth, bouncesz)) {
-		bouncesz = 0;
-		pr_warn("%s: unable to allocate bounce buffers\n",
-			mmc_card_name(card));
-	}
-
-	card->bouncesz = bouncesz;
-
-	if (!bouncesz) {
-		ret = mmc_queue_alloc_sgs(mqrq, qdepth, host->max_segs);
-		if (ret)
-			goto out_err;
-	}
+	kfree(mq_rq->sg);
+	mq_rq->sg = NULL;
 
-	return ret;
-
-out_err:
-	mmc_queue_free_shared_queue(card);
-	return ret;
-}
-
-int mmc_queue_alloc_shared_queue(struct mmc_card *card)
-{
-	return __mmc_queue_alloc_shared_queue(card, 2);
+	mq_rq->req = NULL;
 }
 
 /**
@@ -360,13 +248,21 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 		limit = (u64)dma_max_pfn(mmc_dev(host)) << PAGE_SHIFT;
 
 	mq->card = card;
-	mq->queue = blk_init_queue(mmc_request_fn, lock);
+	mq->queue = blk_alloc_queue_node(GFP_KERNEL, NUMA_NO_NODE);
 	if (!mq->queue)
 		return -ENOMEM;
-
-	mq->mqrq = card->mqrq;
-	mq->qdepth = card->qdepth;
+	mq->queue->queue_lock = lock;
+	mq->queue->request_fn = mmc_request_fn;
+	mq->queue->init_rq_fn = mmc_init_request;
+	mq->queue->exit_rq_fn = mmc_exit_request;
+	mq->queue->cmd_size = sizeof(struct mmc_queue_req);
 	mq->queue->queuedata = mq;
+	mq->qcnt = 0;
+	ret = blk_init_allocated_queue(mq->queue);
+	if (ret) {
+		blk_cleanup_queue(mq->queue);
+		return ret;
+	}
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
 	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);
@@ -374,6 +270,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	if (mmc_can_erase(card))
 		mmc_queue_setup_discard(mq->queue, card);
 
+	card->bouncesz = mmc_queue_calc_bouncesz(host);
 	if (card->bouncesz) {
 		blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 		blk_queue_max_hw_sectors(mq->queue, card->bouncesz / 512);
@@ -400,7 +297,6 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	return 0;
 
 cleanup_queue:
-	mq->mqrq = NULL;
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -421,8 +317,8 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	q->queuedata = NULL;
 	blk_start_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
+	blk_cleanup_queue(mq->queue);
 
-	mq->mqrq = NULL;
 	mq->card = NULL;
 }
 EXPORT_SYMBOL(mmc_cleanup_queue);
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 871796c3f406..8aa10ffdf622 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -3,9 +3,15 @@
 
 #include <linux/types.h>
 #include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/mmc/core.h>
 #include <linux/mmc/host.h>
 
+static inline struct mmc_queue_req *req_to_mq_rq(struct request *rq)
+{
+	return blk_mq_rq_to_pdu(rq);
+}
+
 static inline bool mmc_req_is_special(struct request *req)
 {
 	return req &&
@@ -34,7 +40,6 @@ struct mmc_queue_req {
 	struct scatterlist	*bounce_sg;
 	unsigned int		bounce_sg_len;
 	struct mmc_async_req	areq;
-	int			task_id;
 };
 
 struct mmc_queue {
@@ -45,14 +50,15 @@ struct mmc_queue {
 	bool			asleep;
 	struct mmc_blk_data	*blkdata;
 	struct request_queue	*queue;
-	struct mmc_queue_req	*mqrq;
-	int			qdepth;
+	/*
+	 * FIXME: this counter is not a very reliable way of keeping
+	 * track of how many requests that are ongoing. Switch to just
+	 * letting the block core keep track of requests and per-request
+	 * associated mmc_queue_req data.
+	 */
 	int			qcnt;
-	unsigned long		qslots;
 };
 
-extern int mmc_queue_alloc_shared_queue(struct mmc_card *card);
-extern void mmc_queue_free_shared_queue(struct mmc_card *card);
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
 			  const char *);
 extern void mmc_cleanup_queue(struct mmc_queue *);
@@ -66,8 +72,4 @@ extern void mmc_queue_bounce_post(struct mmc_queue_req *);
 
 extern int mmc_access_rpmb(struct mmc_queue *);
 
-extern struct mmc_queue_req *mmc_queue_req_find(struct mmc_queue *,
-						struct request *);
-extern void mmc_queue_req_free(struct mmc_queue *, struct mmc_queue_req *);
-
 #endif
diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index aad015e0152b..46c73e97e61f 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -305,9 +305,7 @@ struct mmc_card {
 	struct mmc_part	part[MMC_NUM_PHY_PARTITION]; /* physical partitions */
 	unsigned int    nr_parts;
 
-	struct mmc_queue_req	*mqrq;		/* Shared queue structure */
 	unsigned int		bouncesz;	/* Bounce buffer size */
-	int			qdepth;		/* Shared queue depth */
 };
 
 static inline bool mmc_large_sector(struct mmc_card *card)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/5] mmc: block: Tag is_rpmb as bool
  2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
  2017-05-10  8:24 ` [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option Linus Walleij
  2017-05-10  8:24 ` [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core Linus Walleij
@ 2017-05-10  8:24 ` Linus Walleij
  2017-05-10  8:24 ` [PATCH 4/5] mmc: block: move single ioctl() commands to block requests Linus Walleij
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-10  8:24 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Linus Walleij

The variable is_rpmb is clearly a bool and even assigned true
and false, yet declared as an int.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index be782b8d4a0d..323f3790b629 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -443,7 +443,7 @@ static int __mmc_blk_ioctl_cmd(struct mmc_card *card, struct mmc_blk_data *md,
 	struct mmc_request mrq = {};
 	struct scatterlist sg;
 	int err;
-	int is_rpmb = false;
+	bool is_rpmb = false;
 	u32 status = 0;
 
 	if (!card || !md || !idata)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/5] mmc: block: move single ioctl() commands to block requests
  2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
                   ` (2 preceding siblings ...)
  2017-05-10  8:24 ` [PATCH 3/5] mmc: block: Tag is_rpmb as bool Linus Walleij
@ 2017-05-10  8:24 ` Linus Walleij
  2017-05-16  9:15   ` Ulf Hansson
  2017-05-23  8:14     ` Avri Altman
  2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
  2017-05-11 21:08 ` [PATCH 0/5] mmc: core: modernize ioctl() requests Ulf Hansson
  5 siblings, 2 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-10  8:24 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Linus Walleij

This wraps single ioctl() commands into block requests using
the custom block layer request types REQ_OP_DRV_IN and
REQ_OP_DRV_OUT.

By doing this we are loosening the grip on the big host lock,
since two calls to mmc_get_card()/mmc_put_card() are removed.

We are storing the ioctl() in/out argument as a pointer in
the per-request struct mmc_blk_request container. Since we
now let the block layer allocate this data, blk_get_request()
will allocate it for us and we can immediately dereference
it and use it to pass the argument into the block layer.

Tested on the ux500 with the userspace:
mmc extcsd read /dev/mmcblk3
resulting in a successful EXTCSD info dump back to the
console.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c | 56 ++++++++++++++++++++++++++++++++++++++----------
 drivers/mmc/core/queue.h |  3 +++
 2 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 323f3790b629..640db4f57a31 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -564,8 +564,10 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
 {
 	struct mmc_blk_ioc_data *idata;
 	struct mmc_blk_data *md;
+	struct mmc_queue *mq;
 	struct mmc_card *card;
 	int err = 0, ioc_err = 0;
+	struct request *req;
 
 	/*
 	 * The caller must have CAP_SYS_RAWIO, and must be calling this on the
@@ -591,17 +593,18 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
 		goto cmd_done;
 	}
 
-	mmc_get_card(card);
-
-	ioc_err = __mmc_blk_ioctl_cmd(card, md, idata);
-
-	/* Always switch back to main area after RPMB access */
-	if (md->area_type & MMC_BLK_DATA_AREA_RPMB)
-		mmc_blk_part_switch(card, dev_get_drvdata(&card->dev));
-
-	mmc_put_card(card);
-
+	/*
+	 * Dispatch the ioctl() into the block request queue.
+	 */
+	mq = &md->queue;
+	req = blk_get_request(mq->queue,
+		idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
+		__GFP_RECLAIM);
+	req_to_mq_rq(req)->idata = idata;
+	blk_execute_rq(mq->queue, NULL, req, 0);
+	ioc_err = req_to_mq_rq(req)->ioc_result;
 	err = mmc_blk_ioctl_copy_to_user(ic_ptr, idata);
+	blk_put_request(req);
 
 cmd_done:
 	mmc_blk_put(md);
@@ -611,6 +614,31 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
 	return ioc_err ? ioc_err : err;
 }
 
+/*
+ * The ioctl commands come back from the block layer after it queued it and
+ * processed it with all other requests and then they get issued in this
+ * function.
+ */
+static void mmc_blk_ioctl_cmd_issue(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mq_rq;
+	struct mmc_blk_ioc_data *idata;
+	struct mmc_card *card = mq->card;
+	struct mmc_blk_data *md = mq->blkdata;
+	int ioc_err;
+
+	mq_rq = req_to_mq_rq(req);
+	idata = mq_rq->idata;
+	ioc_err = __mmc_blk_ioctl_cmd(card, md, idata);
+	mq_rq->ioc_result = ioc_err;
+
+	/* Always switch back to main area after RPMB access */
+	if (md->area_type & MMC_BLK_DATA_AREA_RPMB)
+		mmc_blk_part_switch(card, dev_get_drvdata(&card->dev));
+
+	blk_end_request_all(req, ioc_err);
+}
+
 static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
 				   struct mmc_ioc_multi_cmd __user *user)
 {
@@ -1854,7 +1882,13 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 		goto out;
 	}
 
-	if (req && req_op(req) == REQ_OP_DISCARD) {
+	if (req &&
+	    (req_op(req) == REQ_OP_DRV_IN || req_op(req) == REQ_OP_DRV_OUT)) {
+		/* complete ongoing async transfer before issuing ioctl()s */
+		if (mq->qcnt)
+			mmc_blk_issue_rw_rq(mq, NULL);
+		mmc_blk_ioctl_cmd_issue(mq, req);
+	} else if (req && req_op(req) == REQ_OP_DISCARD) {
 		/* complete ongoing async transfer before issuing discard */
 		if (mq->qcnt)
 			mmc_blk_issue_rw_rq(mq, NULL);
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 8aa10ffdf622..aeb3408dc85e 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -22,6 +22,7 @@ static inline bool mmc_req_is_special(struct request *req)
 
 struct task_struct;
 struct mmc_blk_data;
+struct mmc_blk_ioc_data;
 
 struct mmc_blk_request {
 	struct mmc_request	mrq;
@@ -40,6 +41,8 @@ struct mmc_queue_req {
 	struct scatterlist	*bounce_sg;
 	unsigned int		bounce_sg_len;
 	struct mmc_async_req	areq;
+	int			ioc_result;
+	struct mmc_blk_ioc_data	*idata;
 };
 
 struct mmc_queue {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
  2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
                   ` (3 preceding siblings ...)
  2017-05-10  8:24 ` [PATCH 4/5] mmc: block: move single ioctl() commands to block requests Linus Walleij
@ 2017-05-10  8:24 ` Linus Walleij
  2017-05-12 21:09     ` Avri Altman
                     ` (3 more replies)
  2017-05-11 21:08 ` [PATCH 0/5] mmc: core: modernize ioctl() requests Ulf Hansson
  5 siblings, 4 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-10  8:24 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Linus Walleij

This switches also the multiple-command ioctl() call to issue
all ioctl()s through the block layer instead of going directly
to the device.

We extend the passed argument with an argument count and loop
over all passed commands in the ioctl() issue function called
from the block layer.

By doing this we are again loosening the grip on the big host
lock, since two calls to mmc_get_card()/mmc_put_card() are
removed.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c | 38 +++++++++++++++++++++++++-------------
 drivers/mmc/core/queue.h |  3 ++-
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 640db4f57a31..152de904d5e4 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -563,6 +563,7 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
 			     struct mmc_ioc_cmd __user *ic_ptr)
 {
 	struct mmc_blk_ioc_data *idata;
+	struct mmc_blk_ioc_data *idatas[1];
 	struct mmc_blk_data *md;
 	struct mmc_queue *mq;
 	struct mmc_card *card;
@@ -600,7 +601,9 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
 	req = blk_get_request(mq->queue,
 		idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
 		__GFP_RECLAIM);
-	req_to_mq_rq(req)->idata = idata;
+	idatas[0] = idata;
+	req_to_mq_rq(req)->idata = idatas;
+	req_to_mq_rq(req)->ioc_count = 1;
 	blk_execute_rq(mq->queue, NULL, req, 0);
 	ioc_err = req_to_mq_rq(req)->ioc_result;
 	err = mmc_blk_ioctl_copy_to_user(ic_ptr, idata);
@@ -622,14 +625,17 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
 static void mmc_blk_ioctl_cmd_issue(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_queue_req *mq_rq;
-	struct mmc_blk_ioc_data *idata;
 	struct mmc_card *card = mq->card;
 	struct mmc_blk_data *md = mq->blkdata;
 	int ioc_err;
+	int i;
 
 	mq_rq = req_to_mq_rq(req);
-	idata = mq_rq->idata;
-	ioc_err = __mmc_blk_ioctl_cmd(card, md, idata);
+	for (i = 0; i < mq_rq->ioc_count; i++) {
+		ioc_err = __mmc_blk_ioctl_cmd(card, md, mq_rq->idata[i]);
+		if (ioc_err)
+			break;
+	}
 	mq_rq->ioc_result = ioc_err;
 
 	/* Always switch back to main area after RPMB access */
@@ -646,8 +652,10 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
 	struct mmc_ioc_cmd __user *cmds = user->cmds;
 	struct mmc_card *card;
 	struct mmc_blk_data *md;
+	struct mmc_queue *mq;
 	int i, err = 0, ioc_err = 0;
 	__u64 num_of_cmds;
+	struct request *req;
 
 	/*
 	 * The caller must have CAP_SYS_RAWIO, and must be calling this on the
@@ -689,21 +697,25 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
 		goto cmd_done;
 	}
 
-	mmc_get_card(card);
-
-	for (i = 0; i < num_of_cmds && !ioc_err; i++)
-		ioc_err = __mmc_blk_ioctl_cmd(card, md, idata[i]);
-
-	/* Always switch back to main area after RPMB access */
-	if (md->area_type & MMC_BLK_DATA_AREA_RPMB)
-		mmc_blk_part_switch(card, dev_get_drvdata(&card->dev));
 
-	mmc_put_card(card);
+	/*
+	 * Dispatch the ioctl()s into the block request queue.
+	 */
+	mq = &md->queue;
+	req = blk_get_request(mq->queue,
+		idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
+		__GFP_RECLAIM);
+	req_to_mq_rq(req)->idata = idata;
+	req_to_mq_rq(req)->ioc_count = num_of_cmds;
+	blk_execute_rq(mq->queue, NULL, req, 0);
+	ioc_err = req_to_mq_rq(req)->ioc_result;
 
 	/* copy to user if data and response */
 	for (i = 0; i < num_of_cmds && !err; i++)
 		err = mmc_blk_ioctl_copy_to_user(&cmds[i], idata[i]);
 
+	blk_put_request(req);
+
 cmd_done:
 	mmc_blk_put(md);
 cmd_err:
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index aeb3408dc85e..7015df6681c3 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -42,7 +42,8 @@ struct mmc_queue_req {
 	unsigned int		bounce_sg_len;
 	struct mmc_async_req	areq;
 	int			ioc_result;
-	struct mmc_blk_ioc_data	*idata;
+	struct mmc_blk_ioc_data	**idata;
+	unsigned int		ioc_count;
 };
 
 struct mmc_queue {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/5] mmc: core: modernize ioctl() requests
  2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
                   ` (4 preceding siblings ...)
  2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
@ 2017-05-11 21:08 ` Ulf Hansson
  5 siblings, 0 replies; 31+ messages in thread
From: Ulf Hansson @ 2017-05-11 21:08 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On 10 May 2017 at 10:24, Linus Walleij <linus.walleij@linaro.org> wrote:
> This is a series that starts to untangle the MMC "big host lock",
> i.e. what is taken by issueing mmc_claim_host() which usually happens
> through mmc_get_card().
>
> The host lock is standing in the way of a bunch of modernizations,
> because the block layer interface takes this lock when a new request
> arrives, then will not release it until the queue is empty, which
> is detected by getting NULL from blk_fetch_request().
>
> This does not work with the new MQ block layer because it issues
> requests asynchrously and there is no way of telling if the submit
> queue is empty or not. Block requests must always be served. Trying
> to introduce an interface for figuring out if the queue is empty
> is probably generally a bad idea as it will require cross-talk
> between all CPUs and then we hit Amdahl's Law again when scaling
> upwards with CPUs.
>
> So Arnd Bergmann suggested that whatever parallel requests the
> host lock is protecting, maybe these requests can be funneled
> through the block layer itself?
>
> It turns out that most block drivers already do this, by using the
> "special" block requests REQ_OP_DRV_IN and REQ_OP_DRV_OUT. And
> it is what we should have done from the beginning.
>
> To do this, the per-request extra data (which I think is also
> referred to as "tag") need to be handled the same way that all
> other block drivers do it: allocate it through the block layer.
> In the MMC stack, this extra per-request data (tag) is called
> struct mmc_queue_req.
>
> So the second patch convert to using the generic block layer
> mechanism to allocate tags. This makes the core simpler IMO and
> that is why the patch series has negative line count. We move
> stuff over to the block layer.
>
> The first patch cleans out bounce buffer configurability and
> move that to be a property of the host rather than a Kconfig option
> in order to make it cleaner to do the rest of the refactorings.
>
> The last two patches moves the ioctl() calls over to use the
> new per-request tags and removes two instances of mmc_get_card(),
> so we start to untangle the big host lock.
>
> This approach can be used to get rid of the debugfs mmc_get_card()
> as well but I want to start with the ioctl()s.
>
> It already fixes a problem: userspace and the block layer could
> essentially starve each other by bombing requests from either
> block access or ioctl() congesting the host lock. With this,
> ioctl() operations get scheduled by the block layer with
> everything else. Not that I know how the block layer prioritizes
> REQ_OP_DRV_IN and REQ_OP_DRV_OUT requests, but I am confident
> it does a better job than the first-come-first-served host lock.
>
> This drives a truck through mine and Adrians patch sets for
> multiqueue and command queueing respectively, but I think that
> both of our patch series will get easier to implement if we
> exploit these patches as a base for future work, so if they can
> get positive reviews and does not make everything explode, I
> would suggest we merge them as a starter for the v4.13 kernel
> cycle.
>
> Linus Walleij (5):
>   mmc: core: Delete bounce buffer Kconfig option
>   mmc: core: Allocate per-request data using the block layer core
>   mmc: block: Tag is_rpmb as bool
>   mmc: block: move single ioctl() commands to block requests
>   mmc: block: move multi-ioctl() to use block layer
>
>  drivers/mmc/core/Kconfig  |  18 ----
>  drivers/mmc/core/block.c  | 126 +++++++++++++++----------
>  drivers/mmc/core/queue.c  | 235 ++++++++++++----------------------------------
>  drivers/mmc/core/queue.h  |  26 +++--
>  drivers/mmc/host/cavium.c |   3 +
>  drivers/mmc/host/pxamci.c |   6 ++
>  include/linux/mmc/card.h  |   2 -
>  include/linux/mmc/host.h  |   1 +
>  8 files changed, 161 insertions(+), 256 deletions(-)
>
> --
> 2.9.3
>


Really nice work you have done here! I have currently looked over the
series and it seems promising. Allow me to get of couple of more days
for review though.

Also, hopefully we can get someone more to help to test it, but I am
rather eager to apply this once rc1 is out.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
  2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
@ 2017-05-12 21:09     ` Avri Altman
  2017-05-16  9:21   ` Ulf Hansson
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-12 21:09 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Wednesday, May 10, 2017 11:24 AM
> To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> Subject: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
>=20
> This switches also the multiple-command ioctl() call to issue all ioctl()=
s
> through the block layer instead of going directly to the device.
>=20
> We extend the passed argument with an argument count and loop over all
> passed commands in the ioctl() issue function called from the block layer=
.
>=20
> By doing this we are again loosening the grip on the big host lock, since=
 two
> calls to mmc_get_card()/mmc_put_card() are removed.
>=20
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/mmc/core/block.c | 38 +++++++++++++++++++++++++-------------
>  drivers/mmc/core/queue.h |  3 ++-
>  2 files changed, 27 insertions(+), 14 deletions(-)
>=20
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index
> 640db4f57a31..152de904d5e4 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -563,6 +563,7 @@ static int mmc_blk_ioctl_cmd(struct block_device
> *bdev,
>  			     struct mmc_ioc_cmd __user *ic_ptr)  {
>  	struct mmc_blk_ioc_data *idata;
> +	struct mmc_blk_ioc_data *idatas[1];
>  	struct mmc_blk_data *md;
>  	struct mmc_queue *mq;
>  	struct mmc_card *card;
> @@ -600,7 +601,9 @@ static int mmc_blk_ioctl_cmd(struct block_device
> *bdev,
>  	req =3D blk_get_request(mq->queue,
>  		idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
>  		__GFP_RECLAIM);
> -	req_to_mq_rq(req)->idata =3D idata;
> +	idatas[0] =3D idata;
> +	req_to_mq_rq(req)->idata =3D idatas;
> +	req_to_mq_rq(req)->ioc_count =3D 1;
>  	blk_execute_rq(mq->queue, NULL, req, 0);
>  	ioc_err =3D req_to_mq_rq(req)->ioc_result;
>  	err =3D mmc_blk_ioctl_copy_to_user(ic_ptr, idata); @@ -622,14 +625,17
> @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,  static void
> mmc_blk_ioctl_cmd_issue(struct mmc_queue *mq, struct request *req)  {
>  	struct mmc_queue_req *mq_rq;
> -	struct mmc_blk_ioc_data *idata;
>  	struct mmc_card *card =3D mq->card;
>  	struct mmc_blk_data *md =3D mq->blkdata;
>  	int ioc_err;
> +	int i;
>=20
>  	mq_rq =3D req_to_mq_rq(req);
> -	idata =3D mq_rq->idata;
> -	ioc_err =3D __mmc_blk_ioctl_cmd(card, md, idata);
> +	for (i =3D 0; i < mq_rq->ioc_count; i++) {
> +		ioc_err =3D __mmc_blk_ioctl_cmd(card, md, mq_rq->idata[i]);
> +		if (ioc_err)
> +			break;
> +	}
>  	mq_rq->ioc_result =3D ioc_err;
>=20
>  	/* Always switch back to main area after RPMB access */ @@ -646,8
> +652,10 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
>  	struct mmc_ioc_cmd __user *cmds =3D user->cmds;
>  	struct mmc_card *card;
>  	struct mmc_blk_data *md;
> +	struct mmc_queue *mq;
>  	int i, err =3D 0, ioc_err =3D 0;
>  	__u64 num_of_cmds;
> +	struct request *req;
>=20
>  	/*
>  	 * The caller must have CAP_SYS_RAWIO, and must be calling this on
> the @@ -689,21 +697,25 @@ static int mmc_blk_ioctl_multi_cmd(struct
> block_device *bdev,
>  		goto cmd_done;
>  	}
>=20
> -	mmc_get_card(card);
> -
> -	for (i =3D 0; i < num_of_cmds && !ioc_err; i++)
> -		ioc_err =3D __mmc_blk_ioctl_cmd(card, md, idata[i]);
> -
> -	/* Always switch back to main area after RPMB access */
> -	if (md->area_type & MMC_BLK_DATA_AREA_RPMB)
> -		mmc_blk_part_switch(card, dev_get_drvdata(&card->dev));
>=20
> -	mmc_put_card(card);
> +	/*
> +	 * Dispatch the ioctl()s into the block request queue.
> +	 */
> +	mq =3D &md->queue;
> +	req =3D blk_get_request(mq->queue,
> +		idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
> +		__GFP_RECLAIM);
It is possible, e.g. as in RPMB access, that some commands are read and som=
e are write.
Not sure that it makes any difference, because once it get back to mmc_blk_=
ioctl_cmd_issue(),
The correct mmc requests will be issued anyway?

> +	req_to_mq_rq(req)->idata =3D idata;
> +	req_to_mq_rq(req)->ioc_count =3D num_of_cmds;
> +	blk_execute_rq(mq->queue, NULL, req, 0);
> +	ioc_err =3D req_to_mq_rq(req)->ioc_result;
>=20
>  	/* copy to user if data and response */
>  	for (i =3D 0; i < num_of_cmds && !err; i++)
>  		err =3D mmc_blk_ioctl_copy_to_user(&cmds[i], idata[i]);
>=20
> +	blk_put_request(req);
> +
>  cmd_done:
>  	mmc_blk_put(md);
>  cmd_err:
> diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h index
> aeb3408dc85e..7015df6681c3 100644
> --- a/drivers/mmc/core/queue.h
> +++ b/drivers/mmc/core/queue.h
> @@ -42,7 +42,8 @@ struct mmc_queue_req {
>  	unsigned int		bounce_sg_len;
>  	struct mmc_async_req	areq;
>  	int			ioc_result;
> -	struct mmc_blk_ioc_data	*idata;
> +	struct mmc_blk_ioc_data	**idata;
> +	unsigned int		ioc_count;
>  };
>=20
>  struct mmc_queue {
> --
> 2.9.3
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in t=
he
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
@ 2017-05-12 21:09     ` Avri Altman
  0 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-12 21:09 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Wednesday, May 10, 2017 11:24 AM
> To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> Subject: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
> 
> This switches also the multiple-command ioctl() call to issue all ioctl()s
> through the block layer instead of going directly to the device.
> 
> We extend the passed argument with an argument count and loop over all
> passed commands in the ioctl() issue function called from the block layer.
> 
> By doing this we are again loosening the grip on the big host lock, since two
> calls to mmc_get_card()/mmc_put_card() are removed.
> 
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/mmc/core/block.c | 38 +++++++++++++++++++++++++-------------
>  drivers/mmc/core/queue.h |  3 ++-
>  2 files changed, 27 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index
> 640db4f57a31..152de904d5e4 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -563,6 +563,7 @@ static int mmc_blk_ioctl_cmd(struct block_device
> *bdev,
>  			     struct mmc_ioc_cmd __user *ic_ptr)  {
>  	struct mmc_blk_ioc_data *idata;
> +	struct mmc_blk_ioc_data *idatas[1];
>  	struct mmc_blk_data *md;
>  	struct mmc_queue *mq;
>  	struct mmc_card *card;
> @@ -600,7 +601,9 @@ static int mmc_blk_ioctl_cmd(struct block_device
> *bdev,
>  	req = blk_get_request(mq->queue,
>  		idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
>  		__GFP_RECLAIM);
> -	req_to_mq_rq(req)->idata = idata;
> +	idatas[0] = idata;
> +	req_to_mq_rq(req)->idata = idatas;
> +	req_to_mq_rq(req)->ioc_count = 1;
>  	blk_execute_rq(mq->queue, NULL, req, 0);
>  	ioc_err = req_to_mq_rq(req)->ioc_result;
>  	err = mmc_blk_ioctl_copy_to_user(ic_ptr, idata); @@ -622,14 +625,17
> @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,  static void
> mmc_blk_ioctl_cmd_issue(struct mmc_queue *mq, struct request *req)  {
>  	struct mmc_queue_req *mq_rq;
> -	struct mmc_blk_ioc_data *idata;
>  	struct mmc_card *card = mq->card;
>  	struct mmc_blk_data *md = mq->blkdata;
>  	int ioc_err;
> +	int i;
> 
>  	mq_rq = req_to_mq_rq(req);
> -	idata = mq_rq->idata;
> -	ioc_err = __mmc_blk_ioctl_cmd(card, md, idata);
> +	for (i = 0; i < mq_rq->ioc_count; i++) {
> +		ioc_err = __mmc_blk_ioctl_cmd(card, md, mq_rq->idata[i]);
> +		if (ioc_err)
> +			break;
> +	}
>  	mq_rq->ioc_result = ioc_err;
> 
>  	/* Always switch back to main area after RPMB access */ @@ -646,8
> +652,10 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
>  	struct mmc_ioc_cmd __user *cmds = user->cmds;
>  	struct mmc_card *card;
>  	struct mmc_blk_data *md;
> +	struct mmc_queue *mq;
>  	int i, err = 0, ioc_err = 0;
>  	__u64 num_of_cmds;
> +	struct request *req;
> 
>  	/*
>  	 * The caller must have CAP_SYS_RAWIO, and must be calling this on
> the @@ -689,21 +697,25 @@ static int mmc_blk_ioctl_multi_cmd(struct
> block_device *bdev,
>  		goto cmd_done;
>  	}
> 
> -	mmc_get_card(card);
> -
> -	for (i = 0; i < num_of_cmds && !ioc_err; i++)
> -		ioc_err = __mmc_blk_ioctl_cmd(card, md, idata[i]);
> -
> -	/* Always switch back to main area after RPMB access */
> -	if (md->area_type & MMC_BLK_DATA_AREA_RPMB)
> -		mmc_blk_part_switch(card, dev_get_drvdata(&card->dev));
> 
> -	mmc_put_card(card);
> +	/*
> +	 * Dispatch the ioctl()s into the block request queue.
> +	 */
> +	mq = &md->queue;
> +	req = blk_get_request(mq->queue,
> +		idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
> +		__GFP_RECLAIM);
It is possible, e.g. as in RPMB access, that some commands are read and some are write.
Not sure that it makes any difference, because once it get back to mmc_blk_ioctl_cmd_issue(),
The correct mmc requests will be issued anyway?

> +	req_to_mq_rq(req)->idata = idata;
> +	req_to_mq_rq(req)->ioc_count = num_of_cmds;
> +	blk_execute_rq(mq->queue, NULL, req, 0);
> +	ioc_err = req_to_mq_rq(req)->ioc_result;
> 
>  	/* copy to user if data and response */
>  	for (i = 0; i < num_of_cmds && !err; i++)
>  		err = mmc_blk_ioctl_copy_to_user(&cmds[i], idata[i]);
> 
> +	blk_put_request(req);
> +
>  cmd_done:
>  	mmc_blk_put(md);
>  cmd_err:
> diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h index
> aeb3408dc85e..7015df6681c3 100644
> --- a/drivers/mmc/core/queue.h
> +++ b/drivers/mmc/core/queue.h
> @@ -42,7 +42,8 @@ struct mmc_queue_req {
>  	unsigned int		bounce_sg_len;
>  	struct mmc_async_req	areq;
>  	int			ioc_result;
> -	struct mmc_blk_ioc_data	*idata;
> +	struct mmc_blk_ioc_data	**idata;
> +	unsigned int		ioc_count;
>  };
> 
>  struct mmc_queue {
> --
> 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option
  2017-05-10  8:24 ` [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option Linus Walleij
@ 2017-05-15 11:55   ` Ulf Hansson
  2017-05-15 14:04   ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 31+ messages in thread
From: Ulf Hansson @ 2017-05-15 11:55 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On 10 May 2017 at 10:24, Linus Walleij <linus.walleij@linaro.org> wrote:
> This option is activated by all multiplatform configs and what
> not so we almost always have it turned on, and the memory it
> saves is negligible, even more so moving forward. The actual
> bounce buffer only gets allocated only when used, the only
> thing the ifdefs are saving is a little bit of code.
>
> It is highly improper to have this as a Kconfig option that
> get turned on by Kconfig, make this a pure runtime-thing and
> let the host decide whether we use bounce buffers. We add a
> new property "disable_bounce" to the host struct.
>
> Notice that mmc_queue_calc_bouncesz() already disables the
> bounce buffers if host->max_segs != 1, so any arch that has a
> maximum number of segments higher than 1 will have bounce
> buffers disabled.
>
> The option CONFIG_MMC_BLOCK_BOUNCE is default y so the
> majority of platforms in the kernel already have it on, and
> it then gets turned off at runtime since most of these have
> a host->max_segs > 1. The few exceptions that have
> host->max_segs == 1 and still turn off the bounce buffering
> are those that disable it in their defconfig.
>
> Those are the following:
>
> arch/arm/configs/colibri_pxa300_defconfig
> arch/arm/configs/zeus_defconfig
> - Uses MMC_PXA, drivers/mmc/host/pxamci.c
> - Sets host->max_segs = NR_SG, which is 1
> - This needs its bounce buffer deactivated so we set
>   host->disable_bounce to true in the host driver
>
> arch/arm/configs/davinci_all_defconfig
> - Uses MMC_DAVINCI, drivers/mmc/host/davinci_mmc.c
> - This driver sets host->max_segs to MAX_NR_SG, which is 16
> - That means this driver anyways disabled bounce buffers
> - No special action needed for this platform
>
> arch/arm/configs/lpc32xx_defconfig
> arch/arm/configs/nhk8815_defconfig
> arch/arm/configs/u300_defconfig
> - Uses MMC_ARMMMCI, drivers/mmc/host/mmci.[c|h]
> - This driver by default sets host->max_segs to NR_SG,
>   which is 128, unless a DMA engine is used, and in that case
>   the number of segments are also > 1
> - That means this driver already disables bounce buffers
> - No special action needed for these platforms
>
> arch/arm/configs/sama5_defconfig
> - Uses MMC_SDHCI, MMC_SDHCI_PLTFM, MMC_SDHCI_OF_AT91, MMC_ATMELMCI
> - Uses drivers/mmc/host/sdhci.c
> - Normally sets host->max_segs to SDHCI_MAX_SEGS which is 128 and
>   thus disables bounce buffers
> - Sets host->max_segs to 1 if SDHCI_USE_SDMA is set
> - SDHCI_USE_SDMA is only set by SDHCI on PCI adapers
> - That means that for this platform bounce buffers are already
>   disabled at runtime
> - No special action needed for this platform
>
> arch/blackfin/configs/CM-BF533_defconfig
> arch/blackfin/configs/CM-BF537E_defconfig
> - Uses MMC_SPI (a simple MMC card connected on SPI pins)
> - Uses drivers/mmc/host/mmc_spi.c
> - Sets host->max_segs to MMC_SPI_BLOCKSATONCE which is 128
> - That means this platform already disables bounce buffers at
>   runtime
> - No special action needed for these platforms
>
> arch/mips/configs/cavium_octeon_defconfig
> - Uses MMC_CAVIUM_OCTEON, drivers/mmc/host/cavium.c
> - Sets host->max_segs to 16 or 1
> - Setting host->disable_bounce to be sure for the 1 case
>
> arch/mips/configs/qi_lb60_defconfig
> - Uses MMC_JZ4740, drivers/mmc/host/jz4740_mmc.c
> - This sets host->max_segs to 128 so bounce buffers are
>   already runtime disabled
> - No action needed for this platform
>
> It would be interesting to come up with a list of the platforms
> that actually end up using bounce buffers. I have not been
> able to infer such a list, but it occurs when
> host->max_segs == 1 and the bounce buffering is not explicitly
> disabled.

Thanks for the descriptive change-log!

[...]

> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index 21385ac0c9b1..b53c0e18a33b 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -310,6 +310,7 @@ struct mmc_host {
>         /* host specific block data */
>         unsigned int            max_seg_size;   /* see blk_queue_max_segment_size */
>         unsigned short          max_segs;       /* see blk_queue_max_segments */
> +       bool                    disable_bounce; /* disable bounce buffers */

Instead of adding a separate host configuration flag, let's use one of
capabilities bit-field instead (->caps or ->caps2). Then I suggest we
then add a new bit, MMC_CAP_NO_BOUNCE_BUFF.

>         unsigned short          unused;
>         unsigned int            max_req_size;   /* maximum number of bytes in one req */
>         unsigned int            max_blk_size;   /* maximum size of one mmc block */
> --
> 2.9.3
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option
  2017-05-10  8:24 ` [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option Linus Walleij
  2017-05-15 11:55   ` Ulf Hansson
@ 2017-05-15 14:04   ` Bartlomiej Zolnierkiewicz
  2017-05-18  7:48     ` Linus Walleij
  1 sibling, 1 reply; 31+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2017-05-15 14:04 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Ulf Hansson, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Paolo Valente, Daniel Mack,
	Marc Zyngier, Steven J. Hill


Hi,

[ I added Daniel, Marc & Steven to cc: ]

On Wednesday, May 10, 2017 10:24:14 AM Linus Walleij wrote:
> This option is activated by all multiplatform configs and what
> not so we almost always have it turned on, and the memory it
> saves is negligible, even more so moving forward. The actual
> bounce buffer only gets allocated only when used, the only
> thing the ifdefs are saving is a little bit of code.
> 
> It is highly improper to have this as a Kconfig option that
> get turned on by Kconfig, make this a pure runtime-thing and
> let the host decide whether we use bounce buffers. We add a
> new property "disable_bounce" to the host struct.
> 
> Notice that mmc_queue_calc_bouncesz() already disables the
> bounce buffers if host->max_segs != 1, so any arch that has a
> maximum number of segments higher than 1 will have bounce
> buffers disabled.
> 
> The option CONFIG_MMC_BLOCK_BOUNCE is default y so the
> majority of platforms in the kernel already have it on, and
> it then gets turned off at runtime since most of these have
> a host->max_segs > 1. The few exceptions that have
> host->max_segs == 1 and still turn off the bounce buffering
> are those that disable it in their defconfig.
> 
> Those are the following:
> 
> arch/arm/configs/colibri_pxa300_defconfig
> arch/arm/configs/zeus_defconfig
> - Uses MMC_PXA, drivers/mmc/host/pxamci.c
> - Sets host->max_segs = NR_SG, which is 1
> - This needs its bounce buffer deactivated so we set
>   host->disable_bounce to true in the host driver

[...]

> arch/mips/configs/cavium_octeon_defconfig
> - Uses MMC_CAVIUM_OCTEON, drivers/mmc/host/cavium.c
> - Sets host->max_segs to 16 or 1
> - Setting host->disable_bounce to be sure for the 1 case

>From looking at the code it seems that bounce buffering should
be always beneficial to MMC performance when host->max_segs == 1.

It would be useful to know why these specific defconfigs
disable bounce buffering (to save memory?). Maybe the defaults
should be changed nowadays?

[...]

>  drivers/mmc/core/Kconfig  | 18 ------------------
>  drivers/mmc/core/queue.c  | 15 +--------------
>  drivers/mmc/host/cavium.c |  3 +++
>  drivers/mmc/host/pxamci.c |  6 ++++++
>  include/linux/mmc/host.h  |  1 +
>  5 files changed, 11 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/mmc/core/Kconfig b/drivers/mmc/core/Kconfig
> index fc1ecdaaa9ca..42e89060cd41 100644
> --- a/drivers/mmc/core/Kconfig
> +++ b/drivers/mmc/core/Kconfig
> @@ -61,24 +61,6 @@ config MMC_BLOCK_MINORS
>  
>  	  If unsure, say 8 here.
>  
> -config MMC_BLOCK_BOUNCE
> -	bool "Use bounce buffer for simple hosts"
> -	depends on MMC_BLOCK
> -	default y
> -	help
> -	  SD/MMC is a high latency protocol where it is crucial to
> -	  send large requests in order to get high performance. Many
> -	  controllers, however, are restricted to continuous memory
> -	  (i.e. they can't do scatter-gather), something the kernel
> -	  rarely can provide.
> -
> -	  Say Y here to help these restricted hosts by bouncing
> -	  requests back and forth from a large buffer. You will get
> -	  a big performance gain at the cost of up to 64 KiB of
> -	  physical memory.
> -
> -	  If unsure, say Y here.
> -
>  config SDIO_UART
>  	tristate "SDIO UART/GPS class support"
>  	depends on TTY
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 5c37b6be3e7b..545466342fb1 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -219,7 +219,6 @@ static struct mmc_queue_req *mmc_queue_alloc_mqrqs(int qdepth)
>  	return mqrq;
>  }
>  
> -#ifdef CONFIG_MMC_BLOCK_BOUNCE
>  static int mmc_queue_alloc_bounce_bufs(struct mmc_queue_req *mqrq, int qdepth,
>  				       unsigned int bouncesz)
>  {
> @@ -258,7 +257,7 @@ static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
>  {
>  	unsigned int bouncesz = MMC_QUEUE_BOUNCESZ;
>  
> -	if (host->max_segs != 1)
> +	if (host->max_segs != 1 || host->disable_bounce)
>  		return 0;
>  
>  	if (bouncesz > host->max_req_size)
> @@ -273,18 +272,6 @@ static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
>  
>  	return bouncesz;
>  }
> -#else
> -static inline bool mmc_queue_alloc_bounce(struct mmc_queue_req *mqrq,
> -					  int qdepth, unsigned int bouncesz)
> -{
> -	return false;
> -}
> -
> -static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
> -{
> -	return 0;
> -}
> -#endif
>  
>  static int mmc_queue_alloc_sgs(struct mmc_queue_req *mqrq, int qdepth,
>  			       int max_segs)
> diff --git a/drivers/mmc/host/cavium.c b/drivers/mmc/host/cavium.c
> index 58b51ba6aabd..66066f73e477 100644
> --- a/drivers/mmc/host/cavium.c
> +++ b/drivers/mmc/host/cavium.c
> @@ -1050,6 +1050,9 @@ int cvm_mmc_of_slot_probe(struct device *dev, struct cvm_mmc_host *host)
>  	else
>  		mmc->max_segs = 1;
>  
> +	/* Disable bounce buffers for max_segs = 1 */
> +	mmc->disable_bounce = true;
> +
>  	/* DMA size field can address up to 8 MB */
>  	mmc->max_seg_size = 8 * 1024 * 1024;
>  	mmc->max_req_size = mmc->max_seg_size;
> diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c
> index c763b404510f..d3b5e6376504 100644
> --- a/drivers/mmc/host/pxamci.c
> +++ b/drivers/mmc/host/pxamci.c
> @@ -666,6 +666,12 @@ static int pxamci_probe(struct platform_device *pdev)
>  	mmc->max_segs = NR_SG;
>  
>  	/*
> +	 * This architecture used to disable bounce buffers through its
> +	 * defconfig, now it is done at runtime as a host property.
> +	 */
> +	mmc->disable_bounce = true;
> +
> +	/*
>  	 * Our hardware DMA can handle a maximum of one page per SG entry.
>  	 */
>  	mmc->max_seg_size = PAGE_SIZE;
> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index 21385ac0c9b1..b53c0e18a33b 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -310,6 +310,7 @@ struct mmc_host {
>  	/* host specific block data */
>  	unsigned int		max_seg_size;	/* see blk_queue_max_segment_size */
>  	unsigned short		max_segs;	/* see blk_queue_max_segments */
> +	bool			disable_bounce; /* disable bounce buffers */
>  	unsigned short		unused;
>  	unsigned int		max_req_size;	/* maximum number of bytes in one req */
>  	unsigned int		max_blk_size;	/* maximum size of one mmc block */

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-10  8:24 ` [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core Linus Walleij
@ 2017-05-16  9:02   ` Ulf Hansson
  2017-05-18  8:01     ` Linus Walleij
  2017-05-16 11:54   ` Adrian Hunter
  1 sibling, 1 reply; 31+ messages in thread
From: Ulf Hansson @ 2017-05-16  9:02 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On 10 May 2017 at 10:24, Linus Walleij <linus.walleij@linaro.org> wrote:
> The mmc_queue_req is a per-request state container the MMC core uses
> to carry bounce buffers, pointers to asynchronous requests and so on.
> Currently allocated as a static array of objects, then as a request
> comes in, a mmc_queue_req is assigned to it, and used during the
> lifetime of the request.
>
> This is backwards compared to how other block layer drivers work:
> they usally let the block core provide a per-request struct that get
> allocated right beind the struct request, and which can be obtained
> using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function
> name is misleading: it is used by both the old and the MQ block
> layer.)
>
> The per-request struct gets allocated to the size stored in the queue
> variable .cmd_size initialized using the .init_rq_fn() and
> cleaned up using .exit_rq_fn().
>
> The block layer code makes the MMC core rely on this mechanism to
> allocate the per-request mmc_queue_req state container.
>
> Doing this make a lot of complicated queue handling go away. We only
> need to keep the .qnct that keeps count of how many request are
> currently being processed by the MMC layer. The MQ block layer will
> replace also this once we transition to it.
>
> Doing this refactoring is necessary to move the ioctl() operations
> into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT]
> instead of the custom code using the BigMMCHostLock that we have
> today: those require that per-request data be obtainable easily from
> a request after creating a custom request with e.g.:
>
> struct request *rq = blk_get_request(q, REQ_OP_DRV_IN, __GFP_RECLAIM);
> struct mmc_queue_req *mq_rq = req_to_mq_rq(rq);
>
> And this is not possible with the current construction, as the request
> is not immediately assigned the per-request state container, but
> instead it gets assigned when the request finally enters the MMC
> queue, which is way too late for custom requests.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/mmc/core/block.c |  38 ++------
>  drivers/mmc/core/queue.c | 222 +++++++++++++----------------------------------
>  drivers/mmc/core/queue.h |  22 ++---
>  include/linux/mmc/card.h |   2 -
>  4 files changed, 80 insertions(+), 204 deletions(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 8273b078686d..be782b8d4a0d 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c

[...]

> @@ -1662,7 +1655,7 @@ static void mmc_blk_rw_try_restart(struct mmc_queue *mq, struct request *req,
>         if (mmc_card_removed(mq->card)) {
>                 req->rq_flags |= RQF_QUIET;
>                 blk_end_request_all(req, -EIO);
> -               mmc_queue_req_free(mq, mqrq);
> +               mq->qcnt--; /* FIXME: just set to 0? */

As mentioned below, perhaps this FIXME is fine to add. As I assume you
soon intend to take care of it, right?

[...]

> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 545466342fb1..65a8e0e63012 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c

[...]

> +/**
> + * mmc_init_request() - initialize the MMC-specific per-request data
> + * @q: the request queue
> + * @req: the request
> + * @gfp: memory allocation policy
> + */
> +static int mmc_init_request(struct request_queue *q, struct request *req,
> +                           gfp_t gfp)
>  {
> -       int i;
> +       struct mmc_queue_req *mq_rq = req_to_mq_rq(req);
> +       struct mmc_queue *mq = q->queuedata;
> +       struct mmc_card *card = mq->card;
> +       struct mmc_host *host = card->host;
>
> -       for (i = 0; i < qdepth; i++) {
> -               mqrq[i].sg = mmc_alloc_sg(max_segs);
> -               if (!mqrq[i].sg)
> +       /* FIXME: use req_to_mq_rq() everywhere this is dereferenced */

Why not do that right now, instead of adding a FIXME comment?

> +       mq_rq->req = req;
> +
> +       if (card->bouncesz) {
> +               mq_rq->bounce_buf = kmalloc(card->bouncesz, gfp);
> +               if (!mq_rq->bounce_buf)
> +                       return -ENOMEM;
> +               if (card->bouncesz > 512) {
> +                       mq_rq->sg = mmc_alloc_sg(1, gfp);
> +                       if (!mq_rq->sg)
> +                               return -ENOMEM;
> +                       mq_rq->bounce_sg = mmc_alloc_sg(card->bouncesz / 512,
> +                                                       gfp);
> +                       if (!mq_rq->bounce_sg)
> +                               return -ENOMEM;
> +               }
> +       } else {
> +               mq_rq->bounce_buf = NULL;
> +               mq_rq->bounce_sg = NULL;
> +               mq_rq->sg = mmc_alloc_sg(host->max_segs, gfp);
> +               if (!mq_rq->sg)
>                         return -ENOMEM;
>         }
>
>         return 0;
>  }
>

[...]

> @@ -360,13 +248,21 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
>                 limit = (u64)dma_max_pfn(mmc_dev(host)) << PAGE_SHIFT;
>
>         mq->card = card;
> -       mq->queue = blk_init_queue(mmc_request_fn, lock);
> +       mq->queue = blk_alloc_queue_node(GFP_KERNEL, NUMA_NO_NODE);

Seems like we should use blk_alloc_queue() instead, as it calls
blk_alloc_queue_node(gfp_mask, NUMA_NO_NODE) for us.

>         if (!mq->queue)
>                 return -ENOMEM;
> -
> -       mq->mqrq = card->mqrq;
> -       mq->qdepth = card->qdepth;
> +       mq->queue->queue_lock = lock;
> +       mq->queue->request_fn = mmc_request_fn;
> +       mq->queue->init_rq_fn = mmc_init_request;
> +       mq->queue->exit_rq_fn = mmc_exit_request;
> +       mq->queue->cmd_size = sizeof(struct mmc_queue_req);
>         mq->queue->queuedata = mq;
> +       mq->qcnt = 0;
> +       ret = blk_init_allocated_queue(mq->queue);
> +       if (ret) {
> +               blk_cleanup_queue(mq->queue);
> +               return ret;
> +       }
>
>         blk_queue_prep_rq(mq->queue, mmc_prep_request);
>         queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);

[...]

> @@ -421,8 +317,8 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
>         q->queuedata = NULL;
>         blk_start_queue(q);
>         spin_unlock_irqrestore(q->queue_lock, flags);
> +       blk_cleanup_queue(mq->queue);
>
> -       mq->mqrq = NULL;
>         mq->card = NULL;
>  }
>  EXPORT_SYMBOL(mmc_cleanup_queue);
> diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
> index 871796c3f406..8aa10ffdf622 100644
> --- a/drivers/mmc/core/queue.h
> +++ b/drivers/mmc/core/queue.h
> @@ -3,9 +3,15 @@
>
>  #include <linux/types.h>
>  #include <linux/blkdev.h>
> +#include <linux/blk-mq.h>
>  #include <linux/mmc/core.h>
>  #include <linux/mmc/host.h>
>
> +static inline struct mmc_queue_req *req_to_mq_rq(struct request *rq)

To be more consistent with existing function names, perhaps rename this to:
req_to_mmc_queue_req()

> +{
> +       return blk_mq_rq_to_pdu(rq);
> +}
> +

[...]

>  struct mmc_queue {
> @@ -45,14 +50,15 @@ struct mmc_queue {
>         bool                    asleep;
>         struct mmc_blk_data     *blkdata;
>         struct request_queue    *queue;
> -       struct mmc_queue_req    *mqrq;
> -       int                     qdepth;
> +       /*
> +        * FIXME: this counter is not a very reliable way of keeping
> +        * track of how many requests that are ongoing. Switch to just
> +        * letting the block core keep track of requests and per-request
> +        * associated mmc_queue_req data.
> +        */
>         int                     qcnt;

I am not very fond of FIXME comments, however perhaps this one really
deserves to be a FIXME because you intend to fix this asap, right?

> -       unsigned long           qslots;
>  };
>

[...]

Besides my minor nitpicks, this is really an impressive cleanup!
Thanks for working on this.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] mmc: block: move single ioctl() commands to block requests
  2017-05-10  8:24 ` [PATCH 4/5] mmc: block: move single ioctl() commands to block requests Linus Walleij
@ 2017-05-16  9:15   ` Ulf Hansson
  2017-05-23  8:14     ` Avri Altman
  1 sibling, 0 replies; 31+ messages in thread
From: Ulf Hansson @ 2017-05-16  9:15 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On 10 May 2017 at 10:24, Linus Walleij <linus.walleij@linaro.org> wrote:
> This wraps single ioctl() commands into block requests using
> the custom block layer request types REQ_OP_DRV_IN and
> REQ_OP_DRV_OUT.
>
> By doing this we are loosening the grip on the big host lock,
> since two calls to mmc_get_card()/mmc_put_card() are removed.
>
> We are storing the ioctl() in/out argument as a pointer in
> the per-request struct mmc_blk_request container. Since we
> now let the block layer allocate this data, blk_get_request()
> will allocate it for us and we can immediately dereference
> it and use it to pass the argument into the block layer.
>
> Tested on the ux500 with the userspace:
> mmc extcsd read /dev/mmcblk3
> resulting in a successful EXTCSD info dump back to the
> console.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/mmc/core/block.c | 56 ++++++++++++++++++++++++++++++++++++++----------

[...]

> @@ -1854,7 +1882,13 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
>                 goto out;
>         }
>
> -       if (req && req_op(req) == REQ_OP_DISCARD) {
> +       if (req &&
> +           (req_op(req) == REQ_OP_DRV_IN || req_op(req) == REQ_OP_DRV_OUT)) {
> +               /* complete ongoing async transfer before issuing ioctl()s */
> +               if (mq->qcnt)
> +                       mmc_blk_issue_rw_rq(mq, NULL);
> +               mmc_blk_ioctl_cmd_issue(mq, req);
> +       } else if (req && req_op(req) == REQ_OP_DISCARD) {

While you are at it, would you mind converting this if-else-if to a
switch clause instead?

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
  2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
  2017-05-12 21:09     ` Avri Altman
@ 2017-05-16  9:21   ` Ulf Hansson
  2017-05-23  9:21     ` Avri Altman
  2017-05-23 10:51     ` Avri Altman
  3 siblings, 0 replies; 31+ messages in thread
From: Ulf Hansson @ 2017-05-16  9:21 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On 10 May 2017 at 10:24, Linus Walleij <linus.walleij@linaro.org> wrote:
> This switches also the multiple-command ioctl() call to issue
> all ioctl()s through the block layer instead of going directly
> to the device.
>
> We extend the passed argument with an argument count and loop
> over all passed commands in the ioctl() issue function called
> from the block layer.
>
> By doing this we are again loosening the grip on the big host
> lock, since two calls to mmc_get_card()/mmc_put_card() are
> removed.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

I have reviewed the series and it looks good to me. Some minor
nitpicks was all I found, please address them then we can apply this.

Kind regards
Uffe

> ---
>  drivers/mmc/core/block.c | 38 +++++++++++++++++++++++++-------------
>  drivers/mmc/core/queue.h |  3 ++-
>  2 files changed, 27 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 640db4f57a31..152de904d5e4 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -563,6 +563,7 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
>                              struct mmc_ioc_cmd __user *ic_ptr)
>  {
>         struct mmc_blk_ioc_data *idata;
> +       struct mmc_blk_ioc_data *idatas[1];
>         struct mmc_blk_data *md;
>         struct mmc_queue *mq;
>         struct mmc_card *card;
> @@ -600,7 +601,9 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
>         req = blk_get_request(mq->queue,
>                 idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
>                 __GFP_RECLAIM);
> -       req_to_mq_rq(req)->idata = idata;
> +       idatas[0] = idata;
> +       req_to_mq_rq(req)->idata = idatas;
> +       req_to_mq_rq(req)->ioc_count = 1;
>         blk_execute_rq(mq->queue, NULL, req, 0);
>         ioc_err = req_to_mq_rq(req)->ioc_result;
>         err = mmc_blk_ioctl_copy_to_user(ic_ptr, idata);
> @@ -622,14 +625,17 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev,
>  static void mmc_blk_ioctl_cmd_issue(struct mmc_queue *mq, struct request *req)
>  {
>         struct mmc_queue_req *mq_rq;
> -       struct mmc_blk_ioc_data *idata;
>         struct mmc_card *card = mq->card;
>         struct mmc_blk_data *md = mq->blkdata;
>         int ioc_err;
> +       int i;
>
>         mq_rq = req_to_mq_rq(req);
> -       idata = mq_rq->idata;
> -       ioc_err = __mmc_blk_ioctl_cmd(card, md, idata);
> +       for (i = 0; i < mq_rq->ioc_count; i++) {
> +               ioc_err = __mmc_blk_ioctl_cmd(card, md, mq_rq->idata[i]);
> +               if (ioc_err)
> +                       break;
> +       }
>         mq_rq->ioc_result = ioc_err;
>
>         /* Always switch back to main area after RPMB access */
> @@ -646,8 +652,10 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
>         struct mmc_ioc_cmd __user *cmds = user->cmds;
>         struct mmc_card *card;
>         struct mmc_blk_data *md;
> +       struct mmc_queue *mq;
>         int i, err = 0, ioc_err = 0;
>         __u64 num_of_cmds;
> +       struct request *req;
>
>         /*
>          * The caller must have CAP_SYS_RAWIO, and must be calling this on the
> @@ -689,21 +697,25 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev,
>                 goto cmd_done;
>         }
>
> -       mmc_get_card(card);
> -
> -       for (i = 0; i < num_of_cmds && !ioc_err; i++)
> -               ioc_err = __mmc_blk_ioctl_cmd(card, md, idata[i]);
> -
> -       /* Always switch back to main area after RPMB access */
> -       if (md->area_type & MMC_BLK_DATA_AREA_RPMB)
> -               mmc_blk_part_switch(card, dev_get_drvdata(&card->dev));
>
> -       mmc_put_card(card);
> +       /*
> +        * Dispatch the ioctl()s into the block request queue.
> +        */
> +       mq = &md->queue;
> +       req = blk_get_request(mq->queue,
> +               idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
> +               __GFP_RECLAIM);
> +       req_to_mq_rq(req)->idata = idata;
> +       req_to_mq_rq(req)->ioc_count = num_of_cmds;
> +       blk_execute_rq(mq->queue, NULL, req, 0);
> +       ioc_err = req_to_mq_rq(req)->ioc_result;
>
>         /* copy to user if data and response */
>         for (i = 0; i < num_of_cmds && !err; i++)
>                 err = mmc_blk_ioctl_copy_to_user(&cmds[i], idata[i]);
>
> +       blk_put_request(req);
> +
>  cmd_done:
>         mmc_blk_put(md);
>  cmd_err:
> diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
> index aeb3408dc85e..7015df6681c3 100644
> --- a/drivers/mmc/core/queue.h
> +++ b/drivers/mmc/core/queue.h
> @@ -42,7 +42,8 @@ struct mmc_queue_req {
>         unsigned int            bounce_sg_len;
>         struct mmc_async_req    areq;
>         int                     ioc_result;
> -       struct mmc_blk_ioc_data *idata;
> +       struct mmc_blk_ioc_data **idata;
> +       unsigned int            ioc_count;
>  };
>
>  struct mmc_queue {
> --
> 2.9.3
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-10  8:24 ` [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core Linus Walleij
  2017-05-16  9:02   ` Ulf Hansson
@ 2017-05-16 11:54   ` Adrian Hunter
  2017-05-18  8:21     ` Linus Walleij
  1 sibling, 1 reply; 31+ messages in thread
From: Adrian Hunter @ 2017-05-16 11:54 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente

On 10/05/17 11:24, Linus Walleij wrote:
> The mmc_queue_req is a per-request state container the MMC core uses
> to carry bounce buffers, pointers to asynchronous requests and so on.
> Currently allocated as a static array of objects, then as a request
> comes in, a mmc_queue_req is assigned to it, and used during the
> lifetime of the request.
> 
> This is backwards compared to how other block layer drivers work:
> they usally let the block core provide a per-request struct that get
> allocated right beind the struct request, and which can be obtained
> using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function
> name is misleading: it is used by both the old and the MQ block
> layer.)
> 
> The per-request struct gets allocated to the size stored in the queue
> variable .cmd_size initialized using the .init_rq_fn() and
> cleaned up using .exit_rq_fn().
> 
> The block layer code makes the MMC core rely on this mechanism to
> allocate the per-request mmc_queue_req state container.
> 
> Doing this make a lot of complicated queue handling go away.

Isn't that at the expense of increased memory allocation.

Have you compared the number of allocations?  It looks to me like the block
layer allocates a minimum of 4 requests in the memory pool which will
increase if there are more in the I/O scheduler, plus 1 for flush.  There
are often 4 queues per eMMC (2x boot,RPMB and main area), so that is 20
requests minimum, up from 2 allocations previously.  For someone using 64K
bounce buffers, you have increased memory allocation by at least 18x64 =
1152k.  However the I/O scheduler could allocate a lot more.

> Doing this refactoring is necessary to move the ioctl() operations
> into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT]

Obviously you could create a per-request data structure with only the
reference to the IOCTL data, and without putting all the memory allocations
there as well.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option
  2017-05-15 14:04   ` Bartlomiej Zolnierkiewicz
@ 2017-05-18  7:48     ` Linus Walleij
  2017-05-19  7:30       ` [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option Steven J. Hill
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Walleij @ 2017-05-18  7:48 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: linux-mmc, Ulf Hansson, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Paolo Valente, Daniel Mack,
	Marc Zyngier, Steven J. Hill

On Mon, May 15, 2017 at 4:04 PM, Bartlomiej Zolnierkiewicz
<b.zolnierkie@samsung.com> wrote:

> [ I added Daniel, Marc & Steven to cc: ]

>> The option CONFIG_MMC_BLOCK_BOUNCE is default y so the
>> majority of platforms in the kernel already have it on, and
>> it then gets turned off at runtime since most of these have
>> a host->max_segs > 1. The few exceptions that have
>> host->max_segs == 1 and still turn off the bounce buffering
>> are those that disable it in their defconfig.
>>
>> Those are the following:
>>
>> arch/arm/configs/colibri_pxa300_defconfig
>> arch/arm/configs/zeus_defconfig
>> - Uses MMC_PXA, drivers/mmc/host/pxamci.c
>> - Sets host->max_segs = NR_SG, which is 1
>> - This needs its bounce buffer deactivated so we set
>>   host->disable_bounce to true in the host driver
>
> [...]
>
>> arch/mips/configs/cavium_octeon_defconfig
>> - Uses MMC_CAVIUM_OCTEON, drivers/mmc/host/cavium.c
>> - Sets host->max_segs to 16 or 1
>> - Setting host->disable_bounce to be sure for the 1 case
>
> From looking at the code it seems that bounce buffering should
> be always beneficial to MMC performance when host->max_segs == 1.
>
> It would be useful to know why these specific defconfigs
> disable bounce buffering (to save memory?). Maybe the defaults
> should be changed nowadays?

I agree. But I can't test on pxamci and cavium so I would
appreciate if the driver maintainers try to remove the
flag (MMC_CAP_NO_BOUNCE_BUFF in the v2 patch set)
and see what happens.

I would be happy to cut this special flag altogether, but I am
also afraid of screwing up some config :/

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-16  9:02   ` Ulf Hansson
@ 2017-05-18  8:01     ` Linus Walleij
  0 siblings, 0 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-18  8:01 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On Tue, May 16, 2017 at 11:02 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 10 May 2017 at 10:24, Linus Walleij <linus.walleij@linaro.org> wrote:

>> @@ -1662,7 +1655,7 @@ static void mmc_blk_rw_try_restart(struct mmc_queue *mq, struct request *req,
>>         if (mmc_card_removed(mq->card)) {
>>                 req->rq_flags |= RQF_QUIET;
>>                 blk_end_request_all(req, -EIO);
>> -               mmc_queue_req_free(mq, mqrq);
>> +               mq->qcnt--; /* FIXME: just set to 0? */
>
> As mentioned below, perhaps this FIXME is fine to add. As I assume you
> soon intend to take care of it, right?

Yes that goes away with my MQ patches (not yet rebased)
by stopping to try to look when the queue is empty and just
issue requests asynchronously. I just wanted to point this out,
that counter is kind of fragile and scary to me.

>> -       for (i = 0; i < qdepth; i++) {
>> -               mqrq[i].sg = mmc_alloc_sg(max_segs);
>> -               if (!mqrq[i].sg)
>> +       /* FIXME: use req_to_mq_rq() everywhere this is dereferenced */
>
> Why not do that right now, instead of adding a FIXME comment?

This comment is wrong, just a development artifact I will just delete it.

>>         mq->card = card;
>> -       mq->queue = blk_init_queue(mmc_request_fn, lock);
>> +       mq->queue = blk_alloc_queue_node(GFP_KERNEL, NUMA_NO_NODE);
>
> Seems like we should use blk_alloc_queue() instead, as it calls
> blk_alloc_queue_node(gfp_mask, NUMA_NO_NODE) for us.

OK

>> +static inline struct mmc_queue_req *req_to_mq_rq(struct request *rq)
>
> To be more consistent with existing function names, perhaps rename this to:
> req_to_mmc_queue_req()
>
>> +{
>> +       return blk_mq_rq_to_pdu(rq);
>> +}
>> +
>
> [...]
>
>>  struct mmc_queue {
>> @@ -45,14 +50,15 @@ struct mmc_queue {
>>         bool                    asleep;
>>         struct mmc_blk_data     *blkdata;
>>         struct request_queue    *queue;
>> -       struct mmc_queue_req    *mqrq;
>> -       int                     qdepth;
>> +       /*
>> +        * FIXME: this counter is not a very reliable way of keeping
>> +        * track of how many requests that are ongoing. Switch to just
>> +        * letting the block core keep track of requests and per-request
>> +        * associated mmc_queue_req data.
>> +        */
>>         int                     qcnt;
>
> I am not very fond of FIXME comments, however perhaps this one really
> deserves to be a FIXME because you intend to fix this asap, right?

Same as the first comment. It is fragile and I don't like it,
with asynchronous issueing in MQ this goes away.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-16 11:54   ` Adrian Hunter
@ 2017-05-18  8:21     ` Linus Walleij
  2017-05-18 12:42       ` Adrian Hunter
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Walleij @ 2017-05-18  8:21 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On Tue, May 16, 2017 at 1:54 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 10/05/17 11:24, Linus Walleij wrote:
>> The mmc_queue_req is a per-request state container the MMC core uses
>> to carry bounce buffers, pointers to asynchronous requests and so on.
>> Currently allocated as a static array of objects, then as a request
>> comes in, a mmc_queue_req is assigned to it, and used during the
>> lifetime of the request.
>>
>> This is backwards compared to how other block layer drivers work:
>> they usally let the block core provide a per-request struct that get
>> allocated right beind the struct request, and which can be obtained
>> using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function
>> name is misleading: it is used by both the old and the MQ block
>> layer.)
>>
>> The per-request struct gets allocated to the size stored in the queue
>> variable .cmd_size initialized using the .init_rq_fn() and
>> cleaned up using .exit_rq_fn().
>>
>> The block layer code makes the MMC core rely on this mechanism to
>> allocate the per-request mmc_queue_req state container.
>>
>> Doing this make a lot of complicated queue handling go away.
>
> Isn't that at the expense of increased memory allocation.
>
> Have you compared the number of allocations?  It looks to me like the block
> layer allocates a minimum of 4 requests in the memory pool which will
> increase if there are more in the I/O scheduler, plus 1 for flush.  There
> are often 4 queues per eMMC (2x boot,RPMB and main area), so that is 20
> requests minimum, up from 2 allocations previously.  For someone using 64K
> bounce buffers, you have increased memory allocation by at least 18x64 =
> 1152k.  However the I/O scheduler could allocate a lot more.

That is not a realistic example.

As pointed out in patch #1, bounce buffers are used on old systems
which have max_segs == 1. No modern hardware has that,
they all have multiple segments-capable host controllers and
often also DMA engines.

Old systems with max_segs == 1 also have:

- One SD or MMC slot
- No eMMC (because it was not yet invented in those times)
- So no RPMB or Boot partitions, just main area

If you can point me to a system that has max_segs == 1 and an
eMMC mounted, I can look into it and ask the driver maintainers to
check if it disturbs them, but I think those simply do not exist.

>> Doing this refactoring is necessary to move the ioctl() operations
>> into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT]
>
> Obviously you could create a per-request data structure with only the
> reference to the IOCTL data, and without putting all the memory allocations
> there as well.

Not easily, and this is the way all IDE, ATA, SCSI disks etc are
doing this so why would be try to be different and maintain a lot
of deviant code.

The allocation of extra data is done by the block layer when issueing
blk_get_request() so trying to keep the old mechanism of a list of
struct mmc_queue_req and trying to pair these with incoming requests
inevitably means a lot of extra work, possibly deepening that list or
creating out-of-list extra entries and whatnot.

It's better to do what everyone else does and let the core do this
allocation of extra data (tag) instead.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
  2017-05-12 21:09     ` Avri Altman
  (?)
@ 2017-05-18  9:26     ` Linus Walleij
  -1 siblings, 0 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-18  9:26 UTC (permalink / raw)
  To: Avri Altman
  Cc: linux-mmc, Ulf Hansson, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On Fri, May 12, 2017 at 11:09 PM, Avri Altman <Avri.Altman@sandisk.com> wrote:

>> +     req = blk_get_request(mq->queue,
>> +             idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
>> +             __GFP_RECLAIM);
>
> It is possible, e.g. as in RPMB access, that some commands are read and some are write.
> Not sure that it makes any difference, because once it get back to mmc_blk_ioctl_cmd_issue(),
> The correct mmc requests will be issued anyway?

The OP type (REQ_OP_DRV_OUT or REQ_OP_DRV_IN) has no semantic
effect in the MMC/SD stack, I just need to set it to something reasonable.
They will all be handled the same when issueing the request later.

The only semantic effect would be if the block layer prioritize these types
of requests differently compared to each other or to other requests, in
which case I think this rough guess is not a big issue.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-18  8:21     ` Linus Walleij
@ 2017-05-18 12:42       ` Adrian Hunter
  2017-05-18 13:31         ` Linus Walleij
  0 siblings, 1 reply; 31+ messages in thread
From: Adrian Hunter @ 2017-05-18 12:42 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On 18/05/17 11:21, Linus Walleij wrote:
> On Tue, May 16, 2017 at 1:54 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 10/05/17 11:24, Linus Walleij wrote:
>>> The mmc_queue_req is a per-request state container the MMC core uses
>>> to carry bounce buffers, pointers to asynchronous requests and so on.
>>> Currently allocated as a static array of objects, then as a request
>>> comes in, a mmc_queue_req is assigned to it, and used during the
>>> lifetime of the request.
>>>
>>> This is backwards compared to how other block layer drivers work:
>>> they usally let the block core provide a per-request struct that get
>>> allocated right beind the struct request, and which can be obtained
>>> using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function
>>> name is misleading: it is used by both the old and the MQ block
>>> layer.)
>>>
>>> The per-request struct gets allocated to the size stored in the queue
>>> variable .cmd_size initialized using the .init_rq_fn() and
>>> cleaned up using .exit_rq_fn().
>>>
>>> The block layer code makes the MMC core rely on this mechanism to
>>> allocate the per-request mmc_queue_req state container.
>>>
>>> Doing this make a lot of complicated queue handling go away.
>>
>> Isn't that at the expense of increased memory allocation.
>>
>> Have you compared the number of allocations?  It looks to me like the block
>> layer allocates a minimum of 4 requests in the memory pool which will
>> increase if there are more in the I/O scheduler, plus 1 for flush.  There
>> are often 4 queues per eMMC (2x boot,RPMB and main area), so that is 20
>> requests minimum, up from 2 allocations previously.  For someone using 64K
>> bounce buffers, you have increased memory allocation by at least 18x64 =
>> 1152k.  However the I/O scheduler could allocate a lot more.
> 
> That is not a realistic example.
> 
> As pointed out in patch #1, bounce buffers are used on old systems
> which have max_segs == 1. No modern hardware has that,
> they all have multiple segments-capable host controllers and
> often also DMA engines.
> 
> Old systems with max_segs == 1 also have:
> 
> - One SD or MMC slot
> - No eMMC (because it was not yet invented in those times)
> - So no RPMB or Boot partitions, just main area
> 
> If you can point me to a system that has max_segs == 1 and an
> eMMC mounted, I can look into it and ask the driver maintainers to
> check if it disturbs them, but I think those simply do not exist.
> 
>>> Doing this refactoring is necessary to move the ioctl() operations
>>> into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT]
>>
>> Obviously you could create a per-request data structure with only the
>> reference to the IOCTL data, and without putting all the memory allocations
>> there as well.
> 
> Not easily, and this is the way all IDE, ATA, SCSI disks etc are
> doing this so why would be try to be different and maintain a lot
> of deviant code.
> 
> The allocation of extra data is done by the block layer when issueing
> blk_get_request() so trying to keep the old mechanism of a list of
> struct mmc_queue_req and trying to pair these with incoming requests
> inevitably means a lot of extra work, possibly deepening that list or
> creating out-of-list extra entries and whatnot.
> 
> It's better to do what everyone else does and let the core do this
> allocation of extra data (tag) instead.

I agree it is much nicer, but the extra bounce buffer allocations still seem
gratuitous.  Maybe we should allocate them as needed from a memory pool,
instead of for every request.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core
  2017-05-18 12:42       ` Adrian Hunter
@ 2017-05-18 13:31         ` Linus Walleij
  0 siblings, 0 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-18 13:31 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente

On Thu, May 18, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 18/05/17 11:21, Linus Walleij wrote:

>> It's better to do what everyone else does and let the core do this
>> allocation of extra data (tag) instead.
>
> I agree it is much nicer, but the extra bounce buffer allocations still seem
> gratuitous.  Maybe we should allocate them as needed from a memory pool,
> instead of for every request.

Incidentally IIRC that is what happens when we migrate to MQ.
In the old block layer, the per-request data is indeed initialized for
every request as you say, but in MQ the same struct request *'s
are reused from a pool, they are only initialized once, i.e. when
you add the block device.

(If I remember my logs right.)

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option.
  2017-05-18  7:48     ` Linus Walleij
@ 2017-05-19  7:30       ` Steven J. Hill
  2017-05-23  9:05         ` Linus Walleij
  0 siblings, 1 reply; 31+ messages in thread
From: Steven J. Hill @ 2017-05-19  7:30 UTC (permalink / raw)
  To: Linus Walleij, Bartlomiej Zolnierkiewicz
  Cc: linux-mmc, Ulf Hansson, Adrian Hunter, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Paolo Valente, Daniel Mack,
	Marc Zyngier

Remove MMC bounce buffer config option and associated code. This
is proposed in addition to Linus' changes to remove the config
option. I have tested this on our Octeon hardware platforms.

Signed-off-by: Steven J. Hill <Steven.Hill@cavium.com>
---
 arch/arm/configs/colibri_pxa300_defconfig |   1 -
 arch/arm/configs/davinci_all_defconfig    |   1 -
 arch/arm/configs/lpc32xx_defconfig        |   1 -
 arch/arm/configs/nhk8815_defconfig        |   1 -
 arch/arm/configs/sama5_defconfig          |   1 -
 arch/arm/configs/u300_defconfig           |   1 -
 arch/arm/configs/zeus_defconfig           |   1 -
 arch/blackfin/configs/CM-BF533_defconfig  |   1 -
 arch/blackfin/configs/CM-BF537E_defconfig |   1 -
 arch/mips/configs/cavium_octeon_defconfig |   1 -
 arch/mips/configs/qi_lb60_defconfig       |   1 -
 drivers/mmc/core/Kconfig                  |  18 -----
 drivers/mmc/core/queue.c                  | 109 +++---------------------------
 drivers/mmc/host/davinci_mmc.c            |   6 +-
 include/linux/mmc/card.h                  |   1 -
 15 files changed, 9 insertions(+), 136 deletions(-)

diff --git a/arch/arm/configs/colibri_pxa300_defconfig b/arch/arm/configs/colibri_pxa300_defconfig
index be02fe2..10c940c 100644
--- a/arch/arm/configs/colibri_pxa300_defconfig
+++ b/arch/arm/configs/colibri_pxa300_defconfig
@@ -51,7 +51,6 @@ CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
 CONFIG_USB_MON=y
 CONFIG_USB_STORAGE=y
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_PXA=y
 CONFIG_EXT3_FS=y
 CONFIG_INOTIFY=y
diff --git a/arch/arm/configs/davinci_all_defconfig b/arch/arm/configs/davinci_all_defconfig
index 67db829..8a8f086 100644
--- a/arch/arm/configs/davinci_all_defconfig
+++ b/arch/arm/configs/davinci_all_defconfig
@@ -188,7 +188,6 @@ CONFIG_USB_G_SERIAL=m
 CONFIG_USB_G_PRINTER=m
 CONFIG_USB_CDC_COMPOSITE=m
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_DAVINCI=y
 CONFIG_NEW_LEDS=y
 CONFIG_LEDS_CLASS=m
diff --git a/arch/arm/configs/lpc32xx_defconfig b/arch/arm/configs/lpc32xx_defconfig
index 6ba430d..655edfd 100644
--- a/arch/arm/configs/lpc32xx_defconfig
+++ b/arch/arm/configs/lpc32xx_defconfig
@@ -146,7 +146,6 @@ CONFIG_USB_LPC32XX=y
 CONFIG_USB_MASS_STORAGE=m
 CONFIG_USB_G_SERIAL=m
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_ARMMMCI=y
 CONFIG_MMC_SPI=y
 CONFIG_NEW_LEDS=y
diff --git a/arch/arm/configs/nhk8815_defconfig b/arch/arm/configs/nhk8815_defconfig
index 7d2ad30..e113d4a 100644
--- a/arch/arm/configs/nhk8815_defconfig
+++ b/arch/arm/configs/nhk8815_defconfig
@@ -95,7 +95,6 @@ CONFIG_GPIO_STMPE=y
 CONFIG_MFD_STMPE=y
 CONFIG_REGULATOR=y
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_ARMMMCI=y
 CONFIG_NEW_LEDS=y
 CONFIG_LEDS_CLASS=y
diff --git a/arch/arm/configs/sama5_defconfig b/arch/arm/configs/sama5_defconfig
index 777c9e9..c330071 100644
--- a/arch/arm/configs/sama5_defconfig
+++ b/arch/arm/configs/sama5_defconfig
@@ -180,7 +180,6 @@ CONFIG_USB_GADGET=y
 CONFIG_USB_ATMEL_USBA=y
 CONFIG_USB_G_SERIAL=y
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_SDHCI=y
 CONFIG_MMC_SDHCI_PLTFM=y
 CONFIG_MMC_SDHCI_OF_AT91=y
diff --git a/arch/arm/configs/u300_defconfig b/arch/arm/configs/u300_defconfig
index aaa95ab..d6eb9fe 100644
--- a/arch/arm/configs/u300_defconfig
+++ b/arch/arm/configs/u300_defconfig
@@ -50,7 +50,6 @@ CONFIG_BACKLIGHT_CLASS_DEVICE=y
 # CONFIG_USB_SUPPORT is not set
 CONFIG_MMC=y
 CONFIG_MMC_UNSAFE_RESUME=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_ARMMMCI=y
 CONFIG_RTC_CLASS=y
 # CONFIG_RTC_HCTOSYS is not set
diff --git a/arch/arm/configs/zeus_defconfig b/arch/arm/configs/zeus_defconfig
index cd11da8..a126f7b 100644
--- a/arch/arm/configs/zeus_defconfig
+++ b/arch/arm/configs/zeus_defconfig
@@ -146,7 +146,6 @@ CONFIG_USB_MASS_STORAGE=m
 CONFIG_USB_G_SERIAL=m
 CONFIG_USB_G_PRINTER=m
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_PXA=y
 CONFIG_NEW_LEDS=y
 CONFIG_LEDS_CLASS=m
diff --git a/arch/blackfin/configs/CM-BF533_defconfig b/arch/blackfin/configs/CM-BF533_defconfig
index 9a5716d..7a40109 100644
--- a/arch/blackfin/configs/CM-BF533_defconfig
+++ b/arch/blackfin/configs/CM-BF533_defconfig
@@ -59,7 +59,6 @@ CONFIG_SPI_BFIN5XX=y
 # CONFIG_HWMON is not set
 # CONFIG_USB_SUPPORT is not set
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_SPI=m
 # CONFIG_DNOTIFY is not set
 CONFIG_VFAT_FS=y
diff --git a/arch/blackfin/configs/CM-BF537E_defconfig b/arch/blackfin/configs/CM-BF537E_defconfig
index 6845928..b47ced8 100644
--- a/arch/blackfin/configs/CM-BF537E_defconfig
+++ b/arch/blackfin/configs/CM-BF537E_defconfig
@@ -83,7 +83,6 @@ CONFIG_GPIO_SYSFS=y
 CONFIG_USB_GADGET=m
 CONFIG_USB_ETH=m
 CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_SPI=m
 CONFIG_EXT2_FS=y
 CONFIG_EXT2_FS_XATTR=y
diff --git a/arch/mips/configs/cavium_octeon_defconfig b/arch/mips/configs/cavium_octeon_defconfig
index d4fda41..7f337cf 100644
--- a/arch/mips/configs/cavium_octeon_defconfig
+++ b/arch/mips/configs/cavium_octeon_defconfig
@@ -130,7 +130,6 @@ CONFIG_USB_OHCI_HCD_PLATFORM=m
 CONFIG_MMC=y
 # CONFIG_PWRSEQ_EMMC is not set
 # CONFIG_PWRSEQ_SIMPLE is not set
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_CAVIUM_OCTEON=y
 CONFIG_RTC_CLASS=y
 CONFIG_RTC_DRV_DS1307=y
diff --git a/arch/mips/configs/qi_lb60_defconfig b/arch/mips/configs/qi_lb60_defconfig
index d7bb8cc..df067bc 100644
--- a/arch/mips/configs/qi_lb60_defconfig
+++ b/arch/mips/configs/qi_lb60_defconfig
@@ -110,7 +110,6 @@ CONFIG_USB_ETH=y
 # CONFIG_USB_ETH_RNDIS is not set
 CONFIG_MMC=y
 CONFIG_MMC_UNSAFE_RESUME=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
 CONFIG_MMC_JZ4740=y
 CONFIG_RTC_CLASS=y
 CONFIG_RTC_DRV_JZ4740=y
diff --git a/drivers/mmc/core/Kconfig b/drivers/mmc/core/Kconfig
index fc1ecda..42e8906 100644
--- a/drivers/mmc/core/Kconfig
+++ b/drivers/mmc/core/Kconfig
@@ -61,24 +61,6 @@ config MMC_BLOCK_MINORS

 	  If unsure, say 8 here.

-config MMC_BLOCK_BOUNCE
-	bool "Use bounce buffer for simple hosts"
-	depends on MMC_BLOCK
-	default y
-	help
-	  SD/MMC is a high latency protocol where it is crucial to
-	  send large requests in order to get high performance. Many
-	  controllers, however, are restricted to continuous memory
-	  (i.e. they can't do scatter-gather), something the kernel
-	  rarely can provide.
-
-	  Say Y here to help these restricted hosts by bouncing
-	  requests back and forth from a large buffer. You will get
-	  a big performance gain at the cost of up to 64 KiB of
-	  physical memory.
-
-	  If unsure, say Y here.
-
 config SDIO_UART
 	tristate "SDIO UART/GPS class support"
 	depends on TTY
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 5c37b6b..0dcf31d 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -23,8 +23,6 @@
 #include "core.h"
 #include "card.h"

-#define MMC_QUEUE_BOUNCESZ	65536
-
 /*
  * Prepare a MMC request. This just filters out odd stuff.
  */
@@ -219,73 +217,6 @@ static struct mmc_queue_req *mmc_queue_alloc_mqrqs(int qdepth)
 	return mqrq;
 }

-#ifdef CONFIG_MMC_BLOCK_BOUNCE
-static int mmc_queue_alloc_bounce_bufs(struct mmc_queue_req *mqrq, int qdepth,
-				       unsigned int bouncesz)
-{
-	int i;
-
-	for (i = 0; i < qdepth; i++) {
-		mqrq[i].bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
-		if (!mqrq[i].bounce_buf)
-			return -ENOMEM;
-
-		mqrq[i].sg = mmc_alloc_sg(1);
-		if (!mqrq[i].sg)
-			return -ENOMEM;
-
-		mqrq[i].bounce_sg = mmc_alloc_sg(bouncesz / 512);
-		if (!mqrq[i].bounce_sg)
-			return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static bool mmc_queue_alloc_bounce(struct mmc_queue_req *mqrq, int qdepth,
-				   unsigned int bouncesz)
-{
-	int ret;
-
-	ret = mmc_queue_alloc_bounce_bufs(mqrq, qdepth, bouncesz);
-	if (ret)
-		mmc_queue_reqs_free_bufs(mqrq, qdepth);
-
-	return !ret;
-}
-
-static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
-{
-	unsigned int bouncesz = MMC_QUEUE_BOUNCESZ;
-
-	if (host->max_segs != 1)
-		return 0;
-
-	if (bouncesz > host->max_req_size)
-		bouncesz = host->max_req_size;
-	if (bouncesz > host->max_seg_size)
-		bouncesz = host->max_seg_size;
-	if (bouncesz > host->max_blk_count * 512)
-		bouncesz = host->max_blk_count * 512;
-
-	if (bouncesz <= 512)
-		return 0;
-
-	return bouncesz;
-}
-#else
-static inline bool mmc_queue_alloc_bounce(struct mmc_queue_req *mqrq,
-					  int qdepth, unsigned int bouncesz)
-{
-	return false;
-}
-
-static unsigned int mmc_queue_calc_bouncesz(struct mmc_host *host)
-{
-	return 0;
-}
-#endif
-
 static int mmc_queue_alloc_sgs(struct mmc_queue_req *mqrq, int qdepth,
 			       int max_segs)
 {
@@ -312,7 +243,6 @@ static int __mmc_queue_alloc_shared_queue(struct mmc_card *card, int qdepth)
 {
 	struct mmc_host *host = card->host;
 	struct mmc_queue_req *mqrq;
-	unsigned int bouncesz;
 	int ret = 0;

 	if (card->mqrq)
@@ -325,26 +255,10 @@ static int __mmc_queue_alloc_shared_queue(struct mmc_card *card, int qdepth)
 	card->mqrq = mqrq;
 	card->qdepth = qdepth;

-	bouncesz = mmc_queue_calc_bouncesz(host);
-
-	if (bouncesz && !mmc_queue_alloc_bounce(mqrq, qdepth, bouncesz)) {
-		bouncesz = 0;
-		pr_warn("%s: unable to allocate bounce buffers\n",
-			mmc_card_name(card));
-	}
-
-	card->bouncesz = bouncesz;
-
-	if (!bouncesz) {
-		ret = mmc_queue_alloc_sgs(mqrq, qdepth, host->max_segs);
-		if (ret)
-			goto out_err;
-	}
-
-	return ret;
+	ret = mmc_queue_alloc_sgs(mqrq, qdepth, host->max_segs);
+	if (ret)
+		mmc_queue_free_shared_queue(card);

-out_err:
-	mmc_queue_free_shared_queue(card);
 	return ret;
 }

@@ -387,18 +301,11 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	if (mmc_can_erase(card))
 		mmc_queue_setup_discard(mq->queue, card);

-	if (card->bouncesz) {
-		blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
-		blk_queue_max_hw_sectors(mq->queue, card->bouncesz / 512);
-		blk_queue_max_segments(mq->queue, card->bouncesz / 512);
-		blk_queue_max_segment_size(mq->queue, card->bouncesz);
-	} else {
-		blk_queue_bounce_limit(mq->queue, limit);
-		blk_queue_max_hw_sectors(mq->queue,
-			min(host->max_blk_count, host->max_req_size / 512));
-		blk_queue_max_segments(mq->queue, host->max_segs);
-		blk_queue_max_segment_size(mq->queue, host->max_seg_size);
-	}
+	blk_queue_bounce_limit(mq->queue, limit);
+	blk_queue_max_hw_sectors(mq->queue,
+		min(host->max_blk_count, host->max_req_size / 512));
+	blk_queue_max_segments(mq->queue, host->max_segs);
+	blk_queue_max_segment_size(mq->queue, host->max_seg_size);

 	sema_init(&mq->thread_sem, 1);

diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c
index 621ce47..2c6d131 100644
--- a/drivers/mmc/host/davinci_mmc.c
+++ b/drivers/mmc/host/davinci_mmc.c
@@ -149,11 +149,7 @@

 /*
  * One scatterlist dma "segment" is at most MAX_CCNT rw_threshold units,
- * and we handle up to MAX_NR_SG segments.  MMC_BLOCK_BOUNCE kicks in only
- * for drivers with max_segs == 1, making the segments bigger (64KB)
- * than the page or two that's otherwise typical. nr_sg (passed from
- * platform data) == 16 gives at least the same throughput boost, using
- * EDMA transfer linkage instead of spending CPU time copying pages.
+ * and we handle up to MAX_NR_SG segments.
  */
 #define MAX_CCNT	((1 << 16) - 1)

diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index aad015e..23653c7 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -306,7 +306,6 @@ struct mmc_card {
 	unsigned int    nr_parts;

 	struct mmc_queue_req	*mqrq;		/* Shared queue structure */
-	unsigned int		bouncesz;	/* Bounce buffer size */
 	int			qdepth;		/* Shared queue depth */
 };

-- 
2.1.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* RE: [PATCH 4/5] mmc: block: move single ioctl() commands to block requests
  2017-05-10  8:24 ` [PATCH 4/5] mmc: block: move single ioctl() commands to block requests Linus Walleij
@ 2017-05-23  8:14     ` Avri Altman
  2017-05-23  8:14     ` Avri Altman
  1 sibling, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-23  8:14 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Wednesday, May 10, 2017 11:24 AM
> To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> Subject: [PATCH 4/5] mmc: block: move single ioctl() commands to block
> requests
>=20
> This wraps single ioctl() commands into block requests using the custom
> block layer request types REQ_OP_DRV_IN and REQ_OP_DRV_OUT.
>=20
> By doing this we are loosening the grip on the big host lock, since two c=
alls to
> mmc_get_card()/mmc_put_card() are removed.
>=20
> We are storing the ioctl() in/out argument as a pointer in the per-reques=
t
> struct mmc_blk_request container. Since we now let the block layer alloca=
te
> this data, blk_get_request() will allocate it for us and we can immediate=
ly
> dereference it and use it to pass the argument into the block layer.
>=20
> Tested on the ux500 with the userspace:
> mmc extcsd read /dev/mmcblk3
> resulting in a successful EXTCSD info dump back to the console.
>=20
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
tested-by: Avri Altman <Avri.Altman@sandisk.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 4/5] mmc: block: move single ioctl() commands to block requests
@ 2017-05-23  8:14     ` Avri Altman
  0 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-23  8:14 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Wednesday, May 10, 2017 11:24 AM
> To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> Subject: [PATCH 4/5] mmc: block: move single ioctl() commands to block
> requests
> 
> This wraps single ioctl() commands into block requests using the custom
> block layer request types REQ_OP_DRV_IN and REQ_OP_DRV_OUT.
> 
> By doing this we are loosening the grip on the big host lock, since two calls to
> mmc_get_card()/mmc_put_card() are removed.
> 
> We are storing the ioctl() in/out argument as a pointer in the per-request
> struct mmc_blk_request container. Since we now let the block layer allocate
> this data, blk_get_request() will allocate it for us and we can immediately
> dereference it and use it to pass the argument into the block layer.
> 
> Tested on the ux500 with the userspace:
> mmc extcsd read /dev/mmcblk3
> resulting in a successful EXTCSD info dump back to the console.
> 
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
tested-by: Avri Altman <Avri.Altman@sandisk.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option.
  2017-05-19  7:30       ` [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option Steven J. Hill
@ 2017-05-23  9:05         ` Linus Walleij
  2017-05-23 10:08           ` Arnd Bergmann
  2017-05-23 18:24           ` Pierre Ossman
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Walleij @ 2017-05-23  9:05 UTC (permalink / raw)
  To: Steven J. Hill, Pierre Ossman, Vinod Koul
  Cc: Bartlomiej Zolnierkiewicz, linux-mmc, Ulf Hansson, Adrian Hunter,
	linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Paolo Valente, Daniel Mack, Marc Zyngier

On Fri, May 19, 2017 at 9:30 AM, Steven J. Hill <Steven.Hill@cavium.com> wrote:

> Remove MMC bounce buffer config option and associated code. This
> is proposed in addition to Linus' changes to remove the config
> option. I have tested this on our Octeon hardware platforms.
>
> Signed-off-by: Steven J. Hill <Steven.Hill@cavium.com>

This would have to be rebased as Ulf merged my patch making this
a per-host runtime config option. (The Kconfig is gone for example.)

Bounce buffers were added by Pierre Ossman for kernel 2.6.23 in
commit 98ccf14909ba02a41c5925b0b2c92aeeef23d3b9
"mmc: bounce requests for simple hosts"

Quote:

    Some hosts cannot do scatter/gather in hardware. Since not doing sg
    is such a big performance hit, we (optionally) bounce the requests
    to a simple linear buffer that we hand over to the driver.

    Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

So this runs the risk on reducing performance on simple MMC/SD
controllers. Notice: simple, not old.

We need to know if people are deploying simple controllers still
and if this is something that really affects their performance.

That said: this was put in place because the kernel was sending
SG lists that the host DMA could not manage.

Nowadays we have two mechanisms:

- DMA engine and DMA-API that help out in managing bounce
  buffers when used. This means this only is useful for hardware
  that does autonomous DMA, without any separate DMA engine.

- CMA that can actually allocate a big chunk of memory: I think
  this original code is restricted to a 64KB segment because
  kmalloc() will only guarantee contigous physical memory up to
  64-128KiB or so. Now we could actually allocate a big badass
  CMA buffer if that improves the performance, and that would be
  a per-host setting.

It would be good to hear from people seeing benefits from bounce
buffers about this. What hardware is there that acually sees a
significant improvement with bounce buffers?

Pierre, what host were you developing this for? Maybe I can try
to get the same and test it.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
  2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
@ 2017-05-23  9:21     ` Avri Altman
  2017-05-16  9:21   ` Ulf Hansson
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-23  9:21 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Wednesday, May 10, 2017 11:24 AM
> To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> Subject: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
>=20
> This switches also the multiple-command ioctl() call to issue all ioctl()=
s
> through the block layer instead of going directly to the device.
>=20
> We extend the passed argument with an argument count and loop over all
> passed commands in the ioctl() issue function called from the block layer=
.
>=20
> By doing this we are again loosening the grip on the big host lock, since=
 two
> calls to mmc_get_card()/mmc_put_card() are removed.
>=20
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Avri Altman <Avri.Altman@sandisk.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
@ 2017-05-23  9:21     ` Avri Altman
  0 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-23  9:21 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Linus Walleij
> Sent: Wednesday, May 10, 2017 11:24 AM
> To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> Subject: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
> 
> This switches also the multiple-command ioctl() call to issue all ioctl()s
> through the block layer instead of going directly to the device.
> 
> We extend the passed argument with an argument count and loop over all
> passed commands in the ioctl() issue function called from the block layer.
> 
> By doing this we are again loosening the grip on the big host lock, since two
> calls to mmc_get_card()/mmc_put_card() are removed.
> 
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Avri Altman <Avri.Altman@sandisk.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option.
  2017-05-23  9:05         ` Linus Walleij
@ 2017-05-23 10:08           ` Arnd Bergmann
  2017-05-23 18:24           ` Pierre Ossman
  1 sibling, 0 replies; 31+ messages in thread
From: Arnd Bergmann @ 2017-05-23 10:08 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Steven J. Hill, Pierre Ossman, Vinod Koul,
	Bartlomiej Zolnierkiewicz, linux-mmc, Ulf Hansson, Adrian Hunter,
	linux-block, Jens Axboe, Christoph Hellwig, Paolo Valente,
	Daniel Mack, Marc Zyngier

On Tue, May 23, 2017 at 11:05 AM, Linus Walleij
<linus.walleij@linaro.org> wrote:
> On Fri, May 19, 2017 at 9:30 AM, Steven J. Hill <Steven.Hill@cavium.com> wrote:
>
>> Remove MMC bounce buffer config option and associated code. This
>> is proposed in addition to Linus' changes to remove the config
>> option. I have tested this on our Octeon hardware platforms.
>>
>> Signed-off-by: Steven J. Hill <Steven.Hill@cavium.com>
>
> This would have to be rebased as Ulf merged my patch making this
> a per-host runtime config option. (The Kconfig is gone for example.)
>
> Bounce buffers were added by Pierre Ossman for kernel 2.6.23 in
> commit 98ccf14909ba02a41c5925b0b2c92aeeef23d3b9
> "mmc: bounce requests for simple hosts"
>
> Quote:
>
>     Some hosts cannot do scatter/gather in hardware. Since not doing sg
>     is such a big performance hit, we (optionally) bounce the requests
>     to a simple linear buffer that we hand over to the driver.
>
>     Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
>
> So this runs the risk on reducing performance on simple MMC/SD
> controllers. Notice: simple, not old.

Right, we definitely need to have something that allows us
to send larger commands to the device for scattered requests.

It wouldn't hurt to reconsider whether the current method is
the most efficient way of doing this, but that doesn't seem to
be a priority to me. If the MMC bounce buffers get in the way
of the blk-mq conversion, we could try to come up with a
different implementation as part of that conversion.

> We need to know if people are deploying simple controllers still
> and if this is something that really affects their performance.
>
> That said: this was put in place because the kernel was sending
> SG lists that the host DMA could not manage.
>
> Nowadays we have two mechanisms:
>
> - DMA engine and DMA-API that help out in managing bounce
>   buffers when used. This means this only is useful for hardware
>   that does autonomous DMA, without any separate DMA engine.

I think the bounce buffers in the DMA-API (swiotlb) should be
kept separate since it has a completely different purpose of
bouncing pages within the dma mask, rather than linearizing
a sglist.

If I read this right, we used to use blk-bounce for highmem
pages and mmc-bounce for lowmem until 2ff1fa679115 ("mmc_block:
bounce buffer highmem support") in 2008, now we don't
use blk-bounce at all if mmc-bounce is in use.

> - CMA that can actually allocate a big chunk of memory: I think
>   this original code is restricted to a 64KB segment because
>   kmalloc() will only guarantee contigous physical memory up to
>   64-128KiB or so. Now we could actually allocate a big badass
>   CMA buffer if that improves the performance, and that would be
>   a per-host setting.
>
> It would be good to hear from people seeing benefits from bounce
> buffers about this. What hardware is there that acually sees a
> significant improvement with bounce buffers?

The mmc-bounce code is active on hosts that have 'max_segs=1'
(the default unless they override it). These one set it to '1'
explicitly:

drivers/mmc/host/au1xmmc.c:     mmc->max_segs = AU1XMMC_DESCRIPTOR_COUNT;
drivers/mmc/host/bfin_sdh.c:    mmc->max_segs = 1; /* only BF51x */
drivers/mmc/host/sdhci.c:               mmc->max_segs = 1; /* with
SDHCI_USE_SDMA */
drivers/mmc/host/ushc.c:        mmc->max_segs      = 1;
drivers/mmc/host/via-sdmmc.c:   mmc->max_segs = 1;

These have max_segs=1 but ask for bounce buffers to be
disabled, which will severely limit their performance:

drivers/mmc/host/cavium.c:              mmc->max_segs = 1; /*
cavium,octeon-6130-mmc */
drivers/mmc/host/pxamci.c: mmc->max_segs = NR_SG;

These ones don't seem to set it at all, defaulting to '1':
drivers/mmc/host/cb710-mmc.c
drivers/mmc/host/moxart-mmc.c
drivers/mmc/host/sdricoh_cs.c
drivers/mmc/host/toshsd.c

       Arnd

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
  2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
@ 2017-05-23 10:51     ` Avri Altman
  2017-05-16  9:21   ` Ulf Hansson
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-23 10:51 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: Avri Altman
> Sent: Tuesday, May 23, 2017 12:21 PM
> To: 'Linus Walleij' <linus.walleij@linaro.org>; linux-mmc@vger.kernel.org=
; Ulf
> Hansson <ulf.hansson@linaro.org>; Adrian Hunter
> <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>
> Subject: RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block laye=
r
>=20
>=20
>=20
> > -----Original Message-----
> > From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> > owner@vger.kernel.org] On Behalf Of Linus Walleij
> > Sent: Wednesday, May 10, 2017 11:24 AM
> > To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> > Adrian Hunter <adrian.hunter@intel.com>
> > Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>;
> > Christoph Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>;
> > Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> > <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> > Subject: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
> >
> > This switches also the multiple-command ioctl() call to issue all
> > ioctl()s through the block layer instead of going directly to the devic=
e.
> >
> > We extend the passed argument with an argument count and loop over all
> > passed commands in the ioctl() issue function called from the block lay=
er.
> >
> > By doing this we are again loosening the grip on the big host lock,
> > since two calls to mmc_get_card()/mmc_put_card() are removed.
> >
> > Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> Tested-by: Avri Altman <Avri.Altman@sandisk.com>

Few more words - we used mmc_utils to run ffu (field firmware update) that =
uses multi-ioctl.
We used Intel NUC with SanDisk storage - this way we could verify that inde=
ed our fw is updated properly.

Cheers,
Avri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
@ 2017-05-23 10:51     ` Avri Altman
  0 siblings, 0 replies; 31+ messages in thread
From: Avri Altman @ 2017-05-23 10:51 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Adrian Hunter
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente



> -----Original Message-----
> From: Avri Altman
> Sent: Tuesday, May 23, 2017 12:21 PM
> To: 'Linus Walleij' <linus.walleij@linaro.org>; linux-mmc@vger.kernel.org; Ulf
> Hansson <ulf.hansson@linaro.org>; Adrian Hunter
> <adrian.hunter@intel.com>
> Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Christoph
> Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>; Bartlomiej
> Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>
> Subject: RE: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
> 
> 
> 
> > -----Original Message-----
> > From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> > owner@vger.kernel.org] On Behalf Of Linus Walleij
> > Sent: Wednesday, May 10, 2017 11:24 AM
> > To: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> > Adrian Hunter <adrian.hunter@intel.com>
> > Cc: linux-block@vger.kernel.org; Jens Axboe <axboe@kernel.dk>;
> > Christoph Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>;
> > Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> > <paolo.valente@linaro.org>; Linus Walleij <linus.walleij@linaro.org>
> > Subject: [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer
> >
> > This switches also the multiple-command ioctl() call to issue all
> > ioctl()s through the block layer instead of going directly to the device.
> >
> > We extend the passed argument with an argument count and loop over all
> > passed commands in the ioctl() issue function called from the block layer.
> >
> > By doing this we are again loosening the grip on the big host lock,
> > since two calls to mmc_get_card()/mmc_put_card() are removed.
> >
> > Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> Tested-by: Avri Altman <Avri.Altman@sandisk.com>

Few more words - we used mmc_utils to run ffu (field firmware update) that uses multi-ioctl.
We used Intel NUC with SanDisk storage - this way we could verify that indeed our fw is updated properly.

Cheers,
Avri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option.
  2017-05-23  9:05         ` Linus Walleij
  2017-05-23 10:08           ` Arnd Bergmann
@ 2017-05-23 18:24           ` Pierre Ossman
  1 sibling, 0 replies; 31+ messages in thread
From: Pierre Ossman @ 2017-05-23 18:24 UTC (permalink / raw)
  To: Linus Walleij, Steven J. Hill, Vinod Koul
  Cc: Bartlomiej Zolnierkiewicz, linux-mmc, Ulf Hansson, Adrian Hunter,
	linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Paolo Valente, Daniel Mack, Marc Zyngier

On 23/05/17 11:05, Linus Walleij wrote:
> 
> Pierre, what host were you developing this for? Maybe I can try
> to get the same and test it.
> 

I think the work primarily was done with the Texas Instruments SDHCI 
host that HP used in their laptops. Ricoh controllers were also popular, 
but they had some proprietary mode that you had to get them out of.

If I recall correctly, all early SDHCI controllers needed this feature. 
I think scatter/gather was a later addition to the spec.

I take it updates to the SD specification hasn't fixed this performance 
issue?

Regards
-- 
Pierre Ossman

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-05-23 18:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10  8:24 [PATCH 0/5] mmc: core: modernize ioctl() requests Linus Walleij
2017-05-10  8:24 ` [PATCH 1/5] mmc: core: Delete bounce buffer Kconfig option Linus Walleij
2017-05-15 11:55   ` Ulf Hansson
2017-05-15 14:04   ` Bartlomiej Zolnierkiewicz
2017-05-18  7:48     ` Linus Walleij
2017-05-19  7:30       ` [RFC PATCH] mmc: core: Remove CONFIG_MMC_BLOCK_BOUNCE option Steven J. Hill
2017-05-23  9:05         ` Linus Walleij
2017-05-23 10:08           ` Arnd Bergmann
2017-05-23 18:24           ` Pierre Ossman
2017-05-10  8:24 ` [PATCH 2/5] mmc: core: Allocate per-request data using the block layer core Linus Walleij
2017-05-16  9:02   ` Ulf Hansson
2017-05-18  8:01     ` Linus Walleij
2017-05-16 11:54   ` Adrian Hunter
2017-05-18  8:21     ` Linus Walleij
2017-05-18 12:42       ` Adrian Hunter
2017-05-18 13:31         ` Linus Walleij
2017-05-10  8:24 ` [PATCH 3/5] mmc: block: Tag is_rpmb as bool Linus Walleij
2017-05-10  8:24 ` [PATCH 4/5] mmc: block: move single ioctl() commands to block requests Linus Walleij
2017-05-16  9:15   ` Ulf Hansson
2017-05-23  8:14   ` Avri Altman
2017-05-23  8:14     ` Avri Altman
2017-05-10  8:24 ` [PATCH 5/5] mmc: block: move multi-ioctl() to use block layer Linus Walleij
2017-05-12 21:09   ` Avri Altman
2017-05-12 21:09     ` Avri Altman
2017-05-18  9:26     ` Linus Walleij
2017-05-16  9:21   ` Ulf Hansson
2017-05-23  9:21   ` Avri Altman
2017-05-23  9:21     ` Avri Altman
2017-05-23 10:51   ` Avri Altman
2017-05-23 10:51     ` Avri Altman
2017-05-11 21:08 ` [PATCH 0/5] mmc: core: modernize ioctl() requests Ulf Hansson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.