All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-06 19:07 ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

How significant is the cache maintenance over head?
It depends, the eMMC are much faster now
compared to a few years ago and cache maintenance cost more due to
multiple cache levels and speculative cache pre-fetch. In relation the
cost for handling the caches have increased and is now a bottle neck
dealing with fast eMMC together with DMA.

The intention for introducing none blocking mmc requests is to minimize the
time between a mmc request ends and another mmc request starts. In the
current implementation the MMC controller is idle when dma_map_sg and
dma_unmap_sg is processing. Introducing none blocking mmc request makes it
possible to prepare the caches for next job parallel with an active
mmc request.

This is done by making the issue_rw_rq() none blocking.
The increase in throughput is proportional to the time it takes to
prepare (major part of preparations is dma_map_sg and dma_unmap_sg)
a request and how fast the memory is. The faster the MMC/SD is
the more significant the prepare request time becomes. Measurements on U5500
and Panda on eMMC and SD shows significant performance gain for for large
reads when running DMA mode. In the PIO case the performance is unchanged.

There are two optional hooks pre_req() and post_req() that the host driver
may implement in order to move work to before and after the actual mmc_request
function is called. In the DMA case pre_req() may do dma_map_sg() and prepare
the dma descriptor and post_req runs the dma_unmap_sg.

Details on measurements from IOZone and mmc_test:
https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Specs/StoragePerfMMC-async-req

Changes since v1:
 * Add support for omap_hsmmc
 * Add test in mmc_test to compare performance with
   and without none blocking request.
 * Add random fault injection in mmc core to exercise error
   handling in the mmc block code.
 * Fix serveral issue in the mmc block error handling.
 * Add a host_cookie member in mmc_data to be used by
   pre_req to mark the data. The host driver will then
   check this mark to see if the data is prepared or not.
 * Previous patch subject was
   "add double buffering for mmc block requests".

Per Forlin (12):
  mmc: add none blocking mmc request function
  mmc: mmc_test: add debugfs file to list all tests
  mmc: mmc_test: add test for none blocking transfers
  mmc: add member in mmc queue struct to hold request data
  mmc: add a block request prepare function
  mmc: move error code in mmc_block_issue_rw_rq to a separate function.
  mmc: add a second mmc queue request member
  mmc: add handling for two parallel block requests in issue_rw_rq
  mmc: test: add random fault injection in core.c
  omap_hsmmc: use original sg_len for dma_unmap_sg
  omap_hsmmc: add support for pre_req and post_req
  mmci: implement pre_req() and post_req()

 drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
 drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
 drivers/mmc/card/queue.c      |  171 +++++++++------
 drivers/mmc/card/queue.h      |   31 ++-
 drivers/mmc/core/core.c       |  132 ++++++++++-
 drivers/mmc/core/debugfs.c    |    5 +
 drivers/mmc/host/mmci.c       |  146 +++++++++++-
 drivers/mmc/host/mmci.h       |    8 +
 drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
 include/linux/mmc/core.h      |    9 +-
 include/linux/mmc/host.h      |   13 +-
 lib/Kconfig.debug             |   11 +
 12 files changed, 1172 insertions(+), 279 deletions(-)

-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-06 19:07 ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

How significant is the cache maintenance over head?
It depends, the eMMC are much faster now
compared to a few years ago and cache maintenance cost more due to
multiple cache levels and speculative cache pre-fetch. In relation the
cost for handling the caches have increased and is now a bottle neck
dealing with fast eMMC together with DMA.

The intention for introducing none blocking mmc requests is to minimize the
time between a mmc request ends and another mmc request starts. In the
current implementation the MMC controller is idle when dma_map_sg and
dma_unmap_sg is processing. Introducing none blocking mmc request makes it
possible to prepare the caches for next job parallel with an active
mmc request.

This is done by making the issue_rw_rq() none blocking.
The increase in throughput is proportional to the time it takes to
prepare (major part of preparations is dma_map_sg and dma_unmap_sg)
a request and how fast the memory is. The faster the MMC/SD is
the more significant the prepare request time becomes. Measurements on U5500
and Panda on eMMC and SD shows significant performance gain for for large
reads when running DMA mode. In the PIO case the performance is unchanged.

There are two optional hooks pre_req() and post_req() that the host driver
may implement in order to move work to before and after the actual mmc_request
function is called. In the DMA case pre_req() may do dma_map_sg() and prepare
the dma descriptor and post_req runs the dma_unmap_sg.

Details on measurements from IOZone and mmc_test:
https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Specs/StoragePerfMMC-async-req

Changes since v1:
 * Add support for omap_hsmmc
 * Add test in mmc_test to compare performance with
   and without none blocking request.
 * Add random fault injection in mmc core to exercise error
   handling in the mmc block code.
 * Fix serveral issue in the mmc block error handling.
 * Add a host_cookie member in mmc_data to be used by
   pre_req to mark the data. The host driver will then
   check this mark to see if the data is prepared or not.
 * Previous patch subject was
   "add double buffering for mmc block requests".

Per Forlin (12):
  mmc: add none blocking mmc request function
  mmc: mmc_test: add debugfs file to list all tests
  mmc: mmc_test: add test for none blocking transfers
  mmc: add member in mmc queue struct to hold request data
  mmc: add a block request prepare function
  mmc: move error code in mmc_block_issue_rw_rq to a separate function.
  mmc: add a second mmc queue request member
  mmc: add handling for two parallel block requests in issue_rw_rq
  mmc: test: add random fault injection in core.c
  omap_hsmmc: use original sg_len for dma_unmap_sg
  omap_hsmmc: add support for pre_req and post_req
  mmci: implement pre_req() and post_req()

 drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
 drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
 drivers/mmc/card/queue.c      |  171 +++++++++------
 drivers/mmc/card/queue.h      |   31 ++-
 drivers/mmc/core/core.c       |  132 ++++++++++-
 drivers/mmc/core/debugfs.c    |    5 +
 drivers/mmc/host/mmci.c       |  146 +++++++++++-
 drivers/mmc/host/mmci.h       |    8 +
 drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
 include/linux/mmc/core.h      |    9 +-
 include/linux/mmc/host.h      |   13 +-
 lib/Kconfig.debug             |   11 +
 12 files changed, 1172 insertions(+), 279 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 01/12] mmc: add none blocking mmc request function
  2011-04-06 19:07 ` Per Forlin
@ 2011-04-06 19:07   ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Previously there has only been one function mmc_wait_for_req
to start and wait for a request. This patch adds
 * mmc_start_req - starts a request wihtout waiting
 * mmc_wait_for_req_done - waits until request is done
 * mmc_pre_req - asks the host driver to prepare for the next job
 * mmc_post_req - asks the host driver to clean up after a completed job

The intention is to use pre_req() and post_req() to do cache maintenance
while a request is active. pre_req() can be called while a request is active
to minimize latency to start next job. post_req() can be used after the next
job is started to clean up the request. This will minimize the host driver
request end latency. post_req() is typically used before ending the block
request and handing over the buffer to the block layer.

Add a host-private member in mmc_data to be used by
pre_req to mark the data. The host driver will then
check this mark to see if the data is prepared or not.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/core/core.c  |   78 ++++++++++++++++++++++++++++++++++++++++------
 include/linux/mmc/core.h |    9 +++++-
 include/linux/mmc/host.h |    9 +++++
 3 files changed, 85 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 1f453ac..e88dd36 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -198,30 +198,88 @@ mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 
 static void mmc_wait_done(struct mmc_request *mrq)
 {
-	complete(mrq->done_data);
+	complete(&mrq->completion);
 }
 
 /**
- *	mmc_wait_for_req - start a request and wait for completion
+ *	mmc_pre_req - Prepare for a new request
+ *	@host: MMC host to prepare command
+ *	@mrq: MMC request to prepare for
+ *	@is_first_req: true if there is no previous started request
+ *                     that may run in parellel to this call, otherwise false
+ *
+ *	mmc_pre_req() is called in prior to mmc_start_req() to let
+ *	host prepare for the new request. Preparation of a request may be
+ *	performed while another request is running on the host.
+ */
+void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq,
+		 bool is_first_req)
+{
+	if (host->ops->pre_req)
+		host->ops->pre_req(host, mrq, is_first_req);
+}
+EXPORT_SYMBOL(mmc_pre_req);
+
+/**
+ *	mmc_post_req - Post process a completed request
+ *	@host: MMC host to post process command
+ *	@mrq: MMC request to post process for
+ *	@err: Error, if none zero, clean up any resources made in pre_req
+ *
+ *	Let the host post process a completed request. Post processing of
+ *	a request may be performed while another reuqest is running.
+ */
+void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq, int err)
+{
+	if (host->ops->post_req)
+		host->ops->post_req(host, mrq, err);
+}
+EXPORT_SYMBOL(mmc_post_req);
+
+/**
+ *	mmc_start_req - start a request
  *	@host: MMC host to start command
  *	@mrq: MMC request to start
  *
- *	Start a new MMC custom command request for a host, and wait
- *	for the command to complete. Does not attempt to parse the
- *	response.
+ *	Start a new MMC custom command request for a host.
+ *	Does not wait for the command to complete.
  */
-void mmc_wait_for_req(struct mmc_host *host, struct mmc_request *mrq)
+void mmc_start_req(struct mmc_host *host, struct mmc_request *mrq)
 {
-	DECLARE_COMPLETION_ONSTACK(complete);
-
-	mrq->done_data = &complete;
+	init_completion(&mrq->completion);
 	mrq->done = mmc_wait_done;
 
 	mmc_start_request(host, mrq);
+}
+EXPORT_SYMBOL(mmc_start_req);
 
-	wait_for_completion(&complete);
+/**
+ *	mmc_wait_for_req_done - wait for completion of request
+ *	@mrq: MMC request to wait for
+ *
+ *	Wait for the command to complete. Does not attempt to parse the
+ *	response.
+ */
+void mmc_wait_for_req_done(struct mmc_request *mrq)
+{
+	wait_for_completion(&mrq->completion);
 }
+EXPORT_SYMBOL(mmc_wait_for_req_done);
 
+/**
+ *	mmc_wait_for_req - start a request and wait for completion
+ *	@host: MMC host to start command
+ *	@mrq: MMC request to start
+ *
+ *	Start a new MMC custom command request for a host, and wait
+ *	for the command to complete. Does not attempt to parse the
+ *	response.
+ */
+void mmc_wait_for_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	mmc_start_req(host, mrq);
+	mmc_wait_for_req_done(mrq);
+}
 EXPORT_SYMBOL(mmc_wait_for_req);
 
 /**
diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
index 07f27af..5bbfb71 100644
--- a/include/linux/mmc/core.h
+++ b/include/linux/mmc/core.h
@@ -117,6 +117,7 @@ struct mmc_data {
 
 	unsigned int		sg_len;		/* size of scatter list */
 	struct scatterlist	*sg;		/* I/O scatter list */
+	s32			host_cookie;	/* host private data */
 };
 
 struct mmc_request {
@@ -124,13 +125,19 @@ struct mmc_request {
 	struct mmc_data		*data;
 	struct mmc_command	*stop;
 
-	void			*done_data;	/* completion data */
+	struct completion	completion;
 	void			(*done)(struct mmc_request *);/* completion function */
 };
 
 struct mmc_host;
 struct mmc_card;
 
+extern void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq,
+			bool is_first_req);
+extern void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
+			 int err);
+extern void mmc_start_req(struct mmc_host *host, struct mmc_request *mrq);
+extern void mmc_wait_for_req_done(struct mmc_request *mrq);
 extern void mmc_wait_for_req(struct mmc_host *, struct mmc_request *);
 extern int mmc_wait_for_cmd(struct mmc_host *, struct mmc_command *, int);
 extern int mmc_wait_for_app_cmd(struct mmc_host *, struct mmc_card *,
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index bcb793e..c056a3d 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -88,6 +88,15 @@ struct mmc_host_ops {
 	 */
 	int (*enable)(struct mmc_host *host);
 	int (*disable)(struct mmc_host *host, int lazy);
+	/*
+	 * It is optional for the host to implement pre_req and post_req in
+	 * order to support double buffering of requests (prepare one
+	 * request while another request is active).
+	 */
+	void	(*post_req)(struct mmc_host *host, struct mmc_request *req,
+			    int err);
+	void	(*pre_req)(struct mmc_host *host, struct mmc_request *req,
+			   bool is_first_req);
 	void	(*request)(struct mmc_host *host, struct mmc_request *req);
 	/*
 	 * Avoid calling these three functions too often or in a "fast path",
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Previously there has only been one function mmc_wait_for_req
to start and wait for a request. This patch adds
 * mmc_start_req - starts a request wihtout waiting
 * mmc_wait_for_req_done - waits until request is done
 * mmc_pre_req - asks the host driver to prepare for the next job
 * mmc_post_req - asks the host driver to clean up after a completed job

The intention is to use pre_req() and post_req() to do cache maintenance
while a request is active. pre_req() can be called while a request is active
to minimize latency to start next job. post_req() can be used after the next
job is started to clean up the request. This will minimize the host driver
request end latency. post_req() is typically used before ending the block
request and handing over the buffer to the block layer.

Add a host-private member in mmc_data to be used by
pre_req to mark the data. The host driver will then
check this mark to see if the data is prepared or not.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/core/core.c  |   78 ++++++++++++++++++++++++++++++++++++++++------
 include/linux/mmc/core.h |    9 +++++-
 include/linux/mmc/host.h |    9 +++++
 3 files changed, 85 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 1f453ac..e88dd36 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -198,30 +198,88 @@ mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 
 static void mmc_wait_done(struct mmc_request *mrq)
 {
-	complete(mrq->done_data);
+	complete(&mrq->completion);
 }
 
 /**
- *	mmc_wait_for_req - start a request and wait for completion
+ *	mmc_pre_req - Prepare for a new request
+ *	@host: MMC host to prepare command
+ *	@mrq: MMC request to prepare for
+ *	@is_first_req: true if there is no previous started request
+ *                     that may run in parellel to this call, otherwise false
+ *
+ *	mmc_pre_req() is called in prior to mmc_start_req() to let
+ *	host prepare for the new request. Preparation of a request may be
+ *	performed while another request is running on the host.
+ */
+void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq,
+		 bool is_first_req)
+{
+	if (host->ops->pre_req)
+		host->ops->pre_req(host, mrq, is_first_req);
+}
+EXPORT_SYMBOL(mmc_pre_req);
+
+/**
+ *	mmc_post_req - Post process a completed request
+ *	@host: MMC host to post process command
+ *	@mrq: MMC request to post process for
+ *	@err: Error, if none zero, clean up any resources made in pre_req
+ *
+ *	Let the host post process a completed request. Post processing of
+ *	a request may be performed while another reuqest is running.
+ */
+void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq, int err)
+{
+	if (host->ops->post_req)
+		host->ops->post_req(host, mrq, err);
+}
+EXPORT_SYMBOL(mmc_post_req);
+
+/**
+ *	mmc_start_req - start a request
  *	@host: MMC host to start command
  *	@mrq: MMC request to start
  *
- *	Start a new MMC custom command request for a host, and wait
- *	for the command to complete. Does not attempt to parse the
- *	response.
+ *	Start a new MMC custom command request for a host.
+ *	Does not wait for the command to complete.
  */
-void mmc_wait_for_req(struct mmc_host *host, struct mmc_request *mrq)
+void mmc_start_req(struct mmc_host *host, struct mmc_request *mrq)
 {
-	DECLARE_COMPLETION_ONSTACK(complete);
-
-	mrq->done_data = &complete;
+	init_completion(&mrq->completion);
 	mrq->done = mmc_wait_done;
 
 	mmc_start_request(host, mrq);
+}
+EXPORT_SYMBOL(mmc_start_req);
 
-	wait_for_completion(&complete);
+/**
+ *	mmc_wait_for_req_done - wait for completion of request
+ *	@mrq: MMC request to wait for
+ *
+ *	Wait for the command to complete. Does not attempt to parse the
+ *	response.
+ */
+void mmc_wait_for_req_done(struct mmc_request *mrq)
+{
+	wait_for_completion(&mrq->completion);
 }
+EXPORT_SYMBOL(mmc_wait_for_req_done);
 
+/**
+ *	mmc_wait_for_req - start a request and wait for completion
+ *	@host: MMC host to start command
+ *	@mrq: MMC request to start
+ *
+ *	Start a new MMC custom command request for a host, and wait
+ *	for the command to complete. Does not attempt to parse the
+ *	response.
+ */
+void mmc_wait_for_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	mmc_start_req(host, mrq);
+	mmc_wait_for_req_done(mrq);
+}
 EXPORT_SYMBOL(mmc_wait_for_req);
 
 /**
diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
index 07f27af..5bbfb71 100644
--- a/include/linux/mmc/core.h
+++ b/include/linux/mmc/core.h
@@ -117,6 +117,7 @@ struct mmc_data {
 
 	unsigned int		sg_len;		/* size of scatter list */
 	struct scatterlist	*sg;		/* I/O scatter list */
+	s32			host_cookie;	/* host private data */
 };
 
 struct mmc_request {
@@ -124,13 +125,19 @@ struct mmc_request {
 	struct mmc_data		*data;
 	struct mmc_command	*stop;
 
-	void			*done_data;	/* completion data */
+	struct completion	completion;
 	void			(*done)(struct mmc_request *);/* completion function */
 };
 
 struct mmc_host;
 struct mmc_card;
 
+extern void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq,
+			bool is_first_req);
+extern void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
+			 int err);
+extern void mmc_start_req(struct mmc_host *host, struct mmc_request *mrq);
+extern void mmc_wait_for_req_done(struct mmc_request *mrq);
 extern void mmc_wait_for_req(struct mmc_host *, struct mmc_request *);
 extern int mmc_wait_for_cmd(struct mmc_host *, struct mmc_command *, int);
 extern int mmc_wait_for_app_cmd(struct mmc_host *, struct mmc_card *,
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index bcb793e..c056a3d 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -88,6 +88,15 @@ struct mmc_host_ops {
 	 */
 	int (*enable)(struct mmc_host *host);
 	int (*disable)(struct mmc_host *host, int lazy);
+	/*
+	 * It is optional for the host to implement pre_req and post_req in
+	 * order to support double buffering of requests (prepare one
+	 * request while another request is active).
+	 */
+	void	(*post_req)(struct mmc_host *host, struct mmc_request *req,
+			    int err);
+	void	(*pre_req)(struct mmc_host *host, struct mmc_request *req,
+			   bool is_first_req);
 	void	(*request)(struct mmc_host *host, struct mmc_request *req);
 	/*
 	 * Avoid calling these three functions too often or in a "fast path",
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 02/12] mmc: mmc_test: add debugfs file to list all tests
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Add a debugfs file "testlist" to print all available tests

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index f5cedec..466cdb5 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -2447,6 +2447,32 @@ static const struct file_operations mmc_test_fops_test = {
 	.release	= single_release,
 };
 
+static int mtf_testlist_show(struct seq_file *sf, void *data)
+{
+	int i;
+
+	mutex_lock(&mmc_test_lock);
+
+	for (i = 0; i < ARRAY_SIZE(mmc_test_cases); i++)
+		seq_printf(sf, "%d:\t%s\n", i+1, mmc_test_cases[i].name);
+
+	mutex_unlock(&mmc_test_lock);
+
+	return 0;
+}
+
+static int mtf_testlist_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mtf_testlist_show, inode->i_private);
+}
+
+static const struct file_operations mmc_test_fops_testlist = {
+	.open		= mtf_testlist_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static void mmc_test_free_file_test(struct mmc_card *card)
 {
 	struct mmc_test_dbgfs_file *df, *dfs;
@@ -2476,6 +2502,10 @@ static int mmc_test_register_file_test(struct mmc_card *card)
 		file = debugfs_create_file("test", S_IWUSR | S_IRUGO,
 			card->debugfs_root, card, &mmc_test_fops_test);
 
+	if (card->debugfs_root)
+		file = debugfs_create_file("testlist", S_IRUGO,
+			card->debugfs_root, card, &mmc_test_fops_testlist);
+
 	if (IS_ERR_OR_NULL(file)) {
 		dev_err(&card->dev,
 			"Can't create file. Perhaps debugfs is disabled.\n");
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 02/12] mmc: mmc_test: add debugfs file to list all tests
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Add a debugfs file "testlist" to print all available tests

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/mmc_test.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index f5cedec..466cdb5 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -2447,6 +2447,32 @@ static const struct file_operations mmc_test_fops_test = {
 	.release	= single_release,
 };
 
+static int mtf_testlist_show(struct seq_file *sf, void *data)
+{
+	int i;
+
+	mutex_lock(&mmc_test_lock);
+
+	for (i = 0; i < ARRAY_SIZE(mmc_test_cases); i++)
+		seq_printf(sf, "%d:\t%s\n", i+1, mmc_test_cases[i].name);
+
+	mutex_unlock(&mmc_test_lock);
+
+	return 0;
+}
+
+static int mtf_testlist_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mtf_testlist_show, inode->i_private);
+}
+
+static const struct file_operations mmc_test_fops_testlist = {
+	.open		= mtf_testlist_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static void mmc_test_free_file_test(struct mmc_card *card)
 {
 	struct mmc_test_dbgfs_file *df, *dfs;
@@ -2476,6 +2502,10 @@ static int mmc_test_register_file_test(struct mmc_card *card)
 		file = debugfs_create_file("test", S_IWUSR | S_IRUGO,
 			card->debugfs_root, card, &mmc_test_fops_test);
 
+	if (card->debugfs_root)
+		file = debugfs_create_file("testlist", S_IRUGO,
+			card->debugfs_root, card, &mmc_test_fops_testlist);
+
 	if (IS_ERR_OR_NULL(file)) {
 		dev_err(&card->dev,
 			"Can't create file. Perhaps debugfs is disabled.\n");
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 02/12] mmc: mmc_test: add debugfs file to list all tests
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Add a debugfs file "testlist" to print all available tests

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index f5cedec..466cdb5 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -2447,6 +2447,32 @@ static const struct file_operations mmc_test_fops_test = {
 	.release	= single_release,
 };
 
+static int mtf_testlist_show(struct seq_file *sf, void *data)
+{
+	int i;
+
+	mutex_lock(&mmc_test_lock);
+
+	for (i = 0; i < ARRAY_SIZE(mmc_test_cases); i++)
+		seq_printf(sf, "%d:\t%s\n", i+1, mmc_test_cases[i].name);
+
+	mutex_unlock(&mmc_test_lock);
+
+	return 0;
+}
+
+static int mtf_testlist_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mtf_testlist_show, inode->i_private);
+}
+
+static const struct file_operations mmc_test_fops_testlist = {
+	.open		= mtf_testlist_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static void mmc_test_free_file_test(struct mmc_card *card)
 {
 	struct mmc_test_dbgfs_file *df, *dfs;
@@ -2476,6 +2502,10 @@ static int mmc_test_register_file_test(struct mmc_card *card)
 		file = debugfs_create_file("test", S_IWUSR | S_IRUGO,
 			card->debugfs_root, card, &mmc_test_fops_test);
 
+	if (card->debugfs_root)
+		file = debugfs_create_file("testlist", S_IRUGO,
+			card->debugfs_root, card, &mmc_test_fops_testlist);
+
 	if (IS_ERR_OR_NULL(file)) {
 		dev_err(&card->dev,
 			"Can't create file. Perhaps debugfs is disabled.\n");
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Add four tests for read and write performance per
different transfer size, 4k to 4M.
 * Read using blocking mmc request
 * Read using none blocking mmc request
 * Write using blocking mmc request
 * Write using none blocking mmc request

The host dirver must support pre_req() and post_req()
in order to run the none blocking test cases.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |  312 +++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index 466cdb5..1000383 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -22,6 +22,7 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/random.h>
 
 #define RESULT_OK		0
 #define RESULT_FAIL		1
@@ -51,10 +52,12 @@ struct mmc_test_pages {
  * struct mmc_test_mem - allocated memory.
  * @arr: array of allocations
  * @cnt: number of allocations
+ * @size_min_cmn: lowest common size in array of allocations
  */
 struct mmc_test_mem {
 	struct mmc_test_pages *arr;
 	unsigned int cnt;
+	unsigned int size_min_cmn;
 };
 
 /**
@@ -148,6 +151,21 @@ struct mmc_test_card {
 	struct mmc_test_general_result	*gr;
 };
 
+enum mmc_test_prep_media {
+	MMC_TEST_PREP_NONE = 0,
+	MMC_TEST_PREP_WRITE_FULL = 1 << 0,
+	MMC_TEST_PREP_ERASE = 1 << 1,
+};
+
+struct mmc_test_multiple_rw {
+	unsigned int *bs;
+	unsigned int len;
+	unsigned int size;
+	bool do_write;
+	bool do_nonblock_req;
+	enum mmc_test_prep_media prepare;
+};
+
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -307,6 +325,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 	unsigned long max_seg_page_cnt = DIV_ROUND_UP(max_seg_sz, PAGE_SIZE);
 	unsigned long page_cnt = 0;
 	unsigned long limit = nr_free_buffer_pages() >> 4;
+	unsigned int min_cmn = 0;
 	struct mmc_test_mem *mem;
 
 	if (max_page_cnt > limit)
@@ -350,6 +369,12 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 		mem->arr[mem->cnt].page = page;
 		mem->arr[mem->cnt].order = order;
 		mem->cnt += 1;
+		if (!min_cmn)
+			min_cmn = PAGE_SIZE << order;
+		else
+			min_cmn = min(min_cmn,
+				      (unsigned int) (PAGE_SIZE << order));
+
 		if (max_page_cnt <= (1UL << order))
 			break;
 		max_page_cnt -= 1UL << order;
@@ -360,6 +385,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 			break;
 		}
 	}
+	mem->size_min_cmn = min_cmn;
 
 	return mem;
 
@@ -386,7 +412,6 @@ static int mmc_test_map_sg(struct mmc_test_mem *mem, unsigned long sz,
 	do {
 		for (i = 0; i < mem->cnt; i++) {
 			unsigned long len = PAGE_SIZE << mem->arr[i].order;
-
 			if (len > sz)
 				len = sz;
 			if (len > max_seg_sz)
@@ -725,6 +750,94 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,
 }
 
 /*
+ * Tests nonblock transfer with certain parameters
+ */
+static void mmc_test_nonblock_reset(struct mmc_request *mrq,
+				    struct mmc_command *cmd,
+				    struct mmc_command *stop,
+				    struct mmc_data *data)
+{
+	memset(mrq, 0, sizeof(struct mmc_request));
+	memset(cmd, 0, sizeof(struct mmc_command));
+	memset(data, 0, sizeof(struct mmc_data));
+	memset(stop, 0, sizeof(struct mmc_command));
+
+	mrq->cmd = cmd;
+	mrq->data = data;
+	mrq->stop = stop;
+}
+static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
+				      struct scatterlist *sg, unsigned sg_len,
+				      unsigned dev_addr, unsigned blocks,
+				      unsigned blksz, int write, int count)
+{
+	struct mmc_request mrq1;
+	struct mmc_command cmd1;
+	struct mmc_command stop1;
+	struct mmc_data data1;
+
+	struct mmc_request mrq2;
+	struct mmc_command cmd2;
+	struct mmc_command stop2;
+	struct mmc_data data2;
+
+	struct mmc_request *cur_mrq;
+	struct mmc_request *prev_mrq;
+	int i;
+	int ret = 0;
+
+	if (!test->card->host->ops->pre_req ||
+		!test->card->host->ops->post_req)
+		return -RESULT_UNSUP_HOST;
+
+	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+
+	cur_mrq = &mrq1;
+	prev_mrq = NULL;
+
+	for (i = 0; i < count; i++) {
+		mmc_test_prepare_mrq(test, cur_mrq, sg, sg_len, dev_addr,
+				blocks, blksz, write);
+		mmc_pre_req(test->card->host, cur_mrq, !prev_mrq);
+
+		if (prev_mrq) {
+			mmc_wait_for_req_done(prev_mrq);
+			mmc_test_wait_busy(test);
+			ret = mmc_test_check_result(test, prev_mrq);
+			if (ret)
+				goto err;
+		}
+
+		mmc_start_req(test->card->host, cur_mrq);
+
+		if (prev_mrq)
+			mmc_post_req(test->card->host, prev_mrq, 0);
+
+		prev_mrq = cur_mrq;
+		if (cur_mrq == &mrq1) {
+			mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+			cur_mrq = &mrq2;
+		} else {
+			mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+			cur_mrq = &mrq1;
+		}
+		dev_addr += blocks;
+	}
+
+	mmc_wait_for_req_done(prev_mrq);
+	mmc_test_wait_busy(test);
+	ret = mmc_test_check_result(test, prev_mrq);
+	if (ret)
+		goto err;
+	mmc_post_req(test->card->host, prev_mrq, 0);
+
+	return ret;
+err:
+	return ret;
+}
+
+/*
  * Tests a basic transfer with certain parameters
  */
 static int mmc_test_simple_transfer(struct mmc_test_card *test,
@@ -1351,14 +1464,17 @@ static int mmc_test_area_transfer(struct mmc_test_card *test,
 }
 
 /*
- * Map and transfer bytes.
+ * Map and transfer bytes for multiple transfers.
  */
-static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
-			    unsigned int dev_addr, int write, int max_scatter,
-			    int timed)
+static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
+				unsigned int dev_addr, int write,
+				int max_scatter, int timed, int count,
+				bool nonblock)
 {
 	struct timespec ts1, ts2;
-	int ret;
+	int ret = 0;
+	int i;
+	struct mmc_test_area *t = &test->area;
 
 	/*
 	 * In the case of a maximally scattered transfer, the maximum transfer
@@ -1382,8 +1498,15 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 
 	if (timed)
 		getnstimeofday(&ts1);
+	if (nonblock)
+		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
+				 dev_addr, t->blocks, 512, write, count);
+	else
+		for (i = 0; i < count && ret == 0; i++) {
+			ret = mmc_test_area_transfer(test, dev_addr, write);
+			dev_addr += sz >> 9;
+		}
 
-	ret = mmc_test_area_transfer(test, dev_addr, write);
 	if (ret)
 		return ret;
 
@@ -1391,11 +1514,19 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 		getnstimeofday(&ts2);
 
 	if (timed)
-		mmc_test_print_rate(test, sz, &ts1, &ts2);
+		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
 
 	return 0;
 }
 
+static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
+			    unsigned int dev_addr, int write, int max_scatter,
+			    int timed)
+{
+	return mmc_test_area_io_seq(test, sz, dev_addr, write, max_scatter,
+				    timed, 1, false);
+}
+
 /*
  * Write the test area entirely.
  */
@@ -1956,6 +2087,144 @@ static int mmc_test_large_seq_write_perf(struct mmc_test_card *test)
 	return mmc_test_large_seq_perf(test, 1);
 }
 
+static int mmc_test_rw_multiple(struct mmc_test_card *test,
+				struct mmc_test_multiple_rw *tdata,
+				unsigned int reqsize, unsigned int size)
+{
+	unsigned int dev_addr;
+	struct mmc_test_area *t = &test->area;
+	int ret = 0;
+	int max_reqsize = max(t->mem->size_min_cmn *
+			      min(t->max_segs, t->mem->cnt), t->max_tfr);
+
+	/* Set up test area */
+	if (size > mmc_test_capacity(test->card) / 2 * 512)
+		size = mmc_test_capacity(test->card) / 2 * 512;
+	if (reqsize > max_reqsize)
+		reqsize = max_reqsize;
+	dev_addr = mmc_test_capacity(test->card) / 4;
+	if ((dev_addr & 0xffff0000))
+		dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
+	else
+		dev_addr &= 0xfffff800; /* Round to 1MiB boundary */
+	if (!dev_addr)
+		goto err;
+
+	/* prepare test area */
+	if (mmc_can_erase(test->card) &&
+	    tdata->prepare & MMC_TEST_PREP_ERASE) {
+		ret = mmc_erase(test->card, dev_addr,
+				size / 512, MMC_SECURE_ERASE_ARG);
+		if (ret)
+			ret = mmc_erase(test->card, dev_addr,
+					size / 512, MMC_ERASE_ARG);
+		if (ret)
+			goto err;
+	}
+
+	/* Run test */
+	ret = mmc_test_area_io_seq(test, reqsize, dev_addr,
+				   tdata->do_write, 0, 1, size / reqsize,
+				   tdata->do_nonblock_req);
+	if (ret)
+		goto err;
+
+	return ret;
+ err:
+	printk(KERN_INFO "[%s] error\n", __func__);
+	return ret;
+}
+
+static int mmc_test_rw_multiple_size(struct mmc_test_card *test,
+				     struct mmc_test_multiple_rw *rw)
+{
+	int ret = 0;
+	int i;
+
+	for (i = 0 ; i < rw->len && ret == 0; i++) {
+		ret = mmc_test_rw_multiple(test, rw, rw->bs[i], rw->size);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+/*
+ * Multiple blocking write 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_write_blocking_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+};
+
+/*
+ * Multiple none blocking write 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_write_nonblock_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
+/*
+ * Multiple blocking read 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_read_blocking_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
+/*
+ * Multiple none blocking read 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_read_nonblock_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
 static const struct mmc_test_case mmc_test_cases[] = {
 	{
 		.name = "Basic write (no data verification)",
@@ -2223,6 +2492,33 @@ static const struct mmc_test_case mmc_test_cases[] = {
 		.cleanup = mmc_test_area_cleanup,
 	},
 
+	{
+		.name = "Write performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Write performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
 };
 
 static DEFINE_MUTEX(mmc_test_lock);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Add four tests for read and write performance per
different transfer size, 4k to 4M.
 * Read using blocking mmc request
 * Read using none blocking mmc request
 * Write using blocking mmc request
 * Write using none blocking mmc request

The host dirver must support pre_req() and post_req()
in order to run the none blocking test cases.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/mmc_test.c |  312 +++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index 466cdb5..1000383 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -22,6 +22,7 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/random.h>
 
 #define RESULT_OK		0
 #define RESULT_FAIL		1
@@ -51,10 +52,12 @@ struct mmc_test_pages {
  * struct mmc_test_mem - allocated memory.
  * @arr: array of allocations
  * @cnt: number of allocations
+ * @size_min_cmn: lowest common size in array of allocations
  */
 struct mmc_test_mem {
 	struct mmc_test_pages *arr;
 	unsigned int cnt;
+	unsigned int size_min_cmn;
 };
 
 /**
@@ -148,6 +151,21 @@ struct mmc_test_card {
 	struct mmc_test_general_result	*gr;
 };
 
+enum mmc_test_prep_media {
+	MMC_TEST_PREP_NONE = 0,
+	MMC_TEST_PREP_WRITE_FULL = 1 << 0,
+	MMC_TEST_PREP_ERASE = 1 << 1,
+};
+
+struct mmc_test_multiple_rw {
+	unsigned int *bs;
+	unsigned int len;
+	unsigned int size;
+	bool do_write;
+	bool do_nonblock_req;
+	enum mmc_test_prep_media prepare;
+};
+
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -307,6 +325,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 	unsigned long max_seg_page_cnt = DIV_ROUND_UP(max_seg_sz, PAGE_SIZE);
 	unsigned long page_cnt = 0;
 	unsigned long limit = nr_free_buffer_pages() >> 4;
+	unsigned int min_cmn = 0;
 	struct mmc_test_mem *mem;
 
 	if (max_page_cnt > limit)
@@ -350,6 +369,12 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 		mem->arr[mem->cnt].page = page;
 		mem->arr[mem->cnt].order = order;
 		mem->cnt += 1;
+		if (!min_cmn)
+			min_cmn = PAGE_SIZE << order;
+		else
+			min_cmn = min(min_cmn,
+				      (unsigned int) (PAGE_SIZE << order));
+
 		if (max_page_cnt <= (1UL << order))
 			break;
 		max_page_cnt -= 1UL << order;
@@ -360,6 +385,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 			break;
 		}
 	}
+	mem->size_min_cmn = min_cmn;
 
 	return mem;
 
@@ -386,7 +412,6 @@ static int mmc_test_map_sg(struct mmc_test_mem *mem, unsigned long sz,
 	do {
 		for (i = 0; i < mem->cnt; i++) {
 			unsigned long len = PAGE_SIZE << mem->arr[i].order;
-
 			if (len > sz)
 				len = sz;
 			if (len > max_seg_sz)
@@ -725,6 +750,94 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,
 }
 
 /*
+ * Tests nonblock transfer with certain parameters
+ */
+static void mmc_test_nonblock_reset(struct mmc_request *mrq,
+				    struct mmc_command *cmd,
+				    struct mmc_command *stop,
+				    struct mmc_data *data)
+{
+	memset(mrq, 0, sizeof(struct mmc_request));
+	memset(cmd, 0, sizeof(struct mmc_command));
+	memset(data, 0, sizeof(struct mmc_data));
+	memset(stop, 0, sizeof(struct mmc_command));
+
+	mrq->cmd = cmd;
+	mrq->data = data;
+	mrq->stop = stop;
+}
+static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
+				      struct scatterlist *sg, unsigned sg_len,
+				      unsigned dev_addr, unsigned blocks,
+				      unsigned blksz, int write, int count)
+{
+	struct mmc_request mrq1;
+	struct mmc_command cmd1;
+	struct mmc_command stop1;
+	struct mmc_data data1;
+
+	struct mmc_request mrq2;
+	struct mmc_command cmd2;
+	struct mmc_command stop2;
+	struct mmc_data data2;
+
+	struct mmc_request *cur_mrq;
+	struct mmc_request *prev_mrq;
+	int i;
+	int ret = 0;
+
+	if (!test->card->host->ops->pre_req ||
+		!test->card->host->ops->post_req)
+		return -RESULT_UNSUP_HOST;
+
+	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+
+	cur_mrq = &mrq1;
+	prev_mrq = NULL;
+
+	for (i = 0; i < count; i++) {
+		mmc_test_prepare_mrq(test, cur_mrq, sg, sg_len, dev_addr,
+				blocks, blksz, write);
+		mmc_pre_req(test->card->host, cur_mrq, !prev_mrq);
+
+		if (prev_mrq) {
+			mmc_wait_for_req_done(prev_mrq);
+			mmc_test_wait_busy(test);
+			ret = mmc_test_check_result(test, prev_mrq);
+			if (ret)
+				goto err;
+		}
+
+		mmc_start_req(test->card->host, cur_mrq);
+
+		if (prev_mrq)
+			mmc_post_req(test->card->host, prev_mrq, 0);
+
+		prev_mrq = cur_mrq;
+		if (cur_mrq == &mrq1) {
+			mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+			cur_mrq = &mrq2;
+		} else {
+			mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+			cur_mrq = &mrq1;
+		}
+		dev_addr += blocks;
+	}
+
+	mmc_wait_for_req_done(prev_mrq);
+	mmc_test_wait_busy(test);
+	ret = mmc_test_check_result(test, prev_mrq);
+	if (ret)
+		goto err;
+	mmc_post_req(test->card->host, prev_mrq, 0);
+
+	return ret;
+err:
+	return ret;
+}
+
+/*
  * Tests a basic transfer with certain parameters
  */
 static int mmc_test_simple_transfer(struct mmc_test_card *test,
@@ -1351,14 +1464,17 @@ static int mmc_test_area_transfer(struct mmc_test_card *test,
 }
 
 /*
- * Map and transfer bytes.
+ * Map and transfer bytes for multiple transfers.
  */
-static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
-			    unsigned int dev_addr, int write, int max_scatter,
-			    int timed)
+static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
+				unsigned int dev_addr, int write,
+				int max_scatter, int timed, int count,
+				bool nonblock)
 {
 	struct timespec ts1, ts2;
-	int ret;
+	int ret = 0;
+	int i;
+	struct mmc_test_area *t = &test->area;
 
 	/*
 	 * In the case of a maximally scattered transfer, the maximum transfer
@@ -1382,8 +1498,15 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 
 	if (timed)
 		getnstimeofday(&ts1);
+	if (nonblock)
+		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
+				 dev_addr, t->blocks, 512, write, count);
+	else
+		for (i = 0; i < count && ret == 0; i++) {
+			ret = mmc_test_area_transfer(test, dev_addr, write);
+			dev_addr += sz >> 9;
+		}
 
-	ret = mmc_test_area_transfer(test, dev_addr, write);
 	if (ret)
 		return ret;
 
@@ -1391,11 +1514,19 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 		getnstimeofday(&ts2);
 
 	if (timed)
-		mmc_test_print_rate(test, sz, &ts1, &ts2);
+		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
 
 	return 0;
 }
 
+static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
+			    unsigned int dev_addr, int write, int max_scatter,
+			    int timed)
+{
+	return mmc_test_area_io_seq(test, sz, dev_addr, write, max_scatter,
+				    timed, 1, false);
+}
+
 /*
  * Write the test area entirely.
  */
@@ -1956,6 +2087,144 @@ static int mmc_test_large_seq_write_perf(struct mmc_test_card *test)
 	return mmc_test_large_seq_perf(test, 1);
 }
 
+static int mmc_test_rw_multiple(struct mmc_test_card *test,
+				struct mmc_test_multiple_rw *tdata,
+				unsigned int reqsize, unsigned int size)
+{
+	unsigned int dev_addr;
+	struct mmc_test_area *t = &test->area;
+	int ret = 0;
+	int max_reqsize = max(t->mem->size_min_cmn *
+			      min(t->max_segs, t->mem->cnt), t->max_tfr);
+
+	/* Set up test area */
+	if (size > mmc_test_capacity(test->card) / 2 * 512)
+		size = mmc_test_capacity(test->card) / 2 * 512;
+	if (reqsize > max_reqsize)
+		reqsize = max_reqsize;
+	dev_addr = mmc_test_capacity(test->card) / 4;
+	if ((dev_addr & 0xffff0000))
+		dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
+	else
+		dev_addr &= 0xfffff800; /* Round to 1MiB boundary */
+	if (!dev_addr)
+		goto err;
+
+	/* prepare test area */
+	if (mmc_can_erase(test->card) &&
+	    tdata->prepare & MMC_TEST_PREP_ERASE) {
+		ret = mmc_erase(test->card, dev_addr,
+				size / 512, MMC_SECURE_ERASE_ARG);
+		if (ret)
+			ret = mmc_erase(test->card, dev_addr,
+					size / 512, MMC_ERASE_ARG);
+		if (ret)
+			goto err;
+	}
+
+	/* Run test */
+	ret = mmc_test_area_io_seq(test, reqsize, dev_addr,
+				   tdata->do_write, 0, 1, size / reqsize,
+				   tdata->do_nonblock_req);
+	if (ret)
+		goto err;
+
+	return ret;
+ err:
+	printk(KERN_INFO "[%s] error\n", __func__);
+	return ret;
+}
+
+static int mmc_test_rw_multiple_size(struct mmc_test_card *test,
+				     struct mmc_test_multiple_rw *rw)
+{
+	int ret = 0;
+	int i;
+
+	for (i = 0 ; i < rw->len && ret == 0; i++) {
+		ret = mmc_test_rw_multiple(test, rw, rw->bs[i], rw->size);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+/*
+ * Multiple blocking write 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_write_blocking_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+};
+
+/*
+ * Multiple none blocking write 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_write_nonblock_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
+/*
+ * Multiple blocking read 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_read_blocking_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
+/*
+ * Multiple none blocking read 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_read_nonblock_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
 static const struct mmc_test_case mmc_test_cases[] = {
 	{
 		.name = "Basic write (no data verification)",
@@ -2223,6 +2492,33 @@ static const struct mmc_test_case mmc_test_cases[] = {
 		.cleanup = mmc_test_area_cleanup,
 	},
 
+	{
+		.name = "Write performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Write performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
 };
 
 static DEFINE_MUTEX(mmc_test_lock);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Add four tests for read and write performance per
different transfer size, 4k to 4M.
 * Read using blocking mmc request
 * Read using none blocking mmc request
 * Write using blocking mmc request
 * Write using none blocking mmc request

The host dirver must support pre_req() and post_req()
in order to run the none blocking test cases.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |  312 +++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index 466cdb5..1000383 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -22,6 +22,7 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/random.h>
 
 #define RESULT_OK		0
 #define RESULT_FAIL		1
@@ -51,10 +52,12 @@ struct mmc_test_pages {
  * struct mmc_test_mem - allocated memory.
  * @arr: array of allocations
  * @cnt: number of allocations
+ * @size_min_cmn: lowest common size in array of allocations
  */
 struct mmc_test_mem {
 	struct mmc_test_pages *arr;
 	unsigned int cnt;
+	unsigned int size_min_cmn;
 };
 
 /**
@@ -148,6 +151,21 @@ struct mmc_test_card {
 	struct mmc_test_general_result	*gr;
 };
 
+enum mmc_test_prep_media {
+	MMC_TEST_PREP_NONE = 0,
+	MMC_TEST_PREP_WRITE_FULL = 1 << 0,
+	MMC_TEST_PREP_ERASE = 1 << 1,
+};
+
+struct mmc_test_multiple_rw {
+	unsigned int *bs;
+	unsigned int len;
+	unsigned int size;
+	bool do_write;
+	bool do_nonblock_req;
+	enum mmc_test_prep_media prepare;
+};
+
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -307,6 +325,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 	unsigned long max_seg_page_cnt = DIV_ROUND_UP(max_seg_sz, PAGE_SIZE);
 	unsigned long page_cnt = 0;
 	unsigned long limit = nr_free_buffer_pages() >> 4;
+	unsigned int min_cmn = 0;
 	struct mmc_test_mem *mem;
 
 	if (max_page_cnt > limit)
@@ -350,6 +369,12 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 		mem->arr[mem->cnt].page = page;
 		mem->arr[mem->cnt].order = order;
 		mem->cnt += 1;
+		if (!min_cmn)
+			min_cmn = PAGE_SIZE << order;
+		else
+			min_cmn = min(min_cmn,
+				      (unsigned int) (PAGE_SIZE << order));
+
 		if (max_page_cnt <= (1UL << order))
 			break;
 		max_page_cnt -= 1UL << order;
@@ -360,6 +385,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 			break;
 		}
 	}
+	mem->size_min_cmn = min_cmn;
 
 	return mem;
 
@@ -386,7 +412,6 @@ static int mmc_test_map_sg(struct mmc_test_mem *mem, unsigned long sz,
 	do {
 		for (i = 0; i < mem->cnt; i++) {
 			unsigned long len = PAGE_SIZE << mem->arr[i].order;
-
 			if (len > sz)
 				len = sz;
 			if (len > max_seg_sz)
@@ -725,6 +750,94 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,
 }
 
 /*
+ * Tests nonblock transfer with certain parameters
+ */
+static void mmc_test_nonblock_reset(struct mmc_request *mrq,
+				    struct mmc_command *cmd,
+				    struct mmc_command *stop,
+				    struct mmc_data *data)
+{
+	memset(mrq, 0, sizeof(struct mmc_request));
+	memset(cmd, 0, sizeof(struct mmc_command));
+	memset(data, 0, sizeof(struct mmc_data));
+	memset(stop, 0, sizeof(struct mmc_command));
+
+	mrq->cmd = cmd;
+	mrq->data = data;
+	mrq->stop = stop;
+}
+static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
+				      struct scatterlist *sg, unsigned sg_len,
+				      unsigned dev_addr, unsigned blocks,
+				      unsigned blksz, int write, int count)
+{
+	struct mmc_request mrq1;
+	struct mmc_command cmd1;
+	struct mmc_command stop1;
+	struct mmc_data data1;
+
+	struct mmc_request mrq2;
+	struct mmc_command cmd2;
+	struct mmc_command stop2;
+	struct mmc_data data2;
+
+	struct mmc_request *cur_mrq;
+	struct mmc_request *prev_mrq;
+	int i;
+	int ret = 0;
+
+	if (!test->card->host->ops->pre_req ||
+		!test->card->host->ops->post_req)
+		return -RESULT_UNSUP_HOST;
+
+	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+
+	cur_mrq = &mrq1;
+	prev_mrq = NULL;
+
+	for (i = 0; i < count; i++) {
+		mmc_test_prepare_mrq(test, cur_mrq, sg, sg_len, dev_addr,
+				blocks, blksz, write);
+		mmc_pre_req(test->card->host, cur_mrq, !prev_mrq);
+
+		if (prev_mrq) {
+			mmc_wait_for_req_done(prev_mrq);
+			mmc_test_wait_busy(test);
+			ret = mmc_test_check_result(test, prev_mrq);
+			if (ret)
+				goto err;
+		}
+
+		mmc_start_req(test->card->host, cur_mrq);
+
+		if (prev_mrq)
+			mmc_post_req(test->card->host, prev_mrq, 0);
+
+		prev_mrq = cur_mrq;
+		if (cur_mrq == &mrq1) {
+			mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+			cur_mrq = &mrq2;
+		} else {
+			mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+			cur_mrq = &mrq1;
+		}
+		dev_addr += blocks;
+	}
+
+	mmc_wait_for_req_done(prev_mrq);
+	mmc_test_wait_busy(test);
+	ret = mmc_test_check_result(test, prev_mrq);
+	if (ret)
+		goto err;
+	mmc_post_req(test->card->host, prev_mrq, 0);
+
+	return ret;
+err:
+	return ret;
+}
+
+/*
  * Tests a basic transfer with certain parameters
  */
 static int mmc_test_simple_transfer(struct mmc_test_card *test,
@@ -1351,14 +1464,17 @@ static int mmc_test_area_transfer(struct mmc_test_card *test,
 }
 
 /*
- * Map and transfer bytes.
+ * Map and transfer bytes for multiple transfers.
  */
-static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
-			    unsigned int dev_addr, int write, int max_scatter,
-			    int timed)
+static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
+				unsigned int dev_addr, int write,
+				int max_scatter, int timed, int count,
+				bool nonblock)
 {
 	struct timespec ts1, ts2;
-	int ret;
+	int ret = 0;
+	int i;
+	struct mmc_test_area *t = &test->area;
 
 	/*
 	 * In the case of a maximally scattered transfer, the maximum transfer
@@ -1382,8 +1498,15 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 
 	if (timed)
 		getnstimeofday(&ts1);
+	if (nonblock)
+		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
+				 dev_addr, t->blocks, 512, write, count);
+	else
+		for (i = 0; i < count && ret == 0; i++) {
+			ret = mmc_test_area_transfer(test, dev_addr, write);
+			dev_addr += sz >> 9;
+		}
 
-	ret = mmc_test_area_transfer(test, dev_addr, write);
 	if (ret)
 		return ret;
 
@@ -1391,11 +1514,19 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 		getnstimeofday(&ts2);
 
 	if (timed)
-		mmc_test_print_rate(test, sz, &ts1, &ts2);
+		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
 
 	return 0;
 }
 
+static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
+			    unsigned int dev_addr, int write, int max_scatter,
+			    int timed)
+{
+	return mmc_test_area_io_seq(test, sz, dev_addr, write, max_scatter,
+				    timed, 1, false);
+}
+
 /*
  * Write the test area entirely.
  */
@@ -1956,6 +2087,144 @@ static int mmc_test_large_seq_write_perf(struct mmc_test_card *test)
 	return mmc_test_large_seq_perf(test, 1);
 }
 
+static int mmc_test_rw_multiple(struct mmc_test_card *test,
+				struct mmc_test_multiple_rw *tdata,
+				unsigned int reqsize, unsigned int size)
+{
+	unsigned int dev_addr;
+	struct mmc_test_area *t = &test->area;
+	int ret = 0;
+	int max_reqsize = max(t->mem->size_min_cmn *
+			      min(t->max_segs, t->mem->cnt), t->max_tfr);
+
+	/* Set up test area */
+	if (size > mmc_test_capacity(test->card) / 2 * 512)
+		size = mmc_test_capacity(test->card) / 2 * 512;
+	if (reqsize > max_reqsize)
+		reqsize = max_reqsize;
+	dev_addr = mmc_test_capacity(test->card) / 4;
+	if ((dev_addr & 0xffff0000))
+		dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
+	else
+		dev_addr &= 0xfffff800; /* Round to 1MiB boundary */
+	if (!dev_addr)
+		goto err;
+
+	/* prepare test area */
+	if (mmc_can_erase(test->card) &&
+	    tdata->prepare & MMC_TEST_PREP_ERASE) {
+		ret = mmc_erase(test->card, dev_addr,
+				size / 512, MMC_SECURE_ERASE_ARG);
+		if (ret)
+			ret = mmc_erase(test->card, dev_addr,
+					size / 512, MMC_ERASE_ARG);
+		if (ret)
+			goto err;
+	}
+
+	/* Run test */
+	ret = mmc_test_area_io_seq(test, reqsize, dev_addr,
+				   tdata->do_write, 0, 1, size / reqsize,
+				   tdata->do_nonblock_req);
+	if (ret)
+		goto err;
+
+	return ret;
+ err:
+	printk(KERN_INFO "[%s] error\n", __func__);
+	return ret;
+}
+
+static int mmc_test_rw_multiple_size(struct mmc_test_card *test,
+				     struct mmc_test_multiple_rw *rw)
+{
+	int ret = 0;
+	int i;
+
+	for (i = 0 ; i < rw->len && ret == 0; i++) {
+		ret = mmc_test_rw_multiple(test, rw, rw->bs[i], rw->size);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+/*
+ * Multiple blocking write 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_write_blocking_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+};
+
+/*
+ * Multiple none blocking write 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_write_nonblock_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
+/*
+ * Multiple blocking read 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_read_blocking_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
+/*
+ * Multiple none blocking read 4k to 4 MB chunks
+ */
+static int mmc_test_profile_mult_read_nonblock_perf(struct mmc_test_card *test)
+{
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data);
+}
+
 static const struct mmc_test_case mmc_test_cases[] = {
 	{
 		.name = "Basic write (no data verification)",
@@ -2223,6 +2492,33 @@ static const struct mmc_test_case mmc_test_cases[] = {
 		.cleanup = mmc_test_area_cleanup,
 	},
 
+	{
+		.name = "Write performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Write performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
 };
 
 static DEFINE_MUTEX(mmc_test_lock);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 04/12] mmc: add member in mmc queue struct to hold request data
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

The way the request data is organized in the mmc queue struct
it only allows processing of one request at the time.
This patch adds a new struct to hold mmc queue request data such as
sg list, request, blk request and bounce buffers, and updates any functions
depending on the mmc queue struct. This lies the ground for
using multiple active request for one mmc queue.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  105 +++++++++++++++++--------------------
 drivers/mmc/card/queue.c |  129 ++++++++++++++++++++++++----------------------
 drivers/mmc/card/queue.h |   30 ++++++++---
 3 files changed, 138 insertions(+), 126 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 61d233a..ec4e432 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -165,13 +165,6 @@ static const struct block_device_operations mmc_bdops = {
 	.owner			= THIS_MODULE,
 };
 
-struct mmc_blk_request {
-	struct mmc_request	mrq;
-	struct mmc_command	cmd;
-	struct mmc_command	stop;
-	struct mmc_data		data;
-};
-
 static u32 mmc_sd_num_wr_blocks(struct mmc_card *card)
 {
 	int err;
@@ -335,7 +328,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request brq;
+	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
 	int ret = 1, disable_multi = 0;
 
 	mmc_claim_host(card->host);
@@ -344,72 +337,72 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		struct mmc_command cmd;
 		u32 readcmd, writecmd, status = 0;
 
-		memset(&brq, 0, sizeof(struct mmc_blk_request));
-		brq.mrq.cmd = &brq.cmd;
-		brq.mrq.data = &brq.data;
+		memset(brq, 0, sizeof(struct mmc_blk_request));
+		brq->mrq.cmd = &brq->cmd;
+		brq->mrq.data = &brq->data;
 
-		brq.cmd.arg = blk_rq_pos(req);
+		brq->cmd.arg = blk_rq_pos(req);
 		if (!mmc_card_blockaddr(card))
-			brq.cmd.arg <<= 9;
-		brq.cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
-		brq.data.blksz = 512;
-		brq.stop.opcode = MMC_STOP_TRANSMISSION;
-		brq.stop.arg = 0;
-		brq.stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		brq.data.blocks = blk_rq_sectors(req);
+			brq->cmd.arg <<= 9;
+		brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
+		brq->data.blksz = 512;
+		brq->stop.opcode = MMC_STOP_TRANSMISSION;
+		brq->stop.arg = 0;
+		brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
+		brq->data.blocks = blk_rq_sectors(req);
 
 		/*
 		 * The block layer doesn't support all sector count
 		 * restrictions, so we need to be prepared for too big
 		 * requests.
 		 */
-		if (brq.data.blocks > card->host->max_blk_count)
-			brq.data.blocks = card->host->max_blk_count;
+		if (brq->data.blocks > card->host->max_blk_count)
+			brq->data.blocks = card->host->max_blk_count;
 
 		/*
 		 * After a read error, we redo the request one sector at a time
 		 * in order to accurately determine which sectors can be read
 		 * successfully.
 		 */
-		if (disable_multi && brq.data.blocks > 1)
-			brq.data.blocks = 1;
+		if (disable_multi && brq->data.blocks > 1)
+			brq->data.blocks = 1;
 
-		if (brq.data.blocks > 1) {
+		if (brq->data.blocks > 1) {
 			/* SPI multiblock writes terminate using a special
 			 * token, not a STOP_TRANSMISSION request.
 			 */
 			if (!mmc_host_is_spi(card->host)
 					|| rq_data_dir(req) == READ)
-				brq.mrq.stop = &brq.stop;
+				brq->mrq.stop = &brq->stop;
 			readcmd = MMC_READ_MULTIPLE_BLOCK;
 			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
 		} else {
-			brq.mrq.stop = NULL;
+			brq->mrq.stop = NULL;
 			readcmd = MMC_READ_SINGLE_BLOCK;
 			writecmd = MMC_WRITE_BLOCK;
 		}
 		if (rq_data_dir(req) == READ) {
-			brq.cmd.opcode = readcmd;
-			brq.data.flags |= MMC_DATA_READ;
+			brq->cmd.opcode = readcmd;
+			brq->data.flags |= MMC_DATA_READ;
 		} else {
-			brq.cmd.opcode = writecmd;
-			brq.data.flags |= MMC_DATA_WRITE;
+			brq->cmd.opcode = writecmd;
+			brq->data.flags |= MMC_DATA_WRITE;
 		}
 
-		mmc_set_data_timeout(&brq.data, card);
+		mmc_set_data_timeout(&brq->data, card);
 
-		brq.data.sg = mq->sg;
-		brq.data.sg_len = mmc_queue_map_sg(mq);
+		brq->data.sg = mq->mqrq_cur->sg;
+		brq->data.sg_len = mmc_queue_map_sg(mq, mq->mqrq_cur);
 
 		/*
 		 * Adjust the sg list so it is the same size as the
 		 * request.
 		 */
-		if (brq.data.blocks != blk_rq_sectors(req)) {
-			int i, data_size = brq.data.blocks << 9;
+		if (brq->data.blocks != blk_rq_sectors(req)) {
+			int i, data_size = brq->data.blocks << 9;
 			struct scatterlist *sg;
 
-			for_each_sg(brq.data.sg, sg, brq.data.sg_len, i) {
+			for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
 				data_size -= sg->length;
 				if (data_size <= 0) {
 					sg->length += data_size;
@@ -417,22 +410,22 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 					break;
 				}
 			}
-			brq.data.sg_len = i;
+			brq->data.sg_len = i;
 		}
 
-		mmc_queue_bounce_pre(mq);
+		mmc_queue_bounce_pre(mq->mqrq_cur);
 
-		mmc_wait_for_req(card->host, &brq.mrq);
+		mmc_wait_for_req(card->host, &brq->mrq);
 
-		mmc_queue_bounce_post(mq);
+		mmc_queue_bounce_post(mq->mqrq_cur);
 
 		/*
 		 * Check for errors here, but don't jump to cmd_err
 		 * until later as we need to wait for the card to leave
 		 * programming mode even when things go wrong.
 		 */
-		if (brq.cmd.error || brq.data.error || brq.stop.error) {
-			if (brq.data.blocks > 1 && rq_data_dir(req) == READ) {
+		if (brq->cmd.error || brq->data.error || brq->stop.error) {
+			if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
 				/* Redo read one sector at a time */
 				printk(KERN_WARNING "%s: retrying using single "
 				       "block read\n", req->rq_disk->disk_name);
@@ -442,29 +435,29 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			status = get_card_status(card, req);
 		}
 
-		if (brq.cmd.error) {
+		if (brq->cmd.error) {
 			printk(KERN_ERR "%s: error %d sending read/write "
 			       "command, response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq.cmd.error,
-			       brq.cmd.resp[0], status);
+			       req->rq_disk->disk_name, brq->cmd.error,
+			       brq->cmd.resp[0], status);
 		}
 
-		if (brq.data.error) {
-			if (brq.data.error == -ETIMEDOUT && brq.mrq.stop)
+		if (brq->data.error) {
+			if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
 				/* 'Stop' response contains card status */
-				status = brq.mrq.stop->resp[0];
+				status = brq->mrq.stop->resp[0];
 			printk(KERN_ERR "%s: error %d transferring data,"
 			       " sector %u, nr %u, card status %#x\n",
-			       req->rq_disk->disk_name, brq.data.error,
+			       req->rq_disk->disk_name, brq->data.error,
 			       (unsigned)blk_rq_pos(req),
 			       (unsigned)blk_rq_sectors(req), status);
 		}
 
-		if (brq.stop.error) {
+		if (brq->stop.error) {
 			printk(KERN_ERR "%s: error %d sending stop command, "
 			       "response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq.stop.error,
-			       brq.stop.resp[0], status);
+			       req->rq_disk->disk_name, brq->stop.error,
+			       brq->stop.resp[0], status);
 		}
 
 		if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
@@ -497,7 +490,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 #endif
 		}
 
-		if (brq.cmd.error || brq.stop.error || brq.data.error) {
+		if (brq->cmd.error || brq->stop.error || brq->data.error) {
 			if (rq_data_dir(req) == READ) {
 				/*
 				 * After an error, we redo I/O one sector at a
@@ -505,7 +498,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 				 * read a single sector.
 				 */
 				spin_lock_irq(&md->lock);
-				ret = __blk_end_request(req, -EIO, brq.data.blksz);
+				ret = __blk_end_request(req, -EIO, brq->data.blksz);
 				spin_unlock_irq(&md->lock);
 				continue;
 			}
@@ -516,7 +509,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		 * A block was successfully transferred.
 		 */
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
+		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	} while (ret);
 
@@ -544,7 +537,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		}
 	} else {
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
+		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	}
 
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 2ae7275..40e18b5 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -56,7 +56,7 @@ static int mmc_queue_thread(void *d)
 		spin_lock_irq(q->queue_lock);
 		set_current_state(TASK_INTERRUPTIBLE);
 		req = blk_fetch_request(q);
-		mq->req = req;
+		mq->mqrq_cur->req = req;
 		spin_unlock_irq(q->queue_lock);
 
 		if (!req) {
@@ -97,10 +97,25 @@ static void mmc_request(struct request_queue *q)
 		return;
 	}
 
-	if (!mq->req)
+	if (!mq->mqrq_cur->req)
 		wake_up_process(mq->thread);
 }
 
+struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
+{
+	struct scatterlist *sg;
+
+	sg = kmalloc(sizeof(struct scatterlist)*sg_len, GFP_KERNEL);
+	if (!sg)
+		*err = -ENOMEM;
+	else {
+		*err = 0;
+		sg_init_table(sg, sg_len);
+	}
+
+	return sg;
+}
+
 /**
  * mmc_init_queue - initialise a queue structure.
  * @mq: mmc queue
@@ -114,6 +129,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	struct mmc_host *host = card->host;
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret;
+	struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = *mmc_dev(host)->dma_mask;
@@ -123,8 +139,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	if (!mq->queue)
 		return -ENOMEM;
 
+	memset(&mq->mqrq_cur, 0, sizeof(mq->mqrq_cur));
+	mq->mqrq_cur = mqrq_cur;
 	mq->queue->queuedata = mq;
-	mq->req = NULL;
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
 	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);
@@ -158,53 +175,44 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 			bouncesz = host->max_blk_count * 512;
 
 		if (bouncesz > 512) {
-			mq->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
-			if (!mq->bounce_buf) {
+			mqrq_cur->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
+			if (!mqrq_cur->bounce_buf) {
 				printk(KERN_WARNING "%s: unable to "
-					"allocate bounce buffer\n",
+					"allocate bounce cur buffer\n",
 					mmc_card_name(card));
 			}
 		}
 
-		if (mq->bounce_buf) {
+		if (mqrq_cur->bounce_buf) {
 			blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 			blk_queue_max_hw_sectors(mq->queue, bouncesz / 512);
 			blk_queue_max_segments(mq->queue, bouncesz / 512);
 			blk_queue_max_segment_size(mq->queue, bouncesz);
 
-			mq->sg = kmalloc(sizeof(struct scatterlist),
-				GFP_KERNEL);
-			if (!mq->sg) {
-				ret = -ENOMEM;
+			mqrq_cur->sg = mmc_alloc_sg(1, &ret);
+			if (ret)
 				goto cleanup_queue;
-			}
-			sg_init_table(mq->sg, 1);
 
-			mq->bounce_sg = kmalloc(sizeof(struct scatterlist) *
-				bouncesz / 512, GFP_KERNEL);
-			if (!mq->bounce_sg) {
-				ret = -ENOMEM;
+			mqrq_cur->bounce_sg =
+				mmc_alloc_sg(bouncesz / 512, &ret);
+			if (ret)
 				goto cleanup_queue;
-			}
-			sg_init_table(mq->bounce_sg, bouncesz / 512);
+
 		}
 	}
 #endif
 
-	if (!mq->bounce_buf) {
+	if (!mqrq_cur->bounce_buf) {
 		blk_queue_bounce_limit(mq->queue, limit);
 		blk_queue_max_hw_sectors(mq->queue,
 			min(host->max_blk_count, host->max_req_size / 512));
 		blk_queue_max_segments(mq->queue, host->max_segs);
 		blk_queue_max_segment_size(mq->queue, host->max_seg_size);
 
-		mq->sg = kmalloc(sizeof(struct scatterlist) *
-			host->max_segs, GFP_KERNEL);
-		if (!mq->sg) {
-			ret = -ENOMEM;
+		mqrq_cur->sg = mmc_alloc_sg(host->max_segs, &ret);
+		if (ret)
 			goto cleanup_queue;
-		}
-		sg_init_table(mq->sg, host->max_segs);
+
 	}
 
 	sema_init(&mq->thread_sem, 1);
@@ -219,16 +227,15 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 
 	return 0;
  free_bounce_sg:
- 	if (mq->bounce_sg)
- 		kfree(mq->bounce_sg);
- 	mq->bounce_sg = NULL;
+	kfree(mqrq_cur->bounce_sg);
+	mqrq_cur->bounce_sg = NULL;
+
  cleanup_queue:
- 	if (mq->sg)
-		kfree(mq->sg);
-	mq->sg = NULL;
-	if (mq->bounce_buf)
-		kfree(mq->bounce_buf);
-	mq->bounce_buf = NULL;
+	kfree(mqrq_cur->sg);
+	mqrq_cur->sg = NULL;
+	kfree(mqrq_cur->bounce_buf);
+	mqrq_cur->bounce_buf = NULL;
+
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -237,6 +244,7 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
+	struct mmc_queue_req *mqrq_cur = mq->mqrq_cur;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
@@ -250,16 +258,14 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	blk_start_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
- 	if (mq->bounce_sg)
- 		kfree(mq->bounce_sg);
- 	mq->bounce_sg = NULL;
+	kfree(mqrq_cur->bounce_sg);
+	mqrq_cur->bounce_sg = NULL;
 
-	kfree(mq->sg);
-	mq->sg = NULL;
+	kfree(mqrq_cur->sg);
+	mqrq_cur->sg = NULL;
 
-	if (mq->bounce_buf)
-		kfree(mq->bounce_buf);
-	mq->bounce_buf = NULL;
+	kfree(mqrq_cur->bounce_buf);
+	mqrq_cur->bounce_buf = NULL;
 
 	mq->card = NULL;
 }
@@ -312,27 +318,27 @@ void mmc_queue_resume(struct mmc_queue *mq)
 /*
  * Prepare the sg list(s) to be handed of to the host driver
  */
-unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
+unsigned int mmc_queue_map_sg(struct mmc_queue *mq, struct mmc_queue_req *mqrq)
 {
 	unsigned int sg_len;
 	size_t buflen;
 	struct scatterlist *sg;
 	int i;
 
-	if (!mq->bounce_buf)
-		return blk_rq_map_sg(mq->queue, mq->req, mq->sg);
+	if (!mqrq->bounce_buf)
+		return blk_rq_map_sg(mq->queue, mqrq->req, mqrq->sg);
 
-	BUG_ON(!mq->bounce_sg);
+	BUG_ON(!mqrq->bounce_sg);
 
-	sg_len = blk_rq_map_sg(mq->queue, mq->req, mq->bounce_sg);
+	sg_len = blk_rq_map_sg(mq->queue, mqrq->req, mqrq->bounce_sg);
 
-	mq->bounce_sg_len = sg_len;
+	mqrq->bounce_sg_len = sg_len;
 
 	buflen = 0;
-	for_each_sg(mq->bounce_sg, sg, sg_len, i)
+	for_each_sg(mqrq->bounce_sg, sg, sg_len, i)
 		buflen += sg->length;
 
-	sg_init_one(mq->sg, mq->bounce_buf, buflen);
+	sg_init_one(mqrq->sg, mqrq->bounce_buf, buflen);
 
 	return 1;
 }
@@ -341,19 +347,19 @@ unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
  * If writing, bounce the data to the buffer before the request
  * is sent to the host driver
  */
-void mmc_queue_bounce_pre(struct mmc_queue *mq)
+void mmc_queue_bounce_pre(struct mmc_queue_req *mqrq)
 {
 	unsigned long flags;
 
-	if (!mq->bounce_buf)
+	if (!mqrq->bounce_buf)
 		return;
 
-	if (rq_data_dir(mq->req) != WRITE)
+	if (rq_data_dir(mqrq->req) != WRITE)
 		return;
 
 	local_irq_save(flags);
-	sg_copy_to_buffer(mq->bounce_sg, mq->bounce_sg_len,
-		mq->bounce_buf, mq->sg[0].length);
+	sg_copy_to_buffer(mqrq->bounce_sg, mqrq->bounce_sg_len,
+		mqrq->bounce_buf, mqrq->sg[0].length);
 	local_irq_restore(flags);
 }
 
@@ -361,19 +367,18 @@ void mmc_queue_bounce_pre(struct mmc_queue *mq)
  * If reading, bounce the data from the buffer after the request
  * has been handled by the host driver
  */
-void mmc_queue_bounce_post(struct mmc_queue *mq)
+void mmc_queue_bounce_post(struct mmc_queue_req *mqrq)
 {
 	unsigned long flags;
 
-	if (!mq->bounce_buf)
+	if (!mqrq->bounce_buf)
 		return;
 
-	if (rq_data_dir(mq->req) != READ)
+	if (rq_data_dir(mqrq->req) != READ)
 		return;
 
 	local_irq_save(flags);
-	sg_copy_from_buffer(mq->bounce_sg, mq->bounce_sg_len,
-		mq->bounce_buf, mq->sg[0].length);
+	sg_copy_from_buffer(mqrq->bounce_sg, mqrq->bounce_sg_len,
+		mqrq->bounce_buf, mqrq->sg[0].length);
 	local_irq_restore(flags);
 }
-
diff --git a/drivers/mmc/card/queue.h b/drivers/mmc/card/queue.h
index 64e66e0..468044f 100644
--- a/drivers/mmc/card/queue.h
+++ b/drivers/mmc/card/queue.h
@@ -4,19 +4,32 @@
 struct request;
 struct task_struct;
 
+struct mmc_blk_request {
+	struct mmc_request	mrq;
+	struct mmc_command	cmd;
+	struct mmc_command	stop;
+	struct mmc_data		data;
+};
+
+struct mmc_queue_req {
+	struct request		*req;
+	struct mmc_blk_request	brq;
+	struct scatterlist	*sg;
+	char			*bounce_buf;
+	struct scatterlist	*bounce_sg;
+	unsigned int		bounce_sg_len;
+};
+
 struct mmc_queue {
 	struct mmc_card		*card;
 	struct task_struct	*thread;
 	struct semaphore	thread_sem;
 	unsigned int		flags;
-	struct request		*req;
 	int			(*issue_fn)(struct mmc_queue *, struct request *);
 	void			*data;
 	struct request_queue	*queue;
-	struct scatterlist	*sg;
-	char			*bounce_buf;
-	struct scatterlist	*bounce_sg;
-	unsigned int		bounce_sg_len;
+	struct mmc_queue_req	mqrq[1];
+	struct mmc_queue_req	*mqrq_cur;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *);
@@ -24,8 +37,9 @@ extern void mmc_cleanup_queue(struct mmc_queue *);
 extern void mmc_queue_suspend(struct mmc_queue *);
 extern void mmc_queue_resume(struct mmc_queue *);
 
-extern unsigned int mmc_queue_map_sg(struct mmc_queue *);
-extern void mmc_queue_bounce_pre(struct mmc_queue *);
-extern void mmc_queue_bounce_post(struct mmc_queue *);
+extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
+				     struct mmc_queue_req *);
+extern void mmc_queue_bounce_pre(struct mmc_queue_req *);
+extern void mmc_queue_bounce_post(struct mmc_queue_req *);
 
 #endif
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 04/12] mmc: add member in mmc queue struct to hold request data
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

The way the request data is organized in the mmc queue struct
it only allows processing of one request at the time.
This patch adds a new struct to hold mmc queue request data such as
sg list, request, blk request and bounce buffers, and updates any functions
depending on the mmc queue struct. This lies the ground for
using multiple active request for one mmc queue.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/block.c |  105 +++++++++++++++++--------------------
 drivers/mmc/card/queue.c |  129 ++++++++++++++++++++++++----------------------
 drivers/mmc/card/queue.h |   30 ++++++++---
 3 files changed, 138 insertions(+), 126 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 61d233a..ec4e432 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -165,13 +165,6 @@ static const struct block_device_operations mmc_bdops = {
 	.owner			= THIS_MODULE,
 };
 
-struct mmc_blk_request {
-	struct mmc_request	mrq;
-	struct mmc_command	cmd;
-	struct mmc_command	stop;
-	struct mmc_data		data;
-};
-
 static u32 mmc_sd_num_wr_blocks(struct mmc_card *card)
 {
 	int err;
@@ -335,7 +328,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request brq;
+	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
 	int ret = 1, disable_multi = 0;
 
 	mmc_claim_host(card->host);
@@ -344,72 +337,72 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		struct mmc_command cmd;
 		u32 readcmd, writecmd, status = 0;
 
-		memset(&brq, 0, sizeof(struct mmc_blk_request));
-		brq.mrq.cmd = &brq.cmd;
-		brq.mrq.data = &brq.data;
+		memset(brq, 0, sizeof(struct mmc_blk_request));
+		brq->mrq.cmd = &brq->cmd;
+		brq->mrq.data = &brq->data;
 
-		brq.cmd.arg = blk_rq_pos(req);
+		brq->cmd.arg = blk_rq_pos(req);
 		if (!mmc_card_blockaddr(card))
-			brq.cmd.arg <<= 9;
-		brq.cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
-		brq.data.blksz = 512;
-		brq.stop.opcode = MMC_STOP_TRANSMISSION;
-		brq.stop.arg = 0;
-		brq.stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		brq.data.blocks = blk_rq_sectors(req);
+			brq->cmd.arg <<= 9;
+		brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
+		brq->data.blksz = 512;
+		brq->stop.opcode = MMC_STOP_TRANSMISSION;
+		brq->stop.arg = 0;
+		brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
+		brq->data.blocks = blk_rq_sectors(req);
 
 		/*
 		 * The block layer doesn't support all sector count
 		 * restrictions, so we need to be prepared for too big
 		 * requests.
 		 */
-		if (brq.data.blocks > card->host->max_blk_count)
-			brq.data.blocks = card->host->max_blk_count;
+		if (brq->data.blocks > card->host->max_blk_count)
+			brq->data.blocks = card->host->max_blk_count;
 
 		/*
 		 * After a read error, we redo the request one sector at a time
 		 * in order to accurately determine which sectors can be read
 		 * successfully.
 		 */
-		if (disable_multi && brq.data.blocks > 1)
-			brq.data.blocks = 1;
+		if (disable_multi && brq->data.blocks > 1)
+			brq->data.blocks = 1;
 
-		if (brq.data.blocks > 1) {
+		if (brq->data.blocks > 1) {
 			/* SPI multiblock writes terminate using a special
 			 * token, not a STOP_TRANSMISSION request.
 			 */
 			if (!mmc_host_is_spi(card->host)
 					|| rq_data_dir(req) == READ)
-				brq.mrq.stop = &brq.stop;
+				brq->mrq.stop = &brq->stop;
 			readcmd = MMC_READ_MULTIPLE_BLOCK;
 			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
 		} else {
-			brq.mrq.stop = NULL;
+			brq->mrq.stop = NULL;
 			readcmd = MMC_READ_SINGLE_BLOCK;
 			writecmd = MMC_WRITE_BLOCK;
 		}
 		if (rq_data_dir(req) == READ) {
-			brq.cmd.opcode = readcmd;
-			brq.data.flags |= MMC_DATA_READ;
+			brq->cmd.opcode = readcmd;
+			brq->data.flags |= MMC_DATA_READ;
 		} else {
-			brq.cmd.opcode = writecmd;
-			brq.data.flags |= MMC_DATA_WRITE;
+			brq->cmd.opcode = writecmd;
+			brq->data.flags |= MMC_DATA_WRITE;
 		}
 
-		mmc_set_data_timeout(&brq.data, card);
+		mmc_set_data_timeout(&brq->data, card);
 
-		brq.data.sg = mq->sg;
-		brq.data.sg_len = mmc_queue_map_sg(mq);
+		brq->data.sg = mq->mqrq_cur->sg;
+		brq->data.sg_len = mmc_queue_map_sg(mq, mq->mqrq_cur);
 
 		/*
 		 * Adjust the sg list so it is the same size as the
 		 * request.
 		 */
-		if (brq.data.blocks != blk_rq_sectors(req)) {
-			int i, data_size = brq.data.blocks << 9;
+		if (brq->data.blocks != blk_rq_sectors(req)) {
+			int i, data_size = brq->data.blocks << 9;
 			struct scatterlist *sg;
 
-			for_each_sg(brq.data.sg, sg, brq.data.sg_len, i) {
+			for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
 				data_size -= sg->length;
 				if (data_size <= 0) {
 					sg->length += data_size;
@@ -417,22 +410,22 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 					break;
 				}
 			}
-			brq.data.sg_len = i;
+			brq->data.sg_len = i;
 		}
 
-		mmc_queue_bounce_pre(mq);
+		mmc_queue_bounce_pre(mq->mqrq_cur);
 
-		mmc_wait_for_req(card->host, &brq.mrq);
+		mmc_wait_for_req(card->host, &brq->mrq);
 
-		mmc_queue_bounce_post(mq);
+		mmc_queue_bounce_post(mq->mqrq_cur);
 
 		/*
 		 * Check for errors here, but don't jump to cmd_err
 		 * until later as we need to wait for the card to leave
 		 * programming mode even when things go wrong.
 		 */
-		if (brq.cmd.error || brq.data.error || brq.stop.error) {
-			if (brq.data.blocks > 1 && rq_data_dir(req) == READ) {
+		if (brq->cmd.error || brq->data.error || brq->stop.error) {
+			if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
 				/* Redo read one sector at a time */
 				printk(KERN_WARNING "%s: retrying using single "
 				       "block read\n", req->rq_disk->disk_name);
@@ -442,29 +435,29 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			status = get_card_status(card, req);
 		}
 
-		if (brq.cmd.error) {
+		if (brq->cmd.error) {
 			printk(KERN_ERR "%s: error %d sending read/write "
 			       "command, response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq.cmd.error,
-			       brq.cmd.resp[0], status);
+			       req->rq_disk->disk_name, brq->cmd.error,
+			       brq->cmd.resp[0], status);
 		}
 
-		if (brq.data.error) {
-			if (brq.data.error == -ETIMEDOUT && brq.mrq.stop)
+		if (brq->data.error) {
+			if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
 				/* 'Stop' response contains card status */
-				status = brq.mrq.stop->resp[0];
+				status = brq->mrq.stop->resp[0];
 			printk(KERN_ERR "%s: error %d transferring data,"
 			       " sector %u, nr %u, card status %#x\n",
-			       req->rq_disk->disk_name, brq.data.error,
+			       req->rq_disk->disk_name, brq->data.error,
 			       (unsigned)blk_rq_pos(req),
 			       (unsigned)blk_rq_sectors(req), status);
 		}
 
-		if (brq.stop.error) {
+		if (brq->stop.error) {
 			printk(KERN_ERR "%s: error %d sending stop command, "
 			       "response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq.stop.error,
-			       brq.stop.resp[0], status);
+			       req->rq_disk->disk_name, brq->stop.error,
+			       brq->stop.resp[0], status);
 		}
 
 		if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
@@ -497,7 +490,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 #endif
 		}
 
-		if (brq.cmd.error || brq.stop.error || brq.data.error) {
+		if (brq->cmd.error || brq->stop.error || brq->data.error) {
 			if (rq_data_dir(req) == READ) {
 				/*
 				 * After an error, we redo I/O one sector at a
@@ -505,7 +498,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 				 * read a single sector.
 				 */
 				spin_lock_irq(&md->lock);
-				ret = __blk_end_request(req, -EIO, brq.data.blksz);
+				ret = __blk_end_request(req, -EIO, brq->data.blksz);
 				spin_unlock_irq(&md->lock);
 				continue;
 			}
@@ -516,7 +509,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		 * A block was successfully transferred.
 		 */
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
+		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	} while (ret);
 
@@ -544,7 +537,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		}
 	} else {
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
+		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	}
 
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 2ae7275..40e18b5 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -56,7 +56,7 @@ static int mmc_queue_thread(void *d)
 		spin_lock_irq(q->queue_lock);
 		set_current_state(TASK_INTERRUPTIBLE);
 		req = blk_fetch_request(q);
-		mq->req = req;
+		mq->mqrq_cur->req = req;
 		spin_unlock_irq(q->queue_lock);
 
 		if (!req) {
@@ -97,10 +97,25 @@ static void mmc_request(struct request_queue *q)
 		return;
 	}
 
-	if (!mq->req)
+	if (!mq->mqrq_cur->req)
 		wake_up_process(mq->thread);
 }
 
+struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
+{
+	struct scatterlist *sg;
+
+	sg = kmalloc(sizeof(struct scatterlist)*sg_len, GFP_KERNEL);
+	if (!sg)
+		*err = -ENOMEM;
+	else {
+		*err = 0;
+		sg_init_table(sg, sg_len);
+	}
+
+	return sg;
+}
+
 /**
  * mmc_init_queue - initialise a queue structure.
  * @mq: mmc queue
@@ -114,6 +129,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	struct mmc_host *host = card->host;
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret;
+	struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = *mmc_dev(host)->dma_mask;
@@ -123,8 +139,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	if (!mq->queue)
 		return -ENOMEM;
 
+	memset(&mq->mqrq_cur, 0, sizeof(mq->mqrq_cur));
+	mq->mqrq_cur = mqrq_cur;
 	mq->queue->queuedata = mq;
-	mq->req = NULL;
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
 	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);
@@ -158,53 +175,44 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 			bouncesz = host->max_blk_count * 512;
 
 		if (bouncesz > 512) {
-			mq->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
-			if (!mq->bounce_buf) {
+			mqrq_cur->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
+			if (!mqrq_cur->bounce_buf) {
 				printk(KERN_WARNING "%s: unable to "
-					"allocate bounce buffer\n",
+					"allocate bounce cur buffer\n",
 					mmc_card_name(card));
 			}
 		}
 
-		if (mq->bounce_buf) {
+		if (mqrq_cur->bounce_buf) {
 			blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 			blk_queue_max_hw_sectors(mq->queue, bouncesz / 512);
 			blk_queue_max_segments(mq->queue, bouncesz / 512);
 			blk_queue_max_segment_size(mq->queue, bouncesz);
 
-			mq->sg = kmalloc(sizeof(struct scatterlist),
-				GFP_KERNEL);
-			if (!mq->sg) {
-				ret = -ENOMEM;
+			mqrq_cur->sg = mmc_alloc_sg(1, &ret);
+			if (ret)
 				goto cleanup_queue;
-			}
-			sg_init_table(mq->sg, 1);
 
-			mq->bounce_sg = kmalloc(sizeof(struct scatterlist) *
-				bouncesz / 512, GFP_KERNEL);
-			if (!mq->bounce_sg) {
-				ret = -ENOMEM;
+			mqrq_cur->bounce_sg =
+				mmc_alloc_sg(bouncesz / 512, &ret);
+			if (ret)
 				goto cleanup_queue;
-			}
-			sg_init_table(mq->bounce_sg, bouncesz / 512);
+
 		}
 	}
 #endif
 
-	if (!mq->bounce_buf) {
+	if (!mqrq_cur->bounce_buf) {
 		blk_queue_bounce_limit(mq->queue, limit);
 		blk_queue_max_hw_sectors(mq->queue,
 			min(host->max_blk_count, host->max_req_size / 512));
 		blk_queue_max_segments(mq->queue, host->max_segs);
 		blk_queue_max_segment_size(mq->queue, host->max_seg_size);
 
-		mq->sg = kmalloc(sizeof(struct scatterlist) *
-			host->max_segs, GFP_KERNEL);
-		if (!mq->sg) {
-			ret = -ENOMEM;
+		mqrq_cur->sg = mmc_alloc_sg(host->max_segs, &ret);
+		if (ret)
 			goto cleanup_queue;
-		}
-		sg_init_table(mq->sg, host->max_segs);
+
 	}
 
 	sema_init(&mq->thread_sem, 1);
@@ -219,16 +227,15 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 
 	return 0;
  free_bounce_sg:
- 	if (mq->bounce_sg)
- 		kfree(mq->bounce_sg);
- 	mq->bounce_sg = NULL;
+	kfree(mqrq_cur->bounce_sg);
+	mqrq_cur->bounce_sg = NULL;
+
  cleanup_queue:
- 	if (mq->sg)
-		kfree(mq->sg);
-	mq->sg = NULL;
-	if (mq->bounce_buf)
-		kfree(mq->bounce_buf);
-	mq->bounce_buf = NULL;
+	kfree(mqrq_cur->sg);
+	mqrq_cur->sg = NULL;
+	kfree(mqrq_cur->bounce_buf);
+	mqrq_cur->bounce_buf = NULL;
+
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -237,6 +244,7 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
+	struct mmc_queue_req *mqrq_cur = mq->mqrq_cur;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
@@ -250,16 +258,14 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	blk_start_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
- 	if (mq->bounce_sg)
- 		kfree(mq->bounce_sg);
- 	mq->bounce_sg = NULL;
+	kfree(mqrq_cur->bounce_sg);
+	mqrq_cur->bounce_sg = NULL;
 
-	kfree(mq->sg);
-	mq->sg = NULL;
+	kfree(mqrq_cur->sg);
+	mqrq_cur->sg = NULL;
 
-	if (mq->bounce_buf)
-		kfree(mq->bounce_buf);
-	mq->bounce_buf = NULL;
+	kfree(mqrq_cur->bounce_buf);
+	mqrq_cur->bounce_buf = NULL;
 
 	mq->card = NULL;
 }
@@ -312,27 +318,27 @@ void mmc_queue_resume(struct mmc_queue *mq)
 /*
  * Prepare the sg list(s) to be handed of to the host driver
  */
-unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
+unsigned int mmc_queue_map_sg(struct mmc_queue *mq, struct mmc_queue_req *mqrq)
 {
 	unsigned int sg_len;
 	size_t buflen;
 	struct scatterlist *sg;
 	int i;
 
-	if (!mq->bounce_buf)
-		return blk_rq_map_sg(mq->queue, mq->req, mq->sg);
+	if (!mqrq->bounce_buf)
+		return blk_rq_map_sg(mq->queue, mqrq->req, mqrq->sg);
 
-	BUG_ON(!mq->bounce_sg);
+	BUG_ON(!mqrq->bounce_sg);
 
-	sg_len = blk_rq_map_sg(mq->queue, mq->req, mq->bounce_sg);
+	sg_len = blk_rq_map_sg(mq->queue, mqrq->req, mqrq->bounce_sg);
 
-	mq->bounce_sg_len = sg_len;
+	mqrq->bounce_sg_len = sg_len;
 
 	buflen = 0;
-	for_each_sg(mq->bounce_sg, sg, sg_len, i)
+	for_each_sg(mqrq->bounce_sg, sg, sg_len, i)
 		buflen += sg->length;
 
-	sg_init_one(mq->sg, mq->bounce_buf, buflen);
+	sg_init_one(mqrq->sg, mqrq->bounce_buf, buflen);
 
 	return 1;
 }
@@ -341,19 +347,19 @@ unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
  * If writing, bounce the data to the buffer before the request
  * is sent to the host driver
  */
-void mmc_queue_bounce_pre(struct mmc_queue *mq)
+void mmc_queue_bounce_pre(struct mmc_queue_req *mqrq)
 {
 	unsigned long flags;
 
-	if (!mq->bounce_buf)
+	if (!mqrq->bounce_buf)
 		return;
 
-	if (rq_data_dir(mq->req) != WRITE)
+	if (rq_data_dir(mqrq->req) != WRITE)
 		return;
 
 	local_irq_save(flags);
-	sg_copy_to_buffer(mq->bounce_sg, mq->bounce_sg_len,
-		mq->bounce_buf, mq->sg[0].length);
+	sg_copy_to_buffer(mqrq->bounce_sg, mqrq->bounce_sg_len,
+		mqrq->bounce_buf, mqrq->sg[0].length);
 	local_irq_restore(flags);
 }
 
@@ -361,19 +367,18 @@ void mmc_queue_bounce_pre(struct mmc_queue *mq)
  * If reading, bounce the data from the buffer after the request
  * has been handled by the host driver
  */
-void mmc_queue_bounce_post(struct mmc_queue *mq)
+void mmc_queue_bounce_post(struct mmc_queue_req *mqrq)
 {
 	unsigned long flags;
 
-	if (!mq->bounce_buf)
+	if (!mqrq->bounce_buf)
 		return;
 
-	if (rq_data_dir(mq->req) != READ)
+	if (rq_data_dir(mqrq->req) != READ)
 		return;
 
 	local_irq_save(flags);
-	sg_copy_from_buffer(mq->bounce_sg, mq->bounce_sg_len,
-		mq->bounce_buf, mq->sg[0].length);
+	sg_copy_from_buffer(mqrq->bounce_sg, mqrq->bounce_sg_len,
+		mqrq->bounce_buf, mqrq->sg[0].length);
 	local_irq_restore(flags);
 }
-
diff --git a/drivers/mmc/card/queue.h b/drivers/mmc/card/queue.h
index 64e66e0..468044f 100644
--- a/drivers/mmc/card/queue.h
+++ b/drivers/mmc/card/queue.h
@@ -4,19 +4,32 @@
 struct request;
 struct task_struct;
 
+struct mmc_blk_request {
+	struct mmc_request	mrq;
+	struct mmc_command	cmd;
+	struct mmc_command	stop;
+	struct mmc_data		data;
+};
+
+struct mmc_queue_req {
+	struct request		*req;
+	struct mmc_blk_request	brq;
+	struct scatterlist	*sg;
+	char			*bounce_buf;
+	struct scatterlist	*bounce_sg;
+	unsigned int		bounce_sg_len;
+};
+
 struct mmc_queue {
 	struct mmc_card		*card;
 	struct task_struct	*thread;
 	struct semaphore	thread_sem;
 	unsigned int		flags;
-	struct request		*req;
 	int			(*issue_fn)(struct mmc_queue *, struct request *);
 	void			*data;
 	struct request_queue	*queue;
-	struct scatterlist	*sg;
-	char			*bounce_buf;
-	struct scatterlist	*bounce_sg;
-	unsigned int		bounce_sg_len;
+	struct mmc_queue_req	mqrq[1];
+	struct mmc_queue_req	*mqrq_cur;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *);
@@ -24,8 +37,9 @@ extern void mmc_cleanup_queue(struct mmc_queue *);
 extern void mmc_queue_suspend(struct mmc_queue *);
 extern void mmc_queue_resume(struct mmc_queue *);
 
-extern unsigned int mmc_queue_map_sg(struct mmc_queue *);
-extern void mmc_queue_bounce_pre(struct mmc_queue *);
-extern void mmc_queue_bounce_post(struct mmc_queue *);
+extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
+				     struct mmc_queue_req *);
+extern void mmc_queue_bounce_pre(struct mmc_queue_req *);
+extern void mmc_queue_bounce_post(struct mmc_queue_req *);
 
 #endif
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 04/12] mmc: add member in mmc queue struct to hold request data
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

The way the request data is organized in the mmc queue struct
it only allows processing of one request at the time.
This patch adds a new struct to hold mmc queue request data such as
sg list, request, blk request and bounce buffers, and updates any functions
depending on the mmc queue struct. This lies the ground for
using multiple active request for one mmc queue.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  105 +++++++++++++++++--------------------
 drivers/mmc/card/queue.c |  129 ++++++++++++++++++++++++----------------------
 drivers/mmc/card/queue.h |   30 ++++++++---
 3 files changed, 138 insertions(+), 126 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 61d233a..ec4e432 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -165,13 +165,6 @@ static const struct block_device_operations mmc_bdops = {
 	.owner			= THIS_MODULE,
 };
 
-struct mmc_blk_request {
-	struct mmc_request	mrq;
-	struct mmc_command	cmd;
-	struct mmc_command	stop;
-	struct mmc_data		data;
-};
-
 static u32 mmc_sd_num_wr_blocks(struct mmc_card *card)
 {
 	int err;
@@ -335,7 +328,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request brq;
+	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
 	int ret = 1, disable_multi = 0;
 
 	mmc_claim_host(card->host);
@@ -344,72 +337,72 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		struct mmc_command cmd;
 		u32 readcmd, writecmd, status = 0;
 
-		memset(&brq, 0, sizeof(struct mmc_blk_request));
-		brq.mrq.cmd = &brq.cmd;
-		brq.mrq.data = &brq.data;
+		memset(brq, 0, sizeof(struct mmc_blk_request));
+		brq->mrq.cmd = &brq->cmd;
+		brq->mrq.data = &brq->data;
 
-		brq.cmd.arg = blk_rq_pos(req);
+		brq->cmd.arg = blk_rq_pos(req);
 		if (!mmc_card_blockaddr(card))
-			brq.cmd.arg <<= 9;
-		brq.cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
-		brq.data.blksz = 512;
-		brq.stop.opcode = MMC_STOP_TRANSMISSION;
-		brq.stop.arg = 0;
-		brq.stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		brq.data.blocks = blk_rq_sectors(req);
+			brq->cmd.arg <<= 9;
+		brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
+		brq->data.blksz = 512;
+		brq->stop.opcode = MMC_STOP_TRANSMISSION;
+		brq->stop.arg = 0;
+		brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
+		brq->data.blocks = blk_rq_sectors(req);
 
 		/*
 		 * The block layer doesn't support all sector count
 		 * restrictions, so we need to be prepared for too big
 		 * requests.
 		 */
-		if (brq.data.blocks > card->host->max_blk_count)
-			brq.data.blocks = card->host->max_blk_count;
+		if (brq->data.blocks > card->host->max_blk_count)
+			brq->data.blocks = card->host->max_blk_count;
 
 		/*
 		 * After a read error, we redo the request one sector at a time
 		 * in order to accurately determine which sectors can be read
 		 * successfully.
 		 */
-		if (disable_multi && brq.data.blocks > 1)
-			brq.data.blocks = 1;
+		if (disable_multi && brq->data.blocks > 1)
+			brq->data.blocks = 1;
 
-		if (brq.data.blocks > 1) {
+		if (brq->data.blocks > 1) {
 			/* SPI multiblock writes terminate using a special
 			 * token, not a STOP_TRANSMISSION request.
 			 */
 			if (!mmc_host_is_spi(card->host)
 					|| rq_data_dir(req) == READ)
-				brq.mrq.stop = &brq.stop;
+				brq->mrq.stop = &brq->stop;
 			readcmd = MMC_READ_MULTIPLE_BLOCK;
 			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
 		} else {
-			brq.mrq.stop = NULL;
+			brq->mrq.stop = NULL;
 			readcmd = MMC_READ_SINGLE_BLOCK;
 			writecmd = MMC_WRITE_BLOCK;
 		}
 		if (rq_data_dir(req) == READ) {
-			brq.cmd.opcode = readcmd;
-			brq.data.flags |= MMC_DATA_READ;
+			brq->cmd.opcode = readcmd;
+			brq->data.flags |= MMC_DATA_READ;
 		} else {
-			brq.cmd.opcode = writecmd;
-			brq.data.flags |= MMC_DATA_WRITE;
+			brq->cmd.opcode = writecmd;
+			brq->data.flags |= MMC_DATA_WRITE;
 		}
 
-		mmc_set_data_timeout(&brq.data, card);
+		mmc_set_data_timeout(&brq->data, card);
 
-		brq.data.sg = mq->sg;
-		brq.data.sg_len = mmc_queue_map_sg(mq);
+		brq->data.sg = mq->mqrq_cur->sg;
+		brq->data.sg_len = mmc_queue_map_sg(mq, mq->mqrq_cur);
 
 		/*
 		 * Adjust the sg list so it is the same size as the
 		 * request.
 		 */
-		if (brq.data.blocks != blk_rq_sectors(req)) {
-			int i, data_size = brq.data.blocks << 9;
+		if (brq->data.blocks != blk_rq_sectors(req)) {
+			int i, data_size = brq->data.blocks << 9;
 			struct scatterlist *sg;
 
-			for_each_sg(brq.data.sg, sg, brq.data.sg_len, i) {
+			for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
 				data_size -= sg->length;
 				if (data_size <= 0) {
 					sg->length += data_size;
@@ -417,22 +410,22 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 					break;
 				}
 			}
-			brq.data.sg_len = i;
+			brq->data.sg_len = i;
 		}
 
-		mmc_queue_bounce_pre(mq);
+		mmc_queue_bounce_pre(mq->mqrq_cur);
 
-		mmc_wait_for_req(card->host, &brq.mrq);
+		mmc_wait_for_req(card->host, &brq->mrq);
 
-		mmc_queue_bounce_post(mq);
+		mmc_queue_bounce_post(mq->mqrq_cur);
 
 		/*
 		 * Check for errors here, but don't jump to cmd_err
 		 * until later as we need to wait for the card to leave
 		 * programming mode even when things go wrong.
 		 */
-		if (brq.cmd.error || brq.data.error || brq.stop.error) {
-			if (brq.data.blocks > 1 && rq_data_dir(req) == READ) {
+		if (brq->cmd.error || brq->data.error || brq->stop.error) {
+			if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
 				/* Redo read one sector at a time */
 				printk(KERN_WARNING "%s: retrying using single "
 				       "block read\n", req->rq_disk->disk_name);
@@ -442,29 +435,29 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			status = get_card_status(card, req);
 		}
 
-		if (brq.cmd.error) {
+		if (brq->cmd.error) {
 			printk(KERN_ERR "%s: error %d sending read/write "
 			       "command, response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq.cmd.error,
-			       brq.cmd.resp[0], status);
+			       req->rq_disk->disk_name, brq->cmd.error,
+			       brq->cmd.resp[0], status);
 		}
 
-		if (brq.data.error) {
-			if (brq.data.error == -ETIMEDOUT && brq.mrq.stop)
+		if (brq->data.error) {
+			if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
 				/* 'Stop' response contains card status */
-				status = brq.mrq.stop->resp[0];
+				status = brq->mrq.stop->resp[0];
 			printk(KERN_ERR "%s: error %d transferring data,"
 			       " sector %u, nr %u, card status %#x\n",
-			       req->rq_disk->disk_name, brq.data.error,
+			       req->rq_disk->disk_name, brq->data.error,
 			       (unsigned)blk_rq_pos(req),
 			       (unsigned)blk_rq_sectors(req), status);
 		}
 
-		if (brq.stop.error) {
+		if (brq->stop.error) {
 			printk(KERN_ERR "%s: error %d sending stop command, "
 			       "response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq.stop.error,
-			       brq.stop.resp[0], status);
+			       req->rq_disk->disk_name, brq->stop.error,
+			       brq->stop.resp[0], status);
 		}
 
 		if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
@@ -497,7 +490,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 #endif
 		}
 
-		if (brq.cmd.error || brq.stop.error || brq.data.error) {
+		if (brq->cmd.error || brq->stop.error || brq->data.error) {
 			if (rq_data_dir(req) == READ) {
 				/*
 				 * After an error, we redo I/O one sector at a
@@ -505,7 +498,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 				 * read a single sector.
 				 */
 				spin_lock_irq(&md->lock);
-				ret = __blk_end_request(req, -EIO, brq.data.blksz);
+				ret = __blk_end_request(req, -EIO, brq->data.blksz);
 				spin_unlock_irq(&md->lock);
 				continue;
 			}
@@ -516,7 +509,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		 * A block was successfully transferred.
 		 */
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
+		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	} while (ret);
 
@@ -544,7 +537,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		}
 	} else {
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
+		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	}
 
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 2ae7275..40e18b5 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -56,7 +56,7 @@ static int mmc_queue_thread(void *d)
 		spin_lock_irq(q->queue_lock);
 		set_current_state(TASK_INTERRUPTIBLE);
 		req = blk_fetch_request(q);
-		mq->req = req;
+		mq->mqrq_cur->req = req;
 		spin_unlock_irq(q->queue_lock);
 
 		if (!req) {
@@ -97,10 +97,25 @@ static void mmc_request(struct request_queue *q)
 		return;
 	}
 
-	if (!mq->req)
+	if (!mq->mqrq_cur->req)
 		wake_up_process(mq->thread);
 }
 
+struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
+{
+	struct scatterlist *sg;
+
+	sg = kmalloc(sizeof(struct scatterlist)*sg_len, GFP_KERNEL);
+	if (!sg)
+		*err = -ENOMEM;
+	else {
+		*err = 0;
+		sg_init_table(sg, sg_len);
+	}
+
+	return sg;
+}
+
 /**
  * mmc_init_queue - initialise a queue structure.
  * @mq: mmc queue
@@ -114,6 +129,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	struct mmc_host *host = card->host;
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret;
+	struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = *mmc_dev(host)->dma_mask;
@@ -123,8 +139,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	if (!mq->queue)
 		return -ENOMEM;
 
+	memset(&mq->mqrq_cur, 0, sizeof(mq->mqrq_cur));
+	mq->mqrq_cur = mqrq_cur;
 	mq->queue->queuedata = mq;
-	mq->req = NULL;
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
 	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);
@@ -158,53 +175,44 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 			bouncesz = host->max_blk_count * 512;
 
 		if (bouncesz > 512) {
-			mq->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
-			if (!mq->bounce_buf) {
+			mqrq_cur->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
+			if (!mqrq_cur->bounce_buf) {
 				printk(KERN_WARNING "%s: unable to "
-					"allocate bounce buffer\n",
+					"allocate bounce cur buffer\n",
 					mmc_card_name(card));
 			}
 		}
 
-		if (mq->bounce_buf) {
+		if (mqrq_cur->bounce_buf) {
 			blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 			blk_queue_max_hw_sectors(mq->queue, bouncesz / 512);
 			blk_queue_max_segments(mq->queue, bouncesz / 512);
 			blk_queue_max_segment_size(mq->queue, bouncesz);
 
-			mq->sg = kmalloc(sizeof(struct scatterlist),
-				GFP_KERNEL);
-			if (!mq->sg) {
-				ret = -ENOMEM;
+			mqrq_cur->sg = mmc_alloc_sg(1, &ret);
+			if (ret)
 				goto cleanup_queue;
-			}
-			sg_init_table(mq->sg, 1);
 
-			mq->bounce_sg = kmalloc(sizeof(struct scatterlist) *
-				bouncesz / 512, GFP_KERNEL);
-			if (!mq->bounce_sg) {
-				ret = -ENOMEM;
+			mqrq_cur->bounce_sg =
+				mmc_alloc_sg(bouncesz / 512, &ret);
+			if (ret)
 				goto cleanup_queue;
-			}
-			sg_init_table(mq->bounce_sg, bouncesz / 512);
+
 		}
 	}
 #endif
 
-	if (!mq->bounce_buf) {
+	if (!mqrq_cur->bounce_buf) {
 		blk_queue_bounce_limit(mq->queue, limit);
 		blk_queue_max_hw_sectors(mq->queue,
 			min(host->max_blk_count, host->max_req_size / 512));
 		blk_queue_max_segments(mq->queue, host->max_segs);
 		blk_queue_max_segment_size(mq->queue, host->max_seg_size);
 
-		mq->sg = kmalloc(sizeof(struct scatterlist) *
-			host->max_segs, GFP_KERNEL);
-		if (!mq->sg) {
-			ret = -ENOMEM;
+		mqrq_cur->sg = mmc_alloc_sg(host->max_segs, &ret);
+		if (ret)
 			goto cleanup_queue;
-		}
-		sg_init_table(mq->sg, host->max_segs);
+
 	}
 
 	sema_init(&mq->thread_sem, 1);
@@ -219,16 +227,15 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 
 	return 0;
  free_bounce_sg:
- 	if (mq->bounce_sg)
- 		kfree(mq->bounce_sg);
- 	mq->bounce_sg = NULL;
+	kfree(mqrq_cur->bounce_sg);
+	mqrq_cur->bounce_sg = NULL;
+
  cleanup_queue:
- 	if (mq->sg)
-		kfree(mq->sg);
-	mq->sg = NULL;
-	if (mq->bounce_buf)
-		kfree(mq->bounce_buf);
-	mq->bounce_buf = NULL;
+	kfree(mqrq_cur->sg);
+	mqrq_cur->sg = NULL;
+	kfree(mqrq_cur->bounce_buf);
+	mqrq_cur->bounce_buf = NULL;
+
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -237,6 +244,7 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
+	struct mmc_queue_req *mqrq_cur = mq->mqrq_cur;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
@@ -250,16 +258,14 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	blk_start_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
- 	if (mq->bounce_sg)
- 		kfree(mq->bounce_sg);
- 	mq->bounce_sg = NULL;
+	kfree(mqrq_cur->bounce_sg);
+	mqrq_cur->bounce_sg = NULL;
 
-	kfree(mq->sg);
-	mq->sg = NULL;
+	kfree(mqrq_cur->sg);
+	mqrq_cur->sg = NULL;
 
-	if (mq->bounce_buf)
-		kfree(mq->bounce_buf);
-	mq->bounce_buf = NULL;
+	kfree(mqrq_cur->bounce_buf);
+	mqrq_cur->bounce_buf = NULL;
 
 	mq->card = NULL;
 }
@@ -312,27 +318,27 @@ void mmc_queue_resume(struct mmc_queue *mq)
 /*
  * Prepare the sg list(s) to be handed of to the host driver
  */
-unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
+unsigned int mmc_queue_map_sg(struct mmc_queue *mq, struct mmc_queue_req *mqrq)
 {
 	unsigned int sg_len;
 	size_t buflen;
 	struct scatterlist *sg;
 	int i;
 
-	if (!mq->bounce_buf)
-		return blk_rq_map_sg(mq->queue, mq->req, mq->sg);
+	if (!mqrq->bounce_buf)
+		return blk_rq_map_sg(mq->queue, mqrq->req, mqrq->sg);
 
-	BUG_ON(!mq->bounce_sg);
+	BUG_ON(!mqrq->bounce_sg);
 
-	sg_len = blk_rq_map_sg(mq->queue, mq->req, mq->bounce_sg);
+	sg_len = blk_rq_map_sg(mq->queue, mqrq->req, mqrq->bounce_sg);
 
-	mq->bounce_sg_len = sg_len;
+	mqrq->bounce_sg_len = sg_len;
 
 	buflen = 0;
-	for_each_sg(mq->bounce_sg, sg, sg_len, i)
+	for_each_sg(mqrq->bounce_sg, sg, sg_len, i)
 		buflen += sg->length;
 
-	sg_init_one(mq->sg, mq->bounce_buf, buflen);
+	sg_init_one(mqrq->sg, mqrq->bounce_buf, buflen);
 
 	return 1;
 }
@@ -341,19 +347,19 @@ unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
  * If writing, bounce the data to the buffer before the request
  * is sent to the host driver
  */
-void mmc_queue_bounce_pre(struct mmc_queue *mq)
+void mmc_queue_bounce_pre(struct mmc_queue_req *mqrq)
 {
 	unsigned long flags;
 
-	if (!mq->bounce_buf)
+	if (!mqrq->bounce_buf)
 		return;
 
-	if (rq_data_dir(mq->req) != WRITE)
+	if (rq_data_dir(mqrq->req) != WRITE)
 		return;
 
 	local_irq_save(flags);
-	sg_copy_to_buffer(mq->bounce_sg, mq->bounce_sg_len,
-		mq->bounce_buf, mq->sg[0].length);
+	sg_copy_to_buffer(mqrq->bounce_sg, mqrq->bounce_sg_len,
+		mqrq->bounce_buf, mqrq->sg[0].length);
 	local_irq_restore(flags);
 }
 
@@ -361,19 +367,18 @@ void mmc_queue_bounce_pre(struct mmc_queue *mq)
  * If reading, bounce the data from the buffer after the request
  * has been handled by the host driver
  */
-void mmc_queue_bounce_post(struct mmc_queue *mq)
+void mmc_queue_bounce_post(struct mmc_queue_req *mqrq)
 {
 	unsigned long flags;
 
-	if (!mq->bounce_buf)
+	if (!mqrq->bounce_buf)
 		return;
 
-	if (rq_data_dir(mq->req) != READ)
+	if (rq_data_dir(mqrq->req) != READ)
 		return;
 
 	local_irq_save(flags);
-	sg_copy_from_buffer(mq->bounce_sg, mq->bounce_sg_len,
-		mq->bounce_buf, mq->sg[0].length);
+	sg_copy_from_buffer(mqrq->bounce_sg, mqrq->bounce_sg_len,
+		mqrq->bounce_buf, mqrq->sg[0].length);
 	local_irq_restore(flags);
 }
-
diff --git a/drivers/mmc/card/queue.h b/drivers/mmc/card/queue.h
index 64e66e0..468044f 100644
--- a/drivers/mmc/card/queue.h
+++ b/drivers/mmc/card/queue.h
@@ -4,19 +4,32 @@
 struct request;
 struct task_struct;
 
+struct mmc_blk_request {
+	struct mmc_request	mrq;
+	struct mmc_command	cmd;
+	struct mmc_command	stop;
+	struct mmc_data		data;
+};
+
+struct mmc_queue_req {
+	struct request		*req;
+	struct mmc_blk_request	brq;
+	struct scatterlist	*sg;
+	char			*bounce_buf;
+	struct scatterlist	*bounce_sg;
+	unsigned int		bounce_sg_len;
+};
+
 struct mmc_queue {
 	struct mmc_card		*card;
 	struct task_struct	*thread;
 	struct semaphore	thread_sem;
 	unsigned int		flags;
-	struct request		*req;
 	int			(*issue_fn)(struct mmc_queue *, struct request *);
 	void			*data;
 	struct request_queue	*queue;
-	struct scatterlist	*sg;
-	char			*bounce_buf;
-	struct scatterlist	*bounce_sg;
-	unsigned int		bounce_sg_len;
+	struct mmc_queue_req	mqrq[1];
+	struct mmc_queue_req	*mqrq_cur;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *);
@@ -24,8 +37,9 @@ extern void mmc_cleanup_queue(struct mmc_queue *);
 extern void mmc_queue_suspend(struct mmc_queue *);
 extern void mmc_queue_resume(struct mmc_queue *);
 
-extern unsigned int mmc_queue_map_sg(struct mmc_queue *);
-extern void mmc_queue_bounce_pre(struct mmc_queue *);
-extern void mmc_queue_bounce_post(struct mmc_queue *);
+extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
+				     struct mmc_queue_req *);
+extern void mmc_queue_bounce_pre(struct mmc_queue_req *);
+extern void mmc_queue_bounce_post(struct mmc_queue_req *);
 
 #endif
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 05/12] mmc: add a block request prepare function
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Break out code from mmc_blk_issue_rw_rq to create a
block request prepare function. This doesn't change
any functionallity. This helps when handling more
than one active block request.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  170 ++++++++++++++++++++++++---------------------
 1 files changed, 91 insertions(+), 79 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index ec4e432..e606dec 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -324,97 +324,109 @@ out:
 	return err ? 0 : 1;
 }
 
-static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+			       struct mmc_card *card,
+			       int disable_multi,
+			       struct mmc_queue *mq)
 {
-	struct mmc_blk_data *md = mq->data;
-	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
-	int ret = 1, disable_multi = 0;
+	u32 readcmd, writecmd;
+	struct mmc_blk_request *brq = &mqrq->brq;
+	struct request *req = mqrq->req;
 
-	mmc_claim_host(card->host);
+	memset(brq, 0, sizeof(struct mmc_blk_request));
 
-	do {
-		struct mmc_command cmd;
-		u32 readcmd, writecmd, status = 0;
-
-		memset(brq, 0, sizeof(struct mmc_blk_request));
-		brq->mrq.cmd = &brq->cmd;
-		brq->mrq.data = &brq->data;
-
-		brq->cmd.arg = blk_rq_pos(req);
-		if (!mmc_card_blockaddr(card))
-			brq->cmd.arg <<= 9;
-		brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
-		brq->data.blksz = 512;
-		brq->stop.opcode = MMC_STOP_TRANSMISSION;
-		brq->stop.arg = 0;
-		brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		brq->data.blocks = blk_rq_sectors(req);
+	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.data = &brq->data;
 
-		/*
-		 * The block layer doesn't support all sector count
-		 * restrictions, so we need to be prepared for too big
-		 * requests.
-		 */
-		if (brq->data.blocks > card->host->max_blk_count)
-			brq->data.blocks = card->host->max_blk_count;
+	brq->cmd.arg = blk_rq_pos(req);
+	if (!mmc_card_blockaddr(card))
+		brq->cmd.arg <<= 9;
+	brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
+	brq->data.blksz = 512;
+	brq->stop.opcode = MMC_STOP_TRANSMISSION;
+	brq->stop.arg = 0;
+	brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
+	brq->data.blocks = blk_rq_sectors(req);
 
-		/*
-		 * After a read error, we redo the request one sector at a time
-		 * in order to accurately determine which sectors can be read
-		 * successfully.
+	/*
+	 * The block layer doesn't support all sector count
+	 * restrictions, so we need to be prepared for too big
+	 * requests.
+	 */
+	if (brq->data.blocks > card->host->max_blk_count)
+		brq->data.blocks = card->host->max_blk_count;
+
+	/*
+	 * After a read error, we redo the request one sector at a time
+	 * in order to accurately determine which sectors can be read
+	 * successfully.
+	 */
+	if (disable_multi && brq->data.blocks > 1)
+		brq->data.blocks = 1;
+
+	if (brq->data.blocks > 1) {
+		/* SPI multiblock writes terminate using a special
+		 * token, not a STOP_TRANSMISSION request.
 		 */
-		if (disable_multi && brq->data.blocks > 1)
-			brq->data.blocks = 1;
-
-		if (brq->data.blocks > 1) {
-			/* SPI multiblock writes terminate using a special
-			 * token, not a STOP_TRANSMISSION request.
-			 */
-			if (!mmc_host_is_spi(card->host)
-					|| rq_data_dir(req) == READ)
-				brq->mrq.stop = &brq->stop;
-			readcmd = MMC_READ_MULTIPLE_BLOCK;
-			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
-		} else {
-			brq->mrq.stop = NULL;
-			readcmd = MMC_READ_SINGLE_BLOCK;
-			writecmd = MMC_WRITE_BLOCK;
-		}
-		if (rq_data_dir(req) == READ) {
-			brq->cmd.opcode = readcmd;
-			brq->data.flags |= MMC_DATA_READ;
-		} else {
-			brq->cmd.opcode = writecmd;
-			brq->data.flags |= MMC_DATA_WRITE;
-		}
+		if (!mmc_host_is_spi(card->host)
+		    || rq_data_dir(req) == READ)
+			brq->mrq.stop = &brq->stop;
+		readcmd = MMC_READ_MULTIPLE_BLOCK;
+		writecmd = MMC_WRITE_MULTIPLE_BLOCK;
+	} else {
+		brq->mrq.stop = NULL;
+		readcmd = MMC_READ_SINGLE_BLOCK;
+		writecmd = MMC_WRITE_BLOCK;
+	}
+	if (rq_data_dir(req) == READ) {
+		brq->cmd.opcode = readcmd;
+		brq->data.flags |= MMC_DATA_READ;
+	} else {
+		brq->cmd.opcode = writecmd;
+		brq->data.flags |= MMC_DATA_WRITE;
+	}
 
-		mmc_set_data_timeout(&brq->data, card);
+	mmc_set_data_timeout(&brq->data, card);
 
-		brq->data.sg = mq->mqrq_cur->sg;
-		brq->data.sg_len = mmc_queue_map_sg(mq, mq->mqrq_cur);
+	brq->data.sg = mqrq->sg;
+	brq->data.sg_len = mmc_queue_map_sg(mq, mqrq);
 
-		/*
-		 * Adjust the sg list so it is the same size as the
-		 * request.
-		 */
-		if (brq->data.blocks != blk_rq_sectors(req)) {
-			int i, data_size = brq->data.blocks << 9;
-			struct scatterlist *sg;
-
-			for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
-				data_size -= sg->length;
-				if (data_size <= 0) {
-					sg->length += data_size;
-					i++;
-					break;
-				}
+	/*
+	 * Adjust the sg list so it is the same size as the
+	 * request.
+	 */
+	if (brq->data.blocks != blk_rq_sectors(req)) {
+		int i, data_size = brq->data.blocks << 9;
+		struct scatterlist *sg;
+
+		for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
+			data_size -= sg->length;
+			if (data_size <= 0) {
+				sg->length += data_size;
+				i++;
+				break;
 			}
-			brq->data.sg_len = i;
 		}
+		brq->data.sg_len = i;
+	}
 
-		mmc_queue_bounce_pre(mq->mqrq_cur);
+	mmc_queue_bounce_pre(mqrq);
+}
+
+static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->data;
+	struct mmc_card *card = md->queue.card;
+	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
+	int ret = 1, disable_multi = 0;
+
+	mmc_claim_host(card->host);
+
+	do {
+		struct mmc_command cmd;
+		u32 status = 0;
 
+		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
 		mmc_wait_for_req(card->host, &brq->mrq);
 
 		mmc_queue_bounce_post(mq->mqrq_cur);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 05/12] mmc: add a block request prepare function
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Break out code from mmc_blk_issue_rw_rq to create a
block request prepare function. This doesn't change
any functionallity. This helps when handling more
than one active block request.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/block.c |  170 ++++++++++++++++++++++++---------------------
 1 files changed, 91 insertions(+), 79 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index ec4e432..e606dec 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -324,97 +324,109 @@ out:
 	return err ? 0 : 1;
 }
 
-static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+			       struct mmc_card *card,
+			       int disable_multi,
+			       struct mmc_queue *mq)
 {
-	struct mmc_blk_data *md = mq->data;
-	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
-	int ret = 1, disable_multi = 0;
+	u32 readcmd, writecmd;
+	struct mmc_blk_request *brq = &mqrq->brq;
+	struct request *req = mqrq->req;
 
-	mmc_claim_host(card->host);
+	memset(brq, 0, sizeof(struct mmc_blk_request));
 
-	do {
-		struct mmc_command cmd;
-		u32 readcmd, writecmd, status = 0;
-
-		memset(brq, 0, sizeof(struct mmc_blk_request));
-		brq->mrq.cmd = &brq->cmd;
-		brq->mrq.data = &brq->data;
-
-		brq->cmd.arg = blk_rq_pos(req);
-		if (!mmc_card_blockaddr(card))
-			brq->cmd.arg <<= 9;
-		brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
-		brq->data.blksz = 512;
-		brq->stop.opcode = MMC_STOP_TRANSMISSION;
-		brq->stop.arg = 0;
-		brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		brq->data.blocks = blk_rq_sectors(req);
+	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.data = &brq->data;
 
-		/*
-		 * The block layer doesn't support all sector count
-		 * restrictions, so we need to be prepared for too big
-		 * requests.
-		 */
-		if (brq->data.blocks > card->host->max_blk_count)
-			brq->data.blocks = card->host->max_blk_count;
+	brq->cmd.arg = blk_rq_pos(req);
+	if (!mmc_card_blockaddr(card))
+		brq->cmd.arg <<= 9;
+	brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
+	brq->data.blksz = 512;
+	brq->stop.opcode = MMC_STOP_TRANSMISSION;
+	brq->stop.arg = 0;
+	brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
+	brq->data.blocks = blk_rq_sectors(req);
 
-		/*
-		 * After a read error, we redo the request one sector at a time
-		 * in order to accurately determine which sectors can be read
-		 * successfully.
+	/*
+	 * The block layer doesn't support all sector count
+	 * restrictions, so we need to be prepared for too big
+	 * requests.
+	 */
+	if (brq->data.blocks > card->host->max_blk_count)
+		brq->data.blocks = card->host->max_blk_count;
+
+	/*
+	 * After a read error, we redo the request one sector at a time
+	 * in order to accurately determine which sectors can be read
+	 * successfully.
+	 */
+	if (disable_multi && brq->data.blocks > 1)
+		brq->data.blocks = 1;
+
+	if (brq->data.blocks > 1) {
+		/* SPI multiblock writes terminate using a special
+		 * token, not a STOP_TRANSMISSION request.
 		 */
-		if (disable_multi && brq->data.blocks > 1)
-			brq->data.blocks = 1;
-
-		if (brq->data.blocks > 1) {
-			/* SPI multiblock writes terminate using a special
-			 * token, not a STOP_TRANSMISSION request.
-			 */
-			if (!mmc_host_is_spi(card->host)
-					|| rq_data_dir(req) == READ)
-				brq->mrq.stop = &brq->stop;
-			readcmd = MMC_READ_MULTIPLE_BLOCK;
-			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
-		} else {
-			brq->mrq.stop = NULL;
-			readcmd = MMC_READ_SINGLE_BLOCK;
-			writecmd = MMC_WRITE_BLOCK;
-		}
-		if (rq_data_dir(req) == READ) {
-			brq->cmd.opcode = readcmd;
-			brq->data.flags |= MMC_DATA_READ;
-		} else {
-			brq->cmd.opcode = writecmd;
-			brq->data.flags |= MMC_DATA_WRITE;
-		}
+		if (!mmc_host_is_spi(card->host)
+		    || rq_data_dir(req) == READ)
+			brq->mrq.stop = &brq->stop;
+		readcmd = MMC_READ_MULTIPLE_BLOCK;
+		writecmd = MMC_WRITE_MULTIPLE_BLOCK;
+	} else {
+		brq->mrq.stop = NULL;
+		readcmd = MMC_READ_SINGLE_BLOCK;
+		writecmd = MMC_WRITE_BLOCK;
+	}
+	if (rq_data_dir(req) == READ) {
+		brq->cmd.opcode = readcmd;
+		brq->data.flags |= MMC_DATA_READ;
+	} else {
+		brq->cmd.opcode = writecmd;
+		brq->data.flags |= MMC_DATA_WRITE;
+	}
 
-		mmc_set_data_timeout(&brq->data, card);
+	mmc_set_data_timeout(&brq->data, card);
 
-		brq->data.sg = mq->mqrq_cur->sg;
-		brq->data.sg_len = mmc_queue_map_sg(mq, mq->mqrq_cur);
+	brq->data.sg = mqrq->sg;
+	brq->data.sg_len = mmc_queue_map_sg(mq, mqrq);
 
-		/*
-		 * Adjust the sg list so it is the same size as the
-		 * request.
-		 */
-		if (brq->data.blocks != blk_rq_sectors(req)) {
-			int i, data_size = brq->data.blocks << 9;
-			struct scatterlist *sg;
-
-			for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
-				data_size -= sg->length;
-				if (data_size <= 0) {
-					sg->length += data_size;
-					i++;
-					break;
-				}
+	/*
+	 * Adjust the sg list so it is the same size as the
+	 * request.
+	 */
+	if (brq->data.blocks != blk_rq_sectors(req)) {
+		int i, data_size = brq->data.blocks << 9;
+		struct scatterlist *sg;
+
+		for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
+			data_size -= sg->length;
+			if (data_size <= 0) {
+				sg->length += data_size;
+				i++;
+				break;
 			}
-			brq->data.sg_len = i;
 		}
+		brq->data.sg_len = i;
+	}
 
-		mmc_queue_bounce_pre(mq->mqrq_cur);
+	mmc_queue_bounce_pre(mqrq);
+}
+
+static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->data;
+	struct mmc_card *card = md->queue.card;
+	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
+	int ret = 1, disable_multi = 0;
+
+	mmc_claim_host(card->host);
+
+	do {
+		struct mmc_command cmd;
+		u32 status = 0;
 
+		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
 		mmc_wait_for_req(card->host, &brq->mrq);
 
 		mmc_queue_bounce_post(mq->mqrq_cur);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 05/12] mmc: add a block request prepare function
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Break out code from mmc_blk_issue_rw_rq to create a
block request prepare function. This doesn't change
any functionallity. This helps when handling more
than one active block request.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  170 ++++++++++++++++++++++++---------------------
 1 files changed, 91 insertions(+), 79 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index ec4e432..e606dec 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -324,97 +324,109 @@ out:
 	return err ? 0 : 1;
 }
 
-static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+			       struct mmc_card *card,
+			       int disable_multi,
+			       struct mmc_queue *mq)
 {
-	struct mmc_blk_data *md = mq->data;
-	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
-	int ret = 1, disable_multi = 0;
+	u32 readcmd, writecmd;
+	struct mmc_blk_request *brq = &mqrq->brq;
+	struct request *req = mqrq->req;
 
-	mmc_claim_host(card->host);
+	memset(brq, 0, sizeof(struct mmc_blk_request));
 
-	do {
-		struct mmc_command cmd;
-		u32 readcmd, writecmd, status = 0;
-
-		memset(brq, 0, sizeof(struct mmc_blk_request));
-		brq->mrq.cmd = &brq->cmd;
-		brq->mrq.data = &brq->data;
-
-		brq->cmd.arg = blk_rq_pos(req);
-		if (!mmc_card_blockaddr(card))
-			brq->cmd.arg <<= 9;
-		brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
-		brq->data.blksz = 512;
-		brq->stop.opcode = MMC_STOP_TRANSMISSION;
-		brq->stop.arg = 0;
-		brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		brq->data.blocks = blk_rq_sectors(req);
+	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.data = &brq->data;
 
-		/*
-		 * The block layer doesn't support all sector count
-		 * restrictions, so we need to be prepared for too big
-		 * requests.
-		 */
-		if (brq->data.blocks > card->host->max_blk_count)
-			brq->data.blocks = card->host->max_blk_count;
+	brq->cmd.arg = blk_rq_pos(req);
+	if (!mmc_card_blockaddr(card))
+		brq->cmd.arg <<= 9;
+	brq->cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
+	brq->data.blksz = 512;
+	brq->stop.opcode = MMC_STOP_TRANSMISSION;
+	brq->stop.arg = 0;
+	brq->stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
+	brq->data.blocks = blk_rq_sectors(req);
 
-		/*
-		 * After a read error, we redo the request one sector at a time
-		 * in order to accurately determine which sectors can be read
-		 * successfully.
+	/*
+	 * The block layer doesn't support all sector count
+	 * restrictions, so we need to be prepared for too big
+	 * requests.
+	 */
+	if (brq->data.blocks > card->host->max_blk_count)
+		brq->data.blocks = card->host->max_blk_count;
+
+	/*
+	 * After a read error, we redo the request one sector at a time
+	 * in order to accurately determine which sectors can be read
+	 * successfully.
+	 */
+	if (disable_multi && brq->data.blocks > 1)
+		brq->data.blocks = 1;
+
+	if (brq->data.blocks > 1) {
+		/* SPI multiblock writes terminate using a special
+		 * token, not a STOP_TRANSMISSION request.
 		 */
-		if (disable_multi && brq->data.blocks > 1)
-			brq->data.blocks = 1;
-
-		if (brq->data.blocks > 1) {
-			/* SPI multiblock writes terminate using a special
-			 * token, not a STOP_TRANSMISSION request.
-			 */
-			if (!mmc_host_is_spi(card->host)
-					|| rq_data_dir(req) == READ)
-				brq->mrq.stop = &brq->stop;
-			readcmd = MMC_READ_MULTIPLE_BLOCK;
-			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
-		} else {
-			brq->mrq.stop = NULL;
-			readcmd = MMC_READ_SINGLE_BLOCK;
-			writecmd = MMC_WRITE_BLOCK;
-		}
-		if (rq_data_dir(req) == READ) {
-			brq->cmd.opcode = readcmd;
-			brq->data.flags |= MMC_DATA_READ;
-		} else {
-			brq->cmd.opcode = writecmd;
-			brq->data.flags |= MMC_DATA_WRITE;
-		}
+		if (!mmc_host_is_spi(card->host)
+		    || rq_data_dir(req) == READ)
+			brq->mrq.stop = &brq->stop;
+		readcmd = MMC_READ_MULTIPLE_BLOCK;
+		writecmd = MMC_WRITE_MULTIPLE_BLOCK;
+	} else {
+		brq->mrq.stop = NULL;
+		readcmd = MMC_READ_SINGLE_BLOCK;
+		writecmd = MMC_WRITE_BLOCK;
+	}
+	if (rq_data_dir(req) == READ) {
+		brq->cmd.opcode = readcmd;
+		brq->data.flags |= MMC_DATA_READ;
+	} else {
+		brq->cmd.opcode = writecmd;
+		brq->data.flags |= MMC_DATA_WRITE;
+	}
 
-		mmc_set_data_timeout(&brq->data, card);
+	mmc_set_data_timeout(&brq->data, card);
 
-		brq->data.sg = mq->mqrq_cur->sg;
-		brq->data.sg_len = mmc_queue_map_sg(mq, mq->mqrq_cur);
+	brq->data.sg = mqrq->sg;
+	brq->data.sg_len = mmc_queue_map_sg(mq, mqrq);
 
-		/*
-		 * Adjust the sg list so it is the same size as the
-		 * request.
-		 */
-		if (brq->data.blocks != blk_rq_sectors(req)) {
-			int i, data_size = brq->data.blocks << 9;
-			struct scatterlist *sg;
-
-			for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
-				data_size -= sg->length;
-				if (data_size <= 0) {
-					sg->length += data_size;
-					i++;
-					break;
-				}
+	/*
+	 * Adjust the sg list so it is the same size as the
+	 * request.
+	 */
+	if (brq->data.blocks != blk_rq_sectors(req)) {
+		int i, data_size = brq->data.blocks << 9;
+		struct scatterlist *sg;
+
+		for_each_sg(brq->data.sg, sg, brq->data.sg_len, i) {
+			data_size -= sg->length;
+			if (data_size <= 0) {
+				sg->length += data_size;
+				i++;
+				break;
 			}
-			brq->data.sg_len = i;
 		}
+		brq->data.sg_len = i;
+	}
 
-		mmc_queue_bounce_pre(mq->mqrq_cur);
+	mmc_queue_bounce_pre(mqrq);
+}
+
+static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->data;
+	struct mmc_card *card = md->queue.card;
+	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
+	int ret = 1, disable_multi = 0;
+
+	mmc_claim_host(card->host);
+
+	do {
+		struct mmc_command cmd;
+		u32 status = 0;
 
+		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
 		mmc_wait_for_req(card->host, &brq->mrq);
 
 		mmc_queue_bounce_post(mq->mqrq_cur);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 06/12] mmc: move error code in mmc_block_issue_rw_rq to a separate function.
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Break out code without functional changes. This simplifies the code and
makes way for handle two parallel request.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  225 +++++++++++++++++++++++++++-------------------
 1 files changed, 132 insertions(+), 93 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index e606dec..f5db000 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -79,6 +79,13 @@ struct mmc_blk_data {
 
 static DEFINE_MUTEX(open_lock);
 
+enum mmc_blk_status {
+	MMC_BLK_SUCCESS = 0,
+	MMC_BLK_RETRY,
+	MMC_BLK_DATA_ERR,
+	MMC_BLK_CMD_ERR,
+};
+
 module_param(perdev_minors, int, 0444);
 MODULE_PARM_DESC(perdev_minors, "Minors numbers to allocate per device");
 
@@ -413,116 +420,148 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 	mmc_queue_bounce_pre(mqrq);
 }
 
+static enum mmc_blk_status mmc_blk_get_status(struct mmc_blk_request *brq,
+					      struct request *req,
+					      struct mmc_card *card,
+					      struct mmc_blk_data *md)
+{
+	struct mmc_command cmd;
+	u32 status;
+	enum mmc_blk_status ret = MMC_BLK_SUCCESS;
+
+	/*
+	 * Check for errors here, but don't jump to cmd_err
+	 * until later as we need to wait for the card to leave
+	 * programming mode even when things go wrong.
+	 */
+	if (brq->cmd.error || brq->data.error || brq->stop.error) {
+		if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
+			/* Redo read one sector at a time */
+			printk(KERN_WARNING "%s: retrying using single "
+			       "block read, brq %p\n",
+			       req->rq_disk->disk_name, brq);
+			ret = MMC_BLK_RETRY;
+			goto out;
+		}
+		status = get_card_status(card, req);
+	}
+
+	if (brq->cmd.error) {
+		printk(KERN_ERR "%s: error %d sending read/write "
+		       "command, response %#x, card status %#x\n",
+		       req->rq_disk->disk_name, brq->cmd.error,
+		       brq->cmd.resp[0], status);
+	}
+
+	if (brq->data.error) {
+		if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
+			/* 'Stop' response contains card status */
+			status = brq->mrq.stop->resp[0];
+		printk(KERN_ERR "%s: error %d transferring data,"
+		       " sector %u, nr %u, card status %#x\n",
+		       req->rq_disk->disk_name, brq->data.error,
+		       (unsigned)blk_rq_pos(req),
+		       (unsigned)blk_rq_sectors(req), status);
+	}
+
+	if (brq->stop.error) {
+		printk(KERN_ERR "%s: error %d sending stop command, "
+		       "response %#x, card status %#x\n",
+		       req->rq_disk->disk_name, brq->stop.error,
+		       brq->stop.resp[0], status);
+	}
+
+	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
+		do {
+			int err;
+
+			cmd.opcode = MMC_SEND_STATUS;
+			cmd.arg = card->rca << 16;
+			cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
+			err = mmc_wait_for_cmd(card->host, &cmd, 5);
+			if (err) {
+				printk(KERN_ERR "%s: error %d requesting status\n",
+				       req->rq_disk->disk_name, err);
+				ret = MMC_BLK_CMD_ERR;
+				goto out;
+			}
+			/*
+			 * Some cards mishandle the status bits,
+			 * so make sure to check both the busy
+			 * indication and the card state.
+			 */
+		} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
+			 (R1_CURRENT_STATE(cmd.resp[0]) == 7));
+
+#if 0
+		if (cmd.resp[0] & ~0x00000900)
+			printk(KERN_ERR "%s: status = %08x\n",
+			       req->rq_disk->disk_name, cmd.resp[0]);
+		if (mmc_decode_status(cmd.resp)) {
+			ret = MMC_BLK_CMD_ERR;
+			goto out;
+		}
+
+#endif
+	}
+
+	if (brq->cmd.error || brq->stop.error || brq->data.error) {
+		if (rq_data_dir(req) == READ)
+			ret = MMC_BLK_DATA_ERR;
+		else
+			ret = MMC_BLK_CMD_ERR;
+	}
+ out:
+	return ret;
+
+}
+
 static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
 	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
 	int ret = 1, disable_multi = 0;
+	enum mmc_blk_status status;
 
 	mmc_claim_host(card->host);
 
 	do {
-		struct mmc_command cmd;
-		u32 status = 0;
-
 		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
 		mmc_wait_for_req(card->host, &brq->mrq);
 
 		mmc_queue_bounce_post(mq->mqrq_cur);
+		status = mmc_blk_get_status(brq, req, card, md);
 
-		/*
-		 * Check for errors here, but don't jump to cmd_err
-		 * until later as we need to wait for the card to leave
-		 * programming mode even when things go wrong.
-		 */
-		if (brq->cmd.error || brq->data.error || brq->stop.error) {
-			if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
-				/* Redo read one sector at a time */
-				printk(KERN_WARNING "%s: retrying using single "
-				       "block read\n", req->rq_disk->disk_name);
-				disable_multi = 1;
-				continue;
-			}
-			status = get_card_status(card, req);
-		}
-
-		if (brq->cmd.error) {
-			printk(KERN_ERR "%s: error %d sending read/write "
-			       "command, response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq->cmd.error,
-			       brq->cmd.resp[0], status);
-		}
-
-		if (brq->data.error) {
-			if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
-				/* 'Stop' response contains card status */
-				status = brq->mrq.stop->resp[0];
-			printk(KERN_ERR "%s: error %d transferring data,"
-			       " sector %u, nr %u, card status %#x\n",
-			       req->rq_disk->disk_name, brq->data.error,
-			       (unsigned)blk_rq_pos(req),
-			       (unsigned)blk_rq_sectors(req), status);
-		}
-
-		if (brq->stop.error) {
-			printk(KERN_ERR "%s: error %d sending stop command, "
-			       "response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq->stop.error,
-			       brq->stop.resp[0], status);
-		}
-
-		if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
-			do {
-				int err;
-
-				cmd.opcode = MMC_SEND_STATUS;
-				cmd.arg = card->rca << 16;
-				cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
-				err = mmc_wait_for_cmd(card->host, &cmd, 5);
-				if (err) {
-					printk(KERN_ERR "%s: error %d requesting status\n",
-					       req->rq_disk->disk_name, err);
-					goto cmd_err;
-				}
-				/*
-				 * Some cards mishandle the status bits,
-				 * so make sure to check both the busy
-				 * indication and the card state.
-				 */
-			} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
-				(R1_CURRENT_STATE(cmd.resp[0]) == 7));
-
-#if 0
-			if (cmd.resp[0] & ~0x00000900)
-				printk(KERN_ERR "%s: status = %08x\n",
-				       req->rq_disk->disk_name, cmd.resp[0]);
-			if (mmc_decode_status(cmd.resp))
-				goto cmd_err;
-#endif
-		}
-
-		if (brq->cmd.error || brq->stop.error || brq->data.error) {
-			if (rq_data_dir(req) == READ) {
-				/*
-				 * After an error, we redo I/O one sector at a
-				 * time, so we only reach here after trying to
-				 * read a single sector.
-				 */
-				spin_lock_irq(&md->lock);
-				ret = __blk_end_request(req, -EIO, brq->data.blksz);
-				spin_unlock_irq(&md->lock);
-				continue;
-			}
+		switch (status) {
+		case MMC_BLK_CMD_ERR:
 			goto cmd_err;
-		}
+			break;
+		case MMC_BLK_RETRY:
+			disable_multi = 1;
+			ret = 1;
+			break;
+		case MMC_BLK_DATA_ERR:
+			/*
+			 * After an error, we redo I/O one sector at a
+			 * time, so we only reach here after trying to
+			 * read a single sector.
+			 */
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(req, -EIO,
+						brq->data.blksz);
+			spin_unlock_irq(&md->lock);
 
-		/*
-		 * A block was successfully transferred.
-		 */
-		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
-		spin_unlock_irq(&md->lock);
+			break;
+		case MMC_BLK_SUCCESS:
+			/*
+			 * A block was successfully transferred.
+			 */
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
+			spin_unlock_irq(&md->lock);
+			break;
+		}
 	} while (ret);
 
 	mmc_release_host(card->host);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 06/12] mmc: move error code in mmc_block_issue_rw_rq to a separate function.
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Break out code without functional changes. This simplifies the code and
makes way for handle two parallel request.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/block.c |  225 +++++++++++++++++++++++++++-------------------
 1 files changed, 132 insertions(+), 93 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index e606dec..f5db000 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -79,6 +79,13 @@ struct mmc_blk_data {
 
 static DEFINE_MUTEX(open_lock);
 
+enum mmc_blk_status {
+	MMC_BLK_SUCCESS = 0,
+	MMC_BLK_RETRY,
+	MMC_BLK_DATA_ERR,
+	MMC_BLK_CMD_ERR,
+};
+
 module_param(perdev_minors, int, 0444);
 MODULE_PARM_DESC(perdev_minors, "Minors numbers to allocate per device");
 
@@ -413,116 +420,148 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 	mmc_queue_bounce_pre(mqrq);
 }
 
+static enum mmc_blk_status mmc_blk_get_status(struct mmc_blk_request *brq,
+					      struct request *req,
+					      struct mmc_card *card,
+					      struct mmc_blk_data *md)
+{
+	struct mmc_command cmd;
+	u32 status;
+	enum mmc_blk_status ret = MMC_BLK_SUCCESS;
+
+	/*
+	 * Check for errors here, but don't jump to cmd_err
+	 * until later as we need to wait for the card to leave
+	 * programming mode even when things go wrong.
+	 */
+	if (brq->cmd.error || brq->data.error || brq->stop.error) {
+		if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
+			/* Redo read one sector at a time */
+			printk(KERN_WARNING "%s: retrying using single "
+			       "block read, brq %p\n",
+			       req->rq_disk->disk_name, brq);
+			ret = MMC_BLK_RETRY;
+			goto out;
+		}
+		status = get_card_status(card, req);
+	}
+
+	if (brq->cmd.error) {
+		printk(KERN_ERR "%s: error %d sending read/write "
+		       "command, response %#x, card status %#x\n",
+		       req->rq_disk->disk_name, brq->cmd.error,
+		       brq->cmd.resp[0], status);
+	}
+
+	if (brq->data.error) {
+		if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
+			/* 'Stop' response contains card status */
+			status = brq->mrq.stop->resp[0];
+		printk(KERN_ERR "%s: error %d transferring data,"
+		       " sector %u, nr %u, card status %#x\n",
+		       req->rq_disk->disk_name, brq->data.error,
+		       (unsigned)blk_rq_pos(req),
+		       (unsigned)blk_rq_sectors(req), status);
+	}
+
+	if (brq->stop.error) {
+		printk(KERN_ERR "%s: error %d sending stop command, "
+		       "response %#x, card status %#x\n",
+		       req->rq_disk->disk_name, brq->stop.error,
+		       brq->stop.resp[0], status);
+	}
+
+	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
+		do {
+			int err;
+
+			cmd.opcode = MMC_SEND_STATUS;
+			cmd.arg = card->rca << 16;
+			cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
+			err = mmc_wait_for_cmd(card->host, &cmd, 5);
+			if (err) {
+				printk(KERN_ERR "%s: error %d requesting status\n",
+				       req->rq_disk->disk_name, err);
+				ret = MMC_BLK_CMD_ERR;
+				goto out;
+			}
+			/*
+			 * Some cards mishandle the status bits,
+			 * so make sure to check both the busy
+			 * indication and the card state.
+			 */
+		} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
+			 (R1_CURRENT_STATE(cmd.resp[0]) == 7));
+
+#if 0
+		if (cmd.resp[0] & ~0x00000900)
+			printk(KERN_ERR "%s: status = %08x\n",
+			       req->rq_disk->disk_name, cmd.resp[0]);
+		if (mmc_decode_status(cmd.resp)) {
+			ret = MMC_BLK_CMD_ERR;
+			goto out;
+		}
+
+#endif
+	}
+
+	if (brq->cmd.error || brq->stop.error || brq->data.error) {
+		if (rq_data_dir(req) == READ)
+			ret = MMC_BLK_DATA_ERR;
+		else
+			ret = MMC_BLK_CMD_ERR;
+	}
+ out:
+	return ret;
+
+}
+
 static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
 	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
 	int ret = 1, disable_multi = 0;
+	enum mmc_blk_status status;
 
 	mmc_claim_host(card->host);
 
 	do {
-		struct mmc_command cmd;
-		u32 status = 0;
-
 		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
 		mmc_wait_for_req(card->host, &brq->mrq);
 
 		mmc_queue_bounce_post(mq->mqrq_cur);
+		status = mmc_blk_get_status(brq, req, card, md);
 
-		/*
-		 * Check for errors here, but don't jump to cmd_err
-		 * until later as we need to wait for the card to leave
-		 * programming mode even when things go wrong.
-		 */
-		if (brq->cmd.error || brq->data.error || brq->stop.error) {
-			if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
-				/* Redo read one sector at a time */
-				printk(KERN_WARNING "%s: retrying using single "
-				       "block read\n", req->rq_disk->disk_name);
-				disable_multi = 1;
-				continue;
-			}
-			status = get_card_status(card, req);
-		}
-
-		if (brq->cmd.error) {
-			printk(KERN_ERR "%s: error %d sending read/write "
-			       "command, response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq->cmd.error,
-			       brq->cmd.resp[0], status);
-		}
-
-		if (brq->data.error) {
-			if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
-				/* 'Stop' response contains card status */
-				status = brq->mrq.stop->resp[0];
-			printk(KERN_ERR "%s: error %d transferring data,"
-			       " sector %u, nr %u, card status %#x\n",
-			       req->rq_disk->disk_name, brq->data.error,
-			       (unsigned)blk_rq_pos(req),
-			       (unsigned)blk_rq_sectors(req), status);
-		}
-
-		if (brq->stop.error) {
-			printk(KERN_ERR "%s: error %d sending stop command, "
-			       "response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq->stop.error,
-			       brq->stop.resp[0], status);
-		}
-
-		if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
-			do {
-				int err;
-
-				cmd.opcode = MMC_SEND_STATUS;
-				cmd.arg = card->rca << 16;
-				cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
-				err = mmc_wait_for_cmd(card->host, &cmd, 5);
-				if (err) {
-					printk(KERN_ERR "%s: error %d requesting status\n",
-					       req->rq_disk->disk_name, err);
-					goto cmd_err;
-				}
-				/*
-				 * Some cards mishandle the status bits,
-				 * so make sure to check both the busy
-				 * indication and the card state.
-				 */
-			} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
-				(R1_CURRENT_STATE(cmd.resp[0]) == 7));
-
-#if 0
-			if (cmd.resp[0] & ~0x00000900)
-				printk(KERN_ERR "%s: status = %08x\n",
-				       req->rq_disk->disk_name, cmd.resp[0]);
-			if (mmc_decode_status(cmd.resp))
-				goto cmd_err;
-#endif
-		}
-
-		if (brq->cmd.error || brq->stop.error || brq->data.error) {
-			if (rq_data_dir(req) == READ) {
-				/*
-				 * After an error, we redo I/O one sector at a
-				 * time, so we only reach here after trying to
-				 * read a single sector.
-				 */
-				spin_lock_irq(&md->lock);
-				ret = __blk_end_request(req, -EIO, brq->data.blksz);
-				spin_unlock_irq(&md->lock);
-				continue;
-			}
+		switch (status) {
+		case MMC_BLK_CMD_ERR:
 			goto cmd_err;
-		}
+			break;
+		case MMC_BLK_RETRY:
+			disable_multi = 1;
+			ret = 1;
+			break;
+		case MMC_BLK_DATA_ERR:
+			/*
+			 * After an error, we redo I/O one sector at a
+			 * time, so we only reach here after trying to
+			 * read a single sector.
+			 */
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(req, -EIO,
+						brq->data.blksz);
+			spin_unlock_irq(&md->lock);
 
-		/*
-		 * A block was successfully transferred.
-		 */
-		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
-		spin_unlock_irq(&md->lock);
+			break;
+		case MMC_BLK_SUCCESS:
+			/*
+			 * A block was successfully transferred.
+			 */
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
+			spin_unlock_irq(&md->lock);
+			break;
+		}
 	} while (ret);
 
 	mmc_release_host(card->host);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 06/12] mmc: move error code in mmc_block_issue_rw_rq to a separate function.
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Break out code without functional changes. This simplifies the code and
makes way for handle two parallel request.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  225 +++++++++++++++++++++++++++-------------------
 1 files changed, 132 insertions(+), 93 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index e606dec..f5db000 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -79,6 +79,13 @@ struct mmc_blk_data {
 
 static DEFINE_MUTEX(open_lock);
 
+enum mmc_blk_status {
+	MMC_BLK_SUCCESS = 0,
+	MMC_BLK_RETRY,
+	MMC_BLK_DATA_ERR,
+	MMC_BLK_CMD_ERR,
+};
+
 module_param(perdev_minors, int, 0444);
 MODULE_PARM_DESC(perdev_minors, "Minors numbers to allocate per device");
 
@@ -413,116 +420,148 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 	mmc_queue_bounce_pre(mqrq);
 }
 
+static enum mmc_blk_status mmc_blk_get_status(struct mmc_blk_request *brq,
+					      struct request *req,
+					      struct mmc_card *card,
+					      struct mmc_blk_data *md)
+{
+	struct mmc_command cmd;
+	u32 status;
+	enum mmc_blk_status ret = MMC_BLK_SUCCESS;
+
+	/*
+	 * Check for errors here, but don't jump to cmd_err
+	 * until later as we need to wait for the card to leave
+	 * programming mode even when things go wrong.
+	 */
+	if (brq->cmd.error || brq->data.error || brq->stop.error) {
+		if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
+			/* Redo read one sector at a time */
+			printk(KERN_WARNING "%s: retrying using single "
+			       "block read, brq %p\n",
+			       req->rq_disk->disk_name, brq);
+			ret = MMC_BLK_RETRY;
+			goto out;
+		}
+		status = get_card_status(card, req);
+	}
+
+	if (brq->cmd.error) {
+		printk(KERN_ERR "%s: error %d sending read/write "
+		       "command, response %#x, card status %#x\n",
+		       req->rq_disk->disk_name, brq->cmd.error,
+		       brq->cmd.resp[0], status);
+	}
+
+	if (brq->data.error) {
+		if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
+			/* 'Stop' response contains card status */
+			status = brq->mrq.stop->resp[0];
+		printk(KERN_ERR "%s: error %d transferring data,"
+		       " sector %u, nr %u, card status %#x\n",
+		       req->rq_disk->disk_name, brq->data.error,
+		       (unsigned)blk_rq_pos(req),
+		       (unsigned)blk_rq_sectors(req), status);
+	}
+
+	if (brq->stop.error) {
+		printk(KERN_ERR "%s: error %d sending stop command, "
+		       "response %#x, card status %#x\n",
+		       req->rq_disk->disk_name, brq->stop.error,
+		       brq->stop.resp[0], status);
+	}
+
+	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
+		do {
+			int err;
+
+			cmd.opcode = MMC_SEND_STATUS;
+			cmd.arg = card->rca << 16;
+			cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
+			err = mmc_wait_for_cmd(card->host, &cmd, 5);
+			if (err) {
+				printk(KERN_ERR "%s: error %d requesting status\n",
+				       req->rq_disk->disk_name, err);
+				ret = MMC_BLK_CMD_ERR;
+				goto out;
+			}
+			/*
+			 * Some cards mishandle the status bits,
+			 * so make sure to check both the busy
+			 * indication and the card state.
+			 */
+		} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
+			 (R1_CURRENT_STATE(cmd.resp[0]) == 7));
+
+#if 0
+		if (cmd.resp[0] & ~0x00000900)
+			printk(KERN_ERR "%s: status = %08x\n",
+			       req->rq_disk->disk_name, cmd.resp[0]);
+		if (mmc_decode_status(cmd.resp)) {
+			ret = MMC_BLK_CMD_ERR;
+			goto out;
+		}
+
+#endif
+	}
+
+	if (brq->cmd.error || brq->stop.error || brq->data.error) {
+		if (rq_data_dir(req) == READ)
+			ret = MMC_BLK_DATA_ERR;
+		else
+			ret = MMC_BLK_CMD_ERR;
+	}
+ out:
+	return ret;
+
+}
+
 static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
 	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
 	int ret = 1, disable_multi = 0;
+	enum mmc_blk_status status;
 
 	mmc_claim_host(card->host);
 
 	do {
-		struct mmc_command cmd;
-		u32 status = 0;
-
 		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
 		mmc_wait_for_req(card->host, &brq->mrq);
 
 		mmc_queue_bounce_post(mq->mqrq_cur);
+		status = mmc_blk_get_status(brq, req, card, md);
 
-		/*
-		 * Check for errors here, but don't jump to cmd_err
-		 * until later as we need to wait for the card to leave
-		 * programming mode even when things go wrong.
-		 */
-		if (brq->cmd.error || brq->data.error || brq->stop.error) {
-			if (brq->data.blocks > 1 && rq_data_dir(req) == READ) {
-				/* Redo read one sector at a time */
-				printk(KERN_WARNING "%s: retrying using single "
-				       "block read\n", req->rq_disk->disk_name);
-				disable_multi = 1;
-				continue;
-			}
-			status = get_card_status(card, req);
-		}
-
-		if (brq->cmd.error) {
-			printk(KERN_ERR "%s: error %d sending read/write "
-			       "command, response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq->cmd.error,
-			       brq->cmd.resp[0], status);
-		}
-
-		if (brq->data.error) {
-			if (brq->data.error == -ETIMEDOUT && brq->mrq.stop)
-				/* 'Stop' response contains card status */
-				status = brq->mrq.stop->resp[0];
-			printk(KERN_ERR "%s: error %d transferring data,"
-			       " sector %u, nr %u, card status %#x\n",
-			       req->rq_disk->disk_name, brq->data.error,
-			       (unsigned)blk_rq_pos(req),
-			       (unsigned)blk_rq_sectors(req), status);
-		}
-
-		if (brq->stop.error) {
-			printk(KERN_ERR "%s: error %d sending stop command, "
-			       "response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, brq->stop.error,
-			       brq->stop.resp[0], status);
-		}
-
-		if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
-			do {
-				int err;
-
-				cmd.opcode = MMC_SEND_STATUS;
-				cmd.arg = card->rca << 16;
-				cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
-				err = mmc_wait_for_cmd(card->host, &cmd, 5);
-				if (err) {
-					printk(KERN_ERR "%s: error %d requesting status\n",
-					       req->rq_disk->disk_name, err);
-					goto cmd_err;
-				}
-				/*
-				 * Some cards mishandle the status bits,
-				 * so make sure to check both the busy
-				 * indication and the card state.
-				 */
-			} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
-				(R1_CURRENT_STATE(cmd.resp[0]) == 7));
-
-#if 0
-			if (cmd.resp[0] & ~0x00000900)
-				printk(KERN_ERR "%s: status = %08x\n",
-				       req->rq_disk->disk_name, cmd.resp[0]);
-			if (mmc_decode_status(cmd.resp))
-				goto cmd_err;
-#endif
-		}
-
-		if (brq->cmd.error || brq->stop.error || brq->data.error) {
-			if (rq_data_dir(req) == READ) {
-				/*
-				 * After an error, we redo I/O one sector at a
-				 * time, so we only reach here after trying to
-				 * read a single sector.
-				 */
-				spin_lock_irq(&md->lock);
-				ret = __blk_end_request(req, -EIO, brq->data.blksz);
-				spin_unlock_irq(&md->lock);
-				continue;
-			}
+		switch (status) {
+		case MMC_BLK_CMD_ERR:
 			goto cmd_err;
-		}
+			break;
+		case MMC_BLK_RETRY:
+			disable_multi = 1;
+			ret = 1;
+			break;
+		case MMC_BLK_DATA_ERR:
+			/*
+			 * After an error, we redo I/O one sector at a
+			 * time, so we only reach here after trying to
+			 * read a single sector.
+			 */
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(req, -EIO,
+						brq->data.blksz);
+			spin_unlock_irq(&md->lock);
 
-		/*
-		 * A block was successfully transferred.
-		 */
-		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
-		spin_unlock_irq(&md->lock);
+			break;
+		case MMC_BLK_SUCCESS:
+			/*
+			 * A block was successfully transferred.
+			 */
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
+			spin_unlock_irq(&md->lock);
+			break;
+		}
 	} while (ret);
 
 	mmc_release_host(card->host);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 07/12] mmc: add a second mmc queue request member
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Add an additional mmc queue request instance to make way for
two active block requests. One request may be active while the
other request is being prepared.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/queue.c |   44 ++++++++++++++++++++++++++++++++++++++++++--
 drivers/mmc/card/queue.h |    3 ++-
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 40e18b5..eef3510 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -130,6 +130,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret;
 	struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
+	struct mmc_queue_req *mqrq_prev = &mq->mqrq[1];
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = *mmc_dev(host)->dma_mask;
@@ -140,7 +141,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 		return -ENOMEM;
 
 	memset(&mq->mqrq_cur, 0, sizeof(mq->mqrq_cur));
+	memset(&mq->mqrq_prev, 0, sizeof(mq->mqrq_prev));
 	mq->mqrq_cur = mqrq_cur;
+	mq->mqrq_prev = mqrq_prev;
 	mq->queue->queuedata = mq;
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
@@ -181,9 +184,17 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 					"allocate bounce cur buffer\n",
 					mmc_card_name(card));
 			}
+			mqrq_prev->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
+			if (!mqrq_prev->bounce_buf) {
+				printk(KERN_WARNING "%s: unable to "
+					"allocate bounce prev buffer\n",
+					mmc_card_name(card));
+				kfree(mqrq_cur->bounce_buf);
+				mqrq_cur->bounce_buf = NULL;
+			}
 		}
 
-		if (mqrq_cur->bounce_buf) {
+		if (mqrq_cur->bounce_buf && mqrq_prev->bounce_buf) {
 			blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 			blk_queue_max_hw_sectors(mq->queue, bouncesz / 512);
 			blk_queue_max_segments(mq->queue, bouncesz / 512);
@@ -198,11 +209,19 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 			if (ret)
 				goto cleanup_queue;
 
+			mqrq_prev->sg = mmc_alloc_sg(1, &ret);
+			if (ret)
+				goto cleanup_queue;
+
+			mqrq_prev->bounce_sg =
+				mmc_alloc_sg(bouncesz / 512, &ret);
+			if (ret)
+				goto cleanup_queue;
 		}
 	}
 #endif
 
-	if (!mqrq_cur->bounce_buf) {
+	if (!mqrq_cur->bounce_buf && !mqrq_prev->bounce_buf) {
 		blk_queue_bounce_limit(mq->queue, limit);
 		blk_queue_max_hw_sectors(mq->queue,
 			min(host->max_blk_count, host->max_req_size / 512));
@@ -213,6 +232,10 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 		if (ret)
 			goto cleanup_queue;
 
+
+		mqrq_prev->sg = mmc_alloc_sg(host->max_segs, &ret);
+		if (ret)
+			goto cleanup_queue;
 	}
 
 	sema_init(&mq->thread_sem, 1);
@@ -229,6 +252,8 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
  free_bounce_sg:
 	kfree(mqrq_cur->bounce_sg);
 	mqrq_cur->bounce_sg = NULL;
+	kfree(mqrq_prev->bounce_sg);
+	mqrq_prev->bounce_sg = NULL;
 
  cleanup_queue:
 	kfree(mqrq_cur->sg);
@@ -236,6 +261,11 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	kfree(mqrq_cur->bounce_buf);
 	mqrq_cur->bounce_buf = NULL;
 
+	kfree(mqrq_prev->sg);
+	mqrq_prev->sg = NULL;
+	kfree(mqrq_prev->bounce_buf);
+	mqrq_prev->bounce_buf = NULL;
+
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -245,6 +275,7 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
 	struct mmc_queue_req *mqrq_cur = mq->mqrq_cur;
+	struct mmc_queue_req *mqrq_prev = mq->mqrq_prev;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
@@ -267,6 +298,15 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	kfree(mqrq_cur->bounce_buf);
 	mqrq_cur->bounce_buf = NULL;
 
+	kfree(mqrq_prev->bounce_sg);
+	mqrq_prev->bounce_sg = NULL;
+
+	kfree(mqrq_prev->sg);
+	mqrq_prev->sg = NULL;
+
+	kfree(mqrq_prev->bounce_buf);
+	mqrq_prev->bounce_buf = NULL;
+
 	mq->card = NULL;
 }
 EXPORT_SYMBOL(mmc_cleanup_queue);
diff --git a/drivers/mmc/card/queue.h b/drivers/mmc/card/queue.h
index 468044f..0e65807 100644
--- a/drivers/mmc/card/queue.h
+++ b/drivers/mmc/card/queue.h
@@ -28,8 +28,9 @@ struct mmc_queue {
 	int			(*issue_fn)(struct mmc_queue *, struct request *);
 	void			*data;
 	struct request_queue	*queue;
-	struct mmc_queue_req	mqrq[1];
+	struct mmc_queue_req	mqrq[2];
 	struct mmc_queue_req	*mqrq_cur;
+	struct mmc_queue_req	*mqrq_prev;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 07/12] mmc: add a second mmc queue request member
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Add an additional mmc queue request instance to make way for
two active block requests. One request may be active while the
other request is being prepared.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/queue.c |   44 ++++++++++++++++++++++++++++++++++++++++++--
 drivers/mmc/card/queue.h |    3 ++-
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 40e18b5..eef3510 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -130,6 +130,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret;
 	struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
+	struct mmc_queue_req *mqrq_prev = &mq->mqrq[1];
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = *mmc_dev(host)->dma_mask;
@@ -140,7 +141,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 		return -ENOMEM;
 
 	memset(&mq->mqrq_cur, 0, sizeof(mq->mqrq_cur));
+	memset(&mq->mqrq_prev, 0, sizeof(mq->mqrq_prev));
 	mq->mqrq_cur = mqrq_cur;
+	mq->mqrq_prev = mqrq_prev;
 	mq->queue->queuedata = mq;
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
@@ -181,9 +184,17 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 					"allocate bounce cur buffer\n",
 					mmc_card_name(card));
 			}
+			mqrq_prev->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
+			if (!mqrq_prev->bounce_buf) {
+				printk(KERN_WARNING "%s: unable to "
+					"allocate bounce prev buffer\n",
+					mmc_card_name(card));
+				kfree(mqrq_cur->bounce_buf);
+				mqrq_cur->bounce_buf = NULL;
+			}
 		}
 
-		if (mqrq_cur->bounce_buf) {
+		if (mqrq_cur->bounce_buf && mqrq_prev->bounce_buf) {
 			blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 			blk_queue_max_hw_sectors(mq->queue, bouncesz / 512);
 			blk_queue_max_segments(mq->queue, bouncesz / 512);
@@ -198,11 +209,19 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 			if (ret)
 				goto cleanup_queue;
 
+			mqrq_prev->sg = mmc_alloc_sg(1, &ret);
+			if (ret)
+				goto cleanup_queue;
+
+			mqrq_prev->bounce_sg =
+				mmc_alloc_sg(bouncesz / 512, &ret);
+			if (ret)
+				goto cleanup_queue;
 		}
 	}
 #endif
 
-	if (!mqrq_cur->bounce_buf) {
+	if (!mqrq_cur->bounce_buf && !mqrq_prev->bounce_buf) {
 		blk_queue_bounce_limit(mq->queue, limit);
 		blk_queue_max_hw_sectors(mq->queue,
 			min(host->max_blk_count, host->max_req_size / 512));
@@ -213,6 +232,10 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 		if (ret)
 			goto cleanup_queue;
 
+
+		mqrq_prev->sg = mmc_alloc_sg(host->max_segs, &ret);
+		if (ret)
+			goto cleanup_queue;
 	}
 
 	sema_init(&mq->thread_sem, 1);
@@ -229,6 +252,8 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
  free_bounce_sg:
 	kfree(mqrq_cur->bounce_sg);
 	mqrq_cur->bounce_sg = NULL;
+	kfree(mqrq_prev->bounce_sg);
+	mqrq_prev->bounce_sg = NULL;
 
  cleanup_queue:
 	kfree(mqrq_cur->sg);
@@ -236,6 +261,11 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	kfree(mqrq_cur->bounce_buf);
 	mqrq_cur->bounce_buf = NULL;
 
+	kfree(mqrq_prev->sg);
+	mqrq_prev->sg = NULL;
+	kfree(mqrq_prev->bounce_buf);
+	mqrq_prev->bounce_buf = NULL;
+
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -245,6 +275,7 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
 	struct mmc_queue_req *mqrq_cur = mq->mqrq_cur;
+	struct mmc_queue_req *mqrq_prev = mq->mqrq_prev;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
@@ -267,6 +298,15 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	kfree(mqrq_cur->bounce_buf);
 	mqrq_cur->bounce_buf = NULL;
 
+	kfree(mqrq_prev->bounce_sg);
+	mqrq_prev->bounce_sg = NULL;
+
+	kfree(mqrq_prev->sg);
+	mqrq_prev->sg = NULL;
+
+	kfree(mqrq_prev->bounce_buf);
+	mqrq_prev->bounce_buf = NULL;
+
 	mq->card = NULL;
 }
 EXPORT_SYMBOL(mmc_cleanup_queue);
diff --git a/drivers/mmc/card/queue.h b/drivers/mmc/card/queue.h
index 468044f..0e65807 100644
--- a/drivers/mmc/card/queue.h
+++ b/drivers/mmc/card/queue.h
@@ -28,8 +28,9 @@ struct mmc_queue {
 	int			(*issue_fn)(struct mmc_queue *, struct request *);
 	void			*data;
 	struct request_queue	*queue;
-	struct mmc_queue_req	mqrq[1];
+	struct mmc_queue_req	mqrq[2];
 	struct mmc_queue_req	*mqrq_cur;
+	struct mmc_queue_req	*mqrq_prev;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 07/12] mmc: add a second mmc queue request member
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Add an additional mmc queue request instance to make way for
two active block requests. One request may be active while the
other request is being prepared.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/queue.c |   44 ++++++++++++++++++++++++++++++++++++++++++--
 drivers/mmc/card/queue.h |    3 ++-
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 40e18b5..eef3510 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -130,6 +130,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret;
 	struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
+	struct mmc_queue_req *mqrq_prev = &mq->mqrq[1];
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = *mmc_dev(host)->dma_mask;
@@ -140,7 +141,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 		return -ENOMEM;
 
 	memset(&mq->mqrq_cur, 0, sizeof(mq->mqrq_cur));
+	memset(&mq->mqrq_prev, 0, sizeof(mq->mqrq_prev));
 	mq->mqrq_cur = mqrq_cur;
+	mq->mqrq_prev = mqrq_prev;
 	mq->queue->queuedata = mq;
 
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
@@ -181,9 +184,17 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 					"allocate bounce cur buffer\n",
 					mmc_card_name(card));
 			}
+			mqrq_prev->bounce_buf = kmalloc(bouncesz, GFP_KERNEL);
+			if (!mqrq_prev->bounce_buf) {
+				printk(KERN_WARNING "%s: unable to "
+					"allocate bounce prev buffer\n",
+					mmc_card_name(card));
+				kfree(mqrq_cur->bounce_buf);
+				mqrq_cur->bounce_buf = NULL;
+			}
 		}
 
-		if (mqrq_cur->bounce_buf) {
+		if (mqrq_cur->bounce_buf && mqrq_prev->bounce_buf) {
 			blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_ANY);
 			blk_queue_max_hw_sectors(mq->queue, bouncesz / 512);
 			blk_queue_max_segments(mq->queue, bouncesz / 512);
@@ -198,11 +209,19 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 			if (ret)
 				goto cleanup_queue;
 
+			mqrq_prev->sg = mmc_alloc_sg(1, &ret);
+			if (ret)
+				goto cleanup_queue;
+
+			mqrq_prev->bounce_sg =
+				mmc_alloc_sg(bouncesz / 512, &ret);
+			if (ret)
+				goto cleanup_queue;
 		}
 	}
 #endif
 
-	if (!mqrq_cur->bounce_buf) {
+	if (!mqrq_cur->bounce_buf && !mqrq_prev->bounce_buf) {
 		blk_queue_bounce_limit(mq->queue, limit);
 		blk_queue_max_hw_sectors(mq->queue,
 			min(host->max_blk_count, host->max_req_size / 512));
@@ -213,6 +232,10 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 		if (ret)
 			goto cleanup_queue;
 
+
+		mqrq_prev->sg = mmc_alloc_sg(host->max_segs, &ret);
+		if (ret)
+			goto cleanup_queue;
 	}
 
 	sema_init(&mq->thread_sem, 1);
@@ -229,6 +252,8 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
  free_bounce_sg:
 	kfree(mqrq_cur->bounce_sg);
 	mqrq_cur->bounce_sg = NULL;
+	kfree(mqrq_prev->bounce_sg);
+	mqrq_prev->bounce_sg = NULL;
 
  cleanup_queue:
 	kfree(mqrq_cur->sg);
@@ -236,6 +261,11 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
 	kfree(mqrq_cur->bounce_buf);
 	mqrq_cur->bounce_buf = NULL;
 
+	kfree(mqrq_prev->sg);
+	mqrq_prev->sg = NULL;
+	kfree(mqrq_prev->bounce_buf);
+	mqrq_prev->bounce_buf = NULL;
+
 	blk_cleanup_queue(mq->queue);
 	return ret;
 }
@@ -245,6 +275,7 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
 	struct mmc_queue_req *mqrq_cur = mq->mqrq_cur;
+	struct mmc_queue_req *mqrq_prev = mq->mqrq_prev;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
@@ -267,6 +298,15 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	kfree(mqrq_cur->bounce_buf);
 	mqrq_cur->bounce_buf = NULL;
 
+	kfree(mqrq_prev->bounce_sg);
+	mqrq_prev->bounce_sg = NULL;
+
+	kfree(mqrq_prev->sg);
+	mqrq_prev->sg = NULL;
+
+	kfree(mqrq_prev->bounce_buf);
+	mqrq_prev->bounce_buf = NULL;
+
 	mq->card = NULL;
 }
 EXPORT_SYMBOL(mmc_cleanup_queue);
diff --git a/drivers/mmc/card/queue.h b/drivers/mmc/card/queue.h
index 468044f..0e65807 100644
--- a/drivers/mmc/card/queue.h
+++ b/drivers/mmc/card/queue.h
@@ -28,8 +28,9 @@ struct mmc_queue {
 	int			(*issue_fn)(struct mmc_queue *, struct request *);
 	void			*data;
 	struct request_queue	*queue;
-	struct mmc_queue_req	mqrq[1];
+	struct mmc_queue_req	mqrq[2];
 	struct mmc_queue_req	*mqrq_cur;
+	struct mmc_queue_req	*mqrq_prev;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Change mmc_blk_issue_rw_rq() to become asynchronous.
The execution flow looks like this:
The mmc-queue calls issue_rw_rq(), which sends the request
to the host and returns back to the mmc-queue. The mmc-queue calls
isuue_rw_rq() again with a new request. This new request is prepared,
in isuue_rw_rq(), then it waits for the active request to complete before
pushing it to the host. When to mmc-queue is empty it will call
isuue_rw_rq() with req=NULL to finish off the active request
without starting a new request.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  157 +++++++++++++++++++++++++++++++++++++++-------
 drivers/mmc/card/queue.c |    2 +-
 2 files changed, 134 insertions(+), 25 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index f5db000..4b530ae 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -516,24 +516,75 @@ static enum mmc_blk_status mmc_blk_get_status(struct mmc_blk_request *brq,
 
 }
 
-static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *rqc)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
-	int ret = 1, disable_multi = 0;
+	struct mmc_blk_request *brqc = &mq->mqrq_cur->brq;
+	struct mmc_blk_request *brqp = &mq->mqrq_prev->brq;
+	struct mmc_queue_req  *mqrqp = mq->mqrq_prev;
+	struct request *rqp = mqrqp->req;
+	int ret = 0;
+	int disable_multi = 0;
 	enum mmc_blk_status status;
 
-	mmc_claim_host(card->host);
+	if (!rqc && !rqp)
+		return 0;
 
-	do {
-		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
-		mmc_wait_for_req(card->host, &brq->mrq);
+	if (rqc) {
+		/* Claim host for the first request in a serie of requests */
+		if (!rqp)
+			mmc_claim_host(card->host);
 
-		mmc_queue_bounce_post(mq->mqrq_cur);
-		status = mmc_blk_get_status(brq, req, card, md);
+		/* Prepare a new request */
+		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, 0, mq);
+		mmc_pre_req(card->host, &brqc->mrq, !rqp);
+	}
+	do {
+		/*
+		 * If there is an ongoing request, indicated by rqp, wait for
+		 * it to finish before starting a new one.
+		 */
+		if (rqp)
+			mmc_wait_for_req_done(&brqp->mrq);
+		else {
+			/* start a new asynchronous request */
+			mmc_start_req(card->host, &brqc->mrq);
+			goto out;
+		}
+		status = mmc_blk_get_status(brqp, rqp, card, md);
+		if (status != MMC_BLK_SUCCESS) {
+			mmc_post_req(card->host, &brqp->mrq, -EINVAL);
+			mmc_queue_bounce_post(mqrqp);
+			if (rqc)
+				mmc_post_req(card->host, &brqc->mrq, -EINVAL);
+		}
 
 		switch (status) {
+		case MMC_BLK_SUCCESS:
+			/*
+			 * A block was successfully transferred.
+			 */
+
+			/*
+			 * All data is transferred without errors.
+			 * Defer mmc post processing and _blk_end_request
+			 * until after the new request is started.
+			 */
+			if (blk_rq_bytes(rqp) == brqp->data.bytes_xfered)
+				break;
+
+			mmc_post_req(card->host, &brqp->mrq, 0);
+			mmc_queue_bounce_post(mqrqp);
+
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(rqp, 0,
+						brqp->data.bytes_xfered);
+			spin_unlock_irq(&md->lock);
+
+			if (rqc)
+				mmc_post_req(card->host, &brqc->mrq, -EINVAL);
+			break;
 		case MMC_BLK_CMD_ERR:
 			goto cmd_err;
 			break;
@@ -548,27 +599,73 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			 * read a single sector.
 			 */
 			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, -EIO,
-						brq->data.blksz);
+			ret = __blk_end_request(rqp, -EIO, brqp->data.blksz);
 			spin_unlock_irq(&md->lock);
-
+			if (rqc && !ret)
+				mmc_pre_req(card->host, &brqc->mrq, false);
 			break;
-		case MMC_BLK_SUCCESS:
+		}
+
+		if (ret) {
 			/*
-			 * A block was successfully transferred.
+			 * In case of a none complete request
+			 * prepare it again and resend.
 			 */
-			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
-			spin_unlock_irq(&md->lock);
-			break;
+			mmc_blk_rw_rq_prep(mqrqp, card, disable_multi, mq);
+			mmc_pre_req(card->host, &brqp->mrq, true);
+			mmc_start_req(card->host, &brqp->mrq);
+			if (rqc)
+				mmc_pre_req(card->host, &brqc->mrq, false);
 		}
 	} while (ret);
 
-	mmc_release_host(card->host);
+	/* Previous request is completed, start the new request if any */
+	if (rqc)
+		mmc_start_req(card->host, &brqc->mrq);
+
+	/*
+	 * Post process the previous request while the new request is active.
+	 * In case of error the reuqest is already ended.
+	 */
+	if (status == MMC_BLK_SUCCESS) {
+		mmc_post_req(card->host, &brqp->mrq, 0);
+		mmc_queue_bounce_post(mqrqp);
+
+		spin_lock_irq(&md->lock);
+		ret = __blk_end_request(rqp, 0, brqp->data.bytes_xfered);
+		spin_unlock_irq(&md->lock);
+
+		if (ret) {
+			/* If this happen it is a bug */
+			printk(KERN_ERR "[%s] BUG: rq_bytes %d xfered %d\n",
+			       __func__, blk_rq_bytes(rqp),
+			       brqp->data.bytes_xfered);
+			goto cmd_err;
+		}
+	}
+
+	/* 1 indicates one request has been completed */
+	ret = 1;
+ out:
+	/*
+	 * TODO: Find out if it is OK to only release host after the
+	 *       last request. For the last request the current request
+	 *        is NULL, which means no requests are pending.
+	 */
+	/* Release host for the last request in a serie of requests */
+	if (!rqc)
+		mmc_release_host(card->host);
 
-	return 1;
+	/* Current request becomes previous request and vice versa. */
+	mqrqp->brq.mrq.data = NULL;
+	mqrqp->req = NULL;
+	mq->mqrq_prev = mq->mqrq_cur;
+	mq->mqrq_cur = mqrqp;
+
+	return ret;
 
  cmd_err:
+
  	/*
  	 * If this is an SD card and we're writing, we can first
  	 * mark the known good sectors as ok.
@@ -583,12 +680,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		blocks = mmc_sd_num_wr_blocks(card);
 		if (blocks != (u32)-1) {
 			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, 0, blocks << 9);
+			ret = __blk_end_request(rqp, 0, blocks << 9);
 			spin_unlock_irq(&md->lock);
 		}
 	} else {
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
+		ret = __blk_end_request(rqp, 0, brqp->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	}
 
@@ -596,15 +693,27 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 
 	spin_lock_irq(&md->lock);
 	while (ret)
-		ret = __blk_end_request(req, -EIO, blk_rq_cur_bytes(req));
+		ret = __blk_end_request(rqp, -EIO, blk_rq_cur_bytes(rqp));
 	spin_unlock_irq(&md->lock);
 
+	if (rqc) {
+		mmc_claim_host(card->host);
+		mmc_pre_req(card->host, &brqc->mrq, false);
+		mmc_start_req(card->host, &brqc->mrq);
+	}
+
+	/* Current request becomes previous request and vice versa. */
+	mqrqp->brq.mrq.data = NULL;
+	mqrqp->req = NULL;
+	mq->mqrq_prev = mq->mqrq_cur;
+	mq->mqrq_cur = mqrqp;
+
 	return 0;
 }
 
 static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 {
-	if (req->cmd_flags & REQ_DISCARD) {
+	if (req && req->cmd_flags & REQ_DISCARD) {
 		if (req->cmd_flags & REQ_SECURE)
 			return mmc_blk_issue_secdiscard_rq(mq, req);
 		else
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index eef3510..2b14d1c 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -59,6 +59,7 @@ static int mmc_queue_thread(void *d)
 		mq->mqrq_cur->req = req;
 		spin_unlock_irq(q->queue_lock);
 
+		mq->issue_fn(mq, req);
 		if (!req) {
 			if (kthread_should_stop()) {
 				set_current_state(TASK_RUNNING);
@@ -71,7 +72,6 @@ static int mmc_queue_thread(void *d)
 		}
 		set_current_state(TASK_RUNNING);
 
-		mq->issue_fn(mq, req);
 	} while (1);
 	up(&mq->thread_sem);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Change mmc_blk_issue_rw_rq() to become asynchronous.
The execution flow looks like this:
The mmc-queue calls issue_rw_rq(), which sends the request
to the host and returns back to the mmc-queue. The mmc-queue calls
isuue_rw_rq() again with a new request. This new request is prepared,
in isuue_rw_rq(), then it waits for the active request to complete before
pushing it to the host. When to mmc-queue is empty it will call
isuue_rw_rq() with req=NULL to finish off the active request
without starting a new request.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/card/block.c |  157 +++++++++++++++++++++++++++++++++++++++-------
 drivers/mmc/card/queue.c |    2 +-
 2 files changed, 134 insertions(+), 25 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index f5db000..4b530ae 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -516,24 +516,75 @@ static enum mmc_blk_status mmc_blk_get_status(struct mmc_blk_request *brq,
 
 }
 
-static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *rqc)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
-	int ret = 1, disable_multi = 0;
+	struct mmc_blk_request *brqc = &mq->mqrq_cur->brq;
+	struct mmc_blk_request *brqp = &mq->mqrq_prev->brq;
+	struct mmc_queue_req  *mqrqp = mq->mqrq_prev;
+	struct request *rqp = mqrqp->req;
+	int ret = 0;
+	int disable_multi = 0;
 	enum mmc_blk_status status;
 
-	mmc_claim_host(card->host);
+	if (!rqc && !rqp)
+		return 0;
 
-	do {
-		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
-		mmc_wait_for_req(card->host, &brq->mrq);
+	if (rqc) {
+		/* Claim host for the first request in a serie of requests */
+		if (!rqp)
+			mmc_claim_host(card->host);
 
-		mmc_queue_bounce_post(mq->mqrq_cur);
-		status = mmc_blk_get_status(brq, req, card, md);
+		/* Prepare a new request */
+		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, 0, mq);
+		mmc_pre_req(card->host, &brqc->mrq, !rqp);
+	}
+	do {
+		/*
+		 * If there is an ongoing request, indicated by rqp, wait for
+		 * it to finish before starting a new one.
+		 */
+		if (rqp)
+			mmc_wait_for_req_done(&brqp->mrq);
+		else {
+			/* start a new asynchronous request */
+			mmc_start_req(card->host, &brqc->mrq);
+			goto out;
+		}
+		status = mmc_blk_get_status(brqp, rqp, card, md);
+		if (status != MMC_BLK_SUCCESS) {
+			mmc_post_req(card->host, &brqp->mrq, -EINVAL);
+			mmc_queue_bounce_post(mqrqp);
+			if (rqc)
+				mmc_post_req(card->host, &brqc->mrq, -EINVAL);
+		}
 
 		switch (status) {
+		case MMC_BLK_SUCCESS:
+			/*
+			 * A block was successfully transferred.
+			 */
+
+			/*
+			 * All data is transferred without errors.
+			 * Defer mmc post processing and _blk_end_request
+			 * until after the new request is started.
+			 */
+			if (blk_rq_bytes(rqp) == brqp->data.bytes_xfered)
+				break;
+
+			mmc_post_req(card->host, &brqp->mrq, 0);
+			mmc_queue_bounce_post(mqrqp);
+
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(rqp, 0,
+						brqp->data.bytes_xfered);
+			spin_unlock_irq(&md->lock);
+
+			if (rqc)
+				mmc_post_req(card->host, &brqc->mrq, -EINVAL);
+			break;
 		case MMC_BLK_CMD_ERR:
 			goto cmd_err;
 			break;
@@ -548,27 +599,73 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			 * read a single sector.
 			 */
 			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, -EIO,
-						brq->data.blksz);
+			ret = __blk_end_request(rqp, -EIO, brqp->data.blksz);
 			spin_unlock_irq(&md->lock);
-
+			if (rqc && !ret)
+				mmc_pre_req(card->host, &brqc->mrq, false);
 			break;
-		case MMC_BLK_SUCCESS:
+		}
+
+		if (ret) {
 			/*
-			 * A block was successfully transferred.
+			 * In case of a none complete request
+			 * prepare it again and resend.
 			 */
-			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
-			spin_unlock_irq(&md->lock);
-			break;
+			mmc_blk_rw_rq_prep(mqrqp, card, disable_multi, mq);
+			mmc_pre_req(card->host, &brqp->mrq, true);
+			mmc_start_req(card->host, &brqp->mrq);
+			if (rqc)
+				mmc_pre_req(card->host, &brqc->mrq, false);
 		}
 	} while (ret);
 
-	mmc_release_host(card->host);
+	/* Previous request is completed, start the new request if any */
+	if (rqc)
+		mmc_start_req(card->host, &brqc->mrq);
+
+	/*
+	 * Post process the previous request while the new request is active.
+	 * In case of error the reuqest is already ended.
+	 */
+	if (status == MMC_BLK_SUCCESS) {
+		mmc_post_req(card->host, &brqp->mrq, 0);
+		mmc_queue_bounce_post(mqrqp);
+
+		spin_lock_irq(&md->lock);
+		ret = __blk_end_request(rqp, 0, brqp->data.bytes_xfered);
+		spin_unlock_irq(&md->lock);
+
+		if (ret) {
+			/* If this happen it is a bug */
+			printk(KERN_ERR "[%s] BUG: rq_bytes %d xfered %d\n",
+			       __func__, blk_rq_bytes(rqp),
+			       brqp->data.bytes_xfered);
+			goto cmd_err;
+		}
+	}
+
+	/* 1 indicates one request has been completed */
+	ret = 1;
+ out:
+	/*
+	 * TODO: Find out if it is OK to only release host after the
+	 *       last request. For the last request the current request
+	 *        is NULL, which means no requests are pending.
+	 */
+	/* Release host for the last request in a serie of requests */
+	if (!rqc)
+		mmc_release_host(card->host);
 
-	return 1;
+	/* Current request becomes previous request and vice versa. */
+	mqrqp->brq.mrq.data = NULL;
+	mqrqp->req = NULL;
+	mq->mqrq_prev = mq->mqrq_cur;
+	mq->mqrq_cur = mqrqp;
+
+	return ret;
 
  cmd_err:
+
  	/*
  	 * If this is an SD card and we're writing, we can first
  	 * mark the known good sectors as ok.
@@ -583,12 +680,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		blocks = mmc_sd_num_wr_blocks(card);
 		if (blocks != (u32)-1) {
 			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, 0, blocks << 9);
+			ret = __blk_end_request(rqp, 0, blocks << 9);
 			spin_unlock_irq(&md->lock);
 		}
 	} else {
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
+		ret = __blk_end_request(rqp, 0, brqp->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	}
 
@@ -596,15 +693,27 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 
 	spin_lock_irq(&md->lock);
 	while (ret)
-		ret = __blk_end_request(req, -EIO, blk_rq_cur_bytes(req));
+		ret = __blk_end_request(rqp, -EIO, blk_rq_cur_bytes(rqp));
 	spin_unlock_irq(&md->lock);
 
+	if (rqc) {
+		mmc_claim_host(card->host);
+		mmc_pre_req(card->host, &brqc->mrq, false);
+		mmc_start_req(card->host, &brqc->mrq);
+	}
+
+	/* Current request becomes previous request and vice versa. */
+	mqrqp->brq.mrq.data = NULL;
+	mqrqp->req = NULL;
+	mq->mqrq_prev = mq->mqrq_cur;
+	mq->mqrq_cur = mqrqp;
+
 	return 0;
 }
 
 static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 {
-	if (req->cmd_flags & REQ_DISCARD) {
+	if (req && req->cmd_flags & REQ_DISCARD) {
 		if (req->cmd_flags & REQ_SECURE)
 			return mmc_blk_issue_secdiscard_rq(mq, req);
 		else
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index eef3510..2b14d1c 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -59,6 +59,7 @@ static int mmc_queue_thread(void *d)
 		mq->mqrq_cur->req = req;
 		spin_unlock_irq(q->queue_lock);
 
+		mq->issue_fn(mq, req);
 		if (!req) {
 			if (kthread_should_stop()) {
 				set_current_state(TASK_RUNNING);
@@ -71,7 +72,6 @@ static int mmc_queue_thread(void *d)
 		}
 		set_current_state(TASK_RUNNING);
 
-		mq->issue_fn(mq, req);
 	} while (1);
 	up(&mq->thread_sem);
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Change mmc_blk_issue_rw_rq() to become asynchronous.
The execution flow looks like this:
The mmc-queue calls issue_rw_rq(), which sends the request
to the host and returns back to the mmc-queue. The mmc-queue calls
isuue_rw_rq() again with a new request. This new request is prepared,
in isuue_rw_rq(), then it waits for the active request to complete before
pushing it to the host. When to mmc-queue is empty it will call
isuue_rw_rq() with req=NULL to finish off the active request
without starting a new request.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/block.c |  157 +++++++++++++++++++++++++++++++++++++++-------
 drivers/mmc/card/queue.c |    2 +-
 2 files changed, 134 insertions(+), 25 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index f5db000..4b530ae 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -516,24 +516,75 @@ static enum mmc_blk_status mmc_blk_get_status(struct mmc_blk_request *brq,
 
 }
 
-static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *rqc)
 {
 	struct mmc_blk_data *md = mq->data;
 	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq = &mq->mqrq_cur->brq;
-	int ret = 1, disable_multi = 0;
+	struct mmc_blk_request *brqc = &mq->mqrq_cur->brq;
+	struct mmc_blk_request *brqp = &mq->mqrq_prev->brq;
+	struct mmc_queue_req  *mqrqp = mq->mqrq_prev;
+	struct request *rqp = mqrqp->req;
+	int ret = 0;
+	int disable_multi = 0;
 	enum mmc_blk_status status;
 
-	mmc_claim_host(card->host);
+	if (!rqc && !rqp)
+		return 0;
 
-	do {
-		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, disable_multi, mq);
-		mmc_wait_for_req(card->host, &brq->mrq);
+	if (rqc) {
+		/* Claim host for the first request in a serie of requests */
+		if (!rqp)
+			mmc_claim_host(card->host);
 
-		mmc_queue_bounce_post(mq->mqrq_cur);
-		status = mmc_blk_get_status(brq, req, card, md);
+		/* Prepare a new request */
+		mmc_blk_rw_rq_prep(mq->mqrq_cur, card, 0, mq);
+		mmc_pre_req(card->host, &brqc->mrq, !rqp);
+	}
+	do {
+		/*
+		 * If there is an ongoing request, indicated by rqp, wait for
+		 * it to finish before starting a new one.
+		 */
+		if (rqp)
+			mmc_wait_for_req_done(&brqp->mrq);
+		else {
+			/* start a new asynchronous request */
+			mmc_start_req(card->host, &brqc->mrq);
+			goto out;
+		}
+		status = mmc_blk_get_status(brqp, rqp, card, md);
+		if (status != MMC_BLK_SUCCESS) {
+			mmc_post_req(card->host, &brqp->mrq, -EINVAL);
+			mmc_queue_bounce_post(mqrqp);
+			if (rqc)
+				mmc_post_req(card->host, &brqc->mrq, -EINVAL);
+		}
 
 		switch (status) {
+		case MMC_BLK_SUCCESS:
+			/*
+			 * A block was successfully transferred.
+			 */
+
+			/*
+			 * All data is transferred without errors.
+			 * Defer mmc post processing and _blk_end_request
+			 * until after the new request is started.
+			 */
+			if (blk_rq_bytes(rqp) == brqp->data.bytes_xfered)
+				break;
+
+			mmc_post_req(card->host, &brqp->mrq, 0);
+			mmc_queue_bounce_post(mqrqp);
+
+			spin_lock_irq(&md->lock);
+			ret = __blk_end_request(rqp, 0,
+						brqp->data.bytes_xfered);
+			spin_unlock_irq(&md->lock);
+
+			if (rqc)
+				mmc_post_req(card->host, &brqc->mrq, -EINVAL);
+			break;
 		case MMC_BLK_CMD_ERR:
 			goto cmd_err;
 			break;
@@ -548,27 +599,73 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			 * read a single sector.
 			 */
 			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, -EIO,
-						brq->data.blksz);
+			ret = __blk_end_request(rqp, -EIO, brqp->data.blksz);
 			spin_unlock_irq(&md->lock);
-
+			if (rqc && !ret)
+				mmc_pre_req(card->host, &brqc->mrq, false);
 			break;
-		case MMC_BLK_SUCCESS:
+		}
+
+		if (ret) {
 			/*
-			 * A block was successfully transferred.
+			 * In case of a none complete request
+			 * prepare it again and resend.
 			 */
-			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
-			spin_unlock_irq(&md->lock);
-			break;
+			mmc_blk_rw_rq_prep(mqrqp, card, disable_multi, mq);
+			mmc_pre_req(card->host, &brqp->mrq, true);
+			mmc_start_req(card->host, &brqp->mrq);
+			if (rqc)
+				mmc_pre_req(card->host, &brqc->mrq, false);
 		}
 	} while (ret);
 
-	mmc_release_host(card->host);
+	/* Previous request is completed, start the new request if any */
+	if (rqc)
+		mmc_start_req(card->host, &brqc->mrq);
+
+	/*
+	 * Post process the previous request while the new request is active.
+	 * In case of error the reuqest is already ended.
+	 */
+	if (status == MMC_BLK_SUCCESS) {
+		mmc_post_req(card->host, &brqp->mrq, 0);
+		mmc_queue_bounce_post(mqrqp);
+
+		spin_lock_irq(&md->lock);
+		ret = __blk_end_request(rqp, 0, brqp->data.bytes_xfered);
+		spin_unlock_irq(&md->lock);
+
+		if (ret) {
+			/* If this happen it is a bug */
+			printk(KERN_ERR "[%s] BUG: rq_bytes %d xfered %d\n",
+			       __func__, blk_rq_bytes(rqp),
+			       brqp->data.bytes_xfered);
+			goto cmd_err;
+		}
+	}
+
+	/* 1 indicates one request has been completed */
+	ret = 1;
+ out:
+	/*
+	 * TODO: Find out if it is OK to only release host after the
+	 *       last request. For the last request the current request
+	 *        is NULL, which means no requests are pending.
+	 */
+	/* Release host for the last request in a serie of requests */
+	if (!rqc)
+		mmc_release_host(card->host);
 
-	return 1;
+	/* Current request becomes previous request and vice versa. */
+	mqrqp->brq.mrq.data = NULL;
+	mqrqp->req = NULL;
+	mq->mqrq_prev = mq->mqrq_cur;
+	mq->mqrq_cur = mqrqp;
+
+	return ret;
 
  cmd_err:
+
  	/*
  	 * If this is an SD card and we're writing, we can first
  	 * mark the known good sectors as ok.
@@ -583,12 +680,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		blocks = mmc_sd_num_wr_blocks(card);
 		if (blocks != (u32)-1) {
 			spin_lock_irq(&md->lock);
-			ret = __blk_end_request(req, 0, blocks << 9);
+			ret = __blk_end_request(rqp, 0, blocks << 9);
 			spin_unlock_irq(&md->lock);
 		}
 	} else {
 		spin_lock_irq(&md->lock);
-		ret = __blk_end_request(req, 0, brq->data.bytes_xfered);
+		ret = __blk_end_request(rqp, 0, brqp->data.bytes_xfered);
 		spin_unlock_irq(&md->lock);
 	}
 
@@ -596,15 +693,27 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 
 	spin_lock_irq(&md->lock);
 	while (ret)
-		ret = __blk_end_request(req, -EIO, blk_rq_cur_bytes(req));
+		ret = __blk_end_request(rqp, -EIO, blk_rq_cur_bytes(rqp));
 	spin_unlock_irq(&md->lock);
 
+	if (rqc) {
+		mmc_claim_host(card->host);
+		mmc_pre_req(card->host, &brqc->mrq, false);
+		mmc_start_req(card->host, &brqc->mrq);
+	}
+
+	/* Current request becomes previous request and vice versa. */
+	mqrqp->brq.mrq.data = NULL;
+	mqrqp->req = NULL;
+	mq->mqrq_prev = mq->mqrq_cur;
+	mq->mqrq_cur = mqrqp;
+
 	return 0;
 }
 
 static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 {
-	if (req->cmd_flags & REQ_DISCARD) {
+	if (req && req->cmd_flags & REQ_DISCARD) {
 		if (req->cmd_flags & REQ_SECURE)
 			return mmc_blk_issue_secdiscard_rq(mq, req);
 		else
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index eef3510..2b14d1c 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -59,6 +59,7 @@ static int mmc_queue_thread(void *d)
 		mq->mqrq_cur->req = req;
 		spin_unlock_irq(q->queue_lock);
 
+		mq->issue_fn(mq, req);
 		if (!req) {
 			if (kthread_should_stop()) {
 				set_current_state(TASK_RUNNING);
@@ -71,7 +72,6 @@ static int mmc_queue_thread(void *d)
 		}
 		set_current_state(TASK_RUNNING);
 
-		mq->issue_fn(mq, req);
 	} while (1);
 	up(&mq->thread_sem);
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 09/12] mmc: test: add random fault injection in core.c
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

This simple fault injection proved to be very useful to
test the error handling in the block.c rw_rq(). It may
still be useful to test if the host driver handle
pre_req() and post_req() correctly in case of errors.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/core/core.c    |   54 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/core/debugfs.c |    5 ++++
 include/linux/mmc/host.h   |    4 ++-
 lib/Kconfig.debug          |   11 +++++++++
 4 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index e88dd36..85296df 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -23,6 +23,8 @@
 #include <linux/log2.h>
 #include <linux/regulator/consumer.h>
 #include <linux/pm_runtime.h>
+#include <linux/fault-inject.h>
+#include <linux/random.h>
 
 #include <linux/mmc/card.h>
 #include <linux/mmc/host.h>
@@ -82,6 +84,56 @@ static void mmc_flush_scheduled_work(void)
 	flush_workqueue(workqueue);
 }
 
+#ifdef CONFIG_FAIL_MMC_REQUEST
+
+static DECLARE_FAULT_ATTR(fail_mmc_request);
+
+static int __init setup_fail_mmc_request(char *str)
+{
+	return setup_fault_attr(&fail_mmc_request, str);
+}
+__setup("fail_mmc_request=", setup_fail_mmc_request);
+
+static void mmc_should_fail_request(struct mmc_host *host,
+				    struct mmc_request *mrq)
+{
+	struct mmc_command *cmd = mrq->cmd;
+	struct mmc_data *data = mrq->data;
+	static const int data_errors[] = {
+		-ETIMEDOUT,
+		-EILSEQ,
+		-EIO,
+	};
+
+	if (!data)
+		return;
+
+	if (cmd->error || data->error || !host->make_it_fail ||
+	    !should_fail(&fail_mmc_request, data->blksz * data->blocks))
+		return;
+
+	data->error = data_errors[random32() % ARRAY_SIZE(data_errors)];
+	data->bytes_xfered = (random32() % (data->bytes_xfered >> 9)) << 9;
+}
+
+static int __init fail_mmc_request_debugfs(void)
+{
+	return init_fault_attr_dentries(&fail_mmc_request,
+					"fail_mmc_request");
+}
+
+late_initcall(fail_mmc_request_debugfs);
+
+#else /* CONFIG_FAIL_MMC_REQUEST */
+
+static inline void mmc_should_fail_request(struct mmc_host *host,
+					   struct mmc_data *data)
+{
+}
+
+#endif /* CONFIG_FAIL_MMC_REQUEST */
+
+
 /**
  *	mmc_request_done - finish processing an MMC request
  *	@host: MMC host which completed request
@@ -108,6 +160,8 @@ void mmc_request_done(struct mmc_host *host, struct mmc_request *mrq)
 		cmd->error = 0;
 		host->ops->request(host, mrq);
 	} else {
+		mmc_should_fail_request(host, mrq);
+
 		led_trigger_event(host->led, LED_OFF);
 
 		pr_debug("%s: req done (CMD%u): %d: %08x %08x %08x %08x\n",
diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index 998797e..588e76f 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -188,6 +188,11 @@ void mmc_add_host_debugfs(struct mmc_host *host)
 				root, &host->clk_delay))
 		goto err_node;
 #endif
+#ifdef CONFIG_FAIL_MMC_REQUEST
+	if (!debugfs_create_u8("make-it-fail", S_IRUSR | S_IWUSR,
+			       root, &host->make_it_fail))
+		goto err_node;
+#endif
 	return;
 
 err_node:
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index c056a3d..8b2b44b 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -251,7 +251,9 @@ struct mmc_host {
 #endif
 
 	struct dentry		*debugfs_root;
-
+#ifdef CONFIG_FAIL_MMC_REQUEST
+	u8			make_it_fail;
+#endif
 	unsigned long		private[0] ____cacheline_aligned;
 };
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index df9234c..180620b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1059,6 +1059,17 @@ config FAIL_IO_TIMEOUT
 	  Only works with drivers that use the generic timeout handling,
 	  for others it wont do anything.
 
+config FAIL_MMC_REQUEST
+	bool "Fault-injection capability for MMC IO"
+	select DEBUG_FS
+	depends on FAULT_INJECTION
+	help
+	  Provide fault-injection capability for MMC IO.
+	  This will make the mmc core return data errors. This is
+	  useful for testing the error handling in the mmc block device
+	  and how the mmc host driver handle retries from
+	  the block device.
+
 config FAULT_INJECTION_DEBUG_FS
 	bool "Debugfs entries for fault-injection capabilities"
 	depends on FAULT_INJECTION && SYSFS && DEBUG_FS
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 09/12] mmc: test: add random fault injection in core.c
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

This simple fault injection proved to be very useful to
test the error handling in the block.c rw_rq(). It may
still be useful to test if the host driver handle
pre_req() and post_req() correctly in case of errors.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/core/core.c    |   54 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/core/debugfs.c |    5 ++++
 include/linux/mmc/host.h   |    4 ++-
 lib/Kconfig.debug          |   11 +++++++++
 4 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index e88dd36..85296df 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -23,6 +23,8 @@
 #include <linux/log2.h>
 #include <linux/regulator/consumer.h>
 #include <linux/pm_runtime.h>
+#include <linux/fault-inject.h>
+#include <linux/random.h>
 
 #include <linux/mmc/card.h>
 #include <linux/mmc/host.h>
@@ -82,6 +84,56 @@ static void mmc_flush_scheduled_work(void)
 	flush_workqueue(workqueue);
 }
 
+#ifdef CONFIG_FAIL_MMC_REQUEST
+
+static DECLARE_FAULT_ATTR(fail_mmc_request);
+
+static int __init setup_fail_mmc_request(char *str)
+{
+	return setup_fault_attr(&fail_mmc_request, str);
+}
+__setup("fail_mmc_request=", setup_fail_mmc_request);
+
+static void mmc_should_fail_request(struct mmc_host *host,
+				    struct mmc_request *mrq)
+{
+	struct mmc_command *cmd = mrq->cmd;
+	struct mmc_data *data = mrq->data;
+	static const int data_errors[] = {
+		-ETIMEDOUT,
+		-EILSEQ,
+		-EIO,
+	};
+
+	if (!data)
+		return;
+
+	if (cmd->error || data->error || !host->make_it_fail ||
+	    !should_fail(&fail_mmc_request, data->blksz * data->blocks))
+		return;
+
+	data->error = data_errors[random32() % ARRAY_SIZE(data_errors)];
+	data->bytes_xfered = (random32() % (data->bytes_xfered >> 9)) << 9;
+}
+
+static int __init fail_mmc_request_debugfs(void)
+{
+	return init_fault_attr_dentries(&fail_mmc_request,
+					"fail_mmc_request");
+}
+
+late_initcall(fail_mmc_request_debugfs);
+
+#else /* CONFIG_FAIL_MMC_REQUEST */
+
+static inline void mmc_should_fail_request(struct mmc_host *host,
+					   struct mmc_data *data)
+{
+}
+
+#endif /* CONFIG_FAIL_MMC_REQUEST */
+
+
 /**
  *	mmc_request_done - finish processing an MMC request
  *	@host: MMC host which completed request
@@ -108,6 +160,8 @@ void mmc_request_done(struct mmc_host *host, struct mmc_request *mrq)
 		cmd->error = 0;
 		host->ops->request(host, mrq);
 	} else {
+		mmc_should_fail_request(host, mrq);
+
 		led_trigger_event(host->led, LED_OFF);
 
 		pr_debug("%s: req done (CMD%u): %d: %08x %08x %08x %08x\n",
diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index 998797e..588e76f 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -188,6 +188,11 @@ void mmc_add_host_debugfs(struct mmc_host *host)
 				root, &host->clk_delay))
 		goto err_node;
 #endif
+#ifdef CONFIG_FAIL_MMC_REQUEST
+	if (!debugfs_create_u8("make-it-fail", S_IRUSR | S_IWUSR,
+			       root, &host->make_it_fail))
+		goto err_node;
+#endif
 	return;
 
 err_node:
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index c056a3d..8b2b44b 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -251,7 +251,9 @@ struct mmc_host {
 #endif
 
 	struct dentry		*debugfs_root;
-
+#ifdef CONFIG_FAIL_MMC_REQUEST
+	u8			make_it_fail;
+#endif
 	unsigned long		private[0] ____cacheline_aligned;
 };
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index df9234c..180620b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1059,6 +1059,17 @@ config FAIL_IO_TIMEOUT
 	  Only works with drivers that use the generic timeout handling,
 	  for others it wont do anything.
 
+config FAIL_MMC_REQUEST
+	bool "Fault-injection capability for MMC IO"
+	select DEBUG_FS
+	depends on FAULT_INJECTION
+	help
+	  Provide fault-injection capability for MMC IO.
+	  This will make the mmc core return data errors. This is
+	  useful for testing the error handling in the mmc block device
+	  and how the mmc host driver handle retries from
+	  the block device.
+
 config FAULT_INJECTION_DEBUG_FS
 	bool "Debugfs entries for fault-injection capabilities"
 	depends on FAULT_INJECTION && SYSFS && DEBUG_FS
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 09/12] mmc: test: add random fault injection in core.c
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

This simple fault injection proved to be very useful to
test the error handling in the block.c rw_rq(). It may
still be useful to test if the host driver handle
pre_req() and post_req() correctly in case of errors.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/core/core.c    |   54 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/core/debugfs.c |    5 ++++
 include/linux/mmc/host.h   |    4 ++-
 lib/Kconfig.debug          |   11 +++++++++
 4 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index e88dd36..85296df 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -23,6 +23,8 @@
 #include <linux/log2.h>
 #include <linux/regulator/consumer.h>
 #include <linux/pm_runtime.h>
+#include <linux/fault-inject.h>
+#include <linux/random.h>
 
 #include <linux/mmc/card.h>
 #include <linux/mmc/host.h>
@@ -82,6 +84,56 @@ static void mmc_flush_scheduled_work(void)
 	flush_workqueue(workqueue);
 }
 
+#ifdef CONFIG_FAIL_MMC_REQUEST
+
+static DECLARE_FAULT_ATTR(fail_mmc_request);
+
+static int __init setup_fail_mmc_request(char *str)
+{
+	return setup_fault_attr(&fail_mmc_request, str);
+}
+__setup("fail_mmc_request=", setup_fail_mmc_request);
+
+static void mmc_should_fail_request(struct mmc_host *host,
+				    struct mmc_request *mrq)
+{
+	struct mmc_command *cmd = mrq->cmd;
+	struct mmc_data *data = mrq->data;
+	static const int data_errors[] = {
+		-ETIMEDOUT,
+		-EILSEQ,
+		-EIO,
+	};
+
+	if (!data)
+		return;
+
+	if (cmd->error || data->error || !host->make_it_fail ||
+	    !should_fail(&fail_mmc_request, data->blksz * data->blocks))
+		return;
+
+	data->error = data_errors[random32() % ARRAY_SIZE(data_errors)];
+	data->bytes_xfered = (random32() % (data->bytes_xfered >> 9)) << 9;
+}
+
+static int __init fail_mmc_request_debugfs(void)
+{
+	return init_fault_attr_dentries(&fail_mmc_request,
+					"fail_mmc_request");
+}
+
+late_initcall(fail_mmc_request_debugfs);
+
+#else /* CONFIG_FAIL_MMC_REQUEST */
+
+static inline void mmc_should_fail_request(struct mmc_host *host,
+					   struct mmc_data *data)
+{
+}
+
+#endif /* CONFIG_FAIL_MMC_REQUEST */
+
+
 /**
  *	mmc_request_done - finish processing an MMC request
  *	@host: MMC host which completed request
@@ -108,6 +160,8 @@ void mmc_request_done(struct mmc_host *host, struct mmc_request *mrq)
 		cmd->error = 0;
 		host->ops->request(host, mrq);
 	} else {
+		mmc_should_fail_request(host, mrq);
+
 		led_trigger_event(host->led, LED_OFF);
 
 		pr_debug("%s: req done (CMD%u): %d: %08x %08x %08x %08x\n",
diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index 998797e..588e76f 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -188,6 +188,11 @@ void mmc_add_host_debugfs(struct mmc_host *host)
 				root, &host->clk_delay))
 		goto err_node;
 #endif
+#ifdef CONFIG_FAIL_MMC_REQUEST
+	if (!debugfs_create_u8("make-it-fail", S_IRUSR | S_IWUSR,
+			       root, &host->make_it_fail))
+		goto err_node;
+#endif
 	return;
 
 err_node:
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index c056a3d..8b2b44b 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -251,7 +251,9 @@ struct mmc_host {
 #endif
 
 	struct dentry		*debugfs_root;
-
+#ifdef CONFIG_FAIL_MMC_REQUEST
+	u8			make_it_fail;
+#endif
 	unsigned long		private[0] ____cacheline_aligned;
 };
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index df9234c..180620b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1059,6 +1059,17 @@ config FAIL_IO_TIMEOUT
 	  Only works with drivers that use the generic timeout handling,
 	  for others it wont do anything.
 
+config FAIL_MMC_REQUEST
+	bool "Fault-injection capability for MMC IO"
+	select DEBUG_FS
+	depends on FAULT_INJECTION
+	help
+	  Provide fault-injection capability for MMC IO.
+	  This will make the mmc core return data errors. This is
+	  useful for testing the error handling in the mmc block device
+	  and how the mmc host driver handle retries from
+	  the block device.
+
 config FAULT_INJECTION_DEBUG_FS
 	bool "Debugfs entries for fault-injection capabilities"
 	depends on FAULT_INJECTION && SYSFS && DEBUG_FS
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 10/12] omap_hsmmc: use original sg_len for dma_unmap_sg
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

Don't use the returned sg_len from dma_map_sg() as inparameter
to dma_unmap_sg(). Use the original sg_len for both dma_map_sg
and dma_unmap_sg.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/host/omap_hsmmc.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index 259ece0..ad3731a 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -959,7 +959,8 @@ static void omap_hsmmc_dma_cleanup(struct omap_hsmmc_host *host, int errno)
 	spin_unlock(&host->irq_lock);
 
 	if (host->use_dma && dma_ch != -1) {
-		dma_unmap_sg(mmc_dev(host->mmc), host->data->sg, host->dma_len,
+		dma_unmap_sg(mmc_dev(host->mmc), host->data->sg,
+			host->data->sg_len,
 			omap_hsmmc_get_dma_dir(host, host->data));
 		omap_free_dma(dma_ch);
 	}
@@ -1343,7 +1344,7 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 		return;
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg, host->dma_len,
+	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
 		omap_hsmmc_get_dma_dir(host, data));
 
 	req_in_progress = host->req_in_progress;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 10/12] omap_hsmmc: use original sg_len for dma_unmap_sg
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

Don't use the returned sg_len from dma_map_sg() as inparameter
to dma_unmap_sg(). Use the original sg_len for both dma_map_sg
and dma_unmap_sg.

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/host/omap_hsmmc.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index 259ece0..ad3731a 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -959,7 +959,8 @@ static void omap_hsmmc_dma_cleanup(struct omap_hsmmc_host *host, int errno)
 	spin_unlock(&host->irq_lock);
 
 	if (host->use_dma && dma_ch != -1) {
-		dma_unmap_sg(mmc_dev(host->mmc), host->data->sg, host->dma_len,
+		dma_unmap_sg(mmc_dev(host->mmc), host->data->sg,
+			host->data->sg_len,
 			omap_hsmmc_get_dma_dir(host, host->data));
 		omap_free_dma(dma_ch);
 	}
@@ -1343,7 +1344,7 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 		return;
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg, host->dma_len,
+	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
 		omap_hsmmc_get_dma_dir(host, data));
 
 	req_in_progress = host->req_in_progress;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 10/12] omap_hsmmc: use original sg_len for dma_unmap_sg
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

Don't use the returned sg_len from dma_map_sg() as inparameter
to dma_unmap_sg(). Use the original sg_len for both dma_map_sg
and dma_unmap_sg.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/host/omap_hsmmc.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index 259ece0..ad3731a 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -959,7 +959,8 @@ static void omap_hsmmc_dma_cleanup(struct omap_hsmmc_host *host, int errno)
 	spin_unlock(&host->irq_lock);
 
 	if (host->use_dma && dma_ch != -1) {
-		dma_unmap_sg(mmc_dev(host->mmc), host->data->sg, host->dma_len,
+		dma_unmap_sg(mmc_dev(host->mmc), host->data->sg,
+			host->data->sg_len,
 			omap_hsmmc_get_dma_dir(host, host->data));
 		omap_free_dma(dma_ch);
 	}
@@ -1343,7 +1344,7 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 		return;
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg, host->dma_len,
+	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
 		omap_hsmmc_get_dma_dir(host, data));
 
 	req_in_progress = host->req_in_progress;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 11/12] omap_hsmmc: add support for pre_req and post_req
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before omap_hsmmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/host/omap_hsmmc.c |   87 +++++++++++++++++++++++++++++++++++++++--
 1 files changed, 83 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index ad3731a..2116c09 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -141,6 +141,11 @@
 #define OMAP_HSMMC_WRITE(base, reg, val) \
 	__raw_writel((val), (base) + OMAP_HSMMC_##reg)
 
+struct omap_hsmmc_next {
+	unsigned int	dma_len;
+	s32		cookie;
+};
+
 struct omap_hsmmc_host {
 	struct	device		*dev;
 	struct	mmc_host	*mmc;
@@ -184,6 +189,7 @@ struct omap_hsmmc_host {
 	int			reqs_blocked;
 	int			use_reg;
 	int			req_in_progress;
+	struct omap_hsmmc_next	next_data;
 
 	struct	omap_mmc_platform_data	*pdata;
 };
@@ -1344,8 +1350,9 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 		return;
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
-		omap_hsmmc_get_dma_dir(host, data));
+	if (!data->host_cookie)
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     omap_hsmmc_get_dma_dir(host, data));
 
 	req_in_progress = host->req_in_progress;
 	dma_ch = host->dma_ch;
@@ -1363,6 +1370,45 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 	}
 }
 
+static int omap_hsmmc_pre_dma_transfer(struct omap_hsmmc_host *host,
+				       struct mmc_data *data,
+				       struct omap_hsmmc_next *next)
+{
+	int dma_len;
+
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next ||
+	    (!next && data->host_cookie != host->next_data.cookie)) {
+		dma_len = dma_map_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len,
+				     omap_hsmmc_get_dma_dir(host, data));
+
+	} else {
+		dma_len = host->next_data.dma_len;
+		host->next_data.dma_len = 0;
+	}
+
+
+	if (dma_len == 0)
+		return -EINVAL;
+
+	if (next) {
+		next->dma_len = dma_len;
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+	} else
+		host->dma_len = dma_len;
+
+	return 0;
+}
+
 /*
  * Routine to configure and start DMA for the MMC card
  */
@@ -1396,9 +1442,10 @@ static int omap_hsmmc_start_dma_transfer(struct omap_hsmmc_host *host,
 			mmc_hostname(host->mmc), ret);
 		return ret;
 	}
+	ret = omap_hsmmc_pre_dma_transfer(host, data, NULL);
+	if (ret)
+		return ret;
 
-	host->dma_len = dma_map_sg(mmc_dev(host->mmc), data->sg,
-			data->sg_len, omap_hsmmc_get_dma_dir(host, data));
 	host->dma_ch = dma_ch;
 	host->dma_sg_idx = 0;
 
@@ -1478,6 +1525,35 @@ omap_hsmmc_prepare_data(struct omap_hsmmc_host *host, struct mmc_request *req)
 	return 0;
 }
 
+static void omap_hsmmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+				int err)
+{
+	struct omap_hsmmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (host->use_dma) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     omap_hsmmc_get_dma_dir(host, data));
+		data->host_cookie = 0;
+	}
+}
+
+static void omap_hsmmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			       bool is_first_req)
+{
+	struct omap_hsmmc_host *host = mmc_priv(mmc);
+
+	if (mrq->data->host_cookie) {
+		mrq->data->host_cookie = 0;
+		return ;
+	}
+
+	if (host->use_dma)
+		if (omap_hsmmc_pre_dma_transfer(host, mrq->data,
+						&host->next_data))
+			mrq->data->host_cookie = 0;
+}
+
 /*
  * Request function. for read/write operation
  */
@@ -1926,6 +2002,8 @@ static int omap_hsmmc_disable_fclk(struct mmc_host *mmc, int lazy)
 static const struct mmc_host_ops omap_hsmmc_ops = {
 	.enable = omap_hsmmc_enable_fclk,
 	.disable = omap_hsmmc_disable_fclk,
+	.post_req = omap_hsmmc_post_req,
+	.pre_req = omap_hsmmc_pre_req,
 	.request = omap_hsmmc_request,
 	.set_ios = omap_hsmmc_set_ios,
 	.get_cd = omap_hsmmc_get_cd,
@@ -2075,6 +2153,7 @@ static int __init omap_hsmmc_probe(struct platform_device *pdev)
 	host->mapbase	= res->start;
 	host->base	= ioremap(host->mapbase, SZ_4K);
 	host->power_mode = MMC_POWER_OFF;
+	host->next_data.cookie = 1;
 
 	platform_set_drvdata(pdev, host);
 	INIT_WORK(&host->mmc_carddetect_work, omap_hsmmc_detect);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 11/12] omap_hsmmc: add support for pre_req and post_req
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw
  Cc: Chris Ball

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before omap_hsmmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/mmc/host/omap_hsmmc.c |   87 +++++++++++++++++++++++++++++++++++++++--
 1 files changed, 83 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index ad3731a..2116c09 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -141,6 +141,11 @@
 #define OMAP_HSMMC_WRITE(base, reg, val) \
 	__raw_writel((val), (base) + OMAP_HSMMC_##reg)
 
+struct omap_hsmmc_next {
+	unsigned int	dma_len;
+	s32		cookie;
+};
+
 struct omap_hsmmc_host {
 	struct	device		*dev;
 	struct	mmc_host	*mmc;
@@ -184,6 +189,7 @@ struct omap_hsmmc_host {
 	int			reqs_blocked;
 	int			use_reg;
 	int			req_in_progress;
+	struct omap_hsmmc_next	next_data;
 
 	struct	omap_mmc_platform_data	*pdata;
 };
@@ -1344,8 +1350,9 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 		return;
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
-		omap_hsmmc_get_dma_dir(host, data));
+	if (!data->host_cookie)
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     omap_hsmmc_get_dma_dir(host, data));
 
 	req_in_progress = host->req_in_progress;
 	dma_ch = host->dma_ch;
@@ -1363,6 +1370,45 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 	}
 }
 
+static int omap_hsmmc_pre_dma_transfer(struct omap_hsmmc_host *host,
+				       struct mmc_data *data,
+				       struct omap_hsmmc_next *next)
+{
+	int dma_len;
+
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next ||
+	    (!next && data->host_cookie != host->next_data.cookie)) {
+		dma_len = dma_map_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len,
+				     omap_hsmmc_get_dma_dir(host, data));
+
+	} else {
+		dma_len = host->next_data.dma_len;
+		host->next_data.dma_len = 0;
+	}
+
+
+	if (dma_len == 0)
+		return -EINVAL;
+
+	if (next) {
+		next->dma_len = dma_len;
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+	} else
+		host->dma_len = dma_len;
+
+	return 0;
+}
+
 /*
  * Routine to configure and start DMA for the MMC card
  */
@@ -1396,9 +1442,10 @@ static int omap_hsmmc_start_dma_transfer(struct omap_hsmmc_host *host,
 			mmc_hostname(host->mmc), ret);
 		return ret;
 	}
+	ret = omap_hsmmc_pre_dma_transfer(host, data, NULL);
+	if (ret)
+		return ret;
 
-	host->dma_len = dma_map_sg(mmc_dev(host->mmc), data->sg,
-			data->sg_len, omap_hsmmc_get_dma_dir(host, data));
 	host->dma_ch = dma_ch;
 	host->dma_sg_idx = 0;
 
@@ -1478,6 +1525,35 @@ omap_hsmmc_prepare_data(struct omap_hsmmc_host *host, struct mmc_request *req)
 	return 0;
 }
 
+static void omap_hsmmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+				int err)
+{
+	struct omap_hsmmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (host->use_dma) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     omap_hsmmc_get_dma_dir(host, data));
+		data->host_cookie = 0;
+	}
+}
+
+static void omap_hsmmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			       bool is_first_req)
+{
+	struct omap_hsmmc_host *host = mmc_priv(mmc);
+
+	if (mrq->data->host_cookie) {
+		mrq->data->host_cookie = 0;
+		return ;
+	}
+
+	if (host->use_dma)
+		if (omap_hsmmc_pre_dma_transfer(host, mrq->data,
+						&host->next_data))
+			mrq->data->host_cookie = 0;
+}
+
 /*
  * Request function. for read/write operation
  */
@@ -1926,6 +2002,8 @@ static int omap_hsmmc_disable_fclk(struct mmc_host *mmc, int lazy)
 static const struct mmc_host_ops omap_hsmmc_ops = {
 	.enable = omap_hsmmc_enable_fclk,
 	.disable = omap_hsmmc_disable_fclk,
+	.post_req = omap_hsmmc_post_req,
+	.pre_req = omap_hsmmc_pre_req,
 	.request = omap_hsmmc_request,
 	.set_ios = omap_hsmmc_set_ios,
 	.get_cd = omap_hsmmc_get_cd,
@@ -2075,6 +2153,7 @@ static int __init omap_hsmmc_probe(struct platform_device *pdev)
 	host->mapbase	= res->start;
 	host->base	= ioremap(host->mapbase, SZ_4K);
 	host->power_mode = MMC_POWER_OFF;
+	host->next_data.cookie = 1;
 
 	platform_set_drvdata(pdev, host);
 	INIT_WORK(&host->mmc_carddetect_work, omap_hsmmc_detect);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 11/12] omap_hsmmc: add support for pre_req and post_req
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before omap_hsmmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/host/omap_hsmmc.c |   87 +++++++++++++++++++++++++++++++++++++++--
 1 files changed, 83 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index ad3731a..2116c09 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -141,6 +141,11 @@
 #define OMAP_HSMMC_WRITE(base, reg, val) \
 	__raw_writel((val), (base) + OMAP_HSMMC_##reg)
 
+struct omap_hsmmc_next {
+	unsigned int	dma_len;
+	s32		cookie;
+};
+
 struct omap_hsmmc_host {
 	struct	device		*dev;
 	struct	mmc_host	*mmc;
@@ -184,6 +189,7 @@ struct omap_hsmmc_host {
 	int			reqs_blocked;
 	int			use_reg;
 	int			req_in_progress;
+	struct omap_hsmmc_next	next_data;
 
 	struct	omap_mmc_platform_data	*pdata;
 };
@@ -1344,8 +1350,9 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 		return;
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
-		omap_hsmmc_get_dma_dir(host, data));
+	if (!data->host_cookie)
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     omap_hsmmc_get_dma_dir(host, data));
 
 	req_in_progress = host->req_in_progress;
 	dma_ch = host->dma_ch;
@@ -1363,6 +1370,45 @@ static void omap_hsmmc_dma_cb(int lch, u16 ch_status, void *cb_data)
 	}
 }
 
+static int omap_hsmmc_pre_dma_transfer(struct omap_hsmmc_host *host,
+				       struct mmc_data *data,
+				       struct omap_hsmmc_next *next)
+{
+	int dma_len;
+
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next ||
+	    (!next && data->host_cookie != host->next_data.cookie)) {
+		dma_len = dma_map_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len,
+				     omap_hsmmc_get_dma_dir(host, data));
+
+	} else {
+		dma_len = host->next_data.dma_len;
+		host->next_data.dma_len = 0;
+	}
+
+
+	if (dma_len == 0)
+		return -EINVAL;
+
+	if (next) {
+		next->dma_len = dma_len;
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+	} else
+		host->dma_len = dma_len;
+
+	return 0;
+}
+
 /*
  * Routine to configure and start DMA for the MMC card
  */
@@ -1396,9 +1442,10 @@ static int omap_hsmmc_start_dma_transfer(struct omap_hsmmc_host *host,
 			mmc_hostname(host->mmc), ret);
 		return ret;
 	}
+	ret = omap_hsmmc_pre_dma_transfer(host, data, NULL);
+	if (ret)
+		return ret;
 
-	host->dma_len = dma_map_sg(mmc_dev(host->mmc), data->sg,
-			data->sg_len, omap_hsmmc_get_dma_dir(host, data));
 	host->dma_ch = dma_ch;
 	host->dma_sg_idx = 0;
 
@@ -1478,6 +1525,35 @@ omap_hsmmc_prepare_data(struct omap_hsmmc_host *host, struct mmc_request *req)
 	return 0;
 }
 
+static void omap_hsmmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+				int err)
+{
+	struct omap_hsmmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (host->use_dma) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     omap_hsmmc_get_dma_dir(host, data));
+		data->host_cookie = 0;
+	}
+}
+
+static void omap_hsmmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			       bool is_first_req)
+{
+	struct omap_hsmmc_host *host = mmc_priv(mmc);
+
+	if (mrq->data->host_cookie) {
+		mrq->data->host_cookie = 0;
+		return ;
+	}
+
+	if (host->use_dma)
+		if (omap_hsmmc_pre_dma_transfer(host, mrq->data,
+						&host->next_data))
+			mrq->data->host_cookie = 0;
+}
+
 /*
  * Request function. for read/write operation
  */
@@ -1926,6 +2002,8 @@ static int omap_hsmmc_disable_fclk(struct mmc_host *mmc, int lazy)
 static const struct mmc_host_ops omap_hsmmc_ops = {
 	.enable = omap_hsmmc_enable_fclk,
 	.disable = omap_hsmmc_disable_fclk,
+	.post_req = omap_hsmmc_post_req,
+	.pre_req = omap_hsmmc_pre_req,
 	.request = omap_hsmmc_request,
 	.set_ios = omap_hsmmc_set_ios,
 	.get_cd = omap_hsmmc_get_cd,
@@ -2075,6 +2153,7 @@ static int __init omap_hsmmc_probe(struct platform_device *pdev)
 	host->mapbase	= res->start;
 	host->base	= ioremap(host->mapbase, SZ_4K);
 	host->power_mode = MMC_POWER_OFF;
+	host->next_data.cookie = 1;
 
 	platform_set_drvdata(pdev, host);
 	INIT_WORK(&host->mmc_carddetect_work, omap_hsmmc_detect);
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 12/12] mmci: implement pre_req() and post_req()
  2011-04-06 19:07 ` Per Forlin
@ 2011-04-06 19:07   ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

pre_req() runs dma_map_sg() and prepares the dma descriptor
for the next mmc data transfer. post_req() runs dma_unmap_sg.
If not calling pre_req() before mmci_request(), mmci_request()
will prepare the cache and dma just like it did it before.
It is optional to use pre_req() and post_req() for mmci.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/host/mmci.c |  146 ++++++++++++++++++++++++++++++++++++++++++----
 drivers/mmc/host/mmci.h |    8 +++
 2 files changed, 141 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index b4a7e4f..985c77d 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -320,7 +320,8 @@ static void mmci_dma_unmap(struct mmci_host *host, struct mmc_data *data)
 		dir = DMA_FROM_DEVICE;
 	}
 
-	dma_unmap_sg(chan->device->dev, data->sg, data->sg_len, dir);
+	if (!data->host_cookie)
+		dma_unmap_sg(chan->device->dev, data->sg, data->sg_len, dir);
 
 	/*
 	 * Use of DMA with scatter-gather is impossible.
@@ -338,7 +339,8 @@ static void mmci_dma_data_error(struct mmci_host *host)
 	dmaengine_terminate_all(host->dma_current);
 }
 
-static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
+static int mmci_dma_prep_data(struct mmci_host *host, struct mmc_data *data,
+			      struct mmci_host_next *next)
 {
 	struct variant_data *variant = host->variant;
 	struct dma_slave_config conf = {
@@ -349,13 +351,20 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 		.src_maxburst = variant->fifohalfsize >> 2, /* # of words */
 		.dst_maxburst = variant->fifohalfsize >> 2, /* # of words */
 	};
-	struct mmc_data *data = host->data;
 	struct dma_chan *chan;
 	struct dma_device *device;
 	struct dma_async_tx_descriptor *desc;
 	int nr_sg;
 
-	host->dma_current = NULL;
+	/* Check if next job is already prepared */
+	if (data->host_cookie && !next &&
+	    host->dma_current && host->dma_desc_current)
+		return 0;
+
+	if (!next) {
+		host->dma_current = NULL;
+		host->dma_desc_current = NULL;
+	}
 
 	if (data->flags & MMC_DATA_READ) {
 		conf.direction = DMA_FROM_DEVICE;
@@ -370,7 +379,7 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 		return -EINVAL;
 
 	/* If less than or equal to the fifo size, don't bother with DMA */
-	if (host->size <= variant->fifosize)
+	if (data->blksz * data->blocks <= variant->fifosize)
 		return -EINVAL;
 
 	device = chan->device;
@@ -384,14 +393,38 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 	if (!desc)
 		goto unmap_exit;
 
-	/* Okay, go for it. */
-	host->dma_current = chan;
+	if (next) {
+		next->dma_chan = chan;
+		next->dma_desc = desc;
+	} else {
+		host->dma_current = chan;
+		host->dma_desc_current = desc;
+	}
+
+	return 0;
 
+ unmap_exit:
+	if (!next)
+		dmaengine_terminate_all(chan);
+	dma_unmap_sg(device->dev, data->sg, data->sg_len, conf.direction);
+	return -ENOMEM;
+}
+
+static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
+{
+	int ret;
+	struct mmc_data *data = host->data;
+
+	ret = mmci_dma_prep_data(host, host->data, NULL);
+	if (ret)
+		return ret;
+
+	/* Okay, go for it. */
 	dev_vdbg(mmc_dev(host->mmc),
 		 "Submit MMCI DMA job, sglen %d blksz %04x blks %04x flags %08x\n",
 		 data->sg_len, data->blksz, data->blocks, data->flags);
-	dmaengine_submit(desc);
-	dma_async_issue_pending(chan);
+	dmaengine_submit(host->dma_desc_current);
+	dma_async_issue_pending(host->dma_current);
 
 	datactrl |= MCI_DPSM_DMAENABLE;
 
@@ -406,14 +439,90 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 	writel(readl(host->base + MMCIMASK0) | MCI_DATAENDMASK,
 	       host->base + MMCIMASK0);
 	return 0;
+}
 
-unmap_exit:
-	dmaengine_terminate_all(chan);
-	dma_unmap_sg(device->dev, data->sg, data->sg_len, conf.direction);
-	return -ENOMEM;
+static void mmci_get_next_data(struct mmci_host *host, struct mmc_data *data)
+{
+	struct mmci_host_next *next = &host->next_data;
+
+	if (data->host_cookie && data->host_cookie != next->cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	if (!data->host_cookie)
+		return;
+
+	host->dma_desc_current = next->dma_desc;
+	host->dma_current = next->dma_chan;
+
+	next->dma_desc = NULL;
+	next->dma_chan = NULL;
 }
+
+static void mmci_pre_request(struct mmc_host *mmc, struct mmc_request *mrq,
+			     bool is_first_req)
+{
+	struct mmci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+	struct mmci_host_next *nd = &host->next_data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		data->host_cookie = 0;
+		return;
+	}
+
+	/* if config for dma */
+	if (((data->flags & MMC_DATA_WRITE) && host->dma_tx_channel) ||
+	    ((data->flags & MMC_DATA_READ) && host->dma_rx_channel)) {
+		if (mmci_dma_prep_data(host, data, nd))
+			data->host_cookie = 0;
+		else
+			data->host_cookie = ++nd->cookie < 0 ? 1 : nd->cookie;
+	}
+}
+
+static void mmci_post_request(struct mmc_host *mmc, struct mmc_request *mrq,
+			      int err)
+{
+	struct mmci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+	struct dma_chan *chan;
+	enum dma_data_direction dir;
+
+	if (!data)
+		return;
+
+	if (data->flags & MMC_DATA_READ) {
+		dir = DMA_FROM_DEVICE;
+		chan = host->dma_rx_channel;
+	} else {
+		dir = DMA_TO_DEVICE;
+		chan = host->dma_tx_channel;
+	}
+
+
+	/* if config for dma */
+	if (chan) {
+		if (err)
+			dmaengine_terminate_all(chan);
+		if (err || data->host_cookie)
+			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len, dir);
+		mrq->data->host_cookie = 0;
+	}
+}
+
 #else
 /* Blank functions if the DMA engine is not available */
+static void mmci_get_next_data(struct mmci_host *host, struct mmc_data *data)
+{
+}
 static inline void mmci_dma_setup(struct mmci_host *host)
 {
 }
@@ -434,6 +543,10 @@ static inline int mmci_dma_start_data(struct mmci_host *host, unsigned int datac
 {
 	return -ENOSYS;
 }
+
+#define mmci_pre_request NULL
+#define mmci_post_request NULL
+
 #endif
 
 static void mmci_start_data(struct mmci_host *host, struct mmc_data *data)
@@ -852,6 +965,9 @@ static void mmci_request(struct mmc_host *mmc, struct mmc_request *mrq)
 
 	host->mrq = mrq;
 
+	if (mrq->data)
+		mmci_get_next_data(host, mrq->data);
+
 	if (mrq->data && mrq->data->flags & MMC_DATA_READ)
 		mmci_start_data(host, mrq->data);
 
@@ -966,6 +1082,8 @@ static irqreturn_t mmci_cd_irq(int irq, void *dev_id)
 
 static const struct mmc_host_ops mmci_ops = {
 	.request	= mmci_request,
+	.pre_req	= mmci_pre_request,
+	.post_req	= mmci_post_request,
 	.set_ios	= mmci_set_ios,
 	.get_ro		= mmci_get_ro,
 	.get_cd		= mmci_get_cd,
@@ -1003,6 +1121,8 @@ static int __devinit mmci_probe(struct amba_device *dev,
 	host->gpio_cd = -ENOSYS;
 	host->gpio_cd_irq = -1;
 
+	host->next_data.cookie = 1;
+
 	host->hw_designer = amba_manf(dev);
 	host->hw_revision = amba_rev(dev);
 	dev_dbg(mmc_dev(mmc), "designer ID = 0x%02x\n", host->hw_designer);
diff --git a/drivers/mmc/host/mmci.h b/drivers/mmc/host/mmci.h
index ec9a7bc6..e21d850 100644
--- a/drivers/mmc/host/mmci.h
+++ b/drivers/mmc/host/mmci.h
@@ -150,6 +150,12 @@ struct clk;
 struct variant_data;
 struct dma_chan;
 
+struct mmci_host_next {
+	struct dma_async_tx_descriptor	*dma_desc;
+	struct dma_chan			*dma_chan;
+	s32				cookie;
+};
+
 struct mmci_host {
 	phys_addr_t		phybase;
 	void __iomem		*base;
@@ -187,6 +193,8 @@ struct mmci_host {
 	struct dma_chan		*dma_current;
 	struct dma_chan		*dma_rx_channel;
 	struct dma_chan		*dma_tx_channel;
+	struct dma_async_tx_descriptor	*dma_desc_current;
+	struct mmci_host_next	next_data;
 
 #define dma_inprogress(host)	((host)->dma_current)
 #else
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2 12/12] mmci: implement pre_req() and post_req()
@ 2011-04-06 19:07   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

pre_req() runs dma_map_sg() and prepares the dma descriptor
for the next mmc data transfer. post_req() runs dma_unmap_sg.
If not calling pre_req() before mmci_request(), mmci_request()
will prepare the cache and dma just like it did it before.
It is optional to use pre_req() and post_req() for mmci.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/host/mmci.c |  146 ++++++++++++++++++++++++++++++++++++++++++----
 drivers/mmc/host/mmci.h |    8 +++
 2 files changed, 141 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index b4a7e4f..985c77d 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -320,7 +320,8 @@ static void mmci_dma_unmap(struct mmci_host *host, struct mmc_data *data)
 		dir = DMA_FROM_DEVICE;
 	}
 
-	dma_unmap_sg(chan->device->dev, data->sg, data->sg_len, dir);
+	if (!data->host_cookie)
+		dma_unmap_sg(chan->device->dev, data->sg, data->sg_len, dir);
 
 	/*
 	 * Use of DMA with scatter-gather is impossible.
@@ -338,7 +339,8 @@ static void mmci_dma_data_error(struct mmci_host *host)
 	dmaengine_terminate_all(host->dma_current);
 }
 
-static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
+static int mmci_dma_prep_data(struct mmci_host *host, struct mmc_data *data,
+			      struct mmci_host_next *next)
 {
 	struct variant_data *variant = host->variant;
 	struct dma_slave_config conf = {
@@ -349,13 +351,20 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 		.src_maxburst = variant->fifohalfsize >> 2, /* # of words */
 		.dst_maxburst = variant->fifohalfsize >> 2, /* # of words */
 	};
-	struct mmc_data *data = host->data;
 	struct dma_chan *chan;
 	struct dma_device *device;
 	struct dma_async_tx_descriptor *desc;
 	int nr_sg;
 
-	host->dma_current = NULL;
+	/* Check if next job is already prepared */
+	if (data->host_cookie && !next &&
+	    host->dma_current && host->dma_desc_current)
+		return 0;
+
+	if (!next) {
+		host->dma_current = NULL;
+		host->dma_desc_current = NULL;
+	}
 
 	if (data->flags & MMC_DATA_READ) {
 		conf.direction = DMA_FROM_DEVICE;
@@ -370,7 +379,7 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 		return -EINVAL;
 
 	/* If less than or equal to the fifo size, don't bother with DMA */
-	if (host->size <= variant->fifosize)
+	if (data->blksz * data->blocks <= variant->fifosize)
 		return -EINVAL;
 
 	device = chan->device;
@@ -384,14 +393,38 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 	if (!desc)
 		goto unmap_exit;
 
-	/* Okay, go for it. */
-	host->dma_current = chan;
+	if (next) {
+		next->dma_chan = chan;
+		next->dma_desc = desc;
+	} else {
+		host->dma_current = chan;
+		host->dma_desc_current = desc;
+	}
+
+	return 0;
 
+ unmap_exit:
+	if (!next)
+		dmaengine_terminate_all(chan);
+	dma_unmap_sg(device->dev, data->sg, data->sg_len, conf.direction);
+	return -ENOMEM;
+}
+
+static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
+{
+	int ret;
+	struct mmc_data *data = host->data;
+
+	ret = mmci_dma_prep_data(host, host->data, NULL);
+	if (ret)
+		return ret;
+
+	/* Okay, go for it. */
 	dev_vdbg(mmc_dev(host->mmc),
 		 "Submit MMCI DMA job, sglen %d blksz %04x blks %04x flags %08x\n",
 		 data->sg_len, data->blksz, data->blocks, data->flags);
-	dmaengine_submit(desc);
-	dma_async_issue_pending(chan);
+	dmaengine_submit(host->dma_desc_current);
+	dma_async_issue_pending(host->dma_current);
 
 	datactrl |= MCI_DPSM_DMAENABLE;
 
@@ -406,14 +439,90 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 	writel(readl(host->base + MMCIMASK0) | MCI_DATAENDMASK,
 	       host->base + MMCIMASK0);
 	return 0;
+}
 
-unmap_exit:
-	dmaengine_terminate_all(chan);
-	dma_unmap_sg(device->dev, data->sg, data->sg_len, conf.direction);
-	return -ENOMEM;
+static void mmci_get_next_data(struct mmci_host *host, struct mmc_data *data)
+{
+	struct mmci_host_next *next = &host->next_data;
+
+	if (data->host_cookie && data->host_cookie != next->cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	if (!data->host_cookie)
+		return;
+
+	host->dma_desc_current = next->dma_desc;
+	host->dma_current = next->dma_chan;
+
+	next->dma_desc = NULL;
+	next->dma_chan = NULL;
 }
+
+static void mmci_pre_request(struct mmc_host *mmc, struct mmc_request *mrq,
+			     bool is_first_req)
+{
+	struct mmci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+	struct mmci_host_next *nd = &host->next_data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		data->host_cookie = 0;
+		return;
+	}
+
+	/* if config for dma */
+	if (((data->flags & MMC_DATA_WRITE) && host->dma_tx_channel) ||
+	    ((data->flags & MMC_DATA_READ) && host->dma_rx_channel)) {
+		if (mmci_dma_prep_data(host, data, nd))
+			data->host_cookie = 0;
+		else
+			data->host_cookie = ++nd->cookie < 0 ? 1 : nd->cookie;
+	}
+}
+
+static void mmci_post_request(struct mmc_host *mmc, struct mmc_request *mrq,
+			      int err)
+{
+	struct mmci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+	struct dma_chan *chan;
+	enum dma_data_direction dir;
+
+	if (!data)
+		return;
+
+	if (data->flags & MMC_DATA_READ) {
+		dir = DMA_FROM_DEVICE;
+		chan = host->dma_rx_channel;
+	} else {
+		dir = DMA_TO_DEVICE;
+		chan = host->dma_tx_channel;
+	}
+
+
+	/* if config for dma */
+	if (chan) {
+		if (err)
+			dmaengine_terminate_all(chan);
+		if (err || data->host_cookie)
+			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len, dir);
+		mrq->data->host_cookie = 0;
+	}
+}
+
 #else
 /* Blank functions if the DMA engine is not available */
+static void mmci_get_next_data(struct mmci_host *host, struct mmc_data *data)
+{
+}
 static inline void mmci_dma_setup(struct mmci_host *host)
 {
 }
@@ -434,6 +543,10 @@ static inline int mmci_dma_start_data(struct mmci_host *host, unsigned int datac
 {
 	return -ENOSYS;
 }
+
+#define mmci_pre_request NULL
+#define mmci_post_request NULL
+
 #endif
 
 static void mmci_start_data(struct mmci_host *host, struct mmc_data *data)
@@ -852,6 +965,9 @@ static void mmci_request(struct mmc_host *mmc, struct mmc_request *mrq)
 
 	host->mrq = mrq;
 
+	if (mrq->data)
+		mmci_get_next_data(host, mrq->data);
+
 	if (mrq->data && mrq->data->flags & MMC_DATA_READ)
 		mmci_start_data(host, mrq->data);
 
@@ -966,6 +1082,8 @@ static irqreturn_t mmci_cd_irq(int irq, void *dev_id)
 
 static const struct mmc_host_ops mmci_ops = {
 	.request	= mmci_request,
+	.pre_req	= mmci_pre_request,
+	.post_req	= mmci_post_request,
 	.set_ios	= mmci_set_ios,
 	.get_ro		= mmci_get_ro,
 	.get_cd		= mmci_get_cd,
@@ -1003,6 +1121,8 @@ static int __devinit mmci_probe(struct amba_device *dev,
 	host->gpio_cd = -ENOSYS;
 	host->gpio_cd_irq = -1;
 
+	host->next_data.cookie = 1;
+
 	host->hw_designer = amba_manf(dev);
 	host->hw_revision = amba_rev(dev);
 	dev_dbg(mmc_dev(mmc), "designer ID = 0x%02x\n", host->hw_designer);
diff --git a/drivers/mmc/host/mmci.h b/drivers/mmc/host/mmci.h
index ec9a7bc6..e21d850 100644
--- a/drivers/mmc/host/mmci.h
+++ b/drivers/mmc/host/mmci.h
@@ -150,6 +150,12 @@ struct clk;
 struct variant_data;
 struct dma_chan;
 
+struct mmci_host_next {
+	struct dma_async_tx_descriptor	*dma_desc;
+	struct dma_chan			*dma_chan;
+	s32				cookie;
+};
+
 struct mmci_host {
 	phys_addr_t		phybase;
 	void __iomem		*base;
@@ -187,6 +193,8 @@ struct mmci_host {
 	struct dma_chan		*dma_current;
 	struct dma_chan		*dma_rx_channel;
 	struct dma_chan		*dma_tx_channel;
+	struct dma_async_tx_descriptor	*dma_desc_current;
+	struct mmci_host_next	next_data;
 
 #define dma_inprogress(host)	((host)->dma_current)
 #else
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-06 19:07 ` Per Forlin
@ 2011-04-08 16:49   ` Linus Walleij
  -1 siblings, 0 replies; 129+ messages in thread
From: Linus Walleij @ 2011-04-08 16:49 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org> wrote:

> The intention for introducing none blocking mmc requests is to minimize the
> time between a mmc request ends and another mmc request starts.

Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at the
patches so many times I'm already blind for any remaining bugs...

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-08 16:49   ` Linus Walleij
  0 siblings, 0 replies; 129+ messages in thread
From: Linus Walleij @ 2011-04-08 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org> wrote:

> The intention for introducing none blocking mmc requests is to minimize the
> time between a mmc request ends and another mmc request starts.

Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at the
patches so many times I'm already blind for any remaining bugs...

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-08 16:49   ` Linus Walleij
@ 2011-04-09 11:55     ` Jae hoon Chung
  -1 siblings, 0 replies; 129+ messages in thread
From: Jae hoon Chung @ 2011-04-09 11:55 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Per Forlin, linux-mmc, linux-arm-kernel, linux-kernel,
	linaro-dev, Chris Ball

Hi Per..

I'm applied your patch..and sent the patch about dw_mmc.c.
I think good this approach..

Tested-by: Jaehoon Chung <jh80.chung@samsung.com>

Regards,
Jaehoon Chung

2011/4/9 Linus Walleij <linus.walleij@linaro.org>:
> On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org> wrote:
>
>> The intention for introducing none blocking mmc requests is to minimize the
>> time between a mmc request ends and another mmc request starts.
>
> Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at the
> patches so many times I'm already blind for any remaining bugs...
>
> Yours,
> Linus Walleij
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-09 11:55     ` Jae hoon Chung
  0 siblings, 0 replies; 129+ messages in thread
From: Jae hoon Chung @ 2011-04-09 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Per..

I'm applied your patch..and sent the patch about dw_mmc.c.
I think good this approach..

Tested-by: Jaehoon Chung <jh80.chung@samsung.com>

Regards,
Jaehoon Chung

2011/4/9 Linus Walleij <linus.walleij@linaro.org>:
> On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org> wrote:
>
>> The intention for introducing none blocking mmc requests is to minimize the
>> time between a mmc request ends and another mmc request starts.
>
> Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at the
> patches so many times I'm already blind for any remaining bugs...
>
> Yours,
> Linus Walleij
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-09 11:55     ` Jae hoon Chung
@ 2011-04-10  3:33         ` anish singh
  -1 siblings, 0 replies; 129+ messages in thread
From: anish singh @ 2011-04-10  3:33 UTC (permalink / raw)
  To: Jae hoon Chung, Per Forlin
  Cc: linaro-dev-cunTk1MwBs8s++Sfvej+rw,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r


[-- Attachment #1.1: Type: text/plain, Size: 1661 bytes --]

On Sat, Apr 9, 2011 at 5:25 PM, Jae hoon Chung <jh80.chung-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hi Per..
>
> I'm applied your patch..and sent the patch about dw_mmc.c.
> I think good this approach..
>
Forlin,i too want to test this patch of yours but unfortunately i have
samsung board+SDHCI host controller so i guess i just need to write pre_req
and post_req functions and that would enable this(nonblock mmc request) on
my board.Hope i am right?

>
> Tested-by: Jaehoon Chung <jh80.chung-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
>
> Regards,
> Jaehoon Chung
>
> 2011/4/9 Linus Walleij <linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>:
> > On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> wrote:
> >
> >> The intention for introducing none blocking mmc requests is to minimize
> the
> >> time between a mmc request ends and another mmc request starts.
> >
> > Acked-by: Linus Walleij <linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> FWIW, I've looked at
> the
> > patches so many times I'm already blind for any remaining bugs...
> >
> > Yours,
> > Linus Walleij
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>  > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

[-- Attachment #1.2: Type: text/html, Size: 2858 bytes --]

[-- Attachment #2: Type: text/plain, Size: 175 bytes --]

_______________________________________________
linaro-dev mailing list
linaro-dev-cunTk1MwBs8s++Sfvej+rw@public.gmane.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-10  3:33         ` anish singh
  0 siblings, 0 replies; 129+ messages in thread
From: anish singh @ 2011-04-10  3:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Apr 9, 2011 at 5:25 PM, Jae hoon Chung <jh80.chung@gmail.com> wrote:

> Hi Per..
>
> I'm applied your patch..and sent the patch about dw_mmc.c.
> I think good this approach..
>
Forlin,i too want to test this patch of yours but unfortunately i have
samsung board+SDHCI host controller so i guess i just need to write pre_req
and post_req functions and that would enable this(nonblock mmc request) on
my board.Hope i am right?

>
> Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
>
> Regards,
> Jaehoon Chung
>
> 2011/4/9 Linus Walleij <linus.walleij@linaro.org>:
> > On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org>
> wrote:
> >
> >> The intention for introducing none blocking mmc requests is to minimize
> the
> >> time between a mmc request ends and another mmc request starts.
> >
> > Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at
> the
> > patches so many times I'm already blind for any remaining bugs...
> >
> > Yours,
> > Linus Walleij
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>  > the body of a message to majordomo at vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110410/713621b5/attachment-0001.html>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-10  3:33         ` anish singh
@ 2011-04-11  9:03           ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-11  9:03 UTC (permalink / raw)
  To: anish singh
  Cc: Jae hoon Chung, Linus Walleij, linux-mmc, linux-arm-kernel,
	linux-kernel, linaro-dev, Chris Ball

On 10 April 2011 05:33, anish singh <anish198519851985@gmail.com> wrote:
>
>
> On Sat, Apr 9, 2011 at 5:25 PM, Jae hoon Chung <jh80.chung@gmail.com> wrote:
>>
>> Hi Per..
>>
>> I'm applied your patch..and sent the patch about dw_mmc.c.
>> I think good this approach..
>
> Forlin,i too want to test this patch of yours but unfortunately i have
> samsung board+SDHCI host controller so i guess i just need to write pre_req
> and post_req functions and that would enable this(nonblock mmc request) on
> my board.Hope i am right?
You are right. The  SDHCI driver should work just fine without
implementing pre_req and post_req but you wont see any performance
increasing for DMA usage.
In order to benefit from the patchset you need to implement pre_req
and post_req.

If you run into trouble implementing those hooks or if you have any
questions please let me know.

Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-11  9:03           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-11  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 10 April 2011 05:33, anish singh <anish198519851985@gmail.com> wrote:
>
>
> On Sat, Apr 9, 2011 at 5:25 PM, Jae hoon Chung <jh80.chung@gmail.com> wrote:
>>
>> Hi Per..
>>
>> I'm applied your patch..and sent the patch about dw_mmc.c.
>> I think good this approach..
>
> Forlin,i too want to test this patch of yours but unfortunately i have
> samsung board+SDHCI host controller so i guess i just need to write pre_req
> and post_req functions and that would enable this(nonblock mmc request) on
> my board.Hope i am right?
You are right. The  SDHCI driver should work just fine without
implementing pre_req and post_req but you wont see any performance
increasing for DMA usage.
In order to benefit from the patchset you need to implement pre_req
and post_req.

If you run into trouble implementing those hooks or if you have any
questions please let me know.

Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-11  9:03           ` Per Forlin
@ 2011-04-11  9:07               ` Sachin Nikam
  -1 siblings, 0 replies; 129+ messages in thread
From: Sachin Nikam @ 2011-04-11  9:07 UTC (permalink / raw)
  To: Per Forlin
  Cc: Jae hoon Chung, anish singh, linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-dev-cunTk1MwBs8s++Sfvej+rw,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r


[-- Attachment #1.1: Type: text/plain, Size: 1451 bytes --]

Does anybody knows which kernel version will have support for SD4.0 specs
UHS-2?


On Mon, Apr 11, 2011 at 2:33 PM, Per Forlin <per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:

> On 10 April 2011 05:33, anish singh <anish198519851985-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >
> > On Sat, Apr 9, 2011 at 5:25 PM, Jae hoon Chung <jh80.chung-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> wrote:
> >>
> >> Hi Per..
> >>
> >> I'm applied your patch..and sent the patch about dw_mmc.c.
> >> I think good this approach..
> >
> > Forlin,i too want to test this patch of yours but unfortunately i have
> > samsung board+SDHCI host controller so i guess i just need to write
> pre_req
> > and post_req functions and that would enable this(nonblock mmc request)
> on
> > my board.Hope i am right?
> You are right. The  SDHCI driver should work just fine without
> implementing pre_req and post_req but you wont see any performance
> increasing for DMA usage.
> In order to benefit from the patchset you need to implement pre_req
> and post_req.
>
> If you run into trouble implementing those hooks or if you have any
> questions please let me know.
>
> Regards,
> Per
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Have a nice time !!!

Cheers,
Sachin.

[-- Attachment #1.2: Type: text/html, Size: 2160 bytes --]

[-- Attachment #2: Type: text/plain, Size: 175 bytes --]

_______________________________________________
linaro-dev mailing list
linaro-dev-cunTk1MwBs8s++Sfvej+rw@public.gmane.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-11  9:07               ` Sachin Nikam
  0 siblings, 0 replies; 129+ messages in thread
From: Sachin Nikam @ 2011-04-11  9:07 UTC (permalink / raw)
  To: linux-arm-kernel

Does anybody knows which kernel version will have support for SD4.0 specs
UHS-2?


On Mon, Apr 11, 2011 at 2:33 PM, Per Forlin <per.forlin@linaro.org> wrote:

> On 10 April 2011 05:33, anish singh <anish198519851985@gmail.com> wrote:
> >
> >
> > On Sat, Apr 9, 2011 at 5:25 PM, Jae hoon Chung <jh80.chung@gmail.com>
> wrote:
> >>
> >> Hi Per..
> >>
> >> I'm applied your patch..and sent the patch about dw_mmc.c.
> >> I think good this approach..
> >
> > Forlin,i too want to test this patch of yours but unfortunately i have
> > samsung board+SDHCI host controller so i guess i just need to write
> pre_req
> > and post_req functions and that would enable this(nonblock mmc request)
> on
> > my board.Hope i am right?
> You are right. The  SDHCI driver should work just fine without
> implementing pre_req and post_req but you wont see any performance
> increasing for DMA usage.
> In order to benefit from the patchset you need to implement pre_req
> and post_req.
>
> If you run into trouble implementing those hooks or if you have any
> questions please let me know.
>
> Regards,
> Per
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Have a nice time !!!

Cheers,
Sachin.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110411/0069f08d/attachment-0001.html>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-09 11:55     ` Jae hoon Chung
@ 2011-04-11  9:08       ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-11  9:08 UTC (permalink / raw)
  To: Jae hoon Chung
  Cc: Linus Walleij, linux-mmc, linux-arm-kernel, linux-kernel,
	linaro-dev, Chris Ball

On 9 April 2011 13:55, Jae hoon Chung <jh80.chung@gmail.com> wrote:
> Hi Per..
>
> I'm applied your patch..and sent the patch about dw_mmc.c.
> I think good this approach..
>
Do you have any test results from the mmc_tests I added?
I am interested in the results.

Regards,
Per

> Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
>
> Regards,
> Jaehoon Chung
>
> 2011/4/9 Linus Walleij <linus.walleij@linaro.org>:
>> On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org> wrote:
>>
>>> The intention for introducing none blocking mmc requests is to minimize the
>>> time between a mmc request ends and another mmc request starts.
>>
>> Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at the
>> patches so many times I'm already blind for any remaining bugs...
>>
>> Yours,
>> Linus Walleij
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-11  9:08       ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-11  9:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 9 April 2011 13:55, Jae hoon Chung <jh80.chung@gmail.com> wrote:
> Hi Per..
>
> I'm applied your patch..and sent the patch about dw_mmc.c.
> I think good this approach..
>
Do you have any test results from the mmc_tests I added?
I am interested in the results.

Regards,
Per

> Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
>
> Regards,
> Jaehoon Chung
>
> 2011/4/9 Linus Walleij <linus.walleij@linaro.org>:
>> On Wed, Apr 6, 2011 at 9:07 PM, Per Forlin <per.forlin@linaro.org> wrote:
>>
>>> The intention for introducing none blocking mmc requests is to minimize the
>>> time between a mmc request ends and another mmc request starts.
>>
>> Acked-by: Linus Walleij <linus.walleij@linaro.org> FWIW, I've looked at the
>> patches so many times I'm already blind for any remaining bugs...
>>
>> Yours,
>> Linus Walleij
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
  2011-04-06 19:07   ` Per Forlin
  (?)
@ 2011-04-15 10:34     ` David Vrabel
  -1 siblings, 0 replies; 129+ messages in thread
From: David Vrabel @ 2011-04-15 10:34 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 06/04/11 20:07, Per Forlin wrote:
> Previously there has only been one function mmc_wait_for_req
> to start and wait for a request. This patch adds
>  * mmc_start_req - starts a request wihtout waiting
>  * mmc_wait_for_req_done - waits until request is done
>  * mmc_pre_req - asks the host driver to prepare for the next job
>  * mmc_post_req - asks the host driver to clean up after a completed job

If MMC core had a queue of requests internally you wouldn't need to
provide mmc_pre_req() and mmc_post_req() functions outside of the core.
 i.e., the mmc block driver would just need to queue up two mmc requests
and the core would take care of calling pre_req and post_req at the
correct time.

Using a MMC request queue has other benefits -- it allows multiple users
without having to claim/release the host.  This would be useful for
(especially multi-function) SDIO.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-15 10:34     ` David Vrabel
  0 siblings, 0 replies; 129+ messages in thread
From: David Vrabel @ 2011-04-15 10:34 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 06/04/11 20:07, Per Forlin wrote:
> Previously there has only been one function mmc_wait_for_req
> to start and wait for a request. This patch adds
>  * mmc_start_req - starts a request wihtout waiting
>  * mmc_wait_for_req_done - waits until request is done
>  * mmc_pre_req - asks the host driver to prepare for the next job
>  * mmc_post_req - asks the host driver to clean up after a completed job

If MMC core had a queue of requests internally you wouldn't need to
provide mmc_pre_req() and mmc_post_req() functions outside of the core.
 i.e., the mmc block driver would just need to queue up two mmc requests
and the core would take care of calling pre_req and post_req at the
correct time.

Using a MMC request queue has other benefits -- it allows multiple users
without having to claim/release the host.  This would be useful for
(especially multi-function) SDIO.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-15 10:34     ` David Vrabel
  0 siblings, 0 replies; 129+ messages in thread
From: David Vrabel @ 2011-04-15 10:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/04/11 20:07, Per Forlin wrote:
> Previously there has only been one function mmc_wait_for_req
> to start and wait for a request. This patch adds
>  * mmc_start_req - starts a request wihtout waiting
>  * mmc_wait_for_req_done - waits until request is done
>  * mmc_pre_req - asks the host driver to prepare for the next job
>  * mmc_post_req - asks the host driver to clean up after a completed job

If MMC core had a queue of requests internally you wouldn't need to
provide mmc_pre_req() and mmc_post_req() functions outside of the core.
 i.e., the mmc block driver would just need to queue up two mmc requests
and the core would take care of calling pre_req and post_req at the
correct time.

Using a MMC request queue has other benefits -- it allows multiple users
without having to claim/release the host.  This would be useful for
(especially multi-function) SDIO.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-06 19:07 ` Per Forlin
  (?)
@ 2011-04-16 15:48   ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-16 15:48 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

Hi Per,

On Wed, Apr 06, 2011 at 09:07:01PM +0200, Per Forlin wrote:
[...]
> 
> Per Forlin (12):
>   mmc: add none blocking mmc request function
>   mmc: mmc_test: add debugfs file to list all tests
>   mmc: mmc_test: add test for none blocking transfers
>   mmc: add member in mmc queue struct to hold request data
>   mmc: add a block request prepare function
>   mmc: move error code in mmc_block_issue_rw_rq to a separate function.
>   mmc: add a second mmc queue request member
>   mmc: add handling for two parallel block requests in issue_rw_rq
>   mmc: test: add random fault injection in core.c
>   omap_hsmmc: use original sg_len for dma_unmap_sg
>   omap_hsmmc: add support for pre_req and post_req
>   mmci: implement pre_req() and post_req()
> 
>  drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
>  drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
>  drivers/mmc/card/queue.c      |  171 +++++++++------
>  drivers/mmc/card/queue.h      |   31 ++-
>  drivers/mmc/core/core.c       |  132 ++++++++++-
>  drivers/mmc/core/debugfs.c    |    5 +
>  drivers/mmc/host/mmci.c       |  146 +++++++++++-
>  drivers/mmc/host/mmci.h       |    8 +
>  drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
>  include/linux/mmc/core.h      |    9 +-
>  include/linux/mmc/host.h      |   13 +-
>  lib/Kconfig.debug             |   11 +
>  12 files changed, 1172 insertions(+), 279 deletions(-)
> 
I'm playing the patch set and seeing the following warnings.

  CC      drivers/mmc/card/block.o
drivers/mmc/card/block.c: In function ‘mmc_blk_issue_rq’:
drivers/mmc/card/block.c:429:6: warning: ‘status’ may be used uninitialized in this function

  CC      drivers/mmc/core/core.o
drivers/mmc/core/core.c: In function ‘mmc_request_done’:
drivers/mmc/core/core.c:163:3: warning: passing argument 2 of ‘mmc_should_fail_request’ from incompatible pointer type
drivers/mmc/core/core.c:129:20: note: expected ‘struct mmc_data *’ but argument is of type ‘struct mmc_request *’

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-16 15:48   ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-16 15:48 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

Hi Per,

On Wed, Apr 06, 2011 at 09:07:01PM +0200, Per Forlin wrote:
[...]
> 
> Per Forlin (12):
>   mmc: add none blocking mmc request function
>   mmc: mmc_test: add debugfs file to list all tests
>   mmc: mmc_test: add test for none blocking transfers
>   mmc: add member in mmc queue struct to hold request data
>   mmc: add a block request prepare function
>   mmc: move error code in mmc_block_issue_rw_rq to a separate function.
>   mmc: add a second mmc queue request member
>   mmc: add handling for two parallel block requests in issue_rw_rq
>   mmc: test: add random fault injection in core.c
>   omap_hsmmc: use original sg_len for dma_unmap_sg
>   omap_hsmmc: add support for pre_req and post_req
>   mmci: implement pre_req() and post_req()
> 
>  drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
>  drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
>  drivers/mmc/card/queue.c      |  171 +++++++++------
>  drivers/mmc/card/queue.h      |   31 ++-
>  drivers/mmc/core/core.c       |  132 ++++++++++-
>  drivers/mmc/core/debugfs.c    |    5 +
>  drivers/mmc/host/mmci.c       |  146 +++++++++++-
>  drivers/mmc/host/mmci.h       |    8 +
>  drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
>  include/linux/mmc/core.h      |    9 +-
>  include/linux/mmc/host.h      |   13 +-
>  lib/Kconfig.debug             |   11 +
>  12 files changed, 1172 insertions(+), 279 deletions(-)
> 
I'm playing the patch set and seeing the following warnings.

  CC      drivers/mmc/card/block.o
drivers/mmc/card/block.c: In function ‘mmc_blk_issue_rq’:
drivers/mmc/card/block.c:429:6: warning: ‘status’ may be used uninitialized in this function

  CC      drivers/mmc/core/core.o
drivers/mmc/core/core.c: In function ‘mmc_request_done’:
drivers/mmc/core/core.c:163:3: warning: passing argument 2 of ‘mmc_should_fail_request’ from incompatible pointer type
drivers/mmc/core/core.c:129:20: note: expected ‘struct mmc_data *’ but argument is of type ‘struct mmc_request *’

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-16 15:48   ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-16 15:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Per,

On Wed, Apr 06, 2011 at 09:07:01PM +0200, Per Forlin wrote:
[...]
> 
> Per Forlin (12):
>   mmc: add none blocking mmc request function
>   mmc: mmc_test: add debugfs file to list all tests
>   mmc: mmc_test: add test for none blocking transfers
>   mmc: add member in mmc queue struct to hold request data
>   mmc: add a block request prepare function
>   mmc: move error code in mmc_block_issue_rw_rq to a separate function.
>   mmc: add a second mmc queue request member
>   mmc: add handling for two parallel block requests in issue_rw_rq
>   mmc: test: add random fault injection in core.c
>   omap_hsmmc: use original sg_len for dma_unmap_sg
>   omap_hsmmc: add support for pre_req and post_req
>   mmci: implement pre_req() and post_req()
> 
>  drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
>  drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
>  drivers/mmc/card/queue.c      |  171 +++++++++------
>  drivers/mmc/card/queue.h      |   31 ++-
>  drivers/mmc/core/core.c       |  132 ++++++++++-
>  drivers/mmc/core/debugfs.c    |    5 +
>  drivers/mmc/host/mmci.c       |  146 +++++++++++-
>  drivers/mmc/host/mmci.h       |    8 +
>  drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
>  include/linux/mmc/core.h      |    9 +-
>  include/linux/mmc/host.h      |   13 +-
>  lib/Kconfig.debug             |   11 +
>  12 files changed, 1172 insertions(+), 279 deletions(-)
> 
I'm playing the patch set and seeing the following warnings.

  CC      drivers/mmc/card/block.o
drivers/mmc/card/block.c: In function ?mmc_blk_issue_rq?:
drivers/mmc/card/block.c:429:6: warning: ?status? may be used uninitialized in this function

  CC      drivers/mmc/core/core.o
drivers/mmc/core/core.c: In function ?mmc_request_done?:
drivers/mmc/core/core.c:163:3: warning: passing argument 2 of ?mmc_should_fail_request? from incompatible pointer type
drivers/mmc/core/core.c:129:20: note: expected ?struct mmc_data *? but argument is of type ?struct mmc_request *?

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-06 19:07 ` Per Forlin
@ 2011-04-16 16:48   ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-16 16:48 UTC (permalink / raw)
  To: linux-mmc
  Cc: linux-arm-kernel, linaro-kernel, patches, cjb, per.forlin, Shawn Guo

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before sdhci_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
I worked out the patch by referring to Per's patch below.

 omap_hsmmc: add support for pre_req and post_req

It adds pre_req and post_req support for sdhci based host drivers to
work with Per's non-blocking optimization.  But I only have imx esdhc
based hardware to test.  Unfortunately, I can not measure the
performance gain using mmc_test, because the current esdhc driver on
mainline fails on the test.  So I just did a quick test using 'dd',
but sadly, I did not see noticeable performance gain here.  The
followings are possible reasons I can think of right away.

* The patch did not add pre_req and post_req correctly.  Please help
  review to catch the mistakes if any.
* The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
  can people holding other sdhci based hardware give a try on the
  patch?

Hopefully, I can find some time to have a close look at the mmc_test
failure and the broken ADMA with imx esdhc.

Regards,
Shawn

 drivers/mmc/host/sdhci.c  |   95 ++++++++++++++++++++++++++++++++++++++------
 include/linux/mmc/sdhci.h |    7 +++
 2 files changed, 89 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 9e15f41..becce9a 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -408,6 +408,44 @@ static void sdhci_set_adma_desc(u8 *desc, u32 addr, int len, unsigned cmd)
 	dataddr[0] = cpu_to_le32(addr);
 }
 
+static int sdhci_pre_dma_transfer(struct sdhci_host *host,
+				  struct mmc_data *data,
+				  struct sdhci_next *next)
+{
+	int sg_count;
+
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next ||
+	    (!next && data->host_cookie != host->next_data.cookie)) {
+		sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg,
+				      data->sg_len,
+				      (data->flags & MMC_DATA_WRITE) ?
+				      DMA_TO_DEVICE : DMA_FROM_DEVICE);
+	} else {
+		sg_count = host->next_data.sg_count;
+		host->next_data.sg_count = 0;
+	}
+
+	if (sg_count == 0)
+		return -EINVAL;
+
+	if (next) {
+		next->sg_count = sg_count;
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+	} else
+		host->sg_count = sg_count;
+
+	return sg_count;
+}
+
 static int sdhci_adma_table_pre(struct sdhci_host *host,
 	struct mmc_data *data)
 {
@@ -445,9 +483,8 @@ static int sdhci_adma_table_pre(struct sdhci_host *host,
 		goto fail;
 	BUG_ON(host->align_addr & 0x3);
 
-	host->sg_count = dma_map_sg(mmc_dev(host->mmc),
-		data->sg, data->sg_len, direction);
-	if (host->sg_count == 0)
+	host->sg_count = sdhci_pre_dma_transfer(host, data, NULL);
+	if (host->sg_count < 0)
 		goto unmap_align;
 
 	desc = host->adma_desc;
@@ -587,8 +624,9 @@ static void sdhci_adma_table_post(struct sdhci_host *host,
 		}
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-		data->sg_len, direction);
+	if (!data->host_cookie)
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     direction);
 }
 
 static u8 sdhci_calc_timeout(struct sdhci_host *host, struct mmc_data *data)
@@ -757,11 +795,7 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_data *data)
 		} else {
 			int sg_cnt;
 
-			sg_cnt = dma_map_sg(mmc_dev(host->mmc),
-					data->sg, data->sg_len,
-					(data->flags & MMC_DATA_READ) ?
-						DMA_FROM_DEVICE :
-						DMA_TO_DEVICE);
+			sg_cnt = sdhci_pre_dma_transfer(host, data, NULL);
 			if (sg_cnt == 0) {
 				/*
 				 * This only happens when someone fed
@@ -850,9 +884,11 @@ static void sdhci_finish_data(struct sdhci_host *host)
 		if (host->flags & SDHCI_USE_ADMA)
 			sdhci_adma_table_post(host, data);
 		else {
-			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-				data->sg_len, (data->flags & MMC_DATA_READ) ?
-					DMA_FROM_DEVICE : DMA_TO_DEVICE);
+			if (!data->host_cookie)
+				dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+					     data->sg_len,
+					     (data->flags & MMC_DATA_READ) ?
+					     DMA_FROM_DEVICE : DMA_TO_DEVICE);
 		}
 	}
 
@@ -1116,6 +1152,35 @@ static void sdhci_set_power(struct sdhci_host *host, unsigned short power)
  *                                                                           *
 \*****************************************************************************/
 
+static void sdhci_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			  bool is_first_req)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+
+	if (mrq->data->host_cookie) {
+		mrq->data->host_cookie = 0;
+		return;
+	}
+
+	if (host->flags & SDHCI_REQ_USE_DMA)
+		if (sdhci_pre_dma_transfer(host, mrq->data, &host->next_data) < 0)
+			mrq->data->host_cookie = 0;
+}
+
+static void sdhci_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			   int err)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (host->flags & SDHCI_REQ_USE_DMA) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     (data->flags & MMC_DATA_WRITE) ?
+			     DMA_TO_DEVICE : DMA_FROM_DEVICE);
+		data->host_cookie = 0;
+	}
+}
+
 static void sdhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
 {
 	struct sdhci_host *host;
@@ -1285,6 +1350,8 @@ out:
 }
 
 static const struct mmc_host_ops sdhci_ops = {
+	.pre_req	= sdhci_pre_req,
+	.post_req	= sdhci_post_req,
 	.request	= sdhci_request,
 	.set_ios	= sdhci_set_ios,
 	.get_ro		= sdhci_get_ro,
@@ -1867,6 +1934,8 @@ int sdhci_add_host(struct sdhci_host *host)
 	if (caps & SDHCI_TIMEOUT_CLK_UNIT)
 		host->timeout_clk *= 1000;
 
+	host->next_data.cookie = 1;
+
 	/*
 	 * Set host parameters.
 	 */
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index 83bd9f7..924e84b 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -17,6 +17,11 @@
 #include <linux/io.h>
 #include <linux/mmc/host.h>
 
+struct sdhci_next {
+	unsigned int sg_count;
+	s32 cookie;
+};
+
 struct sdhci_host {
 	/* Data set by hardware interface driver */
 	const char *hw_name;	/* Hardware bus name */
@@ -145,6 +150,8 @@ struct sdhci_host {
 	unsigned int            ocr_avail_sd;
 	unsigned int            ocr_avail_mmc;
 
+	struct sdhci_next next_data;
+
 	unsigned long private[0] ____cacheline_aligned;
 };
 #endif /* __SDHCI_H */
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-16 16:48   ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-16 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before sdhci_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
I worked out the patch by referring to Per's patch below.

 omap_hsmmc: add support for pre_req and post_req

It adds pre_req and post_req support for sdhci based host drivers to
work with Per's non-blocking optimization.  But I only have imx esdhc
based hardware to test.  Unfortunately, I can not measure the
performance gain using mmc_test, because the current esdhc driver on
mainline fails on the test.  So I just did a quick test using 'dd',
but sadly, I did not see noticeable performance gain here.  The
followings are possible reasons I can think of right away.

* The patch did not add pre_req and post_req correctly.  Please help
  review to catch the mistakes if any.
* The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
  can people holding other sdhci based hardware give a try on the
  patch?

Hopefully, I can find some time to have a close look at the mmc_test
failure and the broken ADMA with imx esdhc.

Regards,
Shawn

 drivers/mmc/host/sdhci.c  |   95 ++++++++++++++++++++++++++++++++++++++------
 include/linux/mmc/sdhci.h |    7 +++
 2 files changed, 89 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 9e15f41..becce9a 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -408,6 +408,44 @@ static void sdhci_set_adma_desc(u8 *desc, u32 addr, int len, unsigned cmd)
 	dataddr[0] = cpu_to_le32(addr);
 }
 
+static int sdhci_pre_dma_transfer(struct sdhci_host *host,
+				  struct mmc_data *data,
+				  struct sdhci_next *next)
+{
+	int sg_count;
+
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next ||
+	    (!next && data->host_cookie != host->next_data.cookie)) {
+		sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg,
+				      data->sg_len,
+				      (data->flags & MMC_DATA_WRITE) ?
+				      DMA_TO_DEVICE : DMA_FROM_DEVICE);
+	} else {
+		sg_count = host->next_data.sg_count;
+		host->next_data.sg_count = 0;
+	}
+
+	if (sg_count == 0)
+		return -EINVAL;
+
+	if (next) {
+		next->sg_count = sg_count;
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+	} else
+		host->sg_count = sg_count;
+
+	return sg_count;
+}
+
 static int sdhci_adma_table_pre(struct sdhci_host *host,
 	struct mmc_data *data)
 {
@@ -445,9 +483,8 @@ static int sdhci_adma_table_pre(struct sdhci_host *host,
 		goto fail;
 	BUG_ON(host->align_addr & 0x3);
 
-	host->sg_count = dma_map_sg(mmc_dev(host->mmc),
-		data->sg, data->sg_len, direction);
-	if (host->sg_count == 0)
+	host->sg_count = sdhci_pre_dma_transfer(host, data, NULL);
+	if (host->sg_count < 0)
 		goto unmap_align;
 
 	desc = host->adma_desc;
@@ -587,8 +624,9 @@ static void sdhci_adma_table_post(struct sdhci_host *host,
 		}
 	}
 
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-		data->sg_len, direction);
+	if (!data->host_cookie)
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     direction);
 }
 
 static u8 sdhci_calc_timeout(struct sdhci_host *host, struct mmc_data *data)
@@ -757,11 +795,7 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_data *data)
 		} else {
 			int sg_cnt;
 
-			sg_cnt = dma_map_sg(mmc_dev(host->mmc),
-					data->sg, data->sg_len,
-					(data->flags & MMC_DATA_READ) ?
-						DMA_FROM_DEVICE :
-						DMA_TO_DEVICE);
+			sg_cnt = sdhci_pre_dma_transfer(host, data, NULL);
 			if (sg_cnt == 0) {
 				/*
 				 * This only happens when someone fed
@@ -850,9 +884,11 @@ static void sdhci_finish_data(struct sdhci_host *host)
 		if (host->flags & SDHCI_USE_ADMA)
 			sdhci_adma_table_post(host, data);
 		else {
-			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-				data->sg_len, (data->flags & MMC_DATA_READ) ?
-					DMA_FROM_DEVICE : DMA_TO_DEVICE);
+			if (!data->host_cookie)
+				dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+					     data->sg_len,
+					     (data->flags & MMC_DATA_READ) ?
+					     DMA_FROM_DEVICE : DMA_TO_DEVICE);
 		}
 	}
 
@@ -1116,6 +1152,35 @@ static void sdhci_set_power(struct sdhci_host *host, unsigned short power)
  *                                                                           *
 \*****************************************************************************/
 
+static void sdhci_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			  bool is_first_req)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+
+	if (mrq->data->host_cookie) {
+		mrq->data->host_cookie = 0;
+		return;
+	}
+
+	if (host->flags & SDHCI_REQ_USE_DMA)
+		if (sdhci_pre_dma_transfer(host, mrq->data, &host->next_data) < 0)
+			mrq->data->host_cookie = 0;
+}
+
+static void sdhci_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			   int err)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (host->flags & SDHCI_REQ_USE_DMA) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			     (data->flags & MMC_DATA_WRITE) ?
+			     DMA_TO_DEVICE : DMA_FROM_DEVICE);
+		data->host_cookie = 0;
+	}
+}
+
 static void sdhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
 {
 	struct sdhci_host *host;
@@ -1285,6 +1350,8 @@ out:
 }
 
 static const struct mmc_host_ops sdhci_ops = {
+	.pre_req	= sdhci_pre_req,
+	.post_req	= sdhci_post_req,
 	.request	= sdhci_request,
 	.set_ios	= sdhci_set_ios,
 	.get_ro		= sdhci_get_ro,
@@ -1867,6 +1934,8 @@ int sdhci_add_host(struct sdhci_host *host)
 	if (caps & SDHCI_TIMEOUT_CLK_UNIT)
 		host->timeout_clk *= 1000;
 
+	host->next_data.cookie = 1;
+
 	/*
 	 * Set host parameters.
 	 */
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index 83bd9f7..924e84b 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -17,6 +17,11 @@
 #include <linux/io.h>
 #include <linux/mmc/host.h>
 
+struct sdhci_next {
+	unsigned int sg_count;
+	s32 cookie;
+};
+
 struct sdhci_host {
 	/* Data set by hardware interface driver */
 	const char *hw_name;	/* Hardware bus name */
@@ -145,6 +150,8 @@ struct sdhci_host {
 	unsigned int            ocr_avail_sd;
 	unsigned int            ocr_avail_mmc;
 
+	struct sdhci_next next_data;
+
 	unsigned long private[0] ____cacheline_aligned;
 };
 #endif /* __SDHCI_H */
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-16 16:48   ` Shawn Guo
@ 2011-04-16 23:06     ` Andrei Warkentin
  -1 siblings, 0 replies; 129+ messages in thread
From: Andrei Warkentin @ 2011-04-16 23:06 UTC (permalink / raw)
  To: Shawn Guo
  Cc: linux-mmc, linux-arm-kernel, linaro-kernel, patches, cjb, per.forlin

Hi Shawn,

On Sat, Apr 16, 2011 at 11:48 AM, Shawn Guo <shawn.guo@linaro.org> wrote:
> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> If not calling pre_req() before sdhci_request(), request()
> will prepare the cache just like it did it before.
> It is optional to use pre_req() and post_req().
>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> I worked out the patch by referring to Per's patch below.
>
>  omap_hsmmc: add support for pre_req and post_req
>
> It adds pre_req and post_req support for sdhci based host drivers to
> work with Per's non-blocking optimization.  But I only have imx esdhc
> based hardware to test.  Unfortunately, I can not measure the
> performance gain using mmc_test, because the current esdhc driver on
> mainline fails on the test.  So I just did a quick test using 'dd',
> but sadly, I did not see noticeable performance gain here.  The
> followings are possible reasons I can think of right away.
>
> * The patch did not add pre_req and post_req correctly.  Please help
>  review to catch the mistakes if any.
> * The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
>  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
>  can people holding other sdhci based hardware give a try on the
>  patch?
>
> Hopefully, I can find some time to have a close look at the mmc_test
> failure and the broken ADMA with imx esdhc.
>

I'll try it out...

A

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-16 23:06     ` Andrei Warkentin
  0 siblings, 0 replies; 129+ messages in thread
From: Andrei Warkentin @ 2011-04-16 23:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shawn,

On Sat, Apr 16, 2011 at 11:48 AM, Shawn Guo <shawn.guo@linaro.org> wrote:
> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> If not calling pre_req() before sdhci_request(), request()
> will prepare the cache just like it did it before.
> It is optional to use pre_req() and post_req().
>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> I worked out the patch by referring to Per's patch below.
>
> ?omap_hsmmc: add support for pre_req and post_req
>
> It adds pre_req and post_req support for sdhci based host drivers to
> work with Per's non-blocking optimization. ?But I only have imx esdhc
> based hardware to test. ?Unfortunately, I can not measure the
> performance gain using mmc_test, because the current esdhc driver on
> mainline fails on the test. ?So I just did a quick test using 'dd',
> but sadly, I did not see noticeable performance gain here. ?The
> followings are possible reasons I can think of right away.
>
> * The patch did not add pre_req and post_req correctly. ?Please help
> ?review to catch the mistakes if any.
> * The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
> ?(max_segs is 128), due to the broken ADMA support on imx esdhc. ?So
> ?can people holding other sdhci based hardware give a try on the
> ?patch?
>
> Hopefully, I can find some time to have a close look at the mmc_test
> failure and the broken ADMA with imx esdhc.
>

I'll try it out...

A

^ permalink raw reply	[flat|nested] 129+ messages in thread

* RE: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
  2011-04-06 19:07   ` Per Forlin
  (?)
@ 2011-04-17  7:09     ` Lin Tony-B19295
  -1 siblings, 0 replies; 129+ messages in thread
From: Lin Tony-B19295 @ 2011-04-17  7:09 UTC (permalink / raw)
  To: Per Forlin, linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball

Hi Per

	Just have a glance of your patch, good thinking. But I have a question about this patch. You modified mmc_test to test your driver. Does it mean your driver's performance enhancement depends on application? The caller must have to know the next request so that could make driver prepare next during current transfer? So testing your driver with blocking request & non blocking request will have different throughput due to different application mechanism. 
	Thanks

BR
Tony

-----Original Message-----
From: linux-arm-kernel-bounces@lists.infradead.org [mailto:linux-arm-kernel-bounces@lists.infradead.org] On Behalf Of Per Forlin
Sent: Thursday, April 07, 2011 3:07 AM
To: linux-mmc@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; linaro-dev@lists.linaro.org
Cc: Chris Ball; Per Forlin
Subject: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers

Add four tests for read and write performance per different transfer size, 4k to 4M.
 * Read using blocking mmc request
 * Read using none blocking mmc request
 * Write using blocking mmc request
 * Write using none blocking mmc request

The host dirver must support pre_req() and post_req() in order to run the none blocking test cases.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |  312 +++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c index 466cdb5..1000383 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -22,6 +22,7 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/random.h>
 
 #define RESULT_OK		0
 #define RESULT_FAIL		1
@@ -51,10 +52,12 @@ struct mmc_test_pages {
  * struct mmc_test_mem - allocated memory.
  * @arr: array of allocations
  * @cnt: number of allocations
+ * @size_min_cmn: lowest common size in array of allocations
  */
 struct mmc_test_mem {
 	struct mmc_test_pages *arr;
 	unsigned int cnt;
+	unsigned int size_min_cmn;
 };
 
 /**
@@ -148,6 +151,21 @@ struct mmc_test_card {
 	struct mmc_test_general_result	*gr;
 };
 
+enum mmc_test_prep_media {
+	MMC_TEST_PREP_NONE = 0,
+	MMC_TEST_PREP_WRITE_FULL = 1 << 0,
+	MMC_TEST_PREP_ERASE = 1 << 1,
+};
+
+struct mmc_test_multiple_rw {
+	unsigned int *bs;
+	unsigned int len;
+	unsigned int size;
+	bool do_write;
+	bool do_nonblock_req;
+	enum mmc_test_prep_media prepare;
+};
+
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -307,6 +325,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 	unsigned long max_seg_page_cnt = DIV_ROUND_UP(max_seg_sz, PAGE_SIZE);
 	unsigned long page_cnt = 0;
 	unsigned long limit = nr_free_buffer_pages() >> 4;
+	unsigned int min_cmn = 0;
 	struct mmc_test_mem *mem;
 
 	if (max_page_cnt > limit)
@@ -350,6 +369,12 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 		mem->arr[mem->cnt].page = page;
 		mem->arr[mem->cnt].order = order;
 		mem->cnt += 1;
+		if (!min_cmn)
+			min_cmn = PAGE_SIZE << order;
+		else
+			min_cmn = min(min_cmn,
+				      (unsigned int) (PAGE_SIZE << order));
+
 		if (max_page_cnt <= (1UL << order))
 			break;
 		max_page_cnt -= 1UL << order;
@@ -360,6 +385,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 			break;
 		}
 	}
+	mem->size_min_cmn = min_cmn;
 
 	return mem;
 
@@ -386,7 +412,6 @@ static int mmc_test_map_sg(struct mmc_test_mem *mem, unsigned long sz,
 	do {
 		for (i = 0; i < mem->cnt; i++) {
 			unsigned long len = PAGE_SIZE << mem->arr[i].order;
-
 			if (len > sz)
 				len = sz;
 			if (len > max_seg_sz)
@@ -725,6 +750,94 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,  }
 
 /*
+ * Tests nonblock transfer with certain parameters  */ static void 
+mmc_test_nonblock_reset(struct mmc_request *mrq,
+				    struct mmc_command *cmd,
+				    struct mmc_command *stop,
+				    struct mmc_data *data)
+{
+	memset(mrq, 0, sizeof(struct mmc_request));
+	memset(cmd, 0, sizeof(struct mmc_command));
+	memset(data, 0, sizeof(struct mmc_data));
+	memset(stop, 0, sizeof(struct mmc_command));
+
+	mrq->cmd = cmd;
+	mrq->data = data;
+	mrq->stop = stop;
+}
+static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
+				      struct scatterlist *sg, unsigned sg_len,
+				      unsigned dev_addr, unsigned blocks,
+				      unsigned blksz, int write, int count) {
+	struct mmc_request mrq1;
+	struct mmc_command cmd1;
+	struct mmc_command stop1;
+	struct mmc_data data1;
+
+	struct mmc_request mrq2;
+	struct mmc_command cmd2;
+	struct mmc_command stop2;
+	struct mmc_data data2;
+
+	struct mmc_request *cur_mrq;
+	struct mmc_request *prev_mrq;
+	int i;
+	int ret = 0;
+
+	if (!test->card->host->ops->pre_req ||
+		!test->card->host->ops->post_req)
+		return -RESULT_UNSUP_HOST;
+
+	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+
+	cur_mrq = &mrq1;
+	prev_mrq = NULL;
+
+	for (i = 0; i < count; i++) {
+		mmc_test_prepare_mrq(test, cur_mrq, sg, sg_len, dev_addr,
+				blocks, blksz, write);
+		mmc_pre_req(test->card->host, cur_mrq, !prev_mrq);
+
+		if (prev_mrq) {
+			mmc_wait_for_req_done(prev_mrq);
+			mmc_test_wait_busy(test);
+			ret = mmc_test_check_result(test, prev_mrq);
+			if (ret)
+				goto err;
+		}
+
+		mmc_start_req(test->card->host, cur_mrq);
+
+		if (prev_mrq)
+			mmc_post_req(test->card->host, prev_mrq, 0);
+
+		prev_mrq = cur_mrq;
+		if (cur_mrq == &mrq1) {
+			mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+			cur_mrq = &mrq2;
+		} else {
+			mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+			cur_mrq = &mrq1;
+		}
+		dev_addr += blocks;
+	}
+
+	mmc_wait_for_req_done(prev_mrq);
+	mmc_test_wait_busy(test);
+	ret = mmc_test_check_result(test, prev_mrq);
+	if (ret)
+		goto err;
+	mmc_post_req(test->card->host, prev_mrq, 0);
+
+	return ret;
+err:
+	return ret;
+}
+
+/*
  * Tests a basic transfer with certain parameters
  */
 static int mmc_test_simple_transfer(struct mmc_test_card *test, @@ -1351,14 +1464,17 @@ static int mmc_test_area_transfer(struct mmc_test_card *test,  }
 
 /*
- * Map and transfer bytes.
+ * Map and transfer bytes for multiple transfers.
  */
-static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
-			    unsigned int dev_addr, int write, int max_scatter,
-			    int timed)
+static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
+				unsigned int dev_addr, int write,
+				int max_scatter, int timed, int count,
+				bool nonblock)
 {
 	struct timespec ts1, ts2;
-	int ret;
+	int ret = 0;
+	int i;
+	struct mmc_test_area *t = &test->area;
 
 	/*
 	 * In the case of a maximally scattered transfer, the maximum transfer @@ -1382,8 +1498,15 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 
 	if (timed)
 		getnstimeofday(&ts1);
+	if (nonblock)
+		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
+				 dev_addr, t->blocks, 512, write, count);
+	else
+		for (i = 0; i < count && ret == 0; i++) {
+			ret = mmc_test_area_transfer(test, dev_addr, write);
+			dev_addr += sz >> 9;
+		}
 
-	ret = mmc_test_area_transfer(test, dev_addr, write);
 	if (ret)
 		return ret;
 
@@ -1391,11 +1514,19 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 		getnstimeofday(&ts2);
 
 	if (timed)
-		mmc_test_print_rate(test, sz, &ts1, &ts2);
+		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
 
 	return 0;
 }
 
+static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
+			    unsigned int dev_addr, int write, int max_scatter,
+			    int timed)
+{
+	return mmc_test_area_io_seq(test, sz, dev_addr, write, max_scatter,
+				    timed, 1, false);
+}
+
 /*
  * Write the test area entirely.
  */
@@ -1956,6 +2087,144 @@ static int mmc_test_large_seq_write_perf(struct mmc_test_card *test)
 	return mmc_test_large_seq_perf(test, 1);  }
 
+static int mmc_test_rw_multiple(struct mmc_test_card *test,
+				struct mmc_test_multiple_rw *tdata,
+				unsigned int reqsize, unsigned int size) {
+	unsigned int dev_addr;
+	struct mmc_test_area *t = &test->area;
+	int ret = 0;
+	int max_reqsize = max(t->mem->size_min_cmn *
+			      min(t->max_segs, t->mem->cnt), t->max_tfr);
+
+	/* Set up test area */
+	if (size > mmc_test_capacity(test->card) / 2 * 512)
+		size = mmc_test_capacity(test->card) / 2 * 512;
+	if (reqsize > max_reqsize)
+		reqsize = max_reqsize;
+	dev_addr = mmc_test_capacity(test->card) / 4;
+	if ((dev_addr & 0xffff0000))
+		dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
+	else
+		dev_addr &= 0xfffff800; /* Round to 1MiB boundary */
+	if (!dev_addr)
+		goto err;
+
+	/* prepare test area */
+	if (mmc_can_erase(test->card) &&
+	    tdata->prepare & MMC_TEST_PREP_ERASE) {
+		ret = mmc_erase(test->card, dev_addr,
+				size / 512, MMC_SECURE_ERASE_ARG);
+		if (ret)
+			ret = mmc_erase(test->card, dev_addr,
+					size / 512, MMC_ERASE_ARG);
+		if (ret)
+			goto err;
+	}
+
+	/* Run test */
+	ret = mmc_test_area_io_seq(test, reqsize, dev_addr,
+				   tdata->do_write, 0, 1, size / reqsize,
+				   tdata->do_nonblock_req);
+	if (ret)
+		goto err;
+
+	return ret;
+ err:
+	printk(KERN_INFO "[%s] error\n", __func__);
+	return ret;
+}
+
+static int mmc_test_rw_multiple_size(struct mmc_test_card *test,
+				     struct mmc_test_multiple_rw *rw) {
+	int ret = 0;
+	int i;
+
+	for (i = 0 ; i < rw->len && ret == 0; i++) {
+		ret = mmc_test_rw_multiple(test, rw, rw->bs[i], rw->size);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+/*
+ * Multiple blocking write 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_write_blocking_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); };
+
+/*
+ * Multiple none blocking write 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_write_nonblock_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
+/*
+ * Multiple blocking read 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_read_blocking_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
+/*
+ * Multiple none blocking read 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_read_nonblock_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
 static const struct mmc_test_case mmc_test_cases[] = {
 	{
 		.name = "Basic write (no data verification)", @@ -2223,6 +2492,33 @@ static const struct mmc_test_case mmc_test_cases[] = {
 		.cleanup = mmc_test_area_cleanup,
 	},
 
+	{
+		.name = "Write performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Write performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
 };
 
 static DEFINE_MUTEX(mmc_test_lock);
--
1.7.4.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



^ permalink raw reply	[flat|nested] 129+ messages in thread

* RE: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-17  7:09     ` Lin Tony-B19295
  0 siblings, 0 replies; 129+ messages in thread
From: Lin Tony-B19295 @ 2011-04-17  7:09 UTC (permalink / raw)
  To: Per Forlin, linux-mmc, linux-arm-kernel, linux-ke; +Cc: Chris Ball

Hi Per

	Just have a glance of your patch, good thinking. But I have a question about this patch. You modified mmc_test to test your driver. Does it mean your driver's performance enhancement depends on application? The caller must have to know the next request so that could make driver prepare next during current transfer? So testing your driver with blocking request & non blocking request will have different throughput due to different application mechanism. 
	Thanks

BR
Tony

-----Original Message-----
From: linux-arm-kernel-bounces@lists.infradead.org [mailto:linux-arm-kernel-bounces@lists.infradead.org] On Behalf Of Per Forlin
Sent: Thursday, April 07, 2011 3:07 AM
To: linux-mmc@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; linaro-dev@lists.linaro.org
Cc: Chris Ball; Per Forlin
Subject: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers

Add four tests for read and write performance per different transfer size, 4k to 4M.
 * Read using blocking mmc request
 * Read using none blocking mmc request
 * Write using blocking mmc request
 * Write using none blocking mmc request

The host dirver must support pre_req() and post_req() in order to run the none blocking test cases.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |  312 +++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c index 466cdb5..1000383 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -22,6 +22,7 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/random.h>
 
 #define RESULT_OK		0
 #define RESULT_FAIL		1
@@ -51,10 +52,12 @@ struct mmc_test_pages {
  * struct mmc_test_mem - allocated memory.
  * @arr: array of allocations
  * @cnt: number of allocations
+ * @size_min_cmn: lowest common size in array of allocations
  */
 struct mmc_test_mem {
 	struct mmc_test_pages *arr;
 	unsigned int cnt;
+	unsigned int size_min_cmn;
 };
 
 /**
@@ -148,6 +151,21 @@ struct mmc_test_card {
 	struct mmc_test_general_result	*gr;
 };
 
+enum mmc_test_prep_media {
+	MMC_TEST_PREP_NONE = 0,
+	MMC_TEST_PREP_WRITE_FULL = 1 << 0,
+	MMC_TEST_PREP_ERASE = 1 << 1,
+};
+
+struct mmc_test_multiple_rw {
+	unsigned int *bs;
+	unsigned int len;
+	unsigned int size;
+	bool do_write;
+	bool do_nonblock_req;
+	enum mmc_test_prep_media prepare;
+};
+
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -307,6 +325,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 	unsigned long max_seg_page_cnt = DIV_ROUND_UP(max_seg_sz, PAGE_SIZE);
 	unsigned long page_cnt = 0;
 	unsigned long limit = nr_free_buffer_pages() >> 4;
+	unsigned int min_cmn = 0;
 	struct mmc_test_mem *mem;
 
 	if (max_page_cnt > limit)
@@ -350,6 +369,12 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 		mem->arr[mem->cnt].page = page;
 		mem->arr[mem->cnt].order = order;
 		mem->cnt += 1;
+		if (!min_cmn)
+			min_cmn = PAGE_SIZE << order;
+		else
+			min_cmn = min(min_cmn,
+				      (unsigned int) (PAGE_SIZE << order));
+
 		if (max_page_cnt <= (1UL << order))
 			break;
 		max_page_cnt -= 1UL << order;
@@ -360,6 +385,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 			break;
 		}
 	}
+	mem->size_min_cmn = min_cmn;
 
 	return mem;
 
@@ -386,7 +412,6 @@ static int mmc_test_map_sg(struct mmc_test_mem *mem, unsigned long sz,
 	do {
 		for (i = 0; i < mem->cnt; i++) {
 			unsigned long len = PAGE_SIZE << mem->arr[i].order;
-
 			if (len > sz)
 				len = sz;
 			if (len > max_seg_sz)
@@ -725,6 +750,94 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,  }
 
 /*
+ * Tests nonblock transfer with certain parameters  */ static void 
+mmc_test_nonblock_reset(struct mmc_request *mrq,
+				    struct mmc_command *cmd,
+				    struct mmc_command *stop,
+				    struct mmc_data *data)
+{
+	memset(mrq, 0, sizeof(struct mmc_request));
+	memset(cmd, 0, sizeof(struct mmc_command));
+	memset(data, 0, sizeof(struct mmc_data));
+	memset(stop, 0, sizeof(struct mmc_command));
+
+	mrq->cmd = cmd;
+	mrq->data = data;
+	mrq->stop = stop;
+}
+static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
+				      struct scatterlist *sg, unsigned sg_len,
+				      unsigned dev_addr, unsigned blocks,
+				      unsigned blksz, int write, int count) {
+	struct mmc_request mrq1;
+	struct mmc_command cmd1;
+	struct mmc_command stop1;
+	struct mmc_data data1;
+
+	struct mmc_request mrq2;
+	struct mmc_command cmd2;
+	struct mmc_command stop2;
+	struct mmc_data data2;
+
+	struct mmc_request *cur_mrq;
+	struct mmc_request *prev_mrq;
+	int i;
+	int ret = 0;
+
+	if (!test->card->host->ops->pre_req ||
+		!test->card->host->ops->post_req)
+		return -RESULT_UNSUP_HOST;
+
+	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+
+	cur_mrq = &mrq1;
+	prev_mrq = NULL;
+
+	for (i = 0; i < count; i++) {
+		mmc_test_prepare_mrq(test, cur_mrq, sg, sg_len, dev_addr,
+				blocks, blksz, write);
+		mmc_pre_req(test->card->host, cur_mrq, !prev_mrq);
+
+		if (prev_mrq) {
+			mmc_wait_for_req_done(prev_mrq);
+			mmc_test_wait_busy(test);
+			ret = mmc_test_check_result(test, prev_mrq);
+			if (ret)
+				goto err;
+		}
+
+		mmc_start_req(test->card->host, cur_mrq);
+
+		if (prev_mrq)
+			mmc_post_req(test->card->host, prev_mrq, 0);
+
+		prev_mrq = cur_mrq;
+		if (cur_mrq == &mrq1) {
+			mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+			cur_mrq = &mrq2;
+		} else {
+			mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+			cur_mrq = &mrq1;
+		}
+		dev_addr += blocks;
+	}
+
+	mmc_wait_for_req_done(prev_mrq);
+	mmc_test_wait_busy(test);
+	ret = mmc_test_check_result(test, prev_mrq);
+	if (ret)
+		goto err;
+	mmc_post_req(test->card->host, prev_mrq, 0);
+
+	return ret;
+err:
+	return ret;
+}
+
+/*
  * Tests a basic transfer with certain parameters
  */
 static int mmc_test_simple_transfer(struct mmc_test_card *test, @@ -1351,14 +1464,17 @@ static int mmc_test_area_transfer(struct mmc_test_card *test,  }
 
 /*
- * Map and transfer bytes.
+ * Map and transfer bytes for multiple transfers.
  */
-static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
-			    unsigned int dev_addr, int write, int max_scatter,
-			    int timed)
+static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
+				unsigned int dev_addr, int write,
+				int max_scatter, int timed, int count,
+				bool nonblock)
 {
 	struct timespec ts1, ts2;
-	int ret;
+	int ret = 0;
+	int i;
+	struct mmc_test_area *t = &test->area;
 
 	/*
 	 * In the case of a maximally scattered transfer, the maximum transfer @@ -1382,8 +1498,15 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 
 	if (timed)
 		getnstimeofday(&ts1);
+	if (nonblock)
+		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
+				 dev_addr, t->blocks, 512, write, count);
+	else
+		for (i = 0; i < count && ret == 0; i++) {
+			ret = mmc_test_area_transfer(test, dev_addr, write);
+			dev_addr += sz >> 9;
+		}
 
-	ret = mmc_test_area_transfer(test, dev_addr, write);
 	if (ret)
 		return ret;
 
@@ -1391,11 +1514,19 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 		getnstimeofday(&ts2);
 
 	if (timed)
-		mmc_test_print_rate(test, sz, &ts1, &ts2);
+		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
 
 	return 0;
 }
 
+static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
+			    unsigned int dev_addr, int write, int max_scatter,
+			    int timed)
+{
+	return mmc_test_area_io_seq(test, sz, dev_addr, write, max_scatter,
+				    timed, 1, false);
+}
+
 /*
  * Write the test area entirely.
  */
@@ -1956,6 +2087,144 @@ static int mmc_test_large_seq_write_perf(struct mmc_test_card *test)
 	return mmc_test_large_seq_perf(test, 1);  }
 
+static int mmc_test_rw_multiple(struct mmc_test_card *test,
+				struct mmc_test_multiple_rw *tdata,
+				unsigned int reqsize, unsigned int size) {
+	unsigned int dev_addr;
+	struct mmc_test_area *t = &test->area;
+	int ret = 0;
+	int max_reqsize = max(t->mem->size_min_cmn *
+			      min(t->max_segs, t->mem->cnt), t->max_tfr);
+
+	/* Set up test area */
+	if (size > mmc_test_capacity(test->card) / 2 * 512)
+		size = mmc_test_capacity(test->card) / 2 * 512;
+	if (reqsize > max_reqsize)
+		reqsize = max_reqsize;
+	dev_addr = mmc_test_capacity(test->card) / 4;
+	if ((dev_addr & 0xffff0000))
+		dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
+	else
+		dev_addr &= 0xfffff800; /* Round to 1MiB boundary */
+	if (!dev_addr)
+		goto err;
+
+	/* prepare test area */
+	if (mmc_can_erase(test->card) &&
+	    tdata->prepare & MMC_TEST_PREP_ERASE) {
+		ret = mmc_erase(test->card, dev_addr,
+				size / 512, MMC_SECURE_ERASE_ARG);
+		if (ret)
+			ret = mmc_erase(test->card, dev_addr,
+					size / 512, MMC_ERASE_ARG);
+		if (ret)
+			goto err;
+	}
+
+	/* Run test */
+	ret = mmc_test_area_io_seq(test, reqsize, dev_addr,
+				   tdata->do_write, 0, 1, size / reqsize,
+				   tdata->do_nonblock_req);
+	if (ret)
+		goto err;
+
+	return ret;
+ err:
+	printk(KERN_INFO "[%s] error\n", __func__);
+	return ret;
+}
+
+static int mmc_test_rw_multiple_size(struct mmc_test_card *test,
+				     struct mmc_test_multiple_rw *rw) {
+	int ret = 0;
+	int i;
+
+	for (i = 0 ; i < rw->len && ret == 0; i++) {
+		ret = mmc_test_rw_multiple(test, rw, rw->bs[i], rw->size);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+/*
+ * Multiple blocking write 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_write_blocking_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); };
+
+/*
+ * Multiple none blocking write 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_write_nonblock_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
+/*
+ * Multiple blocking read 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_read_blocking_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
+/*
+ * Multiple none blocking read 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_read_nonblock_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
 static const struct mmc_test_case mmc_test_cases[] = {
 	{
 		.name = "Basic write (no data verification)", @@ -2223,6 +2492,33 @@ static const struct mmc_test_case mmc_test_cases[] = {
 		.cleanup = mmc_test_area_cleanup,
 	},
 
+	{
+		.name = "Write performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Write performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
 };
 
 static DEFINE_MUTEX(mmc_test_lock);
--
1.7.4.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-17  7:09     ` Lin Tony-B19295
  0 siblings, 0 replies; 129+ messages in thread
From: Lin Tony-B19295 @ 2011-04-17  7:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Per

	Just have a glance of your patch, good thinking. But I have a question about this patch. You modified mmc_test to test your driver. Does it mean your driver's performance enhancement depends on application? The caller must have to know the next request so that could make driver prepare next during current transfer? So testing your driver with blocking request & non blocking request will have different throughput due to different application mechanism. 
	Thanks

BR
Tony

-----Original Message-----
From: linux-arm-kernel-bounces@lists.infradead.org [mailto:linux-arm-kernel-bounces at lists.infradead.org] On Behalf Of Per Forlin
Sent: Thursday, April 07, 2011 3:07 AM
To: linux-mmc at vger.kernel.org; linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org; linaro-dev at lists.linaro.org
Cc: Chris Ball; Per Forlin
Subject: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers

Add four tests for read and write performance per different transfer size, 4k to 4M.
 * Read using blocking mmc request
 * Read using none blocking mmc request
 * Write using blocking mmc request
 * Write using none blocking mmc request

The host dirver must support pre_req() and post_req() in order to run the none blocking test cases.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
---
 drivers/mmc/card/mmc_test.c |  312 +++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c index 466cdb5..1000383 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -22,6 +22,7 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/random.h>
 
 #define RESULT_OK		0
 #define RESULT_FAIL		1
@@ -51,10 +52,12 @@ struct mmc_test_pages {
  * struct mmc_test_mem - allocated memory.
  * @arr: array of allocations
  * @cnt: number of allocations
+ * @size_min_cmn: lowest common size in array of allocations
  */
 struct mmc_test_mem {
 	struct mmc_test_pages *arr;
 	unsigned int cnt;
+	unsigned int size_min_cmn;
 };
 
 /**
@@ -148,6 +151,21 @@ struct mmc_test_card {
 	struct mmc_test_general_result	*gr;
 };
 
+enum mmc_test_prep_media {
+	MMC_TEST_PREP_NONE = 0,
+	MMC_TEST_PREP_WRITE_FULL = 1 << 0,
+	MMC_TEST_PREP_ERASE = 1 << 1,
+};
+
+struct mmc_test_multiple_rw {
+	unsigned int *bs;
+	unsigned int len;
+	unsigned int size;
+	bool do_write;
+	bool do_nonblock_req;
+	enum mmc_test_prep_media prepare;
+};
+
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -307,6 +325,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 	unsigned long max_seg_page_cnt = DIV_ROUND_UP(max_seg_sz, PAGE_SIZE);
 	unsigned long page_cnt = 0;
 	unsigned long limit = nr_free_buffer_pages() >> 4;
+	unsigned int min_cmn = 0;
 	struct mmc_test_mem *mem;
 
 	if (max_page_cnt > limit)
@@ -350,6 +369,12 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 		mem->arr[mem->cnt].page = page;
 		mem->arr[mem->cnt].order = order;
 		mem->cnt += 1;
+		if (!min_cmn)
+			min_cmn = PAGE_SIZE << order;
+		else
+			min_cmn = min(min_cmn,
+				      (unsigned int) (PAGE_SIZE << order));
+
 		if (max_page_cnt <= (1UL << order))
 			break;
 		max_page_cnt -= 1UL << order;
@@ -360,6 +385,7 @@ static struct mmc_test_mem *mmc_test_alloc_mem(unsigned long min_sz,
 			break;
 		}
 	}
+	mem->size_min_cmn = min_cmn;
 
 	return mem;
 
@@ -386,7 +412,6 @@ static int mmc_test_map_sg(struct mmc_test_mem *mem, unsigned long sz,
 	do {
 		for (i = 0; i < mem->cnt; i++) {
 			unsigned long len = PAGE_SIZE << mem->arr[i].order;
-
 			if (len > sz)
 				len = sz;
 			if (len > max_seg_sz)
@@ -725,6 +750,94 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,  }
 
 /*
+ * Tests nonblock transfer with certain parameters  */ static void 
+mmc_test_nonblock_reset(struct mmc_request *mrq,
+				    struct mmc_command *cmd,
+				    struct mmc_command *stop,
+				    struct mmc_data *data)
+{
+	memset(mrq, 0, sizeof(struct mmc_request));
+	memset(cmd, 0, sizeof(struct mmc_command));
+	memset(data, 0, sizeof(struct mmc_data));
+	memset(stop, 0, sizeof(struct mmc_command));
+
+	mrq->cmd = cmd;
+	mrq->data = data;
+	mrq->stop = stop;
+}
+static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
+				      struct scatterlist *sg, unsigned sg_len,
+				      unsigned dev_addr, unsigned blocks,
+				      unsigned blksz, int write, int count) {
+	struct mmc_request mrq1;
+	struct mmc_command cmd1;
+	struct mmc_command stop1;
+	struct mmc_data data1;
+
+	struct mmc_request mrq2;
+	struct mmc_command cmd2;
+	struct mmc_command stop2;
+	struct mmc_data data2;
+
+	struct mmc_request *cur_mrq;
+	struct mmc_request *prev_mrq;
+	int i;
+	int ret = 0;
+
+	if (!test->card->host->ops->pre_req ||
+		!test->card->host->ops->post_req)
+		return -RESULT_UNSUP_HOST;
+
+	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+
+	cur_mrq = &mrq1;
+	prev_mrq = NULL;
+
+	for (i = 0; i < count; i++) {
+		mmc_test_prepare_mrq(test, cur_mrq, sg, sg_len, dev_addr,
+				blocks, blksz, write);
+		mmc_pre_req(test->card->host, cur_mrq, !prev_mrq);
+
+		if (prev_mrq) {
+			mmc_wait_for_req_done(prev_mrq);
+			mmc_test_wait_busy(test);
+			ret = mmc_test_check_result(test, prev_mrq);
+			if (ret)
+				goto err;
+		}
+
+		mmc_start_req(test->card->host, cur_mrq);
+
+		if (prev_mrq)
+			mmc_post_req(test->card->host, prev_mrq, 0);
+
+		prev_mrq = cur_mrq;
+		if (cur_mrq == &mrq1) {
+			mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+			cur_mrq = &mrq2;
+		} else {
+			mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
+			cur_mrq = &mrq1;
+		}
+		dev_addr += blocks;
+	}
+
+	mmc_wait_for_req_done(prev_mrq);
+	mmc_test_wait_busy(test);
+	ret = mmc_test_check_result(test, prev_mrq);
+	if (ret)
+		goto err;
+	mmc_post_req(test->card->host, prev_mrq, 0);
+
+	return ret;
+err:
+	return ret;
+}
+
+/*
  * Tests a basic transfer with certain parameters
  */
 static int mmc_test_simple_transfer(struct mmc_test_card *test, @@ -1351,14 +1464,17 @@ static int mmc_test_area_transfer(struct mmc_test_card *test,  }
 
 /*
- * Map and transfer bytes.
+ * Map and transfer bytes for multiple transfers.
  */
-static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
-			    unsigned int dev_addr, int write, int max_scatter,
-			    int timed)
+static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
+				unsigned int dev_addr, int write,
+				int max_scatter, int timed, int count,
+				bool nonblock)
 {
 	struct timespec ts1, ts2;
-	int ret;
+	int ret = 0;
+	int i;
+	struct mmc_test_area *t = &test->area;
 
 	/*
 	 * In the case of a maximally scattered transfer, the maximum transfer @@ -1382,8 +1498,15 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 
 	if (timed)
 		getnstimeofday(&ts1);
+	if (nonblock)
+		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
+				 dev_addr, t->blocks, 512, write, count);
+	else
+		for (i = 0; i < count && ret == 0; i++) {
+			ret = mmc_test_area_transfer(test, dev_addr, write);
+			dev_addr += sz >> 9;
+		}
 
-	ret = mmc_test_area_transfer(test, dev_addr, write);
 	if (ret)
 		return ret;
 
@@ -1391,11 +1514,19 @@ static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
 		getnstimeofday(&ts2);
 
 	if (timed)
-		mmc_test_print_rate(test, sz, &ts1, &ts2);
+		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
 
 	return 0;
 }
 
+static int mmc_test_area_io(struct mmc_test_card *test, unsigned long sz,
+			    unsigned int dev_addr, int write, int max_scatter,
+			    int timed)
+{
+	return mmc_test_area_io_seq(test, sz, dev_addr, write, max_scatter,
+				    timed, 1, false);
+}
+
 /*
  * Write the test area entirely.
  */
@@ -1956,6 +2087,144 @@ static int mmc_test_large_seq_write_perf(struct mmc_test_card *test)
 	return mmc_test_large_seq_perf(test, 1);  }
 
+static int mmc_test_rw_multiple(struct mmc_test_card *test,
+				struct mmc_test_multiple_rw *tdata,
+				unsigned int reqsize, unsigned int size) {
+	unsigned int dev_addr;
+	struct mmc_test_area *t = &test->area;
+	int ret = 0;
+	int max_reqsize = max(t->mem->size_min_cmn *
+			      min(t->max_segs, t->mem->cnt), t->max_tfr);
+
+	/* Set up test area */
+	if (size > mmc_test_capacity(test->card) / 2 * 512)
+		size = mmc_test_capacity(test->card) / 2 * 512;
+	if (reqsize > max_reqsize)
+		reqsize = max_reqsize;
+	dev_addr = mmc_test_capacity(test->card) / 4;
+	if ((dev_addr & 0xffff0000))
+		dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
+	else
+		dev_addr &= 0xfffff800; /* Round to 1MiB boundary */
+	if (!dev_addr)
+		goto err;
+
+	/* prepare test area */
+	if (mmc_can_erase(test->card) &&
+	    tdata->prepare & MMC_TEST_PREP_ERASE) {
+		ret = mmc_erase(test->card, dev_addr,
+				size / 512, MMC_SECURE_ERASE_ARG);
+		if (ret)
+			ret = mmc_erase(test->card, dev_addr,
+					size / 512, MMC_ERASE_ARG);
+		if (ret)
+			goto err;
+	}
+
+	/* Run test */
+	ret = mmc_test_area_io_seq(test, reqsize, dev_addr,
+				   tdata->do_write, 0, 1, size / reqsize,
+				   tdata->do_nonblock_req);
+	if (ret)
+		goto err;
+
+	return ret;
+ err:
+	printk(KERN_INFO "[%s] error\n", __func__);
+	return ret;
+}
+
+static int mmc_test_rw_multiple_size(struct mmc_test_card *test,
+				     struct mmc_test_multiple_rw *rw) {
+	int ret = 0;
+	int i;
+
+	for (i = 0 ; i < rw->len && ret == 0; i++) {
+		ret = mmc_test_rw_multiple(test, rw, rw->bs[i], rw->size);
+		if (ret)
+			break;
+	}
+	return ret;
+}
+
+/*
+ * Multiple blocking write 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_write_blocking_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); };
+
+/*
+ * Multiple none blocking write 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_write_nonblock_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = true,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_ERASE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
+/*
+ * Multiple blocking read 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_read_blocking_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = false,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
+/*
+ * Multiple none blocking read 4k to 4 MB chunks  */ static int 
+mmc_test_profile_mult_read_nonblock_perf(struct mmc_test_card *test) {
+	unsigned int bs[] = {1 << 12, 1 << 13, 1 << 14, 1 << 15, 1 << 16,
+			     1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 22};
+	struct mmc_test_multiple_rw test_data = {
+		.bs = bs,
+		.size = 128*1024*1024,
+		.len = ARRAY_SIZE(bs),
+		.do_write = false,
+		.do_nonblock_req = true,
+		.prepare = MMC_TEST_PREP_NONE,
+	};
+
+	return mmc_test_rw_multiple_size(test, &test_data); }
+
 static const struct mmc_test_case mmc_test_cases[] = {
 	{
 		.name = "Basic write (no data verification)", @@ -2223,6 +2492,33 @@ static const struct mmc_test_case mmc_test_cases[] = {
 		.cleanup = mmc_test_area_cleanup,
 	},
 
+	{
+		.name = "Write performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Write performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_write_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_blocking_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
+
+	{
+		.name = "Read performance with none blocking req 4k to 4MB",
+		.prepare = mmc_test_area_prepare,
+		.run = mmc_test_profile_mult_read_nonblock_perf,
+		.cleanup = mmc_test_area_cleanup,
+	},
 };
 
 static DEFINE_MUTEX(mmc_test_lock);
--
1.7.4.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-17 15:46     ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 15:46 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On Wed, Apr 06, 2011 at 09:07:04PM +0200, Per Forlin wrote:
[...]
> +static int mmc_test_rw_multiple(struct mmc_test_card *test,
> +				struct mmc_test_multiple_rw *tdata,
> +				unsigned int reqsize, unsigned int size)
> +{
> +	unsigned int dev_addr;
> +	struct mmc_test_area *t = &test->area;
> +	int ret = 0;
> +	int max_reqsize = max(t->mem->size_min_cmn *
> +			      min(t->max_segs, t->mem->cnt), t->max_tfr);
> +
The 'max(..., t->max_tfr)' probably should be 'min(..., t->max_tfr)'.
Otherwise, I see mmc_test failure on my mxs-mmc setup.

mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
mmc0: Transfer of 64 x 2048 sectors (64 x 1024 KiB) took 5.412563314 seconds (12
398 kB/s, 12108 KiB/s, 11.82 IOPS)
mmc0: Failed to map sg list
[mmc_test_rw_multiple] error
mmc0: Result: ERROR (-22)

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-17 15:46     ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 15:46 UTC (permalink / raw)
  To: Per Forlin
  Cc: linaro-dev-cunTk1MwBs8s++Sfvej+rw,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Apr 06, 2011 at 09:07:04PM +0200, Per Forlin wrote:
[...]
> +static int mmc_test_rw_multiple(struct mmc_test_card *test,
> +				struct mmc_test_multiple_rw *tdata,
> +				unsigned int reqsize, unsigned int size)
> +{
> +	unsigned int dev_addr;
> +	struct mmc_test_area *t = &test->area;
> +	int ret = 0;
> +	int max_reqsize = max(t->mem->size_min_cmn *
> +			      min(t->max_segs, t->mem->cnt), t->max_tfr);
> +
The 'max(..., t->max_tfr)' probably should be 'min(..., t->max_tfr)'.
Otherwise, I see mmc_test failure on my mxs-mmc setup.

mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
mmc0: Transfer of 64 x 2048 sectors (64 x 1024 KiB) took 5.412563314 seconds (12
398 kB/s, 12108 KiB/s, 11.82 IOPS)
mmc0: Failed to map sg list
[mmc_test_rw_multiple] error
mmc0: Result: ERROR (-22)

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-17 15:46     ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 15:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 06, 2011 at 09:07:04PM +0200, Per Forlin wrote:
[...]
> +static int mmc_test_rw_multiple(struct mmc_test_card *test,
> +				struct mmc_test_multiple_rw *tdata,
> +				unsigned int reqsize, unsigned int size)
> +{
> +	unsigned int dev_addr;
> +	struct mmc_test_area *t = &test->area;
> +	int ret = 0;
> +	int max_reqsize = max(t->mem->size_min_cmn *
> +			      min(t->max_segs, t->mem->cnt), t->max_tfr);
> +
The 'max(..., t->max_tfr)' probably should be 'min(..., t->max_tfr)'.
Otherwise, I see mmc_test failure on my mxs-mmc setup.

mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
mmc0: Transfer of 64 x 2048 sectors (64 x 1024 KiB) took 5.412563314 seconds (12
398 kB/s, 12108 KiB/s, 11.82 IOPS)
mmc0: Failed to map sg list
[mmc_test_rw_multiple] error
mmc0: Result: ERROR (-22)

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-06 19:07 ` Per Forlin
@ 2011-04-17 16:33   ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 16:33 UTC (permalink / raw)
  To: linux-mmc
  Cc: linux-arm-kernel, linaro-kernel, patches, cjb, per.forlin, Shawn Guo

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before mxs_mmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index 99d39a6..63c2ae2 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -137,6 +137,10 @@
 
 #define SSP_PIO_NUM	3
 
+struct mxs_mmc_next {
+	s32 cookie;
+};
+
 struct mxs_mmc_host {
 	struct mmc_host			*mmc;
 	struct mmc_request		*mrq;
@@ -154,6 +158,7 @@ struct mxs_mmc_host {
 	struct mxs_dma_data		dma_data;
 	unsigned int			dma_dir;
 	u32				ssp_pio_words[SSP_PIO_NUM];
+	struct mxs_mmc_next		next_data;
 
 	unsigned int			version;
 	unsigned char			bus_width;
@@ -302,6 +307,31 @@ static irqreturn_t mxs_mmc_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static int mxs_mmc_prep_dma_data(struct mxs_mmc_host *host,
+				struct mmc_data *data,
+				struct mxs_mmc_next *next)
+{
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next || (!next && data->host_cookie != host->next_data.cookie))
+		if (dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			       (data->flags & MMC_DATA_WRITE) ?
+			       DMA_TO_DEVICE : DMA_FROM_DEVICE) == 0)
+			return -EINVAL;
+
+	if (next)
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+
+	return 0;
+}
+
 static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 	struct mxs_mmc_host *host, unsigned int append)
 {
@@ -312,8 +342,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 
 	if (data) {
 		/* data */
-		dma_map_sg(mmc_dev(host->mmc), data->sg,
-			   data->sg_len, host->dma_dir);
+		if (mxs_mmc_prep_dma_data(host, data, NULL))
+			return NULL;
 		sgl = data->sg;
 		sg_len = data->sg_len;
 	} else {
@@ -328,9 +358,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 		desc->callback = mxs_mmc_dma_irq_callback;
 		desc->callback_param = host;
 	} else {
-		if (data)
+		if (data) {
 			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
 				     data->sg_len, host->dma_dir);
+			data->host_cookie = 0;
+		}
 	}
 
 	return desc;
@@ -553,6 +585,40 @@ static void mxs_mmc_start_cmd(struct mxs_mmc_host *host,
 	}
 }
 
+static void mxs_mmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			    bool is_first_req)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		data->host_cookie = 0;
+		return;
+	}
+
+	if (mxs_mmc_prep_dma_data(host, data, &host->next_data))
+		data->host_cookie = 0;
+}
+
+static void mxs_mmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			     int err)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+			     data->sg_len, host->dma_dir);
+		data->host_cookie = 0;
+	}
+}
+
 static void mxs_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
 {
 	struct mxs_mmc_host *host = mmc_priv(mmc);
@@ -644,6 +710,8 @@ static void mxs_mmc_enable_sdio_irq(struct mmc_host *mmc, int enable)
 }
 
 static const struct mmc_host_ops mxs_mmc_ops = {
+	.pre_req = mxs_mmc_pre_req,
+	.post_req = mxs_mmc_post_req,
 	.request = mxs_mmc_request,
 	.get_ro = mxs_mmc_get_ro,
 	.get_cd = mxs_mmc_get_cd,
@@ -708,6 +776,7 @@ static int mxs_mmc_probe(struct platform_device *pdev)
 	host->dma_res = dmares;
 	host->irq = irq_err;
 	host->sdio_irq_en = 0;
+	host->next_data.cookie = 1;
 
 	host->clk = clk_get(&pdev->dev, NULL);
 	if (IS_ERR(host->clk)) {
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-17 16:33   ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 16:33 UTC (permalink / raw)
  To: linux-arm-kernel

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before mxs_mmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index 99d39a6..63c2ae2 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -137,6 +137,10 @@
 
 #define SSP_PIO_NUM	3
 
+struct mxs_mmc_next {
+	s32 cookie;
+};
+
 struct mxs_mmc_host {
 	struct mmc_host			*mmc;
 	struct mmc_request		*mrq;
@@ -154,6 +158,7 @@ struct mxs_mmc_host {
 	struct mxs_dma_data		dma_data;
 	unsigned int			dma_dir;
 	u32				ssp_pio_words[SSP_PIO_NUM];
+	struct mxs_mmc_next		next_data;
 
 	unsigned int			version;
 	unsigned char			bus_width;
@@ -302,6 +307,31 @@ static irqreturn_t mxs_mmc_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static int mxs_mmc_prep_dma_data(struct mxs_mmc_host *host,
+				struct mmc_data *data,
+				struct mxs_mmc_next *next)
+{
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next || (!next && data->host_cookie != host->next_data.cookie))
+		if (dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			       (data->flags & MMC_DATA_WRITE) ?
+			       DMA_TO_DEVICE : DMA_FROM_DEVICE) == 0)
+			return -EINVAL;
+
+	if (next)
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+
+	return 0;
+}
+
 static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 	struct mxs_mmc_host *host, unsigned int append)
 {
@@ -312,8 +342,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 
 	if (data) {
 		/* data */
-		dma_map_sg(mmc_dev(host->mmc), data->sg,
-			   data->sg_len, host->dma_dir);
+		if (mxs_mmc_prep_dma_data(host, data, NULL))
+			return NULL;
 		sgl = data->sg;
 		sg_len = data->sg_len;
 	} else {
@@ -328,9 +358,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 		desc->callback = mxs_mmc_dma_irq_callback;
 		desc->callback_param = host;
 	} else {
-		if (data)
+		if (data) {
 			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
 				     data->sg_len, host->dma_dir);
+			data->host_cookie = 0;
+		}
 	}
 
 	return desc;
@@ -553,6 +585,40 @@ static void mxs_mmc_start_cmd(struct mxs_mmc_host *host,
 	}
 }
 
+static void mxs_mmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			    bool is_first_req)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		data->host_cookie = 0;
+		return;
+	}
+
+	if (mxs_mmc_prep_dma_data(host, data, &host->next_data))
+		data->host_cookie = 0;
+}
+
+static void mxs_mmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			     int err)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+			     data->sg_len, host->dma_dir);
+		data->host_cookie = 0;
+	}
+}
+
 static void mxs_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
 {
 	struct mxs_mmc_host *host = mmc_priv(mmc);
@@ -644,6 +710,8 @@ static void mxs_mmc_enable_sdio_irq(struct mmc_host *mmc, int enable)
 }
 
 static const struct mmc_host_ops mxs_mmc_ops = {
+	.pre_req = mxs_mmc_pre_req,
+	.post_req = mxs_mmc_post_req,
 	.request = mxs_mmc_request,
 	.get_ro = mxs_mmc_get_ro,
 	.get_cd = mxs_mmc_get_cd,
@@ -708,6 +776,7 @@ static int mxs_mmc_probe(struct platform_device *pdev)
 	host->dma_res = dmares;
 	host->irq = irq_err;
 	host->sdio_irq_en = 0;
+	host->next_data.cookie = 1;
 
 	host->clk = clk_get(&pdev->dev, NULL);
 	if (IS_ERR(host->clk)) {
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-17 16:33   ` Shawn Guo
@ 2011-04-17 16:48     ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 16:48 UTC (permalink / raw)
  To: Shawn Guo; +Cc: linux-mmc, linaro-kernel, patches, cjb, linux-arm-kernel

On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> If not calling pre_req() before mxs_mmc_request(), request()
> will prepare the cache just like it did it before.
> It is optional to use pre_req() and post_req().
> 
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 72 insertions(+), 3 deletions(-)
> 

Here is the result of mmc_test case 37 ~ 40, which are designed to see
the performance improvement introduced by non-blocking changes.

Honestly, the improvement is not so impressive.  Not sure if the patch
for mxs-mmc pre_req and post_req support was correctly produced.  So
please help review ...

mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 76.370031249 seconds (1
757 kB/s, 1716 KiB/s, 429.06 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 34.951875000 seconds (
3840 kB/s, 3750 KiB/s, 468.75 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 19.097406250 seconds (7
028 kB/s, 6863 KiB/s, 428.95 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 14.393937500 seconds (9
324 kB/s, 9106 KiB/s, 284.56 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 12.519875000 seconds (
10720 kB/s, 10469 KiB/s, 163.57 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 11.535156250 seconds 
(11635 kB/s, 11362 KiB/s, 88.77 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 11.165375000 seconds (1
2020 kB/s, 11739 KiB/s, 45.85 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 10.922375001 seconds (
12288 kB/s, 12000 KiB/s, 23.43 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 10.791811701 seconds 
(12436 kB/s, 12145 KiB/s, 11.86 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 10.723858316 seconds (1
2345 kB/s, 12055 KiB/s, 3.63 IOPS)
mmc0: Result: OK
mmc0: Test case 38. Write performance with none blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 75.940425898 seconds (1
767 kB/s, 1725 KiB/s, 431.49 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 34.650031250 seconds (
3873 kB/s, 3782 KiB/s, 472.84 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 18.854781250 seconds (7
118 kB/s, 6951 KiB/s, 434.47 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 14.183781250 seconds (9
462 kB/s, 9240 KiB/s, 288.78 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 12.349375000 seconds (
10868 kB/s, 10613 KiB/s, 165.83 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 11.373031250 seconds 
(11801 kB/s, 11524 KiB/s, 90.03 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 10.991343750 seconds (1
2211 kB/s, 11925 KiB/s, 46.58 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 10.759218749 seconds (
12474 kB/s, 12182 KiB/s, 23.79 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 10.628342707 seconds 
(12628 kB/s, 12332 KiB/s, 12.04 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 10.547394289 seconds (1
2551 kB/s, 12257 KiB/s, 3.69 IOPS)
mmc0: Result: OK
mmc0: Test case 39. Read performance with blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 23.516885650 seconds (5
707 kB/s, 5573 KiB/s, 1393.38 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 13.651000000 seconds (
9832 kB/s, 9601 KiB/s, 1200.20 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 9.048625000 seconds (14
832 kB/s, 14485 KiB/s, 905.33 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 6.589500000 seconds (20
368 kB/s, 19891 KiB/s, 621.59 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 5.292437500 seconds (2
5360 kB/s, 24765 KiB/s, 386.96 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 4.646156250 seconds (
28887 kB/s, 28210 KiB/s, 220.39 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 4.319437500 seconds (31
072 kB/s, 30344 KiB/s, 118.53 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 4.158187500 seconds (3
2277 kB/s, 31521 KiB/s, 61.56 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 4.076250000 seconds (
32926 kB/s, 32155 KiB/s, 31.40 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 3.966816444 seconds (33
373 kB/s, 32591 KiB/s, 9.83 IOPS)
mmc0: Result: OK
mmc0: Test case 40. Read performance with none blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 23.251465475 seconds (5
772 kB/s, 5637 KiB/s, 1409.28 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 13.411468750 seconds (
10007 kB/s, 9773 KiB/s, 1221.64 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 8.822875000 seconds (15
212 kB/s, 14855 KiB/s, 928.49 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 6.413406250 seconds (20
927 kB/s, 20437 KiB/s, 638.66 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 5.127875000 seconds (2
6174 kB/s, 25560 KiB/s, 399.38 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 4.486593750 seconds (
29915 kB/s, 29214 KiB/s, 228.23 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 4.178312500 seconds (32
122 kB/s, 31369 KiB/s, 122.53 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 4.010281250 seconds (3
3468 kB/s, 32683 KiB/s, 63.83 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 3.927437500 seconds (
34174 kB/s, 33373 KiB/s, 32.59 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 3.823653744 seconds (34
623 kB/s, 33811 KiB/s, 10.19 IOPS)
mmc0: Result: OK
mmc0: Tests completed.

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-17 16:48     ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-17 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> If not calling pre_req() before mxs_mmc_request(), request()
> will prepare the cache just like it did it before.
> It is optional to use pre_req() and post_req().
> 
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 72 insertions(+), 3 deletions(-)
> 

Here is the result of mmc_test case 37 ~ 40, which are designed to see
the performance improvement introduced by non-blocking changes.

Honestly, the improvement is not so impressive.  Not sure if the patch
for mxs-mmc pre_req and post_req support was correctly produced.  So
please help review ...

mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 76.370031249 seconds (1
757 kB/s, 1716 KiB/s, 429.06 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 34.951875000 seconds (
3840 kB/s, 3750 KiB/s, 468.75 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 19.097406250 seconds (7
028 kB/s, 6863 KiB/s, 428.95 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 14.393937500 seconds (9
324 kB/s, 9106 KiB/s, 284.56 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 12.519875000 seconds (
10720 kB/s, 10469 KiB/s, 163.57 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 11.535156250 seconds 
(11635 kB/s, 11362 KiB/s, 88.77 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 11.165375000 seconds (1
2020 kB/s, 11739 KiB/s, 45.85 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 10.922375001 seconds (
12288 kB/s, 12000 KiB/s, 23.43 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 10.791811701 seconds 
(12436 kB/s, 12145 KiB/s, 11.86 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 10.723858316 seconds (1
2345 kB/s, 12055 KiB/s, 3.63 IOPS)
mmc0: Result: OK
mmc0: Test case 38. Write performance with none blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 75.940425898 seconds (1
767 kB/s, 1725 KiB/s, 431.49 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 34.650031250 seconds (
3873 kB/s, 3782 KiB/s, 472.84 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 18.854781250 seconds (7
118 kB/s, 6951 KiB/s, 434.47 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 14.183781250 seconds (9
462 kB/s, 9240 KiB/s, 288.78 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 12.349375000 seconds (
10868 kB/s, 10613 KiB/s, 165.83 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 11.373031250 seconds 
(11801 kB/s, 11524 KiB/s, 90.03 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 10.991343750 seconds (1
2211 kB/s, 11925 KiB/s, 46.58 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 10.759218749 seconds (
12474 kB/s, 12182 KiB/s, 23.79 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 10.628342707 seconds 
(12628 kB/s, 12332 KiB/s, 12.04 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 10.547394289 seconds (1
2551 kB/s, 12257 KiB/s, 3.69 IOPS)
mmc0: Result: OK
mmc0: Test case 39. Read performance with blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 23.516885650 seconds (5
707 kB/s, 5573 KiB/s, 1393.38 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 13.651000000 seconds (
9832 kB/s, 9601 KiB/s, 1200.20 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 9.048625000 seconds (14
832 kB/s, 14485 KiB/s, 905.33 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 6.589500000 seconds (20
368 kB/s, 19891 KiB/s, 621.59 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 5.292437500 seconds (2
5360 kB/s, 24765 KiB/s, 386.96 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 4.646156250 seconds (
28887 kB/s, 28210 KiB/s, 220.39 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 4.319437500 seconds (31
072 kB/s, 30344 KiB/s, 118.53 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 4.158187500 seconds (3
2277 kB/s, 31521 KiB/s, 61.56 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 4.076250000 seconds (
32926 kB/s, 32155 KiB/s, 31.40 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 3.966816444 seconds (33
373 kB/s, 32591 KiB/s, 9.83 IOPS)
mmc0: Result: OK
mmc0: Test case 40. Read performance with none blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 23.251465475 seconds (5
772 kB/s, 5637 KiB/s, 1409.28 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 13.411468750 seconds (
10007 kB/s, 9773 KiB/s, 1221.64 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 8.822875000 seconds (15
212 kB/s, 14855 KiB/s, 928.49 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 6.413406250 seconds (20
927 kB/s, 20437 KiB/s, 638.66 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 5.127875000 seconds (2
6174 kB/s, 25560 KiB/s, 399.38 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 4.486593750 seconds (
29915 kB/s, 29214 KiB/s, 228.23 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 4.178312500 seconds (32
122 kB/s, 31369 KiB/s, 122.53 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 4.010281250 seconds (3
3468 kB/s, 32683 KiB/s, 63.83 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 3.927437500 seconds (
34174 kB/s, 33373 KiB/s, 32.59 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 3.823653744 seconds (34
623 kB/s, 33811 KiB/s, 10.19 IOPS)
mmc0: Result: OK
mmc0: Tests completed.

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-11  9:08       ` Per Forlin
@ 2011-04-19 14:30         ` Jae hoon Chung
  -1 siblings, 0 replies; 129+ messages in thread
From: Jae hoon Chung @ 2011-04-19 14:30 UTC (permalink / raw)
  To: Per Forlin
  Cc: Linus Walleij, linux-mmc, linux-arm-kernel, linux-kernel,
	linaro-dev, Chris Ball, Jaehoon Chung

Hi Per..

2011/4/11 Per Forlin <per.forlin@linaro.org>:
> On 9 April 2011 13:55, Jae hoon Chung <jh80.chung@gmail.com> wrote:
>> Hi Per..
>>
>> I'm applied your patch..and sent the patch about dw_mmc.c.
>> I think good this approach..
>>
> Do you have any test results from the mmc_tests I added?
> I am interested in the results.

I didn't test with mmc_tests..but i tested with IOzone..
I'll test with mmc_test..then i'll share the results..

Regards,
Jaehoon Chung

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-19 14:30         ` Jae hoon Chung
  0 siblings, 0 replies; 129+ messages in thread
From: Jae hoon Chung @ 2011-04-19 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Per..

2011/4/11 Per Forlin <per.forlin@linaro.org>:
> On 9 April 2011 13:55, Jae hoon Chung <jh80.chung@gmail.com> wrote:
>> Hi Per..
>>
>> I'm applied your patch..and sent the patch about dw_mmc.c.
>> I think good this approach..
>>
> Do you have any test results from the mmc_tests I added?
> I am interested in the results.

I didn't test with mmc_tests..but i tested with IOzone..
I'll test with mmc_test..then i'll share the results..

Regards,
Jaehoon Chung

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
  2011-04-15 10:34     ` David Vrabel
@ 2011-04-20  7:17       ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:17 UTC (permalink / raw)
  To: David Vrabel
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

Hi,

On 15 April 2011 12:34, David Vrabel <david.vrabel@csr.com> wrote:
> On 06/04/11 20:07, Per Forlin wrote:
>> Previously there has only been one function mmc_wait_for_req
>> to start and wait for a request. This patch adds
>>  * mmc_start_req - starts a request wihtout waiting
>>  * mmc_wait_for_req_done - waits until request is done
>>  * mmc_pre_req - asks the host driver to prepare for the next job
>>  * mmc_post_req - asks the host driver to clean up after a completed job
>
> If MMC core had a queue of requests internally you wouldn't need to
> provide mmc_pre_req() and mmc_post_req() functions outside of the core.
>  i.e., the mmc block driver would just need to queue up two mmc requests
> and the core would take care of calling pre_req and post_req at the
> correct time.
>
Sorry for late response I have been out of office a couple of days.
Yes, it would be nice to not expose those hooks outside the core. I
will look into this in detail to see what it would take to implement
this and if there are any complications.

> Using a MMC request queue has other benefits -- it allows multiple users
> without having to claim/release the host.  This would be useful for
> (especially multi-function) SDIO.
You mean claim and release would be done only within the mmc core. The
timed saved here would equal the time it takes to release and claim
the host.
Claim and release can also be used for power management to indicate if
any client is using the host, if not the power can be switched off.

> (especially multi-function) SDIO.
I just want to make sure I understand the multi-function SDIO case, I
haven't done any work with SDIO.
Can the SDIO functions compete over the same claim_host at the same time?
Example: if function 1 claims the host, function 2 and function 3 also
want to claim the host but have to wait for function 1 to release the
host.
What is the extra benefit of having the internal request queue for
multi function SDIO?
The functions must still wait for request completion before
acknowledge the SDIO client? Or could the functions acknowledge
immediately after the request is queued up in mmc core?

>
> David
> --
Thanks for your feedback,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-20  7:17       ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 15 April 2011 12:34, David Vrabel <david.vrabel@csr.com> wrote:
> On 06/04/11 20:07, Per Forlin wrote:
>> Previously there has only been one function mmc_wait_for_req
>> to start and wait for a request. This patch adds
>> ?* mmc_start_req - starts a request wihtout waiting
>> ?* mmc_wait_for_req_done - waits until request is done
>> ?* mmc_pre_req - asks the host driver to prepare for the next job
>> ?* mmc_post_req - asks the host driver to clean up after a completed job
>
> If MMC core had a queue of requests internally you wouldn't need to
> provide mmc_pre_req() and mmc_post_req() functions outside of the core.
> ?i.e., the mmc block driver would just need to queue up two mmc requests
> and the core would take care of calling pre_req and post_req at the
> correct time.
>
Sorry for late response I have been out of office a couple of days.
Yes, it would be nice to not expose those hooks outside the core. I
will look into this in detail to see what it would take to implement
this and if there are any complications.

> Using a MMC request queue has other benefits -- it allows multiple users
> without having to claim/release the host. ?This would be useful for
> (especially multi-function) SDIO.
You mean claim and release would be done only within the mmc core. The
timed saved here would equal the time it takes to release and claim
the host.
Claim and release can also be used for power management to indicate if
any client is using the host, if not the power can be switched off.

> (especially multi-function) SDIO.
I just want to make sure I understand the multi-function SDIO case, I
haven't done any work with SDIO.
Can the SDIO functions compete over the same claim_host at the same time?
Example: if function 1 claims the host, function 2 and function 3 also
want to claim the host but have to wait for function 1 to release the
host.
What is the extra benefit of having the internal request queue for
multi function SDIO?
The functions must still wait for request completion before
acknowledge the SDIO client? Or could the functions acknowledge
immediately after the request is queued up in mmc core?

>
> David
> --
Thanks for your feedback,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
  2011-04-17  7:09     ` Lin Tony-B19295
  (?)
@ 2011-04-20  7:30       ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:30 UTC (permalink / raw)
  To: Lin Tony-B19295
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 17 April 2011 09:09, Lin Tony-B19295 <B19295@freescale.com> wrote:
> Hi Per
>
>        Just have a glance of your patch, good thinking. But I have a question about this patch. You modified mmc_test to test your driver. Does it mean your driver's performance enhancement depends on application?
I added those tests in mmc_test to compare the performance between
blocking and none blocking. Basically it tests performance gain if
running dma_map and dma_unmap in parallel with the transfer, compared
to running dma_map and dma_unmap in serial with the transfer.

> The caller must have to know the next request so that could make driver prepare next during current transfer?
mmc_test tests the ideal performance gain (all request are linked together).

> So testing your driver with blocking request & non blocking request will have different throughput due to different application mechanism.
Yes.
I added support for mmc non blocking in the mmc block device but the
performance gain here depends on how the FS-requests are propagated
down to the mmc block device. If the request are added one by one,
waiting for the last request to complete before adding the new
request, there will be no performance gain.
To test the mmc block performance I have run IOZone in user space.

>        Thanks
>
> BR
> Tony
BR
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-20  7:30       ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:30 UTC (permalink / raw)
  To: Lin Tony-B19295
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 17 April 2011 09:09, Lin Tony-B19295 <B19295@freescale.com> wrote:
> Hi Per
>
>        Just have a glance of your patch, good thinking. But I have a question about this patch. You modified mmc_test to test your driver. Does it mean your driver's performance enhancement depends on application?
I added those tests in mmc_test to compare the performance between
blocking and none blocking. Basically it tests performance gain if
running dma_map and dma_unmap in parallel with the transfer, compared
to running dma_map and dma_unmap in serial with the transfer.

> The caller must have to know the next request so that could make driver prepare next during current transfer?
mmc_test tests the ideal performance gain (all request are linked together).

> So testing your driver with blocking request & non blocking request will have different throughput due to different application mechanism.
Yes.
I added support for mmc non blocking in the mmc block device but the
performance gain here depends on how the FS-requests are propagated
down to the mmc block device. If the request are added one by one,
waiting for the last request to complete before adding the new
request, there will be no performance gain.
To test the mmc block performance I have run IOZone in user space.

>        Thanks
>
> BR
> Tony
BR
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-20  7:30       ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 17 April 2011 09:09, Lin Tony-B19295 <B19295@freescale.com> wrote:
> Hi Per
>
> ? ? ? ?Just have a glance of your patch, good thinking. But I have a question about this patch. You modified mmc_test to test your driver. Does it mean your driver's performance enhancement depends on application?
I added those tests in mmc_test to compare the performance between
blocking and none blocking. Basically it tests performance gain if
running dma_map and dma_unmap in parallel with the transfer, compared
to running dma_map and dma_unmap in serial with the transfer.

> The caller must have to know the next request so that could make driver prepare next during current transfer?
mmc_test tests the ideal performance gain (all request are linked together).

> So testing your driver with blocking request & non blocking request will have different throughput due to different application mechanism.
Yes.
I added support for mmc non blocking in the mmc block device but the
performance gain here depends on how the FS-requests are propagated
down to the mmc block device. If the request are added one by one,
waiting for the last request to complete before adding the new
request, there will be no performance gain.
To test the mmc block performance I have run IOZone in user space.

> ? ? ? ?Thanks
>
> BR
> Tony
BR
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
  2011-04-17 15:46     ` Shawn Guo
@ 2011-04-20  7:41       ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:41 UTC (permalink / raw)
  To: Shawn Guo
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 17 April 2011 17:46, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 06, 2011 at 09:07:04PM +0200, Per Forlin wrote:
> [...]
>> +static int mmc_test_rw_multiple(struct mmc_test_card *test,
>> +                             struct mmc_test_multiple_rw *tdata,
>> +                             unsigned int reqsize, unsigned int size)
>> +{
>> +     unsigned int dev_addr;
>> +     struct mmc_test_area *t = &test->area;
>> +     int ret = 0;
>> +     int max_reqsize = max(t->mem->size_min_cmn *
>> +                           min(t->max_segs, t->mem->cnt), t->max_tfr);
>> +
> The 'max(..., t->max_tfr)' probably should be 'min(..., t->max_tfr)'.
> Otherwise, I see mmc_test failure on my mxs-mmc setup.
>
> mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
> mmc0: Transfer of 64 x 2048 sectors (64 x 1024 KiB) took 5.412563314 seconds (12
> 398 kB/s, 12108 KiB/s, 11.82 IOPS)
> mmc0: Failed to map sg list
> [mmc_test_rw_multiple] error
> mmc0: Result: ERROR (-22)
Thanks for letting me know. I think I should simplify it and use
t->max_tfr only.
I tried to optimize for one of my case when max_tfr is only 1MiB but
mmc request size is 32MiB.

>
> --
> Regards,
> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers
@ 2011-04-20  7:41       ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 17 April 2011 17:46, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 06, 2011 at 09:07:04PM +0200, Per Forlin wrote:
> [...]
>> +static int mmc_test_rw_multiple(struct mmc_test_card *test,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct mmc_test_multiple_rw *tdata,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? unsigned int reqsize, unsigned int size)
>> +{
>> + ? ? unsigned int dev_addr;
>> + ? ? struct mmc_test_area *t = &test->area;
>> + ? ? int ret = 0;
>> + ? ? int max_reqsize = max(t->mem->size_min_cmn *
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? min(t->max_segs, t->mem->cnt), t->max_tfr);
>> +
> The 'max(..., t->max_tfr)' probably should be 'min(..., t->max_tfr)'.
> Otherwise, I see mmc_test failure on my mxs-mmc setup.
>
> mmc0: Test case 37. Write performance with blocking req 4k to 4MB...
> mmc0: Transfer of 64 x 2048 sectors (64 x 1024 KiB) took 5.412563314 seconds (12
> 398 kB/s, 12108 KiB/s, 11.82 IOPS)
> mmc0: Failed to map sg list
> [mmc_test_rw_multiple] error
> mmc0: Result: ERROR (-22)
Thanks for letting me know. I think I should simplify it and use
t->max_tfr only.
I tried to optimize for one of my case when max_tfr is only 1MiB but
mmc request size is 32MiB.

>
> --
> Regards,
> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-17 16:33   ` Shawn Guo
@ 2011-04-20  7:58     ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:58 UTC (permalink / raw)
  To: Shawn Guo; +Cc: linux-mmc, linux-arm-kernel, linaro-kernel, patches, cjb

On 17 April 2011 18:33, Shawn Guo <shawn.guo@linaro.org> wrote:
> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> If not calling pre_req() before mxs_mmc_request(), request()
> will prepare the cache just like it did it before.
> It is optional to use pre_req() and post_req().
>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 72 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
> index 99d39a6..63c2ae2 100644
> --- a/drivers/mmc/host/mxs-mmc.c
> +++ b/drivers/mmc/host/mxs-mmc.c
> @@ -137,6 +137,10 @@
>
>  #define SSP_PIO_NUM    3
>
> +struct mxs_mmc_next {
> +       s32 cookie;
> +};
> +
>  struct mxs_mmc_host {
>        struct mmc_host                 *mmc;
>        struct mmc_request              *mrq;
> @@ -154,6 +158,7 @@ struct mxs_mmc_host {
>        struct mxs_dma_data             dma_data;
>        unsigned int                    dma_dir;
>        u32                             ssp_pio_words[SSP_PIO_NUM];
> +       struct mxs_mmc_next             next_data;
>
>        unsigned int                    version;
>        unsigned char                   bus_width;
> @@ -302,6 +307,31 @@ static irqreturn_t mxs_mmc_irq_handler(int irq, void *dev_id)
>        return IRQ_HANDLED;
>  }
>
> +static int mxs_mmc_prep_dma_data(struct mxs_mmc_host *host,
> +                               struct mmc_data *data,
> +                               struct mxs_mmc_next *next)
> +{
> +       if (!next && data->host_cookie &&
> +           data->host_cookie != host->next_data.cookie) {
> +               printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
> +                      " host->next_data.cookie %d\n",
> +                      __func__, data->host_cookie, host->next_data.cookie);
> +               data->host_cookie = 0;
> +       }
> +
> +       /* Check if next job is already prepared */
> +       if (next || (!next && data->host_cookie != host->next_data.cookie))
> +               if (dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
> +                              (data->flags & MMC_DATA_WRITE) ?
> +                              DMA_TO_DEVICE : DMA_FROM_DEVICE) == 0)
> +                       return -EINVAL;
> +
> +       if (next)
> +               data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
> +
> +       return 0;
> +}
> +
>  static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
>        struct mxs_mmc_host *host, unsigned int append)
>  {
> @@ -312,8 +342,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
>
>        if (data) {
>                /* data */
> -               dma_map_sg(mmc_dev(host->mmc), data->sg,
> -                          data->sg_len, host->dma_dir);
> +               if (mxs_mmc_prep_dma_data(host, data, NULL))
> +                       return NULL;
>                sgl = data->sg;
>                sg_len = data->sg_len;
>        } else {
> @@ -328,9 +358,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
>                desc->callback = mxs_mmc_dma_irq_callback;
>                desc->callback_param = host;
>        } else {
> -               if (data)
> +               if (data) {
>                        dma_unmap_sg(mmc_dev(host->mmc), data->sg,
>                                     data->sg_len, host->dma_dir);
> +                       data->host_cookie = 0;
> +               }
When is dma_unmap_sg called? If host_cookie is set dma_unmap() should
only be called from post_req.
My guess is
+ if (data && !data->host_cookie) {
It looks like only dma_map is run in parallel with transfer but not
dma_unmap. This may explain the numbers.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20  7:58     ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  7:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 17 April 2011 18:33, Shawn Guo <shawn.guo@linaro.org> wrote:
> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> If not calling pre_req() before mxs_mmc_request(), request()
> will prepare the cache just like it did it before.
> It is optional to use pre_req() and post_req().
>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> ?drivers/mmc/host/mxs-mmc.c | ? 75 ++++++++++++++++++++++++++++++++++++++++++--
> ?1 files changed, 72 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
> index 99d39a6..63c2ae2 100644
> --- a/drivers/mmc/host/mxs-mmc.c
> +++ b/drivers/mmc/host/mxs-mmc.c
> @@ -137,6 +137,10 @@
>
> ?#define SSP_PIO_NUM ? ?3
>
> +struct mxs_mmc_next {
> + ? ? ? s32 cookie;
> +};
> +
> ?struct mxs_mmc_host {
> ? ? ? ?struct mmc_host ? ? ? ? ? ? ? ? *mmc;
> ? ? ? ?struct mmc_request ? ? ? ? ? ? ?*mrq;
> @@ -154,6 +158,7 @@ struct mxs_mmc_host {
> ? ? ? ?struct mxs_dma_data ? ? ? ? ? ? dma_data;
> ? ? ? ?unsigned int ? ? ? ? ? ? ? ? ? ?dma_dir;
> ? ? ? ?u32 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ssp_pio_words[SSP_PIO_NUM];
> + ? ? ? struct mxs_mmc_next ? ? ? ? ? ? next_data;
>
> ? ? ? ?unsigned int ? ? ? ? ? ? ? ? ? ?version;
> ? ? ? ?unsigned char ? ? ? ? ? ? ? ? ? bus_width;
> @@ -302,6 +307,31 @@ static irqreturn_t mxs_mmc_irq_handler(int irq, void *dev_id)
> ? ? ? ?return IRQ_HANDLED;
> ?}
>
> +static int mxs_mmc_prep_dma_data(struct mxs_mmc_host *host,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct mmc_data *data,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct mxs_mmc_next *next)
> +{
> + ? ? ? if (!next && data->host_cookie &&
> + ? ? ? ? ? data->host_cookie != host->next_data.cookie) {
> + ? ? ? ? ? ? ? printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
> + ? ? ? ? ? ? ? ? ? ? ?" host->next_data.cookie %d\n",
> + ? ? ? ? ? ? ? ? ? ? ?__func__, data->host_cookie, host->next_data.cookie);
> + ? ? ? ? ? ? ? data->host_cookie = 0;
> + ? ? ? }
> +
> + ? ? ? /* Check if next job is already prepared */
> + ? ? ? if (next || (!next && data->host_cookie != host->next_data.cookie))
> + ? ? ? ? ? ? ? if (dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(data->flags & MMC_DATA_WRITE) ?
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?DMA_TO_DEVICE : DMA_FROM_DEVICE) == 0)
> + ? ? ? ? ? ? ? ? ? ? ? return -EINVAL;
> +
> + ? ? ? if (next)
> + ? ? ? ? ? ? ? data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
> +
> + ? ? ? return 0;
> +}
> +
> ?static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> ? ? ? ?struct mxs_mmc_host *host, unsigned int append)
> ?{
> @@ -312,8 +342,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
>
> ? ? ? ?if (data) {
> ? ? ? ? ? ? ? ?/* data */
> - ? ? ? ? ? ? ? dma_map_sg(mmc_dev(host->mmc), data->sg,
> - ? ? ? ? ? ? ? ? ? ? ? ? ?data->sg_len, host->dma_dir);
> + ? ? ? ? ? ? ? if (mxs_mmc_prep_dma_data(host, data, NULL))
> + ? ? ? ? ? ? ? ? ? ? ? return NULL;
> ? ? ? ? ? ? ? ?sgl = data->sg;
> ? ? ? ? ? ? ? ?sg_len = data->sg_len;
> ? ? ? ?} else {
> @@ -328,9 +358,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> ? ? ? ? ? ? ? ?desc->callback = mxs_mmc_dma_irq_callback;
> ? ? ? ? ? ? ? ?desc->callback_param = host;
> ? ? ? ?} else {
> - ? ? ? ? ? ? ? if (data)
> + ? ? ? ? ? ? ? if (data) {
> ? ? ? ? ? ? ? ? ? ? ? ?dma_unmap_sg(mmc_dev(host->mmc), data->sg,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? data->sg_len, host->dma_dir);
> + ? ? ? ? ? ? ? ? ? ? ? data->host_cookie = 0;
> + ? ? ? ? ? ? ? }
When is dma_unmap_sg called? If host_cookie is set dma_unmap() should
only be called from post_req.
My guess is
+ if (data && !data->host_cookie) {
It looks like only dma_map is run in parallel with transfer but not
dma_unmap. This may explain the numbers.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-17 16:48     ` Shawn Guo
@ 2011-04-20  8:01       ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  8:01 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
>> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> If not calling pre_req() before mxs_mmc_request(), request()
>> will prepare the cache just like it did it before.
>> It is optional to use pre_req() and post_req().
>>
>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> ---
>>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
>>  1 files changed, 72 insertions(+), 3 deletions(-)
>>
>
> Here is the result of mmc_test case 37 ~ 40, which are designed to see
> the performance improvement introduced by non-blocking changes.
>
> Honestly, the improvement is not so impressive.  Not sure if the patch
> for mxs-mmc pre_req and post_req support was correctly produced.  So
> please help review ...
My guess is that dma_unmap is not run in parallel with transfer.
Please look at my patch reply.

>
> --
> Regards,
> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20  8:01       ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  8:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
>> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> If not calling pre_req() before mxs_mmc_request(), request()
>> will prepare the cache just like it did it before.
>> It is optional to use pre_req() and post_req().
>>
>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> ---
>> ?drivers/mmc/host/mxs-mmc.c | ? 75 ++++++++++++++++++++++++++++++++++++++++++--
>> ?1 files changed, 72 insertions(+), 3 deletions(-)
>>
>
> Here is the result of mmc_test case 37 ~ 40, which are designed to see
> the performance improvement introduced by non-blocking changes.
>
> Honestly, the improvement is not so impressive. ?Not sure if the patch
> for mxs-mmc pre_req and post_req support was correctly produced. ?So
> please help review ...
My guess is that dma_unmap is not run in parallel with transfer.
Please look at my patch reply.

>
> --
> Regards,
> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-20  7:58     ` Per Forlin
@ 2011-04-20  8:17       ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-20  8:17 UTC (permalink / raw)
  To: Per Forlin
  Cc: Shawn Guo, linux-mmc, linux-arm-kernel, linaro-kernel, patches, cjb

On Wed, Apr 20, 2011 at 09:58:48AM +0200, Per Forlin wrote:
> >  static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> >        struct mxs_mmc_host *host, unsigned int append)
> >  {
> > @@ -312,8 +342,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> >
> >        if (data) {
> >                /* data */
> > -               dma_map_sg(mmc_dev(host->mmc), data->sg,
> > -                          data->sg_len, host->dma_dir);
> > +               if (mxs_mmc_prep_dma_data(host, data, NULL))
> > +                       return NULL;
> >                sgl = data->sg;
> >                sg_len = data->sg_len;
> >        } else {
> > @@ -328,9 +358,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> >                desc->callback = mxs_mmc_dma_irq_callback;
> >                desc->callback_param = host;
> >        } else {
> > -               if (data)
> > +               if (data) {
> >                        dma_unmap_sg(mmc_dev(host->mmc), data->sg,
> >                                     data->sg_len, host->dma_dir);
> > +                       data->host_cookie = 0;
> > +               }
> When is dma_unmap_sg called? If host_cookie is set dma_unmap() should
> only be called from post_req.
> My guess is
> + if (data && !data->host_cookie) {
> It looks like only dma_map is run in parallel with transfer but not
> dma_unmap. This may explain the numbers.

Good catch.  I forgot patching mxs_mmc_request_done where dma_unmap_sg
is called.  Will correct and retest ...

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20  8:17       ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-20  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 20, 2011 at 09:58:48AM +0200, Per Forlin wrote:
> > ?static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> > ? ? ? ?struct mxs_mmc_host *host, unsigned int append)
> > ?{
> > @@ -312,8 +342,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> >
> > ? ? ? ?if (data) {
> > ? ? ? ? ? ? ? ?/* data */
> > - ? ? ? ? ? ? ? dma_map_sg(mmc_dev(host->mmc), data->sg,
> > - ? ? ? ? ? ? ? ? ? ? ? ? ?data->sg_len, host->dma_dir);
> > + ? ? ? ? ? ? ? if (mxs_mmc_prep_dma_data(host, data, NULL))
> > + ? ? ? ? ? ? ? ? ? ? ? return NULL;
> > ? ? ? ? ? ? ? ?sgl = data->sg;
> > ? ? ? ? ? ? ? ?sg_len = data->sg_len;
> > ? ? ? ?} else {
> > @@ -328,9 +358,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
> > ? ? ? ? ? ? ? ?desc->callback = mxs_mmc_dma_irq_callback;
> > ? ? ? ? ? ? ? ?desc->callback_param = host;
> > ? ? ? ?} else {
> > - ? ? ? ? ? ? ? if (data)
> > + ? ? ? ? ? ? ? if (data) {
> > ? ? ? ? ? ? ? ? ? ? ? ?dma_unmap_sg(mmc_dev(host->mmc), data->sg,
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? data->sg_len, host->dma_dir);
> > + ? ? ? ? ? ? ? ? ? ? ? data->host_cookie = 0;
> > + ? ? ? ? ? ? ? }
> When is dma_unmap_sg called? If host_cookie is set dma_unmap() should
> only be called from post_req.
> My guess is
> + if (data && !data->host_cookie) {
> It looks like only dma_map is run in parallel with transfer but not
> dma_unmap. This may explain the numbers.

Good catch.  I forgot patching mxs_mmc_request_done where dma_unmap_sg
is called.  Will correct and retest ...

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
  2011-04-16 15:48   ` Shawn Guo
@ 2011-04-20  8:19     ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  8:19 UTC (permalink / raw)
  To: Shawn Guo
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 16 April 2011 17:48, Shawn Guo <shawn.guo@freescale.com> wrote:
> Hi Per,
>
> On Wed, Apr 06, 2011 at 09:07:01PM +0200, Per Forlin wrote:
> [...]
>>
>> Per Forlin (12):
>>   mmc: add none blocking mmc request function
>>   mmc: mmc_test: add debugfs file to list all tests
>>   mmc: mmc_test: add test for none blocking transfers
>>   mmc: add member in mmc queue struct to hold request data
>>   mmc: add a block request prepare function
>>   mmc: move error code in mmc_block_issue_rw_rq to a separate function.
>>   mmc: add a second mmc queue request member
>>   mmc: add handling for two parallel block requests in issue_rw_rq
>>   mmc: test: add random fault injection in core.c
>>   omap_hsmmc: use original sg_len for dma_unmap_sg
>>   omap_hsmmc: add support for pre_req and post_req
>>   mmci: implement pre_req() and post_req()
>>
>>  drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
>>  drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
>>  drivers/mmc/card/queue.c      |  171 +++++++++------
>>  drivers/mmc/card/queue.h      |   31 ++-
>>  drivers/mmc/core/core.c       |  132 ++++++++++-
>>  drivers/mmc/core/debugfs.c    |    5 +
>>  drivers/mmc/host/mmci.c       |  146 +++++++++++-
>>  drivers/mmc/host/mmci.h       |    8 +
>>  drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
>>  include/linux/mmc/core.h      |    9 +-
>>  include/linux/mmc/host.h      |   13 +-
>>  lib/Kconfig.debug             |   11 +
>>  12 files changed, 1172 insertions(+), 279 deletions(-)
>>
> I'm playing the patch set and seeing the following warnings.
>
>  CC      drivers/mmc/card/block.o
> drivers/mmc/card/block.c: In function ‘mmc_blk_issue_rq’:
> drivers/mmc/card/block.c:429:6: warning: ‘status’ may be used uninitialized in this function
>
My bad, it should be:
+ int status = 0;

>  CC      drivers/mmc/core/core.o
> drivers/mmc/core/core.c: In function ‘mmc_request_done’:
> drivers/mmc/core/core.c:163:3: warning: passing argument 2 of ‘mmc_should_fail_request’ from incompatible pointer type
> drivers/mmc/core/core.c:129:20: note: expected ‘struct mmc_data *’ but argument is of type ‘struct mmc_request *’
The function within #ifdef CONFIG_FAIL_MMC_REQUEST is the correct one.
I will update this in the next version.


>
> --
> Regards,
> Shawn
Thanks,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-20  8:19     ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20  8:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 16 April 2011 17:48, Shawn Guo <shawn.guo@freescale.com> wrote:
> Hi Per,
>
> On Wed, Apr 06, 2011 at 09:07:01PM +0200, Per Forlin wrote:
> [...]
>>
>> Per Forlin (12):
>> ? mmc: add none blocking mmc request function
>> ? mmc: mmc_test: add debugfs file to list all tests
>> ? mmc: mmc_test: add test for none blocking transfers
>> ? mmc: add member in mmc queue struct to hold request data
>> ? mmc: add a block request prepare function
>> ? mmc: move error code in mmc_block_issue_rw_rq to a separate function.
>> ? mmc: add a second mmc queue request member
>> ? mmc: add handling for two parallel block requests in issue_rw_rq
>> ? mmc: test: add random fault injection in core.c
>> ? omap_hsmmc: use original sg_len for dma_unmap_sg
>> ? omap_hsmmc: add support for pre_req and post_req
>> ? mmci: implement pre_req() and post_req()
>>
>> ?drivers/mmc/card/block.c ? ? ?| ?493 +++++++++++++++++++++++++++--------------
>> ?drivers/mmc/card/mmc_test.c ? | ?342 ++++++++++++++++++++++++++++-
>> ?drivers/mmc/card/queue.c ? ? ?| ?171 +++++++++------
>> ?drivers/mmc/card/queue.h ? ? ?| ? 31 ++-
>> ?drivers/mmc/core/core.c ? ? ? | ?132 ++++++++++-
>> ?drivers/mmc/core/debugfs.c ? ?| ? ?5 +
>> ?drivers/mmc/host/mmci.c ? ? ? | ?146 +++++++++++-
>> ?drivers/mmc/host/mmci.h ? ? ? | ? ?8 +
>> ?drivers/mmc/host/omap_hsmmc.c | ? 90 +++++++-
>> ?include/linux/mmc/core.h ? ? ?| ? ?9 +-
>> ?include/linux/mmc/host.h ? ? ?| ? 13 +-
>> ?lib/Kconfig.debug ? ? ? ? ? ? | ? 11 +
>> ?12 files changed, 1172 insertions(+), 279 deletions(-)
>>
> I'm playing the patch set and seeing the following warnings.
>
> ?CC ? ? ?drivers/mmc/card/block.o
> drivers/mmc/card/block.c: In function ?mmc_blk_issue_rq?:
> drivers/mmc/card/block.c:429:6: warning: ?status? may be used uninitialized in this function
>
My bad, it should be:
+ int status = 0;

> ?CC ? ? ?drivers/mmc/core/core.o
> drivers/mmc/core/core.c: In function ?mmc_request_done?:
> drivers/mmc/core/core.c:163:3: warning: passing argument 2 of ?mmc_should_fail_request? from incompatible pointer type
> drivers/mmc/core/core.c:129:20: note: expected ?struct mmc_data *? but argument is of type ?struct mmc_request *?
The function within #ifdef CONFIG_FAIL_MMC_REQUEST is the correct one.
I will update this in the next version.


>
> --
> Regards,
> Shawn
Thanks,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq
  2011-04-06 19:07   ` Per Forlin
@ 2011-04-20 11:32     ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20 11:32 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

On 6 April 2011 21:07, Per Forlin <per.forlin@linaro.org> wrote:
> Change mmc_blk_issue_rw_rq() to become asynchronous.
> The execution flow looks like this:
> The mmc-queue calls issue_rw_rq(), which sends the request
> to the host and returns back to the mmc-queue. The mmc-queue calls
> isuue_rw_rq() again with a new request. This new request is prepared,
> in isuue_rw_rq(), then it waits for the active request to complete before
> pushing it to the host. When to mmc-queue is empty it will call
> isuue_rw_rq() with req=NULL to finish off the active request
> without starting a new request.
>
> Signed-off-by: Per Forlin <per.forlin@linaro.org>
> ---
>  drivers/mmc/card/block.c |  157 +++++++++++++++++++++++++++++++++++++++-------
>  drivers/mmc/card/queue.c |    2 +-
>  2 files changed, 134 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index eef3510..2b14d1c 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -59,6 +59,7 @@ static int mmc_queue_thread(void *d)
>                mq->mqrq_cur->req = req;
>                spin_unlock_irq(q->queue_lock);
>
Call set_current_state(TASK_RUNNING) before issue_fn() otherwise
issue_fn will execute as TASK_INTERRUPTIBLE.
Will be added in the version 3 of this patchset.

> +               mq->issue_fn(mq, req);
>                if (!req) {
>                        if (kthread_should_stop()) {
>                                set_current_state(TASK_RUNNING);
> @@ -71,7 +72,6 @@ static int mmc_queue_thread(void *d)
>                }
>                set_current_state(TASK_RUNNING);
>
> -               mq->issue_fn(mq, req);
>        } while (1);
>        up(&mq->thread_sem);
>
> --
> 1.7.4.1
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq
@ 2011-04-20 11:32     ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 April 2011 21:07, Per Forlin <per.forlin@linaro.org> wrote:
> Change mmc_blk_issue_rw_rq() to become asynchronous.
> The execution flow looks like this:
> The mmc-queue calls issue_rw_rq(), which sends the request
> to the host and returns back to the mmc-queue. The mmc-queue calls
> isuue_rw_rq() again with a new request. This new request is prepared,
> in isuue_rw_rq(), then it waits for the active request to complete before
> pushing it to the host. When to mmc-queue is empty it will call
> isuue_rw_rq() with req=NULL to finish off the active request
> without starting a new request.
>
> Signed-off-by: Per Forlin <per.forlin@linaro.org>
> ---
> ?drivers/mmc/card/block.c | ?157 +++++++++++++++++++++++++++++++++++++++-------
> ?drivers/mmc/card/queue.c | ? ?2 +-
> ?2 files changed, 134 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index eef3510..2b14d1c 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -59,6 +59,7 @@ static int mmc_queue_thread(void *d)
> ? ? ? ? ? ? ? ?mq->mqrq_cur->req = req;
> ? ? ? ? ? ? ? ?spin_unlock_irq(q->queue_lock);
>
Call set_current_state(TASK_RUNNING) before issue_fn() otherwise
issue_fn will execute as TASK_INTERRUPTIBLE.
Will be added in the version 3 of this patchset.

> +               mq->issue_fn(mq, req);
> ? ? ? ? ? ? ? ?if (!req) {
> ? ? ? ? ? ? ? ? ? ? ? ?if (kthread_should_stop()) {
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?set_current_state(TASK_RUNNING);
> @@ -71,7 +72,6 @@ static int mmc_queue_thread(void *d)
> ? ? ? ? ? ? ? ?}
> ? ? ? ? ? ? ? ?set_current_state(TASK_RUNNING);
>
> - ? ? ? ? ? ? ? mq->issue_fn(mq, req);
> ? ? ? ?} while (1);
> ? ? ? ?up(&mq->thread_sem);
>
> --
> 1.7.4.1
>
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-17 16:33   ` Shawn Guo
@ 2011-04-20 13:51     ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-20 13:51 UTC (permalink / raw)
  To: linux-mmc; +Cc: linux-arm-kernel, linaro-kernel, patches, Shawn Guo

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before mxs_mmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
Changes since v1:
* Get dma_unmap_sg() call in mxs_mmc_request_done() non-blocking

 drivers/mmc/host/mxs-mmc.c |   81 +++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index 99d39a6..235acfa 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -137,6 +137,10 @@
 
 #define SSP_PIO_NUM	3
 
+struct mxs_mmc_next {
+	s32 cookie;
+};
+
 struct mxs_mmc_host {
 	struct mmc_host			*mmc;
 	struct mmc_request		*mrq;
@@ -154,6 +158,7 @@ struct mxs_mmc_host {
 	struct mxs_dma_data		dma_data;
 	unsigned int			dma_dir;
 	u32				ssp_pio_words[SSP_PIO_NUM];
+	struct mxs_mmc_next		next_data;
 
 	unsigned int			version;
 	unsigned char			bus_width;
@@ -236,8 +241,10 @@ static void mxs_mmc_request_done(struct mxs_mmc_host *host)
 	}
 
 	if (data) {
-		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-			     data->sg_len, host->dma_dir);
+		if (!data->host_cookie)
+			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len, host->dma_dir);
+
 		/*
 		 * If there was an error on any block, we mark all
 		 * data blocks as being in error.
@@ -302,6 +309,31 @@ static irqreturn_t mxs_mmc_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static int mxs_mmc_prep_dma_data(struct mxs_mmc_host *host,
+				struct mmc_data *data,
+				struct mxs_mmc_next *next)
+{
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next || (!next && data->host_cookie != host->next_data.cookie))
+		if (dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			       (data->flags & MMC_DATA_WRITE) ?
+			       DMA_TO_DEVICE : DMA_FROM_DEVICE) == 0)
+			return -EINVAL;
+
+	if (next)
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+
+	return 0;
+}
+
 static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 	struct mxs_mmc_host *host, unsigned int append)
 {
@@ -312,8 +344,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 
 	if (data) {
 		/* data */
-		dma_map_sg(mmc_dev(host->mmc), data->sg,
-			   data->sg_len, host->dma_dir);
+		if (mxs_mmc_prep_dma_data(host, data, NULL))
+			return NULL;
 		sgl = data->sg;
 		sg_len = data->sg_len;
 	} else {
@@ -328,9 +360,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 		desc->callback = mxs_mmc_dma_irq_callback;
 		desc->callback_param = host;
 	} else {
-		if (data)
+		if (data) {
 			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
 				     data->sg_len, host->dma_dir);
+			data->host_cookie = 0;
+		}
 	}
 
 	return desc;
@@ -553,6 +587,40 @@ static void mxs_mmc_start_cmd(struct mxs_mmc_host *host,
 	}
 }
 
+static void mxs_mmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			    bool is_first_req)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		data->host_cookie = 0;
+		return;
+	}
+
+	if (mxs_mmc_prep_dma_data(host, data, &host->next_data))
+		data->host_cookie = 0;
+}
+
+static void mxs_mmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			     int err)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+			     data->sg_len, host->dma_dir);
+		data->host_cookie = 0;
+	}
+}
+
 static void mxs_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
 {
 	struct mxs_mmc_host *host = mmc_priv(mmc);
@@ -644,6 +712,8 @@ static void mxs_mmc_enable_sdio_irq(struct mmc_host *mmc, int enable)
 }
 
 static const struct mmc_host_ops mxs_mmc_ops = {
+	.pre_req = mxs_mmc_pre_req,
+	.post_req = mxs_mmc_post_req,
 	.request = mxs_mmc_request,
 	.get_ro = mxs_mmc_get_ro,
 	.get_cd = mxs_mmc_get_cd,
@@ -708,6 +778,7 @@ static int mxs_mmc_probe(struct platform_device *pdev)
 	host->dma_res = dmares;
 	host->irq = irq_err;
 	host->sdio_irq_en = 0;
+	host->next_data.cookie = 1;
 
 	host->clk = clk_get(&pdev->dev, NULL);
 	if (IS_ERR(host->clk)) {
-- 
1.7.4.1



^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH v2] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20 13:51     ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-20 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
If not calling pre_req() before mxs_mmc_request(), request()
will prepare the cache just like it did it before.
It is optional to use pre_req() and post_req().

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
Changes since v1:
* Get dma_unmap_sg() call in mxs_mmc_request_done() non-blocking

 drivers/mmc/host/mxs-mmc.c |   81 +++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index 99d39a6..235acfa 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -137,6 +137,10 @@
 
 #define SSP_PIO_NUM	3
 
+struct mxs_mmc_next {
+	s32 cookie;
+};
+
 struct mxs_mmc_host {
 	struct mmc_host			*mmc;
 	struct mmc_request		*mrq;
@@ -154,6 +158,7 @@ struct mxs_mmc_host {
 	struct mxs_dma_data		dma_data;
 	unsigned int			dma_dir;
 	u32				ssp_pio_words[SSP_PIO_NUM];
+	struct mxs_mmc_next		next_data;
 
 	unsigned int			version;
 	unsigned char			bus_width;
@@ -236,8 +241,10 @@ static void mxs_mmc_request_done(struct mxs_mmc_host *host)
 	}
 
 	if (data) {
-		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-			     data->sg_len, host->dma_dir);
+		if (!data->host_cookie)
+			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+				     data->sg_len, host->dma_dir);
+
 		/*
 		 * If there was an error on any block, we mark all
 		 * data blocks as being in error.
@@ -302,6 +309,31 @@ static irqreturn_t mxs_mmc_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static int mxs_mmc_prep_dma_data(struct mxs_mmc_host *host,
+				struct mmc_data *data,
+				struct mxs_mmc_next *next)
+{
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		printk(KERN_WARNING "[%s] invalid cookie: data->host_cookie %d"
+		       " host->next_data.cookie %d\n",
+		       __func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next || (!next && data->host_cookie != host->next_data.cookie))
+		if (dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			       (data->flags & MMC_DATA_WRITE) ?
+			       DMA_TO_DEVICE : DMA_FROM_DEVICE) == 0)
+			return -EINVAL;
+
+	if (next)
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+
+	return 0;
+}
+
 static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 	struct mxs_mmc_host *host, unsigned int append)
 {
@@ -312,8 +344,8 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 
 	if (data) {
 		/* data */
-		dma_map_sg(mmc_dev(host->mmc), data->sg,
-			   data->sg_len, host->dma_dir);
+		if (mxs_mmc_prep_dma_data(host, data, NULL))
+			return NULL;
 		sgl = data->sg;
 		sg_len = data->sg_len;
 	} else {
@@ -328,9 +360,11 @@ static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
 		desc->callback = mxs_mmc_dma_irq_callback;
 		desc->callback_param = host;
 	} else {
-		if (data)
+		if (data) {
 			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
 				     data->sg_len, host->dma_dir);
+			data->host_cookie = 0;
+		}
 	}
 
 	return desc;
@@ -553,6 +587,40 @@ static void mxs_mmc_start_cmd(struct mxs_mmc_host *host,
 	}
 }
 
+static void mxs_mmc_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			    bool is_first_req)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		data->host_cookie = 0;
+		return;
+	}
+
+	if (mxs_mmc_prep_dma_data(host, data, &host->next_data))
+		data->host_cookie = 0;
+}
+
+static void mxs_mmc_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			     int err)
+{
+	struct mxs_mmc_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return;
+
+	if (data->host_cookie) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+			     data->sg_len, host->dma_dir);
+		data->host_cookie = 0;
+	}
+}
+
 static void mxs_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
 {
 	struct mxs_mmc_host *host = mmc_priv(mmc);
@@ -644,6 +712,8 @@ static void mxs_mmc_enable_sdio_irq(struct mmc_host *mmc, int enable)
 }
 
 static const struct mmc_host_ops mxs_mmc_ops = {
+	.pre_req = mxs_mmc_pre_req,
+	.post_req = mxs_mmc_post_req,
 	.request = mxs_mmc_request,
 	.get_ro = mxs_mmc_get_ro,
 	.get_cd = mxs_mmc_get_cd,
@@ -708,6 +778,7 @@ static int mxs_mmc_probe(struct platform_device *pdev)
 	host->dma_res = dmares;
 	host->irq = irq_err;
 	host->sdio_irq_en = 0;
+	host->next_data.cookie = 1;
 
 	host->clk = clk_get(&pdev->dev, NULL);
 	if (IS_ERR(host->clk)) {
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-20  8:01       ` Per Forlin
@ 2011-04-20 14:01         ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-20 14:01 UTC (permalink / raw)
  To: Per Forlin
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On Wed, Apr 20, 2011 at 10:01:22AM +0200, Per Forlin wrote:
> On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
> > On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
> >> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> >> If not calling pre_req() before mxs_mmc_request(), request()
> >> will prepare the cache just like it did it before.
> >> It is optional to use pre_req() and post_req().
> >>
> >> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> >> ---
> >>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
> >>  1 files changed, 72 insertions(+), 3 deletions(-)
> >>
> >
> > Here is the result of mmc_test case 37 ~ 40, which are designed to see
> > the performance improvement introduced by non-blocking changes.
> >
> > Honestly, the improvement is not so impressive.  Not sure if the patch
> > for mxs-mmc pre_req and post_req support was correctly produced.  So
> > please help review ...
> My guess is that dma_unmap is not run in parallel with transfer.
> Please look at my patch reply.
> 
Got it fixed in v2 posted just now.  Please take another look.
Unfortunately, I do not see noticeable difference than v1 in terms
of mmc_test result.

Do you have omap_hsmmc and mmci mmc_test result data to share?

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20 14:01         ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-20 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 20, 2011 at 10:01:22AM +0200, Per Forlin wrote:
> On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
> > On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
> >> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
> >> If not calling pre_req() before mxs_mmc_request(), request()
> >> will prepare the cache just like it did it before.
> >> It is optional to use pre_req() and post_req().
> >>
> >> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> >> ---
> >> ?drivers/mmc/host/mxs-mmc.c | ? 75 ++++++++++++++++++++++++++++++++++++++++++--
> >> ?1 files changed, 72 insertions(+), 3 deletions(-)
> >>
> >
> > Here is the result of mmc_test case 37 ~ 40, which are designed to see
> > the performance improvement introduced by non-blocking changes.
> >
> > Honestly, the improvement is not so impressive. ?Not sure if the patch
> > for mxs-mmc pre_req and post_req support was correctly produced. ?So
> > please help review ...
> My guess is that dma_unmap is not run in parallel with transfer.
> Please look at my patch reply.
> 
Got it fixed in v2 posted just now.  Please take another look.
Unfortunately, I do not see noticeable difference than v1 in terms
of mmc_test result.

Do you have omap_hsmmc and mmci mmc_test result data to share?

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-20 14:01         ` Shawn Guo
@ 2011-04-20 15:22           ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20 15:22 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 20 April 2011 16:01, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 10:01:22AM +0200, Per Forlin wrote:
>> On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
>> >> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> >> If not calling pre_req() before mxs_mmc_request(), request()
>> >> will prepare the cache just like it did it before.
>> >> It is optional to use pre_req() and post_req().
>> >>
>> >> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> >> ---
>> >>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
>> >>  1 files changed, 72 insertions(+), 3 deletions(-)
>> >>
>> >
>> > Here is the result of mmc_test case 37 ~ 40, which are designed to see
>> > the performance improvement introduced by non-blocking changes.
>> >
>> > Honestly, the improvement is not so impressive.  Not sure if the patch
>> > for mxs-mmc pre_req and post_req support was correctly produced.  So
>> > please help review ...
>> My guess is that dma_unmap is not run in parallel with transfer.
>> Please look at my patch reply.
>>
> Got it fixed in v2 posted just now.  Please take another look.
> Unfortunately, I do not see noticeable difference than v1 in terms
> of mmc_test result.
>
> Do you have omap_hsmmc and mmci mmc_test result data to share?
>
I keep the here:
https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req

> --
> Regards,
> Shawn
>
>
/Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20 15:22           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20 15:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 20 April 2011 16:01, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 10:01:22AM +0200, Per Forlin wrote:
>> On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
>> >> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> >> If not calling pre_req() before mxs_mmc_request(), request()
>> >> will prepare the cache just like it did it before.
>> >> It is optional to use pre_req() and post_req().
>> >>
>> >> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> >> ---
>> >> ?drivers/mmc/host/mxs-mmc.c | ? 75 ++++++++++++++++++++++++++++++++++++++++++--
>> >> ?1 files changed, 72 insertions(+), 3 deletions(-)
>> >>
>> >
>> > Here is the result of mmc_test case 37 ~ 40, which are designed to see
>> > the performance improvement introduced by non-blocking changes.
>> >
>> > Honestly, the improvement is not so impressive. ?Not sure if the patch
>> > for mxs-mmc pre_req and post_req support was correctly produced. ?So
>> > please help review ...
>> My guess is that dma_unmap is not run in parallel with transfer.
>> Please look at my patch reply.
>>
> Got it fixed in v2 posted just now. ?Please take another look.
> Unfortunately, I do not see noticeable difference than v1 in terms
> of mmc_test result.
>
> Do you have omap_hsmmc and mmci mmc_test result data to share?
>
I keep the here:
https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req

> --
> Regards,
> Shawn
>
>
/Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-20 14:01         ` Shawn Guo
@ 2011-04-20 15:30           ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20 15:30 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 20 April 2011 16:01, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 10:01:22AM +0200, Per Forlin wrote:
>> On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
>> >> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> >> If not calling pre_req() before mxs_mmc_request(), request()
>> >> will prepare the cache just like it did it before.
>> >> It is optional to use pre_req() and post_req().
>> >>
>> >> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> >> ---
>> >>  drivers/mmc/host/mxs-mmc.c |   75 ++++++++++++++++++++++++++++++++++++++++++--
>> >>  1 files changed, 72 insertions(+), 3 deletions(-)
>> >>
>> >
>> > Here is the result of mmc_test case 37 ~ 40, which are designed to see
>> > the performance improvement introduced by non-blocking changes.
>> >
>> > Honestly, the improvement is not so impressive.  Not sure if the patch
>> > for mxs-mmc pre_req and post_req support was correctly produced.  So
>> > please help review ...
>> My guess is that dma_unmap is not run in parallel with transfer.
>> Please look at my patch reply.
>>
> Got it fixed in v2 posted just now.  Please take another look.
> Unfortunately, I do not see noticeable difference than v1 in terms
> of mmc_test result.
>
Remove dma_map and dma_unmap from your host driver and run the tests
(obviously nonblocking and blocking will have the same results). If
there is still no performance gain the cache penalty is very small on
your platform and therefore nonblocking doesn't improve things much.
Please let me know the result.

BR,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-20 15:30           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-20 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 20 April 2011 16:01, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 10:01:22AM +0200, Per Forlin wrote:
>> On 17 April 2011 18:48, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Mon, Apr 18, 2011 at 12:33:30AM +0800, Shawn Guo wrote:
>> >> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> >> If not calling pre_req() before mxs_mmc_request(), request()
>> >> will prepare the cache just like it did it before.
>> >> It is optional to use pre_req() and post_req().
>> >>
>> >> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> >> ---
>> >> ?drivers/mmc/host/mxs-mmc.c | ? 75 ++++++++++++++++++++++++++++++++++++++++++--
>> >> ?1 files changed, 72 insertions(+), 3 deletions(-)
>> >>
>> >
>> > Here is the result of mmc_test case 37 ~ 40, which are designed to see
>> > the performance improvement introduced by non-blocking changes.
>> >
>> > Honestly, the improvement is not so impressive. ?Not sure if the patch
>> > for mxs-mmc pre_req and post_req support was correctly produced. ?So
>> > please help review ...
>> My guess is that dma_unmap is not run in parallel with transfer.
>> Please look at my patch reply.
>>
> Got it fixed in v2 posted just now. ?Please take another look.
> Unfortunately, I do not see noticeable difference than v1 in terms
> of mmc_test result.
>
Remove dma_map and dma_unmap from your host driver and run the tests
(obviously nonblocking and blocking will have the same results). If
there is still no performance gain the cache penalty is very small on
your platform and therefore nonblocking doesn't improve things much.
Please let me know the result.

BR,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-20 15:22           ` Per Forlin
@ 2011-04-21  6:25             ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-21  6:25 UTC (permalink / raw)
  To: Per Forlin
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On Wed, Apr 20, 2011 at 05:22:34PM +0200, Per Forlin wrote:
[...]
> > Do you have omap_hsmmc and mmci mmc_test result data to share?
> >
> I keep the here:
> https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req
> 
Actually, I have seen this page before.  I was wondering if you have
mmc_test raw data (test log of cases #37 ~ #40) to share.

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21  6:25             ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-21  6:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 20, 2011 at 05:22:34PM +0200, Per Forlin wrote:
[...]
> > Do you have omap_hsmmc and mmci mmc_test result data to share?
> >
> I keep the here:
> https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req
> 
Actually, I have seen this page before.  I was wondering if you have
mmc_test raw data (test log of cases #37 ~ #40) to share.

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-20 15:30           ` Per Forlin
@ 2011-04-21  6:29             ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-21  6:29 UTC (permalink / raw)
  To: Per Forlin
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
[...]
> Remove dma_map and dma_unmap from your host driver and run the tests
> (obviously nonblocking and blocking will have the same results). If
> there is still no performance gain the cache penalty is very small on
> your platform and therefore nonblocking doesn't improve things much.
> Please let me know the result.
> 
Sorry, I could not understand.  What's the point to run the test when
the driver is even broken.  The removal of  dma_map_sg and
dma_unmap_sg makes mxs-mmc host driver broken.

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21  6:29             ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-21  6:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
[...]
> Remove dma_map and dma_unmap from your host driver and run the tests
> (obviously nonblocking and blocking will have the same results). If
> there is still no performance gain the cache penalty is very small on
> your platform and therefore nonblocking doesn't improve things much.
> Please let me know the result.
> 
Sorry, I could not understand.  What's the point to run the test when
the driver is even broken.  The removal of  dma_map_sg and
dma_unmap_sg makes mxs-mmc host driver broken.

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-21  6:29             ` Shawn Guo
@ 2011-04-21  8:46               ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21  8:46 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
> [...]
>> Remove dma_map and dma_unmap from your host driver and run the tests
>> (obviously nonblocking and blocking will have the same results). If
>> there is still no performance gain the cache penalty is very small on
>> your platform and therefore nonblocking doesn't improve things much.
>> Please let me know the result.
>>
> Sorry, I could not understand.  What's the point to run the test when
> the driver is even broken.  The removal of  dma_map_sg and
> dma_unmap_sg makes mxs-mmc host driver broken.
The point is only to get a measurement of the cost of handling
dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
can save.
The nonblocking mmc_test should save the total time of dma_map_sg and
dma_unmap_sg, if the pre_req and post_req hooks are implemented
correctly.
Running without dma_map_sg and dma_unmap_sg will confirm if the
pre_req and post_req hooks are implemented correctly.

>
> --
> Regards,
> Shawn
>
>
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21  8:46               ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
> [...]
>> Remove dma_map and dma_unmap from your host driver and run the tests
>> (obviously nonblocking and blocking will have the same results). If
>> there is still no performance gain the cache penalty is very small on
>> your platform and therefore nonblocking doesn't improve things much.
>> Please let me know the result.
>>
> Sorry, I could not understand. ?What's the point to run the test when
> the driver is even broken. ?The removal of ?dma_map_sg and
> dma_unmap_sg makes mxs-mmc host driver broken.
The point is only to get a measurement of the cost of handling
dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
can save.
The nonblocking mmc_test should save the total time of dma_map_sg and
dma_unmap_sg, if the pre_req and post_req hooks are implemented
correctly.
Running without dma_map_sg and dma_unmap_sg will confirm if the
pre_req and post_req hooks are implemented correctly.

>
> --
> Regards,
> Shawn
>
>
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-21  6:25             ` Shawn Guo
@ 2011-04-21  8:52               ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21  8:52 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 21 April 2011 08:25, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 05:22:34PM +0200, Per Forlin wrote:
> [...]
>> > Do you have omap_hsmmc and mmci mmc_test result data to share?
>> >
>> I keep the here:
>> https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req
>>
> Actually, I have seen this page before.  I was wondering if you have
> mmc_test raw data (test log of cases #37 ~ #40) to share.
My test numbers are 39 - 42, same tests though.

PANDA - omap_hsmmc:
mmc0: Test case 39. Multiple write performance with sync req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 18.987487792
seconds (7068 kB/s, 6903 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
11.947631837 seconds (11233 kB/s, 10970 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 9.472808839
seconds (14168 kB/s, 13836 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 8.260650636
seconds (16247 kB/s, 15867 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 7.622741699
seconds (17607 kB/s, 17194 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
7.323181152 seconds (18327 kB/s, 17898 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 6.744140626
seconds (19901 kB/s, 19434 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 6.679138184
seconds (20095 kB/s, 19624 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
6.625518800 seconds (20257 kB/s, 19782 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 6.620574950
seconds (20272 kB/s, 19797 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.
 mmc0: Starting tests of card mmc0:80ca...
 mmc0: Test case 40. Multiple write performance with async req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 18.647644043
seconds (7197 kB/s, 7028 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
11.624816894 seconds (11545 kB/s, 11275 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 9.170440675
seconds (14635 kB/s, 14292 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 7.965667725
seconds (16849 kB/s, 16454 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 7.318023681
seconds (18340 kB/s, 17910 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
7.040740969 seconds (19063 kB/s, 18616 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 6.444641113
seconds (20826 kB/s, 20338 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 6.380249023
seconds (21036 kB/s, 20543 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
6.333343506 seconds (21192 kB/s, 20695 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 6.328002930
seconds (21210 kB/s, 20713 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.
 mmc0: Starting tests of card mmc0:80ca...
 mmc0: Test case 41. Multiple read performance with sync req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 20.567749024
seconds (6525 kB/s, 6372 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
12.770507813 seconds (10509 kB/s, 10263 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 10.003143311
seconds (13417 kB/s, 13103 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 7.775665284
seconds (17261 kB/s, 16856 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 6.960906982
seconds (19281 kB/s, 18829 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
6.661651612 seconds (20147 kB/s, 19675 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 6.510711672
seconds (20614 kB/s, 20131 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 6.434722900
seconds (20858 kB/s, 20369 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
6.396850587 seconds (20981 kB/s, 20490 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 6.368560791
seconds (21075 kB/s, 20581 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.
 mmc0: Starting tests of card mmc0:80ca...
 mmc0: Test case 42. Multiple read performance with async req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 20.650848389
seconds (6499 kB/s, 6347 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
12.796203613 seconds (10488 kB/s, 10243 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 10.019592286
seconds (13395 kB/s, 13081 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 7.603942870
seconds (17651 kB/s, 17237 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 6.751251223
seconds (19880 kB/s, 19414 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
6.252929687 seconds (21464 kB/s, 20961 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 5.956726076
seconds (22532 kB/s, 22004 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 5.863586425
seconds (22890 kB/s, 22353 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
5.818939210 seconds (23065 kB/s, 22525 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 5.797515869
seconds (23150 kB/s, 22608 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.


U5500 - mmci
mmc0: Test case 38. Multiple write performance with sync req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 94.638215679
seconds (2836 kB/s, 2769 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 42.323242363
seconds (6342 kB/s, 6193 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
26.845500458 seconds (9999 kB/s, 9764 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 19.154612107
seconds (14014 kB/s, 13685 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 15.318183875
seconds (17523 kB/s, 17113 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took
13.373840666 seconds (20071 kB/s, 19601 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took
12.335949719 seconds (21760 kB/s, 21250 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 11.886310118
seconds (22583 kB/s, 22054 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took
11.703806675 seconds (22935 kB/s, 22398 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 11.632018242
seconds (23077 kB/s, 22536 KiB/s)
mmc0: Result: OK
mmc0: Tests completed.
mmc0: Starting tests of card mmc0:0001...
mmc0: Test case 39. Multiple write performance with async req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 94.567817239
seconds (2838 kB/s, 2772 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 42.043969582
seconds (6384 kB/s, 6234 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
26.483716374 seconds (10135 kB/s, 9898 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 18.694318999
seconds (14359 kB/s, 14022 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 14.832822022
seconds (18097 kB/s, 17673 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took
12.792243128 seconds (20984 kB/s, 20492 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took
11.793732364 seconds (22760 kB/s, 22227 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 11.299139279
seconds (23757 kB/s, 23200 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took
11.160603255 seconds (24052 kB/s, 23488 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 11.036530115
seconds (24322 kB/s, 23752 KiB/s)
mmc0: Result: OK
mmc0: Tests completed.
mmc0: Starting tests of card mmc0:0001...
mmc0: Test case 40. Multiple read performance with sync req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 24.830932935
seconds (10810 kB/s, 10557 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 15.896942145
seconds (16885 kB/s, 16490 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
11.452273667 seconds (23439 kB/s, 22890 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 9.247484468
seconds (29027 kB/s, 28347 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 8.117996552
seconds (33066 kB/s, 32291 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took 7.672241687
seconds (34987 kB/s, 34167 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took 7.588627916
seconds (35373 kB/s, 34544 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 7.545239751
seconds (35576 kB/s, 34742 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took 7.558409240
seconds (35514 kB/s, 34682 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 7.567300503
seconds (35473 kB/s, 34641 KiB/s)
mmc0: Result: OK
mmc0: Tests completed.
mmc0: Starting tests of card mmc0:0001...
mmc0: Test case 41. Multiple read performance with async req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 24.077512698
seconds (11148 kB/s, 10887 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 15.280992666
seconds (17566 kB/s, 17154 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
10.892908603 seconds (24643 kB/s, 24065 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 8.698435941
seconds (30860 kB/s, 30136 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 7.617525358
seconds (35239 kB/s, 34413 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took 7.058797630
seconds (38028 kB/s, 37137 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took 6.779599610
seconds (39594 kB/s, 38666 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 6.639430826
seconds (40430 kB/s, 39482 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took 6.588963683
seconds (40740 kB/s, 39785 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 6.561933369
seconds (40907 kB/s, 39949 KiB/s)



>
> --
> Regards,
> Shawn
>
>
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21  8:52               ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21  8:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 April 2011 08:25, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Wed, Apr 20, 2011 at 05:22:34PM +0200, Per Forlin wrote:
> [...]
>> > Do you have omap_hsmmc and mmci mmc_test result data to share?
>> >
>> I keep the here:
>> https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req
>>
> Actually, I have seen this page before. ?I was wondering if you have
> mmc_test raw data (test log of cases #37 ~ #40) to share.
My test numbers are 39 - 42, same tests though.

PANDA - omap_hsmmc:
mmc0: Test case 39. Multiple write performance with sync req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 18.987487792
seconds (7068 kB/s, 6903 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
11.947631837 seconds (11233 kB/s, 10970 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 9.472808839
seconds (14168 kB/s, 13836 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 8.260650636
seconds (16247 kB/s, 15867 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 7.622741699
seconds (17607 kB/s, 17194 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
7.323181152 seconds (18327 kB/s, 17898 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 6.744140626
seconds (19901 kB/s, 19434 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 6.679138184
seconds (20095 kB/s, 19624 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
6.625518800 seconds (20257 kB/s, 19782 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 6.620574950
seconds (20272 kB/s, 19797 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.
 mmc0: Starting tests of card mmc0:80ca...
 mmc0: Test case 40. Multiple write performance with async req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 18.647644043
seconds (7197 kB/s, 7028 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
11.624816894 seconds (11545 kB/s, 11275 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 9.170440675
seconds (14635 kB/s, 14292 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 7.965667725
seconds (16849 kB/s, 16454 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 7.318023681
seconds (18340 kB/s, 17910 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
7.040740969 seconds (19063 kB/s, 18616 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 6.444641113
seconds (20826 kB/s, 20338 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 6.380249023
seconds (21036 kB/s, 20543 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
6.333343506 seconds (21192 kB/s, 20695 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 6.328002930
seconds (21210 kB/s, 20713 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.
 mmc0: Starting tests of card mmc0:80ca...
 mmc0: Test case 41. Multiple read performance with sync req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 20.567749024
seconds (6525 kB/s, 6372 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
12.770507813 seconds (10509 kB/s, 10263 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 10.003143311
seconds (13417 kB/s, 13103 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 7.775665284
seconds (17261 kB/s, 16856 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 6.960906982
seconds (19281 kB/s, 18829 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
6.661651612 seconds (20147 kB/s, 19675 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 6.510711672
seconds (20614 kB/s, 20131 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 6.434722900
seconds (20858 kB/s, 20369 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
6.396850587 seconds (20981 kB/s, 20490 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 6.368560791
seconds (21075 kB/s, 20581 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.
 mmc0: Starting tests of card mmc0:80ca...
 mmc0: Test case 42. Multiple read performance with async req 4k to 4MB...
 mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 20.650848389
seconds (6499 kB/s, 6347 KiB/s)
 mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took
12.796203613 seconds (10488 kB/s, 10243 KiB/s)
 mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 10.019592286
seconds (13395 kB/s, 13081 KiB/s)
 mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 7.603942870
seconds (17651 kB/s, 17237 KiB/s)
 mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 6.751251223
seconds (19880 kB/s, 19414 KiB/s)
 mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took
6.252929687 seconds (21464 kB/s, 20961 KiB/s)
 mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 5.956726076
seconds (22532 kB/s, 22004 KiB/s)
 mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 5.863586425
seconds (22890 kB/s, 22353 KiB/s)
 mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took
5.818939210 seconds (23065 kB/s, 22525 KiB/s)
 mmc0: Transfer of 32 x 8192 sectors (32 x 4096 KiB) took 5.797515869
seconds (23150 kB/s, 22608 KiB/s)
 mmc0: Result: OK
 mmc0: Tests completed.


U5500 - mmci
mmc0: Test case 38. Multiple write performance with sync req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 94.638215679
seconds (2836 kB/s, 2769 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 42.323242363
seconds (6342 kB/s, 6193 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
26.845500458 seconds (9999 kB/s, 9764 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 19.154612107
seconds (14014 kB/s, 13685 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 15.318183875
seconds (17523 kB/s, 17113 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took
13.373840666 seconds (20071 kB/s, 19601 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took
12.335949719 seconds (21760 kB/s, 21250 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 11.886310118
seconds (22583 kB/s, 22054 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took
11.703806675 seconds (22935 kB/s, 22398 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 11.632018242
seconds (23077 kB/s, 22536 KiB/s)
mmc0: Result: OK
mmc0: Tests completed.
mmc0: Starting tests of card mmc0:0001...
mmc0: Test case 39. Multiple write performance with async req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 94.567817239
seconds (2838 kB/s, 2772 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 42.043969582
seconds (6384 kB/s, 6234 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
26.483716374 seconds (10135 kB/s, 9898 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 18.694318999
seconds (14359 kB/s, 14022 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 14.832822022
seconds (18097 kB/s, 17673 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took
12.792243128 seconds (20984 kB/s, 20492 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took
11.793732364 seconds (22760 kB/s, 22227 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 11.299139279
seconds (23757 kB/s, 23200 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took
11.160603255 seconds (24052 kB/s, 23488 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 11.036530115
seconds (24322 kB/s, 23752 KiB/s)
mmc0: Result: OK
mmc0: Tests completed.
mmc0: Starting tests of card mmc0:0001...
mmc0: Test case 40. Multiple read performance with sync req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 24.830932935
seconds (10810 kB/s, 10557 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 15.896942145
seconds (16885 kB/s, 16490 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
11.452273667 seconds (23439 kB/s, 22890 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 9.247484468
seconds (29027 kB/s, 28347 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 8.117996552
seconds (33066 kB/s, 32291 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took 7.672241687
seconds (34987 kB/s, 34167 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took 7.588627916
seconds (35373 kB/s, 34544 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 7.545239751
seconds (35576 kB/s, 34742 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took 7.558409240
seconds (35514 kB/s, 34682 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 7.567300503
seconds (35473 kB/s, 34641 KiB/s)
mmc0: Result: OK
mmc0: Tests completed.
mmc0: Starting tests of card mmc0:0001...
mmc0: Test case 41. Multiple read performance with async req 4k to 4MB...
mmc0: Transfer of 65536 x 8 sectors (65536 x 4 KiB) took 24.077512698
seconds (11148 kB/s, 10887 KiB/s)
mmc0: Transfer of 32768 x 16 sectors (32768 x 8 KiB) took 15.280992666
seconds (17566 kB/s, 17154 KiB/s)
mmc0: Transfer of 16384 x 32 sectors (16384 x 16 KiB) took
10.892908603 seconds (24643 kB/s, 24065 KiB/s)
mmc0: Transfer of 8192 x 64 sectors (8192 x 32 KiB) took 8.698435941
seconds (30860 kB/s, 30136 KiB/s)
mmc0: Transfer of 4096 x 128 sectors (4096 x 64 KiB) took 7.617525358
seconds (35239 kB/s, 34413 KiB/s)
mmc0: Transfer of 2048 x 256 sectors (2048 x 128 KiB) took 7.058797630
seconds (38028 kB/s, 37137 KiB/s)
mmc0: Transfer of 1024 x 512 sectors (1024 x 256 KiB) took 6.779599610
seconds (39594 kB/s, 38666 KiB/s)
mmc0: Transfer of 512 x 1024 sectors (512 x 512 KiB) took 6.639430826
seconds (40430 kB/s, 39482 KiB/s)
mmc0: Transfer of 256 x 2048 sectors (256 x 1024 KiB) took 6.588963683
seconds (40740 kB/s, 39785 KiB/s)
mmc0: Transfer of 64 x 8192 sectors (64 x 4096 KiB) took 6.561933369
seconds (40907 kB/s, 39949 KiB/s)



>
> --
> Regards,
> Shawn
>
>
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-21  8:46               ` Per Forlin
@ 2011-04-21  9:11                 ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-21  9:11 UTC (permalink / raw)
  To: Per Forlin
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
> > [...]
> >> Remove dma_map and dma_unmap from your host driver and run the tests
> >> (obviously nonblocking and blocking will have the same results). If
> >> there is still no performance gain the cache penalty is very small on
> >> your platform and therefore nonblocking doesn't improve things much.
> >> Please let me know the result.
> >>
> > Sorry, I could not understand.  What's the point to run the test when
> > the driver is even broken.  The removal of  dma_map_sg and
> > dma_unmap_sg makes mxs-mmc host driver broken.
> The point is only to get a measurement of the cost of handling
> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
> can save.
> The nonblocking mmc_test should save the total time of dma_map_sg and
> dma_unmap_sg, if the pre_req and post_req hooks are implemented
> correctly.
> Running without dma_map_sg and dma_unmap_sg will confirm if the
> pre_req and post_req hooks are implemented correctly.
> 
With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
numbers, though blocking and non-blocking numbers are same.  Is it
an indication that pre_req and post_req hooks are not implemented
correctly?  If yes, can you please help to catch the mistakes?

Thanks.

mc0: Test case 39. Read performance with blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 56.875013015 seconds (2
359 kB/s, 2304 KiB/s, 576.14 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 47.407562500 seconds (
2831 kB/s, 2764 KiB/s, 345.59 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 42.708718750 seconds (3
142 kB/s, 3068 KiB/s, 191.81 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 40.227125000 seconds (3
336 kB/s, 3258 KiB/s, 101.82 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 38.915750000 seconds (
3448 kB/s, 3368 KiB/s, 52.62 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 38.249562498 seconds
(3509 kB/s, 3426 KiB/s, 26.77 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 37.912342548 seconds (3
540 kB/s, 3457 KiB/s, 13.50 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 37.743876391 seconds (
3556 kB/s, 3472 KiB/s, 6.78 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 37.658104019 seconds
(3564 kB/s, 3480 KiB/s, 3.39 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 37.086429038 seconds (3
569 kB/s, 3486 KiB/s, 1.05 IOPS)
mmc0: Result: OK
mmc0: Test case 40. Read performance with none blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 56.732932555 seconds (2
365 kB/s, 2310 KiB/s, 577.58 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 47.342812500 seconds (
2835 kB/s, 2768 KiB/s, 346.07 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 42.673906250 seconds (3
145 kB/s, 3071 KiB/s, 191.96 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 40.208218750 seconds (3
338 kB/s, 3259 KiB/s, 101.86 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 38.906750000 seconds (
3449 kB/s, 3368 KiB/s, 52.63 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 38.244749999 seconds
(3509 kB/s, 3427 KiB/s, 26.77 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 37.909719946 seconds (3
540 kB/s, 3457 KiB/s, 13.50 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 37.741834105 seconds (
3556 kB/s, 3472 KiB/s, 6.78 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 37.657555456 seconds
(3564 kB/s, 3480 KiB/s, 3.39 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 37.086351431 seconds (3
569 kB/s, 3486 KiB/s, 1.05 IOPS)
mmc0: Result: OK
mmc0: Tests completed.

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21  9:11                 ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-21  9:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
> > [...]
> >> Remove dma_map and dma_unmap from your host driver and run the tests
> >> (obviously nonblocking and blocking will have the same results). If
> >> there is still no performance gain the cache penalty is very small on
> >> your platform and therefore nonblocking doesn't improve things much.
> >> Please let me know the result.
> >>
> > Sorry, I could not understand. ?What's the point to run the test when
> > the driver is even broken. ?The removal of ?dma_map_sg and
> > dma_unmap_sg makes mxs-mmc host driver broken.
> The point is only to get a measurement of the cost of handling
> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
> can save.
> The nonblocking mmc_test should save the total time of dma_map_sg and
> dma_unmap_sg, if the pre_req and post_req hooks are implemented
> correctly.
> Running without dma_map_sg and dma_unmap_sg will confirm if the
> pre_req and post_req hooks are implemented correctly.
> 
With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
numbers, though blocking and non-blocking numbers are same.  Is it
an indication that pre_req and post_req hooks are not implemented
correctly?  If yes, can you please help to catch the mistakes?

Thanks.

mc0: Test case 39. Read performance with blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 56.875013015 seconds (2
359 kB/s, 2304 KiB/s, 576.14 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 47.407562500 seconds (
2831 kB/s, 2764 KiB/s, 345.59 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 42.708718750 seconds (3
142 kB/s, 3068 KiB/s, 191.81 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 40.227125000 seconds (3
336 kB/s, 3258 KiB/s, 101.82 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 38.915750000 seconds (
3448 kB/s, 3368 KiB/s, 52.62 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 38.249562498 seconds
(3509 kB/s, 3426 KiB/s, 26.77 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 37.912342548 seconds (3
540 kB/s, 3457 KiB/s, 13.50 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 37.743876391 seconds (
3556 kB/s, 3472 KiB/s, 6.78 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 37.658104019 seconds
(3564 kB/s, 3480 KiB/s, 3.39 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 37.086429038 seconds (3
569 kB/s, 3486 KiB/s, 1.05 IOPS)
mmc0: Result: OK
mmc0: Test case 40. Read performance with none blocking req 4k to 4MB...
mmc0: Transfer of 32768 x 8 sectors (32768 x 4 KiB) took 56.732932555 seconds (2
365 kB/s, 2310 KiB/s, 577.58 IOPS)
mmc0: Transfer of 16384 x 16 sectors (16384 x 8 KiB) took 47.342812500 seconds (
2835 kB/s, 2768 KiB/s, 346.07 IOPS)
mmc0: Transfer of 8192 x 32 sectors (8192 x 16 KiB) took 42.673906250 seconds (3
145 kB/s, 3071 KiB/s, 191.96 IOPS)
mmc0: Transfer of 4096 x 64 sectors (4096 x 32 KiB) took 40.208218750 seconds (3
338 kB/s, 3259 KiB/s, 101.86 IOPS)
mmc0: Transfer of 2048 x 128 sectors (2048 x 64 KiB) took 38.906750000 seconds (
3449 kB/s, 3368 KiB/s, 52.63 IOPS)
mmc0: Transfer of 1024 x 256 sectors (1024 x 128 KiB) took 38.244749999 seconds
(3509 kB/s, 3427 KiB/s, 26.77 IOPS)
mmc0: Transfer of 512 x 512 sectors (512 x 256 KiB) took 37.909719946 seconds (3
540 kB/s, 3457 KiB/s, 13.50 IOPS)
mmc0: Transfer of 256 x 1024 sectors (256 x 512 KiB) took 37.741834105 seconds (
3556 kB/s, 3472 KiB/s, 6.78 IOPS)
mmc0: Transfer of 128 x 2048 sectors (128 x 1024 KiB) took 37.657555456 seconds
(3564 kB/s, 3480 KiB/s, 3.39 IOPS)
mmc0: Transfer of 39 x 6630 sectors (39 x 3315 KiB) took 37.086351431 seconds (3
569 kB/s, 3486 KiB/s, 1.05 IOPS)
mmc0: Result: OK
mmc0: Tests completed.

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-21  9:11                 ` Shawn Guo
@ 2011-04-21  9:47                   ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21  9:47 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 21 April 2011 11:11, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
>> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
>> > [...]
>> >> Remove dma_map and dma_unmap from your host driver and run the tests
>> >> (obviously nonblocking and blocking will have the same results). If
>> >> there is still no performance gain the cache penalty is very small on
>> >> your platform and therefore nonblocking doesn't improve things much.
>> >> Please let me know the result.
>> >>
>> > Sorry, I could not understand.  What's the point to run the test when
>> > the driver is even broken.  The removal of  dma_map_sg and
>> > dma_unmap_sg makes mxs-mmc host driver broken.
>> The point is only to get a measurement of the cost of handling
>> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
>> can save.
>> The nonblocking mmc_test should save the total time of dma_map_sg and
>> dma_unmap_sg, if the pre_req and post_req hooks are implemented
>> correctly.
>> Running without dma_map_sg and dma_unmap_sg will confirm if the
>> pre_req and post_req hooks are implemented correctly.
>>
> With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
> numbers, though blocking and non-blocking numbers are same.  Is it
> an indication that pre_req and post_req hooks are not implemented
> correctly?
I think you could get the same numbers for the nonblocking with
dma_map and dma_unmap in place.

>  If yes, can you please help to catch the mistakes?
I will take a look.

> --
> Regards,
> Shawn
>
>
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21  9:47                   ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21  9:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 April 2011 11:11, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
>> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
>> > [...]
>> >> Remove dma_map and dma_unmap from your host driver and run the tests
>> >> (obviously nonblocking and blocking will have the same results). If
>> >> there is still no performance gain the cache penalty is very small on
>> >> your platform and therefore nonblocking doesn't improve things much.
>> >> Please let me know the result.
>> >>
>> > Sorry, I could not understand. ?What's the point to run the test when
>> > the driver is even broken. ?The removal of ?dma_map_sg and
>> > dma_unmap_sg makes mxs-mmc host driver broken.
>> The point is only to get a measurement of the cost of handling
>> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
>> can save.
>> The nonblocking mmc_test should save the total time of dma_map_sg and
>> dma_unmap_sg, if the pre_req and post_req hooks are implemented
>> correctly.
>> Running without dma_map_sg and dma_unmap_sg will confirm if the
>> pre_req and post_req hooks are implemented correctly.
>>
> With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
> numbers, though blocking and non-blocking numbers are same. ?Is it
> an indication that pre_req and post_req hooks are not implemented
> correctly?
I think you could get the same numbers for the nonblocking with
dma_map and dma_unmap in place.

> ?If yes, can you please help to catch the mistakes?
I will take a look.

> --
> Regards,
> Shawn
>
>
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-21  9:47                   ` Per Forlin
@ 2011-04-21 10:15                     ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21 10:15 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 21 April 2011 11:47, Per Forlin <per.forlin@linaro.org> wrote:
> On 21 April 2011 11:11, Shawn Guo <shawn.guo@freescale.com> wrote:
>> On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
>>> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
>>> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
>>> > [...]
>>> >> Remove dma_map and dma_unmap from your host driver and run the tests
>>> >> (obviously nonblocking and blocking will have the same results). If
>>> >> there is still no performance gain the cache penalty is very small on
>>> >> your platform and therefore nonblocking doesn't improve things much.
>>> >> Please let me know the result.
>>> >>
>>> > Sorry, I could not understand.  What's the point to run the test when
>>> > the driver is even broken.  The removal of  dma_map_sg and
>>> > dma_unmap_sg makes mxs-mmc host driver broken.
>>> The point is only to get a measurement of the cost of handling
>>> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
>>> can save.
>>> The nonblocking mmc_test should save the total time of dma_map_sg and
>>> dma_unmap_sg, if the pre_req and post_req hooks are implemented
>>> correctly.
>>> Running without dma_map_sg and dma_unmap_sg will confirm if the
>>> pre_req and post_req hooks are implemented correctly.
>>>
>> With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
>> numbers, though blocking and non-blocking numbers are same.  Is it
>> an indication that pre_req and post_req hooks are not implemented
>> correctly?
> I think you could get the same numbers for the nonblocking with
> dma_map and dma_unmap in place.
>
>>  If yes, can you please help to catch the mistakes?
> I will take a look.
>
I found something:
static void mxs_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
{
	struct mxs_mmc_host *host = mmc_priv(mmc);

	WARN_ON(host->mrq != NULL);
	host->mrq = mrq;
	mxs_mmc_start_cmd(host, mrq->cmd);
}

This is the execution flow:
pre_req()
mxs_mmc_request()
post_req()
wait_for_completion()
pre_req()

mxs_mmc_request() returns before the prepared value is used
post_req() will run dma_unmap and set the cookie to 0, this mean in
your case dma_unmap_sg will be called twice.
You need to store away the prepared data in mxs_mmc_request().
Look at my patch for mmci, function mmci_get_next_data. That function
deals with this issue.

I didn't see this issue when I only looked at the patch since no
changes are made in the request-function.

>> --
>> Regards,
>> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-21 10:15                     ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-21 10:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 April 2011 11:47, Per Forlin <per.forlin@linaro.org> wrote:
> On 21 April 2011 11:11, Shawn Guo <shawn.guo@freescale.com> wrote:
>> On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
>>> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
>>> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
>>> > [...]
>>> >> Remove dma_map and dma_unmap from your host driver and run the tests
>>> >> (obviously nonblocking and blocking will have the same results). If
>>> >> there is still no performance gain the cache penalty is very small on
>>> >> your platform and therefore nonblocking doesn't improve things much.
>>> >> Please let me know the result.
>>> >>
>>> > Sorry, I could not understand. ?What's the point to run the test when
>>> > the driver is even broken. ?The removal of ?dma_map_sg and
>>> > dma_unmap_sg makes mxs-mmc host driver broken.
>>> The point is only to get a measurement of the cost of handling
>>> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
>>> can save.
>>> The nonblocking mmc_test should save the total time of dma_map_sg and
>>> dma_unmap_sg, if the pre_req and post_req hooks are implemented
>>> correctly.
>>> Running without dma_map_sg and dma_unmap_sg will confirm if the
>>> pre_req and post_req hooks are implemented correctly.
>>>
>> With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
>> numbers, though blocking and non-blocking numbers are same. ?Is it
>> an indication that pre_req and post_req hooks are not implemented
>> correctly?
> I think you could get the same numbers for the nonblocking with
> dma_map and dma_unmap in place.
>
>> ?If yes, can you please help to catch the mistakes?
> I will take a look.
>
I found something:
static void mxs_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
{
	struct mxs_mmc_host *host = mmc_priv(mmc);

	WARN_ON(host->mrq != NULL);
	host->mrq = mrq;
	mxs_mmc_start_cmd(host, mrq->cmd);
}

This is the execution flow:
pre_req()
mxs_mmc_request()
post_req()
wait_for_completion()
pre_req()

mxs_mmc_request() returns before the prepared value is used
post_req() will run dma_unmap and set the cookie to 0, this mean in
your case dma_unmap_sg will be called twice.
You need to store away the prepared data in mxs_mmc_request().
Look at my patch for mmci, function mmci_get_next_data. That function
deals with this issue.

I didn't see this issue when I only looked at the patch since no
changes are made in the request-function.

>> --
>> Regards,
>> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-16 23:06     ` Andrei Warkentin
@ 2011-04-22 11:01       ` Jaehoon Chung
  -1 siblings, 0 replies; 129+ messages in thread
From: Jaehoon Chung @ 2011-04-22 11:01 UTC (permalink / raw)
  To: Andrei Warkentin
  Cc: Shawn Guo, linux-mmc, linux-arm-kernel, linaro-kernel, patches,
	cjb, per.forlin, Kyungmin Park

Hi Andrei..

Did you test this patch with ADMA?
I wonder that be increased performance or others..

Regards,
Jaehoon Chung

Andrei Warkentin wrote:
> Hi Shawn,
> 
> On Sat, Apr 16, 2011 at 11:48 AM, Shawn Guo <shawn.guo@linaro.org> wrote:
>> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> If not calling pre_req() before sdhci_request(), request()
>> will prepare the cache just like it did it before.
>> It is optional to use pre_req() and post_req().
>>
>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> ---
>> I worked out the patch by referring to Per's patch below.
>>
>>  omap_hsmmc: add support for pre_req and post_req
>>
>> It adds pre_req and post_req support for sdhci based host drivers to
>> work with Per's non-blocking optimization.  But I only have imx esdhc
>> based hardware to test.  Unfortunately, I can not measure the
>> performance gain using mmc_test, because the current esdhc driver on
>> mainline fails on the test.  So I just did a quick test using 'dd',
>> but sadly, I did not see noticeable performance gain here.  The
>> followings are possible reasons I can think of right away.
>>
>> * The patch did not add pre_req and post_req correctly.  Please help
>>  review to catch the mistakes if any.
>> * The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
>>  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
>>  can people holding other sdhci based hardware give a try on the
>>  patch?
>>
>> Hopefully, I can find some time to have a close look at the mmc_test
>> failure and the broken ADMA with imx esdhc.
>>
> 
> I'll try it out...
> 
> A
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-22 11:01       ` Jaehoon Chung
  0 siblings, 0 replies; 129+ messages in thread
From: Jaehoon Chung @ 2011-04-22 11:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Andrei..

Did you test this patch with ADMA?
I wonder that be increased performance or others..

Regards,
Jaehoon Chung

Andrei Warkentin wrote:
> Hi Shawn,
> 
> On Sat, Apr 16, 2011 at 11:48 AM, Shawn Guo <shawn.guo@linaro.org> wrote:
>> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> If not calling pre_req() before sdhci_request(), request()
>> will prepare the cache just like it did it before.
>> It is optional to use pre_req() and post_req().
>>
>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> ---
>> I worked out the patch by referring to Per's patch below.
>>
>>  omap_hsmmc: add support for pre_req and post_req
>>
>> It adds pre_req and post_req support for sdhci based host drivers to
>> work with Per's non-blocking optimization.  But I only have imx esdhc
>> based hardware to test.  Unfortunately, I can not measure the
>> performance gain using mmc_test, because the current esdhc driver on
>> mainline fails on the test.  So I just did a quick test using 'dd',
>> but sadly, I did not see noticeable performance gain here.  The
>> followings are possible reasons I can think of right away.
>>
>> * The patch did not add pre_req and post_req correctly.  Please help
>>  review to catch the mistakes if any.
>> * The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
>>  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
>>  can people holding other sdhci based hardware give a try on the
>>  patch?
>>
>> Hopefully, I can find some time to have a close look at the mmc_test
>> failure and the broken ADMA with imx esdhc.
>>
> 
> I'll try it out...
> 
> A
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-16 23:06     ` Andrei Warkentin
@ 2011-04-26  1:26       ` Jaehoon Chung
  -1 siblings, 0 replies; 129+ messages in thread
From: Jaehoon Chung @ 2011-04-26  1:26 UTC (permalink / raw)
  To: Andrei Warkentin
  Cc: Shawn Guo, linux-mmc, linux-arm-kernel, linaro-kernel, patches,
	cjb, per.forlin, Kyungmin Park

Hi Shawn

I tested using ADMA with your patch...(benchmark : IOzone)
But I didn't get improvement of performance with ADMA..
(i can see improvement of performance with SDMA)

I want to know how you think about this..

Regards,
Jaehoon Chung

Andrei Warkentin wrote:
> Hi Shawn,
> 
> On Sat, Apr 16, 2011 at 11:48 AM, Shawn Guo <shawn.guo@linaro.org> wrote:
>> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> If not calling pre_req() before sdhci_request(), request()
>> will prepare the cache just like it did it before.
>> It is optional to use pre_req() and post_req().
>>
>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> ---
>> I worked out the patch by referring to Per's patch below.
>>
>>  omap_hsmmc: add support for pre_req and post_req
>>
>> It adds pre_req and post_req support for sdhci based host drivers to
>> work with Per's non-blocking optimization.  But I only have imx esdhc
>> based hardware to test.  Unfortunately, I can not measure the
>> performance gain using mmc_test, because the current esdhc driver on
>> mainline fails on the test.  So I just did a quick test using 'dd',
>> but sadly, I did not see noticeable performance gain here.  The
>> followings are possible reasons I can think of right away.
>>
>> * The patch did not add pre_req and post_req correctly.  Please help
>>  review to catch the mistakes if any.
>> * The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
>>  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
>>  can people holding other sdhci based hardware give a try on the
>>  patch?
>>
>> Hopefully, I can find some time to have a close look at the mmc_test
>> failure and the broken ADMA with imx esdhc.
>>
> 
> I'll try it out...
> 
> A
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-26  1:26       ` Jaehoon Chung
  0 siblings, 0 replies; 129+ messages in thread
From: Jaehoon Chung @ 2011-04-26  1:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shawn

I tested using ADMA with your patch...(benchmark : IOzone)
But I didn't get improvement of performance with ADMA..
(i can see improvement of performance with SDMA)

I want to know how you think about this..

Regards,
Jaehoon Chung

Andrei Warkentin wrote:
> Hi Shawn,
> 
> On Sat, Apr 16, 2011 at 11:48 AM, Shawn Guo <shawn.guo@linaro.org> wrote:
>> pre_req() runs dma_map_sg() post_req() runs dma_unmap_sg.
>> If not calling pre_req() before sdhci_request(), request()
>> will prepare the cache just like it did it before.
>> It is optional to use pre_req() and post_req().
>>
>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>> ---
>> I worked out the patch by referring to Per's patch below.
>>
>>  omap_hsmmc: add support for pre_req and post_req
>>
>> It adds pre_req and post_req support for sdhci based host drivers to
>> work with Per's non-blocking optimization.  But I only have imx esdhc
>> based hardware to test.  Unfortunately, I can not measure the
>> performance gain using mmc_test, because the current esdhc driver on
>> mainline fails on the test.  So I just did a quick test using 'dd',
>> but sadly, I did not see noticeable performance gain here.  The
>> followings are possible reasons I can think of right away.
>>
>> * The patch did not add pre_req and post_req correctly.  Please help
>>  review to catch the mistakes if any.
>> * The imx esdhc driver uses SDHCI_SDMA (max_segs is 1) than SDHCI_ADAM
>>  (max_segs is 128), due to the broken ADMA support on imx esdhc.  So
>>  can people holding other sdhci based hardware give a try on the
>>  patch?
>>
>> Hopefully, I can find some time to have a close look at the mmc_test
>> failure and the broken ADMA with imx esdhc.
>>
> 
> I'll try it out...
> 
> A
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-26  1:26       ` Jaehoon Chung
@ 2011-04-26  2:47         ` Shawn Guo
  -1 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-26  2:47 UTC (permalink / raw)
  To: Jaehoon Chung
  Cc: Andrei Warkentin, Shawn Guo, linux-mmc, linux-arm-kernel,
	linaro-kernel, patches, cjb, per.forlin, Kyungmin Park

On Tue, Apr 26, 2011 at 10:26:01AM +0900, Jaehoon Chung wrote:
> Hi Shawn
> 
> I tested using ADMA with your patch...(benchmark : IOzone)
> But I didn't get improvement of performance with ADMA..
> (i can see improvement of performance with SDMA)
> 
> I want to know how you think about this..
> 
It's still an open question if pre_req and post_req were correctly
added, even you have seen improvement of SDMA case with IOzone.  I
would leave the question to Per Forlin.

With your result, I'm interested in trying IOzone test with esdhc
to see any difference with SDMA.

BTW, what is the number of performance improvement you are seeing?
And do you have mmc_test result to share?

-- 
Regards,
Shawn


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-26  2:47         ` Shawn Guo
  0 siblings, 0 replies; 129+ messages in thread
From: Shawn Guo @ 2011-04-26  2:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 26, 2011 at 10:26:01AM +0900, Jaehoon Chung wrote:
> Hi Shawn
> 
> I tested using ADMA with your patch...(benchmark : IOzone)
> But I didn't get improvement of performance with ADMA..
> (i can see improvement of performance with SDMA)
> 
> I want to know how you think about this..
> 
It's still an open question if pre_req and post_req were correctly
added, even you have seen improvement of SDMA case with IOzone.  I
would leave the question to Per Forlin.

With your result, I'm interested in trying IOzone test with esdhc
to see any difference with SDMA.

BTW, what is the number of performance improvement you are seeing?
And do you have mmc_test result to share?

-- 
Regards,
Shawn

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-26  2:47         ` Shawn Guo
@ 2011-04-26 10:21           ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-26 10:21 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Jaehoon Chung, Andrei Warkentin, Shawn Guo, linux-mmc,
	linux-arm-kernel, linaro-kernel, patches, cjb, Kyungmin Park

On 26 April 2011 04:47, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Tue, Apr 26, 2011 at 10:26:01AM +0900, Jaehoon Chung wrote:
>> Hi Shawn
>>
>> I tested using ADMA with your patch...(benchmark : IOzone)
>> But I didn't get improvement of performance with ADMA..
>> (i can see improvement of performance with SDMA)
>>
>> I want to know how you think about this..
>>
> It's still an open question if pre_req and post_req were correctly
> added, even you have seen improvement of SDMA case with IOzone.  I
> would leave the question to Per Forlin.
>
Performance numbers from user space may vary.
Currently I am looking into how the block layer adds request to the
mmc blockdev. I can see that if is common that one read request is
pushed to the mmc blockdev queue after the previous request has
already finished. For this scenario there will no improvement. If
running IOzone with large record sizes multiple requests are queued up
in the mmc blockdev queue and this results in increase of bandwidth.
The mmc_test are intended to help verifying if the pre_req and
post_req are implemented correctly and give a number of the maximum
gain in performance.

> --
> Regards,
> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-26 10:21           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-26 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 April 2011 04:47, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Tue, Apr 26, 2011 at 10:26:01AM +0900, Jaehoon Chung wrote:
>> Hi Shawn
>>
>> I tested using ADMA with your patch...(benchmark : IOzone)
>> But I didn't get improvement of performance with ADMA..
>> (i can see improvement of performance with SDMA)
>>
>> I want to know how you think about this..
>>
> It's still an open question if pre_req and post_req were correctly
> added, even you have seen improvement of SDMA case with IOzone. ?I
> would leave the question to Per Forlin.
>
Performance numbers from user space may vary.
Currently I am looking into how the block layer adds request to the
mmc blockdev. I can see that if is common that one read request is
pushed to the mmc blockdev queue after the previous request has
already finished. For this scenario there will no improvement. If
running IOzone with large record sizes multiple requests are queued up
in the mmc blockdev queue and this results in increase of bandwidth.
The mmc_test are intended to help verifying if the pre_req and
post_req are implemented correctly and give a number of the maximum
gain in performance.

> --
> Regards,
> Shawn
Regards,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
  2011-04-20  7:17       ` Per Forlin
  (?)
@ 2011-04-26 13:29         ` David Vrabel
  -1 siblings, 0 replies; 129+ messages in thread
From: David Vrabel @ 2011-04-26 13:29 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 20/04/11 08:17, Per Forlin wrote:
> 
>> Using a MMC request queue has other benefits -- it allows multiple users
>> without having to claim/release the host.  This would be useful for
>> (especially multi-function) SDIO.
>
> You mean claim and release would be done only within the mmc core. The
> timed saved here would equal the time it takes to release and claim
> the host.
> Claim and release can also be used for power management to indicate if
> any client is using the host, if not the power can be switched off.

Isn't there a separate runtime power management API that different from
claim/release?

> I just want to make sure I understand the multi-function SDIO case, I
> haven't done any work with SDIO.
> Can the SDIO functions compete over the same claim_host at the same time?
> Example: if function 1 claims the host, function 2 and function 3 also
> want to claim the host but have to wait for function 1 to release the
> host.

This is the case.   Each function driver has to claim exclusive access
to the host.

> What is the extra benefit of having the internal request queue for
> multi function SDIO?

It reduces the delays between commands if multiple drivers are sending
commands.  I estimated performance improvements with 2-3% from just
removing the need to claim/release in one particular SDIO function
driver.  Performance improvements for multi-function cards would be a
bit more (5% perhaps?).

The more important benefit is the simplification of the API.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-26 13:29         ` David Vrabel
  0 siblings, 0 replies; 129+ messages in thread
From: David Vrabel @ 2011-04-26 13:29 UTC (permalink / raw)
  To: Per Forlin
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 20/04/11 08:17, Per Forlin wrote:
> 
>> Using a MMC request queue has other benefits -- it allows multiple users
>> without having to claim/release the host.  This would be useful for
>> (especially multi-function) SDIO.
>
> You mean claim and release would be done only within the mmc core. The
> timed saved here would equal the time it takes to release and claim
> the host.
> Claim and release can also be used for power management to indicate if
> any client is using the host, if not the power can be switched off.

Isn't there a separate runtime power management API that different from
claim/release?

> I just want to make sure I understand the multi-function SDIO case, I
> haven't done any work with SDIO.
> Can the SDIO functions compete over the same claim_host at the same time?
> Example: if function 1 claims the host, function 2 and function 3 also
> want to claim the host but have to wait for function 1 to release the
> host.

This is the case.   Each function driver has to claim exclusive access
to the host.

> What is the extra benefit of having the internal request queue for
> multi function SDIO?

It reduces the delays between commands if multiple drivers are sending
commands.  I estimated performance improvements with 2-3% from just
removing the need to claim/release in one particular SDIO function
driver.  Performance improvements for multi-function cards would be a
bit more (5% perhaps?).

The more important benefit is the simplification of the API.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-26 13:29         ` David Vrabel
  0 siblings, 0 replies; 129+ messages in thread
From: David Vrabel @ 2011-04-26 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/04/11 08:17, Per Forlin wrote:
> 
>> Using a MMC request queue has other benefits -- it allows multiple users
>> without having to claim/release the host.  This would be useful for
>> (especially multi-function) SDIO.
>
> You mean claim and release would be done only within the mmc core. The
> timed saved here would equal the time it takes to release and claim
> the host.
> Claim and release can also be used for power management to indicate if
> any client is using the host, if not the power can be switched off.

Isn't there a separate runtime power management API that different from
claim/release?

> I just want to make sure I understand the multi-function SDIO case, I
> haven't done any work with SDIO.
> Can the SDIO functions compete over the same claim_host at the same time?
> Example: if function 1 claims the host, function 2 and function 3 also
> want to claim the host but have to wait for function 1 to release the
> host.

This is the case.   Each function driver has to claim exclusive access
to the host.

> What is the extra benefit of having the internal request queue for
> multi function SDIO?

It reduces the delays between commands if multiple drivers are sending
commands.  I estimated performance improvements with 2-3% from just
removing the need to claim/release in one particular SDIO function
driver.  Performance improvements for multi-function cards would be a
bit more (5% perhaps?).

The more important benefit is the simplification of the API.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-26 14:22           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-26 14:22 UTC (permalink / raw)
  To: David Vrabel
  Cc: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev, Chris Ball

On 26 April 2011 15:29, David Vrabel <david.vrabel@csr.com> wrote:
> On 20/04/11 08:17, Per Forlin wrote:
>>
>>> Using a MMC request queue has other benefits -- it allows multiple users
>>> without having to claim/release the host.  This would be useful for
>>> (especially multi-function) SDIO.
>>
>> You mean claim and release would be done only within the mmc core. The
>> timed saved here would equal the time it takes to release and claim
>> the host.
>> Claim and release can also be used for power management to indicate if
>> any client is using the host, if not the power can be switched off.
>
> Isn't there a separate runtime power management API that different from
> claim/release?
>
I misunderstood. I thought you meant that the claim() and release()
were not needed if having an internal request queue in core.c.
Please discard my comment.

>> I just want to make sure I understand the multi-function SDIO case, I
>> haven't done any work with SDIO.
>> Can the SDIO functions compete over the same claim_host at the same time?
>> Example: if function 1 claims the host, function 2 and function 3 also
>> want to claim the host but have to wait for function 1 to release the
>> host.
>
> This is the case.   Each function driver has to claim exclusive access
> to the host.
>
>> What is the extra benefit of having the internal request queue for
>> multi function SDIO?
>
> It reduces the delays between commands if multiple drivers are sending
> commands.  I estimated performance improvements with 2-3% from just
> removing the need to claim/release in one particular SDIO function
> driver.  Performance improvements for multi-function cards would be a
> bit more (5% perhaps?).
>
Your estimates are promising.

> The more important benefit is the simplification of the API.
I agree. I will make a prototype for this. I don't think I will be
able to find time for this until middle of May. I let know you when I
have patches.

>
> David
Thanks,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-26 14:22           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-26 14:22 UTC (permalink / raw)
  To: David Vrabel
  Cc: linaro-dev-cunTk1MwBs8s++Sfvej+rw,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 26 April 2011 15:29, David Vrabel <david.vrabel-kQvG35nSl+M@public.gmane.org> wrote:
> On 20/04/11 08:17, Per Forlin wrote:
>>
>>> Using a MMC request queue has other benefits -- it allows multiple users
>>> without having to claim/release the host.  This would be useful for
>>> (especially multi-function) SDIO.
>>
>> You mean claim and release would be done only within the mmc core. The
>> timed saved here would equal the time it takes to release and claim
>> the host.
>> Claim and release can also be used for power management to indicate if
>> any client is using the host, if not the power can be switched off.
>
> Isn't there a separate runtime power management API that different from
> claim/release?
>
I misunderstood. I thought you meant that the claim() and release()
were not needed if having an internal request queue in core.c.
Please discard my comment.

>> I just want to make sure I understand the multi-function SDIO case, I
>> haven't done any work with SDIO.
>> Can the SDIO functions compete over the same claim_host at the same time?
>> Example: if function 1 claims the host, function 2 and function 3 also
>> want to claim the host but have to wait for function 1 to release the
>> host.
>
> This is the case.   Each function driver has to claim exclusive access
> to the host.
>
>> What is the extra benefit of having the internal request queue for
>> multi function SDIO?
>
> It reduces the delays between commands if multiple drivers are sending
> commands.  I estimated performance improvements with 2-3% from just
> removing the need to claim/release in one particular SDIO function
> driver.  Performance improvements for multi-function cards would be a
> bit more (5% perhaps?).
>
Your estimates are promising.

> The more important benefit is the simplification of the API.
I agree. I will make a prototype for this. I don't think I will be
able to find time for this until middle of May. I let know you when I
have patches.

>
> David
Thanks,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH v2 01/12] mmc: add none blocking mmc request function
@ 2011-04-26 14:22           ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-26 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 April 2011 15:29, David Vrabel <david.vrabel@csr.com> wrote:
> On 20/04/11 08:17, Per Forlin wrote:
>>
>>> Using a MMC request queue has other benefits -- it allows multiple users
>>> without having to claim/release the host. ?This would be useful for
>>> (especially multi-function) SDIO.
>>
>> You mean claim and release would be done only within the mmc core. The
>> timed saved here would equal the time it takes to release and claim
>> the host.
>> Claim and release can also be used for power management to indicate if
>> any client is using the host, if not the power can be switched off.
>
> Isn't there a separate runtime power management API that different from
> claim/release?
>
I misunderstood. I thought you meant that the claim() and release()
were not needed if having an internal request queue in core.c.
Please discard my comment.

>> I just want to make sure I understand the multi-function SDIO case, I
>> haven't done any work with SDIO.
>> Can the SDIO functions compete over the same claim_host at the same time?
>> Example: if function 1 claims the host, function 2 and function 3 also
>> want to claim the host but have to wait for function 1 to release the
>> host.
>
> This is the case. ? Each function driver has to claim exclusive access
> to the host.
>
>> What is the extra benefit of having the internal request queue for
>> multi function SDIO?
>
> It reduces the delays between commands if multiple drivers are sending
> commands. ?I estimated performance improvements with 2-3% from just
> removing the need to claim/release in one particular SDIO function
> driver. ?Performance improvements for multi-function cards would be a
> bit more (5% perhaps?).
>
Your estimates are promising.

> The more important benefit is the simplification of the API.
I agree. I will make a prototype for this. I don't think I will be
able to find time for this until middle of May. I let know you when I
have patches.

>
> David
Thanks,
Per

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: sdhci: add support for pre_req and post_req
  2011-04-22 11:01       ` Jaehoon Chung
@ 2011-04-27  0:59         ` Andrei Warkentin
  -1 siblings, 0 replies; 129+ messages in thread
From: Andrei Warkentin @ 2011-04-27  0:59 UTC (permalink / raw)
  To: Jaehoon Chung
  Cc: Shawn Guo, linux-mmc, linux-arm-kernel, linaro-kernel, patches,
	cjb, per.forlin, Kyungmin Park

Hi,

On Fri, Apr 22, 2011 at 6:01 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
> Hi Andrei..
>
> Did you test this patch with ADMA?
> I wonder that be increased performance or others..

FWIW...

ADMA

With changes

adb shell "echo 0 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/with"

real	0m37.245s
user	0m0.010s
sys	0m0.000s

Without changes

adb shell "echo 1 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/without"

real	0m38.400s
user	0m0.000s
sys	0m0.010s

SDMA plus BOUNCE_BUFFER

With changes

adb shell "echo 0 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/with"

real	0m37.999s
user	0m0.000s
sys	0m0.010s

Without changes

adb shell "echo 1 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/without"

real	0m39.717s
user	0m0.000s
sys	0m0.010s

Collected data using this patch on top of Shawn's...

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 3320c75..f698586 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -39,6 +39,7 @@
 #endif

 static unsigned int debug_quirks = 0;
+static unsigned int no_prepost = 0;

 static void sdhci_prepare_data(struct sdhci_host *, struct mmc_data *);
 static void sdhci_finish_data(struct sdhci_host *);
@@ -1140,6 +1141,8 @@ static void sdhci_pre_req(struct mmc_host *mmc,
struct mmc_request *mrq,
                          bool is_first_req)
 {
        struct sdhci_host *host = mmc_priv(mmc);
+       if (no_prepost)
+               return;

        if (mrq->data->host_cookie) {
                mrq->data->host_cookie = 0;
@@ -1157,6 +1160,9 @@ static void sdhci_post_req(struct mmc_host *mmc,
struct mmc_request *mrq,
        struct sdhci_host *host = mmc_priv(mmc);
        struct mmc_data *data = mrq->data;

+       if (no_prepost)
+               return;
+
        if (host->flags & SDHCI_REQ_USE_DMA) {
                dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
                             (data->flags & MMC_DATA_WRITE) ?
@@ -2163,6 +2169,7 @@ module_init(sdhci_drv_init);
 module_exit(sdhci_drv_exit);

 module_param(debug_quirks, uint, 0444);
+module_param(no_prepost, uint, 0644);

 MODULE_AUTHOR("Pierre Ossman <pierre@ossman.eu>");
 MODULE_DESCRIPTION("Secure Digital Host Controller Interface core driver");


A

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2011-04-27  0:59         ` Andrei Warkentin
  0 siblings, 0 replies; 129+ messages in thread
From: Andrei Warkentin @ 2011-04-27  0:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, Apr 22, 2011 at 6:01 AM, Jaehoon Chung <jh80.chung@samsung.com> wrote:
> Hi Andrei..
>
> Did you test this patch with ADMA?
> I wonder that be increased performance or others..

FWIW...

ADMA

With changes

adb shell "echo 0 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/with"

real	0m37.245s
user	0m0.010s
sys	0m0.000s

Without changes

adb shell "echo 1 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/without"

real	0m38.400s
user	0m0.000s
sys	0m0.010s

SDMA plus BOUNCE_BUFFER

With changes

adb shell "echo 0 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/with"

real	0m37.999s
user	0m0.000s
sys	0m0.010s

Without changes

adb shell "echo 1 > /sys/module/sdhci/parameters/no_prepost"
time adb shell "iozone -a -f /cache/file -g 4m > /mnt/obb/without"

real	0m39.717s
user	0m0.000s
sys	0m0.010s

Collected data using this patch on top of Shawn's...

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 3320c75..f698586 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -39,6 +39,7 @@
 #endif

 static unsigned int debug_quirks = 0;
+static unsigned int no_prepost = 0;

 static void sdhci_prepare_data(struct sdhci_host *, struct mmc_data *);
 static void sdhci_finish_data(struct sdhci_host *);
@@ -1140,6 +1141,8 @@ static void sdhci_pre_req(struct mmc_host *mmc,
struct mmc_request *mrq,
                          bool is_first_req)
 {
        struct sdhci_host *host = mmc_priv(mmc);
+       if (no_prepost)
+               return;

        if (mrq->data->host_cookie) {
                mrq->data->host_cookie = 0;
@@ -1157,6 +1160,9 @@ static void sdhci_post_req(struct mmc_host *mmc,
struct mmc_request *mrq,
        struct sdhci_host *host = mmc_priv(mmc);
        struct mmc_data *data = mrq->data;

+       if (no_prepost)
+               return;
+
        if (host->flags & SDHCI_REQ_USE_DMA) {
                dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
                             (data->flags & MMC_DATA_WRITE) ?
@@ -2163,6 +2169,7 @@ module_init(sdhci_drv_init);
 module_exit(sdhci_drv_exit);

 module_param(debug_quirks, uint, 0444);
+module_param(no_prepost, uint, 0644);

 MODULE_AUTHOR("Pierre Ossman <pierre@ossman.eu>");
 MODULE_DESCRIPTION("Secure Digital Host Controller Interface core driver");


A

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-21  9:47                   ` Per Forlin
@ 2011-04-28  7:52                     ` Per Forlin
  -1 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-28  7:52 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Shawn Guo, linaro-kernel, linux-mmc, cjb, linux-arm-kernel, patches

On 21 April 2011 11:47, Per Forlin <per.forlin@linaro.org> wrote:
> On 21 April 2011 11:11, Shawn Guo <shawn.guo@freescale.com> wrote:
>> On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
>>> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
>>> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
>>> > [...]
>>> >> Remove dma_map and dma_unmap from your host driver and run the tests
>>> >> (obviously nonblocking and blocking will have the same results). If
>>> >> there is still no performance gain the cache penalty is very small on
>>> >> your platform and therefore nonblocking doesn't improve things much.
>>> >> Please let me know the result.
>>> >>
>>> > Sorry, I could not understand.  What's the point to run the test when
>>> > the driver is even broken.  The removal of  dma_map_sg and
>>> > dma_unmap_sg makes mxs-mmc host driver broken.
>>> The point is only to get a measurement of the cost of handling
>>> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
>>> can save.
>>> The nonblocking mmc_test should save the total time of dma_map_sg and
>>> dma_unmap_sg, if the pre_req and post_req hooks are implemented
>>> correctly.
>>> Running without dma_map_sg and dma_unmap_sg will confirm if the
>>> pre_req and post_req hooks are implemented correctly.
>>>
>> With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
>> numbers, though blocking and non-blocking numbers are same.  Is it
>> an indication that pre_req and post_req hooks are not implemented
>> correctly?
> I think you could get the same numbers for the nonblocking with
> dma_map and dma_unmap in place.
>
I wanted to test the performance without cache penalty but removing
dma_map_sg may not work since it produces the physical mapped sg list.
This is not as simple as I first thought. Make a copy for dma_map_sg
(call it dma_map_sg_no_cache) and modify it to not clean/inc the
cache. Replace dma_map_sg with dma_map_sg_no_cache in the mxs-mmc
driver.
Removing dma_unmap should be ok for this test case.
Do you still get very low numbers?

>>  If yes, can you please help to catch the mistakes?
> I will take a look.
>
>> --
>> Regards,
>> Shawn
>>
>>
> Regards,
> Per
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-28  7:52                     ` Per Forlin
  0 siblings, 0 replies; 129+ messages in thread
From: Per Forlin @ 2011-04-28  7:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 April 2011 11:47, Per Forlin <per.forlin@linaro.org> wrote:
> On 21 April 2011 11:11, Shawn Guo <shawn.guo@freescale.com> wrote:
>> On Thu, Apr 21, 2011 at 10:46:18AM +0200, Per Forlin wrote:
>>> On 21 April 2011 08:29, Shawn Guo <shawn.guo@freescale.com> wrote:
>>> > On Wed, Apr 20, 2011 at 05:30:22PM +0200, Per Forlin wrote:
>>> > [...]
>>> >> Remove dma_map and dma_unmap from your host driver and run the tests
>>> >> (obviously nonblocking and blocking will have the same results). If
>>> >> there is still no performance gain the cache penalty is very small on
>>> >> your platform and therefore nonblocking doesn't improve things much.
>>> >> Please let me know the result.
>>> >>
>>> > Sorry, I could not understand. ?What's the point to run the test when
>>> > the driver is even broken. ?The removal of ?dma_map_sg and
>>> > dma_unmap_sg makes mxs-mmc host driver broken.
>>> The point is only to get a measurement of the cost of handling
>>> dma_map_sg and dma_unmap_sg, this is the maximum time mmc nonblocking
>>> can save.
>>> The nonblocking mmc_test should save the total time of dma_map_sg and
>>> dma_unmap_sg, if the pre_req and post_req hooks are implemented
>>> correctly.
>>> Running without dma_map_sg and dma_unmap_sg will confirm if the
>>> pre_req and post_req hooks are implemented correctly.
>>>
>> With dma_map_sg and dma_unmap_sg removed, the mmc_test gave very low
>> numbers, though blocking and non-blocking numbers are same. ?Is it
>> an indication that pre_req and post_req hooks are not implemented
>> correctly?
> I think you could get the same numbers for the nonblocking with
> dma_map and dma_unmap in place.
>
I wanted to test the performance without cache penalty but removing
dma_map_sg may not work since it produces the physical mapped sg list.
This is not as simple as I first thought. Make a copy for dma_map_sg
(call it dma_map_sg_no_cache) and modify it to not clean/inc the
cache. Replace dma_map_sg with dma_map_sg_no_cache in the mxs-mmc
driver.
Removing dma_unmap should be ok for this test case.
Do you still get very low numbers?

>> ?If yes, can you please help to catch the mistakes?
> I will take a look.
>
>> --
>> Regards,
>> Shawn
>>
>>
> Regards,
> Per
>

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
  2011-04-28  7:52                     ` Per Forlin
@ 2011-04-28 10:10                       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 129+ messages in thread
From: Russell King - ARM Linux @ 2011-04-28 10:10 UTC (permalink / raw)
  To: Per Forlin
  Cc: Shawn Guo, cjb, patches, linux-mmc, linaro-kernel, Shawn Guo,
	linux-arm-kernel

On Thu, Apr 28, 2011 at 09:52:17AM +0200, Per Forlin wrote:
> I wanted to test the performance without cache penalty but removing
> dma_map_sg may not work since it produces the physical mapped sg list.
> This is not as simple as I first thought. Make a copy for dma_map_sg
> (call it dma_map_sg_no_cache) and modify it to not clean/inc the
> cache. Replace dma_map_sg with dma_map_sg_no_cache in the mxs-mmc
> driver.
> Removing dma_unmap should be ok for this test case.
> Do you still get very low numbers?

We can live in the hope that this is gathering evidence to illustrate
why having DMA incoherent caches are bad news for performance, and that
one day ARM Ltd will one day give us proper DMA cache coherency for all
devices.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] mmc: mxs-mmc: add support for pre_req and post_req
@ 2011-04-28 10:10                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 129+ messages in thread
From: Russell King - ARM Linux @ 2011-04-28 10:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 28, 2011 at 09:52:17AM +0200, Per Forlin wrote:
> I wanted to test the performance without cache penalty but removing
> dma_map_sg may not work since it produces the physical mapped sg list.
> This is not as simple as I first thought. Make a copy for dma_map_sg
> (call it dma_map_sg_no_cache) and modify it to not clean/inc the
> cache. Replace dma_map_sg with dma_map_sg_no_cache in the mxs-mmc
> driver.
> Removing dma_unmap should be ok for this test case.
> Do you still get very low numbers?

We can live in the hope that this is gathering evidence to illustrate
why having DMA incoherent caches are bad news for performance, and that
one day ARM Ltd will one day give us proper DMA cache coherency for all
devices.

^ permalink raw reply	[flat|nested] 129+ messages in thread

end of thread, other threads:[~2011-04-28 10:10 UTC | newest]

Thread overview: 129+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-06 19:07 [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency Per Forlin
2011-04-06 19:07 ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 01/12] mmc: add none blocking mmc request function Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-15 10:34   ` David Vrabel
2011-04-15 10:34     ` David Vrabel
2011-04-15 10:34     ` David Vrabel
2011-04-20  7:17     ` Per Forlin
2011-04-20  7:17       ` Per Forlin
2011-04-26 13:29       ` David Vrabel
2011-04-26 13:29         ` David Vrabel
2011-04-26 13:29         ` David Vrabel
2011-04-26 14:22         ` Per Forlin
2011-04-26 14:22           ` Per Forlin
2011-04-26 14:22           ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 02/12] mmc: mmc_test: add debugfs file to list all tests Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-17  7:09   ` Lin Tony-B19295
2011-04-17  7:09     ` Lin Tony-B19295
2011-04-17  7:09     ` Lin Tony-B19295
2011-04-20  7:30     ` Per Forlin
2011-04-20  7:30       ` Per Forlin
2011-04-20  7:30       ` Per Forlin
2011-04-17 15:46   ` Shawn Guo
2011-04-17 15:46     ` Shawn Guo
2011-04-17 15:46     ` Shawn Guo
2011-04-20  7:41     ` Per Forlin
2011-04-20  7:41       ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 04/12] mmc: add member in mmc queue struct to hold request data Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 05/12] mmc: add a block request prepare function Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 06/12] mmc: move error code in mmc_block_issue_rw_rq to a separate function Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 07/12] mmc: add a second mmc queue request member Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-20 11:32   ` Per Forlin
2011-04-20 11:32     ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 09/12] mmc: test: add random fault injection in core.c Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 10/12] omap_hsmmc: use original sg_len for dma_unmap_sg Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 11/12] omap_hsmmc: add support for pre_req and post_req Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-06 19:07 ` [PATCH v2 12/12] mmci: implement pre_req() and post_req() Per Forlin
2011-04-06 19:07   ` Per Forlin
2011-04-08 16:49 ` [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency Linus Walleij
2011-04-08 16:49   ` Linus Walleij
2011-04-09 11:55   ` Jae hoon Chung
2011-04-09 11:55     ` Jae hoon Chung
     [not found]     ` <BANLkTikVeXvfSBS-xLDXVdesKJpKdtUVqg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-10  3:33       ` anish singh
2011-04-10  3:33         ` anish singh
2011-04-11  9:03         ` Per Forlin
2011-04-11  9:03           ` Per Forlin
     [not found]           ` <BANLkTikoj6UTx08ntZaMM15taKRXjrU_Mg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-11  9:07             ` Sachin Nikam
2011-04-11  9:07               ` Sachin Nikam
2011-04-11  9:08     ` Per Forlin
2011-04-11  9:08       ` Per Forlin
2011-04-19 14:30       ` Jae hoon Chung
2011-04-19 14:30         ` Jae hoon Chung
2011-04-16 15:48 ` Shawn Guo
2011-04-16 15:48   ` Shawn Guo
2011-04-16 15:48   ` Shawn Guo
2011-04-20  8:19   ` Per Forlin
2011-04-20  8:19     ` Per Forlin
2011-04-16 16:48 ` [PATCH] mmc: sdhci: add support for pre_req and post_req Shawn Guo
2011-04-16 16:48   ` Shawn Guo
2011-04-16 23:06   ` Andrei Warkentin
2011-04-16 23:06     ` Andrei Warkentin
2011-04-22 11:01     ` Jaehoon Chung
2011-04-22 11:01       ` Jaehoon Chung
2011-04-27  0:59       ` Andrei Warkentin
2011-04-27  0:59         ` Andrei Warkentin
2011-04-26  1:26     ` Jaehoon Chung
2011-04-26  1:26       ` Jaehoon Chung
2011-04-26  2:47       ` Shawn Guo
2011-04-26  2:47         ` Shawn Guo
2011-04-26 10:21         ` Per Forlin
2011-04-26 10:21           ` Per Forlin
2011-04-17 16:33 ` [PATCH] mmc: mxs-mmc: " Shawn Guo
2011-04-17 16:33   ` Shawn Guo
2011-04-17 16:48   ` Shawn Guo
2011-04-17 16:48     ` Shawn Guo
2011-04-20  8:01     ` Per Forlin
2011-04-20  8:01       ` Per Forlin
2011-04-20 14:01       ` Shawn Guo
2011-04-20 14:01         ` Shawn Guo
2011-04-20 15:22         ` Per Forlin
2011-04-20 15:22           ` Per Forlin
2011-04-21  6:25           ` Shawn Guo
2011-04-21  6:25             ` Shawn Guo
2011-04-21  8:52             ` Per Forlin
2011-04-21  8:52               ` Per Forlin
2011-04-20 15:30         ` Per Forlin
2011-04-20 15:30           ` Per Forlin
2011-04-21  6:29           ` Shawn Guo
2011-04-21  6:29             ` Shawn Guo
2011-04-21  8:46             ` Per Forlin
2011-04-21  8:46               ` Per Forlin
2011-04-21  9:11               ` Shawn Guo
2011-04-21  9:11                 ` Shawn Guo
2011-04-21  9:47                 ` Per Forlin
2011-04-21  9:47                   ` Per Forlin
2011-04-21 10:15                   ` Per Forlin
2011-04-21 10:15                     ` Per Forlin
2011-04-28  7:52                   ` Per Forlin
2011-04-28  7:52                     ` Per Forlin
2011-04-28 10:10                     ` Russell King - ARM Linux
2011-04-28 10:10                       ` Russell King - ARM Linux
2011-04-20  7:58   ` Per Forlin
2011-04-20  7:58     ` Per Forlin
2011-04-20  8:17     ` Shawn Guo
2011-04-20  8:17       ` Shawn Guo
2011-04-20 13:51   ` [PATCH v2] " Shawn Guo
2011-04-20 13:51     ` Shawn Guo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.