kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/5] s390: more vfio-ccw code rework
@ 2019-06-18 20:23 Eric Farman
  2019-06-18 20:23 ` [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct Eric Farman
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Eric Farman @ 2019-06-18 20:23 UTC (permalink / raw)
  To: Cornelia Huck, Farhan Ali; +Cc: Halil Pasic, linux-s390, kvm, Eric Farman

A couple little improvements to the malloc load in vfio-ccw.
Really, there were just (the first) two patches, but then I
got excited and added a few stylistic ones to the end.

The routine ccwchain_calc_length() has this basic structure:

  ccwchain_calc_length
    a0 = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1))
    copy_ccw_from_iova(a0, src)
      copy_from_iova
        pfn_array_alloc
          b = kcalloc(len, sizeof(*pa_iova_pfn + *pa_pfn)
        pfn_array_pin
          vfio_pin_pages
        memcpy(a0, src)
        pfn_array_unpin_free
          vfio_unpin_pages
          kfree(b)
    kfree(a0)

We do this EVERY time we process a new channel program chain,
meaning at least once per SSCH and more if TICs are involved,
to figure out how many CCWs are chained together.  Once that
is determined, a new piece of memory is allocated (call it a1)
and then passed to copy_ccw_from_iova() again, but for the
value calculated by ccwchain_calc_length().

This seems inefficient.

Patch 1 moves the malloc of a0 from the CCW processor to the
initialization of the device.  Since only one SSCH can be
handled concurrently, we can use this space safely to
determine how long the chain being processed actually is.

Patch 2 then removes the second copy_ccw_from_iova() call
entirely, and replaces it with a memcpy from a0 to a1.  This
is done before we process a TIC and thus a second chain, so
there is no overlap in the storage in channel_program.

Patches 3-5 clean up some things that aren't as clear as I'd
like, but didn't want to pollute the first two changes.
For example, patch 3 moves the population of guest_cp to the
same routine that copies from it, rather than in a called
function.  Meanwhile, patch 4 (and thus, 5) was something I
had lying around for quite some time, because it looked to
be structured weird.  Maybe that's one bridge too far.

Eric Farman (5):
  vfio-ccw: Move guest_cp storage into common struct
  vfio-ccw: Skip second copy of guest cp to host
  vfio-ccw: Copy CCW data outside length calculation
  vfio-ccw: Factor out the ccw0-to-ccw1 transition
  vfio-ccw: Remove copy_ccw_from_iova()

 drivers/s390/cio/vfio_ccw_cp.c  | 108 +++++++++++---------------------
 drivers/s390/cio/vfio_ccw_cp.h  |   7 +++
 drivers/s390/cio/vfio_ccw_drv.c |   7 +++
 3 files changed, 52 insertions(+), 70 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
@ 2019-06-18 20:23 ` Eric Farman
  2019-06-19  8:14   ` Cornelia Huck
  2019-06-19 20:13   ` Farhan Ali
  2019-06-18 20:23 ` [RFC PATCH v1 2/5] vfio-ccw: Skip second copy of guest cp to host Eric Farman
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Eric Farman @ 2019-06-18 20:23 UTC (permalink / raw)
  To: Cornelia Huck, Farhan Ali; +Cc: Halil Pasic, linux-s390, kvm, Eric Farman

Rather than allocating/freeing a piece of memory every time
we try to figure out how long a CCW chain is, let's use a piece
of memory allocated for each device.

The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect
the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
event that accesses/uses this space, so there should be no race
concerns with another CPU attempting an (unexpected) SSCH for the
same device.

Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
Conny, your suggestion [1] did not go unnoticed.  :)

[1] https://patchwork.kernel.org/comment/22312659/
---
 drivers/s390/cio/vfio_ccw_cp.c  | 23 ++++-------------------
 drivers/s390/cio/vfio_ccw_cp.h  |  7 +++++++
 drivers/s390/cio/vfio_ccw_drv.c |  7 +++++++
 3 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 90d86e1354c1..f358502376be 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -16,12 +16,6 @@
 
 #include "vfio_ccw_cp.h"
 
-/*
- * Max length for ccw chain.
- * XXX: Limit to 256, need to check more?
- */
-#define CCWCHAIN_LEN_MAX	256
-
 struct pfn_array {
 	/* Starting guest physical I/O address. */
 	unsigned long		pa_iova;
@@ -386,7 +380,7 @@ static void ccwchain_cda_free(struct ccwchain *chain, int idx)
  */
 static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
 {
-	struct ccw1 *ccw, *p;
+	struct ccw1 *ccw = cp->guest_cp;
 	int cnt;
 
 	/*
@@ -394,15 +388,9 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
 	 * Currently the chain length is limited to CCWCHAIN_LEN_MAX (256).
 	 * So copying 2K is enough (safe).
 	 */
-	p = ccw = kcalloc(CCWCHAIN_LEN_MAX, sizeof(*ccw), GFP_KERNEL);
-	if (!ccw)
-		return -ENOMEM;
-
 	cnt = copy_ccw_from_iova(cp, ccw, iova, CCWCHAIN_LEN_MAX);
-	if (cnt) {
-		kfree(ccw);
+	if (cnt)
 		return cnt;
-	}
 
 	cnt = 0;
 	do {
@@ -413,10 +401,8 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
 		 * orb specified one of the unsupported formats, we defer
 		 * checking for IDAWs in unsupported formats to here.
 		 */
-		if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw)) {
-			kfree(p);
+		if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
 			return -EOPNOTSUPP;
-		}
 
 		/*
 		 * We want to keep counting if the current CCW has the
@@ -435,7 +421,6 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
 	if (cnt == CCWCHAIN_LEN_MAX + 1)
 		cnt = -EINVAL;
 
-	kfree(p);
 	return cnt;
 }
 
@@ -461,7 +446,7 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 	struct ccwchain *chain;
 	int len, ret;
 
-	/* Get chain length. */
+	/* Copy the chain from cda to cp, and count the CCWs in it */
 	len = ccwchain_calc_length(cda, cp);
 	if (len < 0)
 		return len;
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
index 3c20cd208da5..7cdc38049033 100644
--- a/drivers/s390/cio/vfio_ccw_cp.h
+++ b/drivers/s390/cio/vfio_ccw_cp.h
@@ -16,6 +16,12 @@
 
 #include "orb.h"
 
+/*
+ * Max length for ccw chain.
+ * XXX: Limit to 256, need to check more?
+ */
+#define CCWCHAIN_LEN_MAX	256
+
 /**
  * struct channel_program - manage information for channel program
  * @ccwchain_list: list head of ccwchains
@@ -32,6 +38,7 @@ struct channel_program {
 	union orb orb;
 	struct device *mdev;
 	bool initialized;
+	struct ccw1 *guest_cp;
 };
 
 extern int cp_init(struct channel_program *cp, struct device *mdev,
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 66a66ac1f3d1..34a9a5e3fd36 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -129,6 +129,11 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 	if (!private)
 		return -ENOMEM;
 
+	private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1),
+				       GFP_KERNEL);
+	if (!private->cp.guest_cp)
+		goto out_free;
+
 	private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
 					       GFP_KERNEL | GFP_DMA);
 	if (!private->io_region)
@@ -169,6 +174,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 		kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
 	if (private->io_region)
 		kmem_cache_free(vfio_ccw_io_region, private->io_region);
+	kfree(private->cp.guest_cp);
 	kfree(private);
 	return ret;
 }
@@ -185,6 +191,7 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
 
 	kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
 	kmem_cache_free(vfio_ccw_io_region, private->io_region);
+	kfree(private->cp.guest_cp);
 	kfree(private);
 
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v1 2/5] vfio-ccw: Skip second copy of guest cp to host
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
  2019-06-18 20:23 ` [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct Eric Farman
@ 2019-06-18 20:23 ` Eric Farman
  2019-06-19  8:17   ` Cornelia Huck
  2019-06-18 20:23 ` [RFC PATCH v1 3/5] vfio-ccw: Copy CCW data outside length calculation Eric Farman
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Eric Farman @ 2019-06-18 20:23 UTC (permalink / raw)
  To: Cornelia Huck, Farhan Ali; +Cc: Halil Pasic, linux-s390, kvm, Eric Farman

We already pinned/copied/unpinned 2K (256 CCWs) of guest memory
to the host space anchored off vfio_ccw_private.  There's no need
to do that again once we have the length calculated, when we could
just copy the section we need to the "permanent" space for the I/O.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index f358502376be..37d513e86530 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -444,7 +444,7 @@ static int ccwchain_loop_tic(struct ccwchain *chain,
 static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 {
 	struct ccwchain *chain;
-	int len, ret;
+	int len;
 
 	/* Copy the chain from cda to cp, and count the CCWs in it */
 	len = ccwchain_calc_length(cda, cp);
@@ -457,12 +457,8 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 		return -ENOMEM;
 	chain->ch_iova = cda;
 
-	/* Copy the new chain from user. */
-	ret = copy_ccw_from_iova(cp, chain->ch_ccw, cda, len);
-	if (ret) {
-		ccwchain_free(chain);
-		return ret;
-	}
+	/* Copy the actual CCWs into the new chain */
+	memcpy(chain->ch_ccw, cp->guest_cp, len * sizeof(struct ccw1));
 
 	/* Loop for tics on this new chain. */
 	return ccwchain_loop_tic(chain, cp);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v1 3/5] vfio-ccw: Copy CCW data outside length calculation
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
  2019-06-18 20:23 ` [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct Eric Farman
  2019-06-18 20:23 ` [RFC PATCH v1 2/5] vfio-ccw: Skip second copy of guest cp to host Eric Farman
@ 2019-06-18 20:23 ` Eric Farman
  2019-06-19  8:18   ` Cornelia Huck
  2019-06-18 20:23 ` [RFC PATCH v1 4/5] vfio-ccw: Factor out the ccw0-to-ccw1 transition Eric Farman
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Eric Farman @ 2019-06-18 20:23 UTC (permalink / raw)
  To: Cornelia Huck, Farhan Ali; +Cc: Halil Pasic, linux-s390, kvm, Eric Farman

It doesn't make much sense to "hide" the copy to the channel_program
struct inside a routine that calculates the length of the chain.

Let's move it to the calling routine, which will later copy from
channel_program to the memory it allocated itself.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 37d513e86530..a55f8d110920 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -381,18 +381,8 @@ static void ccwchain_cda_free(struct ccwchain *chain, int idx)
 static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
 {
 	struct ccw1 *ccw = cp->guest_cp;
-	int cnt;
+	int cnt = 0;
 
-	/*
-	 * Copy current chain from guest to host kernel.
-	 * Currently the chain length is limited to CCWCHAIN_LEN_MAX (256).
-	 * So copying 2K is enough (safe).
-	 */
-	cnt = copy_ccw_from_iova(cp, ccw, iova, CCWCHAIN_LEN_MAX);
-	if (cnt)
-		return cnt;
-
-	cnt = 0;
 	do {
 		cnt++;
 
@@ -446,7 +436,12 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 	struct ccwchain *chain;
 	int len;
 
-	/* Copy the chain from cda to cp, and count the CCWs in it */
+	/* Copy 2K (the most we support today) of possible CCWs */
+	len = copy_ccw_from_iova(cp, cp->guest_cp, cda, CCWCHAIN_LEN_MAX);
+	if (len)
+		return len;
+
+	/* Count the CCWs in the current chain */
 	len = ccwchain_calc_length(cda, cp);
 	if (len < 0)
 		return len;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v1 4/5] vfio-ccw: Factor out the ccw0-to-ccw1 transition
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
                   ` (2 preceding siblings ...)
  2019-06-18 20:23 ` [RFC PATCH v1 3/5] vfio-ccw: Copy CCW data outside length calculation Eric Farman
@ 2019-06-18 20:23 ` Eric Farman
  2019-06-19  8:22   ` Cornelia Huck
  2019-06-18 20:23 ` [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova() Eric Farman
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Eric Farman @ 2019-06-18 20:23 UTC (permalink / raw)
  To: Cornelia Huck, Farhan Ali; +Cc: Halil Pasic, linux-s390, kvm, Eric Farman

This is a really useful function, but it's buried in the
copy_ccw_from_iova() routine so that ccwchain_calc_length()
can just work with Format-1 CCWs while doing its counting.
But it means we're translating a full 2K of "CCWs" to Format-1,
when in reality there's probably far fewer in that space.

Let's factor it out, so maybe we can do something with it later.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 48 ++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index a55f8d110920..9a8bf06281e0 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -161,6 +161,27 @@ static inline void pfn_array_idal_create_words(
 	idaws[0] += pa->pa_iova & (PAGE_SIZE - 1);
 }
 
+void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
+{
+	struct ccw0 ccw0;
+	struct ccw1 *pccw1 = source;
+	int i;
+
+	for (i = 0; i < len; i++) {
+		ccw0 = *(struct ccw0 *)pccw1;
+		if ((pccw1->cmd_code & 0x0f) == CCW_CMD_TIC) {
+			pccw1->cmd_code = CCW_CMD_TIC;
+			pccw1->flags = 0;
+			pccw1->count = 0;
+		} else {
+			pccw1->cmd_code = ccw0.cmd_code;
+			pccw1->flags = ccw0.flags;
+			pccw1->count = ccw0.count;
+		}
+		pccw1->cda = ccw0.cda;
+		pccw1++;
+	}
+}
 
 /*
  * Within the domain (@mdev), copy @n bytes from a guest physical
@@ -211,32 +232,9 @@ static long copy_ccw_from_iova(struct channel_program *cp,
 			       struct ccw1 *to, u64 iova,
 			       unsigned long len)
 {
-	struct ccw0 ccw0;
-	struct ccw1 *pccw1;
 	int ret;
-	int i;
 
 	ret = copy_from_iova(cp->mdev, to, iova, len * sizeof(struct ccw1));
-	if (ret)
-		return ret;
-
-	if (!cp->orb.cmd.fmt) {
-		pccw1 = to;
-		for (i = 0; i < len; i++) {
-			ccw0 = *(struct ccw0 *)pccw1;
-			if ((pccw1->cmd_code & 0x0f) == CCW_CMD_TIC) {
-				pccw1->cmd_code = CCW_CMD_TIC;
-				pccw1->flags = 0;
-				pccw1->count = 0;
-			} else {
-				pccw1->cmd_code = ccw0.cmd_code;
-				pccw1->flags = ccw0.flags;
-				pccw1->count = ccw0.count;
-			}
-			pccw1->cda = ccw0.cda;
-			pccw1++;
-		}
-	}
 
 	return ret;
 }
@@ -441,6 +439,10 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 	if (len)
 		return len;
 
+	/* Convert any Format-0 CCWs to Format-1 */
+	if (!cp->orb.cmd.fmt)
+		convert_ccw0_to_ccw1(cp->guest_cp, len);
+
 	/* Count the CCWs in the current chain */
 	len = ccwchain_calc_length(cda, cp);
 	if (len < 0)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova()
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
                   ` (3 preceding siblings ...)
  2019-06-18 20:23 ` [RFC PATCH v1 4/5] vfio-ccw: Factor out the ccw0-to-ccw1 transition Eric Farman
@ 2019-06-18 20:23 ` Eric Farman
  2019-06-19  8:23   ` Cornelia Huck
  2019-06-19 21:13   ` Farhan Ali
  2019-06-19  8:25 ` [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Cornelia Huck
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Eric Farman @ 2019-06-18 20:23 UTC (permalink / raw)
  To: Cornelia Huck, Farhan Ali; +Cc: Halil Pasic, linux-s390, kvm, Eric Farman

Just to keep things tidy.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 9a8bf06281e0..9cddc1288059 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -228,17 +228,6 @@ static long copy_from_iova(struct device *mdev,
 	return l;
 }
 
-static long copy_ccw_from_iova(struct channel_program *cp,
-			       struct ccw1 *to, u64 iova,
-			       unsigned long len)
-{
-	int ret;
-
-	ret = copy_from_iova(cp->mdev, to, iova, len * sizeof(struct ccw1));
-
-	return ret;
-}
-
 /*
  * Helpers to operate ccwchain.
  */
@@ -435,7 +424,8 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 	int len;
 
 	/* Copy 2K (the most we support today) of possible CCWs */
-	len = copy_ccw_from_iova(cp, cp->guest_cp, cda, CCWCHAIN_LEN_MAX);
+	len = copy_from_iova(cp->mdev, cp->guest_cp, cda,
+			     CCWCHAIN_LEN_MAX * sizeof(struct ccw1));
 	if (len)
 		return len;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct
  2019-06-18 20:23 ` [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct Eric Farman
@ 2019-06-19  8:14   ` Cornelia Huck
  2019-06-19 20:13   ` Farhan Ali
  1 sibling, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2019-06-19  8:14 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:48 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> Rather than allocating/freeing a piece of memory every time
> we try to figure out how long a CCW chain is, let's use a piece
> of memory allocated for each device.
> 
> The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect
> the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
> event that accesses/uses this space, so there should be no race
> concerns with another CPU attempting an (unexpected) SSCH for the
> same device.
> 
> Suggested-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> Conny, your suggestion [1] did not go unnoticed.  :)

:)

> 
> [1] https://patchwork.kernel.org/comment/22312659/
> ---
>  drivers/s390/cio/vfio_ccw_cp.c  | 23 ++++-------------------
>  drivers/s390/cio/vfio_ccw_cp.h  |  7 +++++++
>  drivers/s390/cio/vfio_ccw_drv.c |  7 +++++++
>  3 files changed, 18 insertions(+), 19 deletions(-)

Nice!

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 2/5] vfio-ccw: Skip second copy of guest cp to host
  2019-06-18 20:23 ` [RFC PATCH v1 2/5] vfio-ccw: Skip second copy of guest cp to host Eric Farman
@ 2019-06-19  8:17   ` Cornelia Huck
  0 siblings, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2019-06-19  8:17 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:49 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> We already pinned/copied/unpinned 2K (256 CCWs) of guest memory
> to the host space anchored off vfio_ccw_private.  There's no need
> to do that again once we have the length calculated, when we could
> just copy the section we need to the "permanent" space for the I/O.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 3/5] vfio-ccw: Copy CCW data outside length calculation
  2019-06-18 20:23 ` [RFC PATCH v1 3/5] vfio-ccw: Copy CCW data outside length calculation Eric Farman
@ 2019-06-19  8:18   ` Cornelia Huck
  0 siblings, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2019-06-19  8:18 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:50 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> It doesn't make much sense to "hide" the copy to the channel_program
> struct inside a routine that calculates the length of the chain.
> 
> Let's move it to the calling routine, which will later copy from
> channel_program to the memory it allocated itself.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 19 +++++++------------
>  1 file changed, 7 insertions(+), 12 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 4/5] vfio-ccw: Factor out the ccw0-to-ccw1 transition
  2019-06-18 20:23 ` [RFC PATCH v1 4/5] vfio-ccw: Factor out the ccw0-to-ccw1 transition Eric Farman
@ 2019-06-19  8:22   ` Cornelia Huck
  0 siblings, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2019-06-19  8:22 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:51 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> This is a really useful function, but it's buried in the
> copy_ccw_from_iova() routine so that ccwchain_calc_length()
> can just work with Format-1 CCWs while doing its counting.
> But it means we're translating a full 2K of "CCWs" to Format-1,
> when in reality there's probably far fewer in that space.
> 
> Let's factor it out, so maybe we can do something with it later.

Agreed, this looks sensible.

> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 48 ++++++++++++++++++----------------
>  1 file changed, 25 insertions(+), 23 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova()
  2019-06-18 20:23 ` [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova() Eric Farman
@ 2019-06-19  8:23   ` Cornelia Huck
  2019-06-19 21:13   ` Farhan Ali
  1 sibling, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2019-06-19  8:23 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:52 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> Just to keep things tidy.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 14 ++------------
>  1 file changed, 2 insertions(+), 12 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 0/5] s390: more vfio-ccw code rework
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
                   ` (4 preceding siblings ...)
  2019-06-18 20:23 ` [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova() Eric Farman
@ 2019-06-19  8:25 ` Cornelia Huck
  2019-06-19 11:11   ` Eric Farman
  2019-06-19 21:15 ` Farhan Ali
  2019-06-21 12:25 ` Cornelia Huck
  7 siblings, 1 reply; 19+ messages in thread
From: Cornelia Huck @ 2019-06-19  8:25 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:47 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> A couple little improvements to the malloc load in vfio-ccw.
> Really, there were just (the first) two patches, but then I
> got excited and added a few stylistic ones to the end.
> 
> The routine ccwchain_calc_length() has this basic structure:
> 
>   ccwchain_calc_length
>     a0 = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1))
>     copy_ccw_from_iova(a0, src)
>       copy_from_iova
>         pfn_array_alloc
>           b = kcalloc(len, sizeof(*pa_iova_pfn + *pa_pfn)
>         pfn_array_pin
>           vfio_pin_pages
>         memcpy(a0, src)
>         pfn_array_unpin_free
>           vfio_unpin_pages
>           kfree(b)
>     kfree(a0)
> 
> We do this EVERY time we process a new channel program chain,
> meaning at least once per SSCH and more if TICs are involved,
> to figure out how many CCWs are chained together.  Once that
> is determined, a new piece of memory is allocated (call it a1)
> and then passed to copy_ccw_from_iova() again, but for the
> value calculated by ccwchain_calc_length().
> 
> This seems inefficient.
> 
> Patch 1 moves the malloc of a0 from the CCW processor to the
> initialization of the device.  Since only one SSCH can be
> handled concurrently, we can use this space safely to
> determine how long the chain being processed actually is.
> 
> Patch 2 then removes the second copy_ccw_from_iova() call
> entirely, and replaces it with a memcpy from a0 to a1.  This
> is done before we process a TIC and thus a second chain, so
> there is no overlap in the storage in channel_program.
> 
> Patches 3-5 clean up some things that aren't as clear as I'd
> like, but didn't want to pollute the first two changes.
> For example, patch 3 moves the population of guest_cp to the
> same routine that copies from it, rather than in a called
> function.  Meanwhile, patch 4 (and thus, 5) was something I
> had lying around for quite some time, because it looked to
> be structured weird.  Maybe that's one bridge too far.

I think this is worthwhile.

> 
> Eric Farman (5):
>   vfio-ccw: Move guest_cp storage into common struct
>   vfio-ccw: Skip second copy of guest cp to host
>   vfio-ccw: Copy CCW data outside length calculation
>   vfio-ccw: Factor out the ccw0-to-ccw1 transition
>   vfio-ccw: Remove copy_ccw_from_iova()
> 
>  drivers/s390/cio/vfio_ccw_cp.c  | 108 +++++++++++---------------------
>  drivers/s390/cio/vfio_ccw_cp.h  |   7 +++
>  drivers/s390/cio/vfio_ccw_drv.c |   7 +++
>  3 files changed, 52 insertions(+), 70 deletions(-)
> 

Ok, so I just wanted to take a quick look, and then ended up reviewing
it all :)

Will give others some time to look at this before I queue.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 0/5] s390: more vfio-ccw code rework
  2019-06-19  8:25 ` [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Cornelia Huck
@ 2019-06-19 11:11   ` Eric Farman
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Farman @ 2019-06-19 11:11 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm



On 6/19/19 4:25 AM, Cornelia Huck wrote:
> On Tue, 18 Jun 2019 22:23:47 +0200
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> A couple little improvements to the malloc load in vfio-ccw.
>> Really, there were just (the first) two patches, but then I
>> got excited and added a few stylistic ones to the end.
>>
>> The routine ccwchain_calc_length() has this basic structure:
>>
>>   ccwchain_calc_length
>>     a0 = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1))
>>     copy_ccw_from_iova(a0, src)
>>       copy_from_iova
>>         pfn_array_alloc
>>           b = kcalloc(len, sizeof(*pa_iova_pfn + *pa_pfn)
>>         pfn_array_pin
>>           vfio_pin_pages
>>         memcpy(a0, src)
>>         pfn_array_unpin_free
>>           vfio_unpin_pages
>>           kfree(b)
>>     kfree(a0)
>>
>> We do this EVERY time we process a new channel program chain,
>> meaning at least once per SSCH and more if TICs are involved,
>> to figure out how many CCWs are chained together.  Once that
>> is determined, a new piece of memory is allocated (call it a1)
>> and then passed to copy_ccw_from_iova() again, but for the
>> value calculated by ccwchain_calc_length().
>>
>> This seems inefficient.
>>
>> Patch 1 moves the malloc of a0 from the CCW processor to the
>> initialization of the device.  Since only one SSCH can be
>> handled concurrently, we can use this space safely to
>> determine how long the chain being processed actually is.
>>
>> Patch 2 then removes the second copy_ccw_from_iova() call
>> entirely, and replaces it with a memcpy from a0 to a1.  This
>> is done before we process a TIC and thus a second chain, so
>> there is no overlap in the storage in channel_program.
>>
>> Patches 3-5 clean up some things that aren't as clear as I'd
>> like, but didn't want to pollute the first two changes.
>> For example, patch 3 moves the population of guest_cp to the
>> same routine that copies from it, rather than in a called
>> function.  Meanwhile, patch 4 (and thus, 5) was something I
>> had lying around for quite some time, because it looked to
>> be structured weird.  Maybe that's one bridge too far.
> 
> I think this is worthwhile.
> 
>>
>> Eric Farman (5):
>>   vfio-ccw: Move guest_cp storage into common struct
>>   vfio-ccw: Skip second copy of guest cp to host
>>   vfio-ccw: Copy CCW data outside length calculation
>>   vfio-ccw: Factor out the ccw0-to-ccw1 transition
>>   vfio-ccw: Remove copy_ccw_from_iova()
>>
>>  drivers/s390/cio/vfio_ccw_cp.c  | 108 +++++++++++---------------------
>>  drivers/s390/cio/vfio_ccw_cp.h  |   7 +++
>>  drivers/s390/cio/vfio_ccw_drv.c |   7 +++
>>  3 files changed, 52 insertions(+), 70 deletions(-)
>>
> 
> Ok, so I just wanted to take a quick look, and then ended up reviewing
> it all :)

Haha, oops!  :)  Thank you!  That was a nice surprise.

> 
> Will give others some time to look at this before I queue.
> 

Sounds great!  I'll get back to my own reviews (notes the gentle
reminder on qemu :)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct
  2019-06-18 20:23 ` [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct Eric Farman
  2019-06-19  8:14   ` Cornelia Huck
@ 2019-06-19 20:13   ` Farhan Ali
  2019-06-19 20:53     ` Eric Farman
  1 sibling, 1 reply; 19+ messages in thread
From: Farhan Ali @ 2019-06-19 20:13 UTC (permalink / raw)
  To: Eric Farman, Cornelia Huck; +Cc: Halil Pasic, linux-s390, kvm



On 06/18/2019 04:23 PM, Eric Farman wrote:
> Rather than allocating/freeing a piece of memory every time
> we try to figure out how long a CCW chain is, let's use a piece
> of memory allocated for each device.
> 
> The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect
> the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
> event that accesses/uses this space, so there should be no race
> concerns with another CPU attempting an (unexpected) SSCH for the
> same device.
> 
> Suggested-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> Conny, your suggestion [1] did not go unnoticed.  :)
> 
> [1] https://patchwork.kernel.org/comment/22312659/
> ---
>   drivers/s390/cio/vfio_ccw_cp.c  | 23 ++++-------------------
>   drivers/s390/cio/vfio_ccw_cp.h  |  7 +++++++
>   drivers/s390/cio/vfio_ccw_drv.c |  7 +++++++
>   3 files changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 90d86e1354c1..f358502376be 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -16,12 +16,6 @@
>   
>   #include "vfio_ccw_cp.h"
>   
> -/*
> - * Max length for ccw chain.
> - * XXX: Limit to 256, need to check more?
> - */
> -#define CCWCHAIN_LEN_MAX	256
> -
>   struct pfn_array {
>   	/* Starting guest physical I/O address. */
>   	unsigned long		pa_iova;
> @@ -386,7 +380,7 @@ static void ccwchain_cda_free(struct ccwchain *chain, int idx)
>    */
>   static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>   {
> -	struct ccw1 *ccw, *p;
> +	struct ccw1 *ccw = cp->guest_cp;
>   	int cnt;
>   
>   	/*
> @@ -394,15 +388,9 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>   	 * Currently the chain length is limited to CCWCHAIN_LEN_MAX (256).
>   	 * So copying 2K is enough (safe).
>   	 */
> -	p = ccw = kcalloc(CCWCHAIN_LEN_MAX, sizeof(*ccw), GFP_KERNEL);
> -	if (!ccw)
> -		return -ENOMEM;
> -
>   	cnt = copy_ccw_from_iova(cp, ccw, iova, CCWCHAIN_LEN_MAX);

Just a minor concern, should we clear out cp->guest_cp memory before we 
do the copying? Given that the ccwchain_calc_length will also call be 
called during tic handling, it's possible there might be some garbage 
data in guest_cp, no?


> -	if (cnt) {
> -		kfree(ccw);
> +	if (cnt)
>   		return cnt;
> -	}
>   
>   	cnt = 0;
>   	do {
> @@ -413,10 +401,8 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>   		 * orb specified one of the unsupported formats, we defer
>   		 * checking for IDAWs in unsupported formats to here.
>   		 */
> -		if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw)) {
> -			kfree(p);
> +		if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
>   			return -EOPNOTSUPP;
> -		}
>   
>   		/*
>   		 * We want to keep counting if the current CCW has the
> @@ -435,7 +421,6 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>   	if (cnt == CCWCHAIN_LEN_MAX + 1)
>   		cnt = -EINVAL;
>   
> -	kfree(p);
>   	return cnt;
>   }
>   
> @@ -461,7 +446,7 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
>   	struct ccwchain *chain;
>   	int len, ret;
>   
> -	/* Get chain length. */
> +	/* Copy the chain from cda to cp, and count the CCWs in it */
>   	len = ccwchain_calc_length(cda, cp);
>   	if (len < 0)
>   		return len;
> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
> index 3c20cd208da5..7cdc38049033 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.h
> +++ b/drivers/s390/cio/vfio_ccw_cp.h
> @@ -16,6 +16,12 @@
>   
>   #include "orb.h"
>   
> +/*
> + * Max length for ccw chain.
> + * XXX: Limit to 256, need to check more?
> + */
> +#define CCWCHAIN_LEN_MAX	256
> +
>   /**
>    * struct channel_program - manage information for channel program
>    * @ccwchain_list: list head of ccwchains
> @@ -32,6 +38,7 @@ struct channel_program {
>   	union orb orb;
>   	struct device *mdev;
>   	bool initialized;
> +	struct ccw1 *guest_cp;
>   };
>   
>   extern int cp_init(struct channel_program *cp, struct device *mdev,
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index 66a66ac1f3d1..34a9a5e3fd36 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -129,6 +129,11 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
>   	if (!private)
>   		return -ENOMEM;
>   
> +	private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1),
> +				       GFP_KERNEL);
> +	if (!private->cp.guest_cp)
> +		goto out_free;
> +
>   	private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
>   					       GFP_KERNEL | GFP_DMA);
>   	if (!private->io_region)
> @@ -169,6 +174,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
>   		kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
>   	if (private->io_region)
>   		kmem_cache_free(vfio_ccw_io_region, private->io_region);
> +	kfree(private->cp.guest_cp);
>   	kfree(private);
>   	return ret;
>   }
> @@ -185,6 +191,7 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
>   
>   	kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
>   	kmem_cache_free(vfio_ccw_io_region, private->io_region);
> +	kfree(private->cp.guest_cp);
>   	kfree(private);
>   
>   	return 0;
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct
  2019-06-19 20:13   ` Farhan Ali
@ 2019-06-19 20:53     ` Eric Farman
  2019-06-19 21:12       ` Farhan Ali
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Farman @ 2019-06-19 20:53 UTC (permalink / raw)
  To: Farhan Ali, Cornelia Huck; +Cc: Halil Pasic, linux-s390, kvm



On 6/19/19 4:13 PM, Farhan Ali wrote:
> 
> 
> On 06/18/2019 04:23 PM, Eric Farman wrote:
>> Rather than allocating/freeing a piece of memory every time
>> we try to figure out how long a CCW chain is, let's use a piece
>> of memory allocated for each device.
>>
>> The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect
>> the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
>> event that accesses/uses this space, so there should be no race
>> concerns with another CPU attempting an (unexpected) SSCH for the
>> same device.
>>
>> Suggested-by: Cornelia Huck <cohuck@redhat.com>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>> Conny, your suggestion [1] did not go unnoticed.  :)
>>
>> [1] https://patchwork.kernel.org/comment/22312659/
>> ---
>>   drivers/s390/cio/vfio_ccw_cp.c  | 23 ++++-------------------
>>   drivers/s390/cio/vfio_ccw_cp.h  |  7 +++++++
>>   drivers/s390/cio/vfio_ccw_drv.c |  7 +++++++
>>   3 files changed, 18 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c
>> b/drivers/s390/cio/vfio_ccw_cp.c
>> index 90d86e1354c1..f358502376be 100644
>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>> @@ -16,12 +16,6 @@
>>     #include "vfio_ccw_cp.h"
>>   -/*
>> - * Max length for ccw chain.
>> - * XXX: Limit to 256, need to check more?
>> - */
>> -#define CCWCHAIN_LEN_MAX    256
>> -
>>   struct pfn_array {
>>       /* Starting guest physical I/O address. */
>>       unsigned long        pa_iova;
>> @@ -386,7 +380,7 @@ static void ccwchain_cda_free(struct ccwchain
>> *chain, int idx)
>>    */
>>   static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>>   {
>> -    struct ccw1 *ccw, *p;
>> +    struct ccw1 *ccw = cp->guest_cp;
>>       int cnt;
>>         /*
>> @@ -394,15 +388,9 @@ static int ccwchain_calc_length(u64 iova, struct
>> channel_program *cp)
>>        * Currently the chain length is limited to CCWCHAIN_LEN_MAX (256).
>>        * So copying 2K is enough (safe).
>>        */
>> -    p = ccw = kcalloc(CCWCHAIN_LEN_MAX, sizeof(*ccw), GFP_KERNEL);
>> -    if (!ccw)
>> -        return -ENOMEM;
>> -
>>       cnt = copy_ccw_from_iova(cp, ccw, iova, CCWCHAIN_LEN_MAX);
> 
> Just a minor concern, should we clear out cp->guest_cp memory before we
> do the copying? Given that the ccwchain_calc_length will also call be
> called during tic handling, it's possible there might be some garbage
> data in guest_cp, no?

Yeah, they'll be garbage there, but I'm not sure it's a problem.  By the
time we get here again (ccwchain_loop_tic() -> ccwchain_handle_ccw()),
we'll have saved the relevant CCWs for the first segment.  And the
second time through we'll be copying a fresh 2K from the target of the
TIC to cp->guest_cp, overwriting all that stale data with new CCWs (and
new garbage data).

> 
> 
>> -    if (cnt) {
>> -        kfree(ccw);
>> +    if (cnt)
>>           return cnt;
>> -    }
>>         cnt = 0;
>>       do {
>> @@ -413,10 +401,8 @@ static int ccwchain_calc_length(u64 iova, struct
>> channel_program *cp)
>>            * orb specified one of the unsupported formats, we defer
>>            * checking for IDAWs in unsupported formats to here.
>>            */
>> -        if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw)) {
>> -            kfree(p);
>> +        if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
>>               return -EOPNOTSUPP;
>> -        }
>>             /*
>>            * We want to keep counting if the current CCW has the
>> @@ -435,7 +421,6 @@ static int ccwchain_calc_length(u64 iova, struct
>> channel_program *cp)
>>       if (cnt == CCWCHAIN_LEN_MAX + 1)
>>           cnt = -EINVAL;
>>   -    kfree(p);
>>       return cnt;
>>   }
>>   @@ -461,7 +446,7 @@ static int ccwchain_handle_ccw(u32 cda, struct
>> channel_program *cp)
>>       struct ccwchain *chain;
>>       int len, ret;
>>   -    /* Get chain length. */
>> +    /* Copy the chain from cda to cp, and count the CCWs in it */
>>       len = ccwchain_calc_length(cda, cp);
>>       if (len < 0)
>>           return len;
>> diff --git a/drivers/s390/cio/vfio_ccw_cp.h
>> b/drivers/s390/cio/vfio_ccw_cp.h
>> index 3c20cd208da5..7cdc38049033 100644
>> --- a/drivers/s390/cio/vfio_ccw_cp.h
>> +++ b/drivers/s390/cio/vfio_ccw_cp.h
>> @@ -16,6 +16,12 @@
>>     #include "orb.h"
>>   +/*
>> + * Max length for ccw chain.
>> + * XXX: Limit to 256, need to check more?
>> + */
>> +#define CCWCHAIN_LEN_MAX    256
>> +
>>   /**
>>    * struct channel_program - manage information for channel program
>>    * @ccwchain_list: list head of ccwchains
>> @@ -32,6 +38,7 @@ struct channel_program {
>>       union orb orb;
>>       struct device *mdev;
>>       bool initialized;
>> +    struct ccw1 *guest_cp;
>>   };
>>     extern int cp_init(struct channel_program *cp, struct device *mdev,
>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c
>> b/drivers/s390/cio/vfio_ccw_drv.c
>> index 66a66ac1f3d1..34a9a5e3fd36 100644
>> --- a/drivers/s390/cio/vfio_ccw_drv.c
>> +++ b/drivers/s390/cio/vfio_ccw_drv.c
>> @@ -129,6 +129,11 @@ static int vfio_ccw_sch_probe(struct subchannel
>> *sch)
>>       if (!private)
>>           return -ENOMEM;
>>   +    private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct
>> ccw1),
>> +                       GFP_KERNEL);
>> +    if (!private->cp.guest_cp)
>> +        goto out_free;
>> +
>>       private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
>>                              GFP_KERNEL | GFP_DMA);
>>       if (!private->io_region)
>> @@ -169,6 +174,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
>>           kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
>>       if (private->io_region)
>>           kmem_cache_free(vfio_ccw_io_region, private->io_region);
>> +    kfree(private->cp.guest_cp);
>>       kfree(private);
>>       return ret;
>>   }
>> @@ -185,6 +191,7 @@ static int vfio_ccw_sch_remove(struct subchannel
>> *sch)
>>         kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
>>       kmem_cache_free(vfio_ccw_io_region, private->io_region);
>> +    kfree(private->cp.guest_cp);
>>       kfree(private);
>>         return 0;
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct
  2019-06-19 20:53     ` Eric Farman
@ 2019-06-19 21:12       ` Farhan Ali
  0 siblings, 0 replies; 19+ messages in thread
From: Farhan Ali @ 2019-06-19 21:12 UTC (permalink / raw)
  To: Eric Farman, Cornelia Huck; +Cc: Halil Pasic, linux-s390, kvm



On 06/19/2019 04:53 PM, Eric Farman wrote:
> 
> 
> On 6/19/19 4:13 PM, Farhan Ali wrote:
>>
>>
>> On 06/18/2019 04:23 PM, Eric Farman wrote:
>>> Rather than allocating/freeing a piece of memory every time
>>> we try to figure out how long a CCW chain is, let's use a piece
>>> of memory allocated for each device.
>>>
>>> The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect
>>> the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
>>> event that accesses/uses this space, so there should be no race
>>> concerns with another CPU attempting an (unexpected) SSCH for the
>>> same device.
>>>
>>> Suggested-by: Cornelia Huck <cohuck@redhat.com>
>>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>>> ---
>>> Conny, your suggestion [1] did not go unnoticed.  :)
>>>
>>> [1] https://patchwork.kernel.org/comment/22312659/
>>> ---
>>>    drivers/s390/cio/vfio_ccw_cp.c  | 23 ++++-------------------
>>>    drivers/s390/cio/vfio_ccw_cp.h  |  7 +++++++
>>>    drivers/s390/cio/vfio_ccw_drv.c |  7 +++++++
>>>    3 files changed, 18 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c
>>> b/drivers/s390/cio/vfio_ccw_cp.c
>>> index 90d86e1354c1..f358502376be 100644
>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>>> @@ -16,12 +16,6 @@
>>>      #include "vfio_ccw_cp.h"
>>>    -/*
>>> - * Max length for ccw chain.
>>> - * XXX: Limit to 256, need to check more?
>>> - */
>>> -#define CCWCHAIN_LEN_MAX    256
>>> -
>>>    struct pfn_array {
>>>        /* Starting guest physical I/O address. */
>>>        unsigned long        pa_iova;
>>> @@ -386,7 +380,7 @@ static void ccwchain_cda_free(struct ccwchain
>>> *chain, int idx)
>>>     */
>>>    static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>>>    {
>>> -    struct ccw1 *ccw, *p;
>>> +    struct ccw1 *ccw = cp->guest_cp;
>>>        int cnt;
>>>          /*
>>> @@ -394,15 +388,9 @@ static int ccwchain_calc_length(u64 iova, struct
>>> channel_program *cp)
>>>         * Currently the chain length is limited to CCWCHAIN_LEN_MAX (256).
>>>         * So copying 2K is enough (safe).
>>>         */
>>> -    p = ccw = kcalloc(CCWCHAIN_LEN_MAX, sizeof(*ccw), GFP_KERNEL);
>>> -    if (!ccw)
>>> -        return -ENOMEM;
>>> -
>>>        cnt = copy_ccw_from_iova(cp, ccw, iova, CCWCHAIN_LEN_MAX);
>>
>> Just a minor concern, should we clear out cp->guest_cp memory before we
>> do the copying? Given that the ccwchain_calc_length will also call be
>> called during tic handling, it's possible there might be some garbage
>> data in guest_cp, no?
> 
> Yeah, they'll be garbage there, but I'm not sure it's a problem.  By the
> time we get here again (ccwchain_loop_tic() -> ccwchain_handle_ccw()),
> we'll have saved the relevant CCWs for the first segment.  And the
> second time through we'll be copying a fresh 2K from the target of the
> TIC to cp->guest_cp, overwriting all that stale data with new CCWs (and
> new garbage data).
> 

Yes, you are right. Please disregard my concern :)

>>
>>
>>> -    if (cnt) {
>>> -        kfree(ccw);
>>> +    if (cnt)
>>>            return cnt;
>>> -    }
>>>          cnt = 0;
>>>        do {
>>> @@ -413,10 +401,8 @@ static int ccwchain_calc_length(u64 iova, struct
>>> channel_program *cp)
>>>             * orb specified one of the unsupported formats, we defer
>>>             * checking for IDAWs in unsupported formats to here.
>>>             */
>>> -        if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw)) {
>>> -            kfree(p);
>>> +        if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
>>>                return -EOPNOTSUPP;
>>> -        }
>>>              /*
>>>             * We want to keep counting if the current CCW has the
>>> @@ -435,7 +421,6 @@ static int ccwchain_calc_length(u64 iova, struct
>>> channel_program *cp)
>>>        if (cnt == CCWCHAIN_LEN_MAX + 1)
>>>            cnt = -EINVAL;
>>>    -    kfree(p);
>>>        return cnt;
>>>    }
>>>    @@ -461,7 +446,7 @@ static int ccwchain_handle_ccw(u32 cda, struct
>>> channel_program *cp)
>>>        struct ccwchain *chain;
>>>        int len, ret;
>>>    -    /* Get chain length. */
>>> +    /* Copy the chain from cda to cp, and count the CCWs in it */
>>>        len = ccwchain_calc_length(cda, cp);
>>>        if (len < 0)
>>>            return len;
>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.h
>>> b/drivers/s390/cio/vfio_ccw_cp.h
>>> index 3c20cd208da5..7cdc38049033 100644
>>> --- a/drivers/s390/cio/vfio_ccw_cp.h
>>> +++ b/drivers/s390/cio/vfio_ccw_cp.h
>>> @@ -16,6 +16,12 @@
>>>      #include "orb.h"
>>>    +/*
>>> + * Max length for ccw chain.
>>> + * XXX: Limit to 256, need to check more?
>>> + */
>>> +#define CCWCHAIN_LEN_MAX    256
>>> +
>>>    /**
>>>     * struct channel_program - manage information for channel program
>>>     * @ccwchain_list: list head of ccwchains
>>> @@ -32,6 +38,7 @@ struct channel_program {
>>>        union orb orb;
>>>        struct device *mdev;
>>>        bool initialized;
>>> +    struct ccw1 *guest_cp;
>>>    };
>>>      extern int cp_init(struct channel_program *cp, struct device *mdev,
>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c
>>> b/drivers/s390/cio/vfio_ccw_drv.c
>>> index 66a66ac1f3d1..34a9a5e3fd36 100644
>>> --- a/drivers/s390/cio/vfio_ccw_drv.c
>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c
>>> @@ -129,6 +129,11 @@ static int vfio_ccw_sch_probe(struct subchannel
>>> *sch)
>>>        if (!private)
>>>            return -ENOMEM;
>>>    +    private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct
>>> ccw1),
>>> +                       GFP_KERNEL);
>>> +    if (!private->cp.guest_cp)
>>> +        goto out_free;
>>> +
>>>        private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
>>>                               GFP_KERNEL | GFP_DMA);
>>>        if (!private->io_region)
>>> @@ -169,6 +174,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
>>>            kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
>>>        if (private->io_region)
>>>            kmem_cache_free(vfio_ccw_io_region, private->io_region);
>>> +    kfree(private->cp.guest_cp);
>>>        kfree(private);
>>>        return ret;
>>>    }
>>> @@ -185,6 +191,7 @@ static int vfio_ccw_sch_remove(struct subchannel
>>> *sch)
>>>          kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
>>>        kmem_cache_free(vfio_ccw_io_region, private->io_region);
>>> +    kfree(private->cp.guest_cp);
>>>        kfree(private);
>>>          return 0;
>>>
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova()
  2019-06-18 20:23 ` [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova() Eric Farman
  2019-06-19  8:23   ` Cornelia Huck
@ 2019-06-19 21:13   ` Farhan Ali
  1 sibling, 0 replies; 19+ messages in thread
From: Farhan Ali @ 2019-06-19 21:13 UTC (permalink / raw)
  To: Eric Farman, Cornelia Huck; +Cc: Halil Pasic, linux-s390, kvm



On 06/18/2019 04:23 PM, Eric Farman wrote:
> Just to keep things tidy.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>   drivers/s390/cio/vfio_ccw_cp.c | 14 ++------------
>   1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 9a8bf06281e0..9cddc1288059 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -228,17 +228,6 @@ static long copy_from_iova(struct device *mdev,
>   	return l;
>   }
>   
> -static long copy_ccw_from_iova(struct channel_program *cp,
> -			       struct ccw1 *to, u64 iova,
> -			       unsigned long len)
> -{
> -	int ret;
> -
> -	ret = copy_from_iova(cp->mdev, to, iova, len * sizeof(struct ccw1));
> -
> -	return ret;
> -}
> -
>   /*
>    * Helpers to operate ccwchain.
>    */
> @@ -435,7 +424,8 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
>   	int len;
>   
>   	/* Copy 2K (the most we support today) of possible CCWs */
> -	len = copy_ccw_from_iova(cp, cp->guest_cp, cda, CCWCHAIN_LEN_MAX);
> +	len = copy_from_iova(cp->mdev, cp->guest_cp, cda,
> +			     CCWCHAIN_LEN_MAX * sizeof(struct ccw1));
>   	if (len)
>   		return len;
>   
> 

This patch probably could be squashed with patch 4. Not a big deal though.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 0/5] s390: more vfio-ccw code rework
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
                   ` (5 preceding siblings ...)
  2019-06-19  8:25 ` [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Cornelia Huck
@ 2019-06-19 21:15 ` Farhan Ali
  2019-06-21 12:25 ` Cornelia Huck
  7 siblings, 0 replies; 19+ messages in thread
From: Farhan Ali @ 2019-06-19 21:15 UTC (permalink / raw)
  To: Eric Farman, Cornelia Huck; +Cc: Halil Pasic, linux-s390, kvm



On 06/18/2019 04:23 PM, Eric Farman wrote:
> A couple little improvements to the malloc load in vfio-ccw.
> Really, there were just (the first) two patches, but then I
> got excited and added a few stylistic ones to the end.
> 
> The routine ccwchain_calc_length() has this basic structure:
> 
>    ccwchain_calc_length
>      a0 = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1))
>      copy_ccw_from_iova(a0, src)
>        copy_from_iova
>          pfn_array_alloc
>            b = kcalloc(len, sizeof(*pa_iova_pfn + *pa_pfn)
>          pfn_array_pin
>            vfio_pin_pages
>          memcpy(a0, src)
>          pfn_array_unpin_free
>            vfio_unpin_pages
>            kfree(b)
>      kfree(a0)
> 
> We do this EVERY time we process a new channel program chain,
> meaning at least once per SSCH and more if TICs are involved,
> to figure out how many CCWs are chained together.  Once that
> is determined, a new piece of memory is allocated (call it a1)
> and then passed to copy_ccw_from_iova() again, but for the
> value calculated by ccwchain_calc_length().
> 
> This seems inefficient.
> 
> Patch 1 moves the malloc of a0 from the CCW processor to the
> initialization of the device.  Since only one SSCH can be
> handled concurrently, we can use this space safely to
> determine how long the chain being processed actually is.
> 
> Patch 2 then removes the second copy_ccw_from_iova() call
> entirely, and replaces it with a memcpy from a0 to a1.  This
> is done before we process a TIC and thus a second chain, so
> there is no overlap in the storage in channel_program.
> 
> Patches 3-5 clean up some things that aren't as clear as I'd
> like, but didn't want to pollute the first two changes.
> For example, patch 3 moves the population of guest_cp to the
> same routine that copies from it, rather than in a called
> function.  Meanwhile, patch 4 (and thus, 5) was something I
> had lying around for quite some time, because it looked to
> be structured weird.  Maybe that's one bridge too far.
> 
> Eric Farman (5):
>    vfio-ccw: Move guest_cp storage into common struct
>    vfio-ccw: Skip second copy of guest cp to host
>    vfio-ccw: Copy CCW data outside length calculation
>    vfio-ccw: Factor out the ccw0-to-ccw1 transition
>    vfio-ccw: Remove copy_ccw_from_iova()
> 
>   drivers/s390/cio/vfio_ccw_cp.c  | 108 +++++++++++---------------------
>   drivers/s390/cio/vfio_ccw_cp.h  |   7 +++
>   drivers/s390/cio/vfio_ccw_drv.c |   7 +++
>   3 files changed, 52 insertions(+), 70 deletions(-)
> 

For this series:
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v1 0/5] s390: more vfio-ccw code rework
  2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
                   ` (6 preceding siblings ...)
  2019-06-19 21:15 ` Farhan Ali
@ 2019-06-21 12:25 ` Cornelia Huck
  7 siblings, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2019-06-21 12:25 UTC (permalink / raw)
  To: Eric Farman; +Cc: Farhan Ali, Halil Pasic, linux-s390, kvm

On Tue, 18 Jun 2019 22:23:47 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> A couple little improvements to the malloc load in vfio-ccw.
> Really, there were just (the first) two patches, but then I
> got excited and added a few stylistic ones to the end.
> 
> The routine ccwchain_calc_length() has this basic structure:
> 
>   ccwchain_calc_length
>     a0 = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1))
>     copy_ccw_from_iova(a0, src)
>       copy_from_iova
>         pfn_array_alloc
>           b = kcalloc(len, sizeof(*pa_iova_pfn + *pa_pfn)
>         pfn_array_pin
>           vfio_pin_pages
>         memcpy(a0, src)
>         pfn_array_unpin_free
>           vfio_unpin_pages
>           kfree(b)
>     kfree(a0)
> 
> We do this EVERY time we process a new channel program chain,
> meaning at least once per SSCH and more if TICs are involved,
> to figure out how many CCWs are chained together.  Once that
> is determined, a new piece of memory is allocated (call it a1)
> and then passed to copy_ccw_from_iova() again, but for the
> value calculated by ccwchain_calc_length().
> 
> This seems inefficient.
> 
> Patch 1 moves the malloc of a0 from the CCW processor to the
> initialization of the device.  Since only one SSCH can be
> handled concurrently, we can use this space safely to
> determine how long the chain being processed actually is.
> 
> Patch 2 then removes the second copy_ccw_from_iova() call
> entirely, and replaces it with a memcpy from a0 to a1.  This
> is done before we process a TIC and thus a second chain, so
> there is no overlap in the storage in channel_program.
> 
> Patches 3-5 clean up some things that aren't as clear as I'd
> like, but didn't want to pollute the first two changes.
> For example, patch 3 moves the population of guest_cp to the
> same routine that copies from it, rather than in a called
> function.  Meanwhile, patch 4 (and thus, 5) was something I
> had lying around for quite some time, because it looked to
> be structured weird.  Maybe that's one bridge too far.
> 
> Eric Farman (5):
>   vfio-ccw: Move guest_cp storage into common struct
>   vfio-ccw: Skip second copy of guest cp to host
>   vfio-ccw: Copy CCW data outside length calculation
>   vfio-ccw: Factor out the ccw0-to-ccw1 transition
>   vfio-ccw: Remove copy_ccw_from_iova()
> 
>  drivers/s390/cio/vfio_ccw_cp.c  | 108 +++++++++++---------------------
>  drivers/s390/cio/vfio_ccw_cp.h  |   7 +++
>  drivers/s390/cio/vfio_ccw_drv.c |   7 +++
>  3 files changed, 52 insertions(+), 70 deletions(-)
> 

Thanks, applied.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-06-21 12:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-18 20:23 [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Eric Farman
2019-06-18 20:23 ` [RFC PATCH v1 1/5] vfio-ccw: Move guest_cp storage into common struct Eric Farman
2019-06-19  8:14   ` Cornelia Huck
2019-06-19 20:13   ` Farhan Ali
2019-06-19 20:53     ` Eric Farman
2019-06-19 21:12       ` Farhan Ali
2019-06-18 20:23 ` [RFC PATCH v1 2/5] vfio-ccw: Skip second copy of guest cp to host Eric Farman
2019-06-19  8:17   ` Cornelia Huck
2019-06-18 20:23 ` [RFC PATCH v1 3/5] vfio-ccw: Copy CCW data outside length calculation Eric Farman
2019-06-19  8:18   ` Cornelia Huck
2019-06-18 20:23 ` [RFC PATCH v1 4/5] vfio-ccw: Factor out the ccw0-to-ccw1 transition Eric Farman
2019-06-19  8:22   ` Cornelia Huck
2019-06-18 20:23 ` [RFC PATCH v1 5/5] vfio-ccw: Remove copy_ccw_from_iova() Eric Farman
2019-06-19  8:23   ` Cornelia Huck
2019-06-19 21:13   ` Farhan Ali
2019-06-19  8:25 ` [RFC PATCH v1 0/5] s390: more vfio-ccw code rework Cornelia Huck
2019-06-19 11:11   ` Eric Farman
2019-06-19 21:15 ` Farhan Ali
2019-06-21 12:25 ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).