All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 00/16] vfio/ccw: channel program cleanup
@ 2022-11-21 21:40 Eric Farman
  2022-11-21 21:40 ` [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary Eric Farman
                   ` (15 more replies)
  0 siblings, 16 replies; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

Hi Matt, et al,

(Apologies for sending this out right before the holiday; but I have
this in a state where it's unlikely to change further.)

(@Heiko, @Vasily, @Alexander: As you might notice in the diffstat,
this is mostly within drivers/s390/cio/vfio_ccw* with the exception
of one change to idals.h (patch 12). I hope that's okay.)

Here is the first batch of the vfio-ccw channel program handler rework,
which I'm referring to as "the IDA code." Most of it is rearrangement to
make it more readable, and to remove restrictions in place since
commit 0a19e61e6d4c ("vfio: ccw: introduce channel program interfaces").
My hope is that with this off the plate, it will be easier to extend the
vfio-ccw channel program handler for newer, more modern, features.

Some background:

A Format-1 Channel Command Word (CCW) contains a 31-bit data address,
meaning any I/O transfer is limited to the first 2GB of memory.
The concept of an Indirect Data Address (IDA) was introduced long ago
to allow for non-contiguous memory to be used for data transfers,
while still using 31-bit data addresses.

The initial z/Architecture extended the definition of ESA/390's IDA
concept to include a new IDA format that allows for 64-bit data
addresses [1]. The result is three distinct IDA flavors:

 - Format-1 IDA (31-bit addresses), each IDAW moves up to 2K of data
 - Format-2 IDA (64-bit addresses), each IDAW moves up to 2K of data
 - Format-2 IDA (64-bit addresses), each IDAW moves up to 4K of data

The selection of these three possibilities is done by bits within the
Operation-Request Block (ORB), though it should not be a surprise
that the last one is far-and-away the most common these days.
Who would operate on 2K at a time?

While newer features can be masked off (by a CPU model or equivalent),
and guarded by a defensive check in a driver (such as our check for
a Transport Mode ORB in fsm_io_request()), all three of these
possibilities are available by the base z/Architecture, and cannot
be "hidden." So while it might be unlikely for a guest to issue
such an I/O, it's not impossible.

vfio-ccw was written to only support the third of these options,
while the first two are rejected by a check buried in
ccwchain_calc_length(). While a Transport Mode ORB gets a
distinct message logged, no such announcement as to the cause
of the problem is done here, except for a generic -EOPNOTSUPP
return code in the prefetch codepath.

The goal of this series is to rework the channel program handler
such that any of the above IDA formats can be processed and sent
to the hardware. Any Format-1 IDA issued by a guest will be
converted to the 2K Format-2 variety, such that it is able to
use the full 64-bit addressing. This is a change from today,
where any direct-addressed CCW is converted to a 4K Format-2 IDA.
This needs to become a 2K Format-2 IDA to maintain compatibility
with potential IDAs in other CCWs in a chain.

The first few patches at the start of this series are improvements
that could stand alone, regardless of the rework that follows.
The remainder of the series is intended to do the code movement
that would enable these older IDA formats, but the restriction
itself isn't removed until the end.

While I developed this on top of the -parent code that you've
seen recently [2], it isn't explicitly dependent on it and should
be usable/reviewable without it. I -did- base this series on the
addressing fixes that were discussed more recently [3], since
there are intersections with this file in general.
Patch 2 addresses the suggestion you made in that series [4].

Thanks in advance, I look forward to the feedback...

Eric

[1] See most recent Principles of Operation, page 1-6
[2] https://lore.kernel.org/kvm/20221104142007.1314999-1-farman@linux.ibm.com/
[3] https://lore.kernel.org/linux-s390/20221121165836.283781-1-farman@linux.ibm.com/
[4] https://lore.kernel.org/linux-s390/c9e7229e-a88d-2185-bb6b-a94e9dac7b7a@linux.ibm.com/

Eric Farman (16):
  vfio/ccw: cleanup some of the mdev commentary
  vfio/ccw: simplify the cp_get_orb interface
  vfio/ccw: allow non-zero storage keys
  vfio/ccw: move where IDA flag is set in ORB
  vfio/ccw: replace copy_from_iova with vfio_dma_rw
  vfio/ccw: simplify CCW chain fetch routines
  vfio/ccw: remove unnecessary malloc alignment
  vfio/ccw: pass page count to page_array struct
  vfio/ccw: populate page_array struct inline
  vfio/ccw: refactor the idaw counter
  vfio/ccw: discard second fmt-1 IDAW
  vfio/ccw: calculate number of IDAWs regardless of format
  vfio/ccw: allocate/populate the guest idal
  vfio/ccw: handle a guest Format-1 IDAL
  vfio/ccw: don't group contiguous pages on 2K IDAWs
  vfio/ccw: remove old IDA format restrictions

 Documentation/s390/vfio-ccw.rst |   4 +-
 arch/s390/include/asm/idals.h   |  12 ++
 drivers/s390/cio/vfio_ccw_cp.c  | 346 +++++++++++++++++---------------
 drivers/s390/cio/vfio_ccw_cp.h  |   3 +-
 drivers/s390/cio/vfio_ccw_fsm.c |   2 +-
 5 files changed, 197 insertions(+), 170 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-11-22 16:12   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface Eric Farman
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

There is no longer an mdev struct accessible via a channel
program struct, but there are some artifacts remaining that
mention it. Clean them up.

Fixes: 0a58795647cd ("vfio/ccw: Remove mdev from struct channel_program")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 5 ++---
 drivers/s390/cio/vfio_ccw_cp.h | 1 -
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index c0a09fa8991a..9e6df1f2fbee 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -121,7 +121,7 @@ static void page_array_unpin(struct page_array *pa,
 /*
  * page_array_pin() - Pin user pages in memory
  * @pa: page_array on which to perform the operation
- * @mdev: the mediated device to perform pin operations
+ * @vdev: the vfio device to perform pin operations
  *
  * Returns number of pages pinned upon success.
  * If the pin request partially succeeds, or fails completely,
@@ -229,7 +229,7 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
 }
 
 /*
- * Within the domain (@mdev), copy @n bytes from a guest physical
+ * Within the domain (@vdev), copy @n bytes from a guest physical
  * address (@iova) to a host physical address (@to).
  */
 static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova,
@@ -665,7 +665,6 @@ static int ccwchain_fetch_one(struct ccwchain *chain,
 /**
  * cp_init() - allocate ccwchains for a channel program.
  * @cp: channel_program on which to perform the operation
- * @mdev: the mediated device to perform pin/unpin operations
  * @orb: control block for the channel program from the guest
  *
  * This creates one or more ccwchain(s), and copies the raw data of
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
index 54d26e242533..16138a654fdd 100644
--- a/drivers/s390/cio/vfio_ccw_cp.h
+++ b/drivers/s390/cio/vfio_ccw_cp.h
@@ -27,7 +27,6 @@
  * struct channel_program - manage information for channel program
  * @ccwchain_list: list head of ccwchains
  * @orb: orb for the currently processed ssch request
- * @mdev: the mediated device to perform page pinning/unpinning
  * @initialized: whether this instance is actually initialized
  *
  * @ccwchain_list is the head of a ccwchain list, that contents the
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
  2022-11-21 21:40 ` [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-11-22 16:13   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys Eric Farman
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

There's no need to send in both the address of the subchannel
struct, and an element within it, to populate the ORB.

Pass the whole pointer and let cp_get_orb() take the pieces
that are needed.

Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c  | 9 ++++-----
 drivers/s390/cio/vfio_ccw_cp.h  | 2 +-
 drivers/s390/cio/vfio_ccw_fsm.c | 2 +-
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 9e6df1f2fbee..a0060ef1119e 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -816,14 +816,13 @@ int cp_prefetch(struct channel_program *cp)
 /**
  * cp_get_orb() - get the orb of the channel program
  * @cp: channel_program on which to perform the operation
- * @intparm: new intparm for the returned orb
- * @lpm: candidate value of the logical-path mask for the returned orb
+ * @sch: subchannel the operation will be performed against
  *
  * This function returns the address of the updated orb of the channel
  * program. Channel I/O device drivers could use this orb to issue a
  * ssch.
  */
-union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
+union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
 {
 	union orb *orb;
 	struct ccwchain *chain;
@@ -835,12 +834,12 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
 
 	orb = &cp->orb;
 
-	orb->cmd.intparm = intparm;
+	orb->cmd.intparm = (u32)virt_to_phys(sch);
 	orb->cmd.fmt = 1;
 	orb->cmd.key = PAGE_DEFAULT_KEY >> 4;
 
 	if (orb->cmd.lpm == 0)
-		orb->cmd.lpm = lpm;
+		orb->cmd.lpm = sch->lpm;
 
 	chain = list_first_entry(&cp->ccwchain_list, struct ccwchain, next);
 	cpa = chain->ch_ccw;
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
index 16138a654fdd..fc31eb699807 100644
--- a/drivers/s390/cio/vfio_ccw_cp.h
+++ b/drivers/s390/cio/vfio_ccw_cp.h
@@ -43,7 +43,7 @@ struct channel_program {
 int cp_init(struct channel_program *cp, union orb *orb);
 void cp_free(struct channel_program *cp);
 int cp_prefetch(struct channel_program *cp);
-union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm);
+union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch);
 void cp_update_scsw(struct channel_program *cp, union scsw *scsw);
 bool cp_iova_pinned(struct channel_program *cp, u64 iova, u64 length);
 
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index 2784a4e4d2be..757b73141246 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -27,7 +27,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 
 	spin_lock_irqsave(sch->lock, flags);
 
-	orb = cp_get_orb(&private->cp, (u32)virt_to_phys(sch), sch->lpm);
+	orb = cp_get_orb(&private->cp, sch);
 	if (!orb) {
 		ret = -EIO;
 		goto out;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
  2022-11-21 21:40 ` [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary Eric Farman
  2022-11-21 21:40 ` [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-15 20:55   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB Eric Farman
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

Currently, vfio-ccw copies the ORB from the io_region to the
channel_program struct being built. It then adjusts various
pieces of that ORB to the values needed to be used by the
SSCH issued by vfio-ccw in the host.

This includes setting the subchannel key to the default,
presumably because Linux doesn't do anything with non-zero
storage keys itself. But it seems wrong to convert every I/O
to the default key if the guest itself requested a non-zero
subchannel (access) key.

Any channel program that sets a non-zero key would expect the
same key returned in the SCSW of the IRB, not zero, so best to
allow that to occur unimpeded.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index a0060ef1119e..268a90252521 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -836,7 +836,6 @@ union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
 
 	orb->cmd.intparm = (u32)virt_to_phys(sch);
 	orb->cmd.fmt = 1;
-	orb->cmd.key = PAGE_DEFAULT_KEY >> 4;
 
 	if (orb->cmd.lpm == 0)
 		orb->cmd.lpm = sch->lpm;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (2 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-15 20:55   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw Eric Farman
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The output of vfio_ccw is always a Format-2 IDAL, but the code that
explicitly sets this is buried in cp_init().

In fact the input is often already a Format-2 IDAL, and would be
rejected (via the check in ccwchain_calc_length()) if it weren't,
so explicitly setting it doesn't do much. Setting it way down here
only makes it impossible to make decisions in support of other
IDAL formats.

Let's move that to where the rest of the ORB is set up, so that the
CCW processing in cp_prefetch() is performed according to the
contents of the unmodified guest ORB.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 268a90252521..3a11132b1685 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -707,15 +707,9 @@ int cp_init(struct channel_program *cp, union orb *orb)
 	/* Build a ccwchain for the first CCW segment */
 	ret = ccwchain_handle_ccw(orb->cmd.cpa, cp);
 
-	if (!ret) {
+	if (!ret)
 		cp->initialized = true;
 
-		/* It is safe to force: if it was not set but idals used
-		 * ccwchain_calc_length would have returned an error.
-		 */
-		cp->orb.cmd.c64 = 1;
-	}
-
 	return ret;
 }
 
@@ -837,6 +831,11 @@ union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
 	orb->cmd.intparm = (u32)virt_to_phys(sch);
 	orb->cmd.fmt = 1;
 
+	/*
+	 * Everything built by vfio-ccw is a Format-2 IDAL.
+	 */
+	orb->cmd.c64 = 1;
+
 	if (orb->cmd.lpm == 0)
 		orb->cmd.lpm = sch->lpm;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (3 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-11-22  1:41   ` Jason Gunthorpe
  2022-12-15 20:59   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines Eric Farman
                   ` (10 subsequent siblings)
  15 siblings, 2 replies; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman,
	Jason Gunthorpe

It was suggested [1] that we replace the old copy_from_iova() routine
(which pins a page, does a memcpy, and unpins the page) with the
newer vfio_dma_rw() interface.

This has a modest improvement in the overall time spent through the
fsm_io_request() path, and simplifies some of the code to boot.

[1] https://lore.kernel.org/r/20220706170553.GK693670@nvidia.com/

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 56 +++-------------------------------
 1 file changed, 5 insertions(+), 51 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 3a11132b1685..1eacbb8dc860 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -228,51 +228,6 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
 	}
 }
 
-/*
- * Within the domain (@vdev), copy @n bytes from a guest physical
- * address (@iova) to a host physical address (@to).
- */
-static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova,
-			   unsigned long n)
-{
-	struct page_array pa = {0};
-	int i, ret;
-	unsigned long l, m;
-
-	ret = page_array_alloc(&pa, iova, n);
-	if (ret < 0)
-		return ret;
-
-	ret = page_array_pin(&pa, vdev);
-	if (ret < 0) {
-		page_array_unpin_free(&pa, vdev);
-		return ret;
-	}
-
-	l = n;
-	for (i = 0; i < pa.pa_nr; i++) {
-		void *from = kmap_local_page(pa.pa_page[i]);
-
-		m = PAGE_SIZE;
-		if (i == 0) {
-			from += iova & (PAGE_SIZE - 1);
-			m -= iova & (PAGE_SIZE - 1);
-		}
-
-		m = min(l, m);
-		memcpy(to + (n - l), from, m);
-		kunmap_local(from);
-
-		l -= m;
-		if (l == 0)
-			break;
-	}
-
-	page_array_unpin_free(&pa, vdev);
-
-	return l;
-}
-
 /*
  * Helpers to operate ccwchain.
  */
@@ -471,10 +426,9 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 	int len, ret;
 
 	/* Copy 2K (the most we support today) of possible CCWs */
-	len = copy_from_iova(vdev, cp->guest_cp, cda,
-			     CCWCHAIN_LEN_MAX * sizeof(struct ccw1));
-	if (len)
-		return len;
+	ret = vfio_dma_rw(vdev, cda, cp->guest_cp, CCWCHAIN_LEN_MAX * sizeof(struct ccw1), false);
+	if (ret)
+		return ret;
 
 	/* Convert any Format-0 CCWs to Format-1 */
 	if (!cp->orb.cmd.fmt)
@@ -572,7 +526,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
 	if (ccw_is_idal(ccw)) {
 		/* Read first IDAW to see if it's 4K-aligned or not. */
 		/* All subsequent IDAws will be 4K-aligned. */
-		ret = copy_from_iova(vdev, &iova, ccw->cda, sizeof(iova));
+		ret = vfio_dma_rw(vdev, ccw->cda, &iova, sizeof(iova), false);
 		if (ret)
 			return ret;
 	} else {
@@ -601,7 +555,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
 
 	if (ccw_is_idal(ccw)) {
 		/* Copy guest IDAL into host IDAL */
-		ret = copy_from_iova(vdev, idaws, ccw->cda, idal_len);
+		ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len, false);
 		if (ret)
 			goto out_unpin;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (4 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-15 21:18   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment Eric Farman
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The act of processing a fetched CCW has two components:

 1) Process a Transfer-in-channel (TIC) CCW
 2) Process any other CCW

The former needs to look at whether the TIC jumps backwards into
the current channel program or forwards into a new one segment,
while the latter just processes the CCW data address itself.

Rather than passing the chain segment and index within it to the
handlers for the above, and requiring each to calculate the
elements it needs, simply pass the needed pointers directly.

For the TIC, that means the CCW being processed and the location
of the entire channel program which holds all segments. For the
other CCWs, the page_array pointer is also needed to perform the
page pinning, etc.

While at it, rename ccwchain_fetch_direct to _ccw, to indicate
what it is. The name "_direct" is historical, when it used to
process a direct-addressed CCW, but IDAs are processed here too.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 1eacbb8dc860..d41d94cecdf8 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -482,11 +482,9 @@ static int ccwchain_loop_tic(struct ccwchain *chain, struct channel_program *cp)
 	return 0;
 }
 
-static int ccwchain_fetch_tic(struct ccwchain *chain,
-			      int idx,
+static int ccwchain_fetch_tic(struct ccw1 *ccw,
 			      struct channel_program *cp)
 {
-	struct ccw1 *ccw = chain->ch_ccw + idx;
 	struct ccwchain *iter;
 	u32 ccw_head;
 
@@ -502,14 +500,12 @@ static int ccwchain_fetch_tic(struct ccwchain *chain,
 	return -EFAULT;
 }
 
-static int ccwchain_fetch_direct(struct ccwchain *chain,
-				 int idx,
-				 struct channel_program *cp)
+static int ccwchain_fetch_ccw(struct ccw1 *ccw,
+			      struct page_array *pa,
+			      struct channel_program *cp)
 {
 	struct vfio_device *vdev =
 		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
-	struct ccw1 *ccw;
-	struct page_array *pa;
 	u64 iova;
 	unsigned long *idaws;
 	int ret;
@@ -517,8 +513,6 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
 	int idaw_nr, idal_len;
 	int i;
 
-	ccw = chain->ch_ccw + idx;
-
 	if (ccw->count)
 		bytes = ccw->count;
 
@@ -548,7 +542,6 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
 	 * required for the data transfer, since we only only support
 	 * 4K IDAWs today.
 	 */
-	pa = chain->ch_pa + idx;
 	ret = page_array_alloc(pa, iova, bytes);
 	if (ret < 0)
 		goto out_free_idaws;
@@ -604,16 +597,15 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
  * and to get rid of the cda 2G limitiaion of ccw1, we'll translate
  * direct ccws to idal ccws.
  */
-static int ccwchain_fetch_one(struct ccwchain *chain,
-			      int idx,
+static int ccwchain_fetch_one(struct ccw1 *ccw,
+			      struct page_array *pa,
 			      struct channel_program *cp)
-{
-	struct ccw1 *ccw = chain->ch_ccw + idx;
 
+{
 	if (ccw_is_tic(ccw))
-		return ccwchain_fetch_tic(chain, idx, cp);
+		return ccwchain_fetch_tic(ccw, cp);
 
-	return ccwchain_fetch_direct(chain, idx, cp);
+	return ccwchain_fetch_ccw(ccw, pa, cp);
 }
 
 /**
@@ -736,6 +728,8 @@ void cp_free(struct channel_program *cp)
 int cp_prefetch(struct channel_program *cp)
 {
 	struct ccwchain *chain;
+	struct ccw1 *ccw;
+	struct page_array *pa;
 	int len, idx, ret;
 
 	/* this is an error in the caller */
@@ -745,7 +739,10 @@ int cp_prefetch(struct channel_program *cp)
 	list_for_each_entry(chain, &cp->ccwchain_list, next) {
 		len = chain->ch_len;
 		for (idx = 0; idx < len; idx++) {
-			ret = ccwchain_fetch_one(chain, idx, cp);
+			ccw = chain->ch_ccw + idx;
+			pa = chain->ch_pa + idx;
+
+			ret = ccwchain_fetch_one(ccw, pa, cp);
 			if (ret)
 				goto out_err;
 		}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (5 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-16 20:10   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct Eric Farman
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

Everything about this allocation is harder than necessary,
since the memory allocation is already aligned to our needs.
Break them apart for readability, instead of doing the
funky artithmetic.

Of the structures that are involved, only ch_ccw needs the
GFP_DMA flag, so the others can be allocated without it.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index d41d94cecdf8..4b6b5f9dc92d 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -311,40 +311,41 @@ static inline int is_tic_within_range(struct ccw1 *ccw, u32 head, int len)
 static struct ccwchain *ccwchain_alloc(struct channel_program *cp, int len)
 {
 	struct ccwchain *chain;
-	void *data;
-	size_t size;
-
-	/* Make ccw address aligned to 8. */
-	size = ((sizeof(*chain) + 7L) & -8L) +
-		sizeof(*chain->ch_ccw) * len +
-		sizeof(*chain->ch_pa) * len;
-	chain = kzalloc(size, GFP_DMA | GFP_KERNEL);
+
+	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 	if (!chain)
 		return NULL;
 
-	data = (u8 *)chain + ((sizeof(*chain) + 7L) & -8L);
-	chain->ch_ccw = (struct ccw1 *)data;
-
-	data = (u8 *)(chain->ch_ccw) + sizeof(*chain->ch_ccw) * len;
-	chain->ch_pa = (struct page_array *)data;
+	chain->ch_ccw = kcalloc(len, sizeof(*chain->ch_ccw), GFP_DMA | GFP_KERNEL);
+	if (!chain->ch_ccw)
+		goto out_err;
 
-	chain->ch_len = len;
+	chain->ch_pa = kcalloc(len, sizeof(*chain->ch_pa), GFP_KERNEL);
+	if (!chain->ch_pa)
+		goto out_err;
 
 	list_add_tail(&chain->next, &cp->ccwchain_list);
 
 	return chain;
+
+out_err:
+	kfree(chain->ch_ccw);
+	kfree(chain);
+	return NULL;
 }
 
 static void ccwchain_free(struct ccwchain *chain)
 {
 	list_del(&chain->next);
+	kfree(chain->ch_pa);
+	kfree(chain->ch_ccw);
 	kfree(chain);
 }
 
 /* Free resource for a ccw that allocated memory for its cda. */
 static void ccwchain_cda_free(struct ccwchain *chain, int idx)
 {
-	struct ccw1 *ccw = chain->ch_ccw + idx;
+	struct ccw1 *ccw = &chain->ch_ccw[idx];
 
 	if (ccw_is_tic(ccw))
 		return;
@@ -443,6 +444,8 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 	chain = ccwchain_alloc(cp, len);
 	if (!chain)
 		return -ENOMEM;
+
+	chain->ch_len = len;
 	chain->ch_iova = cda;
 
 	/* Copy the actual CCWs into the new chain */
@@ -464,7 +467,7 @@ static int ccwchain_loop_tic(struct ccwchain *chain, struct channel_program *cp)
 	int i, ret;
 
 	for (i = 0; i < chain->ch_len; i++) {
-		tic = chain->ch_ccw + i;
+		tic = &chain->ch_ccw[i];
 
 		if (!ccw_is_tic(tic))
 			continue;
@@ -739,8 +742,8 @@ int cp_prefetch(struct channel_program *cp)
 	list_for_each_entry(chain, &cp->ccwchain_list, next) {
 		len = chain->ch_len;
 		for (idx = 0; idx < len; idx++) {
-			ccw = chain->ch_ccw + idx;
-			pa = chain->ch_pa + idx;
+			ccw = &chain->ch_ccw[idx];
+			pa = &chain->ch_pa[idx];
 
 			ret = ccwchain_fetch_one(ccw, pa, cp);
 			if (ret)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (6 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-16 19:59   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 09/16] vfio/ccw: populate page_array struct inline Eric Farman
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The allocation of our page_array struct calculates the number
of 4K pages that would be needed to hold a certain number of
bytes. But, since the number of pages that will be pinned is
also calculated by the length of the IDAL, this logic is
unnecessary. Let's pass that information in directly, and
avoid the math within the allocator.

Also, let's make this two allocations instead of one,
to make it apparent what's happening within here.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 4b6b5f9dc92d..66e890441163 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -43,7 +43,7 @@ struct ccwchain {
  * page_array_alloc() - alloc memory for page array
  * @pa: page_array on which to perform the operation
  * @iova: target guest physical address
- * @len: number of bytes that should be pinned from @iova
+ * @len: number of pages that should be pinned from @iova
  *
  * Attempt to allocate memory for page array.
  *
@@ -63,18 +63,20 @@ static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len)
 	if (pa->pa_nr || pa->pa_iova)
 		return -EINVAL;
 
-	pa->pa_nr = ((iova & ~PAGE_MASK) + len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
-	if (!pa->pa_nr)
+	if (!len)
 		return -EINVAL;
 
-	pa->pa_iova = kcalloc(pa->pa_nr,
-			      sizeof(*pa->pa_iova) + sizeof(*pa->pa_page),
-			      GFP_KERNEL);
-	if (unlikely(!pa->pa_iova)) {
-		pa->pa_nr = 0;
+	pa->pa_nr = len;
+
+	pa->pa_iova = kcalloc(len, sizeof(*pa->pa_iova), GFP_KERNEL);
+	if (!pa->pa_iova)
+		return -ENOMEM;
+
+	pa->pa_page = kcalloc(len, sizeof(*pa->pa_page), GFP_KERNEL);
+	if (!pa->pa_page) {
+		kfree(pa->pa_iova);
 		return -ENOMEM;
 	}
-	pa->pa_page = (struct page **)&pa->pa_iova[pa->pa_nr];
 
 	pa->pa_iova[0] = iova;
 	pa->pa_page[0] = NULL;
@@ -167,6 +169,7 @@ static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
 static void page_array_unpin_free(struct page_array *pa, struct vfio_device *vdev)
 {
 	page_array_unpin(pa, vdev, pa->pa_nr);
+	kfree(pa->pa_page);
 	kfree(pa->pa_iova);
 }
 
@@ -545,7 +548,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	 * required for the data transfer, since we only only support
 	 * 4K IDAWs today.
 	 */
-	ret = page_array_alloc(pa, iova, bytes);
+	ret = page_array_alloc(pa, iova, idaw_nr);
 	if (ret < 0)
 		goto out_free_idaws;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 09/16] vfio/ccw: populate page_array struct inline
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (7 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-16 21:05   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 10/16] vfio/ccw: refactor the idaw counter Eric Farman
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

There are two possible ways the list of addresses that get passed
to vfio are calculated. One is from a guest IDAL, which would be
an array of (probably) non-contiguous addresses. The other is
built from contiguous pages that follow the starting address
provided by ccw->cda.

page_array_alloc() attempts to simplify things by pre-populating
this array from the starting address, but that's not needed for
a CCW with an IDAL anyway so doesn't need to be in the allocator.
Move it to the caller in the non-IDAL case, since it will be
overwritten when reading the guest IDAL.

Remove the initialization of the pa_page output pointers,
since it won't be explicitly needed for either case.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 22 +++++-----------------
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 66e890441163..a30f26962750 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -42,7 +42,6 @@ struct ccwchain {
 /*
  * page_array_alloc() - alloc memory for page array
  * @pa: page_array on which to perform the operation
- * @iova: target guest physical address
  * @len: number of pages that should be pinned from @iova
  *
  * Attempt to allocate memory for page array.
@@ -56,10 +55,8 @@ struct ccwchain {
  *   -EINVAL if pa->pa_nr is not initially zero, or pa->pa_iova is not NULL
  *   -ENOMEM if alloc failed
  */
-static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len)
+static int page_array_alloc(struct page_array *pa, unsigned int len)
 {
-	int i;
-
 	if (pa->pa_nr || pa->pa_iova)
 		return -EINVAL;
 
@@ -78,13 +75,6 @@ static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len)
 		return -ENOMEM;
 	}
 
-	pa->pa_iova[0] = iova;
-	pa->pa_page[0] = NULL;
-	for (i = 1; i < pa->pa_nr; i++) {
-		pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
-		pa->pa_page[i] = NULL;
-	}
-
 	return 0;
 }
 
@@ -548,7 +538,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	 * required for the data transfer, since we only only support
 	 * 4K IDAWs today.
 	 */
-	ret = page_array_alloc(pa, iova, idaw_nr);
+	ret = page_array_alloc(pa, idaw_nr);
 	if (ret < 0)
 		goto out_free_idaws;
 
@@ -565,11 +555,9 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 		for (i = 0; i < idaw_nr; i++)
 			pa->pa_iova[i] = idaws[i];
 	} else {
-		/*
-		 * No action is required here; the iova addresses in page_array
-		 * were initialized sequentially in page_array_alloc() beginning
-		 * with the contents of ccw->cda.
-		 */
+		pa->pa_iova[0] = iova;
+		for (i = 1; i < pa->pa_nr; i++)
+			pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
 	}
 
 	if (ccw_does_data_transfer(ccw)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 10/16] vfio/ccw: refactor the idaw counter
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (8 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 09/16] vfio/ccw: populate page_array struct inline Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 19:16   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW Eric Farman
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The rules of an IDAW are fairly simple: Each one can move no
more than a defined amount of data, must not cross the
boundary defined by that length, and must be aligned to that
length as well. The first IDAW in a list is special, in that
it does not need to adhere to that alignment, but the other
rules still apply. Thus, by reading the first IDAW in a list,
the number of IDAWs that will comprise a data transfer of a
particular size can be calculated.

Let's factor out the reading of that first IDAW with the
logic that calculates the length of the list, to simplify
the rest of the routine that handles the individual IDAWs.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index a30f26962750..34a133d962d1 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -496,23 +496,25 @@ static int ccwchain_fetch_tic(struct ccw1 *ccw,
 	return -EFAULT;
 }
 
-static int ccwchain_fetch_ccw(struct ccw1 *ccw,
-			      struct page_array *pa,
-			      struct channel_program *cp)
+/*
+ * ccw_count_idaws() - Calculate the number of IDAWs needed to transfer
+ * a specified amount of data
+ *
+ * @ccw: The Channel Command Word being translated
+ * @cp: Channel Program being processed
+ */
+static int ccw_count_idaws(struct ccw1 *ccw,
+			   struct channel_program *cp)
 {
 	struct vfio_device *vdev =
 		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
 	u64 iova;
-	unsigned long *idaws;
 	int ret;
 	int bytes = 1;
-	int idaw_nr, idal_len;
-	int i;
 
 	if (ccw->count)
 		bytes = ccw->count;
 
-	/* Calculate size of IDAL */
 	if (ccw_is_idal(ccw)) {
 		/* Read first IDAW to see if it's 4K-aligned or not. */
 		/* All subsequent IDAws will be 4K-aligned. */
@@ -522,7 +524,26 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	} else {
 		iova = ccw->cda;
 	}
-	idaw_nr = idal_nr_words((void *)iova, bytes);
+
+	return idal_nr_words((void *)iova, bytes);
+}
+
+static int ccwchain_fetch_ccw(struct ccw1 *ccw,
+			      struct page_array *pa,
+			      struct channel_program *cp)
+{
+	struct vfio_device *vdev =
+		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
+	unsigned long *idaws;
+	int ret;
+	int idaw_nr, idal_len;
+	int i;
+
+	/* Calculate size of IDAL */
+	idaw_nr = ccw_count_idaws(ccw, cp);
+	if (idaw_nr < 0)
+		return idaw_nr;
+
 	idal_len = idaw_nr * sizeof(*idaws);
 
 	/* Allocate an IDAL from host storage */
@@ -555,7 +576,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 		for (i = 0; i < idaw_nr; i++)
 			pa->pa_iova[i] = idaws[i];
 	} else {
-		pa->pa_iova[0] = iova;
+		pa->pa_iova[0] = ccw->cda;
 		for (i = 1; i < pa->pa_nr; i++)
 			pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (9 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 10/16] vfio/ccw: refactor the idaw counter Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 19:27   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format Eric Farman
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The intention is to read the first IDAW to determine the starting
location of an I/O operation, knowing that the second and any/all
subsequent IDAWs will be aligned per architecture. But, this read
receives 64-bits of data, which is the size of a format-2 IDAW.

In the event that Format-1 IDAWs are presented, discard the lower
32 bits as they contain the second IDAW in such a list.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 34a133d962d1..53246f4f95f7 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -516,11 +516,15 @@ static int ccw_count_idaws(struct ccw1 *ccw,
 		bytes = ccw->count;
 
 	if (ccw_is_idal(ccw)) {
-		/* Read first IDAW to see if it's 4K-aligned or not. */
-		/* All subsequent IDAws will be 4K-aligned. */
+		/* Read first IDAW to check its starting address. */
+		/* All subsequent IDAWs will be 2K- or 4K-aligned. */
 		ret = vfio_dma_rw(vdev, ccw->cda, &iova, sizeof(iova), false);
 		if (ret)
 			return ret;
+
+		/* Format-1 IDAWs only occupy the first int */
+		if (!cp->orb.cmd.c64)
+			iova = iova >> 32;
 	} else {
 		iova = ccw->cda;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (10 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 19:49   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal Eric Farman
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The idal_nr_words() routine works well for 4K IDAWs, but lost its
ability to handle the old 2K formats with the removal of 31-bit
builds in commit 5a79859ae0f3 ("s390: remove 31 bit support").

Since there's nothing preventing a guest from generating this IDAW
format, let's re-introduce the math for them and use both when
calculating the number of IDAWs based on the bits specified in
the ORB.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 arch/s390/include/asm/idals.h  | 12 ++++++++++++
 drivers/s390/cio/vfio_ccw_cp.c | 17 ++++++++++++++++-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/idals.h b/arch/s390/include/asm/idals.h
index 40eae2c08d61..0a05a893aedb 100644
--- a/arch/s390/include/asm/idals.h
+++ b/arch/s390/include/asm/idals.h
@@ -23,6 +23,9 @@
 #define IDA_SIZE_LOG 12 /* 11 for 2k , 12 for 4k */
 #define IDA_BLOCK_SIZE (1L<<IDA_SIZE_LOG)
 
+#define IDA_2K_SIZE_LOG 11
+#define IDA_2K_BLOCK_SIZE (1L << IDA_2K_SIZE_LOG)
+
 /*
  * Test if an address/length pair needs an idal list.
  */
@@ -42,6 +45,15 @@ static inline unsigned int idal_nr_words(void *vaddr, unsigned int length)
 		(IDA_BLOCK_SIZE-1)) >> IDA_SIZE_LOG;
 }
 
+/*
+ * Return the number of 2K IDA words needed for an address/length pair.
+ */
+static inline unsigned int idal_2k_nr_words(void *vaddr, unsigned int length)
+{
+	return ((__pa(vaddr) & (IDA_2K_BLOCK_SIZE-1)) + length +
+		(IDA_2K_BLOCK_SIZE-1)) >> IDA_2K_SIZE_LOG;
+}
+
 /*
  * Create the list of idal words for an address/length pair.
  */
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 53246f4f95f7..6839e7195182 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -502,6 +502,13 @@ static int ccwchain_fetch_tic(struct ccw1 *ccw,
  *
  * @ccw: The Channel Command Word being translated
  * @cp: Channel Program being processed
+ *
+ * The ORB is examined, since it specifies what IDAWs could actually be
+ * used by any CCW in the channel program, regardless of whether or not
+ * the CCW actually does. An ORB that does not specify Format-2-IDAW
+ * Control could still contain a CCW with an IDAL, which would be
+ * Format-1 and thus only move 2K with each IDAW. Thus all CCWs within
+ * the channel program must follow the same size requirements.
  */
 static int ccw_count_idaws(struct ccw1 *ccw,
 			   struct channel_program *cp)
@@ -529,7 +536,15 @@ static int ccw_count_idaws(struct ccw1 *ccw,
 		iova = ccw->cda;
 	}
 
-	return idal_nr_words((void *)iova, bytes);
+	/* Format-1 IDAWs operate on 2K each */
+	if (!cp->orb.cmd.c64)
+		return idal_2k_nr_words((void *)iova, bytes);
+
+	/* Format-2 IDAWs operate on either 2K or 4K */
+	if (cp->orb.cmd.i2k)
+		return idal_2k_nr_words((void *)iova, bytes);
+	else
+		return idal_nr_words((void *)iova, bytes);
 }
 
 static int ccwchain_fetch_ccw(struct ccw1 *ccw,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (11 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 20:14   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL Eric Farman
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

Today, we allocate memory for a list of IDAWs, and if the CCW
being processed contains an IDAL we read that data from the guest
into that space. We then copy each IDAW into the pa_iova array,
or fabricate that pa_iova array with a list of addresses based
on a direct-addressed CCW.

Combine the reading of the guest IDAL with the creation of a
pseudo-IDAL for direct-addressed CCWs, so that both CCW types
have a "guest" IDAL that can be populated straight into the
pa_iova array.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 72 +++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 22 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 6839e7195182..90685cee85db 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -192,11 +192,12 @@ static inline void page_array_idal_create_words(struct page_array *pa,
 	 * idaw.
 	 */
 
-	for (i = 0; i < pa->pa_nr; i++)
+	for (i = 0; i < pa->pa_nr; i++) {
 		idaws[i] = page_to_phys(pa->pa_page[i]);
 
-	/* Adjust the first IDAW, since it may not start on a page boundary */
-	idaws[0] += pa->pa_iova[0] & (PAGE_SIZE - 1);
+		/* Incorporate any offset from each starting address */
+		idaws[i] += pa->pa_iova[i] & (PAGE_SIZE - 1);
+	}
 }
 
 static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
@@ -496,6 +497,44 @@ static int ccwchain_fetch_tic(struct ccw1 *ccw,
 	return -EFAULT;
 }
 
+static unsigned long *get_guest_idal(struct ccw1 *ccw,
+				     struct channel_program *cp,
+				     int idaw_nr)
+{
+	struct vfio_device *vdev =
+		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
+	unsigned long *idaws;
+	int idal_len = idaw_nr * sizeof(*idaws);
+	int idaw_size = PAGE_SIZE;
+	int idaw_mask = ~(idaw_size - 1);
+	int i, ret;
+
+	idaws = kcalloc(idaw_nr, sizeof(*idaws), GFP_DMA | GFP_KERNEL);
+	if (!idaws)
+		return NULL;
+
+	if (ccw_is_idal(ccw)) {
+		/* Copy IDAL from guest */
+		ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len, false);
+		if (ret) {
+			kfree(idaws);
+			return NULL;
+		}
+	} else {
+		/* Fabricate an IDAL based off CCW data address */
+		if (cp->orb.cmd.c64) {
+			idaws[0] = ccw->cda;
+			for (i = 1; i < idaw_nr; i++)
+				idaws[i] = (idaws[i - 1] + idaw_size) & idaw_mask;
+		} else {
+			kfree(idaws);
+			return NULL;
+		}
+	}
+
+	return idaws;
+}
+
 /*
  * ccw_count_idaws() - Calculate the number of IDAWs needed to transfer
  * a specified amount of data
@@ -555,7 +594,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
 	unsigned long *idaws;
 	int ret;
-	int idaw_nr, idal_len;
+	int idaw_nr;
 	int i;
 
 	/* Calculate size of IDAL */
@@ -563,10 +602,8 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	if (idaw_nr < 0)
 		return idaw_nr;
 
-	idal_len = idaw_nr * sizeof(*idaws);
-
 	/* Allocate an IDAL from host storage */
-	idaws = kcalloc(idaw_nr, sizeof(*idaws), GFP_DMA | GFP_KERNEL);
+	idaws = get_guest_idal(ccw, cp, idaw_nr);
 	if (!idaws) {
 		ret = -ENOMEM;
 		goto out_init;
@@ -582,22 +619,13 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	if (ret < 0)
 		goto out_free_idaws;
 
-	if (ccw_is_idal(ccw)) {
-		/* Copy guest IDAL into host IDAL */
-		ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len, false);
-		if (ret)
-			goto out_unpin;
-
-		/*
-		 * Copy guest IDAWs into page_array, in case the memory they
-		 * occupy is not contiguous.
-		 */
-		for (i = 0; i < idaw_nr; i++)
+	/*
+	 * Copy guest IDAWs into page_array, in case the memory they
+	 * occupy is not contiguous.
+	 */
+	for (i = 0; i < idaw_nr; i++) {
+		if (cp->orb.cmd.c64)
 			pa->pa_iova[i] = idaws[i];
-	} else {
-		pa->pa_iova[0] = ccw->cda;
-		for (i = 1; i < pa->pa_nr; i++)
-			pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
 	}
 
 	if (ccw_does_data_transfer(ccw)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (12 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 20:29   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs Eric Farman
  2022-11-21 21:40 ` [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions Eric Farman
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

There are two scenarios that need to be addressed here.

First, an ORB that does NOT have the Format-2 IDAL bit set could
have both a direct-addressed CCW and an indirect-data-address CCW
chained together. This means that the IDA CCW will contain a
Format-1 IDAL, and can be easily converted to a 2K Format-2 IDAL.
But it also means that the direct-addressed CCW needs to be
converted to the same 2K Format-2 IDAL for consistency with the
ORB settings.

Secondly, a Format-1 IDAL is comprised of 31-bit addresses.
Thus, we need to cast this IDAL to a pointer of ints while
populating the list of addresses that are sent to vfio.

Since the result of both of these is the use of the 2K IDAL
variants, and the output of vfio-ccw is always a Format-2 IDAL
(in order to use 64-bit addresses), make sure that the correct
control bit gets set in the ORB when these scenarios occur.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 90685cee85db..9527f3d8da77 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -222,6 +222,8 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
 	}
 }
 
+#define idal_is_2k(_cp) (!_cp->orb.cmd.c64 || _cp->orb.cmd.i2k)
+
 /*
  * Helpers to operate ccwchain.
  */
@@ -504,8 +506,9 @@ static unsigned long *get_guest_idal(struct ccw1 *ccw,
 	struct vfio_device *vdev =
 		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
 	unsigned long *idaws;
+	unsigned int *idaws_f1;
 	int idal_len = idaw_nr * sizeof(*idaws);
-	int idaw_size = PAGE_SIZE;
+	int idaw_size = idal_is_2k(cp) ? PAGE_SIZE / 2 : PAGE_SIZE;
 	int idaw_mask = ~(idaw_size - 1);
 	int i, ret;
 
@@ -527,8 +530,10 @@ static unsigned long *get_guest_idal(struct ccw1 *ccw,
 			for (i = 1; i < idaw_nr; i++)
 				idaws[i] = (idaws[i - 1] + idaw_size) & idaw_mask;
 		} else {
-			kfree(idaws);
-			return NULL;
+			idaws_f1 = (unsigned int *)idaws;
+			idaws_f1[0] = ccw->cda;
+			for (i = 1; i < idaw_nr; i++)
+				idaws_f1[i] = (idaws_f1[i - 1] + idaw_size) & idaw_mask;
 		}
 	}
 
@@ -593,6 +598,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	struct vfio_device *vdev =
 		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
 	unsigned long *idaws;
+	unsigned int *idaws_f1;
 	int ret;
 	int idaw_nr;
 	int i;
@@ -623,9 +629,12 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	 * Copy guest IDAWs into page_array, in case the memory they
 	 * occupy is not contiguous.
 	 */
+	idaws_f1 = (unsigned int *)idaws;
 	for (i = 0; i < idaw_nr; i++) {
 		if (cp->orb.cmd.c64)
 			pa->pa_iova[i] = idaws[i];
+		else
+			pa->pa_iova[i] = idaws_f1[i];
 	}
 
 	if (ccw_does_data_transfer(ccw)) {
@@ -846,7 +855,11 @@ union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
 
 	/*
 	 * Everything built by vfio-ccw is a Format-2 IDAL.
+	 * If the input was a Format-1 IDAL, indicate that
+	 * 2K Format-2 IDAWs were created here.
 	 */
+	if (!orb->cmd.c64)
+		orb->cmd.i2k = 1;
 	orb->cmd.c64 = 1;
 
 	if (orb->cmd.lpm == 0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (13 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 20:40   ` Matthew Rosato
  2022-11-21 21:40 ` [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions Eric Farman
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

The vfio_pin_pages() interface allows contiguous pages to be
pinned as a single request, which is great for the 4K pages
that are normally processed. Old IDA formats operate on 2K
chunks, which makes this logic more difficult.

Since these formats are rare, let's just invoke the page
pinning one-at-a-time, instead of trying to group them.
We can rework this code at a later date if needed.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 9527f3d8da77..3829c346583c 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -88,7 +88,7 @@ static int page_array_alloc(struct page_array *pa, unsigned int len)
  * otherwise only clear pa->pa_nr
  */
 static void page_array_unpin(struct page_array *pa,
-			     struct vfio_device *vdev, int pa_nr)
+			     struct vfio_device *vdev, int pa_nr, bool unaligned)
 {
 	int unpinned = 0, npage = 1;
 
@@ -97,7 +97,8 @@ static void page_array_unpin(struct page_array *pa,
 		dma_addr_t *last = &first[npage];
 
 		if (unpinned + npage < pa_nr &&
-		    *first + npage * PAGE_SIZE == *last) {
+		    *first + npage * PAGE_SIZE == *last &&
+		    !unaligned) {
 			npage++;
 			continue;
 		}
@@ -119,7 +120,7 @@ static void page_array_unpin(struct page_array *pa,
  * If the pin request partially succeeds, or fails completely,
  * all pages are left unpinned and a negative error value is returned.
  */
-static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
+static int page_array_pin(struct page_array *pa, struct vfio_device *vdev, bool unaligned)
 {
 	int pinned = 0, npage = 1;
 	int ret = 0;
@@ -129,7 +130,8 @@ static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
 		dma_addr_t *last = &first[npage];
 
 		if (pinned + npage < pa->pa_nr &&
-		    *first + npage * PAGE_SIZE == *last) {
+		    *first + npage * PAGE_SIZE == *last &&
+		    !unaligned) {
 			npage++;
 			continue;
 		}
@@ -151,14 +153,14 @@ static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
 	return ret;
 
 err_out:
-	page_array_unpin(pa, vdev, pinned);
+	page_array_unpin(pa, vdev, pinned, unaligned);
 	return ret;
 }
 
 /* Unpin the pages before releasing the memory. */
-static void page_array_unpin_free(struct page_array *pa, struct vfio_device *vdev)
+static void page_array_unpin_free(struct page_array *pa, struct vfio_device *vdev, bool unaligned)
 {
-	page_array_unpin(pa, vdev, pa->pa_nr);
+	page_array_unpin(pa, vdev, pa->pa_nr, unaligned);
 	kfree(pa->pa_page);
 	kfree(pa->pa_iova);
 }
@@ -638,7 +640,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	}
 
 	if (ccw_does_data_transfer(ccw)) {
-		ret = page_array_pin(pa, vdev);
+		ret = page_array_pin(pa, vdev, idal_is_2k(cp));
 		if (ret < 0)
 			goto out_unpin;
 	} else {
@@ -654,7 +656,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
 	return 0;
 
 out_unpin:
-	page_array_unpin_free(pa, vdev);
+	page_array_unpin_free(pa, vdev, idal_is_2k(cp));
 out_free_idaws:
 	kfree(idaws);
 out_init:
@@ -752,7 +754,7 @@ void cp_free(struct channel_program *cp)
 	cp->initialized = false;
 	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
 		for (i = 0; i < chain->ch_len; i++) {
-			page_array_unpin_free(chain->ch_pa + i, vdev);
+			page_array_unpin_free(chain->ch_pa + i, vdev, idal_is_2k(cp));
 			ccwchain_cda_free(chain, i);
 		}
 		ccwchain_free(chain);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions
  2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
                   ` (14 preceding siblings ...)
  2022-11-21 21:40 ` [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs Eric Farman
@ 2022-11-21 21:40 ` Eric Farman
  2022-12-19 20:44   ` Matthew Rosato
  15 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-11-21 21:40 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Eric Farman

By this point, all the pieces are in place to properly support
a 2K Format-2 IDAL, and to convert a guest Format-1 IDAL to
the 2K Format-2 variety. Let's remove the fence that prohibits
them, and allow a guest to submit them if desired.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 Documentation/s390/vfio-ccw.rst | 4 ++--
 drivers/s390/cio/vfio_ccw_cp.c  | 8 --------
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
index ea928a3806f4..53375dc86213 100644
--- a/Documentation/s390/vfio-ccw.rst
+++ b/Documentation/s390/vfio-ccw.rst
@@ -219,8 +219,8 @@ values may occur:
   The operation was successful.
 
 ``-EOPNOTSUPP``
-  The orb specified transport mode or an unidentified IDAW format, or the
-  scsw specified a function other than the start function.
+  The ORB specified transport mode, or the
+  SCSW specified a function other than the start function.
 
 ``-EIO``
   A request was issued while the device was not in a state ready to accept
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 3829c346583c..60e972042979 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -372,14 +372,6 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
 	do {
 		cnt++;
 
-		/*
-		 * As we don't want to fail direct addressing even if the
-		 * orb specified one of the unsupported formats, we defer
-		 * checking for IDAWs in unsupported formats to here.
-		 */
-		if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
-			return -EOPNOTSUPP;
-
 		/*
 		 * We want to keep counting if the current CCW has the
 		 * command-chaining flag enabled, or if it is a TIC CCW
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw
  2022-11-21 21:40 ` [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw Eric Farman
@ 2022-11-22  1:41   ` Jason Gunthorpe
  2022-12-15 20:59   ` Matthew Rosato
  1 sibling, 0 replies; 41+ messages in thread
From: Jason Gunthorpe @ 2022-11-22  1:41 UTC (permalink / raw)
  To: Eric Farman
  Cc: Matthew Rosato, Halil Pasic, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Vineeth Vijayan, Peter Oberparleiter, linux-s390, kvm

On Mon, Nov 21, 2022 at 10:40:45PM +0100, Eric Farman wrote:
> It was suggested [1] that we replace the old copy_from_iova() routine
> (which pins a page, does a memcpy, and unpins the page) with the
> newer vfio_dma_rw() interface.
> 
> This has a modest improvement in the overall time spent through the
> fsm_io_request() path, and simplifies some of the code to boot.
> 
> [1] https://lore.kernel.org/r/20220706170553.GK693670@nvidia.com/
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 56 +++-------------------------------
>  1 file changed, 5 insertions(+), 51 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary
  2022-11-21 21:40 ` [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary Eric Farman
@ 2022-11-22 16:12   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-11-22 16:12 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> There is no longer an mdev struct accessible via a channel
> program struct, but there are some artifacts remaining that
> mention it. Clean them up.
> 
> Fixes: 0a58795647cd ("vfio/ccw: Remove mdev from struct channel_program")

Since it's only some changes in code comments, I don't think this needs to go to stable via a fixes tag.

Otherwise:
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 5 ++---
>  drivers/s390/cio/vfio_ccw_cp.h | 1 -
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index c0a09fa8991a..9e6df1f2fbee 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -121,7 +121,7 @@ static void page_array_unpin(struct page_array *pa,
>  /*
>   * page_array_pin() - Pin user pages in memory
>   * @pa: page_array on which to perform the operation
> - * @mdev: the mediated device to perform pin operations
> + * @vdev: the vfio device to perform pin operations
>   *
>   * Returns number of pages pinned upon success.
>   * If the pin request partially succeeds, or fails completely,
> @@ -229,7 +229,7 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
>  }
>  
>  /*
> - * Within the domain (@mdev), copy @n bytes from a guest physical
> + * Within the domain (@vdev), copy @n bytes from a guest physical
>   * address (@iova) to a host physical address (@to).
>   */
>  static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova,
> @@ -665,7 +665,6 @@ static int ccwchain_fetch_one(struct ccwchain *chain,
>  /**
>   * cp_init() - allocate ccwchains for a channel program.
>   * @cp: channel_program on which to perform the operation
> - * @mdev: the mediated device to perform pin/unpin operations
>   * @orb: control block for the channel program from the guest
>   *
>   * This creates one or more ccwchain(s), and copies the raw data of
> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
> index 54d26e242533..16138a654fdd 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.h
> +++ b/drivers/s390/cio/vfio_ccw_cp.h
> @@ -27,7 +27,6 @@
>   * struct channel_program - manage information for channel program
>   * @ccwchain_list: list head of ccwchains
>   * @orb: orb for the currently processed ssch request
> - * @mdev: the mediated device to perform page pinning/unpinning
>   * @initialized: whether this instance is actually initialized
>   *
>   * @ccwchain_list is the head of a ccwchain list, that contents the


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface
  2022-11-21 21:40 ` [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface Eric Farman
@ 2022-11-22 16:13   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-11-22 16:13 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> There's no need to send in both the address of the subchannel
> struct, and an element within it, to populate the ORB.
> 
> Pass the whole pointer and let cp_get_orb() take the pieces
> that are needed.
> 
> Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys
  2022-11-21 21:40 ` [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys Eric Farman
@ 2022-12-15 20:55   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-15 20:55 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> Currently, vfio-ccw copies the ORB from the io_region to the
> channel_program struct being built. It then adjusts various
> pieces of that ORB to the values needed to be used by the
> SSCH issued by vfio-ccw in the host.
> 
> This includes setting the subchannel key to the default,
> presumably because Linux doesn't do anything with non-zero
> storage keys itself. But it seems wrong to convert every I/O
> to the default key if the guest itself requested a non-zero
> subchannel (access) key.
> 
> Any channel program that sets a non-zero key would expect the
> same key returned in the SCSW of the IRB, not zero, so best to
> allow that to occur unimpeded.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index a0060ef1119e..268a90252521 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -836,7 +836,6 @@ union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
>  
>  	orb->cmd.intparm = (u32)virt_to_phys(sch);
>  	orb->cmd.fmt = 1;
> -	orb->cmd.key = PAGE_DEFAULT_KEY >> 4;
>  
>  	if (orb->cmd.lpm == 0)
>  		orb->cmd.lpm = sch->lpm;


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB
  2022-11-21 21:40 ` [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB Eric Farman
@ 2022-12-15 20:55   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-15 20:55 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The output of vfio_ccw is always a Format-2 IDAL, but the code that
> explicitly sets this is buried in cp_init().
> 
> In fact the input is often already a Format-2 IDAL, and would be
> rejected (via the check in ccwchain_calc_length()) if it weren't,
> so explicitly setting it doesn't do much. Setting it way down here
> only makes it impossible to make decisions in support of other
> IDAL formats.
> 
> Let's move that to where the rest of the ORB is set up, so that the
> CCW processing in cp_prefetch() is performed according to the
> contents of the unmodified guest ORB.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 268a90252521..3a11132b1685 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -707,15 +707,9 @@ int cp_init(struct channel_program *cp, union orb *orb)
>  	/* Build a ccwchain for the first CCW segment */
>  	ret = ccwchain_handle_ccw(orb->cmd.cpa, cp);
>  
> -	if (!ret) {
> +	if (!ret)
>  		cp->initialized = true;
>  
> -		/* It is safe to force: if it was not set but idals used
> -		 * ccwchain_calc_length would have returned an error.
> -		 */
> -		cp->orb.cmd.c64 = 1;
> -	}
> -
>  	return ret;
>  }
>  
> @@ -837,6 +831,11 @@ union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
>  	orb->cmd.intparm = (u32)virt_to_phys(sch);
>  	orb->cmd.fmt = 1;
>  
> +	/*
> +	 * Everything built by vfio-ccw is a Format-2 IDAL.
> +	 */
> +	orb->cmd.c64 = 1;
> +
>  	if (orb->cmd.lpm == 0)
>  		orb->cmd.lpm = sch->lpm;
>  


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw
  2022-11-21 21:40 ` [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw Eric Farman
  2022-11-22  1:41   ` Jason Gunthorpe
@ 2022-12-15 20:59   ` Matthew Rosato
  1 sibling, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-15 20:59 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm, Jason Gunthorpe

On 11/21/22 4:40 PM, Eric Farman wrote:
> It was suggested [1] that we replace the old copy_from_iova() routine
> (which pins a page, does a memcpy, and unpins the page) with the
> newer vfio_dma_rw() interface.
> 
> This has a modest improvement in the overall time spent through the
> fsm_io_request() path, and simplifies some of the code to boot.
> 
> [1] https://lore.kernel.org/r/20220706170553.GK693670@nvidia.com/
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Nice cleanup.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 56 +++-------------------------------
>  1 file changed, 5 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 3a11132b1685..1eacbb8dc860 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -228,51 +228,6 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
>  	}
>  }
>  
> -/*
> - * Within the domain (@vdev), copy @n bytes from a guest physical
> - * address (@iova) to a host physical address (@to).
> - */
> -static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova,
> -			   unsigned long n)
> -{
> -	struct page_array pa = {0};
> -	int i, ret;
> -	unsigned long l, m;
> -
> -	ret = page_array_alloc(&pa, iova, n);
> -	if (ret < 0)
> -		return ret;
> -
> -	ret = page_array_pin(&pa, vdev);
> -	if (ret < 0) {
> -		page_array_unpin_free(&pa, vdev);
> -		return ret;
> -	}
> -
> -	l = n;
> -	for (i = 0; i < pa.pa_nr; i++) {
> -		void *from = kmap_local_page(pa.pa_page[i]);
> -
> -		m = PAGE_SIZE;
> -		if (i == 0) {
> -			from += iova & (PAGE_SIZE - 1);
> -			m -= iova & (PAGE_SIZE - 1);
> -		}
> -
> -		m = min(l, m);
> -		memcpy(to + (n - l), from, m);
> -		kunmap_local(from);
> -
> -		l -= m;
> -		if (l == 0)
> -			break;
> -	}
> -
> -	page_array_unpin_free(&pa, vdev);
> -
> -	return l;
> -}
> -
>  /*
>   * Helpers to operate ccwchain.
>   */
> @@ -471,10 +426,9 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
>  	int len, ret;
>  
>  	/* Copy 2K (the most we support today) of possible CCWs */
> -	len = copy_from_iova(vdev, cp->guest_cp, cda,
> -			     CCWCHAIN_LEN_MAX * sizeof(struct ccw1));
> -	if (len)
> -		return len;
> +	ret = vfio_dma_rw(vdev, cda, cp->guest_cp, CCWCHAIN_LEN_MAX * sizeof(struct ccw1), false);
> +	if (ret)
> +		return ret;
>  
>  	/* Convert any Format-0 CCWs to Format-1 */
>  	if (!cp->orb.cmd.fmt)
> @@ -572,7 +526,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
>  	if (ccw_is_idal(ccw)) {
>  		/* Read first IDAW to see if it's 4K-aligned or not. */
>  		/* All subsequent IDAws will be 4K-aligned. */
> -		ret = copy_from_iova(vdev, &iova, ccw->cda, sizeof(iova));
> +		ret = vfio_dma_rw(vdev, ccw->cda, &iova, sizeof(iova), false);
>  		if (ret)
>  			return ret;
>  	} else {
> @@ -601,7 +555,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
>  
>  	if (ccw_is_idal(ccw)) {
>  		/* Copy guest IDAL into host IDAL */
> -		ret = copy_from_iova(vdev, idaws, ccw->cda, idal_len);
> +		ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len, false);
>  		if (ret)
>  			goto out_unpin;
>  


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines
  2022-11-21 21:40 ` [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines Eric Farman
@ 2022-12-15 21:18   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-15 21:18 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The act of processing a fetched CCW has two components:
> 
>  1) Process a Transfer-in-channel (TIC) CCW
>  2) Process any other CCW
> 
> The former needs to look at whether the TIC jumps backwards into
> the current channel program or forwards into a new one segment,

unnecessary word 'one'?

> while the latter just processes the CCW data address itself.
> 
> Rather than passing the chain segment and index within it to the
> handlers for the above, and requiring each to calculate the
> elements it needs, simply pass the needed pointers directly.
> 
> For the TIC, that means the CCW being processed and the location
> of the entire channel program which holds all segments. For the
> other CCWs, the page_array pointer is also needed to perform the
> page pinning, etc.
> 
> While at it, rename ccwchain_fetch_direct to _ccw, to indicate
> what it is. The name "_direct" is historical, when it used to
> process a direct-addressed CCW, but IDAs are processed here too.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 1eacbb8dc860..d41d94cecdf8 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -482,11 +482,9 @@ static int ccwchain_loop_tic(struct ccwchain *chain, struct channel_program *cp)
>  	return 0;
>  }
>  
> -static int ccwchain_fetch_tic(struct ccwchain *chain,
> -			      int idx,
> +static int ccwchain_fetch_tic(struct ccw1 *ccw,
>  			      struct channel_program *cp)
>  {
> -	struct ccw1 *ccw = chain->ch_ccw + idx;
>  	struct ccwchain *iter;
>  	u32 ccw_head;
>  
> @@ -502,14 +500,12 @@ static int ccwchain_fetch_tic(struct ccwchain *chain,
>  	return -EFAULT;
>  }
>  
> -static int ccwchain_fetch_direct(struct ccwchain *chain,
> -				 int idx,
> -				 struct channel_program *cp)
> +static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> +			      struct page_array *pa,
> +			      struct channel_program *cp)
>  {
>  	struct vfio_device *vdev =
>  		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
> -	struct ccw1 *ccw;
> -	struct page_array *pa;
>  	u64 iova;
>  	unsigned long *idaws;
>  	int ret;
> @@ -517,8 +513,6 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
>  	int idaw_nr, idal_len;
>  	int i;
>  
> -	ccw = chain->ch_ccw + idx;
> -
>  	if (ccw->count)
>  		bytes = ccw->count;
>  
> @@ -548,7 +542,6 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
>  	 * required for the data transfer, since we only only support
>  	 * 4K IDAWs today.
>  	 */
> -	pa = chain->ch_pa + idx;
>  	ret = page_array_alloc(pa, iova, bytes);
>  	if (ret < 0)
>  		goto out_free_idaws;
> @@ -604,16 +597,15 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
>   * and to get rid of the cda 2G limitiaion of ccw1, we'll translate
>   * direct ccws to idal ccws.
>   */
> -static int ccwchain_fetch_one(struct ccwchain *chain,
> -			      int idx,
> +static int ccwchain_fetch_one(struct ccw1 *ccw,
> +			      struct page_array *pa,
>  			      struct channel_program *cp)
> -{
> -	struct ccw1 *ccw = chain->ch_ccw + idx;
>  
> +{
>  	if (ccw_is_tic(ccw))
> -		return ccwchain_fetch_tic(chain, idx, cp);
> +		return ccwchain_fetch_tic(ccw, cp);
>  
> -	return ccwchain_fetch_direct(chain, idx, cp);
> +	return ccwchain_fetch_ccw(ccw, pa, cp);
>  }
>  
>  /**
> @@ -736,6 +728,8 @@ void cp_free(struct channel_program *cp)
>  int cp_prefetch(struct channel_program *cp)
>  {
>  	struct ccwchain *chain;
> +	struct ccw1 *ccw;
> +	struct page_array *pa;
>  	int len, idx, ret;
>  
>  	/* this is an error in the caller */
> @@ -745,7 +739,10 @@ int cp_prefetch(struct channel_program *cp)
>  	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>  		len = chain->ch_len;
>  		for (idx = 0; idx < len; idx++) {
> -			ret = ccwchain_fetch_one(chain, idx, cp);
> +			ccw = chain->ch_ccw + idx;
> +			pa = chain->ch_pa + idx;
> +
> +			ret = ccwchain_fetch_one(ccw, pa, cp);
>  			if (ret)
>  				goto out_err;
>  		}


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct
  2022-11-21 21:40 ` [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct Eric Farman
@ 2022-12-16 19:59   ` Matthew Rosato
  2022-12-19 16:22     ` Eric Farman
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Rosato @ 2022-12-16 19:59 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The allocation of our page_array struct calculates the number
> of 4K pages that would be needed to hold a certain number of
> bytes. But, since the number of pages that will be pinned is
> also calculated by the length of the IDAL, this logic is
> unnecessary. Let's pass that information in directly, and
> avoid the math within the allocator.
> 
> Also, let's make this two allocations instead of one,
> to make it apparent what's happening within here.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 4b6b5f9dc92d..66e890441163 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -43,7 +43,7 @@ struct ccwchain {
>   * page_array_alloc() - alloc memory for page array
>   * @pa: page_array on which to perform the operation
>   * @iova: target guest physical address
> - * @len: number of bytes that should be pinned from @iova
> + * @len: number of pages that should be pinned from @iova
>   *
>   * Attempt to allocate memory for page array.
>   *
> @@ -63,18 +63,20 @@ static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len)
>  	if (pa->pa_nr || pa->pa_iova)
>  		return -EINVAL;
>  
> -	pa->pa_nr = ((iova & ~PAGE_MASK) + len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> -	if (!pa->pa_nr)
> +	if (!len)

Seems like a weird way to check this.  if (len == 0) ?

Otherwise:
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

>  		return -EINVAL;
>  
> -	pa->pa_iova = kcalloc(pa->pa_nr,
> -			      sizeof(*pa->pa_iova) + sizeof(*pa->pa_page),
> -			      GFP_KERNEL);
> -	if (unlikely(!pa->pa_iova)) {
> -		pa->pa_nr = 0;
> +	pa->pa_nr = len;
> +
> +	pa->pa_iova = kcalloc(len, sizeof(*pa->pa_iova), GFP_KERNEL);
> +	if (!pa->pa_iova)
> +		return -ENOMEM;
> +
> +	pa->pa_page = kcalloc(len, sizeof(*pa->pa_page), GFP_KERNEL);
> +	if (!pa->pa_page) {
> +		kfree(pa->pa_iova);
>  		return -ENOMEM;
>  	}
> -	pa->pa_page = (struct page **)&pa->pa_iova[pa->pa_nr];
>  
>  	pa->pa_iova[0] = iova;
>  	pa->pa_page[0] = NULL;
> @@ -167,6 +169,7 @@ static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
>  static void page_array_unpin_free(struct page_array *pa, struct vfio_device *vdev)
>  {
>  	page_array_unpin(pa, vdev, pa->pa_nr);
> +	kfree(pa->pa_page);
>  	kfree(pa->pa_iova);
>  }
>  
> @@ -545,7 +548,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	 * required for the data transfer, since we only only support
>  	 * 4K IDAWs today.
>  	 */
> -	ret = page_array_alloc(pa, iova, bytes);
> +	ret = page_array_alloc(pa, iova, idaw_nr);
>  	if (ret < 0)
>  		goto out_free_idaws;
>  


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment
  2022-11-21 21:40 ` [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment Eric Farman
@ 2022-12-16 20:10   ` Matthew Rosato
  2022-12-19 16:22     ` Eric Farman
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Rosato @ 2022-12-16 20:10 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> Everything about this allocation is harder than necessary,
> since the memory allocation is already aligned to our needs.
> Break them apart for readability, instead of doing the
> funky artithmetic.
> 
> Of the structures that are involved, only ch_ccw needs the
> GFP_DMA flag, so the others can be allocated without it.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++----------------
>  1 file changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index d41d94cecdf8..4b6b5f9dc92d 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -311,40 +311,41 @@ static inline int is_tic_within_range(struct ccw1 *ccw, u32 head, int len)
>  static struct ccwchain *ccwchain_alloc(struct channel_program *cp, int len)
>  {
>  	struct ccwchain *chain;
> -	void *data;
> -	size_t size;
> -
> -	/* Make ccw address aligned to 8. */
> -	size = ((sizeof(*chain) + 7L) & -8L) +
> -		sizeof(*chain->ch_ccw) * len +
> -		sizeof(*chain->ch_pa) * len;
> -	chain = kzalloc(size, GFP_DMA | GFP_KERNEL);
> +
> +	chain = kzalloc(sizeof(*chain), GFP_KERNEL);

I suppose you could consider a WARN_ONCE here if one of these kzalloc'd addresses has something in the low-order 3 bits; would probably make it more obvious if for some reason the alignment guarantee was broken vs some status after-the-fact in the IRB.  But as per our discussion off-list I think that can only happen if ARCH_KMALLOC_MINALIGN were to change.

>  	if (!chain)
>  		return NULL;
>  
> -	data = (u8 *)chain + ((sizeof(*chain) + 7L) & -8L);
> -	chain->ch_ccw = (struct ccw1 *)data;
> -
> -	data = (u8 *)(chain->ch_ccw) + sizeof(*chain->ch_ccw) * len;
> -	chain->ch_pa = (struct page_array *)data;
> +	chain->ch_ccw = kcalloc(len, sizeof(*chain->ch_ccw), GFP_DMA | GFP_KERNEL);
> +	if (!chain->ch_ccw)
> +		goto out_err;
>  
> -	chain->ch_len = len;
> +	chain->ch_pa = kcalloc(len, sizeof(*chain->ch_pa), GFP_KERNEL);
> +	if (!chain->ch_pa)
> +		goto out_err;
>  
>  	list_add_tail(&chain->next, &cp->ccwchain_list);
>  
>  	return chain;
> +
> +out_err:
> +	kfree(chain->ch_ccw);
> +	kfree(chain);
> +	return NULL;
>  }
>  
>  static void ccwchain_free(struct ccwchain *chain)
>  {
>  	list_del(&chain->next);
> +	kfree(chain->ch_pa);
> +	kfree(chain->ch_ccw);
>  	kfree(chain);
>  }
>  
>  /* Free resource for a ccw that allocated memory for its cda. */
>  static void ccwchain_cda_free(struct ccwchain *chain, int idx)
>  {
> -	struct ccw1 *ccw = chain->ch_ccw + idx;
> +	struct ccw1 *ccw = &chain->ch_ccw[idx];
>  
>  	if (ccw_is_tic(ccw))
>  		return;
> @@ -443,6 +444,8 @@ static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
>  	chain = ccwchain_alloc(cp, len);
>  	if (!chain)
>  		return -ENOMEM;
> +
> +	chain->ch_len = len;
>  	chain->ch_iova = cda;
>  
>  	/* Copy the actual CCWs into the new chain */
> @@ -464,7 +467,7 @@ static int ccwchain_loop_tic(struct ccwchain *chain, struct channel_program *cp)
>  	int i, ret;
>  
>  	for (i = 0; i < chain->ch_len; i++) {
> -		tic = chain->ch_ccw + i;
> +		tic = &chain->ch_ccw[i];

These don't seem equivalent...  Before at each iteration you'd offset tic by i bytes, now you're treating i as an index of 8B ccw1 structs, so it seems like this went from tic = x + i to tic = x + (8 * i)?  Was the old code broken or am I missing something? 

>  
>  		if (!ccw_is_tic(tic))
>  			continue;
> @@ -739,8 +742,8 @@ int cp_prefetch(struct channel_program *cp)
>  	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>  		len = chain->ch_len;
>  		for (idx = 0; idx < len; idx++) {
> -			ccw = chain->ch_ccw + idx;
> -			pa = chain->ch_pa + idx;
> +			ccw = &chain->ch_ccw[idx];
> +			pa = &chain->ch_pa[idx];

Same sort of question re: ch_pa

>  
>  			ret = ccwchain_fetch_one(ccw, pa, cp);
>  			if (ret)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 09/16] vfio/ccw: populate page_array struct inline
  2022-11-21 21:40 ` [PATCH v1 09/16] vfio/ccw: populate page_array struct inline Eric Farman
@ 2022-12-16 21:05   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-16 21:05 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> There are two possible ways the list of addresses that get passed
> to vfio are calculated. One is from a guest IDAL, which would be
> an array of (probably) non-contiguous addresses. The other is
> built from contiguous pages that follow the starting address
> provided by ccw->cda.
> 
> page_array_alloc() attempts to simplify things by pre-populating
> this array from the starting address, but that's not needed for
> a CCW with an IDAL anyway so doesn't need to be in the allocator.
> Move it to the caller in the non-IDAL case, since it will be
> overwritten when reading the guest IDAL.
> 
> Remove the initialization of the pa_page output pointers,
> since it won't be explicitly needed for either case.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 22 +++++-----------------
>  1 file changed, 5 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 66e890441163..a30f26962750 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -42,7 +42,6 @@ struct ccwchain {
>  /*
>   * page_array_alloc() - alloc memory for page array
>   * @pa: page_array on which to perform the operation
> - * @iova: target guest physical address
>   * @len: number of pages that should be pinned from @iova
>   *
>   * Attempt to allocate memory for page array.
> @@ -56,10 +55,8 @@ struct ccwchain {
>   *   -EINVAL if pa->pa_nr is not initially zero, or pa->pa_iova is not NULL
>   *   -ENOMEM if alloc failed
>   */
> -static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len)
> +static int page_array_alloc(struct page_array *pa, unsigned int len)
>  {
> -	int i;
> -
>  	if (pa->pa_nr || pa->pa_iova)
>  		return -EINVAL;
>  
> @@ -78,13 +75,6 @@ static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len)
>  		return -ENOMEM;
>  	}
>  
> -	pa->pa_iova[0] = iova;
> -	pa->pa_page[0] = NULL;
> -	for (i = 1; i < pa->pa_nr; i++) {
> -		pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
> -		pa->pa_page[i] = NULL;
> -	}
> -
>  	return 0;
>  }
>  
> @@ -548,7 +538,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	 * required for the data transfer, since we only only support
>  	 * 4K IDAWs today.
>  	 */
> -	ret = page_array_alloc(pa, iova, idaw_nr);
> +	ret = page_array_alloc(pa, idaw_nr);
>  	if (ret < 0)
>  		goto out_free_idaws;
>  
> @@ -565,11 +555,9 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  		for (i = 0; i < idaw_nr; i++)
>  			pa->pa_iova[i] = idaws[i];
>  	} else {
> -		/*
> -		 * No action is required here; the iova addresses in page_array
> -		 * were initialized sequentially in page_array_alloc() beginning
> -		 * with the contents of ccw->cda.
> -		 */
> +		pa->pa_iova[0] = iova;
> +		for (i = 1; i < pa->pa_nr; i++)
> +			pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
>  	}
>  
>  	if (ccw_does_data_transfer(ccw)) {


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment
  2022-12-16 20:10   ` Matthew Rosato
@ 2022-12-19 16:22     ` Eric Farman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric Farman @ 2022-12-19 16:22 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On Fri, 2022-12-16 at 15:10 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > Everything about this allocation is harder than necessary,
> > since the memory allocation is already aligned to our needs.
> > Break them apart for readability, instead of doing the
> > funky artithmetic.
> > 
> > Of the structures that are involved, only ch_ccw needs the
> > GFP_DMA flag, so the others can be allocated without it.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++------------
> > ----
> >  1 file changed, 21 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index d41d94cecdf8..4b6b5f9dc92d 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -311,40 +311,41 @@ static inline int is_tic_within_range(struct
> > ccw1 *ccw, u32 head, int len)
> >  static struct ccwchain *ccwchain_alloc(struct channel_program *cp,
> > int len)
> >  {
> >         struct ccwchain *chain;
> > -       void *data;
> > -       size_t size;
> > -
> > -       /* Make ccw address aligned to 8. */
> > -       size = ((sizeof(*chain) + 7L) & -8L) +
> > -               sizeof(*chain->ch_ccw) * len +
> > -               sizeof(*chain->ch_pa) * len;
> > -       chain = kzalloc(size, GFP_DMA | GFP_KERNEL);
> > +
> > +       chain = kzalloc(sizeof(*chain), GFP_KERNEL);
> 
> I suppose you could consider a WARN_ONCE here if one of these
> kzalloc'd addresses has something in the low-order 3 bits; would
> probably make it more obvious if for some reason the alignment
> guarantee was broken vs some status after-the-fact in the IRB.  But
> as per our discussion off-list I think that can only happen if
> ARCH_KMALLOC_MINALIGN were to change.

Yeah, maybe, but the "status after-the-fact" is a program check that
would be generated by the channel, just as would be done if the ORB was
located in a similarly-weird location (which we don't check for
either). Since this is all mainline paths, I don't think it makes sense
to re-check all those possible permutations here.

(And, for what it's worth, it's not this allocation that matters, but
rather the one that gets stuffed into the ORB below [1])

> 
> >         if (!chain)
> >                 return NULL;
> >  
> > -       data = (u8 *)chain + ((sizeof(*chain) + 7L) & -8L);
> > -       chain->ch_ccw = (struct ccw1 *)data;
> > -
> > -       data = (u8 *)(chain->ch_ccw) + sizeof(*chain->ch_ccw) *
> > len;
> > -       chain->ch_pa = (struct page_array *)data;
> > +       chain->ch_ccw = kcalloc(len, sizeof(*chain->ch_ccw),
> > GFP_DMA | GFP_KERNEL);

[1]

> > +       if (!chain->ch_ccw)
> > +               goto out_err;
> >  
> > -       chain->ch_len = len;
> > +       chain->ch_pa = kcalloc(len, sizeof(*chain->ch_pa),
> > GFP_KERNEL);
> > +       if (!chain->ch_pa)
> > +               goto out_err;
> >  
> >         list_add_tail(&chain->next, &cp->ccwchain_list);
> >  
> >         return chain;
> > +
> > +out_err:
> > +       kfree(chain->ch_ccw);
> > +       kfree(chain);
> > +       return NULL;
> >  }
> >  
> >  static void ccwchain_free(struct ccwchain *chain)
> >  {
> >         list_del(&chain->next);
> > +       kfree(chain->ch_pa);
> > +       kfree(chain->ch_ccw);
> >         kfree(chain);
> >  }
> >  
> >  /* Free resource for a ccw that allocated memory for its cda. */
> >  static void ccwchain_cda_free(struct ccwchain *chain, int idx)
> >  {
> > -       struct ccw1 *ccw = chain->ch_ccw + idx;
> > +       struct ccw1 *ccw = &chain->ch_ccw[idx];
> >  
> >         if (ccw_is_tic(ccw))
> >                 return;
> > @@ -443,6 +444,8 @@ static int ccwchain_handle_ccw(u32 cda, struct
> > channel_program *cp)
> >         chain = ccwchain_alloc(cp, len);
> >         if (!chain)
> >                 return -ENOMEM;
> > +
> > +       chain->ch_len = len;
> >         chain->ch_iova = cda;
> >  
> >         /* Copy the actual CCWs into the new chain */
> > @@ -464,7 +467,7 @@ static int ccwchain_loop_tic(struct ccwchain
> > *chain, struct channel_program *cp)
> >         int i, ret;
> >  
> >         for (i = 0; i < chain->ch_len; i++) {
> > -               tic = chain->ch_ccw + i;
> > +               tic = &chain->ch_ccw[i];
> 
> These don't seem equivalent...  Before at each iteration you'd offset
> tic by i bytes, now you're treating i as an index of 8B ccw1 structs,
> so it seems like this went from tic = x + i to tic = x + (8 * i)? 
> Was the old code broken or am I missing something? 

I think the latter. :) The old code did one allocation measured in
bytes, stored it in chain, and then calculated locations within that
for ch_ccw and ch_pa, cast to the respective pointer types. (See the
reference [1] above.)

So any use of "i" was an index into the pointer types and thus already
a "8 * i" addition from your example. My intention here was to remove
the pseudo-assembly above, and changed these along the way as I was un-
tangling everything. Looking at the resulting assembly before/after,
these hunks don't end up changing at all so I'll back these changes
back out. Especially since...

> 
> >  
> >                 if (!ccw_is_tic(tic))
> >                         continue;
> > @@ -739,8 +742,8 @@ int cp_prefetch(struct channel_program *cp)
> >         list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >                 len = chain->ch_len;
> >                 for (idx = 0; idx < len; idx++) {
> > -                       ccw = chain->ch_ccw + idx;
> > -                       pa = chain->ch_pa + idx;
> > +                       ccw = &chain->ch_ccw[idx];
> > +                       pa = &chain->ch_pa[idx];
> 
> Same sort of question re: ch_pa

...this prompted me to notice that I didn't change the users of "chain-
>ch_pa + i" when calling page_array_unpin_free(), so now we have both
flavors which isn't ideal.

BEFORE:
                        ccw = chain->ch_ccw + idx;
                        pa = chain->ch_pa + idx;
    1536:       eb 3b 00 01 00 0d       sllg    %r3,%r11,1
    153c:       b9 08 00 3b             agr     %r3,%r11
    1540:       eb 33 00 03 00 0d       sllg    %r3,%r3,3
    1546:       e3 30 80 28 00 08       ag      %r3,40(%r8)
                        ccw = chain->ch_ccw + idx;
    154c:       eb 2b 00 03 00 0d       sllg    %r2,%r11,3
    1552:       e3 20 80 10 00 08       ag      %r2,16(%r8)
AFTER
                        ccw = &chain->ch_ccw[idx];
                        pa = &chain->ch_pa[idx];
    15be:       eb 3b 00 01 00 0d       sllg    %r3,%r11,1
    15c4:       b9 08 00 3b             agr     %r3,%r11
    15c8:       eb 33 00 03 00 0d       sllg    %r3,%r3,3
    15ce:       e3 30 80 28 00 08       ag      %r3,40(%r8)
                        ccw = &chain->ch_ccw[idx];
    15d4:       eb 2b 00 03 00 0d       sllg    %r2,%r11,3
    15da:       e3 20 80 10 00 08       ag      %r2,16(%r8)


> 
> >  
> >                         ret = ccwchain_fetch_one(ccw, pa, cp);
> >                         if (ret)
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct
  2022-12-16 19:59   ` Matthew Rosato
@ 2022-12-19 16:22     ` Eric Farman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric Farman @ 2022-12-19 16:22 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On Fri, 2022-12-16 at 14:59 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > The allocation of our page_array struct calculates the number
> > of 4K pages that would be needed to hold a certain number of
> > bytes. But, since the number of pages that will be pinned is
> > also calculated by the length of the IDAL, this logic is
> > unnecessary. Let's pass that information in directly, and
> > avoid the math within the allocator.
> > 
> > Also, let's make this two allocations instead of one,
> > to make it apparent what's happening within here.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 23 +++++++++++++----------
> >  1 file changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index 4b6b5f9dc92d..66e890441163 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -43,7 +43,7 @@ struct ccwchain {
> >   * page_array_alloc() - alloc memory for page array
> >   * @pa: page_array on which to perform the operation
> >   * @iova: target guest physical address
> > - * @len: number of bytes that should be pinned from @iova
> > + * @len: number of pages that should be pinned from @iova
> >   *
> >   * Attempt to allocate memory for page array.
> >   *
> > @@ -63,18 +63,20 @@ static int page_array_alloc(struct page_array
> > *pa, u64 iova, unsigned int len)
> >         if (pa->pa_nr || pa->pa_iova)
> >                 return -EINVAL;
> >  
> > -       pa->pa_nr = ((iova & ~PAGE_MASK) + len + (PAGE_SIZE - 1))
> > >> PAGE_SHIFT;
> > -       if (!pa->pa_nr)
> > +       if (!len)
> 
> Seems like a weird way to check this.  if (len == 0) ?

Yeah... Weird before, weird now. I'll fix this up.

> 
> Otherwise:
> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

Thanks!

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 10/16] vfio/ccw: refactor the idaw counter
  2022-11-21 21:40 ` [PATCH v1 10/16] vfio/ccw: refactor the idaw counter Eric Farman
@ 2022-12-19 19:16   ` Matthew Rosato
  2022-12-19 19:31     ` Eric Farman
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 19:16 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The rules of an IDAW are fairly simple: Each one can move no
> more than a defined amount of data, must not cross the
> boundary defined by that length, and must be aligned to that
> length as well. The first IDAW in a list is special, in that
> it does not need to adhere to that alignment, but the other
> rules still apply. Thus, by reading the first IDAW in a list,
> the number of IDAWs that will comprise a data transfer of a
> particular size can be calculated.
> 
> Let's factor out the reading of that first IDAW with the
> logic that calculates the length of the list, to simplify
> the rest of the routine that handles the individual IDAWs.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++++++++++--------
>  1 file changed, 30 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index a30f26962750..34a133d962d1 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -496,23 +496,25 @@ static int ccwchain_fetch_tic(struct ccw1 *ccw,
>  	return -EFAULT;
>  }
>  
> -static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> -			      struct page_array *pa,
> -			      struct channel_program *cp)
> +/*
> + * ccw_count_idaws() - Calculate the number of IDAWs needed to transfer
> + * a specified amount of data
> + *
> + * @ccw: The Channel Command Word being translated
> + * @cp: Channel Program being processed
> + */
> +static int ccw_count_idaws(struct ccw1 *ccw,
> +			   struct channel_program *cp)
>  {
>  	struct vfio_device *vdev =
>  		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
>  	u64 iova;
> -	unsigned long *idaws;
>  	int ret;
>  	int bytes = 1;
> -	int idaw_nr, idal_len;
> -	int i;
>  
>  	if (ccw->count)
>  		bytes = ccw->count;
>  
> -	/* Calculate size of IDAL */
>  	if (ccw_is_idal(ccw)) {
>  		/* Read first IDAW to see if it's 4K-aligned or not. */
>  		/* All subsequent IDAws will be 4K-aligned. */
> @@ -522,7 +524,26 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	} else {
>  		iova = ccw->cda;
>  	}
> -	idaw_nr = idal_nr_words((void *)iova, bytes);
> +
> +	return idal_nr_words((void *)iova, bytes);
> +}
> +
> +static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> +			      struct page_array *pa,
> +			      struct channel_program *cp)
> +{
> +	struct vfio_device *vdev =
> +		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
> +	unsigned long *idaws;
> +	int ret;
> +	int idaw_nr, idal_len;
> +	int i;
> +
> +	/* Calculate size of IDAL */
> +	idaw_nr = ccw_count_idaws(ccw, cp);
> +	if (idaw_nr < 0)
> +		return idaw_nr;
> +

What about if we get a 0 back from ccw_count_idaws?   The next thing we're going to do (not shown here) is kcalloc(0, sizeof(*idaws)), which I think means you'll get back ZERO_SIZE_PTR, not a null pointer.

>  	idal_len = idaw_nr * sizeof(*idaws);
>  
>  	/* Allocate an IDAL from host storage */
> @@ -555,7 +576,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  		for (i = 0; i < idaw_nr; i++)
>  			pa->pa_iova[i] = idaws[i];
>  	} else {
> -		pa->pa_iova[0] = iova;
> +		pa->pa_iova[0] = ccw->cda;
>  		for (i = 1; i < pa->pa_nr; i++)
>  			pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
>  	}


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW
  2022-11-21 21:40 ` [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW Eric Farman
@ 2022-12-19 19:27   ` Matthew Rosato
  2022-12-19 20:27     ` Eric Farman
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 19:27 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The intention is to read the first IDAW to determine the starting
> location of an I/O operation, knowing that the second and any/all
> subsequent IDAWs will be aligned per architecture. But, this read
> receives 64-bits of data, which is the size of a format-2 IDAW.
> 
> In the event that Format-1 IDAWs are presented, discard the lower
> 32 bits as they contain the second IDAW in such a list.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 34a133d962d1..53246f4f95f7 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -516,11 +516,15 @@ static int ccw_count_idaws(struct ccw1 *ccw,
>  		bytes = ccw->count;
>  
>  	if (ccw_is_idal(ccw)) {
> -		/* Read first IDAW to see if it's 4K-aligned or not. */
> -		/* All subsequent IDAws will be 4K-aligned. */
> +		/* Read first IDAW to check its starting address. */
> +		/* All subsequent IDAWs will be 2K- or 4K-aligned. */
>  		ret = vfio_dma_rw(vdev, ccw->cda, &iova, sizeof(iova), false);
>  		if (ret)
>  			return ret;
> +
> +		/* Format-1 IDAWs only occupy the first int */
> +		if (!cp->orb.cmd.c64)
> +			iova = iova >> 32;

Rather than read 8B and discarding 4B, can't we check this format value first and only read 4B for format-1?

>  	} else {
>  		iova = ccw->cda;
>  	}


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 10/16] vfio/ccw: refactor the idaw counter
  2022-12-19 19:16   ` Matthew Rosato
@ 2022-12-19 19:31     ` Eric Farman
  2022-12-19 19:40       ` Matthew Rosato
  0 siblings, 1 reply; 41+ messages in thread
From: Eric Farman @ 2022-12-19 19:31 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On Mon, 2022-12-19 at 14:16 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > The rules of an IDAW are fairly simple: Each one can move no
> > more than a defined amount of data, must not cross the
> > boundary defined by that length, and must be aligned to that
> > length as well. The first IDAW in a list is special, in that
> > it does not need to adhere to that alignment, but the other
> > rules still apply. Thus, by reading the first IDAW in a list,
> > the number of IDAWs that will comprise a data transfer of a
> > particular size can be calculated.
> > 
> > Let's factor out the reading of that first IDAW with the
> > logic that calculates the length of the list, to simplify
> > the rest of the routine that handles the individual IDAWs.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++++++++++----
> > ----
> >  1 file changed, 30 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index a30f26962750..34a133d962d1 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -496,23 +496,25 @@ static int ccwchain_fetch_tic(struct ccw1
> > *ccw,
> >         return -EFAULT;
> >  }
> >  
> > -static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> > -                             struct page_array *pa,
> > -                             struct channel_program *cp)
> > +/*
> > + * ccw_count_idaws() - Calculate the number of IDAWs needed to
> > transfer
> > + * a specified amount of data
> > + *
> > + * @ccw: The Channel Command Word being translated
> > + * @cp: Channel Program being processed
> > + */
> > +static int ccw_count_idaws(struct ccw1 *ccw,
> > +                          struct channel_program *cp)
> >  {
> >         struct vfio_device *vdev =
> >                 &container_of(cp, struct vfio_ccw_private, cp)-
> > >vdev;
> >         u64 iova;
> > -       unsigned long *idaws;
> >         int ret;
> >         int bytes = 1;
> > -       int idaw_nr, idal_len;
> > -       int i;
> >  
> >         if (ccw->count)
> >                 bytes = ccw->count;
> >  
> > -       /* Calculate size of IDAL */
> >         if (ccw_is_idal(ccw)) {
> >                 /* Read first IDAW to see if it's 4K-aligned or
> > not. */
> >                 /* All subsequent IDAws will be 4K-aligned. */
> > @@ -522,7 +524,26 @@ static int ccwchain_fetch_ccw(struct ccw1
> > *ccw,
> >         } else {
> >                 iova = ccw->cda;
> >         }
> > -       idaw_nr = idal_nr_words((void *)iova, bytes);
> > +
> > +       return idal_nr_words((void *)iova, bytes);
> > +}
> > +
> > +static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> > +                             struct page_array *pa,
> > +                             struct channel_program *cp)
> > +{
> > +       struct vfio_device *vdev =
> > +               &container_of(cp, struct vfio_ccw_private, cp)-
> > >vdev;
> > +       unsigned long *idaws;
> > +       int ret;
> > +       int idaw_nr, idal_len;
> > +       int i;
> > +
> > +       /* Calculate size of IDAL */
> > +       idaw_nr = ccw_count_idaws(ccw, cp);
> > +       if (idaw_nr < 0)
> > +               return idaw_nr;
> > +
> 
> What about if we get a 0 back from ccw_count_idaws?   The next thing
> we're going to do (not shown here) is kcalloc(0, sizeof(*idaws)),
> which I think means you'll get back ZERO_SIZE_PTR, not a null
> pointer.

While it's true that the idal_nr_words routines could return zero, I
don't see how the ccw_count_idaws routine which calls it could do the
same. We added a check for a zero data count with commit 453eac312445e
("s390/cio: Allow zero-length CCWs in vfio-ccw"), such that a CCW that
has no length will cause us to allocate -something- that would be valid
for the channel to use, even if it's not going to put anything in/out
of it.

> 
> >         idal_len = idaw_nr * sizeof(*idaws);
> >  
> >         /* Allocate an IDAL from host storage */
> > @@ -555,7 +576,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> >                 for (i = 0; i < idaw_nr; i++)
> >                         pa->pa_iova[i] = idaws[i];
> >         } else {
> > -               pa->pa_iova[0] = iova;
> > +               pa->pa_iova[0] = ccw->cda;
> >                 for (i = 1; i < pa->pa_nr; i++)
> >                         pa->pa_iova[i] = pa->pa_iova[i - 1] +
> > PAGE_SIZE;
> >         }
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 10/16] vfio/ccw: refactor the idaw counter
  2022-12-19 19:31     ` Eric Farman
@ 2022-12-19 19:40       ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 19:40 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 12/19/22 2:31 PM, Eric Farman wrote:
> On Mon, 2022-12-19 at 14:16 -0500, Matthew Rosato wrote:
>> On 11/21/22 4:40 PM, Eric Farman wrote:
>>> The rules of an IDAW are fairly simple: Each one can move no
>>> more than a defined amount of data, must not cross the
>>> boundary defined by that length, and must be aligned to that
>>> length as well. The first IDAW in a list is special, in that
>>> it does not need to adhere to that alignment, but the other
>>> rules still apply. Thus, by reading the first IDAW in a list,
>>> the number of IDAWs that will comprise a data transfer of a
>>> particular size can be calculated.
>>>
>>> Let's factor out the reading of that first IDAW with the
>>> logic that calculates the length of the list, to simplify
>>> the rest of the routine that handles the individual IDAWs.
>>>
>>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>>> ---
>>>  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++++++++++----
>>> ----
>>>  1 file changed, 30 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c
>>> b/drivers/s390/cio/vfio_ccw_cp.c
>>> index a30f26962750..34a133d962d1 100644
>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>>> @@ -496,23 +496,25 @@ static int ccwchain_fetch_tic(struct ccw1
>>> *ccw,
>>>         return -EFAULT;
>>>  }
>>>  
>>> -static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>>> -                             struct page_array *pa,
>>> -                             struct channel_program *cp)
>>> +/*
>>> + * ccw_count_idaws() - Calculate the number of IDAWs needed to
>>> transfer
>>> + * a specified amount of data
>>> + *
>>> + * @ccw: The Channel Command Word being translated
>>> + * @cp: Channel Program being processed
>>> + */
>>> +static int ccw_count_idaws(struct ccw1 *ccw,
>>> +                          struct channel_program *cp)
>>>  {
>>>         struct vfio_device *vdev =
>>>                 &container_of(cp, struct vfio_ccw_private, cp)-
>>>> vdev;
>>>         u64 iova;
>>> -       unsigned long *idaws;
>>>         int ret;
>>>         int bytes = 1;
>>> -       int idaw_nr, idal_len;
>>> -       int i;
>>>  
>>>         if (ccw->count)
>>>                 bytes = ccw->count;
>>>  
>>> -       /* Calculate size of IDAL */
>>>         if (ccw_is_idal(ccw)) {
>>>                 /* Read first IDAW to see if it's 4K-aligned or
>>> not. */
>>>                 /* All subsequent IDAws will be 4K-aligned. */
>>> @@ -522,7 +524,26 @@ static int ccwchain_fetch_ccw(struct ccw1
>>> *ccw,
>>>         } else {
>>>                 iova = ccw->cda;
>>>         }
>>> -       idaw_nr = idal_nr_words((void *)iova, bytes);
>>> +
>>> +       return idal_nr_words((void *)iova, bytes);
>>> +}
>>> +
>>> +static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>>> +                             struct page_array *pa,
>>> +                             struct channel_program *cp)
>>> +{
>>> +       struct vfio_device *vdev =
>>> +               &container_of(cp, struct vfio_ccw_private, cp)-
>>>> vdev;
>>> +       unsigned long *idaws;
>>> +       int ret;
>>> +       int idaw_nr, idal_len;
>>> +       int i;
>>> +
>>> +       /* Calculate size of IDAL */
>>> +       idaw_nr = ccw_count_idaws(ccw, cp);
>>> +       if (idaw_nr < 0)
>>> +               return idaw_nr;
>>> +
>>
>> What about if we get a 0 back from ccw_count_idaws?   The next thing
>> we're going to do (not shown here) is kcalloc(0, sizeof(*idaws)),
>> which I think means you'll get back ZERO_SIZE_PTR, not a null
>> pointer.
> 
> While it's true that the idal_nr_words routines could return zero, I
> don't see how the ccw_count_idaws routine which calls it could do the
> same. We added a check for a zero data count with commit 453eac312445e
> ("s390/cio: Allow zero-length CCWs in vfio-ccw"), such that a CCW that
> has no length will cause us to allocate -something- that would be valid
> for the channel to use, even if it's not going to put anything in/out
> of it.

Ah, yeah I was specifically looking at idal_nr_words and I missed the 'int bytes = 1;' business at the top of ccw_count_idaws. 

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> 
>>
>>>         idal_len = idaw_nr * sizeof(*idaws);
>>>  
>>>         /* Allocate an IDAL from host storage */
>>> @@ -555,7 +576,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>>>                 for (i = 0; i < idaw_nr; i++)
>>>                         pa->pa_iova[i] = idaws[i];
>>>         } else {
>>> -               pa->pa_iova[0] = iova;
>>> +               pa->pa_iova[0] = ccw->cda;
>>>                 for (i = 1; i < pa->pa_nr; i++)
>>>                         pa->pa_iova[i] = pa->pa_iova[i - 1] +
>>> PAGE_SIZE;
>>>         }
>>
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format
  2022-11-21 21:40 ` [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format Eric Farman
@ 2022-12-19 19:49   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 19:49 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The idal_nr_words() routine works well for 4K IDAWs, but lost its
> ability to handle the old 2K formats with the removal of 31-bit
> builds in commit 5a79859ae0f3 ("s390: remove 31 bit support").
> 
> Since there's nothing preventing a guest from generating this IDAW
> format, let's re-introduce the math for them and use both when
> calculating the number of IDAWs based on the bits specified in
> the ORB.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  arch/s390/include/asm/idals.h  | 12 ++++++++++++
>  drivers/s390/cio/vfio_ccw_cp.c | 17 ++++++++++++++++-
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/idals.h b/arch/s390/include/asm/idals.h
> index 40eae2c08d61..0a05a893aedb 100644
> --- a/arch/s390/include/asm/idals.h
> +++ b/arch/s390/include/asm/idals.h
> @@ -23,6 +23,9 @@
>  #define IDA_SIZE_LOG 12 /* 11 for 2k , 12 for 4k */
>  #define IDA_BLOCK_SIZE (1L<<IDA_SIZE_LOG)
>  
> +#define IDA_2K_SIZE_LOG 11
> +#define IDA_2K_BLOCK_SIZE (1L << IDA_2K_SIZE_LOG)
> +
>  /*
>   * Test if an address/length pair needs an idal list.
>   */
> @@ -42,6 +45,15 @@ static inline unsigned int idal_nr_words(void *vaddr, unsigned int length)
>  		(IDA_BLOCK_SIZE-1)) >> IDA_SIZE_LOG;
>  }
>  
> +/*
> + * Return the number of 2K IDA words needed for an address/length pair.
> + */
> +static inline unsigned int idal_2k_nr_words(void *vaddr, unsigned int length)
> +{
> +	return ((__pa(vaddr) & (IDA_2K_BLOCK_SIZE-1)) + length +
> +		(IDA_2K_BLOCK_SIZE-1)) >> IDA_2K_SIZE_LOG;
> +}
> +
>  /*
>   * Create the list of idal words for an address/length pair.
>   */
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 53246f4f95f7..6839e7195182 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -502,6 +502,13 @@ static int ccwchain_fetch_tic(struct ccw1 *ccw,
>   *
>   * @ccw: The Channel Command Word being translated
>   * @cp: Channel Program being processed
> + *
> + * The ORB is examined, since it specifies what IDAWs could actually be
> + * used by any CCW in the channel program, regardless of whether or not
> + * the CCW actually does. An ORB that does not specify Format-2-IDAW
> + * Control could still contain a CCW with an IDAL, which would be
> + * Format-1 and thus only move 2K with each IDAW. Thus all CCWs within
> + * the channel program must follow the same size requirements.
>   */
>  static int ccw_count_idaws(struct ccw1 *ccw,
>  			   struct channel_program *cp)
> @@ -529,7 +536,15 @@ static int ccw_count_idaws(struct ccw1 *ccw,
>  		iova = ccw->cda;
>  	}
>  
> -	return idal_nr_words((void *)iova, bytes);
> +	/* Format-1 IDAWs operate on 2K each */
> +	if (!cp->orb.cmd.c64)
> +		return idal_2k_nr_words((void *)iova, bytes);
> +
> +	/* Format-2 IDAWs operate on either 2K or 4K */
> +	if (cp->orb.cmd.i2k)
> +		return idal_2k_nr_words((void *)iova, bytes);
> +	else

Nit: The else is unnecessary, just unconditionally return idal_nr_words if you reach the end of the function.

Either way:
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> +		return idal_nr_words((void *)iova, bytes);
>  }
>  
>  static int ccwchain_fetch_ccw(struct ccw1 *ccw,


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal
  2022-11-21 21:40 ` [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal Eric Farman
@ 2022-12-19 20:14   ` Matthew Rosato
  2022-12-19 21:00     ` Eric Farman
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 20:14 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> Today, we allocate memory for a list of IDAWs, and if the CCW
> being processed contains an IDAL we read that data from the guest
> into that space. We then copy each IDAW into the pa_iova array,
> or fabricate that pa_iova array with a list of addresses based
> on a direct-addressed CCW.
> 
> Combine the reading of the guest IDAL with the creation of a
> pseudo-IDAL for direct-addressed CCWs, so that both CCW types
> have a "guest" IDAL that can be populated straight into the
> pa_iova array.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 72 +++++++++++++++++++++++-----------
>  1 file changed, 50 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 6839e7195182..90685cee85db 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -192,11 +192,12 @@ static inline void page_array_idal_create_words(struct page_array *pa,
>  	 * idaw.
>  	 */
>  
> -	for (i = 0; i < pa->pa_nr; i++)
> +	for (i = 0; i < pa->pa_nr; i++) {
>  		idaws[i] = page_to_phys(pa->pa_page[i]);
>  
> -	/* Adjust the first IDAW, since it may not start on a page boundary */
> -	idaws[0] += pa->pa_iova[0] & (PAGE_SIZE - 1);
> +		/* Incorporate any offset from each starting address */
> +		idaws[i] += pa->pa_iova[i] & (PAGE_SIZE - 1);
> +	}
>  }
>  
>  static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
> @@ -496,6 +497,44 @@ static int ccwchain_fetch_tic(struct ccw1 *ccw,
>  	return -EFAULT;
>  }
>  
> +static unsigned long *get_guest_idal(struct ccw1 *ccw,
> +				     struct channel_program *cp,
> +				     int idaw_nr)
> +{
> +	struct vfio_device *vdev =
> +		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
> +	unsigned long *idaws;
> +	int idal_len = idaw_nr * sizeof(*idaws);
> +	int idaw_size = PAGE_SIZE;
> +	int idaw_mask = ~(idaw_size - 1);
> +	int i, ret;
> +
> +	idaws = kcalloc(idaw_nr, sizeof(*idaws), GFP_DMA | GFP_KERNEL);
> +	if (!idaws)
> +		return NULL;
> +
> +	if (ccw_is_idal(ccw)) {
> +		/* Copy IDAL from guest */
> +		ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len, false);
> +		if (ret) {
> +			kfree(idaws);
> +			return NULL;

As discussed off-list, for debug purposes consider using something like ERR_PTR of the vfio_dma_rw error return here rather than NULL. 

> +		}
> +	} else {
> +		/* Fabricate an IDAL based off CCW data address */
> +		if (cp->orb.cmd.c64) {
> +			idaws[0] = ccw->cda;
> +			for (i = 1; i < idaw_nr; i++)
> +				idaws[i] = (idaws[i - 1] + idaw_size) & idaw_mask;
> +		} else {

If anyone else is reviewing and stumbles on this, I was initially wondering why we bail here with no obvious explanation - was going to ask for a comment here but it looks like this else gets replaced next patch with implementation for format-1.

> +			kfree(idaws);
> +			return NULL;
> +		}
> +	}
> +
> +	return idaws;
> +}
> +
>  /*
>   * ccw_count_idaws() - Calculate the number of IDAWs needed to transfer
>   * a specified amount of data
> @@ -555,7 +594,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
>  	unsigned long *idaws;
>  	int ret;
> -	int idaw_nr, idal_len;
> +	int idaw_nr;
>  	int i;
>  
>  	/* Calculate size of IDAL */
> @@ -563,10 +602,8 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	if (idaw_nr < 0)
>  		return idaw_nr;
>  
> -	idal_len = idaw_nr * sizeof(*idaws);
> -
>  	/* Allocate an IDAL from host storage */
> -	idaws = kcalloc(idaw_nr, sizeof(*idaws), GFP_DMA | GFP_KERNEL);
> +	idaws = get_guest_idal(ccw, cp, idaw_nr);
>  	if (!idaws) {
>  		ret = -ENOMEM;
>  		goto out_init;
> @@ -582,22 +619,13 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	if (ret < 0)
>  		goto out_free_idaws;
>  
> -	if (ccw_is_idal(ccw)) {
> -		/* Copy guest IDAL into host IDAL */
> -		ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len, false);
> -		if (ret)
> -			goto out_unpin;
> -
> -		/*
> -		 * Copy guest IDAWs into page_array, in case the memory they
> -		 * occupy is not contiguous.
> -		 */
> -		for (i = 0; i < idaw_nr; i++)
> +	/*
> +	 * Copy guest IDAWs into page_array, in case the memory they
> +	 * occupy is not contiguous.
> +	 */
> +	for (i = 0; i < idaw_nr; i++) {
> +		if (cp->orb.cmd.c64)
>  			pa->pa_iova[i] = idaws[i];
> -	} else {
> -		pa->pa_iova[0] = ccw->cda;
> -		for (i = 1; i < pa->pa_nr; i++)
> -			pa->pa_iova[i] = pa->pa_iova[i - 1] + PAGE_SIZE;
>  	}
>  
>  	if (ccw_does_data_transfer(ccw)) {


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW
  2022-12-19 19:27   ` Matthew Rosato
@ 2022-12-19 20:27     ` Eric Farman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric Farman @ 2022-12-19 20:27 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On Mon, 2022-12-19 at 14:27 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > The intention is to read the first IDAW to determine the starting
> > location of an I/O operation, knowing that the second and any/all
> > subsequent IDAWs will be aligned per architecture. But, this read
> > receives 64-bits of data, which is the size of a format-2 IDAW.
> > 
> > In the event that Format-1 IDAWs are presented, discard the lower
> > 32 bits as they contain the second IDAW in such a list.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index 34a133d962d1..53246f4f95f7 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -516,11 +516,15 @@ static int ccw_count_idaws(struct ccw1 *ccw,
> >                 bytes = ccw->count;
> >  
> >         if (ccw_is_idal(ccw)) {
> > -               /* Read first IDAW to see if it's 4K-aligned or
> > not. */
> > -               /* All subsequent IDAws will be 4K-aligned. */
> > +               /* Read first IDAW to check its starting address.
> > */
> > +               /* All subsequent IDAWs will be 2K- or 4K-aligned.
> > */
> >                 ret = vfio_dma_rw(vdev, ccw->cda, &iova,
> > sizeof(iova), false);
> >                 if (ret)
> >                         return ret;
> > +
> > +               /* Format-1 IDAWs only occupy the first int */
> > +               if (!cp->orb.cmd.c64)
> > +                       iova = iova >> 32;
> 
> Rather than read 8B and discarding 4B, can't we check this format
> value first and only read 4B for format-1?

Erp, yeah. I think I had that in one point; I'll work something in
here. Thanks for the top.

> 
> >         } else {
> >                 iova = ccw->cda;
> >         }
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL
  2022-11-21 21:40 ` [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL Eric Farman
@ 2022-12-19 20:29   ` Matthew Rosato
  2022-12-19 21:04     ` Eric Farman
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 20:29 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> There are two scenarios that need to be addressed here.
> 
> First, an ORB that does NOT have the Format-2 IDAL bit set could
> have both a direct-addressed CCW and an indirect-data-address CCW
> chained together. This means that the IDA CCW will contain a
> Format-1 IDAL, and can be easily converted to a 2K Format-2 IDAL.
> But it also means that the direct-addressed CCW needs to be
> converted to the same 2K Format-2 IDAL for consistency with the
> ORB settings.
> 
> Secondly, a Format-1 IDAL is comprised of 31-bit addresses.
> Thus, we need to cast this IDAL to a pointer of ints while
> populating the list of addresses that are sent to vfio.
> 
> Since the result of both of these is the use of the 2K IDAL
> variants, and the output of vfio-ccw is always a Format-2 IDAL
> (in order to use 64-bit addresses), make sure that the correct
> control bit gets set in the ORB when these scenarios occur.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 90685cee85db..9527f3d8da77 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -222,6 +222,8 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned long len)
>  	}
>  }
>  
> +#define idal_is_2k(_cp) (!_cp->orb.cmd.c64 || _cp->orb.cmd.i2k)
> +
>  /*
>   * Helpers to operate ccwchain.
>   */
> @@ -504,8 +506,9 @@ static unsigned long *get_guest_idal(struct ccw1 *ccw,
>  	struct vfio_device *vdev =
>  		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
>  	unsigned long *idaws;
> +	unsigned int *idaws_f1;

I wonder if we should be using explicit u64/u32 here since we are dealing with hardware-architected data sizes and specifically want to index by 32- or 64-bits.  Honestly, there's probably a number of other spots in vfio-ccw where that might make sense so it would also be OK to look into that as a follow-on.

Otherwise, LGTM

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

>  	int idal_len = idaw_nr * sizeof(*idaws);
> -	int idaw_size = PAGE_SIZE;
> +	int idaw_size = idal_is_2k(cp) ? PAGE_SIZE / 2 : PAGE_SIZE;
>  	int idaw_mask = ~(idaw_size - 1);
>  	int i, ret;
>  
> @@ -527,8 +530,10 @@ static unsigned long *get_guest_idal(struct ccw1 *ccw,
>  			for (i = 1; i < idaw_nr; i++)
>  				idaws[i] = (idaws[i - 1] + idaw_size) & idaw_mask;
>  		} else {
> -			kfree(idaws);
> -			return NULL;
> +			idaws_f1 = (unsigned int *)idaws;
> +			idaws_f1[0] = ccw->cda;
> +			for (i = 1; i < idaw_nr; i++)
> +				idaws_f1[i] = (idaws_f1[i - 1] + idaw_size) & idaw_mask;
>  		}
>  	}
>  
> @@ -593,6 +598,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	struct vfio_device *vdev =
>  		&container_of(cp, struct vfio_ccw_private, cp)->vdev;
>  	unsigned long *idaws;
> +	unsigned int *idaws_f1;
>  	int ret;
>  	int idaw_nr;
>  	int i;
> @@ -623,9 +629,12 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	 * Copy guest IDAWs into page_array, in case the memory they
>  	 * occupy is not contiguous.
>  	 */
> +	idaws_f1 = (unsigned int *)idaws;
>  	for (i = 0; i < idaw_nr; i++) {
>  		if (cp->orb.cmd.c64)
>  			pa->pa_iova[i] = idaws[i];
> +		else
> +			pa->pa_iova[i] = idaws_f1[i];
>  	}
>  
>  	if (ccw_does_data_transfer(ccw)) {
> @@ -846,7 +855,11 @@ union orb *cp_get_orb(struct channel_program *cp, struct subchannel *sch)
>  
>  	/*
>  	 * Everything built by vfio-ccw is a Format-2 IDAL.
> +	 * If the input was a Format-1 IDAL, indicate that
> +	 * 2K Format-2 IDAWs were created here.
>  	 */
> +	if (!orb->cmd.c64)
> +		orb->cmd.i2k = 1;
>  	orb->cmd.c64 = 1;
>  
>  	if (orb->cmd.lpm == 0)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs
  2022-11-21 21:40 ` [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs Eric Farman
@ 2022-12-19 20:40   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 20:40 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> The vfio_pin_pages() interface allows contiguous pages to be
> pinned as a single request, which is great for the 4K pages
> that are normally processed. Old IDA formats operate on 2K
> chunks, which makes this logic more difficult.
> 
> Since these formats are rare, let's just invoke the page
> pinning one-at-a-time, instead of trying to group them.
> We can rework this code at a later date if needed.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 9527f3d8da77..3829c346583c 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -88,7 +88,7 @@ static int page_array_alloc(struct page_array *pa, unsigned int len)
>   * otherwise only clear pa->pa_nr
>   */
>  static void page_array_unpin(struct page_array *pa,
> -			     struct vfio_device *vdev, int pa_nr)
> +			     struct vfio_device *vdev, int pa_nr, bool unaligned)

Please add 'unaligned' the comment block above this function with a short explanation...

>  {
>  	int unpinned = 0, npage = 1;
>  
> @@ -97,7 +97,8 @@ static void page_array_unpin(struct page_array *pa,
>  		dma_addr_t *last = &first[npage];
>  
>  		if (unpinned + npage < pa_nr &&
> -		    *first + npage * PAGE_SIZE == *last) {
> +		    *first + npage * PAGE_SIZE == *last &&
> +		    !unaligned) {
>  			npage++;
>  			continue;
>  		}
> @@ -119,7 +120,7 @@ static void page_array_unpin(struct page_array *pa,
>   * If the pin request partially succeeds, or fails completely,
>   * all pages are left unpinned and a negative error value is returned.
>   */
> -static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
> +static int page_array_pin(struct page_array *pa, struct vfio_device *vdev, bool unaligned)

...  Also here.  Otherwise, I agree re-work can be done later since this is not a common case.

With those changes:

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

>  {
>  	int pinned = 0, npage = 1;
>  	int ret = 0;
> @@ -129,7 +130,8 @@ static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
>  		dma_addr_t *last = &first[npage];
>  
>  		if (pinned + npage < pa->pa_nr &&
> -		    *first + npage * PAGE_SIZE == *last) {
> +		    *first + npage * PAGE_SIZE == *last &&
> +		    !unaligned) {
>  			npage++;
>  			continue;
>  		}
> @@ -151,14 +153,14 @@ static int page_array_pin(struct page_array *pa, struct vfio_device *vdev)
>  	return ret;
>  
>  err_out:
> -	page_array_unpin(pa, vdev, pinned);
> +	page_array_unpin(pa, vdev, pinned, unaligned);
>  	return ret;
>  }
>  
>  /* Unpin the pages before releasing the memory. */
> -static void page_array_unpin_free(struct page_array *pa, struct vfio_device *vdev)
> +static void page_array_unpin_free(struct page_array *pa, struct vfio_device *vdev, bool unaligned)
>  {
> -	page_array_unpin(pa, vdev, pa->pa_nr);
> +	page_array_unpin(pa, vdev, pa->pa_nr, unaligned);
>  	kfree(pa->pa_page);
>  	kfree(pa->pa_iova);
>  }
> @@ -638,7 +640,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	}
>  
>  	if (ccw_does_data_transfer(ccw)) {
> -		ret = page_array_pin(pa, vdev);
> +		ret = page_array_pin(pa, vdev, idal_is_2k(cp));
>  		if (ret < 0)
>  			goto out_unpin;
>  	} else {
> @@ -654,7 +656,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
>  	return 0;
>  
>  out_unpin:
> -	page_array_unpin_free(pa, vdev);
> +	page_array_unpin_free(pa, vdev, idal_is_2k(cp));
>  out_free_idaws:
>  	kfree(idaws);
>  out_init:
> @@ -752,7 +754,7 @@ void cp_free(struct channel_program *cp)
>  	cp->initialized = false;
>  	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>  		for (i = 0; i < chain->ch_len; i++) {
> -			page_array_unpin_free(chain->ch_pa + i, vdev);
> +			page_array_unpin_free(chain->ch_pa + i, vdev, idal_is_2k(cp));
>  			ccwchain_cda_free(chain, i);
>  		}
>  		ccwchain_free(chain);


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions
  2022-11-21 21:40 ` [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions Eric Farman
@ 2022-12-19 20:44   ` Matthew Rosato
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Rosato @ 2022-12-19 20:44 UTC (permalink / raw)
  To: Eric Farman, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On 11/21/22 4:40 PM, Eric Farman wrote:
> By this point, all the pieces are in place to properly support
> a 2K Format-2 IDAL, and to convert a guest Format-1 IDAL to
> the 2K Format-2 variety. Let's remove the fence that prohibits
> them, and allow a guest to submit them if desired.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  Documentation/s390/vfio-ccw.rst | 4 ++--
>  drivers/s390/cio/vfio_ccw_cp.c  | 8 --------
>  2 files changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
> index ea928a3806f4..53375dc86213 100644
> --- a/Documentation/s390/vfio-ccw.rst
> +++ b/Documentation/s390/vfio-ccw.rst
> @@ -219,8 +219,8 @@ values may occur:
>    The operation was successful.
>  
>  ``-EOPNOTSUPP``
> -  The orb specified transport mode or an unidentified IDAW format, or the
> -  scsw specified a function other than the start function.
> +  The ORB specified transport mode, or the

Nit:  Drop unnecessary comma

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> +  SCSW specified a function other than the start function.
>  
>  ``-EIO``
>    A request was issued while the device was not in a state ready to accept
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 3829c346583c..60e972042979 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -372,14 +372,6 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
>  	do {
>  		cnt++;
>  
> -		/*
> -		 * As we don't want to fail direct addressing even if the
> -		 * orb specified one of the unsupported formats, we defer
> -		 * checking for IDAWs in unsupported formats to here.
> -		 */
> -		if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
> -			return -EOPNOTSUPP;
> -
>  		/*
>  		 * We want to keep counting if the current CCW has the
>  		 * command-chaining flag enabled, or if it is a TIC CCW


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal
  2022-12-19 20:14   ` Matthew Rosato
@ 2022-12-19 21:00     ` Eric Farman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric Farman @ 2022-12-19 21:00 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On Mon, 2022-12-19 at 15:14 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > Today, we allocate memory for a list of IDAWs, and if the CCW
> > being processed contains an IDAL we read that data from the guest
> > into that space. We then copy each IDAW into the pa_iova array,
> > or fabricate that pa_iova array with a list of addresses based
> > on a direct-addressed CCW.
> > 
> > Combine the reading of the guest IDAL with the creation of a
> > pseudo-IDAL for direct-addressed CCWs, so that both CCW types
> > have a "guest" IDAL that can be populated straight into the
> > pa_iova array.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 72 +++++++++++++++++++++++-------
> > ----
> >  1 file changed, 50 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index 6839e7195182..90685cee85db 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -192,11 +192,12 @@ static inline void
> > page_array_idal_create_words(struct page_array *pa,
> >          * idaw.
> >          */
> >  
> > -       for (i = 0; i < pa->pa_nr; i++)
> > +       for (i = 0; i < pa->pa_nr; i++) {
> >                 idaws[i] = page_to_phys(pa->pa_page[i]);
> >  
> > -       /* Adjust the first IDAW, since it may not start on a page
> > boundary */
> > -       idaws[0] += pa->pa_iova[0] & (PAGE_SIZE - 1);
> > +               /* Incorporate any offset from each starting
> > address */
> > +               idaws[i] += pa->pa_iova[i] & (PAGE_SIZE - 1);
> > +       }
> >  }
> >  
> >  static void convert_ccw0_to_ccw1(struct ccw1 *source, unsigned
> > long len)
> > @@ -496,6 +497,44 @@ static int ccwchain_fetch_tic(struct ccw1
> > *ccw,
> >         return -EFAULT;
> >  }
> >  
> > +static unsigned long *get_guest_idal(struct ccw1 *ccw,
> > +                                    struct channel_program *cp,
> > +                                    int idaw_nr)
> > +{
> > +       struct vfio_device *vdev =
> > +               &container_of(cp, struct vfio_ccw_private, cp)-
> > >vdev;
> > +       unsigned long *idaws;
> > +       int idal_len = idaw_nr * sizeof(*idaws);
> > +       int idaw_size = PAGE_SIZE;
> > +       int idaw_mask = ~(idaw_size - 1);
> > +       int i, ret;
> > +
> > +       idaws = kcalloc(idaw_nr, sizeof(*idaws), GFP_DMA |
> > GFP_KERNEL);
> > +       if (!idaws)
> > +               return NULL;
> > +
> > +       if (ccw_is_idal(ccw)) {
> > +               /* Copy IDAL from guest */
> > +               ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len,
> > false);
> > +               if (ret) {
> > +                       kfree(idaws);
> > +                       return NULL;
> 
> As discussed off-list, for debug purposes consider using something
> like ERR_PTR of the vfio_dma_rw error return here rather than NULL. 

Yes, good idea.

> 
> > +               }
> > +       } else {
> > +               /* Fabricate an IDAL based off CCW data address */
> > +               if (cp->orb.cmd.c64) {
> > +                       idaws[0] = ccw->cda;
> > +                       for (i = 1; i < idaw_nr; i++)
> > +                               idaws[i] = (idaws[i - 1] +
> > idaw_size) & idaw_mask;
> > +               } else {
> 
> If anyone else is reviewing and stumbles on this, I was initially
> wondering why we bail here with no obvious explanation - was going to
> ask for a comment here but it looks like this else gets replaced next
> patch with implementation for format-1.

As you note, this goes away in the next patch so I didn't put a comment
in place. But while putting in the ERR_PTR/PTR_ERR stuff, I opted to
return EOPNOTSUPP here instead of NULL, so we get an error equal to the
current fence instead of an ENOMEM. So that'll be in v2.

Thanks,
Eric

> 
> > +                       kfree(idaws);
> > +                       return NULL;
> > +               }
> > +       }
> > +
> > +       return idaws;
> > +}
> > +
> >  /*
> >   * ccw_count_idaws() - Calculate the number of IDAWs needed to
> > transfer
> >   * a specified amount of data
> > @@ -555,7 +594,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> >                 &container_of(cp, struct vfio_ccw_private, cp)-
> > >vdev;
> >         unsigned long *idaws;
> >         int ret;
> > -       int idaw_nr, idal_len;
> > +       int idaw_nr;
> >         int i;
> >  
> >         /* Calculate size of IDAL */
> > @@ -563,10 +602,8 @@ static int ccwchain_fetch_ccw(struct ccw1
> > *ccw,
> >         if (idaw_nr < 0)
> >                 return idaw_nr;
> >  
> > -       idal_len = idaw_nr * sizeof(*idaws);
> > -
> >         /* Allocate an IDAL from host storage */
> > -       idaws = kcalloc(idaw_nr, sizeof(*idaws), GFP_DMA |
> > GFP_KERNEL);
> > +       idaws = get_guest_idal(ccw, cp, idaw_nr);
> >         if (!idaws) {
> >                 ret = -ENOMEM;
> >                 goto out_init;
> > @@ -582,22 +619,13 @@ static int ccwchain_fetch_ccw(struct ccw1
> > *ccw,
> >         if (ret < 0)
> >                 goto out_free_idaws;
> >  
> > -       if (ccw_is_idal(ccw)) {
> > -               /* Copy guest IDAL into host IDAL */
> > -               ret = vfio_dma_rw(vdev, ccw->cda, idaws, idal_len,
> > false);
> > -               if (ret)
> > -                       goto out_unpin;
> > -
> > -               /*
> > -                * Copy guest IDAWs into page_array, in case the
> > memory they
> > -                * occupy is not contiguous.
> > -                */
> > -               for (i = 0; i < idaw_nr; i++)
> > +       /*
> > +        * Copy guest IDAWs into page_array, in case the memory
> > they
> > +        * occupy is not contiguous.
> > +        */
> > +       for (i = 0; i < idaw_nr; i++) {
> > +               if (cp->orb.cmd.c64)
> >                         pa->pa_iova[i] = idaws[i];
> > -       } else {
> > -               pa->pa_iova[0] = ccw->cda;
> > -               for (i = 1; i < pa->pa_nr; i++)
> > -                       pa->pa_iova[i] = pa->pa_iova[i - 1] +
> > PAGE_SIZE;
> >         }
> >  
> >         if (ccw_does_data_transfer(ccw)) {
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL
  2022-12-19 20:29   ` Matthew Rosato
@ 2022-12-19 21:04     ` Eric Farman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric Farman @ 2022-12-19 21:04 UTC (permalink / raw)
  To: Matthew Rosato, Halil Pasic
  Cc: Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Vineeth Vijayan,
	Peter Oberparleiter, linux-s390, kvm

On Mon, 2022-12-19 at 15:29 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > There are two scenarios that need to be addressed here.
> > 
> > First, an ORB that does NOT have the Format-2 IDAL bit set could
> > have both a direct-addressed CCW and an indirect-data-address CCW
> > chained together. This means that the IDA CCW will contain a
> > Format-1 IDAL, and can be easily converted to a 2K Format-2 IDAL.
> > But it also means that the direct-addressed CCW needs to be
> > converted to the same 2K Format-2 IDAL for consistency with the
> > ORB settings.
> > 
> > Secondly, a Format-1 IDAL is comprised of 31-bit addresses.
> > Thus, we need to cast this IDAL to a pointer of ints while
> > populating the list of addresses that are sent to vfio.
> > 
> > Since the result of both of these is the use of the 2K IDAL
> > variants, and the output of vfio-ccw is always a Format-2 IDAL
> > (in order to use 64-bit addresses), make sure that the correct
> > control bit gets set in the ORB when these scenarios occur.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index 90685cee85db..9527f3d8da77 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -222,6 +222,8 @@ static void convert_ccw0_to_ccw1(struct ccw1
> > *source, unsigned long len)
> >         }
> >  }
> >  
> > +#define idal_is_2k(_cp) (!_cp->orb.cmd.c64 || _cp->orb.cmd.i2k)
> > +
> >  /*
> >   * Helpers to operate ccwchain.
> >   */
> > @@ -504,8 +506,9 @@ static unsigned long *get_guest_idal(struct
> > ccw1 *ccw,
> >         struct vfio_device *vdev =
> >                 &container_of(cp, struct vfio_ccw_private, cp)-
> > >vdev;
> >         unsigned long *idaws;
> > +       unsigned int *idaws_f1;
> 
> I wonder if we should be using explicit u64/u32 here since we are
> dealing with hardware-architected data sizes and specifically want to
> index by 32- or 64-bits.  Honestly, there's probably a number of
> other spots in vfio-ccw where that might make sense so it would also
> be OK to look into that as a follow-on.

This is a fair point. I have some follow-on work to this series for
after the holidays, so I'm going to put this suggestion on that todo
list.

> 
> Otherwise, LGTM
> 
> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
> 
> >         int idal_len = idaw_nr * sizeof(*idaws);
> > -       int idaw_size = PAGE_SIZE;
> > +       int idaw_size = idal_is_2k(cp) ? PAGE_SIZE / 2 : PAGE_SIZE;
> >         int idaw_mask = ~(idaw_size - 1);
> >         int i, ret;
> >  
> > @@ -527,8 +530,10 @@ static unsigned long *get_guest_idal(struct
> > ccw1 *ccw,
> >                         for (i = 1; i < idaw_nr; i++)
> >                                 idaws[i] = (idaws[i - 1] +
> > idaw_size) & idaw_mask;
> >                 } else {
> > -                       kfree(idaws);
> > -                       return NULL;
> > +                       idaws_f1 = (unsigned int *)idaws;
> > +                       idaws_f1[0] = ccw->cda;
> > +                       for (i = 1; i < idaw_nr; i++)
> > +                               idaws_f1[i] = (idaws_f1[i - 1] +
> > idaw_size) & idaw_mask;
> >                 }
> >         }
> >  
> > @@ -593,6 +598,7 @@ static int ccwchain_fetch_ccw(struct ccw1 *ccw,
> >         struct vfio_device *vdev =
> >                 &container_of(cp, struct vfio_ccw_private, cp)-
> > >vdev;
> >         unsigned long *idaws;
> > +       unsigned int *idaws_f1;
> >         int ret;
> >         int idaw_nr;
> >         int i;
> > @@ -623,9 +629,12 @@ static int ccwchain_fetch_ccw(struct ccw1
> > *ccw,
> >          * Copy guest IDAWs into page_array, in case the memory
> > they
> >          * occupy is not contiguous.
> >          */
> > +       idaws_f1 = (unsigned int *)idaws;
> >         for (i = 0; i < idaw_nr; i++) {
> >                 if (cp->orb.cmd.c64)
> >                         pa->pa_iova[i] = idaws[i];
> > +               else
> > +                       pa->pa_iova[i] = idaws_f1[i];
> >         }
> >  
> >         if (ccw_does_data_transfer(ccw)) {
> > @@ -846,7 +855,11 @@ union orb *cp_get_orb(struct channel_program
> > *cp, struct subchannel *sch)
> >  
> >         /*
> >          * Everything built by vfio-ccw is a Format-2 IDAL.
> > +        * If the input was a Format-1 IDAL, indicate that
> > +        * 2K Format-2 IDAWs were created here.
> >          */
> > +       if (!orb->cmd.c64)
> > +               orb->cmd.i2k = 1;
> >         orb->cmd.c64 = 1;
> >  
> >         if (orb->cmd.lpm == 0)
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-12-19 21:57 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-21 21:40 [PATCH v1 00/16] vfio/ccw: channel program cleanup Eric Farman
2022-11-21 21:40 ` [PATCH v1 01/16] vfio/ccw: cleanup some of the mdev commentary Eric Farman
2022-11-22 16:12   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 02/16] vfio/ccw: simplify the cp_get_orb interface Eric Farman
2022-11-22 16:13   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 03/16] vfio/ccw: allow non-zero storage keys Eric Farman
2022-12-15 20:55   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 04/16] vfio/ccw: move where IDA flag is set in ORB Eric Farman
2022-12-15 20:55   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 05/16] vfio/ccw: replace copy_from_iova with vfio_dma_rw Eric Farman
2022-11-22  1:41   ` Jason Gunthorpe
2022-12-15 20:59   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 06/16] vfio/ccw: simplify CCW chain fetch routines Eric Farman
2022-12-15 21:18   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment Eric Farman
2022-12-16 20:10   ` Matthew Rosato
2022-12-19 16:22     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 08/16] vfio/ccw: pass page count to page_array struct Eric Farman
2022-12-16 19:59   ` Matthew Rosato
2022-12-19 16:22     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 09/16] vfio/ccw: populate page_array struct inline Eric Farman
2022-12-16 21:05   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 10/16] vfio/ccw: refactor the idaw counter Eric Farman
2022-12-19 19:16   ` Matthew Rosato
2022-12-19 19:31     ` Eric Farman
2022-12-19 19:40       ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 11/16] vfio/ccw: discard second fmt-1 IDAW Eric Farman
2022-12-19 19:27   ` Matthew Rosato
2022-12-19 20:27     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 12/16] vfio/ccw: calculate number of IDAWs regardless of format Eric Farman
2022-12-19 19:49   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 13/16] vfio/ccw: allocate/populate the guest idal Eric Farman
2022-12-19 20:14   ` Matthew Rosato
2022-12-19 21:00     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 14/16] vfio/ccw: handle a guest Format-1 IDAL Eric Farman
2022-12-19 20:29   ` Matthew Rosato
2022-12-19 21:04     ` Eric Farman
2022-11-21 21:40 ` [PATCH v1 15/16] vfio/ccw: don't group contiguous pages on 2K IDAWs Eric Farman
2022-12-19 20:40   ` Matthew Rosato
2022-11-21 21:40 ` [PATCH v1 16/16] vfio/ccw: remove old IDA format restrictions Eric Farman
2022-12-19 20:44   ` Matthew Rosato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.