All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores
@ 2024-04-25 16:56 Gerd Bayer
  2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Gerd Bayer

Hi all,

this all started with a single patch by Ben to enable writing a user-mode
driver for a PCI device that requires 64bit register read/writes on s390.
A quick grep showed that there are several other drivers for PCI devices
in the kernel that use readq/writeq and eventually could use this, too.
So we decided to propose this for general inclusion.

A couple of suggestions for refactorizations by Jason Gunthorpe and Alex
Williamson later [1], I arrived at this little series that avoids some
code duplication and structures the different-size accesses in
vfio_pci_core_do_io_rw() in a way that the conditional compile of
8-byte accesses no longer creates an odd split of "else-if".

The initial version was tested with a PCI device on s390. This version
has only been tested for reads of 1..8 byte sizes and only been compile
tested for a 32bit architecture (arm).

Thank you,
Gerd Bayer

[1] https://lore.kernel.org/all/20240422153508.2355844-1-gbayer@linux.ibm.com/

Changes v2 -> v3:
- Introduce macro to generate body of different-size accesses in
  vfio_pci_core_do_io_rw (courtesy Alex Williamson).
- Convert if-else if chain to a switch-case construct to better
  accommodate conditional compiles.

Changes v1 -> v2:
- On non 64bit architecture use at most 32bit accesses in
  vfio_pci_core_do_io_rw and describe that in the commit message.
- Drop the run-time error on 32bit architectures.
- The #endif splitting the "else if" is not really fortunate, but I'm
  open to suggestions.

Ben Segal (1):
  vfio/pci: Support 8-byte PCI loads and stores

Gerd Bayer (2):
  vfio/pci: Extract duplicated code into macro
  vfio/pci: Continue to refactor vfio_pci_core_do_io_rw

 drivers/vfio/pci/vfio_pci_rdwr.c | 154 +++++++++++++++++--------------
 include/linux/vfio_pci_core.h    |   3 +
 2 files changed, 90 insertions(+), 67 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
@ 2024-04-25 16:56 ` Gerd Bayer
  2024-04-29 16:31   ` Alex Williamson
                     ` (2 more replies)
  2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
  2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer
  2 siblings, 3 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Gerd Bayer

vfio_pci_core_do_io_rw() repeats the same code for multiple access
widths. Factor this out into a macro

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-----------------
 1 file changed, 46 insertions(+), 60 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 03b8f7ada1ac..3335f1b868b1 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -90,6 +90,40 @@ VFIO_IOREAD(8)
 VFIO_IOREAD(16)
 VFIO_IOREAD(32)
 
+#define VFIO_IORDWR(size)						\
+static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
+				bool iswrite, bool test_mem,		\
+				void __iomem *io, char __user *buf,	\
+				loff_t off, size_t *filled)		\
+{									\
+	u##size val;							\
+	int ret;							\
+									\
+	if (iswrite) {							\
+		if (copy_from_user(&val, buf, sizeof(val)))		\
+			return -EFAULT;					\
+									\
+		ret = vfio_pci_core_iowrite##size(vdev, test_mem,	\
+						  val, io + off);	\
+		if (ret)						\
+			return ret;					\
+	} else {							\
+		ret = vfio_pci_core_ioread##size(vdev, test_mem,	\
+						 &val, io + off);	\
+		if (ret)						\
+			return ret;					\
+									\
+		if (copy_to_user(buf, &val, sizeof(val)))		\
+			return -EFAULT;					\
+	}								\
+									\
+	*filled = sizeof(val);						\
+	return 0;							\
+}									\
+
+VFIO_IORDWR(8)
+VFIO_IORDWR(16)
+VFIO_IORDWR(32)
 /*
  * Read or write from an __iomem region (MMIO or I/O port) with an excluded
  * range which is inaccessible.  The excluded range drops writes and fills
@@ -115,71 +149,23 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
 			fillable = 0;
 
 		if (fillable >= 4 && !(off % 4)) {
-			u32 val;
-
-			if (iswrite) {
-				if (copy_from_user(&val, buf, 4))
-					return -EFAULT;
-
-				ret = vfio_pci_core_iowrite32(vdev, test_mem,
-							      val, io + off);
-				if (ret)
-					return ret;
-			} else {
-				ret = vfio_pci_core_ioread32(vdev, test_mem,
-							     &val, io + off);
-				if (ret)
-					return ret;
-
-				if (copy_to_user(buf, &val, 4))
-					return -EFAULT;
-			}
+			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
+						     io, buf, off, &filled);
+			if (ret)
+				return ret;
 
-			filled = 4;
 		} else if (fillable >= 2 && !(off % 2)) {
-			u16 val;
-
-			if (iswrite) {
-				if (copy_from_user(&val, buf, 2))
-					return -EFAULT;
-
-				ret = vfio_pci_core_iowrite16(vdev, test_mem,
-							      val, io + off);
-				if (ret)
-					return ret;
-			} else {
-				ret = vfio_pci_core_ioread16(vdev, test_mem,
-							     &val, io + off);
-				if (ret)
-					return ret;
-
-				if (copy_to_user(buf, &val, 2))
-					return -EFAULT;
-			}
+			ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem,
+						     io, buf, off, &filled);
+			if (ret)
+				return ret;
 
-			filled = 2;
 		} else if (fillable) {
-			u8 val;
-
-			if (iswrite) {
-				if (copy_from_user(&val, buf, 1))
-					return -EFAULT;
-
-				ret = vfio_pci_core_iowrite8(vdev, test_mem,
-							     val, io + off);
-				if (ret)
-					return ret;
-			} else {
-				ret = vfio_pci_core_ioread8(vdev, test_mem,
-							    &val, io + off);
-				if (ret)
-					return ret;
-
-				if (copy_to_user(buf, &val, 1))
-					return -EFAULT;
-			}
+			ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem,
+						    io, buf, off, &filled);
+			if (ret)
+				return ret;
 
-			filled = 1;
 		} else {
 			/* Fill reads with -1, drop writes */
 			filled = min(count, (size_t)(x_end - off));
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
  2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
@ 2024-04-25 16:56 ` Gerd Bayer
  2024-04-29 16:31   ` Alex Williamson
  2024-05-17 10:29   ` Ramesh Thomas
  2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer
  2 siblings, 2 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal, Gerd Bayer

From: Ben Segal <bpsegal@us.ibm.com>

Many PCI adapters can benefit or even require full 64bit read
and write access to their registers. In order to enable work on
user-space drivers for these devices add two new variations
vfio_pci_core_io{read|write}64 of the existing access methods
when the architecture supports 64-bit ioreads and iowrites.

Signed-off-by: Ben Segal <bpsegal@us.ibm.com>
Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++
 include/linux/vfio_pci_core.h    |  3 +++
 2 files changed, 19 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 3335f1b868b1..8ed06edaee23 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size);
 VFIO_IOREAD(8)
 VFIO_IOREAD(16)
 VFIO_IOREAD(32)
+#ifdef ioread64
+VFIO_IOREAD(64)
+#endif
 
 #define VFIO_IORDWR(size)						\
 static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
@@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
 VFIO_IORDWR(8)
 VFIO_IORDWR(16)
 VFIO_IORDWR(32)
+#if defined(ioread64) && defined(iowrite64)
+VFIO_IORDWR(64)
+#endif
+
 /*
  * Read or write from an __iomem region (MMIO or I/O port) with an excluded
  * range which is inaccessible.  The excluded range drops writes and fills
@@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
 		else
 			fillable = 0;
 
+#if defined(ioread64) && defined(iowrite64)
+		if (fillable >= 8 && !(off % 8)) {
+			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
+						     io, buf, off, &filled);
+			if (ret)
+				return ret;
+
+		} else
+#endif /* defined(ioread64) && defined(iowrite64) */
 		if (fillable >= 4 && !(off % 4)) {
 			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
 						     io, buf, off, &filled);
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index a2c8b8bba711..f4cf5fd2350c 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct vfio_pci_core_device *vdev,	\
 VFIO_IOREAD_DECLATION(8)
 VFIO_IOREAD_DECLATION(16)
 VFIO_IOREAD_DECLATION(32)
+#ifdef ioread64
+VFIO_IOREAD_DECLATION(64)
+#endif
 
 #endif /* VFIO_PCI_CORE_H */
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw
  2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
  2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
  2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
@ 2024-04-25 16:56 ` Gerd Bayer
  2024-04-28  6:59   ` Tian, Kevin
                     ` (2 more replies)
  2 siblings, 3 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Gerd Bayer

Convert if-elseif-chain into switch-case.
Separate out and generalize the code from the if-clauses so the result
can be used in the switch statement.

Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 8ed06edaee23..634c00b03c71 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -131,6 +131,20 @@ VFIO_IORDWR(32)
 VFIO_IORDWR(64)
 #endif
 
+static int fill_size(size_t fillable, loff_t off)
+{
+	unsigned int fill_size;
+#if defined(ioread64) && defined(iowrite64)
+	for (fill_size = 8; fill_size >= 0; fill_size /= 2) {
+#else
+	for (fill_size = 4; fill_size >= 0; fill_size /= 2) {
+#endif /* defined(ioread64) && defined(iowrite64) */
+		if (fillable >= fill_size && !(off % fill_size))
+			return fill_size;
+	}
+	return -1;
+}
+
 /*
  * Read or write from an __iomem region (MMIO or I/O port) with an excluded
  * range which is inaccessible.  The excluded range drops writes and fills
@@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
 		else
 			fillable = 0;
 
+		switch (fill_size(fillable, off)) {
 #if defined(ioread64) && defined(iowrite64)
-		if (fillable >= 8 && !(off % 8)) {
+		case 8:
 			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
 						     io, buf, off, &filled);
 			if (ret)
 				return ret;
+			break;
 
-		} else
 #endif /* defined(ioread64) && defined(iowrite64) */
-		if (fillable >= 4 && !(off % 4)) {
+		case 4:
 			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
 						     io, buf, off, &filled);
 			if (ret)
 				return ret;
+			break;
 
-		} else if (fillable >= 2 && !(off % 2)) {
+		case 2:
 			ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem,
 						     io, buf, off, &filled);
 			if (ret)
 				return ret;
+			break;
 
-		} else if (fillable) {
+		case 1:
 			ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem,
 						    io, buf, off, &filled);
 			if (ret)
 				return ret;
+			break;
 
-		} else {
+		default:
 			/* Fill reads with -1, drop writes */
 			filled = min(count, (size_t)(x_end - off));
 			if (!iswrite) {
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* RE: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw
  2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer
@ 2024-04-28  6:59   ` Tian, Kevin
  2024-04-29 16:32   ` Alex Williamson
  2024-05-17 11:41   ` Ramesh Thomas
  2 siblings, 0 replies; 25+ messages in thread
From: Tian, Kevin @ 2024-04-28  6:59 UTC (permalink / raw)
  To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess

> From: Gerd Bayer <gbayer@linux.ibm.com>
> Sent: Friday, April 26, 2024 12:56 AM
> 
> @@ -131,6 +131,20 @@ VFIO_IORDWR(32)
>  VFIO_IORDWR(64)
>  #endif
> 
> +static int fill_size(size_t fillable, loff_t off)
> +{
> +	unsigned int fill_size;
> +#if defined(ioread64) && defined(iowrite64)
> +	for (fill_size = 8; fill_size >= 0; fill_size /= 2) {
> +#else
> +	for (fill_size = 4; fill_size >= 0; fill_size /= 2) {
> +#endif /* defined(ioread64) && defined(iowrite64) */
> +		if (fillable >= fill_size && !(off % fill_size))
> +			return fill_size;
> +	}
> +	return -1;

this is unreachable with "fill_size >= 0" in the loop.

shouldn't it be:

#if defined(ioread64) && defined(iowrite64)
	for (fill_size = 8; fill_size > 0; fill_size /= 2) {
#else
	for (fill_size = 4; fill_size > 0; fill_size /= 2) {
#endif /* defined(ioread64) && defined(iowrite64) */
		if (fillable >= fill_size && !(off % fill_size))
			break;
	}
	return fill_size;


> 
> +		switch (fill_size(fillable, off)) {
>  #if defined(ioread64) && defined(iowrite64)
> -		if (fillable >= 8 && !(off % 8)) {
> +		case 8:
>  			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
>  						     io, buf, off, &filled);
>  			if (ret)
>  				return ret;
> +			break;

the check on 'ret' can be moved out of the switch to be generic.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
@ 2024-04-29 16:31   ` Alex Williamson
  2024-05-17 14:22     ` Gerd Bayer
  2024-04-29 20:09   ` Jason Gunthorpe
  2024-05-17 10:47   ` Ramesh Thomas
  2 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2024-04-29 16:31 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Thu, 25 Apr 2024 18:56:02 +0200
Gerd Bayer <gbayer@linux.ibm.com> wrote:

> vfio_pci_core_do_io_rw() repeats the same code for multiple access
> widths. Factor this out into a macro
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>  drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-----------------
>  1 file changed, 46 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 03b8f7ada1ac..3335f1b868b1 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -90,6 +90,40 @@ VFIO_IOREAD(8)
>  VFIO_IOREAD(16)
>  VFIO_IOREAD(32)
>  
> +#define VFIO_IORDWR(size)						\
> +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
> +				bool iswrite, bool test_mem,		\
> +				void __iomem *io, char __user *buf,	\
> +				loff_t off, size_t *filled)		\

I realized later after proposing this that we should drop 'core' from
the name since the resulting functions are not currently exported.  It
also helps with the wordiness.  Thanks,

Alex

> +{									\
> +	u##size val;							\
> +	int ret;							\
> +									\
> +	if (iswrite) {							\
> +		if (copy_from_user(&val, buf, sizeof(val)))		\
> +			return -EFAULT;					\
> +									\
> +		ret = vfio_pci_core_iowrite##size(vdev, test_mem,	\
> +						  val, io + off);	\
> +		if (ret)						\
> +			return ret;					\
> +	} else {							\
> +		ret = vfio_pci_core_ioread##size(vdev, test_mem,	\
> +						 &val, io + off);	\
> +		if (ret)						\
> +			return ret;					\
> +									\
> +		if (copy_to_user(buf, &val, sizeof(val)))		\
> +			return -EFAULT;					\
> +	}								\
> +									\
> +	*filled = sizeof(val);						\
> +	return 0;							\
> +}									\
> +
> +VFIO_IORDWR(8)
> +VFIO_IORDWR(16)
> +VFIO_IORDWR(32)
>  /*
>   * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>   * range which is inaccessible.  The excluded range drops writes and fills
> @@ -115,71 +149,23 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
>  			fillable = 0;
>  
>  		if (fillable >= 4 && !(off % 4)) {
> -			u32 val;
> -
> -			if (iswrite) {
> -				if (copy_from_user(&val, buf, 4))
> -					return -EFAULT;
> -
> -				ret = vfio_pci_core_iowrite32(vdev, test_mem,
> -							      val, io + off);
> -				if (ret)
> -					return ret;
> -			} else {
> -				ret = vfio_pci_core_ioread32(vdev, test_mem,
> -							     &val, io + off);
> -				if (ret)
> -					return ret;
> -
> -				if (copy_to_user(buf, &val, 4))
> -					return -EFAULT;
> -			}
> +			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
> +						     io, buf, off, &filled);
> +			if (ret)
> +				return ret;
>  
> -			filled = 4;
>  		} else if (fillable >= 2 && !(off % 2)) {
> -			u16 val;
> -
> -			if (iswrite) {
> -				if (copy_from_user(&val, buf, 2))
> -					return -EFAULT;
> -
> -				ret = vfio_pci_core_iowrite16(vdev, test_mem,
> -							      val, io + off);
> -				if (ret)
> -					return ret;
> -			} else {
> -				ret = vfio_pci_core_ioread16(vdev, test_mem,
> -							     &val, io + off);
> -				if (ret)
> -					return ret;
> -
> -				if (copy_to_user(buf, &val, 2))
> -					return -EFAULT;
> -			}
> +			ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem,
> +						     io, buf, off, &filled);
> +			if (ret)
> +				return ret;
>  
> -			filled = 2;
>  		} else if (fillable) {
> -			u8 val;
> -
> -			if (iswrite) {
> -				if (copy_from_user(&val, buf, 1))
> -					return -EFAULT;
> -
> -				ret = vfio_pci_core_iowrite8(vdev, test_mem,
> -							     val, io + off);
> -				if (ret)
> -					return ret;
> -			} else {
> -				ret = vfio_pci_core_ioread8(vdev, test_mem,
> -							    &val, io + off);
> -				if (ret)
> -					return ret;
> -
> -				if (copy_to_user(buf, &val, 1))
> -					return -EFAULT;
> -			}
> +			ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem,
> +						    io, buf, off, &filled);
> +			if (ret)
> +				return ret;
>  
> -			filled = 1;
>  		} else {
>  			/* Fill reads with -1, drop writes */
>  			filled = min(count, (size_t)(x_end - off));


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
@ 2024-04-29 16:31   ` Alex Williamson
  2024-05-21 15:50     ` Gerd Bayer
  2024-05-17 10:29   ` Ramesh Thomas
  1 sibling, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2024-04-29 16:31 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal

On Thu, 25 Apr 2024 18:56:03 +0200
Gerd Bayer <gbayer@linux.ibm.com> wrote:

> From: Ben Segal <bpsegal@us.ibm.com>
> 
> Many PCI adapters can benefit or even require full 64bit read
> and write access to their registers. In order to enable work on
> user-space drivers for these devices add two new variations
> vfio_pci_core_io{read|write}64 of the existing access methods
> when the architecture supports 64-bit ioreads and iowrites.
> 
> Signed-off-by: Ben Segal <bpsegal@us.ibm.com>
> Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>  drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++
>  include/linux/vfio_pci_core.h    |  3 +++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 3335f1b868b1..8ed06edaee23 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size);
>  VFIO_IOREAD(8)
>  VFIO_IOREAD(16)
>  VFIO_IOREAD(32)
> +#ifdef ioread64
> +VFIO_IOREAD(64)
> +#endif
>  
>  #define VFIO_IORDWR(size)						\
>  static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
> @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
>  VFIO_IORDWR(8)
>  VFIO_IORDWR(16)
>  VFIO_IORDWR(32)
> +#if defined(ioread64) && defined(iowrite64)
> +VFIO_IORDWR(64)
> +#endif
> +
>  /*
>   * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>   * range which is inaccessible.  The excluded range drops writes and fills
> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
>  		else
>  			fillable = 0;
>  
> +#if defined(ioread64) && defined(iowrite64)

Nit, #ifdef vfio_pci_core_iordwr64

> +		if (fillable >= 8 && !(off % 8)) {
> +			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
> +						     io, buf, off, &filled);
> +			if (ret)
> +				return ret;
> +
> +		} else
> +#endif /* defined(ioread64) && defined(iowrite64) */

AFAIK, the comment appended to the #endif is really only suggested when
the code block is too long to reasonable fit in a terminal.  That's no
longer the case with the new helper.  Thanks,

Alex

>  		if (fillable >= 4 && !(off % 4)) {
>  			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
>  						     io, buf, off, &filled);
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index a2c8b8bba711..f4cf5fd2350c 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct vfio_pci_core_device *vdev,	\
>  VFIO_IOREAD_DECLATION(8)
>  VFIO_IOREAD_DECLATION(16)
>  VFIO_IOREAD_DECLATION(32)
> +#ifdef ioread64
> +VFIO_IOREAD_DECLATION(64)
> +#endif
>  
>  #endif /* VFIO_PCI_CORE_H */


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw
  2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer
  2024-04-28  6:59   ` Tian, Kevin
@ 2024-04-29 16:32   ` Alex Williamson
  2024-05-21 16:43     ` Gerd Bayer
  2024-05-17 11:41   ` Ramesh Thomas
  2 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2024-04-29 16:32 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Thu, 25 Apr 2024 18:56:04 +0200
Gerd Bayer <gbayer@linux.ibm.com> wrote:

> Convert if-elseif-chain into switch-case.
> Separate out and generalize the code from the if-clauses so the result
> can be used in the switch statement.
> 
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>  drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++------
>  1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 8ed06edaee23..634c00b03c71 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -131,6 +131,20 @@ VFIO_IORDWR(32)
>  VFIO_IORDWR(64)

#define MAX_FILL_SIZE 8
#else
#define MAX_FILL_SIZE 4

>  #endif
>  
> +static int fill_size(size_t fillable, loff_t off)
> +{
> +	unsigned int fill_size;

	unsigned int fill_size = MAX_FILL_SIZE;

> +#if defined(ioread64) && defined(iowrite64)
> +	for (fill_size = 8; fill_size >= 0; fill_size /= 2) {
> +#else
> +	for (fill_size = 4; fill_size >= 0; fill_size /= 2) {
> +#endif /* defined(ioread64) && defined(iowrite64) */
> +		if (fillable >= fill_size && !(off % fill_size))
> +			return fill_size;
> +	}
> +	return -1;
> +}
> +
>  /*
>   * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>   * range which is inaccessible.  The excluded range drops writes and fills
> @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
>  		else
>  			fillable = 0;
>  
> +		switch (fill_size(fillable, off)) {
>  #if defined(ioread64) && defined(iowrite64)
> -		if (fillable >= 8 && !(off % 8)) {
> +		case 8:
>  			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
>  						     io, buf, off, &filled);
>  			if (ret)
>  				return ret;
> +			break;
>  
> -		} else

AFAICT, avoiding this dangling else within the #ifdef is really the
only tangible advantage of conversion to a switch statement.  Getting
rid of that alone while keeping, and actually increasing, the inline
ifdefs in the code doesn't seem worthwhile to me.  I'd probably only go
this route if we could make vfio_pci_iordwr64() stubbed as a BUG_ON
when we don't have the ioread64 and iowrite64 accessors, in which case
the switch helper would never return 8 and the function would be
unreachable.

>  #endif /* defined(ioread64) && defined(iowrite64) */
> -		if (fillable >= 4 && !(off % 4)) {
> +		case 4:
>  			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
>  						     io, buf, off, &filled);
>  			if (ret)
>  				return ret;
> +			break;
>  
> -		} else if (fillable >= 2 && !(off % 2)) {
> +		case 2:
>  			ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem,
>  						     io, buf, off, &filled);
>  			if (ret)
>  				return ret;
> +			break;
>  
> -		} else if (fillable) {
> +		case 1:
>  			ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem,
>  						    io, buf, off, &filled);
>  			if (ret)
>  				return ret;
> +			break;
>  
> -		} else {
> +		default:

This condition also seems a little more obfuscated without being
preceded by the 'if (fillable)' test, which might warrant handling
separate from the switch if we continue to pursue the switch
conversion.  Thanks,

Alex

>  			/* Fill reads with -1, drop writes */
>  			filled = min(count, (size_t)(x_end - off));
>  			if (!iswrite) {


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
  2024-04-29 16:31   ` Alex Williamson
@ 2024-04-29 20:09   ` Jason Gunthorpe
  2024-04-29 22:11     ` Alex Williamson
  2024-05-17 10:47   ` Ramesh Thomas
  2 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2024-04-29 20:09 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Alex Williamson, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Thu, Apr 25, 2024 at 06:56:02PM +0200, Gerd Bayer wrote:
> vfio_pci_core_do_io_rw() repeats the same code for multiple access
> widths. Factor this out into a macro
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>  drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-----------------
>  1 file changed, 46 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 03b8f7ada1ac..3335f1b868b1 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -90,6 +90,40 @@ VFIO_IOREAD(8)
>  VFIO_IOREAD(16)
>  VFIO_IOREAD(32)
>  
> +#define VFIO_IORDWR(size)						\
> +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
> +				bool iswrite, bool test_mem,		\
> +				void __iomem *io, char __user *buf,	\
> +				loff_t off, size_t *filled)		\
> +{									\
> +	u##size val;							\
> +	int ret;							\
> +									\
> +	if (iswrite) {							\
> +		if (copy_from_user(&val, buf, sizeof(val)))		\
> +			return -EFAULT;					\
> +									\
> +		ret = vfio_pci_core_iowrite##size(vdev, test_mem,	\
> +						  val, io + off);	\
> +		if (ret)						\
> +			return ret;					\
> +	} else {							\
> +		ret = vfio_pci_core_ioread##size(vdev, test_mem,	\
> +						 &val, io + off);	\
> +		if (ret)						\
> +			return ret;					\
> +									\
> +		if (copy_to_user(buf, &val, sizeof(val)))		\
> +			return -EFAULT;					\
> +	}								\
> +									\
> +	*filled = sizeof(val);						\
> +	return 0;							\
> +}									\
> +
> +VFIO_IORDWR(8)
> +VFIO_IORDWR(16)
> +VFIO_IORDWR(32)

I'd suggest to try writing this without so many macros.

This isn't very performance optimal already, we take a lock on every
iteration, so there isn't much point in inlining multiple copies of
everything to save an branch.

Push the sizing switch down to the bottom, start with a function like:

static void __iowrite(const void *val, void __iomem *io, size_t len)
{
	switch (len) {
	case 8: {
#ifdef iowrite64 // NOTE this doesn't seem to work on x86?
		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
			return iowrite64be(*(const u64 *)val, io);
		return iowrite64(*(const u64 *)val, io);
#else
		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) {
			iowrite32be(*(const u32 *)val, io);
			iowrite32be(*(const u32 *)(val + 4), io + 4);
		} else {
			iowrite32(*(const u32 *)val, io);
			iowrite32(*(const u32 *)(val + 4), io + 4);
		}
		return;
#endif
	}

	case 4:
		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
			return iowrite32be(*(const u32 *)val, io);
		return iowrite32(*(const u32 *)val, io);
	case 2:
		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
			return iowrite16be(*(const u16 *)val, io);
		return iowrite16(*(const u16 *)val, io);

	case 1:
		return iowrite8(*(const u8 *)val, io);
	}
}

And then wrap it with the copy and the lock:

static int do_iordwr(struct vfio_pci_core_device *vdev, bool test_mem,
		     const void __user *buf, void __iomem *io, size_t len,
		     bool iswrite)
{
	u64 val;

	if (iswrite && copy_from_user(&val, buf, len))
		return -EFAULT;

	if (test_mem) {
		down_read(&vdev->memory_lock);
		if (!__vfio_pci_memory_enabled(vdev)) {
			up_read(&vdev->memory_lock);
			return -EIO;
		}
	}

	if (iswrite)
		__iowrite(&val, io, len);
	else
		__ioread(&val, io, len);

	if (test_mem)
		up_read(&vdev->memory_lock);

	if (!iswrite && copy_to_user(buf, &val, len))
		return -EFAULT;

	return 0;
}

And then the loop can be simple:

		if (fillable) {
			filled = num_bytes(fillable, off);
			ret = do_iordwr(vdev, test_mem, buf, io + off, filled,
					iswrite);
			if (ret)
				return ret;
		} else {
			filled = min(count, (size_t)(x_end - off));
			/* Fill reads with -1, drop writes */
			ret = fill_err(buf, filled);
			if (ret)
				return ret;
		}

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-29 20:09   ` Jason Gunthorpe
@ 2024-04-29 22:11     ` Alex Williamson
  2024-04-29 22:33       ` Jason Gunthorpe
  2024-04-30  8:16       ` liulongfang
  0 siblings, 2 replies; 25+ messages in thread
From: Alex Williamson @ 2024-04-29 22:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Gerd Bayer, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Mon, 29 Apr 2024 17:09:10 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Thu, Apr 25, 2024 at 06:56:02PM +0200, Gerd Bayer wrote:
> > vfio_pci_core_do_io_rw() repeats the same code for multiple access
> > widths. Factor this out into a macro
> > 
> > Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> >  drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-----------------
> >  1 file changed, 46 insertions(+), 60 deletions(-)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> > index 03b8f7ada1ac..3335f1b868b1 100644
> > --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> > @@ -90,6 +90,40 @@ VFIO_IOREAD(8)
> >  VFIO_IOREAD(16)
> >  VFIO_IOREAD(32)
> >  
> > +#define VFIO_IORDWR(size)						\
> > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
> > +				bool iswrite, bool test_mem,		\
> > +				void __iomem *io, char __user *buf,	\
> > +				loff_t off, size_t *filled)		\
> > +{									\
> > +	u##size val;							\
> > +	int ret;							\
> > +									\
> > +	if (iswrite) {							\
> > +		if (copy_from_user(&val, buf, sizeof(val)))		\
> > +			return -EFAULT;					\
> > +									\
> > +		ret = vfio_pci_core_iowrite##size(vdev, test_mem,	\
> > +						  val, io + off);	\
> > +		if (ret)						\
> > +			return ret;					\
> > +	} else {							\
> > +		ret = vfio_pci_core_ioread##size(vdev, test_mem,	\
> > +						 &val, io + off);	\
> > +		if (ret)						\
> > +			return ret;					\
> > +									\
> > +		if (copy_to_user(buf, &val, sizeof(val)))		\
> > +			return -EFAULT;					\
> > +	}								\
> > +									\
> > +	*filled = sizeof(val);						\
> > +	return 0;							\
> > +}									\
> > +
> > +VFIO_IORDWR(8)
> > +VFIO_IORDWR(16)
> > +VFIO_IORDWR(32)  
> 
> I'd suggest to try writing this without so many macros.
> 
> This isn't very performance optimal already, we take a lock on every
> iteration, so there isn't much point in inlining multiple copies of
> everything to save an branch.

These macros are to reduce duplicate code blocks and the errors that
typically come from such duplication, as well as to provide type safe
functions in the spirit of the ioread# and iowrite# helpers.  It really
has nothing to do with, nor is it remotely effective at saving a branch.
Thanks,

Alex

> Push the sizing switch down to the bottom, start with a function like:
> 
> static void __iowrite(const void *val, void __iomem *io, size_t len)
> {
> 	switch (len) {
> 	case 8: {
> #ifdef iowrite64 // NOTE this doesn't seem to work on x86?
> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> 			return iowrite64be(*(const u64 *)val, io);
> 		return iowrite64(*(const u64 *)val, io);
> #else
> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) {
> 			iowrite32be(*(const u32 *)val, io);
> 			iowrite32be(*(const u32 *)(val + 4), io + 4);
> 		} else {
> 			iowrite32(*(const u32 *)val, io);
> 			iowrite32(*(const u32 *)(val + 4), io + 4);
> 		}
> 		return;
> #endif
> 	}
> 
> 	case 4:
> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> 			return iowrite32be(*(const u32 *)val, io);
> 		return iowrite32(*(const u32 *)val, io);
> 	case 2:
> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> 			return iowrite16be(*(const u16 *)val, io);
> 		return iowrite16(*(const u16 *)val, io);
> 
> 	case 1:
> 		return iowrite8(*(const u8 *)val, io);
> 	}
> }
> 
> And then wrap it with the copy and the lock:
> 
> static int do_iordwr(struct vfio_pci_core_device *vdev, bool test_mem,
> 		     const void __user *buf, void __iomem *io, size_t len,
> 		     bool iswrite)
> {
> 	u64 val;
> 
> 	if (iswrite && copy_from_user(&val, buf, len))
> 		return -EFAULT;
> 
> 	if (test_mem) {
> 		down_read(&vdev->memory_lock);
> 		if (!__vfio_pci_memory_enabled(vdev)) {
> 			up_read(&vdev->memory_lock);
> 			return -EIO;
> 		}
> 	}
> 
> 	if (iswrite)
> 		__iowrite(&val, io, len);
> 	else
> 		__ioread(&val, io, len);
> 
> 	if (test_mem)
> 		up_read(&vdev->memory_lock);
> 
> 	if (!iswrite && copy_to_user(buf, &val, len))
> 		return -EFAULT;
> 
> 	return 0;
> }
> 
> And then the loop can be simple:
> 
> 		if (fillable) {
> 			filled = num_bytes(fillable, off);
> 			ret = do_iordwr(vdev, test_mem, buf, io + off, filled,
> 					iswrite);
> 			if (ret)
> 				return ret;
> 		} else {
> 			filled = min(count, (size_t)(x_end - off));
> 			/* Fill reads with -1, drop writes */
> 			ret = fill_err(buf, filled);
> 			if (ret)
> 				return ret;
> 		}
> 
> Jason
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-29 22:11     ` Alex Williamson
@ 2024-04-29 22:33       ` Jason Gunthorpe
  2024-05-21 15:47         ` Gerd Bayer
  2024-04-30  8:16       ` liulongfang
  1 sibling, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2024-04-29 22:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Gerd Bayer, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Mon, Apr 29, 2024 at 04:11:03PM -0600, Alex Williamson wrote:
> > This isn't very performance optimal already, we take a lock on every
> > iteration, so there isn't much point in inlining multiple copies of
> > everything to save an branch.
> 
> These macros are to reduce duplicate code blocks and the errors that
> typically come from such duplication, 

But there is still quite a bit of repetition here..

> as well as to provide type safe functions in the spirit of the
> ioread# and iowrite# helpers.

But it never really takes any advantage of type safety? It is making a
memcpy..

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-29 22:11     ` Alex Williamson
  2024-04-29 22:33       ` Jason Gunthorpe
@ 2024-04-30  8:16       ` liulongfang
  1 sibling, 0 replies; 25+ messages in thread
From: liulongfang @ 2024-04-30  8:16 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe
  Cc: Gerd Bayer, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On 2024/4/30 6:11, Alex Williamson wrote:
> On Mon, 29 Apr 2024 17:09:10 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
> 
>> On Thu, Apr 25, 2024 at 06:56:02PM +0200, Gerd Bayer wrote:
>>> vfio_pci_core_do_io_rw() repeats the same code for multiple access
>>> widths. Factor this out into a macro
>>>
>>> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
>>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
>>> ---
>>>  drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-----------------
>>>  1 file changed, 46 insertions(+), 60 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
>>> index 03b8f7ada1ac..3335f1b868b1 100644
>>> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
>>> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
>>> @@ -90,6 +90,40 @@ VFIO_IOREAD(8)
>>>  VFIO_IOREAD(16)
>>>  VFIO_IOREAD(32)
>>>  
>>> +#define VFIO_IORDWR(size)						\
>>> +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
>>> +				bool iswrite, bool test_mem,		\
>>> +				void __iomem *io, char __user *buf,	\
>>> +				loff_t off, size_t *filled)		\
>>> +{									\
>>> +	u##size val;							\
>>> +	int ret;							\
>>> +									\
>>> +	if (iswrite) {							\
>>> +		if (copy_from_user(&val, buf, sizeof(val)))		\
>>> +			return -EFAULT;					\
>>> +									\
>>> +		ret = vfio_pci_core_iowrite##size(vdev, test_mem,	\
>>> +						  val, io + off);	\
>>> +		if (ret)						\
>>> +			return ret;					\
>>> +	} else {							\
>>> +		ret = vfio_pci_core_ioread##size(vdev, test_mem,	\
>>> +						 &val, io + off);	\
>>> +		if (ret)						\
>>> +			return ret;					\
>>> +									\
>>> +		if (copy_to_user(buf, &val, sizeof(val)))		\
>>> +			return -EFAULT;					\
>>> +	}								\
>>> +									\
>>> +	*filled = sizeof(val);						\
>>> +	return 0;							\
>>> +}									\
>>> +
>>> +VFIO_IORDWR(8)
>>> +VFIO_IORDWR(16)
>>> +VFIO_IORDWR(32)  
>>
>> I'd suggest to try writing this without so many macros.
>>
>> This isn't very performance optimal already, we take a lock on every
>> iteration, so there isn't much point in inlining multiple copies of
>> everything to save an branch.
> 
> These macros are to reduce duplicate code blocks and the errors that

Although simple and straightforward writing will result in more lines of code.
But it's not easy to squeeze in "extra" code.
The backdoor of "XZ Utils" is implanted through code complication.

Thanks.
Longfang.

> typically come from such duplication, as well as to provide type safe
> functions in the spirit of the ioread# and iowrite# helpers.  It really
> has nothing to do with, nor is it remotely effective at saving a branch.
> Thanks,
> 
> Alex
> 
>> Push the sizing switch down to the bottom, start with a function like:
>>
>> static void __iowrite(const void *val, void __iomem *io, size_t len)
>> {
>> 	switch (len) {
>> 	case 8: {
>> #ifdef iowrite64 // NOTE this doesn't seem to work on x86?
>> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
>> 			return iowrite64be(*(const u64 *)val, io);
>> 		return iowrite64(*(const u64 *)val, io);
>> #else
>> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) {
>> 			iowrite32be(*(const u32 *)val, io);
>> 			iowrite32be(*(const u32 *)(val + 4), io + 4);
>> 		} else {
>> 			iowrite32(*(const u32 *)val, io);
>> 			iowrite32(*(const u32 *)(val + 4), io + 4);
>> 		}
>> 		return;
>> #endif
>> 	}
>>
>> 	case 4:
>> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
>> 			return iowrite32be(*(const u32 *)val, io);
>> 		return iowrite32(*(const u32 *)val, io);
>> 	case 2:
>> 		if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
>> 			return iowrite16be(*(const u16 *)val, io);
>> 		return iowrite16(*(const u16 *)val, io);
>>
>> 	case 1:
>> 		return iowrite8(*(const u8 *)val, io);
>> 	}
>> }
>>
>> And then wrap it with the copy and the lock:
>>
>> static int do_iordwr(struct vfio_pci_core_device *vdev, bool test_mem,
>> 		     const void __user *buf, void __iomem *io, size_t len,
>> 		     bool iswrite)
>> {
>> 	u64 val;
>>
>> 	if (iswrite && copy_from_user(&val, buf, len))
>> 		return -EFAULT;
>>
>> 	if (test_mem) {
>> 		down_read(&vdev->memory_lock);
>> 		if (!__vfio_pci_memory_enabled(vdev)) {
>> 			up_read(&vdev->memory_lock);
>> 			return -EIO;
>> 		}
>> 	}
>>
>> 	if (iswrite)
>> 		__iowrite(&val, io, len);
>> 	else
>> 		__ioread(&val, io, len);
>>
>> 	if (test_mem)
>> 		up_read(&vdev->memory_lock);
>>
>> 	if (!iswrite && copy_to_user(buf, &val, len))
>> 		return -EFAULT;
>>
>> 	return 0;
>> }
>>
>> And then the loop can be simple:
>>
>> 		if (fillable) {
>> 			filled = num_bytes(fillable, off);
>> 			ret = do_iordwr(vdev, test_mem, buf, io + off, filled,
>> 					iswrite);
>> 			if (ret)
>> 				return ret;
>> 		} else {
>> 			filled = min(count, (size_t)(x_end - off));
>> 			/* Fill reads with -1, drop writes */
>> 			ret = fill_err(buf, filled);
>> 			if (ret)
>> 				return ret;
>> 		}
>>
>> Jason
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
  2024-04-29 16:31   ` Alex Williamson
@ 2024-05-17 10:29   ` Ramesh Thomas
  2024-05-20  9:02     ` Tian, Kevin
  2024-05-21 16:40     ` Gerd Bayer
  1 sibling, 2 replies; 25+ messages in thread
From: Ramesh Thomas @ 2024-05-17 10:29 UTC (permalink / raw)
  To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal, Thomas, Ramesh

On 4/25/2024 9:56 AM, Gerd Bayer wrote:
> From: Ben Segal <bpsegal@us.ibm.com>
> 
> Many PCI adapters can benefit or even require full 64bit read
> and write access to their registers. In order to enable work on
> user-space drivers for these devices add two new variations
> vfio_pci_core_io{read|write}64 of the existing access methods
> when the architecture supports 64-bit ioreads and iowrites.

This is indeed necessary as back to back 32 bit may not be optimal in 
some devices.

> 
> Signed-off-by: Ben Segal <bpsegal@us.ibm.com>
> Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>   drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++
>   include/linux/vfio_pci_core.h    |  3 +++
>   2 files changed, 19 insertions(+)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 3335f1b868b1..8ed06edaee23 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size);
>   VFIO_IOREAD(8)
>   VFIO_IOREAD(16)
>   VFIO_IOREAD(32)
> +#ifdef ioread64
> +VFIO_IOREAD(64)
> +#endif
>   
>   #define VFIO_IORDWR(size)						\
>   static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
> @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
>   VFIO_IORDWR(8)
>   VFIO_IORDWR(16)
>   VFIO_IORDWR(32)
> +#if defined(ioread64) && defined(iowrite64)
> +VFIO_IORDWR(64)
> +#endif
> +
>   /*
>    * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>    * range which is inaccessible.  The excluded range drops writes and fills
> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
>   		else
>   			fillable = 0;
>   
> +#if defined(ioread64) && defined(iowrite64)

Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and 
iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is 
defined and this check always fails. In include/asm-generic/io.h, 
asm-generic/iomap.h gets included which declares them as extern functions.

One more thing to consider io-64-nonatomic-hi-lo.h and 
io-64-nonatomic-lo-hi.h, if included would define it as a macro that 
calls a function that rw 32 bits back to back.

> +		if (fillable >= 8 && !(off % 8)) {
> +			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
> +						     io, buf, off, &filled);
> +			if (ret)
> +				return ret;
> +
> +		} else
> +#endif /* defined(ioread64) && defined(iowrite64) */
>   		if (fillable >= 4 && !(off % 4)) {
>   			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
>   						     io, buf, off, &filled);
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index a2c8b8bba711..f4cf5fd2350c 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct vfio_pci_core_device *vdev,	\
>   VFIO_IOREAD_DECLATION(8)
>   VFIO_IOREAD_DECLATION(16)
>   VFIO_IOREAD_DECLATION(32)
> +#ifdef ioread64
> +VFIO_IOREAD_DECLATION(64)
nit: This macro is referenced only in this file. Can the typo be 
corrected (_DECLARATION)?

> +#endif
>   
>   #endif /* VFIO_PCI_CORE_H */


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
  2024-04-29 16:31   ` Alex Williamson
  2024-04-29 20:09   ` Jason Gunthorpe
@ 2024-05-17 10:47   ` Ramesh Thomas
  2 siblings, 0 replies; 25+ messages in thread
From: Ramesh Thomas @ 2024-05-17 10:47 UTC (permalink / raw)
  To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Thomas, Ramesh

On 4/25/2024 9:56 AM, Gerd Bayer wrote:
> vfio_pci_core_do_io_rw() repeats the same code for multiple access
> widths. Factor this out into a macro
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>   drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-----------------
>   1 file changed, 46 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 03b8f7ada1ac..3335f1b868b1 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -90,6 +90,40 @@ VFIO_IOREAD(8)
>   VFIO_IOREAD(16)
>   VFIO_IOREAD(32)
>   
> +#define VFIO_IORDWR(size)						\
> +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\
> +				bool iswrite, bool test_mem,		\
> +				void __iomem *io, char __user *buf,	\
> +				loff_t off, size_t *filled)		\
> +{									\
> +	u##size val;							\
> +	int ret;							\
> +									\
> +	if (iswrite) {							\
> +		if (copy_from_user(&val, buf, sizeof(val)))		\

Another way to get the size is (size)/8
"if (copy_from_user(&val, buf, (size)/8))"

> +			return -EFAULT;					\
> +									\
> +		ret = vfio_pci_core_iowrite##size(vdev, test_mem,	\
> +						  val, io + off);	\
> +		if (ret)						\
> +			return ret;					\
> +	} else {							\
> +		ret = vfio_pci_core_ioread##size(vdev, test_mem,	\
> +						 &val, io + off);	\
> +		if (ret)						\
> +			return ret;					\
> +									\
> +		if (copy_to_user(buf, &val, sizeof(val)))		\
> +			return -EFAULT;					\
> +	}								\
> +									\
> +	*filled = sizeof(val);						\
> +	return 0;							\
> +}									\
> +
> +VFIO_IORDWR(8)
> +VFIO_IORDWR(16)
> +VFIO_IORDWR(32)
>   /*
>    * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>    * range which is inaccessible.  The excluded range drops writes and fills
> @@ -115,71 +149,23 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
>   			fillable = 0;
>   
>   		if (fillable >= 4 && !(off % 4)) {
> -			u32 val;
> -
> -			if (iswrite) {
> -				if (copy_from_user(&val, buf, 4))
> -					return -EFAULT;
> -
> -				ret = vfio_pci_core_iowrite32(vdev, test_mem,
> -							      val, io + off);
> -				if (ret)
> -					return ret;
> -			} else {
> -				ret = vfio_pci_core_ioread32(vdev, test_mem,
> -							     &val, io + off);
> -				if (ret)
> -					return ret;
> -
> -				if (copy_to_user(buf, &val, 4))
> -					return -EFAULT;
> -			}
> +			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
> +						     io, buf, off, &filled);
> +			if (ret)
> +				return ret;
>   
> -			filled = 4;
>   		} else if (fillable >= 2 && !(off % 2)) {
> -			u16 val;
> -
> -			if (iswrite) {
> -				if (copy_from_user(&val, buf, 2))
> -					return -EFAULT;
> -
> -				ret = vfio_pci_core_iowrite16(vdev, test_mem,
> -							      val, io + off);
> -				if (ret)
> -					return ret;
> -			} else {
> -				ret = vfio_pci_core_ioread16(vdev, test_mem,
> -							     &val, io + off);
> -				if (ret)
> -					return ret;
> -
> -				if (copy_to_user(buf, &val, 2))
> -					return -EFAULT;
> -			}
> +			ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem,
> +						     io, buf, off, &filled);
> +			if (ret)
> +				return ret;
>   
> -			filled = 2;
>   		} else if (fillable) {
> -			u8 val;
> -
> -			if (iswrite) {
> -				if (copy_from_user(&val, buf, 1))
> -					return -EFAULT;
> -
> -				ret = vfio_pci_core_iowrite8(vdev, test_mem,
> -							     val, io + off);
> -				if (ret)
> -					return ret;
> -			} else {
> -				ret = vfio_pci_core_ioread8(vdev, test_mem,
> -							    &val, io + off);
> -				if (ret)
> -					return ret;
> -
> -				if (copy_to_user(buf, &val, 1))
> -					return -EFAULT;
> -			}
> +			ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem,
> +						    io, buf, off, &filled);
> +			if (ret)
> +				return ret;
>   
> -			filled = 1;
>   		} else {
>   			/* Fill reads with -1, drop writes */
>   			filled = min(count, (size_t)(x_end - off));


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw
  2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer
  2024-04-28  6:59   ` Tian, Kevin
  2024-04-29 16:32   ` Alex Williamson
@ 2024-05-17 11:41   ` Ramesh Thomas
  2 siblings, 0 replies; 25+ messages in thread
From: Ramesh Thomas @ 2024-05-17 11:41 UTC (permalink / raw)
  To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Thomas, Ramesh

On 4/25/2024 9:56 AM, Gerd Bayer wrote:
> Convert if-elseif-chain into switch-case.
> Separate out and generalize the code from the if-clauses so the result
> can be used in the switch statement.
> 
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>   drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++------
>   1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 8ed06edaee23..634c00b03c71 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -131,6 +131,20 @@ VFIO_IORDWR(32)
>   VFIO_IORDWR(64)
>   #endif
>   
> +static int fill_size(size_t fillable, loff_t off)
> +{
> +	unsigned int fill_size;
> +#if defined(ioread64) && defined(iowrite64)
> +	for (fill_size = 8; fill_size >= 0; fill_size /= 2) {
> +#else
> +	for (fill_size = 4; fill_size >= 0; fill_size /= 2) {

0 anyway reaches default case, so loop condition can be "fill_size > 0" 
or at the start of function return 0 or -1 if fillable <= 0

> +#endif /* defined(ioread64) && defined(iowrite64) */
> +		if (fillable >= fill_size && !(off % fill_size))
> +			return fill_size;
> +	}
> +	return -1;
> +}
> +
>   /*
>    * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>    * range which is inaccessible.  The excluded range drops writes and fills
> @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem,
>   		else
>   			fillable = 0;
>   
> +		switch (fill_size(fillable, off)) {
>   #if defined(ioread64) && defined(iowrite64)
> -		if (fillable >= 8 && !(off % 8)) {
> +		case 8:
>   			ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem,
>   						     io, buf, off, &filled);
>   			if (ret)
>   				return ret;
This check and returning is common for all cases except default. Maybe 
ret can be initialized to 0 before the switch block and do the check and 
return after the switch block.

> +			break;
>   
> -		} else
>   #endif /* defined(ioread64) && defined(iowrite64) */
> -		if (fillable >= 4 && !(off % 4)) {
> +		case 4:
>   			ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem,
>   						     io, buf, off, &filled);
>   			if (ret)
>   				return ret;
> +			break;
>   
> -		} else if (fillable >= 2 && !(off % 2)) {
> +		case 2:
>   			ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem,
>   						     io, buf, off, &filled);
>   			if (ret)
>   				return ret;
> +			break;
>   
> -		} else if (fillable) {
> +		case 1:
>   			ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem,
>   						    io, buf, off, &filled);
>   			if (ret)
>   				return ret;
> +			break;
>   
> -		} else {
> +		default:
>   			/* Fill reads with -1, drop writes */
>   			filled = min(count, (size_t)(x_end - off));
>   			if (!iswrite) {


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-29 16:31   ` Alex Williamson
@ 2024-05-17 14:22     ` Gerd Bayer
  0 siblings, 0 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-05-17 14:22 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Mon, 2024-04-29 at 10:31 -0600, Alex Williamson wrote:
> On Thu, 25 Apr 2024 18:56:02 +0200
> Gerd Bayer <gbayer@linux.ibm.com> wrote:
> 
> > vfio_pci_core_do_io_rw() repeats the same code for multiple access
> > widths. Factor this out into a macro
> > 
> > Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> >  drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++-------------
> > ----
> >  1 file changed, 46 insertions(+), 60 deletions(-)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c
> > b/drivers/vfio/pci/vfio_pci_rdwr.c
> > index 03b8f7ada1ac..3335f1b868b1 100644
> > --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> > @@ -90,6 +90,40 @@ VFIO_IOREAD(8)
> >  VFIO_IOREAD(16)
> >  VFIO_IOREAD(32)
> >  
> > +#define
> > VFIO_IORDWR(size)						\
> > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device
> > *vdev,\
> > +				bool iswrite, bool
> > test_mem,		\
> > +				void __iomem *io, char __user
> > *buf,	\
> > +				loff_t off, size_t
> > *filled)		\
> 
> I realized later after proposing this that we should drop 'core' from
> the name since the resulting functions are not currently exported. 
> It also helps with the wordiness.  Thanks,
> 
> Alex
> 
> 
Sure that's easy enough.

Thanks, Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-05-17 10:29   ` Ramesh Thomas
@ 2024-05-20  9:02     ` Tian, Kevin
  2024-05-23  0:11       ` Ramesh Thomas
  2024-05-21 16:40     ` Gerd Bayer
  1 sibling, 1 reply; 25+ messages in thread
From: Tian, Kevin @ 2024-05-20  9:02 UTC (permalink / raw)
  To: Thomas, Ramesh, Gerd Bayer, Alex Williamson, Jason Gunthorpe,
	Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal, Thomas, Ramesh

> From: Ramesh Thomas <ramesh.thomas@intel.com>
> Sent: Friday, May 17, 2024 6:30 PM
> 
> On 4/25/2024 9:56 AM, Gerd Bayer wrote:
> > From: Ben Segal <bpsegal@us.ibm.com>
> >
> > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
> vfio_pci_core_device *vdev, bool test_mem,
> >   		else
> >   			fillable = 0;
> >	
> > +#if defined(ioread64) && defined(iowrite64)
> 
> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and
> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is
> defined and this check always fails. In include/asm-generic/io.h,
> asm-generic/iomap.h gets included which declares them as extern functions.
> 
> One more thing to consider io-64-nonatomic-hi-lo.h and
> io-64-nonatomic-lo-hi.h, if included would define it as a macro that
> calls a function that rw 32 bits back to back.

I don't see the problem here. when the defined check fails it falls
back to back-to-back vfio_pci_core_iordwr32(). there is no need to
do it in an indirect way via including io-64-nonatomic-hi-lo.h.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro
  2024-04-29 22:33       ` Jason Gunthorpe
@ 2024-05-21 15:47         ` Gerd Bayer
  0 siblings, 0 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-05-21 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas,
	Halil Pasic, Julian Ruess

On Mon, 2024-04-29 at 19:33 -0300, Jason Gunthorpe wrote:
> On Mon, Apr 29, 2024 at 04:11:03PM -0600, Alex Williamson wrote:
> > > This isn't very performance optimal already, we take a lock on
> > > every
> > > iteration, so there isn't much point in inlining multiple copies
> > > of
> > > everything to save an branch.
> > 
> > These macros are to reduce duplicate code blocks and the errors
> > that typically come from such duplication, 
> 
> But there is still quite a bit of repetition here..

I appears like duplications, I agree - but the vfio_pci_core_ioreadX
and vfio_pci_core_iowriteX accessors are exported as such, or might be
reused by way of vfio_pci_iordwrX in vfio_pci_core_do_io_rw for
arbitrarily sized read/writes, too.

> > as well as to provide type safe functions in the spirit of the
> > ioread# and iowrite# helpers.
> 
> But it never really takes any advantage of type safety? It is making
> a memcpy..

At first, I was overwhelmed by the macro definitions, too. But after a
while I started to like the strict typing once the value came out of
memcpy or until it is memcpy'd.

> 
> Jason
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-04-29 16:31   ` Alex Williamson
@ 2024-05-21 15:50     ` Gerd Bayer
  0 siblings, 0 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-05-21 15:50 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal

On Mon, 2024-04-29 at 10:31 -0600, Alex Williamson wrote:
> On Thu, 25 Apr 2024 18:56:03 +0200
> Gerd Bayer <gbayer@linux.ibm.com> wrote:
> 
> > From: Ben Segal <bpsegal@us.ibm.com>
> > 
> > Many PCI adapters can benefit or even require full 64bit read
> > and write access to their registers. In order to enable work on
> > user-space drivers for these devices add two new variations
> > vfio_pci_core_io{read|write}64 of the existing access methods
> > when the architecture supports 64-bit ioreads and iowrites.
> > 
> > Signed-off-by: Ben Segal <bpsegal@us.ibm.com>
> > Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com>
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> >  drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++
> >  include/linux/vfio_pci_core.h    |  3 +++
> >  2 files changed, 19 insertions(+)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c
> > b/drivers/vfio/pci/vfio_pci_rdwr.c
> > index 3335f1b868b1..8ed06edaee23 100644
> > --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> > @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size);
> >  VFIO_IOREAD(8)
> >  VFIO_IOREAD(16)
> >  VFIO_IOREAD(32)
> > +#ifdef ioread64
> > +VFIO_IOREAD(64)
> > +#endif
> >  
> >  #define
> > VFIO_IORDWR(size)						\
> >  static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device
> > *vdev,\
> > @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct
> > vfio_pci_core_device *vdev,\
> >  VFIO_IORDWR(8)
> >  VFIO_IORDWR(16)
> >  VFIO_IORDWR(32)
> > +#if defined(ioread64) && defined(iowrite64)
> > +VFIO_IORDWR(64)
> > +#endif
> > +
> >  /*
> >   * Read or write from an __iomem region (MMIO or I/O port) with an
> > excluded
> >   * range which is inaccessible.  The excluded range drops writes
> > and fills
> > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
> > vfio_pci_core_device *vdev, bool test_mem,
> >  		else
> >  			fillable = 0;
> >  
> > +#if defined(ioread64) && defined(iowrite64)
> 
> Nit, #ifdef vfio_pci_core_iordwr64

I'm sorry, but I'm not sure how I should check for the expanded symbol
here. I think I'll have to stick to checking the same condition as for
whether VFIO_IORDWR(64) should be expanded.

> > +		if (fillable >= 8 && !(off % 8)) {
> > +			ret = vfio_pci_core_iordwr64(vdev,
> > iswrite, test_mem,
> > +						     io, buf, off,
> > &filled);
> > +			if (ret)
> > +				return ret;
> > +
> > +		} else
> > +#endif /* defined(ioread64) && defined(iowrite64) */
> 
> AFAIK, the comment appended to the #endif is really only suggested
> when the code block is too long to reasonable fit in a terminal. 
> That's no longer the case with the new helper.

Yes, I'll change that.

> Thanks,
> 
> Alex
> 
> 

Thanks, Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-05-17 10:29   ` Ramesh Thomas
  2024-05-20  9:02     ` Tian, Kevin
@ 2024-05-21 16:40     ` Gerd Bayer
  2024-05-22 13:48       ` Jason Gunthorpe
  2024-05-22 23:57       ` Ramesh Thomas
  1 sibling, 2 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-05-21 16:40 UTC (permalink / raw)
  To: Ramesh Thomas, Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal

On Fri, 2024-05-17 at 03:29 -0700, Ramesh Thomas wrote:
> On 4/25/2024 9:56 AM, Gerd Bayer wrote:
> > From: Ben Segal <bpsegal@us.ibm.com>
> > 
> > Many PCI adapters can benefit or even require full 64bit read
> > and write access to their registers. In order to enable work on
> > user-space drivers for these devices add two new variations
> > vfio_pci_core_io{read|write}64 of the existing access methods
> > when the architecture supports 64-bit ioreads and iowrites.
> 
> This is indeed necessary as back to back 32 bit may not be optimal in
> some devices.
> 
> > 
> > Signed-off-by: Ben Segal <bpsegal@us.ibm.com>
> > Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com>
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> >   drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++
> >   include/linux/vfio_pci_core.h    |  3 +++
> >   2 files changed, 19 insertions(+)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c
> > b/drivers/vfio/pci/vfio_pci_rdwr.c
> > index 3335f1b868b1..8ed06edaee23 100644
> > --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> > @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size);
> >   VFIO_IOREAD(8)
> >   VFIO_IOREAD(16)
> >   VFIO_IOREAD(32)
> > +#ifdef ioread64
> > +VFIO_IOREAD(64)
> > +#endif
> >   
> >   #define
> > VFIO_IORDWR(size)						\
> >   static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device
> > *vdev,\
> > @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct
> > vfio_pci_core_device *vdev,\
> >   VFIO_IORDWR(8)
> >   VFIO_IORDWR(16)
> >   VFIO_IORDWR(32)
> > +#if defined(ioread64) && defined(iowrite64)
> > +VFIO_IORDWR(64)
> > +#endif
> > +
> >   /*
> >    * Read or write from an __iomem region (MMIO or I/O port) with
> > an excluded
> >    * range which is inaccessible.  The excluded range drops writes
> > and fills
> > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
> > vfio_pci_core_device *vdev, bool test_mem,
> >   		else
> >   			fillable = 0;
> >   
> > +#if defined(ioread64) && defined(iowrite64)
> 
> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and 
> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is
> defined and this check always fails. In include/asm-generic/io.h, 
> asm-generic/iomap.h gets included which declares them as extern
> functions.

I thinks that should be possible - since ioread64/iowrite64 depend on
CONFIG_64BIT themselves.

> One more thing to consider io-64-nonatomic-hi-lo.h and 
> io-64-nonatomic-lo-hi.h, if included would define it as a macro that 
> calls a function that rw 32 bits back to back.

Even today, vfio_pci_core_do_io_rw() makes no guarantees to its users
that register accesses will be done in the granularity they've thought
to use. The vfs layer may coalesce the accesses and this code will then
read/write the largest naturally aligned chunks. I've witnessed in my
testing that one device driver was doing 8-byte writes through the 8-
byte capable vfio layer all of a sudden when run in a KVM guest.

So higher-level code needs to consider how to split register accesses
appropriately to get the intended side-effects. Thus, I'd rather stay
away from splitting 64bit accesses into two 32bit accesses - and decide
if high or low order values should be written first.


> > +		if (fillable >= 8 && !(off % 8)) {
> > +			ret = vfio_pci_core_iordwr64(vdev,
> > iswrite, test_mem,
> > +						     io, buf, off,
> > &filled);
> > +			if (ret)
> > +				return ret;
> > +
> > +		} else
> > +#endif /* defined(ioread64) && defined(iowrite64) */
> >   		if (fillable >= 4 && !(off % 4)) {
> >   			ret = vfio_pci_core_iordwr32(vdev,
> > iswrite, test_mem,
> >   						     io, buf, off,
> > &filled);
> > diff --git a/include/linux/vfio_pci_core.h
> > b/include/linux/vfio_pci_core.h
> > index a2c8b8bba711..f4cf5fd2350c 100644
> > --- a/include/linux/vfio_pci_core.h
> > +++ b/include/linux/vfio_pci_core.h
> > @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct
> > vfio_pci_core_device *vdev,	\
> >   VFIO_IOREAD_DECLATION(8)
> >   VFIO_IOREAD_DECLATION(16)
> >   VFIO_IOREAD_DECLATION(32)
> > +#ifdef ioread64
> > +VFIO_IOREAD_DECLATION(64)
> nit: This macro is referenced only in this file. Can the typo be 
> corrected (_DECLARATION)?

Sure thanks for pointing this out!
I'll single this editorial change out into a separate patch of the
series, though.

> 
> > +#endif
> >   
> >   #endif /* VFIO_PCI_CORE_H */
> 
> 

Thanks, Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw
  2024-04-29 16:32   ` Alex Williamson
@ 2024-05-21 16:43     ` Gerd Bayer
  0 siblings, 0 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-05-21 16:43 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal,
	Yishai Hadas, Halil Pasic, Julian Ruess

On Mon, 2024-04-29 at 10:32 -0600, Alex Williamson wrote:
> On Thu, 25 Apr 2024 18:56:04 +0200
> Gerd Bayer <gbayer@linux.ibm.com> wrote:
> 
> > Convert if-elseif-chain into switch-case.
> > Separate out and generalize the code from the if-clauses so the
> > result
> > can be used in the switch statement.
> > 
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> >  drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++----
> > --
> >  1 file changed, 24 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c
> > b/drivers/vfio/pci/vfio_pci_rdwr.c
> > index 8ed06edaee23..634c00b03c71 100644
> > --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> > @@ -131,6 +131,20 @@ VFIO_IORDWR(32)
> >  VFIO_IORDWR(64)
> 
> #define MAX_FILL_SIZE 8
> #else
> #define MAX_FILL_SIZE 4
> 
> >  #endif
> >  
> > +static int fill_size(size_t fillable, loff_t off)
> > +{
> > +	unsigned int fill_size;
> 
> 	unsigned int fill_size = MAX_FILL_SIZE;
> 
> > +#if defined(ioread64) && defined(iowrite64)
> > +	for (fill_size = 8; fill_size >= 0; fill_size /= 2) {
> > +#else
> > +	for (fill_size = 4; fill_size >= 0; fill_size /= 2) {
> > +#endif /* defined(ioread64) && defined(iowrite64) */
> > +		if (fillable >= fill_size && !(off % fill_size))
> > +			return fill_size;
> > +	}
> > +	return -1;
> > +}
> > +
> >  /*
> >   * Read or write from an __iomem region (MMIO or I/O port) with an
> > excluded
> >   * range which is inaccessible.  The excluded range drops writes
> > and fills
> > @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct
> > vfio_pci_core_device *vdev, bool test_mem,
> >  		else
> >  			fillable = 0;
> >  
> > +		switch (fill_size(fillable, off)) {
> >  #if defined(ioread64) && defined(iowrite64)
> > -		if (fillable >= 8 && !(off % 8)) {
> > +		case 8:
> >  			ret = vfio_pci_core_iordwr64(vdev,
> > iswrite, test_mem,
> >  						     io, buf, off,
> > &filled);
> >  			if (ret)
> >  				return ret;
> > +			break;
> >  
> > -		} else
> 
> AFAICT, avoiding this dangling else within the #ifdef is really the
> only tangible advantage of conversion to a switch statement.  Getting
> rid of that alone while keeping, and actually increasing, the inline
> ifdefs in the code doesn't seem worthwhile to me.  I'd probably only
> go this route if we could make vfio_pci_iordwr64() stubbed as a
> BUG_ON when we don't have the ioread64 and iowrite64 accessors, in
> which case the switch helper would never return 8 and the function
> would be unreachable.
> 
> >  #endif /* defined(ioread64) && defined(iowrite64) */
> > -		if (fillable >= 4 && !(off % 4)) {
> > +		case 4:
> >  			ret = vfio_pci_core_iordwr32(vdev,
> > iswrite, test_mem,
> >  						     io, buf, off,
> > &filled);
> >  			if (ret)
> >  				return ret;
> > +			break;
> >  
> > -		} else if (fillable >= 2 && !(off % 2)) {
> > +		case 2:
> >  			ret = vfio_pci_core_iordwr16(vdev,
> > iswrite, test_mem,
> >  						     io, buf, off,
> > &filled);
> >  			if (ret)
> >  				return ret;
> > +			break;
> >  
> > -		} else if (fillable) {
> > +		case 1:
> >  			ret = vfio_pci_core_iordwr8(vdev, iswrite,
> > test_mem,
> >  						    io, buf, off,
> > &filled);
> >  			if (ret)
> >  				return ret;
> > +			break;
> >  
> > -		} else {
> > +		default:
> 
> This condition also seems a little more obfuscated without being
> preceded by the 'if (fillable)' test, which might warrant handling
> separate from the switch if we continue to pursue the switch
> conversion.  Thanks,
> 
> Alex
> 
> >  			/* Fill reads with -1, drop writes */
> >  			filled = min(count, (size_t)(x_end -
> > off));
> >  			if (!iswrite) {
> 
> 

Well, overall this sounds like it creates more headaches than it tries
to solve - and that is a strong hint to not do it.

I'll drop this further refactoring in the next version.

Thanks, Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-05-21 16:40     ` Gerd Bayer
@ 2024-05-22 13:48       ` Jason Gunthorpe
  2024-05-22 23:57       ` Ramesh Thomas
  1 sibling, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2024-05-22 13:48 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Ramesh Thomas, Alex Williamson, Niklas Schnelle, kvm, linux-s390,
	Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess,
	Ben Segal

On Tue, May 21, 2024 at 06:40:13PM +0200, Gerd Bayer wrote:
> > > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
> > > vfio_pci_core_device *vdev, bool test_mem,
> > >   		else
> > >   			fillable = 0;
> > >   
> > > +#if defined(ioread64) && defined(iowrite64)
> > 
> > Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and 
> > iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is
> > defined and this check always fails. In include/asm-generic/io.h, 
> > asm-generic/iomap.h gets included which declares them as extern
> > functions.
> 
> I thinks that should be possible - since ioread64/iowrite64 depend on
> CONFIG_64BIT themselves.
> 
> > One more thing to consider io-64-nonatomic-hi-lo.h and 
> > io-64-nonatomic-lo-hi.h, if included would define it as a macro that 
> > calls a function that rw 32 bits back to back.

This might be a better way to go than trying to have vfio provide its
own emulation.

> Even today, vfio_pci_core_do_io_rw() makes no guarantees to its users
> that register accesses will be done in the granularity they've thought
> to use. The vfs layer may coalesce the accesses and this code will then
> read/write the largest naturally aligned chunks. I've witnessed in my
> testing that one device driver was doing 8-byte writes through the 8-
> byte capable vfio layer all of a sudden when run in a KVM guest.

Sure, KVM has emulation for various byte sizes, and does invoke vfio
with the raw size it got from guest, including larger than 8 sizes
from things like SSE instructions. This has nothing to do with the VFS
layer.

> So higher-level code needs to consider how to split register accesses
> appropriately to get the intended side-effects. Thus, I'd rather stay
> away from splitting 64bit accesses into two 32bit accesses - and decide
> if high or low order values should be written first.

The VFIO API is a byte for byte memcpy. VFIO should try to do the
largest single instruction accesses it knows how to do because some HW
is sensitive to that. Otherwise it does a memcpy loop.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-05-21 16:40     ` Gerd Bayer
  2024-05-22 13:48       ` Jason Gunthorpe
@ 2024-05-22 23:57       ` Ramesh Thomas
  1 sibling, 0 replies; 25+ messages in thread
From: Ramesh Thomas @ 2024-05-22 23:57 UTC (permalink / raw)
  To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal

On 5/21/2024 9:40 AM, Gerd Bayer wrote:
> On Fri, 2024-05-17 at 03:29 -0700, Ramesh Thomas wrote:
>> On 4/25/2024 9:56 AM, Gerd Bayer wrote:
>>> From: Ben Segal <bpsegal@us.ibm.com>
>>>
>>> Many PCI adapters can benefit or even require full 64bit read
>>> and write access to their registers. In order to enable work on
>>> user-space drivers for these devices add two new variations
>>> vfio_pci_core_io{read|write}64 of the existing access methods
>>> when the architecture supports 64-bit ioreads and iowrites.
>>
>> This is indeed necessary as back to back 32 bit may not be optimal in
>> some devices.
>>
>>>
>>> Signed-off-by: Ben Segal <bpsegal@us.ibm.com>
>>> Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com>
>>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
>>> ---
>>>    drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++
>>>    include/linux/vfio_pci_core.h    |  3 +++
>>>    2 files changed, 19 insertions(+)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c
>>> b/drivers/vfio/pci/vfio_pci_rdwr.c
>>> index 3335f1b868b1..8ed06edaee23 100644
>>> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
>>> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
>>> @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size);
>>>    VFIO_IOREAD(8)
>>>    VFIO_IOREAD(16)
>>>    VFIO_IOREAD(32)
>>> +#ifdef ioread64
>>> +VFIO_IOREAD(64)
>>> +#endif
>>>    
>>>    #define
>>> VFIO_IORDWR(size)						\
>>>    static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device
>>> *vdev,\
>>> @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct
>>> vfio_pci_core_device *vdev,\
>>>    VFIO_IORDWR(8)
>>>    VFIO_IORDWR(16)
>>>    VFIO_IORDWR(32)
>>> +#if defined(ioread64) && defined(iowrite64)
>>> +VFIO_IORDWR(64)
>>> +#endif
>>> +
>>>    /*
>>>     * Read or write from an __iomem region (MMIO or I/O port) with
>>> an excluded
>>>     * range which is inaccessible.  The excluded range drops writes
>>> and fills
>>> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
>>> vfio_pci_core_device *vdev, bool test_mem,
>>>    		else
>>>    			fillable = 0;
>>>    
>>> +#if defined(ioread64) && defined(iowrite64)
>>
>> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and
>> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is
>> defined and this check always fails. In include/asm-generic/io.h,
>> asm-generic/iomap.h gets included which declares them as extern
>> functions.
> 
> I thinks that should be possible - since ioread64/iowrite64 depend on
> CONFIG_64BIT themselves.
> 
>> One more thing to consider io-64-nonatomic-hi-lo.h and
>> io-64-nonatomic-lo-hi.h, if included would define it as a macro that
>> calls a function that rw 32 bits back to back.
> 
> Even today, vfio_pci_core_do_io_rw() makes no guarantees to its users
> that register accesses will be done in the granularity they've thought
> to use. The vfs layer may coalesce the accesses and this code will then
> read/write the largest naturally aligned chunks. I've witnessed in my
> testing that one device driver was doing 8-byte writes through the 8-
> byte capable vfio layer all of a sudden when run in a KVM guest.
> 
> So higher-level code needs to consider how to split register accesses
> appropriately to get the intended side-effects. Thus, I'd rather stay
> away from splitting 64bit accesses into two 32bit accesses - and decide
> if high or low order values should be written first.

Sorry, I was not clear. The main reason to include 
io-64-nonatomic-hi-lo.h or io-64-nonatomic-lo-hi.h is to get the 
iowrite64 and ioread64 macros defined mapping to functions that 
implement them. They define and map them to generic functions in 
lib/iomap.c The 64 bit rw functions (readq/writeq) get called if 
present. If any architecture has not implemented them, then it maps them 
to functions that do 32 bit back to back.

> 
> 
>>> +		if (fillable >= 8 && !(off % 8)) {
>>> +			ret = vfio_pci_core_iordwr64(vdev,
>>> iswrite, test_mem,
>>> +						     io, buf, off,
>>> &filled);
>>> +			if (ret)
>>> +				return ret;
>>> +
>>> +		} else
>>> +#endif /* defined(ioread64) && defined(iowrite64) */
>>>    		if (fillable >= 4 && !(off % 4)) {
>>>    			ret = vfio_pci_core_iordwr32(vdev,
>>> iswrite, test_mem,
>>>    						     io, buf, off,
>>> &filled);
>>> diff --git a/include/linux/vfio_pci_core.h
>>> b/include/linux/vfio_pci_core.h
>>> index a2c8b8bba711..f4cf5fd2350c 100644
>>> --- a/include/linux/vfio_pci_core.h
>>> +++ b/include/linux/vfio_pci_core.h
>>> @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct
>>> vfio_pci_core_device *vdev,	\
>>>    VFIO_IOREAD_DECLATION(8)
>>>    VFIO_IOREAD_DECLATION(16)
>>>    VFIO_IOREAD_DECLATION(32)
>>> +#ifdef ioread64
>>> +VFIO_IOREAD_DECLATION(64)
>> nit: This macro is referenced only in this file. Can the typo be
>> corrected (_DECLARATION)?
> 
> Sure thanks for pointing this out!
> I'll single this editorial change out into a separate patch of the
> series, though.
> 
>>
>>> +#endif
>>>    
>>>    #endif /* VFIO_PCI_CORE_H */
>>
>>
> 
> Thanks, Gerd


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-05-20  9:02     ` Tian, Kevin
@ 2024-05-23  0:11       ` Ramesh Thomas
  2024-05-23 21:52         ` Ramesh Thomas
  0 siblings, 1 reply; 25+ messages in thread
From: Ramesh Thomas @ 2024-05-23  0:11 UTC (permalink / raw)
  To: Tian, Kevin, Gerd Bayer, Alex Williamson, Jason Gunthorpe,
	Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal

On 5/20/2024 2:02 AM, Tian, Kevin wrote:
>> From: Ramesh Thomas <ramesh.thomas@intel.com>
>> Sent: Friday, May 17, 2024 6:30 PM
>>
>> On 4/25/2024 9:56 AM, Gerd Bayer wrote:
>>> From: Ben Segal <bpsegal@us.ibm.com>
>>>
>>> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
>> vfio_pci_core_device *vdev, bool test_mem,
>>>    		else
>>>    			fillable = 0;
>>> 	
>>> +#if defined(ioread64) && defined(iowrite64)
>>
>> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and
>> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is
>> defined and this check always fails. In include/asm-generic/io.h,
>> asm-generic/iomap.h gets included which declares them as extern functions.
>>
>> One more thing to consider io-64-nonatomic-hi-lo.h and
>> io-64-nonatomic-lo-hi.h, if included would define it as a macro that
>> calls a function that rw 32 bits back to back.
> 
> I don't see the problem here. when the defined check fails it falls
> back to back-to-back vfio_pci_core_iordwr32(). there is no need to
> do it in an indirect way via including io-64-nonatomic-hi-lo.h.

The issue is iowrite64 and iowrite64 was not getting defined when 
CONFIG_GENERIC_IOMAP was not defined, even though the architecture 
implemented the 64 bit rw functions readq and writeq. 
io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h define them and map 
them to generic implementations in lib/iomap.c. The implementation calls 
the 64 bit rw functions if present, otherwise does 32 bit back to back 
rw. Besides it also has sanity checks for port numbers in the iomap 
path. I think it is better to rely on this existing generic method than 
implementing the checks at places where iowrite64 and ioread64 get 
called, at least in the IOMAP path.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores
  2024-05-23  0:11       ` Ramesh Thomas
@ 2024-05-23 21:52         ` Ramesh Thomas
  0 siblings, 0 replies; 25+ messages in thread
From: Ramesh Thomas @ 2024-05-23 21:52 UTC (permalink / raw)
  To: Tian, Kevin, Gerd Bayer, Alex Williamson, Jason Gunthorpe,
	Niklas Schnelle
  Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
	Julian Ruess, Ben Segal

On 5/22/2024 5:11 PM, Ramesh Thomas wrote:
> On 5/20/2024 2:02 AM, Tian, Kevin wrote:
>>> From: Ramesh Thomas <ramesh.thomas@intel.com>
>>> Sent: Friday, May 17, 2024 6:30 PM
>>>
>>> On 4/25/2024 9:56 AM, Gerd Bayer wrote:
>>>> From: Ben Segal <bpsegal@us.ibm.com>
>>>>
>>>> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct
>>> vfio_pci_core_device *vdev, bool test_mem,
>>>>            else
>>>>                fillable = 0;
>>>>
>>>> +#if defined(ioread64) && defined(iowrite64)
>>>
>>> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and
>>> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is
>>> defined and this check always fails. In include/asm-generic/io.h,
>>> asm-generic/iomap.h gets included which declares them as extern 
>>> functions.
>>>
>>> One more thing to consider io-64-nonatomic-hi-lo.h and
>>> io-64-nonatomic-lo-hi.h, if included would define it as a macro that
>>> calls a function that rw 32 bits back to back.
>>
>> I don't see the problem here. when the defined check fails it falls
>> back to back-to-back vfio_pci_core_iordwr32(). there is no need to
>> do it in an indirect way via including io-64-nonatomic-hi-lo.h.
> 
> The issue is iowrite64 and iowrite64 was not getting defined when 
> CONFIG_GENERIC_IOMAP was not defined, even though the architecture 

Sorry, I meant they were not getting defined when CONFIG_GENERIC_IOMAP 
*was defined*. The only definitions of ioread64 and iowrite64 in the 
code path are in asm/io.h where they are surrounded by #ifndef 
CONFIG_GENERIC_IOMAP

> implemented the 64 bit rw functions readq and writeq. 
> io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h define them and map 
> them to generic implementations in lib/iomap.c. The implementation calls 
> the 64 bit rw functions if present, otherwise does 32 bit back to back 
> rw. Besides it also has sanity checks for port numbers in the iomap 
> path. I think it is better to rely on this existing generic method than 
> implementing the checks at places where iowrite64 and ioread64 get 
> called, at least in the IOMAP path.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-05-23 21:53 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
2024-04-29 16:31   ` Alex Williamson
2024-05-17 14:22     ` Gerd Bayer
2024-04-29 20:09   ` Jason Gunthorpe
2024-04-29 22:11     ` Alex Williamson
2024-04-29 22:33       ` Jason Gunthorpe
2024-05-21 15:47         ` Gerd Bayer
2024-04-30  8:16       ` liulongfang
2024-05-17 10:47   ` Ramesh Thomas
2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer
2024-04-29 16:31   ` Alex Williamson
2024-05-21 15:50     ` Gerd Bayer
2024-05-17 10:29   ` Ramesh Thomas
2024-05-20  9:02     ` Tian, Kevin
2024-05-23  0:11       ` Ramesh Thomas
2024-05-23 21:52         ` Ramesh Thomas
2024-05-21 16:40     ` Gerd Bayer
2024-05-22 13:48       ` Jason Gunthorpe
2024-05-22 23:57       ` Ramesh Thomas
2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer
2024-04-28  6:59   ` Tian, Kevin
2024-04-29 16:32   ` Alex Williamson
2024-05-21 16:43     ` Gerd Bayer
2024-05-17 11:41   ` Ramesh Thomas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.