[RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-03 10:04 ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Shunsuke Mie, Frank Li, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

This patchset introduce a virtio-net EP device function. It provides a
new option to communiate between PCIe host and endpoint over IP.
Advantage of this option is that the driver fully uses a PCIe embedded DMA.
It is used to transport data between virtio ring directly each other. It
can be expected to better throughput.

To realize the function, this patchset has few changes and introduces a
new APIs to PCI EP framework related to virtio. Furthermore, it device
depends on the some patchtes that is discussing. Those depended patchset
are following:
- [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
link: https://lore.kernel.org/dmaengine/20221223022608.550697-1-mie@igel.co.jp/
- [RFC PATCH 0/3] Deal with alignment restriction on EP side
link: https://lore.kernel.org/linux-pci/20230113090350.1103494-1-mie@igel.co.jp/
- [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
link: https://lore.kernel.org/virtualization/20230202090934.549556-1-mie@igel.co.jp/

About this patchset has 4 patches. The first of two patch is little changes
to virtio. The third patch add APIs to easily access virtio data structure
on PCIe Host side memory. The last one introduce a virtio-net EP device
function. Details are in commit respectively.

Currently those network devices are testd using ping only. I'll add a
result of performance evaluation using iperf and etc to the future version
of this patchset.

Shunsuke Mie (4):
  virtio_pci: add a definition of queue flag in ISR
  virtio_ring: remove const from vring getter
  PCI: endpoint: Introduce virtio library for EP functions
  PCI: endpoint: function: Add EP function driver to provide virtio net
    device

 drivers/pci/endpoint/Kconfig                  |   7 +
 drivers/pci/endpoint/Makefile                 |   1 +
 drivers/pci/endpoint/functions/Kconfig        |  12 +
 drivers/pci/endpoint/functions/Makefile       |   1 +
 .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
 .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
 drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
 drivers/virtio/virtio_ring.c                  |   2 +-
 include/linux/pci-epf-virtio.h                |  25 +
 include/linux/virtio.h                        |   2 +-
 include/uapi/linux/virtio_pci.h               |   2 +
 13 files changed, 1590 insertions(+), 2 deletions(-)
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
 create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
 create mode 100644 include/linux/pci-epf-virtio.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-03 10:04 ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Manivannan Sadhasivam, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Shunsuke Mie, Jon Mason, Bjorn Helgaas

This patchset introduce a virtio-net EP device function. It provides a
new option to communiate between PCIe host and endpoint over IP.
Advantage of this option is that the driver fully uses a PCIe embedded DMA.
It is used to transport data between virtio ring directly each other. It
can be expected to better throughput.

To realize the function, this patchset has few changes and introduces a
new APIs to PCI EP framework related to virtio. Furthermore, it device
depends on the some patchtes that is discussing. Those depended patchset
are following:
- [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
link: https://lore.kernel.org/dmaengine/20221223022608.550697-1-mie@igel.co.jp/
- [RFC PATCH 0/3] Deal with alignment restriction on EP side
link: https://lore.kernel.org/linux-pci/20230113090350.1103494-1-mie@igel.co.jp/
- [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
link: https://lore.kernel.org/virtualization/20230202090934.549556-1-mie@igel.co.jp/

About this patchset has 4 patches. The first of two patch is little changes
to virtio. The third patch add APIs to easily access virtio data structure
on PCIe Host side memory. The last one introduce a virtio-net EP device
function. Details are in commit respectively.

Currently those network devices are testd using ping only. I'll add a
result of performance evaluation using iperf and etc to the future version
of this patchset.

Shunsuke Mie (4):
  virtio_pci: add a definition of queue flag in ISR
  virtio_ring: remove const from vring getter
  PCI: endpoint: Introduce virtio library for EP functions
  PCI: endpoint: function: Add EP function driver to provide virtio net
    device

 drivers/pci/endpoint/Kconfig                  |   7 +
 drivers/pci/endpoint/Makefile                 |   1 +
 drivers/pci/endpoint/functions/Kconfig        |  12 +
 drivers/pci/endpoint/functions/Makefile       |   1 +
 .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
 .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
 drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
 drivers/virtio/virtio_ring.c                  |   2 +-
 include/linux/pci-epf-virtio.h                |  25 +
 include/linux/virtio.h                        |   2 +-
 include/uapi/linux/virtio_pci.h               |   2 +
 13 files changed, 1590 insertions(+), 2 deletions(-)
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
 create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
 create mode 100644 include/linux/pci-epf-virtio.h

-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR
  2023-02-03 10:04 ` Shunsuke Mie
@ 2023-02-03 10:04   ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Shunsuke Mie, Frank Li, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

Already it has beed defined a config changed flag of ISR, but not the queue
flag. Add a macro for it.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 include/uapi/linux/virtio_pci.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index f703afc7ad31..fa82afd6171a 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -94,6 +94,8 @@
 
 #endif /* VIRTIO_PCI_NO_LEGACY */
 
+/* Ths bit of the ISR which indicates a queue entry update */
+#define VIRTIO_PCI_ISR_QUEUE		0x1
 /* The bit of the ISR which indicates a device configuration change. */
 #define VIRTIO_PCI_ISR_CONFIG		0x2
 /* Vector value used to disable MSI for queue */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR
@ 2023-02-03 10:04   ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Manivannan Sadhasivam, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Shunsuke Mie, Jon Mason, Bjorn Helgaas

Already it has beed defined a config changed flag of ISR, but not the queue
flag. Add a macro for it.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 include/uapi/linux/virtio_pci.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index f703afc7ad31..fa82afd6171a 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -94,6 +94,8 @@
 
 #endif /* VIRTIO_PCI_NO_LEGACY */
 
+/* Ths bit of the ISR which indicates a queue entry update */
+#define VIRTIO_PCI_ISR_QUEUE		0x1
 /* The bit of the ISR which indicates a device configuration change. */
 #define VIRTIO_PCI_ISR_CONFIG		0x2
 /* Vector value used to disable MSI for queue */
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 2/4] virtio_ring: remove const from vring getter
  2023-02-03 10:04 ` Shunsuke Mie
@ 2023-02-03 10:04   ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Shunsuke Mie, Frank Li, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

There are some method to manage the virto ring in Linux kernel. e.g. vhost
and vringh. Remove const from the getter in order to control vring with
other APIs, such as vringh.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 drivers/virtio/virtio_ring.c | 2 +-
 include/linux/virtio.h       | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 2e7689bb933b..aa0c455d402b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2857,7 +2857,7 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *_vq)
 EXPORT_SYMBOL_GPL(virtqueue_get_used_addr);
 
 /* Only available for split ring */
-const struct vring *virtqueue_get_vring(struct virtqueue *vq)
+struct vring *virtqueue_get_vring(struct virtqueue *vq)
 {
 	return &to_vvq(vq)->split.vring;
 }
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index dcab9c7e8784..83530b7bc2e9 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -88,7 +88,7 @@ unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
 
 bool virtqueue_is_broken(struct virtqueue *vq);
 
-const struct vring *virtqueue_get_vring(struct virtqueue *vq);
+struct vring *virtqueue_get_vring(struct virtqueue *vq);
 dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 2/4] virtio_ring: remove const from vring getter
@ 2023-02-03 10:04   ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Manivannan Sadhasivam, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Shunsuke Mie, Jon Mason, Bjorn Helgaas

There are some method to manage the virto ring in Linux kernel. e.g. vhost
and vringh. Remove const from the getter in order to control vring with
other APIs, such as vringh.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 drivers/virtio/virtio_ring.c | 2 +-
 include/linux/virtio.h       | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 2e7689bb933b..aa0c455d402b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2857,7 +2857,7 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *_vq)
 EXPORT_SYMBOL_GPL(virtqueue_get_used_addr);
 
 /* Only available for split ring */
-const struct vring *virtqueue_get_vring(struct virtqueue *vq)
+struct vring *virtqueue_get_vring(struct virtqueue *vq)
 {
 	return &to_vvq(vq)->split.vring;
 }
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index dcab9c7e8784..83530b7bc2e9 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -88,7 +88,7 @@ unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
 
 bool virtqueue_is_broken(struct virtqueue *vq);
 
-const struct vring *virtqueue_get_vring(struct virtqueue *vq);
+struct vring *virtqueue_get_vring(struct virtqueue *vq);
 dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions
  2023-02-03 10:04 ` Shunsuke Mie
@ 2023-02-03 10:04   ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Shunsuke Mie, Frank Li, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

Add a new library to access a virtio ring located on PCIe host memory. The
library generates struct pci_epf_vringh that is introduced in this patch.
The struct has a vringh member, so vringh APIs can be used to access the
virtio ring.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 drivers/pci/endpoint/Kconfig          |   7 ++
 drivers/pci/endpoint/Makefile         |   1 +
 drivers/pci/endpoint/pci-epf-virtio.c | 113 ++++++++++++++++++++++++++
 include/linux/pci-epf-virtio.h        |  25 ++++++
 4 files changed, 146 insertions(+)
 create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
 create mode 100644 include/linux/pci-epf-virtio.h

diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
index 17bbdc9bbde0..07276dcc43c8 100644
--- a/drivers/pci/endpoint/Kconfig
+++ b/drivers/pci/endpoint/Kconfig
@@ -28,6 +28,13 @@ config PCI_ENDPOINT_CONFIGFS
 	   configure the endpoint function and used to bind the
 	   function with a endpoint controller.
 
+config PCI_ENDPOINT_VIRTIO
+	tristate
+	depends on PCI_ENDPOINT
+	select VHOST_IOMEM
+	help
+	  TODO update this comment
+
 source "drivers/pci/endpoint/functions/Kconfig"
 
 endmenu
diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
index 95b2fe47e3b0..95712f0a13d1 100644
--- a/drivers/pci/endpoint/Makefile
+++ b/drivers/pci/endpoint/Makefile
@@ -4,5 +4,6 @@
 #
 
 obj-$(CONFIG_PCI_ENDPOINT_CONFIGFS)	+= pci-ep-cfs.o
+obj-$(CONFIG_PCI_ENDPOINT_VIRTIO)	+= pci-epf-virtio.o
 obj-$(CONFIG_PCI_ENDPOINT)		+= pci-epc-core.o pci-epf-core.o\
 					   pci-epc-mem.o functions/
diff --git a/drivers/pci/endpoint/pci-epf-virtio.c b/drivers/pci/endpoint/pci-epf-virtio.c
new file mode 100644
index 000000000000..7134ca407a03
--- /dev/null
+++ b/drivers/pci/endpoint/pci-epf-virtio.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Virtio library for PCI Endpoint function
+ */
+#include <linux/kernel.h>
+#include <linux/pci-epf-virtio.h>
+#include <linux/pci-epc.h>
+#include <linux/virtio_pci.h>
+
+static void __iomem *epf_virtio_map_vq(struct pci_epf *epf, u32 pfn,
+				       size_t size, phys_addr_t *vq_phys)
+{
+	int err;
+	phys_addr_t vq_addr;
+	size_t vq_size;
+	void __iomem *vq_virt;
+
+	vq_addr = (phys_addr_t)pfn << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+	vq_size = vring_size(size, VIRTIO_PCI_VRING_ALIGN) + 100;
+
+	vq_virt = pci_epc_mem_alloc_addr(epf->epc, vq_phys, vq_size);
+	if (!vq_virt) {
+		pr_err("Failed to allocate epc memory\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, *vq_phys,
+			       vq_addr, vq_size);
+	if (err) {
+		pr_err("Failed to map virtuqueue to local");
+		goto err_free;
+	}
+
+	return vq_virt;
+
+err_free:
+	pci_epc_mem_free_addr(epf->epc, *vq_phys, vq_virt, vq_size);
+
+	return ERR_PTR(err);
+}
+
+static void epf_virtio_unmap_vq(struct pci_epf *epf, void __iomem *vq_virt,
+				phys_addr_t vq_phys, size_t size)
+{
+	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no, vq_phys);
+	pci_epc_mem_free_addr(epf->epc, vq_phys, vq_virt,
+			      vring_size(size, VIRTIO_PCI_VRING_ALIGN));
+}
+
+/**
+ * pci_epf_virtio_alloc_vringh() - allocate epf vringh from @pfn
+ * @epf: the EPF device that communicates to host virtio dirver
+ * @features: the virtio features of device
+ * @pfn: page frame number of virtqueue located on host memory. It is
+ *		passed during virtqueue negotiation.
+ * @size: a length of virtqueue
+ */
+struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
+						   u64 features, u32 pfn,
+						   size_t size)
+{
+	int err;
+	struct vring vring;
+	struct pci_epf_vringh *evrh;
+
+	evrh = kmalloc(sizeof(*evrh), GFP_KERNEL);
+	if (!evrh) {
+		err = -ENOMEM;
+		goto err_unmap_vq;
+	}
+
+	evrh->size = size;
+
+	evrh->virt = epf_virtio_map_vq(epf, pfn, size, &evrh->phys);
+	if (IS_ERR(evrh->virt))
+		return evrh->virt;
+
+	vring_init(&vring, size, evrh->virt, VIRTIO_PCI_VRING_ALIGN);
+
+	err = vringh_init_iomem(&evrh->vrh, features, size, false, GFP_KERNEL,
+				vring.desc, vring.avail, vring.used);
+	if (err)
+		goto err_free_epf_vq;
+
+	return evrh;
+
+err_free_epf_vq:
+	kfree(evrh);
+
+err_unmap_vq:
+	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
+
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(pci_epf_virtio_alloc_vringh);
+
+/**
+ * pci_epf_virtio_free_vringh() - release allocated epf vring
+ * @epf: the EPF device that communicates to host virtio dirver
+ * @evrh: epf vringh to free
+ */
+void pci_epf_virtio_free_vringh(struct pci_epf *epf,
+				struct pci_epf_vringh *evrh)
+{
+	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
+	kfree(evrh);
+}
+EXPORT_SYMBOL_GPL(pci_epf_virtio_free_vringh);
+
+MODULE_DESCRIPTION("PCI EP Virtio Library");
+MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/pci-epf-virtio.h b/include/linux/pci-epf-virtio.h
new file mode 100644
index 000000000000..ae09087919a9
--- /dev/null
+++ b/include/linux/pci-epf-virtio.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * PCI Endpoint Function (EPF) for virtio definitions
+ */
+#ifndef __LINUX_PCI_EPF_VIRTIO_H
+#define __LINUX_PCI_EPF_VIRTIO_H
+
+#include <linux/types.h>
+#include <linux/vringh.h>
+#include <linux/pci-epf.h>
+
+struct pci_epf_vringh {
+	struct vringh vrh;
+	void __iomem *virt;
+	phys_addr_t phys;
+	size_t size;
+};
+
+struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
+						   u64 features, u32 pfn,
+						   size_t size);
+void pci_epf_virtio_free_vringh(struct pci_epf *epf,
+				struct pci_epf_vringh *evrh);
+
+#endif // __LINUX_PCI_EPF_VIRTIO_H
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions
@ 2023-02-03 10:04   ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Manivannan Sadhasivam, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Shunsuke Mie, Jon Mason, Bjorn Helgaas

Add a new library to access a virtio ring located on PCIe host memory. The
library generates struct pci_epf_vringh that is introduced in this patch.
The struct has a vringh member, so vringh APIs can be used to access the
virtio ring.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 drivers/pci/endpoint/Kconfig          |   7 ++
 drivers/pci/endpoint/Makefile         |   1 +
 drivers/pci/endpoint/pci-epf-virtio.c | 113 ++++++++++++++++++++++++++
 include/linux/pci-epf-virtio.h        |  25 ++++++
 4 files changed, 146 insertions(+)
 create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
 create mode 100644 include/linux/pci-epf-virtio.h

diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
index 17bbdc9bbde0..07276dcc43c8 100644
--- a/drivers/pci/endpoint/Kconfig
+++ b/drivers/pci/endpoint/Kconfig
@@ -28,6 +28,13 @@ config PCI_ENDPOINT_CONFIGFS
 	   configure the endpoint function and used to bind the
 	   function with a endpoint controller.
 
+config PCI_ENDPOINT_VIRTIO
+	tristate
+	depends on PCI_ENDPOINT
+	select VHOST_IOMEM
+	help
+	  TODO update this comment
+
 source "drivers/pci/endpoint/functions/Kconfig"
 
 endmenu
diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
index 95b2fe47e3b0..95712f0a13d1 100644
--- a/drivers/pci/endpoint/Makefile
+++ b/drivers/pci/endpoint/Makefile
@@ -4,5 +4,6 @@
 #
 
 obj-$(CONFIG_PCI_ENDPOINT_CONFIGFS)	+= pci-ep-cfs.o
+obj-$(CONFIG_PCI_ENDPOINT_VIRTIO)	+= pci-epf-virtio.o
 obj-$(CONFIG_PCI_ENDPOINT)		+= pci-epc-core.o pci-epf-core.o\
 					   pci-epc-mem.o functions/
diff --git a/drivers/pci/endpoint/pci-epf-virtio.c b/drivers/pci/endpoint/pci-epf-virtio.c
new file mode 100644
index 000000000000..7134ca407a03
--- /dev/null
+++ b/drivers/pci/endpoint/pci-epf-virtio.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Virtio library for PCI Endpoint function
+ */
+#include <linux/kernel.h>
+#include <linux/pci-epf-virtio.h>
+#include <linux/pci-epc.h>
+#include <linux/virtio_pci.h>
+
+static void __iomem *epf_virtio_map_vq(struct pci_epf *epf, u32 pfn,
+				       size_t size, phys_addr_t *vq_phys)
+{
+	int err;
+	phys_addr_t vq_addr;
+	size_t vq_size;
+	void __iomem *vq_virt;
+
+	vq_addr = (phys_addr_t)pfn << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+	vq_size = vring_size(size, VIRTIO_PCI_VRING_ALIGN) + 100;
+
+	vq_virt = pci_epc_mem_alloc_addr(epf->epc, vq_phys, vq_size);
+	if (!vq_virt) {
+		pr_err("Failed to allocate epc memory\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, *vq_phys,
+			       vq_addr, vq_size);
+	if (err) {
+		pr_err("Failed to map virtuqueue to local");
+		goto err_free;
+	}
+
+	return vq_virt;
+
+err_free:
+	pci_epc_mem_free_addr(epf->epc, *vq_phys, vq_virt, vq_size);
+
+	return ERR_PTR(err);
+}
+
+static void epf_virtio_unmap_vq(struct pci_epf *epf, void __iomem *vq_virt,
+				phys_addr_t vq_phys, size_t size)
+{
+	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no, vq_phys);
+	pci_epc_mem_free_addr(epf->epc, vq_phys, vq_virt,
+			      vring_size(size, VIRTIO_PCI_VRING_ALIGN));
+}
+
+/**
+ * pci_epf_virtio_alloc_vringh() - allocate epf vringh from @pfn
+ * @epf: the EPF device that communicates to host virtio dirver
+ * @features: the virtio features of device
+ * @pfn: page frame number of virtqueue located on host memory. It is
+ *		passed during virtqueue negotiation.
+ * @size: a length of virtqueue
+ */
+struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
+						   u64 features, u32 pfn,
+						   size_t size)
+{
+	int err;
+	struct vring vring;
+	struct pci_epf_vringh *evrh;
+
+	evrh = kmalloc(sizeof(*evrh), GFP_KERNEL);
+	if (!evrh) {
+		err = -ENOMEM;
+		goto err_unmap_vq;
+	}
+
+	evrh->size = size;
+
+	evrh->virt = epf_virtio_map_vq(epf, pfn, size, &evrh->phys);
+	if (IS_ERR(evrh->virt))
+		return evrh->virt;
+
+	vring_init(&vring, size, evrh->virt, VIRTIO_PCI_VRING_ALIGN);
+
+	err = vringh_init_iomem(&evrh->vrh, features, size, false, GFP_KERNEL,
+				vring.desc, vring.avail, vring.used);
+	if (err)
+		goto err_free_epf_vq;
+
+	return evrh;
+
+err_free_epf_vq:
+	kfree(evrh);
+
+err_unmap_vq:
+	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
+
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(pci_epf_virtio_alloc_vringh);
+
+/**
+ * pci_epf_virtio_free_vringh() - release allocated epf vring
+ * @epf: the EPF device that communicates to host virtio dirver
+ * @evrh: epf vringh to free
+ */
+void pci_epf_virtio_free_vringh(struct pci_epf *epf,
+				struct pci_epf_vringh *evrh)
+{
+	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
+	kfree(evrh);
+}
+EXPORT_SYMBOL_GPL(pci_epf_virtio_free_vringh);
+
+MODULE_DESCRIPTION("PCI EP Virtio Library");
+MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/pci-epf-virtio.h b/include/linux/pci-epf-virtio.h
new file mode 100644
index 000000000000..ae09087919a9
--- /dev/null
+++ b/include/linux/pci-epf-virtio.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * PCI Endpoint Function (EPF) for virtio definitions
+ */
+#ifndef __LINUX_PCI_EPF_VIRTIO_H
+#define __LINUX_PCI_EPF_VIRTIO_H
+
+#include <linux/types.h>
+#include <linux/vringh.h>
+#include <linux/pci-epf.h>
+
+struct pci_epf_vringh {
+	struct vringh vrh;
+	void __iomem *virt;
+	phys_addr_t phys;
+	size_t size;
+};
+
+struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
+						   u64 features, u32 pfn,
+						   size_t size);
+void pci_epf_virtio_free_vringh(struct pci_epf *epf,
+				struct pci_epf_vringh *evrh);
+
+#endif // __LINUX_PCI_EPF_VIRTIO_H
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 10:04 ` Shunsuke Mie
@ 2023-02-03 10:04   ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Shunsuke Mie, Frank Li, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

Add a new endpoint(EP) function driver to provide virtio-net device. This
function not only shows virtio-net device for PCIe host system, but also
provides virtio-net device to EP side(local) system. Virtualy those network
devices are connected, so we can use to communicate over IP like a simple
NIC.

Architecture overview is following:

to Host       |	                to Endpoint
network stack |                 network stack
      |       |                       |
+-----------+ |	+-----------+   +-----------+
|virtio-net | |	|virtio-net |   |virtio-net |
|driver     | |	|EP function|---|driver     |
+-----------+ |	+-----------+   +-----------+
      |       |	      |
+-----------+ | +-----------+
|PCIeC      | | |PCIeC      |
|Rootcomplex|-|-|Endpoint   |
+-----------+ | +-----------+
  Host side   |          Endpoint side

This driver uses PCIe EP framework to show virtio-net (pci) device Host
side, and generate virtual virtio-net device and register to EP side.
A communication date is diractly transported between virtqueue level
with each other using PCIe embedded DMA controller.

by a limitation of the hardware and Linux EP framework, this function
follows a virtio legacy specification.

This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
just use the PCIe EP framework and depends on the PCIe EDMA.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 drivers/pci/endpoint/functions/Kconfig        |  12 +
 drivers/pci/endpoint/functions/Makefile       |   1 +
 .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
 .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
 6 files changed, 1440 insertions(+)
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h

diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
index 9fd560886871..f88d8baaf689 100644
--- a/drivers/pci/endpoint/functions/Kconfig
+++ b/drivers/pci/endpoint/functions/Kconfig
@@ -37,3 +37,15 @@ config PCI_EPF_VNTB
 	  between PCI Root Port and PCIe Endpoint.
 
 	  If in doubt, say "N" to disable Endpoint NTB driver.
+
+config PCI_EPF_VNET
+	tristate "PCI Endpoint virtio-net driver"
+	depends on PCI_ENDPOINT
+	select PCI_ENDPOINT_VIRTIO
+	select VHOST_RING
+	select VHOST_IOMEM
+	help
+	  PCIe Endpoint virtio-net function implementation. This module enables to
+	  show the virtio-net as pci device to PCIe Host side, and, another
+	  virtio-net device show to local machine. Those devices can communicate
+	  each other.
diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
index 5c13001deaba..74cc4c330c62 100644
--- a/drivers/pci/endpoint/functions/Makefile
+++ b/drivers/pci/endpoint/functions/Makefile
@@ -6,3 +6,4 @@
 obj-$(CONFIG_PCI_EPF_TEST)		+= pci-epf-test.o
 obj-$(CONFIG_PCI_EPF_NTB)		+= pci-epf-ntb.o
 obj-$(CONFIG_PCI_EPF_VNTB) 		+= pci-epf-vntb.o
+obj-$(CONFIG_PCI_EPF_VNET)		+= pci-epf-vnet.o pci-epf-vnet-rc.o pci-epf-vnet-ep.o
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
new file mode 100644
index 000000000000..93b7e00e8d06
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Functions work for Endpoint side(local) using EPF framework
+ */
+#include <linux/pci-epc.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
+
+#include "pci-epf-vnet.h"
+
+static inline struct epf_vnet *vdev_to_vnet(struct virtio_device *vdev)
+{
+	return container_of(vdev, struct epf_vnet, ep.vdev);
+}
+
+static void epf_vnet_ep_set_status(struct epf_vnet *vnet, u16 status)
+{
+	vnet->ep.net_config_status |= status;
+}
+
+static void epf_vnet_ep_clear_status(struct epf_vnet *vnet, u16 status)
+{
+	vnet->ep.net_config_status &= ~status;
+}
+
+static void epf_vnet_ep_raise_config_irq(struct epf_vnet *vnet)
+{
+	virtio_config_changed(&vnet->ep.vdev);
+}
+
+void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet)
+{
+	epf_vnet_ep_set_status(vnet,
+			       VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
+	epf_vnet_ep_raise_config_irq(vnet);
+}
+
+void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq)
+{
+	vring_interrupt(0, vq);
+}
+
+static int epf_vnet_ep_process_ctrlq_entry(struct epf_vnet *vnet)
+{
+	struct vringh *vrh = &vnet->ep.ctlvrh;
+	struct vringh_kiov *wiov = &vnet->ep.ctl_riov;
+	struct vringh_kiov *riov = &vnet->ep.ctl_wiov;
+	struct virtio_net_ctrl_hdr *hdr;
+	virtio_net_ctrl_ack *ack;
+	int err;
+	u16 head;
+	size_t len;
+
+	err = vringh_getdesc(vrh, riov, wiov, &head);
+	if (err <= 0)
+		goto done;
+
+	len = vringh_kiov_length(riov);
+	if (len < sizeof(*hdr)) {
+		pr_debug("Command is too short: %ld\n", len);
+		err = -EIO;
+		goto done;
+	}
+
+	if (vringh_kiov_length(wiov) < sizeof(*ack)) {
+		pr_debug("Space for ack is not enough\n");
+		err = -EIO;
+		goto done;
+	}
+
+	hdr = phys_to_virt((unsigned long)riov->iov[riov->i].iov_base);
+	ack = phys_to_virt((unsigned long)wiov->iov[wiov->i].iov_base);
+
+	switch (hdr->class) {
+	case VIRTIO_NET_CTRL_ANNOUNCE:
+		if (hdr->cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
+			pr_debug("Invalid command: announce: %d\n", hdr->cmd);
+			goto done;
+		}
+
+		epf_vnet_ep_clear_status(vnet, VIRTIO_NET_S_ANNOUNCE);
+		*ack = VIRTIO_NET_OK;
+		break;
+	default:
+		pr_debug("Found not supported class: %d\n", hdr->class);
+		err = -EIO;
+	}
+
+done:
+	vringh_complete(vrh, head, len);
+	return err;
+}
+
+static u64 epf_vnet_ep_vdev_get_features(struct virtio_device *vdev)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	return vnet->virtio_features;
+}
+
+static int epf_vnet_ep_vdev_finalize_features(struct virtio_device *vdev)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	if (vdev->features != vnet->virtio_features)
+		return -EINVAL;
+
+	return 0;
+}
+
+static void epf_vnet_ep_vdev_get_config(struct virtio_device *vdev,
+					unsigned int offset, void *buf,
+					unsigned int len)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+	const unsigned int mac_len = sizeof(vnet->vnet_cfg.mac);
+	const unsigned int status_len = sizeof(vnet->vnet_cfg.status);
+	unsigned int copy_len;
+
+	switch (offset) {
+	case offsetof(struct virtio_net_config, mac):
+		/* This PCIe EP function doesn't provide a VIRTIO_NET_F_MAC feature, so just
+		 * clear the buffer.
+		 */
+		copy_len = len >= mac_len ? mac_len : len;
+		memset(buf, 0x00, copy_len);
+		len -= copy_len;
+		buf += copy_len;
+		fallthrough;
+	case offsetof(struct virtio_net_config, status):
+		copy_len = len >= status_len ? status_len : len;
+		memcpy(buf, &vnet->ep.net_config_status, copy_len);
+		len -= copy_len;
+		buf += copy_len;
+		fallthrough;
+	default:
+		if (offset > sizeof(vnet->vnet_cfg)) {
+			memset(buf, 0x00, len);
+			break;
+		}
+		memcpy(buf, (void *)&vnet->vnet_cfg + offset, len);
+	}
+}
+
+static void epf_vnet_ep_vdev_set_config(struct virtio_device *vdev,
+					unsigned int offset, const void *buf,
+					unsigned int len)
+{
+	/* Do nothing, because all of virtio net config space is readonly. */
+}
+
+static u8 epf_vnet_ep_vdev_get_status(struct virtio_device *vdev)
+{
+	return 0;
+}
+
+static void epf_vnet_ep_vdev_set_status(struct virtio_device *vdev, u8 status)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	if (status & VIRTIO_CONFIG_S_DRIVER_OK)
+		epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_EP);
+}
+
+static void epf_vnet_ep_vdev_reset(struct virtio_device *vdev)
+{
+	pr_debug("doesn't support yet");
+}
+
+static bool epf_vnet_ep_vdev_vq_notify(struct virtqueue *vq)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vq->vdev);
+	struct vringh *tx_vrh = &vnet->ep.txvrh;
+	struct vringh *rx_vrh = &vnet->rc.rxvrh->vrh;
+	struct vringh_kiov *tx_iov = &vnet->ep.tx_iov;
+	struct vringh_kiov *rx_iov = &vnet->rc.rx_iov;
+	int err;
+
+	/* Support only one queue pair */
+	switch (vq->index) {
+	case 0: // rx queue
+		break;
+	case 1: // tx queue
+		while ((err = epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov,
+						rx_iov, DMA_MEM_TO_DEV)) > 0)
+			;
+		if (err < 0)
+			pr_debug("Failed to transmit: EP -> Host: %d\n", err);
+		break;
+	case 2: // control queue
+		epf_vnet_ep_process_ctrlq_entry(vnet);
+		break;
+	default:
+		return false;
+	}
+
+	return true;
+}
+
+static int epf_vnet_ep_vdev_find_vqs(struct virtio_device *vdev,
+				     unsigned int nvqs, struct virtqueue *vqs[],
+				     vq_callback_t *callback[],
+				     const char *const names[], const bool *ctx,
+				     struct irq_affinity *desc)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+	const size_t vq_size = epf_vnet_get_vq_size();
+	int i;
+	int err;
+	int qidx;
+
+	for (qidx = 0, i = 0; i < nvqs; i++) {
+		struct virtqueue *vq;
+		struct vring *vring;
+		struct vringh *vrh;
+
+		if (!names[i]) {
+			vqs[i] = NULL;
+			continue;
+		}
+
+		vq = vring_create_virtqueue(qidx++, vq_size,
+					    VIRTIO_PCI_VRING_ALIGN, vdev, true,
+					    false, ctx ? ctx[i] : false,
+					    epf_vnet_ep_vdev_vq_notify,
+					    callback[i], names[i]);
+		if (!vq) {
+			err = -ENOMEM;
+			goto err_del_vqs;
+		}
+
+		vqs[i] = vq;
+		vring = virtqueue_get_vring(vq);
+
+		switch (i) {
+		case 0: // rx
+			vrh = &vnet->ep.rxvrh;
+			vnet->ep.rxvq = vq;
+			break;
+		case 1: // tx
+			vrh = &vnet->ep.txvrh;
+			vnet->ep.txvq = vq;
+			break;
+		case 2: // control
+			vrh = &vnet->ep.ctlvrh;
+			vnet->ep.ctlvq = vq;
+			break;
+		default:
+			err = -EIO;
+			goto err_del_vqs;
+		}
+
+		err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
+				       true, GFP_KERNEL, vring->desc,
+				       vring->avail, vring->used);
+		if (err) {
+			pr_err("failed to init vringh for vring %d\n", i);
+			goto err_del_vqs;
+		}
+	}
+
+	err = epf_vnet_init_kiov(&vnet->ep.tx_iov, vq_size);
+	if (err)
+		goto err_free_kiov;
+	err = epf_vnet_init_kiov(&vnet->ep.rx_iov, vq_size);
+	if (err)
+		goto err_free_kiov;
+	err = epf_vnet_init_kiov(&vnet->ep.ctl_riov, vq_size);
+	if (err)
+		goto err_free_kiov;
+	err = epf_vnet_init_kiov(&vnet->ep.ctl_wiov, vq_size);
+	if (err)
+		goto err_free_kiov;
+
+	return 0;
+
+err_free_kiov:
+	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
+
+err_del_vqs:
+	for (; i >= 0; i--) {
+		if (!names[i])
+			continue;
+
+		if (!vqs[i])
+			continue;
+
+		vring_del_virtqueue(vqs[i]);
+	}
+	return err;
+}
+
+static void epf_vnet_ep_vdev_del_vqs(struct virtio_device *vdev)
+{
+	struct virtqueue *vq, *n;
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
+		vring_del_virtqueue(vq);
+
+	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
+}
+
+static const struct virtio_config_ops epf_vnet_ep_vdev_config_ops = {
+	.get_features = epf_vnet_ep_vdev_get_features,
+	.finalize_features = epf_vnet_ep_vdev_finalize_features,
+	.get = epf_vnet_ep_vdev_get_config,
+	.set = epf_vnet_ep_vdev_set_config,
+	.get_status = epf_vnet_ep_vdev_get_status,
+	.set_status = epf_vnet_ep_vdev_set_status,
+	.reset = epf_vnet_ep_vdev_reset,
+	.find_vqs = epf_vnet_ep_vdev_find_vqs,
+	.del_vqs = epf_vnet_ep_vdev_del_vqs,
+};
+
+void epf_vnet_ep_cleanup(struct epf_vnet *vnet)
+{
+	unregister_virtio_device(&vnet->ep.vdev);
+}
+
+int epf_vnet_ep_setup(struct epf_vnet *vnet)
+{
+	int err;
+	struct virtio_device *vdev = &vnet->ep.vdev;
+
+	vdev->dev.parent = vnet->epf->epc->dev.parent;
+	vdev->config = &epf_vnet_ep_vdev_config_ops;
+	vdev->id.vendor = PCI_VENDOR_ID_REDHAT_QUMRANET;
+	vdev->id.device = VIRTIO_ID_NET;
+
+	err = register_virtio_device(vdev);
+	if (err)
+		return err;
+
+	return 0;
+}
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
new file mode 100644
index 000000000000..2ca0245a9134
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
@@ -0,0 +1,635 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Functions work for PCie Host side(remote) using EPF framework.
+ */
+#include <linux/pci-epf.h>
+#include <linux/pci-epc.h>
+#include <linux/pci_ids.h>
+#include <linux/sched.h>
+#include <linux/virtio_pci.h>
+
+#include "pci-epf-vnet.h"
+
+#define VIRTIO_NET_LEGACY_CFG_BAR BAR_0
+
+/* Returns an out side of the valid queue index. */
+static inline u16 epf_vnet_rc_get_number_of_queues(struct epf_vnet *vnet)
+
+{
+	/* number of queue pairs and control queue */
+	return vnet->vnet_cfg.max_virtqueue_pairs * 2 + 1;
+}
+
+static void epf_vnet_rc_memcpy_config(struct epf_vnet *vnet, size_t offset,
+				      void *buf, size_t len)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	memcpy_toio(base, buf, len);
+}
+
+static void epf_vnet_rc_set_config8(struct epf_vnet *vnet, size_t offset,
+				    u8 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite8(ioread8(base) | config, base);
+}
+
+static void epf_vnet_rc_set_config16(struct epf_vnet *vnet, size_t offset,
+				     u16 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite16(ioread16(base) | config, base);
+}
+
+static void epf_vnet_rc_clear_config16(struct epf_vnet *vnet, size_t offset,
+				       u16 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite16(ioread16(base) & ~config, base);
+}
+
+static void epf_vnet_rc_set_config32(struct epf_vnet *vnet, size_t offset,
+				     u32 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite32(ioread32(base) | config, base);
+}
+
+static void epf_vnet_rc_raise_config_irq(struct epf_vnet *vnet)
+{
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_CONFIG);
+	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
+}
+
+void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet)
+{
+	epf_vnet_rc_set_config16(vnet,
+				 VIRTIO_PCI_CONFIG_OFF(false) +
+					 offsetof(struct virtio_net_config,
+						  status),
+				 VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
+	epf_vnet_rc_raise_config_irq(vnet);
+}
+
+/*
+ * For the PCIe host, this driver shows legacy virtio-net device. Because,
+ * virtio structure pci capabilities is mandatory for modern virtio device,
+ * but there is no PCIe EP hardware that can be configured with any pci
+ * capabilities and Linux PCIe EP framework doesn't support it.
+ */
+static struct pci_epf_header epf_vnet_pci_header = {
+	.vendorid = PCI_VENDOR_ID_REDHAT_QUMRANET,
+	.deviceid = VIRTIO_TRANS_ID_NET,
+	.subsys_vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
+	.subsys_id = VIRTIO_ID_NET,
+	.revid = 0,
+	.baseclass_code = PCI_BASE_CLASS_NETWORK,
+	.interrupt_pin = PCI_INTERRUPT_PIN,
+};
+
+static void epf_vnet_rc_setup_configs(struct epf_vnet *vnet,
+				      void __iomem *cfg_base)
+{
+	u16 default_qindex = epf_vnet_rc_get_number_of_queues(vnet);
+
+	epf_vnet_rc_set_config32(vnet, VIRTIO_PCI_HOST_FEATURES,
+				 vnet->virtio_features);
+
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_QUEUE);
+	/*
+	 * Initialize the queue notify and selector to outside of the appropriate
+	 * virtqueue index. It is used to detect change with polling. There is no
+	 * other ways to detect host side driver updateing those values
+	 */
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY, default_qindex);
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL, default_qindex);
+	/* This pfn is also set to 0 for the polling as well */
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
+
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NUM,
+				 epf_vnet_get_vq_size());
+	epf_vnet_rc_set_config8(vnet, VIRTIO_PCI_STATUS, 0);
+	epf_vnet_rc_memcpy_config(vnet, VIRTIO_PCI_CONFIG_OFF(false),
+				  &vnet->vnet_cfg, sizeof(vnet->vnet_cfg));
+}
+
+static void epf_vnet_cleanup_bar(struct epf_vnet *vnet)
+{
+	struct pci_epf *epf = vnet->epf;
+
+	pci_epc_clear_bar(epf->epc, epf->func_no, epf->vfunc_no,
+			  &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR]);
+	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
+			   PRIMARY_INTERFACE);
+}
+
+static int epf_vnet_setup_bar(struct epf_vnet *vnet)
+{
+	int err;
+	size_t cfg_bar_size =
+		VIRTIO_PCI_CONFIG_OFF(false) + sizeof(struct virtio_net_config);
+	struct pci_epf *epf = vnet->epf;
+	const struct pci_epc_features *features;
+	struct pci_epf_bar *config_bar = &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR];
+
+	features = pci_epc_get_features(epf->epc, epf->func_no, epf->vfunc_no);
+	if (!features) {
+		pr_debug("Failed to get PCI EPC features\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (features->reserved_bar & BIT(VIRTIO_NET_LEGACY_CFG_BAR)) {
+		pr_debug("Cannot use the PCI BAR for legacy virtio pci\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
+		if (cfg_bar_size >
+		    features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
+			pr_debug("PCI BAR size is not enough\n");
+			return -ENOMEM;
+		}
+	}
+
+	config_bar->flags |= PCI_BASE_ADDRESS_MEM_TYPE_64;
+
+	vnet->rc.cfg_base = pci_epf_alloc_space(epf, cfg_bar_size,
+						VIRTIO_NET_LEGACY_CFG_BAR,
+						features->align,
+						PRIMARY_INTERFACE);
+	if (!vnet->rc.cfg_base) {
+		pr_debug("Failed to allocate virtio-net config memory\n");
+		return -ENOMEM;
+	}
+
+	epf_vnet_rc_setup_configs(vnet, vnet->rc.cfg_base);
+
+	err = pci_epc_set_bar(epf->epc, epf->func_no, epf->vfunc_no,
+			      config_bar);
+	if (err) {
+		pr_debug("Failed to set PCI BAR");
+		goto err_free_space;
+	}
+
+	return 0;
+
+err_free_space:
+	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
+			   PRIMARY_INTERFACE);
+	return err;
+}
+
+static int epf_vnet_rc_negotiate_configs(struct epf_vnet *vnet, u32 *txpfn,
+					 u32 *rxpfn, u32 *ctlpfn)
+{
+	const u16 nqueues = epf_vnet_rc_get_number_of_queues(vnet);
+	const u16 default_sel = nqueues;
+	u32 __iomem *queue_pfn = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_PFN;
+	u16 __iomem *queue_sel = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_SEL;
+	u8 __iomem *pci_status = vnet->rc.cfg_base + VIRTIO_PCI_STATUS;
+	u32 pfn;
+	u16 sel;
+	struct {
+		u32 pfn;
+		u16 sel;
+	} tmp[3] = {};
+	int tmp_index = 0;
+
+	*rxpfn = *txpfn = *ctlpfn = 0;
+
+	/* To avoid to miss a getting the pfn and selector for virtqueue wrote by
+	 * host driver, we need to implement fast polling with saving.
+	 *
+	 * This implementation suspects that the host driver writes pfn only once
+	 * for each queues
+	 */
+	while (tmp_index < nqueues) {
+		pfn = ioread32(queue_pfn);
+		if (pfn == 0)
+			continue;
+
+		iowrite32(0, queue_pfn);
+
+		sel = ioread16(queue_sel);
+		if (sel == default_sel)
+			continue;
+
+		tmp[tmp_index].pfn = pfn;
+		tmp[tmp_index].sel = sel;
+		tmp_index++;
+	}
+
+	while (!((ioread8(pci_status) & VIRTIO_CONFIG_S_DRIVER_OK)))
+		;
+
+	for (int i = 0; i < nqueues; ++i) {
+		switch (tmp[i].sel) {
+		case 0:
+			*rxpfn = tmp[i].pfn;
+			break;
+		case 1:
+			*txpfn = tmp[i].pfn;
+			break;
+		case 2:
+			*ctlpfn = tmp[i].pfn;
+			break;
+		}
+	}
+
+	if (!*rxpfn || !*txpfn || !*ctlpfn)
+		return -EIO;
+
+	return 0;
+}
+
+static int epf_vnet_rc_monitor_notify(void *data)
+{
+	struct epf_vnet *vnet = data;
+	u16 __iomem *queue_notify = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_NOTIFY;
+	const u16 notify_default = epf_vnet_rc_get_number_of_queues(vnet);
+
+	epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_RC);
+
+	/* Poll to detect a change of the queue_notify register. Sometimes this
+	 * polling misses the change, so try to check each virtqueues
+	 * everytime.
+	 */
+	while (true) {
+		while (ioread16(queue_notify) == notify_default)
+			;
+		iowrite16(notify_default, queue_notify);
+
+		queue_work(vnet->rc.tx_wq, &vnet->rc.tx_work);
+		queue_work(vnet->rc.ctl_wq, &vnet->rc.ctl_work);
+	}
+
+	return 0;
+}
+
+static int epf_vnet_rc_spawn_notify_monitor(struct epf_vnet *vnet)
+{
+	vnet->rc.notify_monitor_task =
+		kthread_create(epf_vnet_rc_monitor_notify, vnet,
+			       "pci-epf-vnet/cfg_negotiator");
+	if (IS_ERR(vnet->rc.notify_monitor_task))
+		return PTR_ERR(vnet->rc.notify_monitor_task);
+
+	/* Change the thread priority to high for polling. */
+	sched_set_fifo(vnet->rc.notify_monitor_task);
+	wake_up_process(vnet->rc.notify_monitor_task);
+
+	return 0;
+}
+
+static int epf_vnet_rc_device_setup(void *data)
+{
+	struct epf_vnet *vnet = data;
+	struct pci_epf *epf = vnet->epf;
+	u32 txpfn, rxpfn, ctlpfn;
+	const size_t vq_size = epf_vnet_get_vq_size();
+	int err;
+
+	err = epf_vnet_rc_negotiate_configs(vnet, &txpfn, &rxpfn, &ctlpfn);
+	if (err) {
+		pr_debug("Failed to negatiate configs with driver\n");
+		return err;
+	}
+
+	/* Polling phase is finished. This thread backs to normal priority. */
+	sched_set_normal(vnet->rc.device_setup_task, 19);
+
+	vnet->rc.txvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
+						     txpfn, vq_size);
+	if (IS_ERR(vnet->rc.txvrh)) {
+		pr_debug("Failed to setup virtqueue for tx\n");
+		return PTR_ERR(vnet->rc.txvrh);
+	}
+
+	err = epf_vnet_init_kiov(&vnet->rc.tx_iov, vq_size);
+	if (err)
+		goto err_free_epf_tx_vringh;
+
+	vnet->rc.rxvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
+						     rxpfn, vq_size);
+	if (IS_ERR(vnet->rc.rxvrh)) {
+		pr_debug("Failed to setup virtqueue for rx\n");
+		err = PTR_ERR(vnet->rc.rxvrh);
+		goto err_deinit_tx_kiov;
+	}
+
+	err = epf_vnet_init_kiov(&vnet->rc.rx_iov, vq_size);
+	if (err)
+		goto err_free_epf_rx_vringh;
+
+	vnet->rc.ctlvrh = pci_epf_virtio_alloc_vringh(
+		epf, vnet->virtio_features, ctlpfn, vq_size);
+	if (IS_ERR(vnet->rc.ctlvrh)) {
+		pr_err("failed to setup virtqueue\n");
+		err = PTR_ERR(vnet->rc.ctlvrh);
+		goto err_deinit_rx_kiov;
+	}
+
+	err = epf_vnet_init_kiov(&vnet->rc.ctl_riov, vq_size);
+	if (err)
+		goto err_free_epf_ctl_vringh;
+
+	err = epf_vnet_init_kiov(&vnet->rc.ctl_wiov, vq_size);
+	if (err)
+		goto err_deinit_ctl_riov;
+
+	err = epf_vnet_rc_spawn_notify_monitor(vnet);
+	if (err) {
+		pr_debug("Failed to create notify monitor thread\n");
+		goto err_deinit_ctl_wiov;
+	}
+
+	return 0;
+
+err_deinit_ctl_wiov:
+	epf_vnet_deinit_kiov(&vnet->rc.ctl_wiov);
+err_deinit_ctl_riov:
+	epf_vnet_deinit_kiov(&vnet->rc.ctl_riov);
+err_free_epf_ctl_vringh:
+	pci_epf_virtio_free_vringh(epf, vnet->rc.ctlvrh);
+err_deinit_rx_kiov:
+	epf_vnet_deinit_kiov(&vnet->rc.rx_iov);
+err_free_epf_rx_vringh:
+	pci_epf_virtio_free_vringh(epf, vnet->rc.rxvrh);
+err_deinit_tx_kiov:
+	epf_vnet_deinit_kiov(&vnet->rc.tx_iov);
+err_free_epf_tx_vringh:
+	pci_epf_virtio_free_vringh(epf, vnet->rc.txvrh);
+
+	return err;
+}
+
+static int epf_vnet_rc_spawn_device_setup_task(struct epf_vnet *vnet)
+{
+	vnet->rc.device_setup_task = kthread_create(
+		epf_vnet_rc_device_setup, vnet, "pci-epf-vnet/cfg_negotiator");
+	if (IS_ERR(vnet->rc.device_setup_task))
+		return PTR_ERR(vnet->rc.device_setup_task);
+
+	/* Change the thread priority to high for the polling. */
+	sched_set_fifo(vnet->rc.device_setup_task);
+	wake_up_process(vnet->rc.device_setup_task);
+
+	return 0;
+}
+
+static void epf_vnet_rc_tx_handler(struct work_struct *work)
+{
+	struct epf_vnet *vnet = container_of(work, struct epf_vnet, rc.tx_work);
+	struct vringh *tx_vrh = &vnet->rc.txvrh->vrh;
+	struct vringh *rx_vrh = &vnet->ep.rxvrh;
+	struct vringh_kiov *tx_iov = &vnet->rc.tx_iov;
+	struct vringh_kiov *rx_iov = &vnet->ep.rx_iov;
+
+	while (epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov, rx_iov,
+				 DMA_DEV_TO_MEM) > 0)
+		;
+}
+
+static void epf_vnet_rc_raise_irq_handler(struct work_struct *work)
+{
+	struct epf_vnet *vnet =
+		container_of(work, struct epf_vnet, rc.raise_irq_work);
+	struct pci_epf *epf = vnet->epf;
+
+	pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no,
+			  PCI_EPC_IRQ_LEGACY, 0);
+}
+
+struct epf_vnet_rc_meminfo {
+	void __iomem *addr, *virt;
+	phys_addr_t phys;
+	size_t len;
+};
+
+/* Util function to access PCIe host side memory from local CPU.  */
+static struct epf_vnet_rc_meminfo *
+epf_vnet_rc_epc_mmap(struct pci_epf *epf, phys_addr_t pci_addr, size_t len)
+{
+	int err;
+	phys_addr_t aaddr, phys_addr;
+	size_t asize, offset;
+	void __iomem *virt_addr;
+	struct epf_vnet_rc_meminfo *meminfo;
+
+	err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
+	if (err) {
+		pr_debug("Failed to get EPC align: %d\n", err);
+		return NULL;
+	}
+
+	offset = pci_addr - aaddr;
+
+	virt_addr = pci_epc_mem_alloc_addr(epf->epc, &phys_addr, asize);
+	if (!virt_addr) {
+		pr_debug("Failed to allocate epc memory\n");
+		return NULL;
+	}
+
+	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, phys_addr,
+			       aaddr, asize);
+	if (err) {
+		pr_debug("Failed to map epc memory\n");
+		goto err_epc_free_addr;
+	}
+
+	meminfo = kmalloc(sizeof(*meminfo), GFP_KERNEL);
+	if (!meminfo)
+		goto err_epc_unmap_addr;
+
+	meminfo->virt = virt_addr;
+	meminfo->phys = phys_addr;
+	meminfo->len = len;
+	meminfo->addr = virt_addr + offset;
+
+	return meminfo;
+
+err_epc_unmap_addr:
+	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
+			   meminfo->phys);
+err_epc_free_addr:
+	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
+			      meminfo->len);
+
+	return NULL;
+}
+
+static void epf_vnet_rc_epc_munmap(struct pci_epf *epf,
+				   struct epf_vnet_rc_meminfo *meminfo)
+{
+	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
+			   meminfo->phys);
+	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
+			      meminfo->len);
+	kfree(meminfo);
+}
+
+static int epf_vnet_rc_process_ctrlq_entry(struct epf_vnet *vnet)
+{
+	struct vringh_kiov *riov = &vnet->rc.ctl_riov;
+	struct vringh_kiov *wiov = &vnet->rc.ctl_wiov;
+	struct vringh *vrh = &vnet->rc.ctlvrh->vrh;
+	struct pci_epf *epf = vnet->epf;
+	struct epf_vnet_rc_meminfo *rmem, *wmem;
+	struct virtio_net_ctrl_hdr *hdr;
+	int err;
+	u16 head;
+	size_t total_len;
+	u8 class, cmd;
+
+	err = vringh_getdesc(vrh, riov, wiov, &head);
+	if (err <= 0)
+		return err;
+
+	total_len = vringh_kiov_length(riov);
+
+	rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
+				    riov->iov[riov->i].iov_len);
+	if (!rmem) {
+		err = -ENOMEM;
+		goto err_abandon_descs;
+	}
+
+	wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
+				    wiov->iov[wiov->i].iov_len);
+	if (!wmem) {
+		err = -ENOMEM;
+		goto err_epc_unmap_rmem;
+	}
+
+	hdr = rmem->addr;
+	class = ioread8(&hdr->class);
+	cmd = ioread8(&hdr->cmd);
+	switch (ioread8(&hdr->class)) {
+	case VIRTIO_NET_CTRL_ANNOUNCE:
+		if (cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
+			pr_err("Found invalid command: announce: %d\n", cmd);
+			break;
+		}
+		epf_vnet_rc_clear_config16(
+			vnet,
+			VIRTIO_PCI_CONFIG_OFF(false) +
+				offsetof(struct virtio_net_config, status),
+			VIRTIO_NET_S_ANNOUNCE);
+		epf_vnet_rc_clear_config16(vnet, VIRTIO_PCI_ISR,
+					   VIRTIO_PCI_ISR_CONFIG);
+
+		iowrite8(VIRTIO_NET_OK, wmem->addr);
+		break;
+	default:
+		pr_err("Found unsupported class in control queue: %d\n", class);
+		break;
+	}
+
+	epf_vnet_rc_epc_munmap(epf, rmem);
+	epf_vnet_rc_epc_munmap(epf, wmem);
+	vringh_complete(vrh, head, total_len);
+
+	return 1;
+
+err_epc_unmap_rmem:
+	epf_vnet_rc_epc_munmap(epf, rmem);
+err_abandon_descs:
+	vringh_abandon(vrh, head);
+
+	return err;
+}
+
+static void epf_vnet_rc_process_ctrlq_entries(struct work_struct *work)
+{
+	struct epf_vnet *vnet =
+		container_of(work, struct epf_vnet, rc.ctl_work);
+
+	while (epf_vnet_rc_process_ctrlq_entry(vnet) > 0)
+		;
+}
+
+void epf_vnet_rc_notify(struct epf_vnet *vnet)
+{
+	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
+}
+
+void epf_vnet_rc_cleanup(struct epf_vnet *vnet)
+{
+	epf_vnet_cleanup_bar(vnet);
+	destroy_workqueue(vnet->rc.tx_wq);
+	destroy_workqueue(vnet->rc.irq_wq);
+	destroy_workqueue(vnet->rc.ctl_wq);
+
+	kthread_stop(vnet->rc.device_setup_task);
+}
+
+int epf_vnet_rc_setup(struct epf_vnet *vnet)
+{
+	int err;
+	struct pci_epf *epf = vnet->epf;
+
+	err = pci_epc_write_header(epf->epc, epf->func_no, epf->vfunc_no,
+				   &epf_vnet_pci_header);
+	if (err)
+		return err;
+
+	err = epf_vnet_setup_bar(vnet);
+	if (err)
+		return err;
+
+	vnet->rc.tx_wq =
+		alloc_workqueue("pci-epf-vnet/tx-wq",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
+	if (!vnet->rc.tx_wq) {
+		pr_debug(
+			"Failed to allocate workqueue for rc -> ep transmission\n");
+		err = -ENOMEM;
+		goto err_cleanup_bar;
+	}
+
+	INIT_WORK(&vnet->rc.tx_work, epf_vnet_rc_tx_handler);
+
+	vnet->rc.irq_wq =
+		alloc_workqueue("pci-epf-vnet/irq-wq",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
+	if (!vnet->rc.irq_wq) {
+		pr_debug("Failed to allocate workqueue for irq\n");
+		err = -ENOMEM;
+		goto err_destory_tx_wq;
+	}
+
+	INIT_WORK(&vnet->rc.raise_irq_work, epf_vnet_rc_raise_irq_handler);
+
+	vnet->rc.ctl_wq =
+		alloc_workqueue("pci-epf-vnet/ctl-wq",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
+	if (!vnet->rc.ctl_wq) {
+		pr_err("Failed to allocate work queue for control queue processing\n");
+		err = -ENOMEM;
+		goto err_destory_irq_wq;
+	}
+
+	INIT_WORK(&vnet->rc.ctl_work, epf_vnet_rc_process_ctrlq_entries);
+
+	err = epf_vnet_rc_spawn_device_setup_task(vnet);
+	if (err)
+		goto err_destory_ctl_wq;
+
+	return 0;
+
+err_cleanup_bar:
+	epf_vnet_cleanup_bar(vnet);
+err_destory_tx_wq:
+	destroy_workqueue(vnet->rc.tx_wq);
+err_destory_irq_wq:
+	destroy_workqueue(vnet->rc.irq_wq);
+err_destory_ctl_wq:
+	destroy_workqueue(vnet->rc.ctl_wq);
+
+	return err;
+}
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.c b/drivers/pci/endpoint/functions/pci-epf-vnet.c
new file mode 100644
index 000000000000..e48ad8067796
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet.c
@@ -0,0 +1,387 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCI Endpoint function driver to impliment virtio-net device.
+ */
+#include <linux/module.h>
+#include <linux/pci-epf.h>
+#include <linux/pci-epc.h>
+#include <linux/vringh.h>
+#include <linux/dmaengine.h>
+
+#include "pci-epf-vnet.h"
+
+static int virtio_queue_size = 0x100;
+module_param(virtio_queue_size, int, 0444);
+MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
+
+int epf_vnet_get_vq_size(void)
+{
+	return virtio_queue_size;
+}
+
+int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size)
+{
+	struct kvec *kvec;
+
+	kvec = kmalloc_array(vq_size, sizeof(*kvec), GFP_KERNEL);
+	if (!kvec)
+		return -ENOMEM;
+
+	vringh_kiov_init(kiov, kvec, vq_size);
+
+	return 0;
+}
+
+void epf_vnet_deinit_kiov(struct vringh_kiov *kiov)
+{
+	kfree(kiov->iov);
+}
+
+void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from)
+{
+	vnet->init_complete |= from;
+
+	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_EP))
+		return;
+
+	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_RC))
+		return;
+
+	epf_vnet_ep_announce_linkup(vnet);
+	epf_vnet_rc_announce_linkup(vnet);
+}
+
+struct epf_dma_filter_param {
+	struct device *dev;
+	u32 dma_mask;
+};
+
+static bool epf_virtnet_dma_filter(struct dma_chan *chan, void *param)
+{
+	struct epf_dma_filter_param *fparam = param;
+	struct dma_slave_caps caps;
+
+	memset(&caps, 0, sizeof(caps));
+	dma_get_slave_caps(chan, &caps);
+
+	return chan->device->dev == fparam->dev &&
+	       (fparam->dma_mask & caps.directions);
+}
+
+static int epf_vnet_init_edma(struct epf_vnet *vnet, struct device *dma_dev)
+{
+	struct epf_dma_filter_param param;
+	dma_cap_mask_t mask;
+	int err;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+
+	param.dev = dma_dev;
+	param.dma_mask = BIT(DMA_MEM_TO_DEV);
+	vnet->lr_dma_chan =
+		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
+	if (!vnet->lr_dma_chan)
+		return -EOPNOTSUPP;
+
+	param.dma_mask = BIT(DMA_DEV_TO_MEM);
+	vnet->rl_dma_chan =
+		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
+	if (!vnet->rl_dma_chan) {
+		err = -EOPNOTSUPP;
+		goto err_release_channel;
+	}
+
+	return 0;
+
+err_release_channel:
+	dma_release_channel(vnet->lr_dma_chan);
+
+	return err;
+}
+
+static void epf_vnet_deinit_edma(struct epf_vnet *vnet)
+{
+	dma_release_channel(vnet->lr_dma_chan);
+	dma_release_channel(vnet->rl_dma_chan);
+}
+
+static int epf_vnet_dma_single(struct epf_vnet *vnet, phys_addr_t pci,
+			       dma_addr_t dma, size_t len,
+			       void (*callback)(void *), void *param,
+			       enum dma_transfer_direction dir)
+{
+	struct dma_async_tx_descriptor *desc;
+	int err;
+	struct dma_chan *chan;
+	struct dma_slave_config sconf;
+	dma_cookie_t cookie;
+	unsigned long flags = 0;
+
+	if (dir == DMA_MEM_TO_DEV) {
+		sconf.dst_addr = pci;
+		chan = vnet->lr_dma_chan;
+	} else {
+		sconf.src_addr = pci;
+		chan = vnet->rl_dma_chan;
+	}
+
+	err = dmaengine_slave_config(chan, &sconf);
+	if (unlikely(err))
+		return err;
+
+	if (callback)
+		flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
+
+	desc = dmaengine_prep_slave_single(chan, dma, len, dir, flags);
+	if (unlikely(!desc))
+		return -EIO;
+
+	desc->callback = callback;
+	desc->callback_param = param;
+
+	cookie = dmaengine_submit(desc);
+	err = dma_submit_error(cookie);
+	if (unlikely(err))
+		return err;
+
+	dma_async_issue_pending(chan);
+
+	return 0;
+}
+
+struct epf_vnet_dma_callback_param {
+	struct epf_vnet *vnet;
+	struct vringh *tx_vrh, *rx_vrh;
+	struct virtqueue *vq;
+	size_t total_len;
+	u16 tx_head, rx_head;
+};
+
+static void epf_vnet_dma_callback(void *p)
+{
+	struct epf_vnet_dma_callback_param *param = p;
+	struct epf_vnet *vnet = param->vnet;
+
+	vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
+	vringh_complete(param->rx_vrh, param->rx_head, param->total_len);
+
+	epf_vnet_rc_notify(vnet);
+	epf_vnet_ep_notify(vnet, param->vq);
+
+	kfree(param);
+}
+
+/**
+ * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
+ * @vnet: epf virtio net device to do dma
+ * @tx_vrh: vringh related to source tx vring
+ * @rx_vrh: vringh related to target rx vring
+ * @tx_iov: buffer to use tx
+ * @rx_iov: buffer to use rx
+ * @dir: a direction of DMA. local to remote or local from remote
+ *
+ * This function returns 0, 1 or error number. The 0 indicates there is not
+ * data to send. The 1 indicates a request to DMA is succeeded. Other error
+ * numbers shows error, however, ENOSPC means there is no buffer on target
+ * vring, so should retry to call later.
+ */
+int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
+		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
+		      struct vringh_kiov *rx_iov,
+		      enum dma_transfer_direction dir)
+{
+	int err;
+	u16 tx_head, rx_head;
+	size_t total_tx_len;
+	struct epf_vnet_dma_callback_param *cb_param;
+	struct vringh_kiov *liov, *riov;
+
+	err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
+	if (err <= 0)
+		return err;
+
+	total_tx_len = vringh_kiov_length(tx_iov);
+
+	err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
+	if (err < 0) {
+		goto err_tx_complete;
+	} else if (!err) {
+		/* There is not space on a vring of destination to transmit data, so
+		 * rollback tx vringh
+		 */
+		vringh_abandon(tx_vrh, tx_head);
+		return -ENOSPC;
+	}
+
+	cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
+	if (!cb_param) {
+		err = -ENOMEM;
+		goto err_rx_complete;
+	}
+
+	cb_param->tx_vrh = tx_vrh;
+	cb_param->rx_vrh = rx_vrh;
+	cb_param->tx_head = tx_head;
+	cb_param->rx_head = rx_head;
+	cb_param->total_len = total_tx_len;
+	cb_param->vnet = vnet;
+
+	switch (dir) {
+	case DMA_MEM_TO_DEV:
+		liov = tx_iov;
+		riov = rx_iov;
+		cb_param->vq = vnet->ep.txvq;
+		break;
+	case DMA_DEV_TO_MEM:
+		liov = rx_iov;
+		riov = tx_iov;
+		cb_param->vq = vnet->ep.rxvq;
+		break;
+	default:
+		err = -EINVAL;
+		goto err_free_param;
+	}
+
+	for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
+		size_t len;
+		u64 lbase, rbase;
+		void (*callback)(void *) = NULL;
+
+		lbase = (u64)liov->iov[liov->i].iov_base;
+		rbase = (u64)riov->iov[riov->i].iov_base;
+		len = tx_iov->iov[tx_iov->i].iov_len;
+
+		if (tx_iov->i + 1 == tx_iov->used)
+			callback = epf_vnet_dma_callback;
+
+		err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
+					  cb_param, dir);
+		if (err)
+			goto err_free_param;
+	}
+
+	return 1;
+
+err_free_param:
+	kfree(cb_param);
+err_rx_complete:
+	vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
+err_tx_complete:
+	vringh_complete(tx_vrh, tx_head, total_tx_len);
+
+	return err;
+}
+
+static int epf_vnet_bind(struct pci_epf *epf)
+{
+	int err;
+	struct epf_vnet *vnet = epf_get_drvdata(epf);
+
+	err = epf_vnet_init_edma(vnet, epf->epc->dev.parent);
+	if (err)
+		return err;
+
+	err = epf_vnet_rc_setup(vnet);
+	if (err)
+		goto err_free_edma;
+
+	err = epf_vnet_ep_setup(vnet);
+	if (err)
+		goto err_cleanup_rc;
+
+	return 0;
+
+err_free_edma:
+	epf_vnet_deinit_edma(vnet);
+err_cleanup_rc:
+	epf_vnet_rc_cleanup(vnet);
+
+	return err;
+}
+
+static void epf_vnet_unbind(struct pci_epf *epf)
+{
+	struct epf_vnet *vnet = epf_get_drvdata(epf);
+
+	epf_vnet_deinit_edma(vnet);
+	epf_vnet_rc_cleanup(vnet);
+	epf_vnet_ep_cleanup(vnet);
+}
+
+static struct pci_epf_ops epf_vnet_ops = {
+	.bind = epf_vnet_bind,
+	.unbind = epf_vnet_unbind,
+};
+
+static const struct pci_epf_device_id epf_vnet_ids[] = {
+	{ .name = "pci_epf_vnet" },
+	{}
+};
+
+static void epf_vnet_virtio_init(struct epf_vnet *vnet)
+{
+	vnet->virtio_features =
+		BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_STATUS) |
+		/* Following features are to skip any of checking and offloading, Like a
+		 * transmission between virtual machines on same system. Details are on
+		 * section 5.1.5 in virtio specification.
+		 */
+		BIT(VIRTIO_NET_F_GUEST_CSUM) | BIT(VIRTIO_NET_F_GUEST_TSO4) |
+		BIT(VIRTIO_NET_F_GUEST_TSO6) | BIT(VIRTIO_NET_F_GUEST_ECN) |
+		BIT(VIRTIO_NET_F_GUEST_UFO) |
+		// The control queue is just used for linkup announcement.
+		BIT(VIRTIO_NET_F_CTRL_VQ);
+
+	vnet->vnet_cfg.max_virtqueue_pairs = 1;
+	vnet->vnet_cfg.status = 0;
+	vnet->vnet_cfg.mtu = PAGE_SIZE;
+}
+
+static int epf_vnet_probe(struct pci_epf *epf)
+{
+	struct epf_vnet *vnet;
+
+	vnet = devm_kzalloc(&epf->dev, sizeof(*vnet), GFP_KERNEL);
+	if (!vnet)
+		return -ENOMEM;
+
+	epf_set_drvdata(epf, vnet);
+	vnet->epf = epf;
+
+	epf_vnet_virtio_init(vnet);
+
+	return 0;
+}
+
+static struct pci_epf_driver epf_vnet_drv = {
+	.driver.name = "pci_epf_vnet",
+	.ops = &epf_vnet_ops,
+	.id_table = epf_vnet_ids,
+	.probe = epf_vnet_probe,
+	.owner = THIS_MODULE,
+};
+
+static int __init epf_vnet_init(void)
+{
+	int err;
+
+	err = pci_epf_register_driver(&epf_vnet_drv);
+	if (err) {
+		pr_err("Failed to register epf vnet driver\n");
+		return err;
+	}
+
+	return 0;
+}
+module_init(epf_vnet_init);
+
+static void epf_vnet_exit(void)
+{
+	pci_epf_unregister_driver(&epf_vnet_drv);
+}
+module_exit(epf_vnet_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
+MODULE_DESCRIPTION("PCI endpoint function acts as virtio net device");
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.h b/drivers/pci/endpoint/functions/pci-epf-vnet.h
new file mode 100644
index 000000000000..1e0f90c95578
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PCI_EPF_VNET_H
+#define _PCI_EPF_VNET_H
+
+#include <linux/pci-epf.h>
+#include <linux/pci-epf-virtio.h>
+#include <linux/virtio_net.h>
+#include <linux/dmaengine.h>
+#include <linux/virtio.h>
+
+struct epf_vnet {
+	//TODO Should this variable be placed here?
+	struct pci_epf *epf;
+	struct virtio_net_config vnet_cfg;
+	u64 virtio_features;
+
+	// dma channels for local to remote(lr) and remote to local(rl)
+	struct dma_chan *lr_dma_chan, *rl_dma_chan;
+
+	struct {
+		void __iomem *cfg_base;
+		struct task_struct *device_setup_task;
+		struct task_struct *notify_monitor_task;
+		struct workqueue_struct *tx_wq, *irq_wq, *ctl_wq;
+		struct work_struct tx_work, raise_irq_work, ctl_work;
+		struct pci_epf_vringh *txvrh, *rxvrh, *ctlvrh;
+		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
+	} rc;
+
+	struct {
+		struct virtqueue *rxvq, *txvq, *ctlvq;
+		struct vringh txvrh, rxvrh, ctlvrh;
+		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
+		struct virtio_device vdev;
+		u16 net_config_status;
+	} ep;
+
+#define EPF_VNET_INIT_COMPLETE_EP BIT(0)
+#define EPF_VNET_INIT_COMPLETE_RC BIT(1)
+	u8 init_complete;
+};
+
+int epf_vnet_rc_setup(struct epf_vnet *vnet);
+void epf_vnet_rc_cleanup(struct epf_vnet *vnet);
+int epf_vnet_ep_setup(struct epf_vnet *vnet);
+void epf_vnet_ep_cleanup(struct epf_vnet *vnet);
+
+int epf_vnet_get_vq_size(void);
+int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size);
+void epf_vnet_deinit_kiov(struct vringh_kiov *kiov);
+int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
+		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
+		      struct vringh_kiov *rx_iov,
+		      enum dma_transfer_direction dir);
+void epf_vnet_rc_notify(struct epf_vnet *vnet);
+void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq);
+
+void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from);
+void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet);
+void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet);
+
+#endif // _PCI_EPF_VNET_H
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
@ 2023-02-03 10:04   ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-03 10:04 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Manivannan Sadhasivam, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Shunsuke Mie, Jon Mason, Bjorn Helgaas

Add a new endpoint(EP) function driver to provide virtio-net device. This
function not only shows virtio-net device for PCIe host system, but also
provides virtio-net device to EP side(local) system. Virtualy those network
devices are connected, so we can use to communicate over IP like a simple
NIC.

Architecture overview is following:

to Host       |	                to Endpoint
network stack |                 network stack
      |       |                       |
+-----------+ |	+-----------+   +-----------+
|virtio-net | |	|virtio-net |   |virtio-net |
|driver     | |	|EP function|---|driver     |
+-----------+ |	+-----------+   +-----------+
      |       |	      |
+-----------+ | +-----------+
|PCIeC      | | |PCIeC      |
|Rootcomplex|-|-|Endpoint   |
+-----------+ | +-----------+
  Host side   |          Endpoint side

This driver uses PCIe EP framework to show virtio-net (pci) device Host
side, and generate virtual virtio-net device and register to EP side.
A communication date is diractly transported between virtqueue level
with each other using PCIe embedded DMA controller.

by a limitation of the hardware and Linux EP framework, this function
follows a virtio legacy specification.

This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
just use the PCIe EP framework and depends on the PCIe EDMA.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Takanari Hayama <taki@igel.co.jp>
---
 drivers/pci/endpoint/functions/Kconfig        |  12 +
 drivers/pci/endpoint/functions/Makefile       |   1 +
 .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
 .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
 drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
 6 files changed, 1440 insertions(+)
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h

diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
index 9fd560886871..f88d8baaf689 100644
--- a/drivers/pci/endpoint/functions/Kconfig
+++ b/drivers/pci/endpoint/functions/Kconfig
@@ -37,3 +37,15 @@ config PCI_EPF_VNTB
 	  between PCI Root Port and PCIe Endpoint.
 
 	  If in doubt, say "N" to disable Endpoint NTB driver.
+
+config PCI_EPF_VNET
+	tristate "PCI Endpoint virtio-net driver"
+	depends on PCI_ENDPOINT
+	select PCI_ENDPOINT_VIRTIO
+	select VHOST_RING
+	select VHOST_IOMEM
+	help
+	  PCIe Endpoint virtio-net function implementation. This module enables to
+	  show the virtio-net as pci device to PCIe Host side, and, another
+	  virtio-net device show to local machine. Those devices can communicate
+	  each other.
diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
index 5c13001deaba..74cc4c330c62 100644
--- a/drivers/pci/endpoint/functions/Makefile
+++ b/drivers/pci/endpoint/functions/Makefile
@@ -6,3 +6,4 @@
 obj-$(CONFIG_PCI_EPF_TEST)		+= pci-epf-test.o
 obj-$(CONFIG_PCI_EPF_NTB)		+= pci-epf-ntb.o
 obj-$(CONFIG_PCI_EPF_VNTB) 		+= pci-epf-vntb.o
+obj-$(CONFIG_PCI_EPF_VNET)		+= pci-epf-vnet.o pci-epf-vnet-rc.o pci-epf-vnet-ep.o
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
new file mode 100644
index 000000000000..93b7e00e8d06
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Functions work for Endpoint side(local) using EPF framework
+ */
+#include <linux/pci-epc.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
+
+#include "pci-epf-vnet.h"
+
+static inline struct epf_vnet *vdev_to_vnet(struct virtio_device *vdev)
+{
+	return container_of(vdev, struct epf_vnet, ep.vdev);
+}
+
+static void epf_vnet_ep_set_status(struct epf_vnet *vnet, u16 status)
+{
+	vnet->ep.net_config_status |= status;
+}
+
+static void epf_vnet_ep_clear_status(struct epf_vnet *vnet, u16 status)
+{
+	vnet->ep.net_config_status &= ~status;
+}
+
+static void epf_vnet_ep_raise_config_irq(struct epf_vnet *vnet)
+{
+	virtio_config_changed(&vnet->ep.vdev);
+}
+
+void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet)
+{
+	epf_vnet_ep_set_status(vnet,
+			       VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
+	epf_vnet_ep_raise_config_irq(vnet);
+}
+
+void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq)
+{
+	vring_interrupt(0, vq);
+}
+
+static int epf_vnet_ep_process_ctrlq_entry(struct epf_vnet *vnet)
+{
+	struct vringh *vrh = &vnet->ep.ctlvrh;
+	struct vringh_kiov *wiov = &vnet->ep.ctl_riov;
+	struct vringh_kiov *riov = &vnet->ep.ctl_wiov;
+	struct virtio_net_ctrl_hdr *hdr;
+	virtio_net_ctrl_ack *ack;
+	int err;
+	u16 head;
+	size_t len;
+
+	err = vringh_getdesc(vrh, riov, wiov, &head);
+	if (err <= 0)
+		goto done;
+
+	len = vringh_kiov_length(riov);
+	if (len < sizeof(*hdr)) {
+		pr_debug("Command is too short: %ld\n", len);
+		err = -EIO;
+		goto done;
+	}
+
+	if (vringh_kiov_length(wiov) < sizeof(*ack)) {
+		pr_debug("Space for ack is not enough\n");
+		err = -EIO;
+		goto done;
+	}
+
+	hdr = phys_to_virt((unsigned long)riov->iov[riov->i].iov_base);
+	ack = phys_to_virt((unsigned long)wiov->iov[wiov->i].iov_base);
+
+	switch (hdr->class) {
+	case VIRTIO_NET_CTRL_ANNOUNCE:
+		if (hdr->cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
+			pr_debug("Invalid command: announce: %d\n", hdr->cmd);
+			goto done;
+		}
+
+		epf_vnet_ep_clear_status(vnet, VIRTIO_NET_S_ANNOUNCE);
+		*ack = VIRTIO_NET_OK;
+		break;
+	default:
+		pr_debug("Found not supported class: %d\n", hdr->class);
+		err = -EIO;
+	}
+
+done:
+	vringh_complete(vrh, head, len);
+	return err;
+}
+
+static u64 epf_vnet_ep_vdev_get_features(struct virtio_device *vdev)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	return vnet->virtio_features;
+}
+
+static int epf_vnet_ep_vdev_finalize_features(struct virtio_device *vdev)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	if (vdev->features != vnet->virtio_features)
+		return -EINVAL;
+
+	return 0;
+}
+
+static void epf_vnet_ep_vdev_get_config(struct virtio_device *vdev,
+					unsigned int offset, void *buf,
+					unsigned int len)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+	const unsigned int mac_len = sizeof(vnet->vnet_cfg.mac);
+	const unsigned int status_len = sizeof(vnet->vnet_cfg.status);
+	unsigned int copy_len;
+
+	switch (offset) {
+	case offsetof(struct virtio_net_config, mac):
+		/* This PCIe EP function doesn't provide a VIRTIO_NET_F_MAC feature, so just
+		 * clear the buffer.
+		 */
+		copy_len = len >= mac_len ? mac_len : len;
+		memset(buf, 0x00, copy_len);
+		len -= copy_len;
+		buf += copy_len;
+		fallthrough;
+	case offsetof(struct virtio_net_config, status):
+		copy_len = len >= status_len ? status_len : len;
+		memcpy(buf, &vnet->ep.net_config_status, copy_len);
+		len -= copy_len;
+		buf += copy_len;
+		fallthrough;
+	default:
+		if (offset > sizeof(vnet->vnet_cfg)) {
+			memset(buf, 0x00, len);
+			break;
+		}
+		memcpy(buf, (void *)&vnet->vnet_cfg + offset, len);
+	}
+}
+
+static void epf_vnet_ep_vdev_set_config(struct virtio_device *vdev,
+					unsigned int offset, const void *buf,
+					unsigned int len)
+{
+	/* Do nothing, because all of virtio net config space is readonly. */
+}
+
+static u8 epf_vnet_ep_vdev_get_status(struct virtio_device *vdev)
+{
+	return 0;
+}
+
+static void epf_vnet_ep_vdev_set_status(struct virtio_device *vdev, u8 status)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	if (status & VIRTIO_CONFIG_S_DRIVER_OK)
+		epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_EP);
+}
+
+static void epf_vnet_ep_vdev_reset(struct virtio_device *vdev)
+{
+	pr_debug("doesn't support yet");
+}
+
+static bool epf_vnet_ep_vdev_vq_notify(struct virtqueue *vq)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vq->vdev);
+	struct vringh *tx_vrh = &vnet->ep.txvrh;
+	struct vringh *rx_vrh = &vnet->rc.rxvrh->vrh;
+	struct vringh_kiov *tx_iov = &vnet->ep.tx_iov;
+	struct vringh_kiov *rx_iov = &vnet->rc.rx_iov;
+	int err;
+
+	/* Support only one queue pair */
+	switch (vq->index) {
+	case 0: // rx queue
+		break;
+	case 1: // tx queue
+		while ((err = epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov,
+						rx_iov, DMA_MEM_TO_DEV)) > 0)
+			;
+		if (err < 0)
+			pr_debug("Failed to transmit: EP -> Host: %d\n", err);
+		break;
+	case 2: // control queue
+		epf_vnet_ep_process_ctrlq_entry(vnet);
+		break;
+	default:
+		return false;
+	}
+
+	return true;
+}
+
+static int epf_vnet_ep_vdev_find_vqs(struct virtio_device *vdev,
+				     unsigned int nvqs, struct virtqueue *vqs[],
+				     vq_callback_t *callback[],
+				     const char *const names[], const bool *ctx,
+				     struct irq_affinity *desc)
+{
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+	const size_t vq_size = epf_vnet_get_vq_size();
+	int i;
+	int err;
+	int qidx;
+
+	for (qidx = 0, i = 0; i < nvqs; i++) {
+		struct virtqueue *vq;
+		struct vring *vring;
+		struct vringh *vrh;
+
+		if (!names[i]) {
+			vqs[i] = NULL;
+			continue;
+		}
+
+		vq = vring_create_virtqueue(qidx++, vq_size,
+					    VIRTIO_PCI_VRING_ALIGN, vdev, true,
+					    false, ctx ? ctx[i] : false,
+					    epf_vnet_ep_vdev_vq_notify,
+					    callback[i], names[i]);
+		if (!vq) {
+			err = -ENOMEM;
+			goto err_del_vqs;
+		}
+
+		vqs[i] = vq;
+		vring = virtqueue_get_vring(vq);
+
+		switch (i) {
+		case 0: // rx
+			vrh = &vnet->ep.rxvrh;
+			vnet->ep.rxvq = vq;
+			break;
+		case 1: // tx
+			vrh = &vnet->ep.txvrh;
+			vnet->ep.txvq = vq;
+			break;
+		case 2: // control
+			vrh = &vnet->ep.ctlvrh;
+			vnet->ep.ctlvq = vq;
+			break;
+		default:
+			err = -EIO;
+			goto err_del_vqs;
+		}
+
+		err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
+				       true, GFP_KERNEL, vring->desc,
+				       vring->avail, vring->used);
+		if (err) {
+			pr_err("failed to init vringh for vring %d\n", i);
+			goto err_del_vqs;
+		}
+	}
+
+	err = epf_vnet_init_kiov(&vnet->ep.tx_iov, vq_size);
+	if (err)
+		goto err_free_kiov;
+	err = epf_vnet_init_kiov(&vnet->ep.rx_iov, vq_size);
+	if (err)
+		goto err_free_kiov;
+	err = epf_vnet_init_kiov(&vnet->ep.ctl_riov, vq_size);
+	if (err)
+		goto err_free_kiov;
+	err = epf_vnet_init_kiov(&vnet->ep.ctl_wiov, vq_size);
+	if (err)
+		goto err_free_kiov;
+
+	return 0;
+
+err_free_kiov:
+	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
+
+err_del_vqs:
+	for (; i >= 0; i--) {
+		if (!names[i])
+			continue;
+
+		if (!vqs[i])
+			continue;
+
+		vring_del_virtqueue(vqs[i]);
+	}
+	return err;
+}
+
+static void epf_vnet_ep_vdev_del_vqs(struct virtio_device *vdev)
+{
+	struct virtqueue *vq, *n;
+	struct epf_vnet *vnet = vdev_to_vnet(vdev);
+
+	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
+		vring_del_virtqueue(vq);
+
+	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
+	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
+}
+
+static const struct virtio_config_ops epf_vnet_ep_vdev_config_ops = {
+	.get_features = epf_vnet_ep_vdev_get_features,
+	.finalize_features = epf_vnet_ep_vdev_finalize_features,
+	.get = epf_vnet_ep_vdev_get_config,
+	.set = epf_vnet_ep_vdev_set_config,
+	.get_status = epf_vnet_ep_vdev_get_status,
+	.set_status = epf_vnet_ep_vdev_set_status,
+	.reset = epf_vnet_ep_vdev_reset,
+	.find_vqs = epf_vnet_ep_vdev_find_vqs,
+	.del_vqs = epf_vnet_ep_vdev_del_vqs,
+};
+
+void epf_vnet_ep_cleanup(struct epf_vnet *vnet)
+{
+	unregister_virtio_device(&vnet->ep.vdev);
+}
+
+int epf_vnet_ep_setup(struct epf_vnet *vnet)
+{
+	int err;
+	struct virtio_device *vdev = &vnet->ep.vdev;
+
+	vdev->dev.parent = vnet->epf->epc->dev.parent;
+	vdev->config = &epf_vnet_ep_vdev_config_ops;
+	vdev->id.vendor = PCI_VENDOR_ID_REDHAT_QUMRANET;
+	vdev->id.device = VIRTIO_ID_NET;
+
+	err = register_virtio_device(vdev);
+	if (err)
+		return err;
+
+	return 0;
+}
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
new file mode 100644
index 000000000000..2ca0245a9134
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
@@ -0,0 +1,635 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Functions work for PCie Host side(remote) using EPF framework.
+ */
+#include <linux/pci-epf.h>
+#include <linux/pci-epc.h>
+#include <linux/pci_ids.h>
+#include <linux/sched.h>
+#include <linux/virtio_pci.h>
+
+#include "pci-epf-vnet.h"
+
+#define VIRTIO_NET_LEGACY_CFG_BAR BAR_0
+
+/* Returns an out side of the valid queue index. */
+static inline u16 epf_vnet_rc_get_number_of_queues(struct epf_vnet *vnet)
+
+{
+	/* number of queue pairs and control queue */
+	return vnet->vnet_cfg.max_virtqueue_pairs * 2 + 1;
+}
+
+static void epf_vnet_rc_memcpy_config(struct epf_vnet *vnet, size_t offset,
+				      void *buf, size_t len)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	memcpy_toio(base, buf, len);
+}
+
+static void epf_vnet_rc_set_config8(struct epf_vnet *vnet, size_t offset,
+				    u8 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite8(ioread8(base) | config, base);
+}
+
+static void epf_vnet_rc_set_config16(struct epf_vnet *vnet, size_t offset,
+				     u16 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite16(ioread16(base) | config, base);
+}
+
+static void epf_vnet_rc_clear_config16(struct epf_vnet *vnet, size_t offset,
+				       u16 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite16(ioread16(base) & ~config, base);
+}
+
+static void epf_vnet_rc_set_config32(struct epf_vnet *vnet, size_t offset,
+				     u32 config)
+{
+	void __iomem *base = vnet->rc.cfg_base + offset;
+
+	iowrite32(ioread32(base) | config, base);
+}
+
+static void epf_vnet_rc_raise_config_irq(struct epf_vnet *vnet)
+{
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_CONFIG);
+	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
+}
+
+void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet)
+{
+	epf_vnet_rc_set_config16(vnet,
+				 VIRTIO_PCI_CONFIG_OFF(false) +
+					 offsetof(struct virtio_net_config,
+						  status),
+				 VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
+	epf_vnet_rc_raise_config_irq(vnet);
+}
+
+/*
+ * For the PCIe host, this driver shows legacy virtio-net device. Because,
+ * virtio structure pci capabilities is mandatory for modern virtio device,
+ * but there is no PCIe EP hardware that can be configured with any pci
+ * capabilities and Linux PCIe EP framework doesn't support it.
+ */
+static struct pci_epf_header epf_vnet_pci_header = {
+	.vendorid = PCI_VENDOR_ID_REDHAT_QUMRANET,
+	.deviceid = VIRTIO_TRANS_ID_NET,
+	.subsys_vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
+	.subsys_id = VIRTIO_ID_NET,
+	.revid = 0,
+	.baseclass_code = PCI_BASE_CLASS_NETWORK,
+	.interrupt_pin = PCI_INTERRUPT_PIN,
+};
+
+static void epf_vnet_rc_setup_configs(struct epf_vnet *vnet,
+				      void __iomem *cfg_base)
+{
+	u16 default_qindex = epf_vnet_rc_get_number_of_queues(vnet);
+
+	epf_vnet_rc_set_config32(vnet, VIRTIO_PCI_HOST_FEATURES,
+				 vnet->virtio_features);
+
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_QUEUE);
+	/*
+	 * Initialize the queue notify and selector to outside of the appropriate
+	 * virtqueue index. It is used to detect change with polling. There is no
+	 * other ways to detect host side driver updateing those values
+	 */
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY, default_qindex);
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL, default_qindex);
+	/* This pfn is also set to 0 for the polling as well */
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
+
+	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NUM,
+				 epf_vnet_get_vq_size());
+	epf_vnet_rc_set_config8(vnet, VIRTIO_PCI_STATUS, 0);
+	epf_vnet_rc_memcpy_config(vnet, VIRTIO_PCI_CONFIG_OFF(false),
+				  &vnet->vnet_cfg, sizeof(vnet->vnet_cfg));
+}
+
+static void epf_vnet_cleanup_bar(struct epf_vnet *vnet)
+{
+	struct pci_epf *epf = vnet->epf;
+
+	pci_epc_clear_bar(epf->epc, epf->func_no, epf->vfunc_no,
+			  &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR]);
+	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
+			   PRIMARY_INTERFACE);
+}
+
+static int epf_vnet_setup_bar(struct epf_vnet *vnet)
+{
+	int err;
+	size_t cfg_bar_size =
+		VIRTIO_PCI_CONFIG_OFF(false) + sizeof(struct virtio_net_config);
+	struct pci_epf *epf = vnet->epf;
+	const struct pci_epc_features *features;
+	struct pci_epf_bar *config_bar = &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR];
+
+	features = pci_epc_get_features(epf->epc, epf->func_no, epf->vfunc_no);
+	if (!features) {
+		pr_debug("Failed to get PCI EPC features\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (features->reserved_bar & BIT(VIRTIO_NET_LEGACY_CFG_BAR)) {
+		pr_debug("Cannot use the PCI BAR for legacy virtio pci\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
+		if (cfg_bar_size >
+		    features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
+			pr_debug("PCI BAR size is not enough\n");
+			return -ENOMEM;
+		}
+	}
+
+	config_bar->flags |= PCI_BASE_ADDRESS_MEM_TYPE_64;
+
+	vnet->rc.cfg_base = pci_epf_alloc_space(epf, cfg_bar_size,
+						VIRTIO_NET_LEGACY_CFG_BAR,
+						features->align,
+						PRIMARY_INTERFACE);
+	if (!vnet->rc.cfg_base) {
+		pr_debug("Failed to allocate virtio-net config memory\n");
+		return -ENOMEM;
+	}
+
+	epf_vnet_rc_setup_configs(vnet, vnet->rc.cfg_base);
+
+	err = pci_epc_set_bar(epf->epc, epf->func_no, epf->vfunc_no,
+			      config_bar);
+	if (err) {
+		pr_debug("Failed to set PCI BAR");
+		goto err_free_space;
+	}
+
+	return 0;
+
+err_free_space:
+	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
+			   PRIMARY_INTERFACE);
+	return err;
+}
+
+static int epf_vnet_rc_negotiate_configs(struct epf_vnet *vnet, u32 *txpfn,
+					 u32 *rxpfn, u32 *ctlpfn)
+{
+	const u16 nqueues = epf_vnet_rc_get_number_of_queues(vnet);
+	const u16 default_sel = nqueues;
+	u32 __iomem *queue_pfn = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_PFN;
+	u16 __iomem *queue_sel = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_SEL;
+	u8 __iomem *pci_status = vnet->rc.cfg_base + VIRTIO_PCI_STATUS;
+	u32 pfn;
+	u16 sel;
+	struct {
+		u32 pfn;
+		u16 sel;
+	} tmp[3] = {};
+	int tmp_index = 0;
+
+	*rxpfn = *txpfn = *ctlpfn = 0;
+
+	/* To avoid to miss a getting the pfn and selector for virtqueue wrote by
+	 * host driver, we need to implement fast polling with saving.
+	 *
+	 * This implementation suspects that the host driver writes pfn only once
+	 * for each queues
+	 */
+	while (tmp_index < nqueues) {
+		pfn = ioread32(queue_pfn);
+		if (pfn == 0)
+			continue;
+
+		iowrite32(0, queue_pfn);
+
+		sel = ioread16(queue_sel);
+		if (sel == default_sel)
+			continue;
+
+		tmp[tmp_index].pfn = pfn;
+		tmp[tmp_index].sel = sel;
+		tmp_index++;
+	}
+
+	while (!((ioread8(pci_status) & VIRTIO_CONFIG_S_DRIVER_OK)))
+		;
+
+	for (int i = 0; i < nqueues; ++i) {
+		switch (tmp[i].sel) {
+		case 0:
+			*rxpfn = tmp[i].pfn;
+			break;
+		case 1:
+			*txpfn = tmp[i].pfn;
+			break;
+		case 2:
+			*ctlpfn = tmp[i].pfn;
+			break;
+		}
+	}
+
+	if (!*rxpfn || !*txpfn || !*ctlpfn)
+		return -EIO;
+
+	return 0;
+}
+
+static int epf_vnet_rc_monitor_notify(void *data)
+{
+	struct epf_vnet *vnet = data;
+	u16 __iomem *queue_notify = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_NOTIFY;
+	const u16 notify_default = epf_vnet_rc_get_number_of_queues(vnet);
+
+	epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_RC);
+
+	/* Poll to detect a change of the queue_notify register. Sometimes this
+	 * polling misses the change, so try to check each virtqueues
+	 * everytime.
+	 */
+	while (true) {
+		while (ioread16(queue_notify) == notify_default)
+			;
+		iowrite16(notify_default, queue_notify);
+
+		queue_work(vnet->rc.tx_wq, &vnet->rc.tx_work);
+		queue_work(vnet->rc.ctl_wq, &vnet->rc.ctl_work);
+	}
+
+	return 0;
+}
+
+static int epf_vnet_rc_spawn_notify_monitor(struct epf_vnet *vnet)
+{
+	vnet->rc.notify_monitor_task =
+		kthread_create(epf_vnet_rc_monitor_notify, vnet,
+			       "pci-epf-vnet/cfg_negotiator");
+	if (IS_ERR(vnet->rc.notify_monitor_task))
+		return PTR_ERR(vnet->rc.notify_monitor_task);
+
+	/* Change the thread priority to high for polling. */
+	sched_set_fifo(vnet->rc.notify_monitor_task);
+	wake_up_process(vnet->rc.notify_monitor_task);
+
+	return 0;
+}
+
+static int epf_vnet_rc_device_setup(void *data)
+{
+	struct epf_vnet *vnet = data;
+	struct pci_epf *epf = vnet->epf;
+	u32 txpfn, rxpfn, ctlpfn;
+	const size_t vq_size = epf_vnet_get_vq_size();
+	int err;
+
+	err = epf_vnet_rc_negotiate_configs(vnet, &txpfn, &rxpfn, &ctlpfn);
+	if (err) {
+		pr_debug("Failed to negatiate configs with driver\n");
+		return err;
+	}
+
+	/* Polling phase is finished. This thread backs to normal priority. */
+	sched_set_normal(vnet->rc.device_setup_task, 19);
+
+	vnet->rc.txvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
+						     txpfn, vq_size);
+	if (IS_ERR(vnet->rc.txvrh)) {
+		pr_debug("Failed to setup virtqueue for tx\n");
+		return PTR_ERR(vnet->rc.txvrh);
+	}
+
+	err = epf_vnet_init_kiov(&vnet->rc.tx_iov, vq_size);
+	if (err)
+		goto err_free_epf_tx_vringh;
+
+	vnet->rc.rxvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
+						     rxpfn, vq_size);
+	if (IS_ERR(vnet->rc.rxvrh)) {
+		pr_debug("Failed to setup virtqueue for rx\n");
+		err = PTR_ERR(vnet->rc.rxvrh);
+		goto err_deinit_tx_kiov;
+	}
+
+	err = epf_vnet_init_kiov(&vnet->rc.rx_iov, vq_size);
+	if (err)
+		goto err_free_epf_rx_vringh;
+
+	vnet->rc.ctlvrh = pci_epf_virtio_alloc_vringh(
+		epf, vnet->virtio_features, ctlpfn, vq_size);
+	if (IS_ERR(vnet->rc.ctlvrh)) {
+		pr_err("failed to setup virtqueue\n");
+		err = PTR_ERR(vnet->rc.ctlvrh);
+		goto err_deinit_rx_kiov;
+	}
+
+	err = epf_vnet_init_kiov(&vnet->rc.ctl_riov, vq_size);
+	if (err)
+		goto err_free_epf_ctl_vringh;
+
+	err = epf_vnet_init_kiov(&vnet->rc.ctl_wiov, vq_size);
+	if (err)
+		goto err_deinit_ctl_riov;
+
+	err = epf_vnet_rc_spawn_notify_monitor(vnet);
+	if (err) {
+		pr_debug("Failed to create notify monitor thread\n");
+		goto err_deinit_ctl_wiov;
+	}
+
+	return 0;
+
+err_deinit_ctl_wiov:
+	epf_vnet_deinit_kiov(&vnet->rc.ctl_wiov);
+err_deinit_ctl_riov:
+	epf_vnet_deinit_kiov(&vnet->rc.ctl_riov);
+err_free_epf_ctl_vringh:
+	pci_epf_virtio_free_vringh(epf, vnet->rc.ctlvrh);
+err_deinit_rx_kiov:
+	epf_vnet_deinit_kiov(&vnet->rc.rx_iov);
+err_free_epf_rx_vringh:
+	pci_epf_virtio_free_vringh(epf, vnet->rc.rxvrh);
+err_deinit_tx_kiov:
+	epf_vnet_deinit_kiov(&vnet->rc.tx_iov);
+err_free_epf_tx_vringh:
+	pci_epf_virtio_free_vringh(epf, vnet->rc.txvrh);
+
+	return err;
+}
+
+static int epf_vnet_rc_spawn_device_setup_task(struct epf_vnet *vnet)
+{
+	vnet->rc.device_setup_task = kthread_create(
+		epf_vnet_rc_device_setup, vnet, "pci-epf-vnet/cfg_negotiator");
+	if (IS_ERR(vnet->rc.device_setup_task))
+		return PTR_ERR(vnet->rc.device_setup_task);
+
+	/* Change the thread priority to high for the polling. */
+	sched_set_fifo(vnet->rc.device_setup_task);
+	wake_up_process(vnet->rc.device_setup_task);
+
+	return 0;
+}
+
+static void epf_vnet_rc_tx_handler(struct work_struct *work)
+{
+	struct epf_vnet *vnet = container_of(work, struct epf_vnet, rc.tx_work);
+	struct vringh *tx_vrh = &vnet->rc.txvrh->vrh;
+	struct vringh *rx_vrh = &vnet->ep.rxvrh;
+	struct vringh_kiov *tx_iov = &vnet->rc.tx_iov;
+	struct vringh_kiov *rx_iov = &vnet->ep.rx_iov;
+
+	while (epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov, rx_iov,
+				 DMA_DEV_TO_MEM) > 0)
+		;
+}
+
+static void epf_vnet_rc_raise_irq_handler(struct work_struct *work)
+{
+	struct epf_vnet *vnet =
+		container_of(work, struct epf_vnet, rc.raise_irq_work);
+	struct pci_epf *epf = vnet->epf;
+
+	pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no,
+			  PCI_EPC_IRQ_LEGACY, 0);
+}
+
+struct epf_vnet_rc_meminfo {
+	void __iomem *addr, *virt;
+	phys_addr_t phys;
+	size_t len;
+};
+
+/* Util function to access PCIe host side memory from local CPU.  */
+static struct epf_vnet_rc_meminfo *
+epf_vnet_rc_epc_mmap(struct pci_epf *epf, phys_addr_t pci_addr, size_t len)
+{
+	int err;
+	phys_addr_t aaddr, phys_addr;
+	size_t asize, offset;
+	void __iomem *virt_addr;
+	struct epf_vnet_rc_meminfo *meminfo;
+
+	err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
+	if (err) {
+		pr_debug("Failed to get EPC align: %d\n", err);
+		return NULL;
+	}
+
+	offset = pci_addr - aaddr;
+
+	virt_addr = pci_epc_mem_alloc_addr(epf->epc, &phys_addr, asize);
+	if (!virt_addr) {
+		pr_debug("Failed to allocate epc memory\n");
+		return NULL;
+	}
+
+	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, phys_addr,
+			       aaddr, asize);
+	if (err) {
+		pr_debug("Failed to map epc memory\n");
+		goto err_epc_free_addr;
+	}
+
+	meminfo = kmalloc(sizeof(*meminfo), GFP_KERNEL);
+	if (!meminfo)
+		goto err_epc_unmap_addr;
+
+	meminfo->virt = virt_addr;
+	meminfo->phys = phys_addr;
+	meminfo->len = len;
+	meminfo->addr = virt_addr + offset;
+
+	return meminfo;
+
+err_epc_unmap_addr:
+	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
+			   meminfo->phys);
+err_epc_free_addr:
+	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
+			      meminfo->len);
+
+	return NULL;
+}
+
+static void epf_vnet_rc_epc_munmap(struct pci_epf *epf,
+				   struct epf_vnet_rc_meminfo *meminfo)
+{
+	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
+			   meminfo->phys);
+	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
+			      meminfo->len);
+	kfree(meminfo);
+}
+
+static int epf_vnet_rc_process_ctrlq_entry(struct epf_vnet *vnet)
+{
+	struct vringh_kiov *riov = &vnet->rc.ctl_riov;
+	struct vringh_kiov *wiov = &vnet->rc.ctl_wiov;
+	struct vringh *vrh = &vnet->rc.ctlvrh->vrh;
+	struct pci_epf *epf = vnet->epf;
+	struct epf_vnet_rc_meminfo *rmem, *wmem;
+	struct virtio_net_ctrl_hdr *hdr;
+	int err;
+	u16 head;
+	size_t total_len;
+	u8 class, cmd;
+
+	err = vringh_getdesc(vrh, riov, wiov, &head);
+	if (err <= 0)
+		return err;
+
+	total_len = vringh_kiov_length(riov);
+
+	rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
+				    riov->iov[riov->i].iov_len);
+	if (!rmem) {
+		err = -ENOMEM;
+		goto err_abandon_descs;
+	}
+
+	wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
+				    wiov->iov[wiov->i].iov_len);
+	if (!wmem) {
+		err = -ENOMEM;
+		goto err_epc_unmap_rmem;
+	}
+
+	hdr = rmem->addr;
+	class = ioread8(&hdr->class);
+	cmd = ioread8(&hdr->cmd);
+	switch (ioread8(&hdr->class)) {
+	case VIRTIO_NET_CTRL_ANNOUNCE:
+		if (cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
+			pr_err("Found invalid command: announce: %d\n", cmd);
+			break;
+		}
+		epf_vnet_rc_clear_config16(
+			vnet,
+			VIRTIO_PCI_CONFIG_OFF(false) +
+				offsetof(struct virtio_net_config, status),
+			VIRTIO_NET_S_ANNOUNCE);
+		epf_vnet_rc_clear_config16(vnet, VIRTIO_PCI_ISR,
+					   VIRTIO_PCI_ISR_CONFIG);
+
+		iowrite8(VIRTIO_NET_OK, wmem->addr);
+		break;
+	default:
+		pr_err("Found unsupported class in control queue: %d\n", class);
+		break;
+	}
+
+	epf_vnet_rc_epc_munmap(epf, rmem);
+	epf_vnet_rc_epc_munmap(epf, wmem);
+	vringh_complete(vrh, head, total_len);
+
+	return 1;
+
+err_epc_unmap_rmem:
+	epf_vnet_rc_epc_munmap(epf, rmem);
+err_abandon_descs:
+	vringh_abandon(vrh, head);
+
+	return err;
+}
+
+static void epf_vnet_rc_process_ctrlq_entries(struct work_struct *work)
+{
+	struct epf_vnet *vnet =
+		container_of(work, struct epf_vnet, rc.ctl_work);
+
+	while (epf_vnet_rc_process_ctrlq_entry(vnet) > 0)
+		;
+}
+
+void epf_vnet_rc_notify(struct epf_vnet *vnet)
+{
+	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
+}
+
+void epf_vnet_rc_cleanup(struct epf_vnet *vnet)
+{
+	epf_vnet_cleanup_bar(vnet);
+	destroy_workqueue(vnet->rc.tx_wq);
+	destroy_workqueue(vnet->rc.irq_wq);
+	destroy_workqueue(vnet->rc.ctl_wq);
+
+	kthread_stop(vnet->rc.device_setup_task);
+}
+
+int epf_vnet_rc_setup(struct epf_vnet *vnet)
+{
+	int err;
+	struct pci_epf *epf = vnet->epf;
+
+	err = pci_epc_write_header(epf->epc, epf->func_no, epf->vfunc_no,
+				   &epf_vnet_pci_header);
+	if (err)
+		return err;
+
+	err = epf_vnet_setup_bar(vnet);
+	if (err)
+		return err;
+
+	vnet->rc.tx_wq =
+		alloc_workqueue("pci-epf-vnet/tx-wq",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
+	if (!vnet->rc.tx_wq) {
+		pr_debug(
+			"Failed to allocate workqueue for rc -> ep transmission\n");
+		err = -ENOMEM;
+		goto err_cleanup_bar;
+	}
+
+	INIT_WORK(&vnet->rc.tx_work, epf_vnet_rc_tx_handler);
+
+	vnet->rc.irq_wq =
+		alloc_workqueue("pci-epf-vnet/irq-wq",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
+	if (!vnet->rc.irq_wq) {
+		pr_debug("Failed to allocate workqueue for irq\n");
+		err = -ENOMEM;
+		goto err_destory_tx_wq;
+	}
+
+	INIT_WORK(&vnet->rc.raise_irq_work, epf_vnet_rc_raise_irq_handler);
+
+	vnet->rc.ctl_wq =
+		alloc_workqueue("pci-epf-vnet/ctl-wq",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
+	if (!vnet->rc.ctl_wq) {
+		pr_err("Failed to allocate work queue for control queue processing\n");
+		err = -ENOMEM;
+		goto err_destory_irq_wq;
+	}
+
+	INIT_WORK(&vnet->rc.ctl_work, epf_vnet_rc_process_ctrlq_entries);
+
+	err = epf_vnet_rc_spawn_device_setup_task(vnet);
+	if (err)
+		goto err_destory_ctl_wq;
+
+	return 0;
+
+err_cleanup_bar:
+	epf_vnet_cleanup_bar(vnet);
+err_destory_tx_wq:
+	destroy_workqueue(vnet->rc.tx_wq);
+err_destory_irq_wq:
+	destroy_workqueue(vnet->rc.irq_wq);
+err_destory_ctl_wq:
+	destroy_workqueue(vnet->rc.ctl_wq);
+
+	return err;
+}
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.c b/drivers/pci/endpoint/functions/pci-epf-vnet.c
new file mode 100644
index 000000000000..e48ad8067796
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet.c
@@ -0,0 +1,387 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCI Endpoint function driver to impliment virtio-net device.
+ */
+#include <linux/module.h>
+#include <linux/pci-epf.h>
+#include <linux/pci-epc.h>
+#include <linux/vringh.h>
+#include <linux/dmaengine.h>
+
+#include "pci-epf-vnet.h"
+
+static int virtio_queue_size = 0x100;
+module_param(virtio_queue_size, int, 0444);
+MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
+
+int epf_vnet_get_vq_size(void)
+{
+	return virtio_queue_size;
+}
+
+int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size)
+{
+	struct kvec *kvec;
+
+	kvec = kmalloc_array(vq_size, sizeof(*kvec), GFP_KERNEL);
+	if (!kvec)
+		return -ENOMEM;
+
+	vringh_kiov_init(kiov, kvec, vq_size);
+
+	return 0;
+}
+
+void epf_vnet_deinit_kiov(struct vringh_kiov *kiov)
+{
+	kfree(kiov->iov);
+}
+
+void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from)
+{
+	vnet->init_complete |= from;
+
+	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_EP))
+		return;
+
+	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_RC))
+		return;
+
+	epf_vnet_ep_announce_linkup(vnet);
+	epf_vnet_rc_announce_linkup(vnet);
+}
+
+struct epf_dma_filter_param {
+	struct device *dev;
+	u32 dma_mask;
+};
+
+static bool epf_virtnet_dma_filter(struct dma_chan *chan, void *param)
+{
+	struct epf_dma_filter_param *fparam = param;
+	struct dma_slave_caps caps;
+
+	memset(&caps, 0, sizeof(caps));
+	dma_get_slave_caps(chan, &caps);
+
+	return chan->device->dev == fparam->dev &&
+	       (fparam->dma_mask & caps.directions);
+}
+
+static int epf_vnet_init_edma(struct epf_vnet *vnet, struct device *dma_dev)
+{
+	struct epf_dma_filter_param param;
+	dma_cap_mask_t mask;
+	int err;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+
+	param.dev = dma_dev;
+	param.dma_mask = BIT(DMA_MEM_TO_DEV);
+	vnet->lr_dma_chan =
+		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
+	if (!vnet->lr_dma_chan)
+		return -EOPNOTSUPP;
+
+	param.dma_mask = BIT(DMA_DEV_TO_MEM);
+	vnet->rl_dma_chan =
+		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
+	if (!vnet->rl_dma_chan) {
+		err = -EOPNOTSUPP;
+		goto err_release_channel;
+	}
+
+	return 0;
+
+err_release_channel:
+	dma_release_channel(vnet->lr_dma_chan);
+
+	return err;
+}
+
+static void epf_vnet_deinit_edma(struct epf_vnet *vnet)
+{
+	dma_release_channel(vnet->lr_dma_chan);
+	dma_release_channel(vnet->rl_dma_chan);
+}
+
+static int epf_vnet_dma_single(struct epf_vnet *vnet, phys_addr_t pci,
+			       dma_addr_t dma, size_t len,
+			       void (*callback)(void *), void *param,
+			       enum dma_transfer_direction dir)
+{
+	struct dma_async_tx_descriptor *desc;
+	int err;
+	struct dma_chan *chan;
+	struct dma_slave_config sconf;
+	dma_cookie_t cookie;
+	unsigned long flags = 0;
+
+	if (dir == DMA_MEM_TO_DEV) {
+		sconf.dst_addr = pci;
+		chan = vnet->lr_dma_chan;
+	} else {
+		sconf.src_addr = pci;
+		chan = vnet->rl_dma_chan;
+	}
+
+	err = dmaengine_slave_config(chan, &sconf);
+	if (unlikely(err))
+		return err;
+
+	if (callback)
+		flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
+
+	desc = dmaengine_prep_slave_single(chan, dma, len, dir, flags);
+	if (unlikely(!desc))
+		return -EIO;
+
+	desc->callback = callback;
+	desc->callback_param = param;
+
+	cookie = dmaengine_submit(desc);
+	err = dma_submit_error(cookie);
+	if (unlikely(err))
+		return err;
+
+	dma_async_issue_pending(chan);
+
+	return 0;
+}
+
+struct epf_vnet_dma_callback_param {
+	struct epf_vnet *vnet;
+	struct vringh *tx_vrh, *rx_vrh;
+	struct virtqueue *vq;
+	size_t total_len;
+	u16 tx_head, rx_head;
+};
+
+static void epf_vnet_dma_callback(void *p)
+{
+	struct epf_vnet_dma_callback_param *param = p;
+	struct epf_vnet *vnet = param->vnet;
+
+	vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
+	vringh_complete(param->rx_vrh, param->rx_head, param->total_len);
+
+	epf_vnet_rc_notify(vnet);
+	epf_vnet_ep_notify(vnet, param->vq);
+
+	kfree(param);
+}
+
+/**
+ * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
+ * @vnet: epf virtio net device to do dma
+ * @tx_vrh: vringh related to source tx vring
+ * @rx_vrh: vringh related to target rx vring
+ * @tx_iov: buffer to use tx
+ * @rx_iov: buffer to use rx
+ * @dir: a direction of DMA. local to remote or local from remote
+ *
+ * This function returns 0, 1 or error number. The 0 indicates there is not
+ * data to send. The 1 indicates a request to DMA is succeeded. Other error
+ * numbers shows error, however, ENOSPC means there is no buffer on target
+ * vring, so should retry to call later.
+ */
+int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
+		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
+		      struct vringh_kiov *rx_iov,
+		      enum dma_transfer_direction dir)
+{
+	int err;
+	u16 tx_head, rx_head;
+	size_t total_tx_len;
+	struct epf_vnet_dma_callback_param *cb_param;
+	struct vringh_kiov *liov, *riov;
+
+	err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
+	if (err <= 0)
+		return err;
+
+	total_tx_len = vringh_kiov_length(tx_iov);
+
+	err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
+	if (err < 0) {
+		goto err_tx_complete;
+	} else if (!err) {
+		/* There is not space on a vring of destination to transmit data, so
+		 * rollback tx vringh
+		 */
+		vringh_abandon(tx_vrh, tx_head);
+		return -ENOSPC;
+	}
+
+	cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
+	if (!cb_param) {
+		err = -ENOMEM;
+		goto err_rx_complete;
+	}
+
+	cb_param->tx_vrh = tx_vrh;
+	cb_param->rx_vrh = rx_vrh;
+	cb_param->tx_head = tx_head;
+	cb_param->rx_head = rx_head;
+	cb_param->total_len = total_tx_len;
+	cb_param->vnet = vnet;
+
+	switch (dir) {
+	case DMA_MEM_TO_DEV:
+		liov = tx_iov;
+		riov = rx_iov;
+		cb_param->vq = vnet->ep.txvq;
+		break;
+	case DMA_DEV_TO_MEM:
+		liov = rx_iov;
+		riov = tx_iov;
+		cb_param->vq = vnet->ep.rxvq;
+		break;
+	default:
+		err = -EINVAL;
+		goto err_free_param;
+	}
+
+	for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
+		size_t len;
+		u64 lbase, rbase;
+		void (*callback)(void *) = NULL;
+
+		lbase = (u64)liov->iov[liov->i].iov_base;
+		rbase = (u64)riov->iov[riov->i].iov_base;
+		len = tx_iov->iov[tx_iov->i].iov_len;
+
+		if (tx_iov->i + 1 == tx_iov->used)
+			callback = epf_vnet_dma_callback;
+
+		err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
+					  cb_param, dir);
+		if (err)
+			goto err_free_param;
+	}
+
+	return 1;
+
+err_free_param:
+	kfree(cb_param);
+err_rx_complete:
+	vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
+err_tx_complete:
+	vringh_complete(tx_vrh, tx_head, total_tx_len);
+
+	return err;
+}
+
+static int epf_vnet_bind(struct pci_epf *epf)
+{
+	int err;
+	struct epf_vnet *vnet = epf_get_drvdata(epf);
+
+	err = epf_vnet_init_edma(vnet, epf->epc->dev.parent);
+	if (err)
+		return err;
+
+	err = epf_vnet_rc_setup(vnet);
+	if (err)
+		goto err_free_edma;
+
+	err = epf_vnet_ep_setup(vnet);
+	if (err)
+		goto err_cleanup_rc;
+
+	return 0;
+
+err_free_edma:
+	epf_vnet_deinit_edma(vnet);
+err_cleanup_rc:
+	epf_vnet_rc_cleanup(vnet);
+
+	return err;
+}
+
+static void epf_vnet_unbind(struct pci_epf *epf)
+{
+	struct epf_vnet *vnet = epf_get_drvdata(epf);
+
+	epf_vnet_deinit_edma(vnet);
+	epf_vnet_rc_cleanup(vnet);
+	epf_vnet_ep_cleanup(vnet);
+}
+
+static struct pci_epf_ops epf_vnet_ops = {
+	.bind = epf_vnet_bind,
+	.unbind = epf_vnet_unbind,
+};
+
+static const struct pci_epf_device_id epf_vnet_ids[] = {
+	{ .name = "pci_epf_vnet" },
+	{}
+};
+
+static void epf_vnet_virtio_init(struct epf_vnet *vnet)
+{
+	vnet->virtio_features =
+		BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_STATUS) |
+		/* Following features are to skip any of checking and offloading, Like a
+		 * transmission between virtual machines on same system. Details are on
+		 * section 5.1.5 in virtio specification.
+		 */
+		BIT(VIRTIO_NET_F_GUEST_CSUM) | BIT(VIRTIO_NET_F_GUEST_TSO4) |
+		BIT(VIRTIO_NET_F_GUEST_TSO6) | BIT(VIRTIO_NET_F_GUEST_ECN) |
+		BIT(VIRTIO_NET_F_GUEST_UFO) |
+		// The control queue is just used for linkup announcement.
+		BIT(VIRTIO_NET_F_CTRL_VQ);
+
+	vnet->vnet_cfg.max_virtqueue_pairs = 1;
+	vnet->vnet_cfg.status = 0;
+	vnet->vnet_cfg.mtu = PAGE_SIZE;
+}
+
+static int epf_vnet_probe(struct pci_epf *epf)
+{
+	struct epf_vnet *vnet;
+
+	vnet = devm_kzalloc(&epf->dev, sizeof(*vnet), GFP_KERNEL);
+	if (!vnet)
+		return -ENOMEM;
+
+	epf_set_drvdata(epf, vnet);
+	vnet->epf = epf;
+
+	epf_vnet_virtio_init(vnet);
+
+	return 0;
+}
+
+static struct pci_epf_driver epf_vnet_drv = {
+	.driver.name = "pci_epf_vnet",
+	.ops = &epf_vnet_ops,
+	.id_table = epf_vnet_ids,
+	.probe = epf_vnet_probe,
+	.owner = THIS_MODULE,
+};
+
+static int __init epf_vnet_init(void)
+{
+	int err;
+
+	err = pci_epf_register_driver(&epf_vnet_drv);
+	if (err) {
+		pr_err("Failed to register epf vnet driver\n");
+		return err;
+	}
+
+	return 0;
+}
+module_init(epf_vnet_init);
+
+static void epf_vnet_exit(void)
+{
+	pci_epf_unregister_driver(&epf_vnet_drv);
+}
+module_exit(epf_vnet_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
+MODULE_DESCRIPTION("PCI endpoint function acts as virtio net device");
diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.h b/drivers/pci/endpoint/functions/pci-epf-vnet.h
new file mode 100644
index 000000000000..1e0f90c95578
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vnet.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PCI_EPF_VNET_H
+#define _PCI_EPF_VNET_H
+
+#include <linux/pci-epf.h>
+#include <linux/pci-epf-virtio.h>
+#include <linux/virtio_net.h>
+#include <linux/dmaengine.h>
+#include <linux/virtio.h>
+
+struct epf_vnet {
+	//TODO Should this variable be placed here?
+	struct pci_epf *epf;
+	struct virtio_net_config vnet_cfg;
+	u64 virtio_features;
+
+	// dma channels for local to remote(lr) and remote to local(rl)
+	struct dma_chan *lr_dma_chan, *rl_dma_chan;
+
+	struct {
+		void __iomem *cfg_base;
+		struct task_struct *device_setup_task;
+		struct task_struct *notify_monitor_task;
+		struct workqueue_struct *tx_wq, *irq_wq, *ctl_wq;
+		struct work_struct tx_work, raise_irq_work, ctl_work;
+		struct pci_epf_vringh *txvrh, *rxvrh, *ctlvrh;
+		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
+	} rc;
+
+	struct {
+		struct virtqueue *rxvq, *txvq, *ctlvq;
+		struct vringh txvrh, rxvrh, ctlvrh;
+		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
+		struct virtio_device vdev;
+		u16 net_config_status;
+	} ep;
+
+#define EPF_VNET_INIT_COMPLETE_EP BIT(0)
+#define EPF_VNET_INIT_COMPLETE_RC BIT(1)
+	u8 init_complete;
+};
+
+int epf_vnet_rc_setup(struct epf_vnet *vnet);
+void epf_vnet_rc_cleanup(struct epf_vnet *vnet);
+int epf_vnet_ep_setup(struct epf_vnet *vnet);
+void epf_vnet_ep_cleanup(struct epf_vnet *vnet);
+
+int epf_vnet_get_vq_size(void);
+int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size);
+void epf_vnet_deinit_kiov(struct vringh_kiov *kiov);
+int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
+		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
+		      struct vringh_kiov *rx_iov,
+		      enum dma_transfer_direction dir);
+void epf_vnet_rc_notify(struct epf_vnet *vnet);
+void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq);
+
+void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from);
+void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet);
+void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet);
+
+#endif // _PCI_EPF_VNET_H
-- 
2.25.1


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR
  2023-02-03 10:04   ` Shunsuke Mie
@ 2023-02-03 10:16     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-03 10:16 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

On Fri, Feb 03, 2023 at 07:04:15PM +0900, Shunsuke Mie wrote:
> Already it has beed defined a config changed flag of ISR, but not the queue
> flag. Add a macro for it.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> ---
>  include/uapi/linux/virtio_pci.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> index f703afc7ad31..fa82afd6171a 100644
> --- a/include/uapi/linux/virtio_pci.h
> +++ b/include/uapi/linux/virtio_pci.h
> @@ -94,6 +94,8 @@
>  
>  #endif /* VIRTIO_PCI_NO_LEGACY */
>  
> +/* Ths bit of the ISR which indicates a queue entry update */

typo

Something to add here:
	Note: only when MSI-X is disabled



> +#define VIRTIO_PCI_ISR_QUEUE		0x1
>  /* The bit of the ISR which indicates a device configuration change. */
>  #define VIRTIO_PCI_ISR_CONFIG		0x2
>  /* Vector value used to disable MSI for queue */
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR
@ 2023-02-03 10:16     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-03 10:16 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

On Fri, Feb 03, 2023 at 07:04:15PM +0900, Shunsuke Mie wrote:
> Already it has beed defined a config changed flag of ISR, but not the queue
> flag. Add a macro for it.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> ---
>  include/uapi/linux/virtio_pci.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> index f703afc7ad31..fa82afd6171a 100644
> --- a/include/uapi/linux/virtio_pci.h
> +++ b/include/uapi/linux/virtio_pci.h
> @@ -94,6 +94,8 @@
>  
>  #endif /* VIRTIO_PCI_NO_LEGACY */
>  
> +/* Ths bit of the ISR which indicates a queue entry update */

typo

Something to add here:
	Note: only when MSI-X is disabled



> +#define VIRTIO_PCI_ISR_QUEUE		0x1
>  /* The bit of the ISR which indicates a device configuration change. */
>  #define VIRTIO_PCI_ISR_CONFIG		0x2
>  /* Vector value used to disable MSI for queue */
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions
  2023-02-03 10:04   ` Shunsuke Mie
@ 2023-02-03 10:20     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-03 10:20 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

On Fri, Feb 03, 2023 at 07:04:17PM +0900, Shunsuke Mie wrote:
> Add a new library to access a virtio ring located on PCIe host memory. The
> library generates struct pci_epf_vringh that is introduced in this patch.
> The struct has a vringh member, so vringh APIs can be used to access the
> virtio ring.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> ---
>  drivers/pci/endpoint/Kconfig          |   7 ++
>  drivers/pci/endpoint/Makefile         |   1 +
>  drivers/pci/endpoint/pci-epf-virtio.c | 113 ++++++++++++++++++++++++++
>  include/linux/pci-epf-virtio.h        |  25 ++++++
>  4 files changed, 146 insertions(+)
>  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>  create mode 100644 include/linux/pci-epf-virtio.h
> 
> diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
> index 17bbdc9bbde0..07276dcc43c8 100644
> --- a/drivers/pci/endpoint/Kconfig
> +++ b/drivers/pci/endpoint/Kconfig
> @@ -28,6 +28,13 @@ config PCI_ENDPOINT_CONFIGFS
>  	   configure the endpoint function and used to bind the
>  	   function with a endpoint controller.
>  
> +config PCI_ENDPOINT_VIRTIO
> +	tristate
> +	depends on PCI_ENDPOINT
> +	select VHOST_IOMEM
> +	help
> +	  TODO update this comment
> +
>  source "drivers/pci/endpoint/functions/Kconfig"
>  
>  endmenu
> diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
> index 95b2fe47e3b0..95712f0a13d1 100644
> --- a/drivers/pci/endpoint/Makefile
> +++ b/drivers/pci/endpoint/Makefile
> @@ -4,5 +4,6 @@
>  #
>  
>  obj-$(CONFIG_PCI_ENDPOINT_CONFIGFS)	+= pci-ep-cfs.o
> +obj-$(CONFIG_PCI_ENDPOINT_VIRTIO)	+= pci-epf-virtio.o
>  obj-$(CONFIG_PCI_ENDPOINT)		+= pci-epc-core.o pci-epf-core.o\
>  					   pci-epc-mem.o functions/
> diff --git a/drivers/pci/endpoint/pci-epf-virtio.c b/drivers/pci/endpoint/pci-epf-virtio.c
> new file mode 100644
> index 000000000000..7134ca407a03
> --- /dev/null
> +++ b/drivers/pci/endpoint/pci-epf-virtio.c
> @@ -0,0 +1,113 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio library for PCI Endpoint function
> + */
> +#include <linux/kernel.h>
> +#include <linux/pci-epf-virtio.h>
> +#include <linux/pci-epc.h>
> +#include <linux/virtio_pci.h>
> +
> +static void __iomem *epf_virtio_map_vq(struct pci_epf *epf, u32 pfn,
> +				       size_t size, phys_addr_t *vq_phys)
> +{
> +	int err;
> +	phys_addr_t vq_addr;
> +	size_t vq_size;
> +	void __iomem *vq_virt;
> +
> +	vq_addr = (phys_addr_t)pfn << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> +
> +	vq_size = vring_size(size, VIRTIO_PCI_VRING_ALIGN) + 100;

100?

Also ugh, this uses the legacy vring_size.
Did not look closely but is all this limited to legacy virtio then?
Pls make sure you code builds with #define VIRTIO_RING_NO_LEGACY.

> +
> +	vq_virt = pci_epc_mem_alloc_addr(epf->epc, vq_phys, vq_size);
> +	if (!vq_virt) {
> +		pr_err("Failed to allocate epc memory\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, *vq_phys,
> +			       vq_addr, vq_size);
> +	if (err) {
> +		pr_err("Failed to map virtuqueue to local");
> +		goto err_free;
> +	}
> +
> +	return vq_virt;
> +
> +err_free:
> +	pci_epc_mem_free_addr(epf->epc, *vq_phys, vq_virt, vq_size);
> +
> +	return ERR_PTR(err);
> +}
> +
> +static void epf_virtio_unmap_vq(struct pci_epf *epf, void __iomem *vq_virt,
> +				phys_addr_t vq_phys, size_t size)
> +{
> +	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no, vq_phys);
> +	pci_epc_mem_free_addr(epf->epc, vq_phys, vq_virt,
> +			      vring_size(size, VIRTIO_PCI_VRING_ALIGN));
> +}
> +
> +/**
> + * pci_epf_virtio_alloc_vringh() - allocate epf vringh from @pfn
> + * @epf: the EPF device that communicates to host virtio dirver
> + * @features: the virtio features of device
> + * @pfn: page frame number of virtqueue located on host memory. It is
> + *		passed during virtqueue negotiation.
> + * @size: a length of virtqueue
> + */
> +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> +						   u64 features, u32 pfn,
> +						   size_t size)
> +{
> +	int err;
> +	struct vring vring;
> +	struct pci_epf_vringh *evrh;
> +
> +	evrh = kmalloc(sizeof(*evrh), GFP_KERNEL);
> +	if (!evrh) {
> +		err = -ENOMEM;
> +		goto err_unmap_vq;
> +	}
> +
> +	evrh->size = size;
> +
> +	evrh->virt = epf_virtio_map_vq(epf, pfn, size, &evrh->phys);
> +	if (IS_ERR(evrh->virt))
> +		return evrh->virt;
> +
> +	vring_init(&vring, size, evrh->virt, VIRTIO_PCI_VRING_ALIGN);
> +
> +	err = vringh_init_iomem(&evrh->vrh, features, size, false, GFP_KERNEL,
> +				vring.desc, vring.avail, vring.used);
> +	if (err)
> +		goto err_free_epf_vq;
> +
> +	return evrh;
> +
> +err_free_epf_vq:
> +	kfree(evrh);
> +
> +err_unmap_vq:
> +	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> +
> +	return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(pci_epf_virtio_alloc_vringh);
> +
> +/**
> + * pci_epf_virtio_free_vringh() - release allocated epf vring
> + * @epf: the EPF device that communicates to host virtio dirver
> + * @evrh: epf vringh to free
> + */
> +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> +				struct pci_epf_vringh *evrh)
> +{
> +	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> +	kfree(evrh);
> +}
> +EXPORT_SYMBOL_GPL(pci_epf_virtio_free_vringh);
> +
> +MODULE_DESCRIPTION("PCI EP Virtio Library");
> +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/pci-epf-virtio.h b/include/linux/pci-epf-virtio.h
> new file mode 100644
> index 000000000000..ae09087919a9
> --- /dev/null
> +++ b/include/linux/pci-epf-virtio.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * PCI Endpoint Function (EPF) for virtio definitions
> + */
> +#ifndef __LINUX_PCI_EPF_VIRTIO_H
> +#define __LINUX_PCI_EPF_VIRTIO_H
> +
> +#include <linux/types.h>
> +#include <linux/vringh.h>
> +#include <linux/pci-epf.h>
> +
> +struct pci_epf_vringh {
> +	struct vringh vrh;
> +	void __iomem *virt;
> +	phys_addr_t phys;
> +	size_t size;
> +};
> +
> +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> +						   u64 features, u32 pfn,
> +						   size_t size);
> +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> +				struct pci_epf_vringh *evrh);
> +
> +#endif // __LINUX_PCI_EPF_VIRTIO_H
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions
@ 2023-02-03 10:20     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-03 10:20 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

On Fri, Feb 03, 2023 at 07:04:17PM +0900, Shunsuke Mie wrote:
> Add a new library to access a virtio ring located on PCIe host memory. The
> library generates struct pci_epf_vringh that is introduced in this patch.
> The struct has a vringh member, so vringh APIs can be used to access the
> virtio ring.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> ---
>  drivers/pci/endpoint/Kconfig          |   7 ++
>  drivers/pci/endpoint/Makefile         |   1 +
>  drivers/pci/endpoint/pci-epf-virtio.c | 113 ++++++++++++++++++++++++++
>  include/linux/pci-epf-virtio.h        |  25 ++++++
>  4 files changed, 146 insertions(+)
>  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>  create mode 100644 include/linux/pci-epf-virtio.h
> 
> diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
> index 17bbdc9bbde0..07276dcc43c8 100644
> --- a/drivers/pci/endpoint/Kconfig
> +++ b/drivers/pci/endpoint/Kconfig
> @@ -28,6 +28,13 @@ config PCI_ENDPOINT_CONFIGFS
>  	   configure the endpoint function and used to bind the
>  	   function with a endpoint controller.
>  
> +config PCI_ENDPOINT_VIRTIO
> +	tristate
> +	depends on PCI_ENDPOINT
> +	select VHOST_IOMEM
> +	help
> +	  TODO update this comment
> +
>  source "drivers/pci/endpoint/functions/Kconfig"
>  
>  endmenu
> diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
> index 95b2fe47e3b0..95712f0a13d1 100644
> --- a/drivers/pci/endpoint/Makefile
> +++ b/drivers/pci/endpoint/Makefile
> @@ -4,5 +4,6 @@
>  #
>  
>  obj-$(CONFIG_PCI_ENDPOINT_CONFIGFS)	+= pci-ep-cfs.o
> +obj-$(CONFIG_PCI_ENDPOINT_VIRTIO)	+= pci-epf-virtio.o
>  obj-$(CONFIG_PCI_ENDPOINT)		+= pci-epc-core.o pci-epf-core.o\
>  					   pci-epc-mem.o functions/
> diff --git a/drivers/pci/endpoint/pci-epf-virtio.c b/drivers/pci/endpoint/pci-epf-virtio.c
> new file mode 100644
> index 000000000000..7134ca407a03
> --- /dev/null
> +++ b/drivers/pci/endpoint/pci-epf-virtio.c
> @@ -0,0 +1,113 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio library for PCI Endpoint function
> + */
> +#include <linux/kernel.h>
> +#include <linux/pci-epf-virtio.h>
> +#include <linux/pci-epc.h>
> +#include <linux/virtio_pci.h>
> +
> +static void __iomem *epf_virtio_map_vq(struct pci_epf *epf, u32 pfn,
> +				       size_t size, phys_addr_t *vq_phys)
> +{
> +	int err;
> +	phys_addr_t vq_addr;
> +	size_t vq_size;
> +	void __iomem *vq_virt;
> +
> +	vq_addr = (phys_addr_t)pfn << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> +
> +	vq_size = vring_size(size, VIRTIO_PCI_VRING_ALIGN) + 100;

100?

Also ugh, this uses the legacy vring_size.
Did not look closely but is all this limited to legacy virtio then?
Pls make sure you code builds with #define VIRTIO_RING_NO_LEGACY.

> +
> +	vq_virt = pci_epc_mem_alloc_addr(epf->epc, vq_phys, vq_size);
> +	if (!vq_virt) {
> +		pr_err("Failed to allocate epc memory\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, *vq_phys,
> +			       vq_addr, vq_size);
> +	if (err) {
> +		pr_err("Failed to map virtuqueue to local");
> +		goto err_free;
> +	}
> +
> +	return vq_virt;
> +
> +err_free:
> +	pci_epc_mem_free_addr(epf->epc, *vq_phys, vq_virt, vq_size);
> +
> +	return ERR_PTR(err);
> +}
> +
> +static void epf_virtio_unmap_vq(struct pci_epf *epf, void __iomem *vq_virt,
> +				phys_addr_t vq_phys, size_t size)
> +{
> +	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no, vq_phys);
> +	pci_epc_mem_free_addr(epf->epc, vq_phys, vq_virt,
> +			      vring_size(size, VIRTIO_PCI_VRING_ALIGN));
> +}
> +
> +/**
> + * pci_epf_virtio_alloc_vringh() - allocate epf vringh from @pfn
> + * @epf: the EPF device that communicates to host virtio dirver
> + * @features: the virtio features of device
> + * @pfn: page frame number of virtqueue located on host memory. It is
> + *		passed during virtqueue negotiation.
> + * @size: a length of virtqueue
> + */
> +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> +						   u64 features, u32 pfn,
> +						   size_t size)
> +{
> +	int err;
> +	struct vring vring;
> +	struct pci_epf_vringh *evrh;
> +
> +	evrh = kmalloc(sizeof(*evrh), GFP_KERNEL);
> +	if (!evrh) {
> +		err = -ENOMEM;
> +		goto err_unmap_vq;
> +	}
> +
> +	evrh->size = size;
> +
> +	evrh->virt = epf_virtio_map_vq(epf, pfn, size, &evrh->phys);
> +	if (IS_ERR(evrh->virt))
> +		return evrh->virt;
> +
> +	vring_init(&vring, size, evrh->virt, VIRTIO_PCI_VRING_ALIGN);
> +
> +	err = vringh_init_iomem(&evrh->vrh, features, size, false, GFP_KERNEL,
> +				vring.desc, vring.avail, vring.used);
> +	if (err)
> +		goto err_free_epf_vq;
> +
> +	return evrh;
> +
> +err_free_epf_vq:
> +	kfree(evrh);
> +
> +err_unmap_vq:
> +	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> +
> +	return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(pci_epf_virtio_alloc_vringh);
> +
> +/**
> + * pci_epf_virtio_free_vringh() - release allocated epf vring
> + * @epf: the EPF device that communicates to host virtio dirver
> + * @evrh: epf vringh to free
> + */
> +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> +				struct pci_epf_vringh *evrh)
> +{
> +	epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> +	kfree(evrh);
> +}
> +EXPORT_SYMBOL_GPL(pci_epf_virtio_free_vringh);
> +
> +MODULE_DESCRIPTION("PCI EP Virtio Library");
> +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/pci-epf-virtio.h b/include/linux/pci-epf-virtio.h
> new file mode 100644
> index 000000000000..ae09087919a9
> --- /dev/null
> +++ b/include/linux/pci-epf-virtio.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * PCI Endpoint Function (EPF) for virtio definitions
> + */
> +#ifndef __LINUX_PCI_EPF_VIRTIO_H
> +#define __LINUX_PCI_EPF_VIRTIO_H
> +
> +#include <linux/types.h>
> +#include <linux/vringh.h>
> +#include <linux/pci-epf.h>
> +
> +struct pci_epf_vringh {
> +	struct vringh vrh;
> +	void __iomem *virt;
> +	phys_addr_t phys;
> +	size_t size;
> +};
> +
> +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> +						   u64 features, u32 pfn,
> +						   size_t size);
> +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> +				struct pci_epf_vringh *evrh);
> +
> +#endif // __LINUX_PCI_EPF_VIRTIO_H
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 10:04   ` Shunsuke Mie
@ 2023-02-03 10:22     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-03 10:22 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> Add a new endpoint(EP) function driver to provide virtio-net device. This
> function not only shows virtio-net device for PCIe host system, but also
> provides virtio-net device to EP side(local) system. Virtualy those network
> devices are connected, so we can use to communicate over IP like a simple
> NIC.
> 
> Architecture overview is following:
> 
> to Host       |	                to Endpoint
> network stack |                 network stack
>       |       |                       |
> +-----------+ |	+-----------+   +-----------+
> |virtio-net | |	|virtio-net |   |virtio-net |
> |driver     | |	|EP function|---|driver     |
> +-----------+ |	+-----------+   +-----------+
>       |       |	      |
> +-----------+ | +-----------+
> |PCIeC      | | |PCIeC      |
> |Rootcomplex|-|-|Endpoint   |
> +-----------+ | +-----------+
>   Host side   |          Endpoint side
> 
> This driver uses PCIe EP framework to show virtio-net (pci) device Host
> side, and generate virtual virtio-net device and register to EP side.
> A communication date

data?

> is diractly

directly?

> transported between virtqueue level
> with each other using PCIe embedded DMA controller.
> 
> by a limitation of the hardware and Linux EP framework, this function
> follows a virtio legacy specification.

what exactly is the limitation and why does it force legacy?

> This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> just use the PCIe EP framework and depends on the PCIe EDMA.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> ---
>  drivers/pci/endpoint/functions/Kconfig        |  12 +
>  drivers/pci/endpoint/functions/Makefile       |   1 +
>  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>  6 files changed, 1440 insertions(+)
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> 
> diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
> index 9fd560886871..f88d8baaf689 100644
> --- a/drivers/pci/endpoint/functions/Kconfig
> +++ b/drivers/pci/endpoint/functions/Kconfig
> @@ -37,3 +37,15 @@ config PCI_EPF_VNTB
>  	  between PCI Root Port and PCIe Endpoint.
>  
>  	  If in doubt, say "N" to disable Endpoint NTB driver.
> +
> +config PCI_EPF_VNET
> +	tristate "PCI Endpoint virtio-net driver"
> +	depends on PCI_ENDPOINT
> +	select PCI_ENDPOINT_VIRTIO
> +	select VHOST_RING
> +	select VHOST_IOMEM
> +	help
> +	  PCIe Endpoint virtio-net function implementation. This module enables to
> +	  show the virtio-net as pci device to PCIe Host side, and, another
> +	  virtio-net device show to local machine. Those devices can communicate
> +	  each other.
> diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
> index 5c13001deaba..74cc4c330c62 100644
> --- a/drivers/pci/endpoint/functions/Makefile
> +++ b/drivers/pci/endpoint/functions/Makefile
> @@ -6,3 +6,4 @@
>  obj-$(CONFIG_PCI_EPF_TEST)		+= pci-epf-test.o
>  obj-$(CONFIG_PCI_EPF_NTB)		+= pci-epf-ntb.o
>  obj-$(CONFIG_PCI_EPF_VNTB) 		+= pci-epf-vntb.o
> +obj-$(CONFIG_PCI_EPF_VNET)		+= pci-epf-vnet.o pci-epf-vnet-rc.o pci-epf-vnet-ep.o
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> new file mode 100644
> index 000000000000..93b7e00e8d06
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> @@ -0,0 +1,343 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Functions work for Endpoint side(local) using EPF framework
> + */
> +#include <linux/pci-epc.h>
> +#include <linux/virtio_pci.h>
> +#include <linux/virtio_net.h>
> +#include <linux/virtio_ring.h>
> +
> +#include "pci-epf-vnet.h"
> +
> +static inline struct epf_vnet *vdev_to_vnet(struct virtio_device *vdev)
> +{
> +	return container_of(vdev, struct epf_vnet, ep.vdev);
> +}
> +
> +static void epf_vnet_ep_set_status(struct epf_vnet *vnet, u16 status)
> +{
> +	vnet->ep.net_config_status |= status;
> +}
> +
> +static void epf_vnet_ep_clear_status(struct epf_vnet *vnet, u16 status)
> +{
> +	vnet->ep.net_config_status &= ~status;
> +}
> +
> +static void epf_vnet_ep_raise_config_irq(struct epf_vnet *vnet)
> +{
> +	virtio_config_changed(&vnet->ep.vdev);
> +}
> +
> +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet)
> +{
> +	epf_vnet_ep_set_status(vnet,
> +			       VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> +	epf_vnet_ep_raise_config_irq(vnet);
> +}
> +
> +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq)
> +{
> +	vring_interrupt(0, vq);
> +}
> +
> +static int epf_vnet_ep_process_ctrlq_entry(struct epf_vnet *vnet)
> +{
> +	struct vringh *vrh = &vnet->ep.ctlvrh;
> +	struct vringh_kiov *wiov = &vnet->ep.ctl_riov;
> +	struct vringh_kiov *riov = &vnet->ep.ctl_wiov;
> +	struct virtio_net_ctrl_hdr *hdr;
> +	virtio_net_ctrl_ack *ack;
> +	int err;
> +	u16 head;
> +	size_t len;
> +
> +	err = vringh_getdesc(vrh, riov, wiov, &head);
> +	if (err <= 0)
> +		goto done;
> +
> +	len = vringh_kiov_length(riov);
> +	if (len < sizeof(*hdr)) {
> +		pr_debug("Command is too short: %ld\n", len);
> +		err = -EIO;
> +		goto done;
> +	}
> +
> +	if (vringh_kiov_length(wiov) < sizeof(*ack)) {
> +		pr_debug("Space for ack is not enough\n");
> +		err = -EIO;
> +		goto done;
> +	}
> +
> +	hdr = phys_to_virt((unsigned long)riov->iov[riov->i].iov_base);
> +	ack = phys_to_virt((unsigned long)wiov->iov[wiov->i].iov_base);
> +
> +	switch (hdr->class) {
> +	case VIRTIO_NET_CTRL_ANNOUNCE:
> +		if (hdr->cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> +			pr_debug("Invalid command: announce: %d\n", hdr->cmd);
> +			goto done;
> +		}
> +
> +		epf_vnet_ep_clear_status(vnet, VIRTIO_NET_S_ANNOUNCE);
> +		*ack = VIRTIO_NET_OK;
> +		break;
> +	default:
> +		pr_debug("Found not supported class: %d\n", hdr->class);
> +		err = -EIO;
> +	}
> +
> +done:
> +	vringh_complete(vrh, head, len);
> +	return err;
> +}
> +
> +static u64 epf_vnet_ep_vdev_get_features(struct virtio_device *vdev)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	return vnet->virtio_features;
> +}
> +
> +static int epf_vnet_ep_vdev_finalize_features(struct virtio_device *vdev)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	if (vdev->features != vnet->virtio_features)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +
> +static void epf_vnet_ep_vdev_get_config(struct virtio_device *vdev,
> +					unsigned int offset, void *buf,
> +					unsigned int len)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +	const unsigned int mac_len = sizeof(vnet->vnet_cfg.mac);
> +	const unsigned int status_len = sizeof(vnet->vnet_cfg.status);
> +	unsigned int copy_len;
> +
> +	switch (offset) {
> +	case offsetof(struct virtio_net_config, mac):
> +		/* This PCIe EP function doesn't provide a VIRTIO_NET_F_MAC feature, so just
> +		 * clear the buffer.
> +		 */
> +		copy_len = len >= mac_len ? mac_len : len;
> +		memset(buf, 0x00, copy_len);
> +		len -= copy_len;
> +		buf += copy_len;
> +		fallthrough;
> +	case offsetof(struct virtio_net_config, status):
> +		copy_len = len >= status_len ? status_len : len;
> +		memcpy(buf, &vnet->ep.net_config_status, copy_len);
> +		len -= copy_len;
> +		buf += copy_len;
> +		fallthrough;
> +	default:
> +		if (offset > sizeof(vnet->vnet_cfg)) {
> +			memset(buf, 0x00, len);
> +			break;
> +		}
> +		memcpy(buf, (void *)&vnet->vnet_cfg + offset, len);
> +	}
> +}
> +
> +static void epf_vnet_ep_vdev_set_config(struct virtio_device *vdev,
> +					unsigned int offset, const void *buf,
> +					unsigned int len)
> +{
> +	/* Do nothing, because all of virtio net config space is readonly. */
> +}
> +
> +static u8 epf_vnet_ep_vdev_get_status(struct virtio_device *vdev)
> +{
> +	return 0;
> +}
> +
> +static void epf_vnet_ep_vdev_set_status(struct virtio_device *vdev, u8 status)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	if (status & VIRTIO_CONFIG_S_DRIVER_OK)
> +		epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_EP);
> +}
> +
> +static void epf_vnet_ep_vdev_reset(struct virtio_device *vdev)
> +{
> +	pr_debug("doesn't support yet");
> +}
> +
> +static bool epf_vnet_ep_vdev_vq_notify(struct virtqueue *vq)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vq->vdev);
> +	struct vringh *tx_vrh = &vnet->ep.txvrh;
> +	struct vringh *rx_vrh = &vnet->rc.rxvrh->vrh;
> +	struct vringh_kiov *tx_iov = &vnet->ep.tx_iov;
> +	struct vringh_kiov *rx_iov = &vnet->rc.rx_iov;
> +	int err;
> +
> +	/* Support only one queue pair */
> +	switch (vq->index) {
> +	case 0: // rx queue
> +		break;
> +	case 1: // tx queue
> +		while ((err = epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov,
> +						rx_iov, DMA_MEM_TO_DEV)) > 0)
> +			;
> +		if (err < 0)
> +			pr_debug("Failed to transmit: EP -> Host: %d\n", err);
> +		break;
> +	case 2: // control queue
> +		epf_vnet_ep_process_ctrlq_entry(vnet);
> +		break;
> +	default:
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static int epf_vnet_ep_vdev_find_vqs(struct virtio_device *vdev,
> +				     unsigned int nvqs, struct virtqueue *vqs[],
> +				     vq_callback_t *callback[],
> +				     const char *const names[], const bool *ctx,
> +				     struct irq_affinity *desc)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +	const size_t vq_size = epf_vnet_get_vq_size();
> +	int i;
> +	int err;
> +	int qidx;
> +
> +	for (qidx = 0, i = 0; i < nvqs; i++) {
> +		struct virtqueue *vq;
> +		struct vring *vring;
> +		struct vringh *vrh;
> +
> +		if (!names[i]) {
> +			vqs[i] = NULL;
> +			continue;
> +		}
> +
> +		vq = vring_create_virtqueue(qidx++, vq_size,
> +					    VIRTIO_PCI_VRING_ALIGN, vdev, true,
> +					    false, ctx ? ctx[i] : false,
> +					    epf_vnet_ep_vdev_vq_notify,
> +					    callback[i], names[i]);
> +		if (!vq) {
> +			err = -ENOMEM;
> +			goto err_del_vqs;
> +		}
> +
> +		vqs[i] = vq;
> +		vring = virtqueue_get_vring(vq);
> +
> +		switch (i) {
> +		case 0: // rx
> +			vrh = &vnet->ep.rxvrh;
> +			vnet->ep.rxvq = vq;
> +			break;
> +		case 1: // tx
> +			vrh = &vnet->ep.txvrh;
> +			vnet->ep.txvq = vq;
> +			break;
> +		case 2: // control
> +			vrh = &vnet->ep.ctlvrh;
> +			vnet->ep.ctlvq = vq;
> +			break;
> +		default:
> +			err = -EIO;
> +			goto err_del_vqs;
> +		}
> +
> +		err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
> +				       true, GFP_KERNEL, vring->desc,
> +				       vring->avail, vring->used);
> +		if (err) {
> +			pr_err("failed to init vringh for vring %d\n", i);
> +			goto err_del_vqs;
> +		}
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->ep.tx_iov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +	err = epf_vnet_init_kiov(&vnet->ep.rx_iov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +	err = epf_vnet_init_kiov(&vnet->ep.ctl_riov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +	err = epf_vnet_init_kiov(&vnet->ep.ctl_wiov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +
> +	return 0;
> +
> +err_free_kiov:
> +	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> +
> +err_del_vqs:
> +	for (; i >= 0; i--) {
> +		if (!names[i])
> +			continue;
> +
> +		if (!vqs[i])
> +			continue;
> +
> +		vring_del_virtqueue(vqs[i]);
> +	}
> +	return err;
> +}
> +
> +static void epf_vnet_ep_vdev_del_vqs(struct virtio_device *vdev)
> +{
> +	struct virtqueue *vq, *n;
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
> +		vring_del_virtqueue(vq);
> +
> +	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> +}
> +
> +static const struct virtio_config_ops epf_vnet_ep_vdev_config_ops = {
> +	.get_features = epf_vnet_ep_vdev_get_features,
> +	.finalize_features = epf_vnet_ep_vdev_finalize_features,
> +	.get = epf_vnet_ep_vdev_get_config,
> +	.set = epf_vnet_ep_vdev_set_config,
> +	.get_status = epf_vnet_ep_vdev_get_status,
> +	.set_status = epf_vnet_ep_vdev_set_status,
> +	.reset = epf_vnet_ep_vdev_reset,
> +	.find_vqs = epf_vnet_ep_vdev_find_vqs,
> +	.del_vqs = epf_vnet_ep_vdev_del_vqs,
> +};
> +
> +void epf_vnet_ep_cleanup(struct epf_vnet *vnet)
> +{
> +	unregister_virtio_device(&vnet->ep.vdev);
> +}
> +
> +int epf_vnet_ep_setup(struct epf_vnet *vnet)
> +{
> +	int err;
> +	struct virtio_device *vdev = &vnet->ep.vdev;
> +
> +	vdev->dev.parent = vnet->epf->epc->dev.parent;
> +	vdev->config = &epf_vnet_ep_vdev_config_ops;
> +	vdev->id.vendor = PCI_VENDOR_ID_REDHAT_QUMRANET;
> +	vdev->id.device = VIRTIO_ID_NET;
> +
> +	err = register_virtio_device(vdev);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> new file mode 100644
> index 000000000000..2ca0245a9134
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> @@ -0,0 +1,635 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Functions work for PCie Host side(remote) using EPF framework.
> + */
> +#include <linux/pci-epf.h>
> +#include <linux/pci-epc.h>
> +#include <linux/pci_ids.h>
> +#include <linux/sched.h>
> +#include <linux/virtio_pci.h>
> +
> +#include "pci-epf-vnet.h"
> +
> +#define VIRTIO_NET_LEGACY_CFG_BAR BAR_0
> +
> +/* Returns an out side of the valid queue index. */
> +static inline u16 epf_vnet_rc_get_number_of_queues(struct epf_vnet *vnet)
> +
> +{
> +	/* number of queue pairs and control queue */
> +	return vnet->vnet_cfg.max_virtqueue_pairs * 2 + 1;
> +}
> +
> +static void epf_vnet_rc_memcpy_config(struct epf_vnet *vnet, size_t offset,
> +				      void *buf, size_t len)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	memcpy_toio(base, buf, len);
> +}
> +
> +static void epf_vnet_rc_set_config8(struct epf_vnet *vnet, size_t offset,
> +				    u8 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite8(ioread8(base) | config, base);
> +}
> +
> +static void epf_vnet_rc_set_config16(struct epf_vnet *vnet, size_t offset,
> +				     u16 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite16(ioread16(base) | config, base);
> +}
> +
> +static void epf_vnet_rc_clear_config16(struct epf_vnet *vnet, size_t offset,
> +				       u16 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite16(ioread16(base) & ~config, base);
> +}
> +
> +static void epf_vnet_rc_set_config32(struct epf_vnet *vnet, size_t offset,
> +				     u32 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite32(ioread32(base) | config, base);
> +}
> +
> +static void epf_vnet_rc_raise_config_irq(struct epf_vnet *vnet)
> +{
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_CONFIG);
> +	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> +}
> +
> +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet)
> +{
> +	epf_vnet_rc_set_config16(vnet,
> +				 VIRTIO_PCI_CONFIG_OFF(false) +
> +					 offsetof(struct virtio_net_config,
> +						  status),
> +				 VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> +	epf_vnet_rc_raise_config_irq(vnet);
> +}
> +
> +/*
> + * For the PCIe host, this driver shows legacy virtio-net device. Because,
> + * virtio structure pci capabilities is mandatory for modern virtio device,
> + * but there is no PCIe EP hardware that can be configured with any pci
> + * capabilities and Linux PCIe EP framework doesn't support it.
> + */
> +static struct pci_epf_header epf_vnet_pci_header = {
> +	.vendorid = PCI_VENDOR_ID_REDHAT_QUMRANET,
> +	.deviceid = VIRTIO_TRANS_ID_NET,
> +	.subsys_vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
> +	.subsys_id = VIRTIO_ID_NET,
> +	.revid = 0,
> +	.baseclass_code = PCI_BASE_CLASS_NETWORK,
> +	.interrupt_pin = PCI_INTERRUPT_PIN,
> +};
> +
> +static void epf_vnet_rc_setup_configs(struct epf_vnet *vnet,
> +				      void __iomem *cfg_base)
> +{
> +	u16 default_qindex = epf_vnet_rc_get_number_of_queues(vnet);
> +
> +	epf_vnet_rc_set_config32(vnet, VIRTIO_PCI_HOST_FEATURES,
> +				 vnet->virtio_features);
> +
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_QUEUE);
> +	/*
> +	 * Initialize the queue notify and selector to outside of the appropriate
> +	 * virtqueue index. It is used to detect change with polling. There is no
> +	 * other ways to detect host side driver updateing those values
> +	 */
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY, default_qindex);
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL, default_qindex);
> +	/* This pfn is also set to 0 for the polling as well */
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> +
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NUM,
> +				 epf_vnet_get_vq_size());
> +	epf_vnet_rc_set_config8(vnet, VIRTIO_PCI_STATUS, 0);
> +	epf_vnet_rc_memcpy_config(vnet, VIRTIO_PCI_CONFIG_OFF(false),
> +				  &vnet->vnet_cfg, sizeof(vnet->vnet_cfg));
> +}
> +
> +static void epf_vnet_cleanup_bar(struct epf_vnet *vnet)
> +{
> +	struct pci_epf *epf = vnet->epf;
> +
> +	pci_epc_clear_bar(epf->epc, epf->func_no, epf->vfunc_no,
> +			  &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR]);
> +	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> +			   PRIMARY_INTERFACE);
> +}
> +
> +static int epf_vnet_setup_bar(struct epf_vnet *vnet)
> +{
> +	int err;
> +	size_t cfg_bar_size =
> +		VIRTIO_PCI_CONFIG_OFF(false) + sizeof(struct virtio_net_config);
> +	struct pci_epf *epf = vnet->epf;
> +	const struct pci_epc_features *features;
> +	struct pci_epf_bar *config_bar = &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR];
> +
> +	features = pci_epc_get_features(epf->epc, epf->func_no, epf->vfunc_no);
> +	if (!features) {
> +		pr_debug("Failed to get PCI EPC features\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (features->reserved_bar & BIT(VIRTIO_NET_LEGACY_CFG_BAR)) {
> +		pr_debug("Cannot use the PCI BAR for legacy virtio pci\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> +		if (cfg_bar_size >
> +		    features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> +			pr_debug("PCI BAR size is not enough\n");
> +			return -ENOMEM;
> +		}
> +	}
> +
> +	config_bar->flags |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> +
> +	vnet->rc.cfg_base = pci_epf_alloc_space(epf, cfg_bar_size,
> +						VIRTIO_NET_LEGACY_CFG_BAR,
> +						features->align,
> +						PRIMARY_INTERFACE);
> +	if (!vnet->rc.cfg_base) {
> +		pr_debug("Failed to allocate virtio-net config memory\n");
> +		return -ENOMEM;
> +	}
> +
> +	epf_vnet_rc_setup_configs(vnet, vnet->rc.cfg_base);
> +
> +	err = pci_epc_set_bar(epf->epc, epf->func_no, epf->vfunc_no,
> +			      config_bar);
> +	if (err) {
> +		pr_debug("Failed to set PCI BAR");
> +		goto err_free_space;
> +	}
> +
> +	return 0;
> +
> +err_free_space:
> +	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> +			   PRIMARY_INTERFACE);
> +	return err;
> +}
> +
> +static int epf_vnet_rc_negotiate_configs(struct epf_vnet *vnet, u32 *txpfn,
> +					 u32 *rxpfn, u32 *ctlpfn)
> +{
> +	const u16 nqueues = epf_vnet_rc_get_number_of_queues(vnet);
> +	const u16 default_sel = nqueues;
> +	u32 __iomem *queue_pfn = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_PFN;
> +	u16 __iomem *queue_sel = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_SEL;
> +	u8 __iomem *pci_status = vnet->rc.cfg_base + VIRTIO_PCI_STATUS;
> +	u32 pfn;
> +	u16 sel;
> +	struct {
> +		u32 pfn;
> +		u16 sel;
> +	} tmp[3] = {};
> +	int tmp_index = 0;
> +
> +	*rxpfn = *txpfn = *ctlpfn = 0;
> +
> +	/* To avoid to miss a getting the pfn and selector for virtqueue wrote by
> +	 * host driver, we need to implement fast polling with saving.
> +	 *
> +	 * This implementation suspects that the host driver writes pfn only once
> +	 * for each queues
> +	 */
> +	while (tmp_index < nqueues) {
> +		pfn = ioread32(queue_pfn);
> +		if (pfn == 0)
> +			continue;
> +
> +		iowrite32(0, queue_pfn);
> +
> +		sel = ioread16(queue_sel);
> +		if (sel == default_sel)
> +			continue;
> +
> +		tmp[tmp_index].pfn = pfn;
> +		tmp[tmp_index].sel = sel;
> +		tmp_index++;
> +	}
> +
> +	while (!((ioread8(pci_status) & VIRTIO_CONFIG_S_DRIVER_OK)))
> +		;
> +
> +	for (int i = 0; i < nqueues; ++i) {
> +		switch (tmp[i].sel) {
> +		case 0:
> +			*rxpfn = tmp[i].pfn;
> +			break;
> +		case 1:
> +			*txpfn = tmp[i].pfn;
> +			break;
> +		case 2:
> +			*ctlpfn = tmp[i].pfn;
> +			break;
> +		}
> +	}
> +
> +	if (!*rxpfn || !*txpfn || !*ctlpfn)
> +		return -EIO;
> +
> +	return 0;
> +}
> +
> +static int epf_vnet_rc_monitor_notify(void *data)
> +{
> +	struct epf_vnet *vnet = data;
> +	u16 __iomem *queue_notify = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_NOTIFY;
> +	const u16 notify_default = epf_vnet_rc_get_number_of_queues(vnet);
> +
> +	epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_RC);
> +
> +	/* Poll to detect a change of the queue_notify register. Sometimes this
> +	 * polling misses the change, so try to check each virtqueues
> +	 * everytime.
> +	 */
> +	while (true) {
> +		while (ioread16(queue_notify) == notify_default)
> +			;
> +		iowrite16(notify_default, queue_notify);
> +
> +		queue_work(vnet->rc.tx_wq, &vnet->rc.tx_work);
> +		queue_work(vnet->rc.ctl_wq, &vnet->rc.ctl_work);
> +	}
> +
> +	return 0;
> +}
> +
> +static int epf_vnet_rc_spawn_notify_monitor(struct epf_vnet *vnet)
> +{
> +	vnet->rc.notify_monitor_task =
> +		kthread_create(epf_vnet_rc_monitor_notify, vnet,
> +			       "pci-epf-vnet/cfg_negotiator");
> +	if (IS_ERR(vnet->rc.notify_monitor_task))
> +		return PTR_ERR(vnet->rc.notify_monitor_task);
> +
> +	/* Change the thread priority to high for polling. */
> +	sched_set_fifo(vnet->rc.notify_monitor_task);
> +	wake_up_process(vnet->rc.notify_monitor_task);
> +
> +	return 0;
> +}
> +
> +static int epf_vnet_rc_device_setup(void *data)
> +{
> +	struct epf_vnet *vnet = data;
> +	struct pci_epf *epf = vnet->epf;
> +	u32 txpfn, rxpfn, ctlpfn;
> +	const size_t vq_size = epf_vnet_get_vq_size();
> +	int err;
> +
> +	err = epf_vnet_rc_negotiate_configs(vnet, &txpfn, &rxpfn, &ctlpfn);
> +	if (err) {
> +		pr_debug("Failed to negatiate configs with driver\n");
> +		return err;
> +	}
> +
> +	/* Polling phase is finished. This thread backs to normal priority. */
> +	sched_set_normal(vnet->rc.device_setup_task, 19);
> +
> +	vnet->rc.txvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> +						     txpfn, vq_size);
> +	if (IS_ERR(vnet->rc.txvrh)) {
> +		pr_debug("Failed to setup virtqueue for tx\n");
> +		return PTR_ERR(vnet->rc.txvrh);
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.tx_iov, vq_size);
> +	if (err)
> +		goto err_free_epf_tx_vringh;
> +
> +	vnet->rc.rxvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> +						     rxpfn, vq_size);
> +	if (IS_ERR(vnet->rc.rxvrh)) {
> +		pr_debug("Failed to setup virtqueue for rx\n");
> +		err = PTR_ERR(vnet->rc.rxvrh);
> +		goto err_deinit_tx_kiov;
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.rx_iov, vq_size);
> +	if (err)
> +		goto err_free_epf_rx_vringh;
> +
> +	vnet->rc.ctlvrh = pci_epf_virtio_alloc_vringh(
> +		epf, vnet->virtio_features, ctlpfn, vq_size);
> +	if (IS_ERR(vnet->rc.ctlvrh)) {
> +		pr_err("failed to setup virtqueue\n");
> +		err = PTR_ERR(vnet->rc.ctlvrh);
> +		goto err_deinit_rx_kiov;
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.ctl_riov, vq_size);
> +	if (err)
> +		goto err_free_epf_ctl_vringh;
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.ctl_wiov, vq_size);
> +	if (err)
> +		goto err_deinit_ctl_riov;
> +
> +	err = epf_vnet_rc_spawn_notify_monitor(vnet);
> +	if (err) {
> +		pr_debug("Failed to create notify monitor thread\n");
> +		goto err_deinit_ctl_wiov;
> +	}
> +
> +	return 0;
> +
> +err_deinit_ctl_wiov:
> +	epf_vnet_deinit_kiov(&vnet->rc.ctl_wiov);
> +err_deinit_ctl_riov:
> +	epf_vnet_deinit_kiov(&vnet->rc.ctl_riov);
> +err_free_epf_ctl_vringh:
> +	pci_epf_virtio_free_vringh(epf, vnet->rc.ctlvrh);
> +err_deinit_rx_kiov:
> +	epf_vnet_deinit_kiov(&vnet->rc.rx_iov);
> +err_free_epf_rx_vringh:
> +	pci_epf_virtio_free_vringh(epf, vnet->rc.rxvrh);
> +err_deinit_tx_kiov:
> +	epf_vnet_deinit_kiov(&vnet->rc.tx_iov);
> +err_free_epf_tx_vringh:
> +	pci_epf_virtio_free_vringh(epf, vnet->rc.txvrh);
> +
> +	return err;
> +}
> +
> +static int epf_vnet_rc_spawn_device_setup_task(struct epf_vnet *vnet)
> +{
> +	vnet->rc.device_setup_task = kthread_create(
> +		epf_vnet_rc_device_setup, vnet, "pci-epf-vnet/cfg_negotiator");
> +	if (IS_ERR(vnet->rc.device_setup_task))
> +		return PTR_ERR(vnet->rc.device_setup_task);
> +
> +	/* Change the thread priority to high for the polling. */
> +	sched_set_fifo(vnet->rc.device_setup_task);
> +	wake_up_process(vnet->rc.device_setup_task);
> +
> +	return 0;
> +}
> +
> +static void epf_vnet_rc_tx_handler(struct work_struct *work)
> +{
> +	struct epf_vnet *vnet = container_of(work, struct epf_vnet, rc.tx_work);
> +	struct vringh *tx_vrh = &vnet->rc.txvrh->vrh;
> +	struct vringh *rx_vrh = &vnet->ep.rxvrh;
> +	struct vringh_kiov *tx_iov = &vnet->rc.tx_iov;
> +	struct vringh_kiov *rx_iov = &vnet->ep.rx_iov;
> +
> +	while (epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov, rx_iov,
> +				 DMA_DEV_TO_MEM) > 0)
> +		;
> +}
> +
> +static void epf_vnet_rc_raise_irq_handler(struct work_struct *work)
> +{
> +	struct epf_vnet *vnet =
> +		container_of(work, struct epf_vnet, rc.raise_irq_work);
> +	struct pci_epf *epf = vnet->epf;
> +
> +	pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no,
> +			  PCI_EPC_IRQ_LEGACY, 0);
> +}
> +
> +struct epf_vnet_rc_meminfo {
> +	void __iomem *addr, *virt;
> +	phys_addr_t phys;
> +	size_t len;
> +};
> +
> +/* Util function to access PCIe host side memory from local CPU.  */
> +static struct epf_vnet_rc_meminfo *
> +epf_vnet_rc_epc_mmap(struct pci_epf *epf, phys_addr_t pci_addr, size_t len)
> +{
> +	int err;
> +	phys_addr_t aaddr, phys_addr;
> +	size_t asize, offset;
> +	void __iomem *virt_addr;
> +	struct epf_vnet_rc_meminfo *meminfo;
> +
> +	err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
> +	if (err) {
> +		pr_debug("Failed to get EPC align: %d\n", err);
> +		return NULL;
> +	}
> +
> +	offset = pci_addr - aaddr;
> +
> +	virt_addr = pci_epc_mem_alloc_addr(epf->epc, &phys_addr, asize);
> +	if (!virt_addr) {
> +		pr_debug("Failed to allocate epc memory\n");
> +		return NULL;
> +	}
> +
> +	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, phys_addr,
> +			       aaddr, asize);
> +	if (err) {
> +		pr_debug("Failed to map epc memory\n");
> +		goto err_epc_free_addr;
> +	}
> +
> +	meminfo = kmalloc(sizeof(*meminfo), GFP_KERNEL);
> +	if (!meminfo)
> +		goto err_epc_unmap_addr;
> +
> +	meminfo->virt = virt_addr;
> +	meminfo->phys = phys_addr;
> +	meminfo->len = len;
> +	meminfo->addr = virt_addr + offset;
> +
> +	return meminfo;
> +
> +err_epc_unmap_addr:
> +	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> +			   meminfo->phys);
> +err_epc_free_addr:
> +	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> +			      meminfo->len);
> +
> +	return NULL;
> +}
> +
> +static void epf_vnet_rc_epc_munmap(struct pci_epf *epf,
> +				   struct epf_vnet_rc_meminfo *meminfo)
> +{
> +	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> +			   meminfo->phys);
> +	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> +			      meminfo->len);
> +	kfree(meminfo);
> +}
> +
> +static int epf_vnet_rc_process_ctrlq_entry(struct epf_vnet *vnet)
> +{
> +	struct vringh_kiov *riov = &vnet->rc.ctl_riov;
> +	struct vringh_kiov *wiov = &vnet->rc.ctl_wiov;
> +	struct vringh *vrh = &vnet->rc.ctlvrh->vrh;
> +	struct pci_epf *epf = vnet->epf;
> +	struct epf_vnet_rc_meminfo *rmem, *wmem;
> +	struct virtio_net_ctrl_hdr *hdr;
> +	int err;
> +	u16 head;
> +	size_t total_len;
> +	u8 class, cmd;
> +
> +	err = vringh_getdesc(vrh, riov, wiov, &head);
> +	if (err <= 0)
> +		return err;
> +
> +	total_len = vringh_kiov_length(riov);
> +
> +	rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
> +				    riov->iov[riov->i].iov_len);
> +	if (!rmem) {
> +		err = -ENOMEM;
> +		goto err_abandon_descs;
> +	}
> +
> +	wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
> +				    wiov->iov[wiov->i].iov_len);
> +	if (!wmem) {
> +		err = -ENOMEM;
> +		goto err_epc_unmap_rmem;
> +	}
> +
> +	hdr = rmem->addr;
> +	class = ioread8(&hdr->class);
> +	cmd = ioread8(&hdr->cmd);
> +	switch (ioread8(&hdr->class)) {
> +	case VIRTIO_NET_CTRL_ANNOUNCE:
> +		if (cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> +			pr_err("Found invalid command: announce: %d\n", cmd);
> +			break;
> +		}
> +		epf_vnet_rc_clear_config16(
> +			vnet,
> +			VIRTIO_PCI_CONFIG_OFF(false) +
> +				offsetof(struct virtio_net_config, status),
> +			VIRTIO_NET_S_ANNOUNCE);
> +		epf_vnet_rc_clear_config16(vnet, VIRTIO_PCI_ISR,
> +					   VIRTIO_PCI_ISR_CONFIG);
> +
> +		iowrite8(VIRTIO_NET_OK, wmem->addr);
> +		break;
> +	default:
> +		pr_err("Found unsupported class in control queue: %d\n", class);
> +		break;
> +	}
> +
> +	epf_vnet_rc_epc_munmap(epf, rmem);
> +	epf_vnet_rc_epc_munmap(epf, wmem);
> +	vringh_complete(vrh, head, total_len);
> +
> +	return 1;
> +
> +err_epc_unmap_rmem:
> +	epf_vnet_rc_epc_munmap(epf, rmem);
> +err_abandon_descs:
> +	vringh_abandon(vrh, head);
> +
> +	return err;
> +}
> +
> +static void epf_vnet_rc_process_ctrlq_entries(struct work_struct *work)
> +{
> +	struct epf_vnet *vnet =
> +		container_of(work, struct epf_vnet, rc.ctl_work);
> +
> +	while (epf_vnet_rc_process_ctrlq_entry(vnet) > 0)
> +		;
> +}
> +
> +void epf_vnet_rc_notify(struct epf_vnet *vnet)
> +{
> +	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> +}
> +
> +void epf_vnet_rc_cleanup(struct epf_vnet *vnet)
> +{
> +	epf_vnet_cleanup_bar(vnet);
> +	destroy_workqueue(vnet->rc.tx_wq);
> +	destroy_workqueue(vnet->rc.irq_wq);
> +	destroy_workqueue(vnet->rc.ctl_wq);
> +
> +	kthread_stop(vnet->rc.device_setup_task);
> +}
> +
> +int epf_vnet_rc_setup(struct epf_vnet *vnet)
> +{
> +	int err;
> +	struct pci_epf *epf = vnet->epf;
> +
> +	err = pci_epc_write_header(epf->epc, epf->func_no, epf->vfunc_no,
> +				   &epf_vnet_pci_header);
> +	if (err)
> +		return err;
> +
> +	err = epf_vnet_setup_bar(vnet);
> +	if (err)
> +		return err;
> +
> +	vnet->rc.tx_wq =
> +		alloc_workqueue("pci-epf-vnet/tx-wq",
> +				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> +	if (!vnet->rc.tx_wq) {
> +		pr_debug(
> +			"Failed to allocate workqueue for rc -> ep transmission\n");
> +		err = -ENOMEM;
> +		goto err_cleanup_bar;
> +	}
> +
> +	INIT_WORK(&vnet->rc.tx_work, epf_vnet_rc_tx_handler);
> +
> +	vnet->rc.irq_wq =
> +		alloc_workqueue("pci-epf-vnet/irq-wq",
> +				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> +	if (!vnet->rc.irq_wq) {
> +		pr_debug("Failed to allocate workqueue for irq\n");
> +		err = -ENOMEM;
> +		goto err_destory_tx_wq;
> +	}
> +
> +	INIT_WORK(&vnet->rc.raise_irq_work, epf_vnet_rc_raise_irq_handler);
> +
> +	vnet->rc.ctl_wq =
> +		alloc_workqueue("pci-epf-vnet/ctl-wq",
> +				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> +	if (!vnet->rc.ctl_wq) {
> +		pr_err("Failed to allocate work queue for control queue processing\n");
> +		err = -ENOMEM;
> +		goto err_destory_irq_wq;
> +	}
> +
> +	INIT_WORK(&vnet->rc.ctl_work, epf_vnet_rc_process_ctrlq_entries);
> +
> +	err = epf_vnet_rc_spawn_device_setup_task(vnet);
> +	if (err)
> +		goto err_destory_ctl_wq;
> +
> +	return 0;
> +
> +err_cleanup_bar:
> +	epf_vnet_cleanup_bar(vnet);
> +err_destory_tx_wq:
> +	destroy_workqueue(vnet->rc.tx_wq);
> +err_destory_irq_wq:
> +	destroy_workqueue(vnet->rc.irq_wq);
> +err_destory_ctl_wq:
> +	destroy_workqueue(vnet->rc.ctl_wq);
> +
> +	return err;
> +}
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.c b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> new file mode 100644
> index 000000000000..e48ad8067796
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> @@ -0,0 +1,387 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PCI Endpoint function driver to impliment virtio-net device.
> + */
> +#include <linux/module.h>
> +#include <linux/pci-epf.h>
> +#include <linux/pci-epc.h>
> +#include <linux/vringh.h>
> +#include <linux/dmaengine.h>
> +
> +#include "pci-epf-vnet.h"
> +
> +static int virtio_queue_size = 0x100;
> +module_param(virtio_queue_size, int, 0444);
> +MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
> +
> +int epf_vnet_get_vq_size(void)
> +{
> +	return virtio_queue_size;
> +}
> +
> +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size)
> +{
> +	struct kvec *kvec;
> +
> +	kvec = kmalloc_array(vq_size, sizeof(*kvec), GFP_KERNEL);
> +	if (!kvec)
> +		return -ENOMEM;
> +
> +	vringh_kiov_init(kiov, kvec, vq_size);
> +
> +	return 0;
> +}
> +
> +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov)
> +{
> +	kfree(kiov->iov);
> +}
> +
> +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from)
> +{
> +	vnet->init_complete |= from;
> +
> +	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_EP))
> +		return;
> +
> +	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_RC))
> +		return;
> +
> +	epf_vnet_ep_announce_linkup(vnet);
> +	epf_vnet_rc_announce_linkup(vnet);
> +}
> +
> +struct epf_dma_filter_param {
> +	struct device *dev;
> +	u32 dma_mask;
> +};
> +
> +static bool epf_virtnet_dma_filter(struct dma_chan *chan, void *param)
> +{
> +	struct epf_dma_filter_param *fparam = param;
> +	struct dma_slave_caps caps;
> +
> +	memset(&caps, 0, sizeof(caps));
> +	dma_get_slave_caps(chan, &caps);
> +
> +	return chan->device->dev == fparam->dev &&
> +	       (fparam->dma_mask & caps.directions);
> +}
> +
> +static int epf_vnet_init_edma(struct epf_vnet *vnet, struct device *dma_dev)
> +{
> +	struct epf_dma_filter_param param;
> +	dma_cap_mask_t mask;
> +	int err;
> +
> +	dma_cap_zero(mask);
> +	dma_cap_set(DMA_SLAVE, mask);
> +
> +	param.dev = dma_dev;
> +	param.dma_mask = BIT(DMA_MEM_TO_DEV);
> +	vnet->lr_dma_chan =
> +		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> +	if (!vnet->lr_dma_chan)
> +		return -EOPNOTSUPP;
> +
> +	param.dma_mask = BIT(DMA_DEV_TO_MEM);
> +	vnet->rl_dma_chan =
> +		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> +	if (!vnet->rl_dma_chan) {
> +		err = -EOPNOTSUPP;
> +		goto err_release_channel;
> +	}
> +
> +	return 0;
> +
> +err_release_channel:
> +	dma_release_channel(vnet->lr_dma_chan);
> +
> +	return err;
> +}
> +
> +static void epf_vnet_deinit_edma(struct epf_vnet *vnet)
> +{
> +	dma_release_channel(vnet->lr_dma_chan);
> +	dma_release_channel(vnet->rl_dma_chan);
> +}
> +
> +static int epf_vnet_dma_single(struct epf_vnet *vnet, phys_addr_t pci,
> +			       dma_addr_t dma, size_t len,
> +			       void (*callback)(void *), void *param,
> +			       enum dma_transfer_direction dir)
> +{
> +	struct dma_async_tx_descriptor *desc;
> +	int err;
> +	struct dma_chan *chan;
> +	struct dma_slave_config sconf;
> +	dma_cookie_t cookie;
> +	unsigned long flags = 0;
> +
> +	if (dir == DMA_MEM_TO_DEV) {
> +		sconf.dst_addr = pci;
> +		chan = vnet->lr_dma_chan;
> +	} else {
> +		sconf.src_addr = pci;
> +		chan = vnet->rl_dma_chan;
> +	}
> +
> +	err = dmaengine_slave_config(chan, &sconf);
> +	if (unlikely(err))
> +		return err;
> +
> +	if (callback)
> +		flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
> +
> +	desc = dmaengine_prep_slave_single(chan, dma, len, dir, flags);
> +	if (unlikely(!desc))
> +		return -EIO;
> +
> +	desc->callback = callback;
> +	desc->callback_param = param;
> +
> +	cookie = dmaengine_submit(desc);
> +	err = dma_submit_error(cookie);
> +	if (unlikely(err))
> +		return err;
> +
> +	dma_async_issue_pending(chan);
> +
> +	return 0;
> +}
> +
> +struct epf_vnet_dma_callback_param {
> +	struct epf_vnet *vnet;
> +	struct vringh *tx_vrh, *rx_vrh;
> +	struct virtqueue *vq;
> +	size_t total_len;
> +	u16 tx_head, rx_head;
> +};
> +
> +static void epf_vnet_dma_callback(void *p)
> +{
> +	struct epf_vnet_dma_callback_param *param = p;
> +	struct epf_vnet *vnet = param->vnet;
> +
> +	vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
> +	vringh_complete(param->rx_vrh, param->rx_head, param->total_len);
> +
> +	epf_vnet_rc_notify(vnet);
> +	epf_vnet_ep_notify(vnet, param->vq);
> +
> +	kfree(param);
> +}
> +
> +/**
> + * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
> + * @vnet: epf virtio net device to do dma
> + * @tx_vrh: vringh related to source tx vring
> + * @rx_vrh: vringh related to target rx vring
> + * @tx_iov: buffer to use tx
> + * @rx_iov: buffer to use rx
> + * @dir: a direction of DMA. local to remote or local from remote
> + *
> + * This function returns 0, 1 or error number. The 0 indicates there is not
> + * data to send. The 1 indicates a request to DMA is succeeded. Other error
> + * numbers shows error, however, ENOSPC means there is no buffer on target
> + * vring, so should retry to call later.
> + */
> +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> +		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> +		      struct vringh_kiov *rx_iov,
> +		      enum dma_transfer_direction dir)
> +{
> +	int err;
> +	u16 tx_head, rx_head;
> +	size_t total_tx_len;
> +	struct epf_vnet_dma_callback_param *cb_param;
> +	struct vringh_kiov *liov, *riov;
> +
> +	err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
> +	if (err <= 0)
> +		return err;
> +
> +	total_tx_len = vringh_kiov_length(tx_iov);
> +
> +	err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
> +	if (err < 0) {
> +		goto err_tx_complete;
> +	} else if (!err) {
> +		/* There is not space on a vring of destination to transmit data, so
> +		 * rollback tx vringh
> +		 */
> +		vringh_abandon(tx_vrh, tx_head);
> +		return -ENOSPC;
> +	}
> +
> +	cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
> +	if (!cb_param) {
> +		err = -ENOMEM;
> +		goto err_rx_complete;
> +	}
> +
> +	cb_param->tx_vrh = tx_vrh;
> +	cb_param->rx_vrh = rx_vrh;
> +	cb_param->tx_head = tx_head;
> +	cb_param->rx_head = rx_head;
> +	cb_param->total_len = total_tx_len;
> +	cb_param->vnet = vnet;
> +
> +	switch (dir) {
> +	case DMA_MEM_TO_DEV:
> +		liov = tx_iov;
> +		riov = rx_iov;
> +		cb_param->vq = vnet->ep.txvq;
> +		break;
> +	case DMA_DEV_TO_MEM:
> +		liov = rx_iov;
> +		riov = tx_iov;
> +		cb_param->vq = vnet->ep.rxvq;
> +		break;
> +	default:
> +		err = -EINVAL;
> +		goto err_free_param;
> +	}
> +
> +	for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
> +		size_t len;
> +		u64 lbase, rbase;
> +		void (*callback)(void *) = NULL;
> +
> +		lbase = (u64)liov->iov[liov->i].iov_base;
> +		rbase = (u64)riov->iov[riov->i].iov_base;
> +		len = tx_iov->iov[tx_iov->i].iov_len;
> +
> +		if (tx_iov->i + 1 == tx_iov->used)
> +			callback = epf_vnet_dma_callback;
> +
> +		err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
> +					  cb_param, dir);
> +		if (err)
> +			goto err_free_param;
> +	}
> +
> +	return 1;
> +
> +err_free_param:
> +	kfree(cb_param);
> +err_rx_complete:
> +	vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
> +err_tx_complete:
> +	vringh_complete(tx_vrh, tx_head, total_tx_len);
> +
> +	return err;
> +}
> +
> +static int epf_vnet_bind(struct pci_epf *epf)
> +{
> +	int err;
> +	struct epf_vnet *vnet = epf_get_drvdata(epf);
> +
> +	err = epf_vnet_init_edma(vnet, epf->epc->dev.parent);
> +	if (err)
> +		return err;
> +
> +	err = epf_vnet_rc_setup(vnet);
> +	if (err)
> +		goto err_free_edma;
> +
> +	err = epf_vnet_ep_setup(vnet);
> +	if (err)
> +		goto err_cleanup_rc;
> +
> +	return 0;
> +
> +err_free_edma:
> +	epf_vnet_deinit_edma(vnet);
> +err_cleanup_rc:
> +	epf_vnet_rc_cleanup(vnet);
> +
> +	return err;
> +}
> +
> +static void epf_vnet_unbind(struct pci_epf *epf)
> +{
> +	struct epf_vnet *vnet = epf_get_drvdata(epf);
> +
> +	epf_vnet_deinit_edma(vnet);
> +	epf_vnet_rc_cleanup(vnet);
> +	epf_vnet_ep_cleanup(vnet);
> +}
> +
> +static struct pci_epf_ops epf_vnet_ops = {
> +	.bind = epf_vnet_bind,
> +	.unbind = epf_vnet_unbind,
> +};
> +
> +static const struct pci_epf_device_id epf_vnet_ids[] = {
> +	{ .name = "pci_epf_vnet" },
> +	{}
> +};
> +
> +static void epf_vnet_virtio_init(struct epf_vnet *vnet)
> +{
> +	vnet->virtio_features =
> +		BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_STATUS) |
> +		/* Following features are to skip any of checking and offloading, Like a
> +		 * transmission between virtual machines on same system. Details are on
> +		 * section 5.1.5 in virtio specification.
> +		 */
> +		BIT(VIRTIO_NET_F_GUEST_CSUM) | BIT(VIRTIO_NET_F_GUEST_TSO4) |
> +		BIT(VIRTIO_NET_F_GUEST_TSO6) | BIT(VIRTIO_NET_F_GUEST_ECN) |
> +		BIT(VIRTIO_NET_F_GUEST_UFO) |
> +		// The control queue is just used for linkup announcement.
> +		BIT(VIRTIO_NET_F_CTRL_VQ);
> +
> +	vnet->vnet_cfg.max_virtqueue_pairs = 1;
> +	vnet->vnet_cfg.status = 0;
> +	vnet->vnet_cfg.mtu = PAGE_SIZE;
> +}
> +
> +static int epf_vnet_probe(struct pci_epf *epf)
> +{
> +	struct epf_vnet *vnet;
> +
> +	vnet = devm_kzalloc(&epf->dev, sizeof(*vnet), GFP_KERNEL);
> +	if (!vnet)
> +		return -ENOMEM;
> +
> +	epf_set_drvdata(epf, vnet);
> +	vnet->epf = epf;
> +
> +	epf_vnet_virtio_init(vnet);
> +
> +	return 0;
> +}
> +
> +static struct pci_epf_driver epf_vnet_drv = {
> +	.driver.name = "pci_epf_vnet",
> +	.ops = &epf_vnet_ops,
> +	.id_table = epf_vnet_ids,
> +	.probe = epf_vnet_probe,
> +	.owner = THIS_MODULE,
> +};
> +
> +static int __init epf_vnet_init(void)
> +{
> +	int err;
> +
> +	err = pci_epf_register_driver(&epf_vnet_drv);
> +	if (err) {
> +		pr_err("Failed to register epf vnet driver\n");
> +		return err;
> +	}
> +
> +	return 0;
> +}
> +module_init(epf_vnet_init);
> +
> +static void epf_vnet_exit(void)
> +{
> +	pci_epf_unregister_driver(&epf_vnet_drv);
> +}
> +module_exit(epf_vnet_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> +MODULE_DESCRIPTION("PCI endpoint function acts as virtio net device");
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.h b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> new file mode 100644
> index 000000000000..1e0f90c95578
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _PCI_EPF_VNET_H
> +#define _PCI_EPF_VNET_H
> +
> +#include <linux/pci-epf.h>
> +#include <linux/pci-epf-virtio.h>
> +#include <linux/virtio_net.h>
> +#include <linux/dmaengine.h>
> +#include <linux/virtio.h>
> +
> +struct epf_vnet {
> +	//TODO Should this variable be placed here?
> +	struct pci_epf *epf;
> +	struct virtio_net_config vnet_cfg;
> +	u64 virtio_features;
> +
> +	// dma channels for local to remote(lr) and remote to local(rl)
> +	struct dma_chan *lr_dma_chan, *rl_dma_chan;
> +
> +	struct {
> +		void __iomem *cfg_base;
> +		struct task_struct *device_setup_task;
> +		struct task_struct *notify_monitor_task;
> +		struct workqueue_struct *tx_wq, *irq_wq, *ctl_wq;
> +		struct work_struct tx_work, raise_irq_work, ctl_work;
> +		struct pci_epf_vringh *txvrh, *rxvrh, *ctlvrh;
> +		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> +	} rc;
> +
> +	struct {
> +		struct virtqueue *rxvq, *txvq, *ctlvq;
> +		struct vringh txvrh, rxvrh, ctlvrh;
> +		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> +		struct virtio_device vdev;
> +		u16 net_config_status;
> +	} ep;
> +
> +#define EPF_VNET_INIT_COMPLETE_EP BIT(0)
> +#define EPF_VNET_INIT_COMPLETE_RC BIT(1)
> +	u8 init_complete;
> +};
> +
> +int epf_vnet_rc_setup(struct epf_vnet *vnet);
> +void epf_vnet_rc_cleanup(struct epf_vnet *vnet);
> +int epf_vnet_ep_setup(struct epf_vnet *vnet);
> +void epf_vnet_ep_cleanup(struct epf_vnet *vnet);
> +
> +int epf_vnet_get_vq_size(void);
> +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size);
> +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov);
> +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> +		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> +		      struct vringh_kiov *rx_iov,
> +		      enum dma_transfer_direction dir);
> +void epf_vnet_rc_notify(struct epf_vnet *vnet);
> +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq);
> +
> +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from);
> +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet);
> +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet);
> +
> +#endif // _PCI_EPF_VNET_H
> -- 
> 2.25.1


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
@ 2023-02-03 10:22     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-03 10:22 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> Add a new endpoint(EP) function driver to provide virtio-net device. This
> function not only shows virtio-net device for PCIe host system, but also
> provides virtio-net device to EP side(local) system. Virtualy those network
> devices are connected, so we can use to communicate over IP like a simple
> NIC.
> 
> Architecture overview is following:
> 
> to Host       |	                to Endpoint
> network stack |                 network stack
>       |       |                       |
> +-----------+ |	+-----------+   +-----------+
> |virtio-net | |	|virtio-net |   |virtio-net |
> |driver     | |	|EP function|---|driver     |
> +-----------+ |	+-----------+   +-----------+
>       |       |	      |
> +-----------+ | +-----------+
> |PCIeC      | | |PCIeC      |
> |Rootcomplex|-|-|Endpoint   |
> +-----------+ | +-----------+
>   Host side   |          Endpoint side
> 
> This driver uses PCIe EP framework to show virtio-net (pci) device Host
> side, and generate virtual virtio-net device and register to EP side.
> A communication date

data?

> is diractly

directly?

> transported between virtqueue level
> with each other using PCIe embedded DMA controller.
> 
> by a limitation of the hardware and Linux EP framework, this function
> follows a virtio legacy specification.

what exactly is the limitation and why does it force legacy?

> This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> just use the PCIe EP framework and depends on the PCIe EDMA.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> ---
>  drivers/pci/endpoint/functions/Kconfig        |  12 +
>  drivers/pci/endpoint/functions/Makefile       |   1 +
>  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>  6 files changed, 1440 insertions(+)
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> 
> diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
> index 9fd560886871..f88d8baaf689 100644
> --- a/drivers/pci/endpoint/functions/Kconfig
> +++ b/drivers/pci/endpoint/functions/Kconfig
> @@ -37,3 +37,15 @@ config PCI_EPF_VNTB
>  	  between PCI Root Port and PCIe Endpoint.
>  
>  	  If in doubt, say "N" to disable Endpoint NTB driver.
> +
> +config PCI_EPF_VNET
> +	tristate "PCI Endpoint virtio-net driver"
> +	depends on PCI_ENDPOINT
> +	select PCI_ENDPOINT_VIRTIO
> +	select VHOST_RING
> +	select VHOST_IOMEM
> +	help
> +	  PCIe Endpoint virtio-net function implementation. This module enables to
> +	  show the virtio-net as pci device to PCIe Host side, and, another
> +	  virtio-net device show to local machine. Those devices can communicate
> +	  each other.
> diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
> index 5c13001deaba..74cc4c330c62 100644
> --- a/drivers/pci/endpoint/functions/Makefile
> +++ b/drivers/pci/endpoint/functions/Makefile
> @@ -6,3 +6,4 @@
>  obj-$(CONFIG_PCI_EPF_TEST)		+= pci-epf-test.o
>  obj-$(CONFIG_PCI_EPF_NTB)		+= pci-epf-ntb.o
>  obj-$(CONFIG_PCI_EPF_VNTB) 		+= pci-epf-vntb.o
> +obj-$(CONFIG_PCI_EPF_VNET)		+= pci-epf-vnet.o pci-epf-vnet-rc.o pci-epf-vnet-ep.o
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> new file mode 100644
> index 000000000000..93b7e00e8d06
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> @@ -0,0 +1,343 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Functions work for Endpoint side(local) using EPF framework
> + */
> +#include <linux/pci-epc.h>
> +#include <linux/virtio_pci.h>
> +#include <linux/virtio_net.h>
> +#include <linux/virtio_ring.h>
> +
> +#include "pci-epf-vnet.h"
> +
> +static inline struct epf_vnet *vdev_to_vnet(struct virtio_device *vdev)
> +{
> +	return container_of(vdev, struct epf_vnet, ep.vdev);
> +}
> +
> +static void epf_vnet_ep_set_status(struct epf_vnet *vnet, u16 status)
> +{
> +	vnet->ep.net_config_status |= status;
> +}
> +
> +static void epf_vnet_ep_clear_status(struct epf_vnet *vnet, u16 status)
> +{
> +	vnet->ep.net_config_status &= ~status;
> +}
> +
> +static void epf_vnet_ep_raise_config_irq(struct epf_vnet *vnet)
> +{
> +	virtio_config_changed(&vnet->ep.vdev);
> +}
> +
> +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet)
> +{
> +	epf_vnet_ep_set_status(vnet,
> +			       VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> +	epf_vnet_ep_raise_config_irq(vnet);
> +}
> +
> +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq)
> +{
> +	vring_interrupt(0, vq);
> +}
> +
> +static int epf_vnet_ep_process_ctrlq_entry(struct epf_vnet *vnet)
> +{
> +	struct vringh *vrh = &vnet->ep.ctlvrh;
> +	struct vringh_kiov *wiov = &vnet->ep.ctl_riov;
> +	struct vringh_kiov *riov = &vnet->ep.ctl_wiov;
> +	struct virtio_net_ctrl_hdr *hdr;
> +	virtio_net_ctrl_ack *ack;
> +	int err;
> +	u16 head;
> +	size_t len;
> +
> +	err = vringh_getdesc(vrh, riov, wiov, &head);
> +	if (err <= 0)
> +		goto done;
> +
> +	len = vringh_kiov_length(riov);
> +	if (len < sizeof(*hdr)) {
> +		pr_debug("Command is too short: %ld\n", len);
> +		err = -EIO;
> +		goto done;
> +	}
> +
> +	if (vringh_kiov_length(wiov) < sizeof(*ack)) {
> +		pr_debug("Space for ack is not enough\n");
> +		err = -EIO;
> +		goto done;
> +	}
> +
> +	hdr = phys_to_virt((unsigned long)riov->iov[riov->i].iov_base);
> +	ack = phys_to_virt((unsigned long)wiov->iov[wiov->i].iov_base);
> +
> +	switch (hdr->class) {
> +	case VIRTIO_NET_CTRL_ANNOUNCE:
> +		if (hdr->cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> +			pr_debug("Invalid command: announce: %d\n", hdr->cmd);
> +			goto done;
> +		}
> +
> +		epf_vnet_ep_clear_status(vnet, VIRTIO_NET_S_ANNOUNCE);
> +		*ack = VIRTIO_NET_OK;
> +		break;
> +	default:
> +		pr_debug("Found not supported class: %d\n", hdr->class);
> +		err = -EIO;
> +	}
> +
> +done:
> +	vringh_complete(vrh, head, len);
> +	return err;
> +}
> +
> +static u64 epf_vnet_ep_vdev_get_features(struct virtio_device *vdev)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	return vnet->virtio_features;
> +}
> +
> +static int epf_vnet_ep_vdev_finalize_features(struct virtio_device *vdev)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	if (vdev->features != vnet->virtio_features)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +
> +static void epf_vnet_ep_vdev_get_config(struct virtio_device *vdev,
> +					unsigned int offset, void *buf,
> +					unsigned int len)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +	const unsigned int mac_len = sizeof(vnet->vnet_cfg.mac);
> +	const unsigned int status_len = sizeof(vnet->vnet_cfg.status);
> +	unsigned int copy_len;
> +
> +	switch (offset) {
> +	case offsetof(struct virtio_net_config, mac):
> +		/* This PCIe EP function doesn't provide a VIRTIO_NET_F_MAC feature, so just
> +		 * clear the buffer.
> +		 */
> +		copy_len = len >= mac_len ? mac_len : len;
> +		memset(buf, 0x00, copy_len);
> +		len -= copy_len;
> +		buf += copy_len;
> +		fallthrough;
> +	case offsetof(struct virtio_net_config, status):
> +		copy_len = len >= status_len ? status_len : len;
> +		memcpy(buf, &vnet->ep.net_config_status, copy_len);
> +		len -= copy_len;
> +		buf += copy_len;
> +		fallthrough;
> +	default:
> +		if (offset > sizeof(vnet->vnet_cfg)) {
> +			memset(buf, 0x00, len);
> +			break;
> +		}
> +		memcpy(buf, (void *)&vnet->vnet_cfg + offset, len);
> +	}
> +}
> +
> +static void epf_vnet_ep_vdev_set_config(struct virtio_device *vdev,
> +					unsigned int offset, const void *buf,
> +					unsigned int len)
> +{
> +	/* Do nothing, because all of virtio net config space is readonly. */
> +}
> +
> +static u8 epf_vnet_ep_vdev_get_status(struct virtio_device *vdev)
> +{
> +	return 0;
> +}
> +
> +static void epf_vnet_ep_vdev_set_status(struct virtio_device *vdev, u8 status)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	if (status & VIRTIO_CONFIG_S_DRIVER_OK)
> +		epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_EP);
> +}
> +
> +static void epf_vnet_ep_vdev_reset(struct virtio_device *vdev)
> +{
> +	pr_debug("doesn't support yet");
> +}
> +
> +static bool epf_vnet_ep_vdev_vq_notify(struct virtqueue *vq)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vq->vdev);
> +	struct vringh *tx_vrh = &vnet->ep.txvrh;
> +	struct vringh *rx_vrh = &vnet->rc.rxvrh->vrh;
> +	struct vringh_kiov *tx_iov = &vnet->ep.tx_iov;
> +	struct vringh_kiov *rx_iov = &vnet->rc.rx_iov;
> +	int err;
> +
> +	/* Support only one queue pair */
> +	switch (vq->index) {
> +	case 0: // rx queue
> +		break;
> +	case 1: // tx queue
> +		while ((err = epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov,
> +						rx_iov, DMA_MEM_TO_DEV)) > 0)
> +			;
> +		if (err < 0)
> +			pr_debug("Failed to transmit: EP -> Host: %d\n", err);
> +		break;
> +	case 2: // control queue
> +		epf_vnet_ep_process_ctrlq_entry(vnet);
> +		break;
> +	default:
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static int epf_vnet_ep_vdev_find_vqs(struct virtio_device *vdev,
> +				     unsigned int nvqs, struct virtqueue *vqs[],
> +				     vq_callback_t *callback[],
> +				     const char *const names[], const bool *ctx,
> +				     struct irq_affinity *desc)
> +{
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +	const size_t vq_size = epf_vnet_get_vq_size();
> +	int i;
> +	int err;
> +	int qidx;
> +
> +	for (qidx = 0, i = 0; i < nvqs; i++) {
> +		struct virtqueue *vq;
> +		struct vring *vring;
> +		struct vringh *vrh;
> +
> +		if (!names[i]) {
> +			vqs[i] = NULL;
> +			continue;
> +		}
> +
> +		vq = vring_create_virtqueue(qidx++, vq_size,
> +					    VIRTIO_PCI_VRING_ALIGN, vdev, true,
> +					    false, ctx ? ctx[i] : false,
> +					    epf_vnet_ep_vdev_vq_notify,
> +					    callback[i], names[i]);
> +		if (!vq) {
> +			err = -ENOMEM;
> +			goto err_del_vqs;
> +		}
> +
> +		vqs[i] = vq;
> +		vring = virtqueue_get_vring(vq);
> +
> +		switch (i) {
> +		case 0: // rx
> +			vrh = &vnet->ep.rxvrh;
> +			vnet->ep.rxvq = vq;
> +			break;
> +		case 1: // tx
> +			vrh = &vnet->ep.txvrh;
> +			vnet->ep.txvq = vq;
> +			break;
> +		case 2: // control
> +			vrh = &vnet->ep.ctlvrh;
> +			vnet->ep.ctlvq = vq;
> +			break;
> +		default:
> +			err = -EIO;
> +			goto err_del_vqs;
> +		}
> +
> +		err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
> +				       true, GFP_KERNEL, vring->desc,
> +				       vring->avail, vring->used);
> +		if (err) {
> +			pr_err("failed to init vringh for vring %d\n", i);
> +			goto err_del_vqs;
> +		}
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->ep.tx_iov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +	err = epf_vnet_init_kiov(&vnet->ep.rx_iov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +	err = epf_vnet_init_kiov(&vnet->ep.ctl_riov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +	err = epf_vnet_init_kiov(&vnet->ep.ctl_wiov, vq_size);
> +	if (err)
> +		goto err_free_kiov;
> +
> +	return 0;
> +
> +err_free_kiov:
> +	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> +
> +err_del_vqs:
> +	for (; i >= 0; i--) {
> +		if (!names[i])
> +			continue;
> +
> +		if (!vqs[i])
> +			continue;
> +
> +		vring_del_virtqueue(vqs[i]);
> +	}
> +	return err;
> +}
> +
> +static void epf_vnet_ep_vdev_del_vqs(struct virtio_device *vdev)
> +{
> +	struct virtqueue *vq, *n;
> +	struct epf_vnet *vnet = vdev_to_vnet(vdev);
> +
> +	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
> +		vring_del_virtqueue(vq);
> +
> +	epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> +	epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> +}
> +
> +static const struct virtio_config_ops epf_vnet_ep_vdev_config_ops = {
> +	.get_features = epf_vnet_ep_vdev_get_features,
> +	.finalize_features = epf_vnet_ep_vdev_finalize_features,
> +	.get = epf_vnet_ep_vdev_get_config,
> +	.set = epf_vnet_ep_vdev_set_config,
> +	.get_status = epf_vnet_ep_vdev_get_status,
> +	.set_status = epf_vnet_ep_vdev_set_status,
> +	.reset = epf_vnet_ep_vdev_reset,
> +	.find_vqs = epf_vnet_ep_vdev_find_vqs,
> +	.del_vqs = epf_vnet_ep_vdev_del_vqs,
> +};
> +
> +void epf_vnet_ep_cleanup(struct epf_vnet *vnet)
> +{
> +	unregister_virtio_device(&vnet->ep.vdev);
> +}
> +
> +int epf_vnet_ep_setup(struct epf_vnet *vnet)
> +{
> +	int err;
> +	struct virtio_device *vdev = &vnet->ep.vdev;
> +
> +	vdev->dev.parent = vnet->epf->epc->dev.parent;
> +	vdev->config = &epf_vnet_ep_vdev_config_ops;
> +	vdev->id.vendor = PCI_VENDOR_ID_REDHAT_QUMRANET;
> +	vdev->id.device = VIRTIO_ID_NET;
> +
> +	err = register_virtio_device(vdev);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> new file mode 100644
> index 000000000000..2ca0245a9134
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> @@ -0,0 +1,635 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Functions work for PCie Host side(remote) using EPF framework.
> + */
> +#include <linux/pci-epf.h>
> +#include <linux/pci-epc.h>
> +#include <linux/pci_ids.h>
> +#include <linux/sched.h>
> +#include <linux/virtio_pci.h>
> +
> +#include "pci-epf-vnet.h"
> +
> +#define VIRTIO_NET_LEGACY_CFG_BAR BAR_0
> +
> +/* Returns an out side of the valid queue index. */
> +static inline u16 epf_vnet_rc_get_number_of_queues(struct epf_vnet *vnet)
> +
> +{
> +	/* number of queue pairs and control queue */
> +	return vnet->vnet_cfg.max_virtqueue_pairs * 2 + 1;
> +}
> +
> +static void epf_vnet_rc_memcpy_config(struct epf_vnet *vnet, size_t offset,
> +				      void *buf, size_t len)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	memcpy_toio(base, buf, len);
> +}
> +
> +static void epf_vnet_rc_set_config8(struct epf_vnet *vnet, size_t offset,
> +				    u8 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite8(ioread8(base) | config, base);
> +}
> +
> +static void epf_vnet_rc_set_config16(struct epf_vnet *vnet, size_t offset,
> +				     u16 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite16(ioread16(base) | config, base);
> +}
> +
> +static void epf_vnet_rc_clear_config16(struct epf_vnet *vnet, size_t offset,
> +				       u16 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite16(ioread16(base) & ~config, base);
> +}
> +
> +static void epf_vnet_rc_set_config32(struct epf_vnet *vnet, size_t offset,
> +				     u32 config)
> +{
> +	void __iomem *base = vnet->rc.cfg_base + offset;
> +
> +	iowrite32(ioread32(base) | config, base);
> +}
> +
> +static void epf_vnet_rc_raise_config_irq(struct epf_vnet *vnet)
> +{
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_CONFIG);
> +	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> +}
> +
> +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet)
> +{
> +	epf_vnet_rc_set_config16(vnet,
> +				 VIRTIO_PCI_CONFIG_OFF(false) +
> +					 offsetof(struct virtio_net_config,
> +						  status),
> +				 VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> +	epf_vnet_rc_raise_config_irq(vnet);
> +}
> +
> +/*
> + * For the PCIe host, this driver shows legacy virtio-net device. Because,
> + * virtio structure pci capabilities is mandatory for modern virtio device,
> + * but there is no PCIe EP hardware that can be configured with any pci
> + * capabilities and Linux PCIe EP framework doesn't support it.
> + */
> +static struct pci_epf_header epf_vnet_pci_header = {
> +	.vendorid = PCI_VENDOR_ID_REDHAT_QUMRANET,
> +	.deviceid = VIRTIO_TRANS_ID_NET,
> +	.subsys_vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
> +	.subsys_id = VIRTIO_ID_NET,
> +	.revid = 0,
> +	.baseclass_code = PCI_BASE_CLASS_NETWORK,
> +	.interrupt_pin = PCI_INTERRUPT_PIN,
> +};
> +
> +static void epf_vnet_rc_setup_configs(struct epf_vnet *vnet,
> +				      void __iomem *cfg_base)
> +{
> +	u16 default_qindex = epf_vnet_rc_get_number_of_queues(vnet);
> +
> +	epf_vnet_rc_set_config32(vnet, VIRTIO_PCI_HOST_FEATURES,
> +				 vnet->virtio_features);
> +
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_QUEUE);
> +	/*
> +	 * Initialize the queue notify and selector to outside of the appropriate
> +	 * virtqueue index. It is used to detect change with polling. There is no
> +	 * other ways to detect host side driver updateing those values
> +	 */
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY, default_qindex);
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL, default_qindex);
> +	/* This pfn is also set to 0 for the polling as well */
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> +
> +	epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NUM,
> +				 epf_vnet_get_vq_size());
> +	epf_vnet_rc_set_config8(vnet, VIRTIO_PCI_STATUS, 0);
> +	epf_vnet_rc_memcpy_config(vnet, VIRTIO_PCI_CONFIG_OFF(false),
> +				  &vnet->vnet_cfg, sizeof(vnet->vnet_cfg));
> +}
> +
> +static void epf_vnet_cleanup_bar(struct epf_vnet *vnet)
> +{
> +	struct pci_epf *epf = vnet->epf;
> +
> +	pci_epc_clear_bar(epf->epc, epf->func_no, epf->vfunc_no,
> +			  &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR]);
> +	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> +			   PRIMARY_INTERFACE);
> +}
> +
> +static int epf_vnet_setup_bar(struct epf_vnet *vnet)
> +{
> +	int err;
> +	size_t cfg_bar_size =
> +		VIRTIO_PCI_CONFIG_OFF(false) + sizeof(struct virtio_net_config);
> +	struct pci_epf *epf = vnet->epf;
> +	const struct pci_epc_features *features;
> +	struct pci_epf_bar *config_bar = &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR];
> +
> +	features = pci_epc_get_features(epf->epc, epf->func_no, epf->vfunc_no);
> +	if (!features) {
> +		pr_debug("Failed to get PCI EPC features\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (features->reserved_bar & BIT(VIRTIO_NET_LEGACY_CFG_BAR)) {
> +		pr_debug("Cannot use the PCI BAR for legacy virtio pci\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> +		if (cfg_bar_size >
> +		    features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> +			pr_debug("PCI BAR size is not enough\n");
> +			return -ENOMEM;
> +		}
> +	}
> +
> +	config_bar->flags |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> +
> +	vnet->rc.cfg_base = pci_epf_alloc_space(epf, cfg_bar_size,
> +						VIRTIO_NET_LEGACY_CFG_BAR,
> +						features->align,
> +						PRIMARY_INTERFACE);
> +	if (!vnet->rc.cfg_base) {
> +		pr_debug("Failed to allocate virtio-net config memory\n");
> +		return -ENOMEM;
> +	}
> +
> +	epf_vnet_rc_setup_configs(vnet, vnet->rc.cfg_base);
> +
> +	err = pci_epc_set_bar(epf->epc, epf->func_no, epf->vfunc_no,
> +			      config_bar);
> +	if (err) {
> +		pr_debug("Failed to set PCI BAR");
> +		goto err_free_space;
> +	}
> +
> +	return 0;
> +
> +err_free_space:
> +	pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> +			   PRIMARY_INTERFACE);
> +	return err;
> +}
> +
> +static int epf_vnet_rc_negotiate_configs(struct epf_vnet *vnet, u32 *txpfn,
> +					 u32 *rxpfn, u32 *ctlpfn)
> +{
> +	const u16 nqueues = epf_vnet_rc_get_number_of_queues(vnet);
> +	const u16 default_sel = nqueues;
> +	u32 __iomem *queue_pfn = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_PFN;
> +	u16 __iomem *queue_sel = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_SEL;
> +	u8 __iomem *pci_status = vnet->rc.cfg_base + VIRTIO_PCI_STATUS;
> +	u32 pfn;
> +	u16 sel;
> +	struct {
> +		u32 pfn;
> +		u16 sel;
> +	} tmp[3] = {};
> +	int tmp_index = 0;
> +
> +	*rxpfn = *txpfn = *ctlpfn = 0;
> +
> +	/* To avoid to miss a getting the pfn and selector for virtqueue wrote by
> +	 * host driver, we need to implement fast polling with saving.
> +	 *
> +	 * This implementation suspects that the host driver writes pfn only once
> +	 * for each queues
> +	 */
> +	while (tmp_index < nqueues) {
> +		pfn = ioread32(queue_pfn);
> +		if (pfn == 0)
> +			continue;
> +
> +		iowrite32(0, queue_pfn);
> +
> +		sel = ioread16(queue_sel);
> +		if (sel == default_sel)
> +			continue;
> +
> +		tmp[tmp_index].pfn = pfn;
> +		tmp[tmp_index].sel = sel;
> +		tmp_index++;
> +	}
> +
> +	while (!((ioread8(pci_status) & VIRTIO_CONFIG_S_DRIVER_OK)))
> +		;
> +
> +	for (int i = 0; i < nqueues; ++i) {
> +		switch (tmp[i].sel) {
> +		case 0:
> +			*rxpfn = tmp[i].pfn;
> +			break;
> +		case 1:
> +			*txpfn = tmp[i].pfn;
> +			break;
> +		case 2:
> +			*ctlpfn = tmp[i].pfn;
> +			break;
> +		}
> +	}
> +
> +	if (!*rxpfn || !*txpfn || !*ctlpfn)
> +		return -EIO;
> +
> +	return 0;
> +}
> +
> +static int epf_vnet_rc_monitor_notify(void *data)
> +{
> +	struct epf_vnet *vnet = data;
> +	u16 __iomem *queue_notify = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_NOTIFY;
> +	const u16 notify_default = epf_vnet_rc_get_number_of_queues(vnet);
> +
> +	epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_RC);
> +
> +	/* Poll to detect a change of the queue_notify register. Sometimes this
> +	 * polling misses the change, so try to check each virtqueues
> +	 * everytime.
> +	 */
> +	while (true) {
> +		while (ioread16(queue_notify) == notify_default)
> +			;
> +		iowrite16(notify_default, queue_notify);
> +
> +		queue_work(vnet->rc.tx_wq, &vnet->rc.tx_work);
> +		queue_work(vnet->rc.ctl_wq, &vnet->rc.ctl_work);
> +	}
> +
> +	return 0;
> +}
> +
> +static int epf_vnet_rc_spawn_notify_monitor(struct epf_vnet *vnet)
> +{
> +	vnet->rc.notify_monitor_task =
> +		kthread_create(epf_vnet_rc_monitor_notify, vnet,
> +			       "pci-epf-vnet/cfg_negotiator");
> +	if (IS_ERR(vnet->rc.notify_monitor_task))
> +		return PTR_ERR(vnet->rc.notify_monitor_task);
> +
> +	/* Change the thread priority to high for polling. */
> +	sched_set_fifo(vnet->rc.notify_monitor_task);
> +	wake_up_process(vnet->rc.notify_monitor_task);
> +
> +	return 0;
> +}
> +
> +static int epf_vnet_rc_device_setup(void *data)
> +{
> +	struct epf_vnet *vnet = data;
> +	struct pci_epf *epf = vnet->epf;
> +	u32 txpfn, rxpfn, ctlpfn;
> +	const size_t vq_size = epf_vnet_get_vq_size();
> +	int err;
> +
> +	err = epf_vnet_rc_negotiate_configs(vnet, &txpfn, &rxpfn, &ctlpfn);
> +	if (err) {
> +		pr_debug("Failed to negatiate configs with driver\n");
> +		return err;
> +	}
> +
> +	/* Polling phase is finished. This thread backs to normal priority. */
> +	sched_set_normal(vnet->rc.device_setup_task, 19);
> +
> +	vnet->rc.txvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> +						     txpfn, vq_size);
> +	if (IS_ERR(vnet->rc.txvrh)) {
> +		pr_debug("Failed to setup virtqueue for tx\n");
> +		return PTR_ERR(vnet->rc.txvrh);
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.tx_iov, vq_size);
> +	if (err)
> +		goto err_free_epf_tx_vringh;
> +
> +	vnet->rc.rxvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> +						     rxpfn, vq_size);
> +	if (IS_ERR(vnet->rc.rxvrh)) {
> +		pr_debug("Failed to setup virtqueue for rx\n");
> +		err = PTR_ERR(vnet->rc.rxvrh);
> +		goto err_deinit_tx_kiov;
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.rx_iov, vq_size);
> +	if (err)
> +		goto err_free_epf_rx_vringh;
> +
> +	vnet->rc.ctlvrh = pci_epf_virtio_alloc_vringh(
> +		epf, vnet->virtio_features, ctlpfn, vq_size);
> +	if (IS_ERR(vnet->rc.ctlvrh)) {
> +		pr_err("failed to setup virtqueue\n");
> +		err = PTR_ERR(vnet->rc.ctlvrh);
> +		goto err_deinit_rx_kiov;
> +	}
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.ctl_riov, vq_size);
> +	if (err)
> +		goto err_free_epf_ctl_vringh;
> +
> +	err = epf_vnet_init_kiov(&vnet->rc.ctl_wiov, vq_size);
> +	if (err)
> +		goto err_deinit_ctl_riov;
> +
> +	err = epf_vnet_rc_spawn_notify_monitor(vnet);
> +	if (err) {
> +		pr_debug("Failed to create notify monitor thread\n");
> +		goto err_deinit_ctl_wiov;
> +	}
> +
> +	return 0;
> +
> +err_deinit_ctl_wiov:
> +	epf_vnet_deinit_kiov(&vnet->rc.ctl_wiov);
> +err_deinit_ctl_riov:
> +	epf_vnet_deinit_kiov(&vnet->rc.ctl_riov);
> +err_free_epf_ctl_vringh:
> +	pci_epf_virtio_free_vringh(epf, vnet->rc.ctlvrh);
> +err_deinit_rx_kiov:
> +	epf_vnet_deinit_kiov(&vnet->rc.rx_iov);
> +err_free_epf_rx_vringh:
> +	pci_epf_virtio_free_vringh(epf, vnet->rc.rxvrh);
> +err_deinit_tx_kiov:
> +	epf_vnet_deinit_kiov(&vnet->rc.tx_iov);
> +err_free_epf_tx_vringh:
> +	pci_epf_virtio_free_vringh(epf, vnet->rc.txvrh);
> +
> +	return err;
> +}
> +
> +static int epf_vnet_rc_spawn_device_setup_task(struct epf_vnet *vnet)
> +{
> +	vnet->rc.device_setup_task = kthread_create(
> +		epf_vnet_rc_device_setup, vnet, "pci-epf-vnet/cfg_negotiator");
> +	if (IS_ERR(vnet->rc.device_setup_task))
> +		return PTR_ERR(vnet->rc.device_setup_task);
> +
> +	/* Change the thread priority to high for the polling. */
> +	sched_set_fifo(vnet->rc.device_setup_task);
> +	wake_up_process(vnet->rc.device_setup_task);
> +
> +	return 0;
> +}
> +
> +static void epf_vnet_rc_tx_handler(struct work_struct *work)
> +{
> +	struct epf_vnet *vnet = container_of(work, struct epf_vnet, rc.tx_work);
> +	struct vringh *tx_vrh = &vnet->rc.txvrh->vrh;
> +	struct vringh *rx_vrh = &vnet->ep.rxvrh;
> +	struct vringh_kiov *tx_iov = &vnet->rc.tx_iov;
> +	struct vringh_kiov *rx_iov = &vnet->ep.rx_iov;
> +
> +	while (epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov, rx_iov,
> +				 DMA_DEV_TO_MEM) > 0)
> +		;
> +}
> +
> +static void epf_vnet_rc_raise_irq_handler(struct work_struct *work)
> +{
> +	struct epf_vnet *vnet =
> +		container_of(work, struct epf_vnet, rc.raise_irq_work);
> +	struct pci_epf *epf = vnet->epf;
> +
> +	pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no,
> +			  PCI_EPC_IRQ_LEGACY, 0);
> +}
> +
> +struct epf_vnet_rc_meminfo {
> +	void __iomem *addr, *virt;
> +	phys_addr_t phys;
> +	size_t len;
> +};
> +
> +/* Util function to access PCIe host side memory from local CPU.  */
> +static struct epf_vnet_rc_meminfo *
> +epf_vnet_rc_epc_mmap(struct pci_epf *epf, phys_addr_t pci_addr, size_t len)
> +{
> +	int err;
> +	phys_addr_t aaddr, phys_addr;
> +	size_t asize, offset;
> +	void __iomem *virt_addr;
> +	struct epf_vnet_rc_meminfo *meminfo;
> +
> +	err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
> +	if (err) {
> +		pr_debug("Failed to get EPC align: %d\n", err);
> +		return NULL;
> +	}
> +
> +	offset = pci_addr - aaddr;
> +
> +	virt_addr = pci_epc_mem_alloc_addr(epf->epc, &phys_addr, asize);
> +	if (!virt_addr) {
> +		pr_debug("Failed to allocate epc memory\n");
> +		return NULL;
> +	}
> +
> +	err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, phys_addr,
> +			       aaddr, asize);
> +	if (err) {
> +		pr_debug("Failed to map epc memory\n");
> +		goto err_epc_free_addr;
> +	}
> +
> +	meminfo = kmalloc(sizeof(*meminfo), GFP_KERNEL);
> +	if (!meminfo)
> +		goto err_epc_unmap_addr;
> +
> +	meminfo->virt = virt_addr;
> +	meminfo->phys = phys_addr;
> +	meminfo->len = len;
> +	meminfo->addr = virt_addr + offset;
> +
> +	return meminfo;
> +
> +err_epc_unmap_addr:
> +	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> +			   meminfo->phys);
> +err_epc_free_addr:
> +	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> +			      meminfo->len);
> +
> +	return NULL;
> +}
> +
> +static void epf_vnet_rc_epc_munmap(struct pci_epf *epf,
> +				   struct epf_vnet_rc_meminfo *meminfo)
> +{
> +	pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> +			   meminfo->phys);
> +	pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> +			      meminfo->len);
> +	kfree(meminfo);
> +}
> +
> +static int epf_vnet_rc_process_ctrlq_entry(struct epf_vnet *vnet)
> +{
> +	struct vringh_kiov *riov = &vnet->rc.ctl_riov;
> +	struct vringh_kiov *wiov = &vnet->rc.ctl_wiov;
> +	struct vringh *vrh = &vnet->rc.ctlvrh->vrh;
> +	struct pci_epf *epf = vnet->epf;
> +	struct epf_vnet_rc_meminfo *rmem, *wmem;
> +	struct virtio_net_ctrl_hdr *hdr;
> +	int err;
> +	u16 head;
> +	size_t total_len;
> +	u8 class, cmd;
> +
> +	err = vringh_getdesc(vrh, riov, wiov, &head);
> +	if (err <= 0)
> +		return err;
> +
> +	total_len = vringh_kiov_length(riov);
> +
> +	rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
> +				    riov->iov[riov->i].iov_len);
> +	if (!rmem) {
> +		err = -ENOMEM;
> +		goto err_abandon_descs;
> +	}
> +
> +	wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
> +				    wiov->iov[wiov->i].iov_len);
> +	if (!wmem) {
> +		err = -ENOMEM;
> +		goto err_epc_unmap_rmem;
> +	}
> +
> +	hdr = rmem->addr;
> +	class = ioread8(&hdr->class);
> +	cmd = ioread8(&hdr->cmd);
> +	switch (ioread8(&hdr->class)) {
> +	case VIRTIO_NET_CTRL_ANNOUNCE:
> +		if (cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> +			pr_err("Found invalid command: announce: %d\n", cmd);
> +			break;
> +		}
> +		epf_vnet_rc_clear_config16(
> +			vnet,
> +			VIRTIO_PCI_CONFIG_OFF(false) +
> +				offsetof(struct virtio_net_config, status),
> +			VIRTIO_NET_S_ANNOUNCE);
> +		epf_vnet_rc_clear_config16(vnet, VIRTIO_PCI_ISR,
> +					   VIRTIO_PCI_ISR_CONFIG);
> +
> +		iowrite8(VIRTIO_NET_OK, wmem->addr);
> +		break;
> +	default:
> +		pr_err("Found unsupported class in control queue: %d\n", class);
> +		break;
> +	}
> +
> +	epf_vnet_rc_epc_munmap(epf, rmem);
> +	epf_vnet_rc_epc_munmap(epf, wmem);
> +	vringh_complete(vrh, head, total_len);
> +
> +	return 1;
> +
> +err_epc_unmap_rmem:
> +	epf_vnet_rc_epc_munmap(epf, rmem);
> +err_abandon_descs:
> +	vringh_abandon(vrh, head);
> +
> +	return err;
> +}
> +
> +static void epf_vnet_rc_process_ctrlq_entries(struct work_struct *work)
> +{
> +	struct epf_vnet *vnet =
> +		container_of(work, struct epf_vnet, rc.ctl_work);
> +
> +	while (epf_vnet_rc_process_ctrlq_entry(vnet) > 0)
> +		;
> +}
> +
> +void epf_vnet_rc_notify(struct epf_vnet *vnet)
> +{
> +	queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> +}
> +
> +void epf_vnet_rc_cleanup(struct epf_vnet *vnet)
> +{
> +	epf_vnet_cleanup_bar(vnet);
> +	destroy_workqueue(vnet->rc.tx_wq);
> +	destroy_workqueue(vnet->rc.irq_wq);
> +	destroy_workqueue(vnet->rc.ctl_wq);
> +
> +	kthread_stop(vnet->rc.device_setup_task);
> +}
> +
> +int epf_vnet_rc_setup(struct epf_vnet *vnet)
> +{
> +	int err;
> +	struct pci_epf *epf = vnet->epf;
> +
> +	err = pci_epc_write_header(epf->epc, epf->func_no, epf->vfunc_no,
> +				   &epf_vnet_pci_header);
> +	if (err)
> +		return err;
> +
> +	err = epf_vnet_setup_bar(vnet);
> +	if (err)
> +		return err;
> +
> +	vnet->rc.tx_wq =
> +		alloc_workqueue("pci-epf-vnet/tx-wq",
> +				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> +	if (!vnet->rc.tx_wq) {
> +		pr_debug(
> +			"Failed to allocate workqueue for rc -> ep transmission\n");
> +		err = -ENOMEM;
> +		goto err_cleanup_bar;
> +	}
> +
> +	INIT_WORK(&vnet->rc.tx_work, epf_vnet_rc_tx_handler);
> +
> +	vnet->rc.irq_wq =
> +		alloc_workqueue("pci-epf-vnet/irq-wq",
> +				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> +	if (!vnet->rc.irq_wq) {
> +		pr_debug("Failed to allocate workqueue for irq\n");
> +		err = -ENOMEM;
> +		goto err_destory_tx_wq;
> +	}
> +
> +	INIT_WORK(&vnet->rc.raise_irq_work, epf_vnet_rc_raise_irq_handler);
> +
> +	vnet->rc.ctl_wq =
> +		alloc_workqueue("pci-epf-vnet/ctl-wq",
> +				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> +	if (!vnet->rc.ctl_wq) {
> +		pr_err("Failed to allocate work queue for control queue processing\n");
> +		err = -ENOMEM;
> +		goto err_destory_irq_wq;
> +	}
> +
> +	INIT_WORK(&vnet->rc.ctl_work, epf_vnet_rc_process_ctrlq_entries);
> +
> +	err = epf_vnet_rc_spawn_device_setup_task(vnet);
> +	if (err)
> +		goto err_destory_ctl_wq;
> +
> +	return 0;
> +
> +err_cleanup_bar:
> +	epf_vnet_cleanup_bar(vnet);
> +err_destory_tx_wq:
> +	destroy_workqueue(vnet->rc.tx_wq);
> +err_destory_irq_wq:
> +	destroy_workqueue(vnet->rc.irq_wq);
> +err_destory_ctl_wq:
> +	destroy_workqueue(vnet->rc.ctl_wq);
> +
> +	return err;
> +}
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.c b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> new file mode 100644
> index 000000000000..e48ad8067796
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> @@ -0,0 +1,387 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PCI Endpoint function driver to impliment virtio-net device.
> + */
> +#include <linux/module.h>
> +#include <linux/pci-epf.h>
> +#include <linux/pci-epc.h>
> +#include <linux/vringh.h>
> +#include <linux/dmaengine.h>
> +
> +#include "pci-epf-vnet.h"
> +
> +static int virtio_queue_size = 0x100;
> +module_param(virtio_queue_size, int, 0444);
> +MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
> +
> +int epf_vnet_get_vq_size(void)
> +{
> +	return virtio_queue_size;
> +}
> +
> +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size)
> +{
> +	struct kvec *kvec;
> +
> +	kvec = kmalloc_array(vq_size, sizeof(*kvec), GFP_KERNEL);
> +	if (!kvec)
> +		return -ENOMEM;
> +
> +	vringh_kiov_init(kiov, kvec, vq_size);
> +
> +	return 0;
> +}
> +
> +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov)
> +{
> +	kfree(kiov->iov);
> +}
> +
> +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from)
> +{
> +	vnet->init_complete |= from;
> +
> +	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_EP))
> +		return;
> +
> +	if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_RC))
> +		return;
> +
> +	epf_vnet_ep_announce_linkup(vnet);
> +	epf_vnet_rc_announce_linkup(vnet);
> +}
> +
> +struct epf_dma_filter_param {
> +	struct device *dev;
> +	u32 dma_mask;
> +};
> +
> +static bool epf_virtnet_dma_filter(struct dma_chan *chan, void *param)
> +{
> +	struct epf_dma_filter_param *fparam = param;
> +	struct dma_slave_caps caps;
> +
> +	memset(&caps, 0, sizeof(caps));
> +	dma_get_slave_caps(chan, &caps);
> +
> +	return chan->device->dev == fparam->dev &&
> +	       (fparam->dma_mask & caps.directions);
> +}
> +
> +static int epf_vnet_init_edma(struct epf_vnet *vnet, struct device *dma_dev)
> +{
> +	struct epf_dma_filter_param param;
> +	dma_cap_mask_t mask;
> +	int err;
> +
> +	dma_cap_zero(mask);
> +	dma_cap_set(DMA_SLAVE, mask);
> +
> +	param.dev = dma_dev;
> +	param.dma_mask = BIT(DMA_MEM_TO_DEV);
> +	vnet->lr_dma_chan =
> +		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> +	if (!vnet->lr_dma_chan)
> +		return -EOPNOTSUPP;
> +
> +	param.dma_mask = BIT(DMA_DEV_TO_MEM);
> +	vnet->rl_dma_chan =
> +		dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> +	if (!vnet->rl_dma_chan) {
> +		err = -EOPNOTSUPP;
> +		goto err_release_channel;
> +	}
> +
> +	return 0;
> +
> +err_release_channel:
> +	dma_release_channel(vnet->lr_dma_chan);
> +
> +	return err;
> +}
> +
> +static void epf_vnet_deinit_edma(struct epf_vnet *vnet)
> +{
> +	dma_release_channel(vnet->lr_dma_chan);
> +	dma_release_channel(vnet->rl_dma_chan);
> +}
> +
> +static int epf_vnet_dma_single(struct epf_vnet *vnet, phys_addr_t pci,
> +			       dma_addr_t dma, size_t len,
> +			       void (*callback)(void *), void *param,
> +			       enum dma_transfer_direction dir)
> +{
> +	struct dma_async_tx_descriptor *desc;
> +	int err;
> +	struct dma_chan *chan;
> +	struct dma_slave_config sconf;
> +	dma_cookie_t cookie;
> +	unsigned long flags = 0;
> +
> +	if (dir == DMA_MEM_TO_DEV) {
> +		sconf.dst_addr = pci;
> +		chan = vnet->lr_dma_chan;
> +	} else {
> +		sconf.src_addr = pci;
> +		chan = vnet->rl_dma_chan;
> +	}
> +
> +	err = dmaengine_slave_config(chan, &sconf);
> +	if (unlikely(err))
> +		return err;
> +
> +	if (callback)
> +		flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
> +
> +	desc = dmaengine_prep_slave_single(chan, dma, len, dir, flags);
> +	if (unlikely(!desc))
> +		return -EIO;
> +
> +	desc->callback = callback;
> +	desc->callback_param = param;
> +
> +	cookie = dmaengine_submit(desc);
> +	err = dma_submit_error(cookie);
> +	if (unlikely(err))
> +		return err;
> +
> +	dma_async_issue_pending(chan);
> +
> +	return 0;
> +}
> +
> +struct epf_vnet_dma_callback_param {
> +	struct epf_vnet *vnet;
> +	struct vringh *tx_vrh, *rx_vrh;
> +	struct virtqueue *vq;
> +	size_t total_len;
> +	u16 tx_head, rx_head;
> +};
> +
> +static void epf_vnet_dma_callback(void *p)
> +{
> +	struct epf_vnet_dma_callback_param *param = p;
> +	struct epf_vnet *vnet = param->vnet;
> +
> +	vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
> +	vringh_complete(param->rx_vrh, param->rx_head, param->total_len);
> +
> +	epf_vnet_rc_notify(vnet);
> +	epf_vnet_ep_notify(vnet, param->vq);
> +
> +	kfree(param);
> +}
> +
> +/**
> + * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
> + * @vnet: epf virtio net device to do dma
> + * @tx_vrh: vringh related to source tx vring
> + * @rx_vrh: vringh related to target rx vring
> + * @tx_iov: buffer to use tx
> + * @rx_iov: buffer to use rx
> + * @dir: a direction of DMA. local to remote or local from remote
> + *
> + * This function returns 0, 1 or error number. The 0 indicates there is not
> + * data to send. The 1 indicates a request to DMA is succeeded. Other error
> + * numbers shows error, however, ENOSPC means there is no buffer on target
> + * vring, so should retry to call later.
> + */
> +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> +		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> +		      struct vringh_kiov *rx_iov,
> +		      enum dma_transfer_direction dir)
> +{
> +	int err;
> +	u16 tx_head, rx_head;
> +	size_t total_tx_len;
> +	struct epf_vnet_dma_callback_param *cb_param;
> +	struct vringh_kiov *liov, *riov;
> +
> +	err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
> +	if (err <= 0)
> +		return err;
> +
> +	total_tx_len = vringh_kiov_length(tx_iov);
> +
> +	err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
> +	if (err < 0) {
> +		goto err_tx_complete;
> +	} else if (!err) {
> +		/* There is not space on a vring of destination to transmit data, so
> +		 * rollback tx vringh
> +		 */
> +		vringh_abandon(tx_vrh, tx_head);
> +		return -ENOSPC;
> +	}
> +
> +	cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
> +	if (!cb_param) {
> +		err = -ENOMEM;
> +		goto err_rx_complete;
> +	}
> +
> +	cb_param->tx_vrh = tx_vrh;
> +	cb_param->rx_vrh = rx_vrh;
> +	cb_param->tx_head = tx_head;
> +	cb_param->rx_head = rx_head;
> +	cb_param->total_len = total_tx_len;
> +	cb_param->vnet = vnet;
> +
> +	switch (dir) {
> +	case DMA_MEM_TO_DEV:
> +		liov = tx_iov;
> +		riov = rx_iov;
> +		cb_param->vq = vnet->ep.txvq;
> +		break;
> +	case DMA_DEV_TO_MEM:
> +		liov = rx_iov;
> +		riov = tx_iov;
> +		cb_param->vq = vnet->ep.rxvq;
> +		break;
> +	default:
> +		err = -EINVAL;
> +		goto err_free_param;
> +	}
> +
> +	for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
> +		size_t len;
> +		u64 lbase, rbase;
> +		void (*callback)(void *) = NULL;
> +
> +		lbase = (u64)liov->iov[liov->i].iov_base;
> +		rbase = (u64)riov->iov[riov->i].iov_base;
> +		len = tx_iov->iov[tx_iov->i].iov_len;
> +
> +		if (tx_iov->i + 1 == tx_iov->used)
> +			callback = epf_vnet_dma_callback;
> +
> +		err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
> +					  cb_param, dir);
> +		if (err)
> +			goto err_free_param;
> +	}
> +
> +	return 1;
> +
> +err_free_param:
> +	kfree(cb_param);
> +err_rx_complete:
> +	vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
> +err_tx_complete:
> +	vringh_complete(tx_vrh, tx_head, total_tx_len);
> +
> +	return err;
> +}
> +
> +static int epf_vnet_bind(struct pci_epf *epf)
> +{
> +	int err;
> +	struct epf_vnet *vnet = epf_get_drvdata(epf);
> +
> +	err = epf_vnet_init_edma(vnet, epf->epc->dev.parent);
> +	if (err)
> +		return err;
> +
> +	err = epf_vnet_rc_setup(vnet);
> +	if (err)
> +		goto err_free_edma;
> +
> +	err = epf_vnet_ep_setup(vnet);
> +	if (err)
> +		goto err_cleanup_rc;
> +
> +	return 0;
> +
> +err_free_edma:
> +	epf_vnet_deinit_edma(vnet);
> +err_cleanup_rc:
> +	epf_vnet_rc_cleanup(vnet);
> +
> +	return err;
> +}
> +
> +static void epf_vnet_unbind(struct pci_epf *epf)
> +{
> +	struct epf_vnet *vnet = epf_get_drvdata(epf);
> +
> +	epf_vnet_deinit_edma(vnet);
> +	epf_vnet_rc_cleanup(vnet);
> +	epf_vnet_ep_cleanup(vnet);
> +}
> +
> +static struct pci_epf_ops epf_vnet_ops = {
> +	.bind = epf_vnet_bind,
> +	.unbind = epf_vnet_unbind,
> +};
> +
> +static const struct pci_epf_device_id epf_vnet_ids[] = {
> +	{ .name = "pci_epf_vnet" },
> +	{}
> +};
> +
> +static void epf_vnet_virtio_init(struct epf_vnet *vnet)
> +{
> +	vnet->virtio_features =
> +		BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_STATUS) |
> +		/* Following features are to skip any of checking and offloading, Like a
> +		 * transmission between virtual machines on same system. Details are on
> +		 * section 5.1.5 in virtio specification.
> +		 */
> +		BIT(VIRTIO_NET_F_GUEST_CSUM) | BIT(VIRTIO_NET_F_GUEST_TSO4) |
> +		BIT(VIRTIO_NET_F_GUEST_TSO6) | BIT(VIRTIO_NET_F_GUEST_ECN) |
> +		BIT(VIRTIO_NET_F_GUEST_UFO) |
> +		// The control queue is just used for linkup announcement.
> +		BIT(VIRTIO_NET_F_CTRL_VQ);
> +
> +	vnet->vnet_cfg.max_virtqueue_pairs = 1;
> +	vnet->vnet_cfg.status = 0;
> +	vnet->vnet_cfg.mtu = PAGE_SIZE;
> +}
> +
> +static int epf_vnet_probe(struct pci_epf *epf)
> +{
> +	struct epf_vnet *vnet;
> +
> +	vnet = devm_kzalloc(&epf->dev, sizeof(*vnet), GFP_KERNEL);
> +	if (!vnet)
> +		return -ENOMEM;
> +
> +	epf_set_drvdata(epf, vnet);
> +	vnet->epf = epf;
> +
> +	epf_vnet_virtio_init(vnet);
> +
> +	return 0;
> +}
> +
> +static struct pci_epf_driver epf_vnet_drv = {
> +	.driver.name = "pci_epf_vnet",
> +	.ops = &epf_vnet_ops,
> +	.id_table = epf_vnet_ids,
> +	.probe = epf_vnet_probe,
> +	.owner = THIS_MODULE,
> +};
> +
> +static int __init epf_vnet_init(void)
> +{
> +	int err;
> +
> +	err = pci_epf_register_driver(&epf_vnet_drv);
> +	if (err) {
> +		pr_err("Failed to register epf vnet driver\n");
> +		return err;
> +	}
> +
> +	return 0;
> +}
> +module_init(epf_vnet_init);
> +
> +static void epf_vnet_exit(void)
> +{
> +	pci_epf_unregister_driver(&epf_vnet_drv);
> +}
> +module_exit(epf_vnet_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> +MODULE_DESCRIPTION("PCI endpoint function acts as virtio net device");
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.h b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> new file mode 100644
> index 000000000000..1e0f90c95578
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _PCI_EPF_VNET_H
> +#define _PCI_EPF_VNET_H
> +
> +#include <linux/pci-epf.h>
> +#include <linux/pci-epf-virtio.h>
> +#include <linux/virtio_net.h>
> +#include <linux/dmaengine.h>
> +#include <linux/virtio.h>
> +
> +struct epf_vnet {
> +	//TODO Should this variable be placed here?
> +	struct pci_epf *epf;
> +	struct virtio_net_config vnet_cfg;
> +	u64 virtio_features;
> +
> +	// dma channels for local to remote(lr) and remote to local(rl)
> +	struct dma_chan *lr_dma_chan, *rl_dma_chan;
> +
> +	struct {
> +		void __iomem *cfg_base;
> +		struct task_struct *device_setup_task;
> +		struct task_struct *notify_monitor_task;
> +		struct workqueue_struct *tx_wq, *irq_wq, *ctl_wq;
> +		struct work_struct tx_work, raise_irq_work, ctl_work;
> +		struct pci_epf_vringh *txvrh, *rxvrh, *ctlvrh;
> +		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> +	} rc;
> +
> +	struct {
> +		struct virtqueue *rxvq, *txvq, *ctlvq;
> +		struct vringh txvrh, rxvrh, ctlvrh;
> +		struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> +		struct virtio_device vdev;
> +		u16 net_config_status;
> +	} ep;
> +
> +#define EPF_VNET_INIT_COMPLETE_EP BIT(0)
> +#define EPF_VNET_INIT_COMPLETE_RC BIT(1)
> +	u8 init_complete;
> +};
> +
> +int epf_vnet_rc_setup(struct epf_vnet *vnet);
> +void epf_vnet_rc_cleanup(struct epf_vnet *vnet);
> +int epf_vnet_ep_setup(struct epf_vnet *vnet);
> +void epf_vnet_ep_cleanup(struct epf_vnet *vnet);
> +
> +int epf_vnet_get_vq_size(void);
> +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size);
> +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov);
> +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> +		      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> +		      struct vringh_kiov *rx_iov,
> +		      enum dma_transfer_direction dir);
> +void epf_vnet_rc_notify(struct epf_vnet *vnet);
> +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq);
> +
> +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from);
> +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet);
> +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet);
> +
> +#endif // _PCI_EPF_VNET_H
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 10:04   ` Shunsuke Mie
  (?)
  (?)
@ 2023-02-03 11:58   ` kernel test robot
  -1 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2023-02-03 11:58 UTC (permalink / raw)
  To: Shunsuke Mie; +Cc: oe-kbuild-all

Hi Shunsuke,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on helgaas-pci/next]
[also build test WARNING on helgaas-pci/for-linus linus/master v6.2-rc6 next-20230203]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shunsuke-Mie/virtio_pci-add-a-definition-of-queue-flag-in-ISR/20230203-180631
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
patch link:    https://lore.kernel.org/r/20230203100418.2981144-5-mie%40igel.co.jp
patch subject: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
config: sparc-allyesconfig (https://download.01.org/0day-ci/archive/20230203/202302031952.nh5R4Ocj-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/a76cd5956970d04bf5e4e72da87cfdaa4da86164
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Shunsuke-Mie/virtio_pci-add-a-definition-of-queue-flag-in-ISR/20230203-180631
        git checkout a76cd5956970d04bf5e4e72da87cfdaa4da86164
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sparc olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sparc SHELL=/bin/bash drivers/pci/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c: In function 'epf_vnet_ep_process_ctrlq_entry':
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:55:15: error: implicit declaration of function 'vringh_getdesc'; did you mean 'vringh_getdesc_kern'? [-Werror=implicit-function-declaration]
      55 |         err = vringh_getdesc(vrh, riov, wiov, &head);
         |               ^~~~~~~~~~~~~~
         |               vringh_getdesc_kern
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:91:9: error: implicit declaration of function 'vringh_complete'; did you mean 'vringh_complete_kern'? [-Werror=implicit-function-declaration]
      91 |         vringh_complete(vrh, head, len);
         |         ^~~~~~~~~~~~~~~
         |         vringh_complete_kern
   In file included from include/linux/cpumask.h:15,
                    from include/linux/smp.h:13,
                    from include/linux/lockdep.h:14,
                    from include/linux/spinlock.h:63,
                    from include/linux/kref.h:16,
                    from include/linux/configfs.h:25,
                    from include/linux/pci-epf.h:12,
                    from include/linux/pci-epc.h:12,
                    from drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:5:
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c: In function 'epf_vnet_ep_vdev_find_vqs':
>> include/linux/gfp_types.h:333:25: warning: passing argument 5 of 'vringh_init_kern' makes pointer from integer without a cast [-Wint-conversion]
     333 | #define GFP_KERNEL      (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
         |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |                         |
         |                         unsigned int
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:255:46: note: in expansion of macro 'GFP_KERNEL'
     255 |                                        true, GFP_KERNEL, vring->desc,
         |                                              ^~~~~~~~~~
   In file included from include/linux/pci-epf-virtio.h:9,
                    from drivers/pci/endpoint/functions/pci-epf-vnet.h:6,
                    from drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:10:
   include/linux/vringh.h:174:41: note: expected 'struct vring_desc *' but argument is of type 'unsigned int'
     174 |                      struct vring_desc *desc,
         |                      ~~~~~~~~~~~~~~~~~~~^~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:255:63: error: passing argument 6 of 'vringh_init_kern' from incompatible pointer type [-Werror=incompatible-pointer-types]
     255 |                                        true, GFP_KERNEL, vring->desc,
         |                                                          ~~~~~^~~~~~
         |                                                               |
         |                                                               vring_desc_t * {aka struct vring_desc *}
   include/linux/vringh.h:175:42: note: expected 'struct vring_avail *' but argument is of type 'vring_desc_t *' {aka 'struct vring_desc *'}
     175 |                      struct vring_avail *avail,
         |                      ~~~~~~~~~~~~~~~~~~~~^~~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:256:45: error: passing argument 7 of 'vringh_init_kern' from incompatible pointer type [-Werror=incompatible-pointer-types]
     256 |                                        vring->avail, vring->used);
         |                                        ~~~~~^~~~~~~
         |                                             |
         |                                             vring_avail_t * {aka struct vring_avail *}
   include/linux/vringh.h:176:41: note: expected 'struct vring_used *' but argument is of type 'vring_avail_t *' {aka 'struct vring_avail *'}
     176 |                      struct vring_used *used);
         |                      ~~~~~~~~~~~~~~~~~~~^~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:254:23: error: too many arguments to function 'vringh_init_kern'
     254 |                 err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
         |                       ^~~~~~~~~~~~~~~~
   include/linux/vringh.h:172:5: note: declared here
     172 | int vringh_init_kern(struct vringh *vrh, u64 features,
         |     ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/vringh_init_kern +333 include/linux/gfp_types.h

cb5a065b4ea9c0 Ingo Molnar    2022-04-14  261  
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  262  /**
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  263   * DOC: Useful GFP flag combinations
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  264   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  265   * Useful GFP flag combinations
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  266   * ----------------------------
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  267   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  268   * Useful GFP flag combinations that are commonly used. It is recommended
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  269   * that subsystems start with one of these combinations and then set/clear
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  270   * %__GFP_FOO flags as necessary.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  271   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  272   * %GFP_ATOMIC users can not sleep and need the allocation to succeed. A lower
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  273   * watermark is applied to allow access to "atomic reserves".
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  274   * The current implementation doesn't support NMI and few other strict
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  275   * non-preemptive contexts (e.g. raw_spin_lock). The same applies to %GFP_NOWAIT.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  276   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  277   * %GFP_KERNEL is typical for kernel-internal allocations. The caller requires
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  278   * %ZONE_NORMAL or a lower zone for direct access but can direct reclaim.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  279   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  280   * %GFP_KERNEL_ACCOUNT is the same as GFP_KERNEL, except the allocation is
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  281   * accounted to kmemcg.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  282   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  283   * %GFP_NOWAIT is for kernel allocations that should not stall for direct
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  284   * reclaim, start physical IO or use any filesystem callback.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  285   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  286   * %GFP_NOIO will use direct reclaim to discard clean pages or slab pages
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  287   * that do not require the starting of any physical IO.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  288   * Please try to avoid using this flag directly and instead use
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  289   * memalloc_noio_{save,restore} to mark the whole scope which cannot
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  290   * perform any IO with a short explanation why. All allocation requests
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  291   * will inherit GFP_NOIO implicitly.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  292   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  293   * %GFP_NOFS will use direct reclaim but will not use any filesystem interfaces.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  294   * Please try to avoid using this flag directly and instead use
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  295   * memalloc_nofs_{save,restore} to mark the whole scope which cannot/shouldn't
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  296   * recurse into the FS layer with a short explanation why. All allocation
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  297   * requests will inherit GFP_NOFS implicitly.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  298   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  299   * %GFP_USER is for userspace allocations that also need to be directly
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  300   * accessibly by the kernel or hardware. It is typically used by hardware
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  301   * for buffers that are mapped to userspace (e.g. graphics) that hardware
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  302   * still must DMA to. cpuset limits are enforced for these allocations.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  303   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  304   * %GFP_DMA exists for historical reasons and should be avoided where possible.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  305   * The flags indicates that the caller requires that the lowest zone be
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  306   * used (%ZONE_DMA or 16M on x86-64). Ideally, this would be removed but
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  307   * it would require careful auditing as some users really require it and
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  308   * others use the flag to avoid lowmem reserves in %ZONE_DMA and treat the
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  309   * lowest zone as a type of emergency reserve.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  310   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  311   * %GFP_DMA32 is similar to %GFP_DMA except that the caller requires a 32-bit
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  312   * address. Note that kmalloc(..., GFP_DMA32) does not return DMA32 memory
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  313   * because the DMA32 kmalloc cache array is not implemented.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  314   * (Reason: there is no such user in kernel).
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  315   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  316   * %GFP_HIGHUSER is for userspace allocations that may be mapped to userspace,
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  317   * do not need to be directly accessible by the kernel but that cannot
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  318   * move once in use. An example may be a hardware allocation that maps
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  319   * data directly into userspace but has no addressing limitations.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  320   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  321   * %GFP_HIGHUSER_MOVABLE is for userspace allocations that the kernel does not
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  322   * need direct access to but can use kmap() when access is required. They
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  323   * are expected to be movable via page reclaim or page migration. Typically,
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  324   * pages on the LRU would also be allocated with %GFP_HIGHUSER_MOVABLE.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  325   *
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  326   * %GFP_TRANSHUGE and %GFP_TRANSHUGE_LIGHT are used for THP allocations. They
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  327   * are compound allocations that will generally fail quickly if memory is not
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  328   * available and will not wake kswapd/kcompactd on failure. The _LIGHT
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  329   * version does not attempt reclaim/compaction at all and is by default used
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  330   * in page fault path, while the non-light is used by khugepaged.
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  331   */
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  332  #define GFP_ATOMIC	(__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14 @333  #define GFP_KERNEL	(__GFP_RECLAIM | __GFP_IO | __GFP_FS)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  334  #define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  335  #define GFP_NOWAIT	(__GFP_KSWAPD_RECLAIM)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  336  #define GFP_NOIO	(__GFP_RECLAIM)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  337  #define GFP_NOFS	(__GFP_RECLAIM | __GFP_IO)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  338  #define GFP_USER	(__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  339  #define GFP_DMA		__GFP_DMA
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  340  #define GFP_DMA32	__GFP_DMA32
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  341  #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  342  #define GFP_HIGHUSER_MOVABLE	(GFP_HIGHUSER | __GFP_MOVABLE | \
4e23eeebb2e57f Linus Torvalds 2022-08-07  343  			 __GFP_SKIP_KASAN_POISON | __GFP_SKIP_KASAN_UNPOISON)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  344  #define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  345  			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  346  #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
cb5a065b4ea9c0 Ingo Molnar    2022-04-14  347  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-03 10:04 ` Shunsuke Mie
                   ` (4 preceding siblings ...)
  (?)
@ 2023-02-03 16:45 ` Frank Li
  2023-02-07 10:29     ` Shunsuke Mie
  -1 siblings, 1 reply; 50+ messages in thread
From: Frank Li @ 2023-02-03 16:45 UTC (permalink / raw)
  To: Shunsuke Mie, Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Jon Mason, Ren Zhijie, Takanari Hayama, linux-kernel,
	linux-pci, virtualization



> -----Original Message-----
> From: Shunsuke Mie <mie@igel.co.jp>
> Sent: Friday, February 3, 2023 4:04 AM
> To: Lorenzo Pieralisi <lpieralisi@kernel.org>
> Cc: Krzysztof Wilczyński <kw@linux.com>; Manivannan Sadhasivam
> <mani@kernel.org>; Kishon Vijay Abraham I <kishon@kernel.org>; Bjorn
> Helgaas <bhelgaas@google.com>; Michael S. Tsirkin <mst@redhat.com>;
> Jason Wang <jasowang@redhat.com>; Shunsuke Mie <mie@igel.co.jp>;
> Frank Li <frank.li@nxp.com>; Jon Mason <jdmason@kudzu.us>; Ren Zhijie
> <renzhijie2@huawei.com>; Takanari Hayama <taki@igel.co.jp>; linux-
> kernel@vger.kernel.org; linux-pci@vger.kernel.org; virtualization@lists.linux-
> foundation.org
> Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> function
> 
> Caution: EXT Email
> 
> This patchset introduce a virtio-net EP device function. It provides a
> new option to communiate between PCIe host and endpoint over IP.
> Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> It is used to transport data between virtio ring directly each other. It
> can be expected to better throughput.

Thanks, basic that's what I want.  I am trying use RDMA. 
But I think virtio-net still be good solution. 

Frank Li 

> 
> To realize the function, this patchset has few changes and introduces a
> new APIs to PCI EP framework related to virtio. Furthermore, it device
> depends on the some patchtes that is discussing. Those depended patchset
> are following:
> - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> link:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fdmaengine%2F20221223022608.550697-1-
> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> C%7C%7C&sdata=tIn0MHzEvrdxaC4KKTvTRvYXBzQ6MyrFa2GXpa3ePv0%3D&
> reserved=0
> - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> link:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Flinux-pci%2F20230113090350.1103494-1-
> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> C%7C%7C&sdata=RLpnDiLwfqQd5QMXdiQyPVCkfOj8q2AyVeZOwWHvlsM%3
> D&reserved=0
> - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> link:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fvirtualization%2F20230202090934.549556-1-
> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> C%7C%7C&sdata=6jgY76BMSbvamb%2Fl3Urjt4Gcizeqon%2BZE5nPssc2kDA%
> 3D&reserved=0
> 
> About this patchset has 4 patches. The first of two patch is little changes
> to virtio. The third patch add APIs to easily access virtio data structure
> on PCIe Host side memory. The last one introduce a virtio-net EP device
> function. Details are in commit respectively.
> 
> Currently those network devices are testd using ping only. I'll add a
> result of performance evaluation using iperf and etc to the future version
> of this patchset.
> 
> Shunsuke Mie (4):
>   virtio_pci: add a definition of queue flag in ISR
>   virtio_ring: remove const from vring getter
>   PCI: endpoint: Introduce virtio library for EP functions
>   PCI: endpoint: function: Add EP function driver to provide virtio net
>     device
> 
>  drivers/pci/endpoint/Kconfig                  |   7 +
>  drivers/pci/endpoint/Makefile                 |   1 +
>  drivers/pci/endpoint/functions/Kconfig        |  12 +
>  drivers/pci/endpoint/functions/Makefile       |   1 +
>  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
>  drivers/virtio/virtio_ring.c                  |   2 +-
>  include/linux/pci-epf-virtio.h                |  25 +
>  include/linux/virtio.h                        |   2 +-
>  include/uapi/linux/virtio_pci.h               |   2 +
>  13 files changed, 1590 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
>  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>  create mode 100644 include/linux/pci-epf-virtio.h
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-03 10:04 ` Shunsuke Mie
                   ` (5 preceding siblings ...)
  (?)
@ 2023-02-03 21:48 ` Frank Li
  2023-02-07  1:43     ` Shunsuke Mie
  -1 siblings, 1 reply; 50+ messages in thread
From: Frank Li @ 2023-02-03 21:48 UTC (permalink / raw)
  To: Shunsuke Mie, Lorenzo Pieralisi
  Cc: Krzysztof Wilczyński, Manivannan Sadhasivam,
	Kishon Vijay Abraham I, Bjorn Helgaas, Michael S. Tsirkin,
	Jason Wang, Jon Mason, Ren Zhijie, Takanari Hayama, linux-kernel,
	linux-pci, virtualization

> foundation.org
> Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> function
> 

The dependent EDMA patch can't be applied at last linux-next.
Can you provide a git link? So I can try directly.

Frank  

> 
> About this patchset has 4 patches. The first of two patch is little changes
> to virtio. The third patch add APIs to easily access virtio data structure
> on PCIe Host side memory. The last one introduce a virtio-net EP device
> function. Details are in commit respectively.
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [EXT] Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 10:22     ` Michael S. Tsirkin
  (?)
@ 2023-02-03 22:15     ` Frank Li
  2023-02-07 10:56         ` Shunsuke Mie
  -1 siblings, 1 reply; 50+ messages in thread
From: Frank Li @ 2023-02-03 22:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Jon Mason, Ren Zhijie, Takanari Hayama, linux-kernel,
	linux-pci, virtualization

> 
> Caution: EXT Email
> 
> On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> > Add a new endpoint(EP) function driver to provide virtio-net device. This
> > function not only shows virtio-net device for PCIe host system, but also
> > provides virtio-net device to EP side(local) system. Virtualy those network
> > devices are connected, so we can use to communicate over IP like a simple
> > NIC.
> >
> > Architecture overview is following:
> >
> > to Host       |                       to Endpoint
> > network stack |                 network stack
> >       |       |                       |
> > +-----------+ |       +-----------+   +-----------+
> > |virtio-net | |       |virtio-net |   |virtio-net |
> > |driver     | |       |EP function|---|driver     |
> > +-----------+ |       +-----------+   +-----------+
> >       |       |             |
> > +-----------+ | +-----------+
> > |PCIeC      | | |PCIeC      |
> > |Rootcomplex|-|-|Endpoint   |
> > +-----------+ | +-----------+
> >   Host side   |          Endpoint side
> >
> > This driver uses PCIe EP framework to show virtio-net (pci) device Host
> > side, and generate virtual virtio-net device and register to EP side.
> > A communication date
> 
> data?
> 
> > is diractly
> 
> directly?
> 
> > transported between virtqueue level
> > with each other using PCIe embedded DMA controller.
> >
> > by a limitation of the hardware and Linux EP framework, this function
> > follows a virtio legacy specification.
> 
> what exactly is the limitation and why does it force legacy?
> 
> > This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> > just use the PCIe EP framework and depends on the PCIe EDMA.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++

It is actually that not related vnet. Just virtio. 
I think pci-epf-virtio.c is better. 

> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++

It is epf driver. rc is quite confused.  
Maybe you can combine pci-epf-vnet-ep.c and pci-epf-vnet-rc.c to one file.

> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++

This file setup dma transfer according virtio-ring.
How about pci-epf-virtio-dma.c ?

> > +
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR,
> VIRTIO_PCI_ISR_QUEUE);
> > +     /*
> > +      * Initialize the queue notify and selector to outside of the appropriate
> > +      * virtqueue index. It is used to detect change with polling. There is no
> > +      * other ways to detect host side driver updateing those values
> > +      */

I am try to use gic-its or other msi controller as doorbell. 
https://lore.kernel.org/imx/20221125192729.1722913-1-Frank.Li@nxp.com/T/#u

but it may need update host side pci virtio driver. 

> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY,
> default_qindex);
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL,
> default_qindex);
> > +     /* This pfn is also set to 0 for the polling as well */
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> > +
> --
> > 2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 10:04   ` Shunsuke Mie
                     ` (2 preceding siblings ...)
  (?)
@ 2023-02-04  1:05   ` kernel test robot
  -1 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2023-02-04  1:05 UTC (permalink / raw)
  To: Shunsuke Mie; +Cc: oe-kbuild-all

Hi Shunsuke,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on helgaas-pci/next]
[also build test WARNING on helgaas-pci/for-linus linus/master v6.2-rc6 next-20230203]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shunsuke-Mie/virtio_pci-add-a-definition-of-queue-flag-in-ISR/20230203-180631
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
patch link:    https://lore.kernel.org/r/20230203100418.2981144-5-mie%40igel.co.jp
patch subject: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
config: arc-allyesconfig (https://download.01.org/0day-ci/archive/20230204/202302040802.0lalFaV7-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/a76cd5956970d04bf5e4e72da87cfdaa4da86164
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Shunsuke-Mie/virtio_pci-add-a-definition-of-queue-flag-in-ISR/20230203-180631
        git checkout a76cd5956970d04bf5e4e72da87cfdaa4da86164
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arc olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arc SHELL=/bin/bash drivers/pci/endpoint/functions/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/pci/endpoint/functions/pci-epf-vnet.c: In function 'epf_vnet_dma_callback':
   drivers/pci/endpoint/functions/pci-epf-vnet.c:166:9: error: implicit declaration of function 'vringh_complete'; did you mean 'vringh_complete_kern'? [-Werror=implicit-function-declaration]
     166 |         vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
         |         ^~~~~~~~~~~~~~~
         |         vringh_complete_kern
   drivers/pci/endpoint/functions/pci-epf-vnet.c: In function 'epf_vnet_transfer':
   drivers/pci/endpoint/functions/pci-epf-vnet.c:200:15: error: implicit declaration of function 'vringh_getdesc'; did you mean 'vringh_getdesc_kern'? [-Werror=implicit-function-declaration]
     200 |         err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
         |               ^~~~~~~~~~~~~~
         |               vringh_getdesc_kern
   drivers/pci/endpoint/functions/pci-epf-vnet.c:213:17: error: implicit declaration of function 'vringh_abandon'; did you mean 'vringh_abandon_kern'? [-Werror=implicit-function-declaration]
     213 |                 vringh_abandon(tx_vrh, tx_head);
         |                 ^~~~~~~~~~~~~~
         |                 vringh_abandon_kern
>> drivers/pci/endpoint/functions/pci-epf-vnet.c:251:25: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     251 |                 lbase = (u64)liov->iov[liov->i].iov_base;
         |                         ^
   drivers/pci/endpoint/functions/pci-epf-vnet.c:252:25: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     252 |                 rbase = (u64)riov->iov[riov->i].iov_base;
         |                         ^
   cc1: some warnings being treated as errors
--
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c: In function 'epf_vnet_rc_epc_mmap':
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c:424:15: error: implicit declaration of function 'pci_epc_mem_align'; did you mean 'pci_epc_mem_exit'? [-Werror=implicit-function-declaration]
     424 |         err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
         |               ^~~~~~~~~~~~~~~~~
         |               pci_epc_mem_exit
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c: In function 'epf_vnet_rc_process_ctrlq_entry':
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c:489:15: error: implicit declaration of function 'vringh_getdesc'; did you mean 'vringh_getdesc_kern'? [-Werror=implicit-function-declaration]
     489 |         err = vringh_getdesc(vrh, riov, wiov, &head);
         |               ^~~~~~~~~~~~~~
         |               vringh_getdesc_kern
>> drivers/pci/endpoint/functions/pci-epf-vnet-rc.c:495:42: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     495 |         rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
         |                                          ^
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c:502:42: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     502 |         wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
         |                                          ^
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c:535:9: error: implicit declaration of function 'vringh_complete'; did you mean 'vringh_complete_kern'? [-Werror=implicit-function-declaration]
     535 |         vringh_complete(vrh, head, total_len);
         |         ^~~~~~~~~~~~~~~
         |         vringh_complete_kern
   drivers/pci/endpoint/functions/pci-epf-vnet-rc.c:542:9: error: implicit declaration of function 'vringh_abandon'; did you mean 'vringh_abandon_kern'? [-Werror=implicit-function-declaration]
     542 |         vringh_abandon(vrh, head);
         |         ^~~~~~~~~~~~~~
         |         vringh_abandon_kern
   cc1: some warnings being treated as errors
--
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c: In function 'epf_vnet_ep_process_ctrlq_entry':
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:55:15: error: implicit declaration of function 'vringh_getdesc'; did you mean 'vringh_getdesc_kern'? [-Werror=implicit-function-declaration]
      55 |         err = vringh_getdesc(vrh, riov, wiov, &head);
         |               ^~~~~~~~~~~~~~
         |               vringh_getdesc_kern
   In file included from include/asm-generic/bug.h:22,
                    from arch/arc/include/asm/bug.h:30,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:13,
                    from include/asm-generic/preempt.h:5,
                    from ./arch/arc/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:78,
                    from include/linux/spinlock.h:56,
                    from include/linux/kref.h:16,
                    from include/linux/configfs.h:25,
                    from include/linux/pci-epf.h:12,
                    from include/linux/pci-epc.h:12,
                    from drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:5:
>> drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:61:26: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Wformat=]
      61 |                 pr_debug("Command is too short: %ld\n", len);
         |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/printk.h:347:21: note: in definition of macro 'pr_fmt'
     347 | #define pr_fmt(fmt) fmt
         |                     ^~~
   include/linux/dynamic_debug.h:247:9: note: in expansion of macro '__dynamic_func_call_cls'
     247 |         __dynamic_func_call_cls(__UNIQUE_ID(ddebug), cls, fmt, func, ##__VA_ARGS__)
         |         ^~~~~~~~~~~~~~~~~~~~~~~
   include/linux/dynamic_debug.h:249:9: note: in expansion of macro '_dynamic_func_call_cls'
     249 |         _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, ##__VA_ARGS__)
         |         ^~~~~~~~~~~~~~~~~~~~~~
   include/linux/dynamic_debug.h:268:9: note: in expansion of macro '_dynamic_func_call'
     268 |         _dynamic_func_call(fmt, __dynamic_pr_debug,             \
         |         ^~~~~~~~~~~~~~~~~~
   include/linux/printk.h:581:9: note: in expansion of macro 'dynamic_pr_debug'
     581 |         dynamic_pr_debug(fmt, ##__VA_ARGS__)
         |         ^~~~~~~~~~~~~~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:61:17: note: in expansion of macro 'pr_debug'
      61 |                 pr_debug("Command is too short: %ld\n", len);
         |                 ^~~~~~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:61:51: note: format string is defined here
      61 |                 pr_debug("Command is too short: %ld\n", len);
         |                                                 ~~^
         |                                                   |
         |                                                   long int
         |                                                 %d
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:91:9: error: implicit declaration of function 'vringh_complete'; did you mean 'vringh_complete_kern'? [-Werror=implicit-function-declaration]
      91 |         vringh_complete(vrh, head, len);
         |         ^~~~~~~~~~~~~~~
         |         vringh_complete_kern
   In file included from include/linux/cpumask.h:15,
                    from include/linux/smp.h:13,
                    from include/linux/lockdep.h:14,
                    from include/linux/spinlock.h:63:
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c: In function 'epf_vnet_ep_vdev_find_vqs':
   include/linux/gfp_types.h:333:25: warning: passing argument 5 of 'vringh_init_kern' makes pointer from integer without a cast [-Wint-conversion]
     333 | #define GFP_KERNEL      (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
         |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |                         |
         |                         unsigned int
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:255:46: note: in expansion of macro 'GFP_KERNEL'
     255 |                                        true, GFP_KERNEL, vring->desc,
         |                                              ^~~~~~~~~~
   In file included from include/linux/pci-epf-virtio.h:9,
                    from drivers/pci/endpoint/functions/pci-epf-vnet.h:6,
                    from drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:10:
   include/linux/vringh.h:174:41: note: expected 'struct vring_desc *' but argument is of type 'unsigned int'
     174 |                      struct vring_desc *desc,
         |                      ~~~~~~~~~~~~~~~~~~~^~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:255:63: error: passing argument 6 of 'vringh_init_kern' from incompatible pointer type [-Werror=incompatible-pointer-types]
     255 |                                        true, GFP_KERNEL, vring->desc,
         |                                                          ~~~~~^~~~~~
         |                                                               |
         |                                                               vring_desc_t * {aka struct vring_desc *}
   include/linux/vringh.h:175:42: note: expected 'struct vring_avail *' but argument is of type 'vring_desc_t *' {aka 'struct vring_desc *'}
     175 |                      struct vring_avail *avail,
         |                      ~~~~~~~~~~~~~~~~~~~~^~~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:256:45: error: passing argument 7 of 'vringh_init_kern' from incompatible pointer type [-Werror=incompatible-pointer-types]
     256 |                                        vring->avail, vring->used);
         |                                        ~~~~~^~~~~~~
         |                                             |
         |                                             vring_avail_t * {aka struct vring_avail *}
   include/linux/vringh.h:176:41: note: expected 'struct vring_used *' but argument is of type 'vring_avail_t *' {aka 'struct vring_avail *'}
     176 |                      struct vring_used *used);
         |                      ~~~~~~~~~~~~~~~~~~~^~~~
   drivers/pci/endpoint/functions/pci-epf-vnet-ep.c:254:23: error: too many arguments to function 'vringh_init_kern'
     254 |                 err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
         |                       ^~~~~~~~~~~~~~~~
   include/linux/vringh.h:172:5: note: declared here
     172 | int vringh_init_kern(struct vringh *vrh, u64 features,
         |     ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +251 drivers/pci/endpoint/functions/pci-epf-vnet.c

   174	
   175	/**
   176	 * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
   177	 * @vnet: epf virtio net device to do dma
   178	 * @tx_vrh: vringh related to source tx vring
   179	 * @rx_vrh: vringh related to target rx vring
   180	 * @tx_iov: buffer to use tx
   181	 * @rx_iov: buffer to use rx
   182	 * @dir: a direction of DMA. local to remote or local from remote
   183	 *
   184	 * This function returns 0, 1 or error number. The 0 indicates there is not
   185	 * data to send. The 1 indicates a request to DMA is succeeded. Other error
   186	 * numbers shows error, however, ENOSPC means there is no buffer on target
   187	 * vring, so should retry to call later.
   188	 */
   189	int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
   190			      struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
   191			      struct vringh_kiov *rx_iov,
   192			      enum dma_transfer_direction dir)
   193	{
   194		int err;
   195		u16 tx_head, rx_head;
   196		size_t total_tx_len;
   197		struct epf_vnet_dma_callback_param *cb_param;
   198		struct vringh_kiov *liov, *riov;
   199	
   200		err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
   201		if (err <= 0)
   202			return err;
   203	
   204		total_tx_len = vringh_kiov_length(tx_iov);
   205	
   206		err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
   207		if (err < 0) {
   208			goto err_tx_complete;
   209		} else if (!err) {
   210			/* There is not space on a vring of destination to transmit data, so
   211			 * rollback tx vringh
   212			 */
   213			vringh_abandon(tx_vrh, tx_head);
   214			return -ENOSPC;
   215		}
   216	
   217		cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
   218		if (!cb_param) {
   219			err = -ENOMEM;
   220			goto err_rx_complete;
   221		}
   222	
   223		cb_param->tx_vrh = tx_vrh;
   224		cb_param->rx_vrh = rx_vrh;
   225		cb_param->tx_head = tx_head;
   226		cb_param->rx_head = rx_head;
   227		cb_param->total_len = total_tx_len;
   228		cb_param->vnet = vnet;
   229	
   230		switch (dir) {
   231		case DMA_MEM_TO_DEV:
   232			liov = tx_iov;
   233			riov = rx_iov;
   234			cb_param->vq = vnet->ep.txvq;
   235			break;
   236		case DMA_DEV_TO_MEM:
   237			liov = rx_iov;
   238			riov = tx_iov;
   239			cb_param->vq = vnet->ep.rxvq;
   240			break;
   241		default:
   242			err = -EINVAL;
   243			goto err_free_param;
   244		}
   245	
   246		for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
   247			size_t len;
   248			u64 lbase, rbase;
   249			void (*callback)(void *) = NULL;
   250	
 > 251			lbase = (u64)liov->iov[liov->i].iov_base;
   252			rbase = (u64)riov->iov[riov->i].iov_base;
   253			len = tx_iov->iov[tx_iov->i].iov_len;
   254	
   255			if (tx_iov->i + 1 == tx_iov->used)
   256				callback = epf_vnet_dma_callback;
   257	
   258			err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
   259						  cb_param, dir);
   260			if (err)
   261				goto err_free_param;
   262		}
   263	
   264		return 1;
   265	
   266	err_free_param:
   267		kfree(cb_param);
   268	err_rx_complete:
   269		vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
   270	err_tx_complete:
   271		vringh_complete(tx_vrh, tx_head, total_tx_len);
   272	
   273		return err;
   274	}
   275	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-03 10:04 ` Shunsuke Mie
@ 2023-02-05 10:01   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-05 10:01 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

On Fri, Feb 03, 2023 at 07:04:14PM +0900, Shunsuke Mie wrote:
> This patchset introduce a virtio-net EP device function. It provides a
> new option to communiate between PCIe host and endpoint over IP.
> Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> It is used to transport data between virtio ring directly each other. It
> can be expected to better throughput.
> 
> To realize the function, this patchset has few changes and introduces a
> new APIs to PCI EP framework related to virtio. Furthermore, it device
> depends on the some patchtes that is discussing. Those depended patchset
> are following:
> - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> link: https://lore.kernel.org/dmaengine/20221223022608.550697-1-mie@igel.co.jp/
> - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> link: https://lore.kernel.org/linux-pci/20230113090350.1103494-1-mie@igel.co.jp/
> - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> link: https://lore.kernel.org/virtualization/20230202090934.549556-1-mie@igel.co.jp/
> 
> About this patchset has 4 patches. The first of two patch is little changes
> to virtio. The third patch add APIs to easily access virtio data structure
> on PCIe Host side memory. The last one introduce a virtio-net EP device
> function. Details are in commit respectively.
> 
> Currently those network devices are testd using ping only. I'll add a
> result of performance evaluation using iperf and etc to the future version
> of this patchset.


All this feels like it'd need a virtio spec extension but I'm not 100%
sure without spending much more time understanding this.
what do you say?

> Shunsuke Mie (4):
>   virtio_pci: add a definition of queue flag in ISR
>   virtio_ring: remove const from vring getter
>   PCI: endpoint: Introduce virtio library for EP functions
>   PCI: endpoint: function: Add EP function driver to provide virtio net
>     device
> 
>  drivers/pci/endpoint/Kconfig                  |   7 +
>  drivers/pci/endpoint/Makefile                 |   1 +
>  drivers/pci/endpoint/functions/Kconfig        |  12 +
>  drivers/pci/endpoint/functions/Makefile       |   1 +
>  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
>  drivers/virtio/virtio_ring.c                  |   2 +-
>  include/linux/pci-epf-virtio.h                |  25 +
>  include/linux/virtio.h                        |   2 +-
>  include/uapi/linux/virtio_pci.h               |   2 +
>  13 files changed, 1590 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
>  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>  create mode 100644 include/linux/pci-epf-virtio.h
> 
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-05 10:01   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2023-02-05 10:01 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

On Fri, Feb 03, 2023 at 07:04:14PM +0900, Shunsuke Mie wrote:
> This patchset introduce a virtio-net EP device function. It provides a
> new option to communiate between PCIe host and endpoint over IP.
> Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> It is used to transport data between virtio ring directly each other. It
> can be expected to better throughput.
> 
> To realize the function, this patchset has few changes and introduces a
> new APIs to PCI EP framework related to virtio. Furthermore, it device
> depends on the some patchtes that is discussing. Those depended patchset
> are following:
> - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> link: https://lore.kernel.org/dmaengine/20221223022608.550697-1-mie@igel.co.jp/
> - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> link: https://lore.kernel.org/linux-pci/20230113090350.1103494-1-mie@igel.co.jp/
> - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> link: https://lore.kernel.org/virtualization/20230202090934.549556-1-mie@igel.co.jp/
> 
> About this patchset has 4 patches. The first of two patch is little changes
> to virtio. The third patch add APIs to easily access virtio data structure
> on PCIe Host side memory. The last one introduce a virtio-net EP device
> function. Details are in commit respectively.
> 
> Currently those network devices are testd using ping only. I'll add a
> result of performance evaluation using iperf and etc to the future version
> of this patchset.


All this feels like it'd need a virtio spec extension but I'm not 100%
sure without spending much more time understanding this.
what do you say?

> Shunsuke Mie (4):
>   virtio_pci: add a definition of queue flag in ISR
>   virtio_ring: remove const from vring getter
>   PCI: endpoint: Introduce virtio library for EP functions
>   PCI: endpoint: function: Add EP function driver to provide virtio net
>     device
> 
>  drivers/pci/endpoint/Kconfig                  |   7 +
>  drivers/pci/endpoint/Makefile                 |   1 +
>  drivers/pci/endpoint/functions/Kconfig        |  12 +
>  drivers/pci/endpoint/functions/Makefile       |   1 +
>  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
>  drivers/virtio/virtio_ring.c                  |   2 +-
>  include/linux/pci-epf-virtio.h                |  25 +
>  include/linux/virtio.h                        |   2 +-
>  include/uapi/linux/virtio_pci.h               |   2 +
>  13 files changed, 1590 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
>  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>  create mode 100644 include/linux/pci-epf-virtio.h
> 
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-03 21:48 ` Frank Li
@ 2023-02-07  1:43     ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07  1:43 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Lorenzo Pieralisi, Manivannan Sadhasivam, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月4日(土) 6:48 Frank Li <frank.li@nxp.com>:
>
> > foundation.org
> > Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> > function
> >
>
> The dependent EDMA patch can't be applied at last linux-next.
> Can you provide a git link? So I can try directly.
Sorry, I've missed it. The embedded DMA's patchset is
https://lore.kernel.org/linux-pci/20230113171409.30470-1-Sergey.Semin@baikalelectronics.ru/
and, merged to a pci/dwc branch on kernel/git/lpieralisi/pci.git . The
link is here:
https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/commit/?h=pci/dwc

I'll add the information to a cover letter from the next submission.
> Frank
>
> >
> > About this patchset has 4 patches. The first of two patch is little changes
> > to virtio. The third patch add APIs to easily access virtio data structure
> > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > function. Details are in commit respectively.
> >
>
Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-07  1:43     ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07  1:43 UTC (permalink / raw)
  To: Frank Li
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

2023年2月4日(土) 6:48 Frank Li <frank.li@nxp.com>:
>
> > foundation.org
> > Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> > function
> >
>
> The dependent EDMA patch can't be applied at last linux-next.
> Can you provide a git link? So I can try directly.
Sorry, I've missed it. The embedded DMA's patchset is
https://lore.kernel.org/linux-pci/20230113171409.30470-1-Sergey.Semin@baikalelectronics.ru/
and, merged to a pci/dwc branch on kernel/git/lpieralisi/pci.git . The
link is here:
https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/commit/?h=pci/dwc

I'll add the information to a cover letter from the next submission.
> Frank
>
> >
> > About this patchset has 4 patches. The first of two patch is little changes
> > to virtio. The third patch add APIs to easily access virtio data structure
> > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > function. Details are in commit respectively.
> >
>
Best,
Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-07  1:43     ` Shunsuke Mie
@ 2023-02-07  3:27       ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07  3:27 UTC (permalink / raw)
  To: Frank Li
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

2023年2月7日(火) 10:43 Shunsuke Mie <mie@igel.co.jp>:
>
> 2023年2月4日(土) 6:48 Frank Li <frank.li@nxp.com>:
> >
> > > foundation.org
> > > Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> > > function
> > >
> >
> > The dependent EDMA patch can't be applied at last linux-next.
> > Can you provide a git link? So I can try directly.
> Sorry, I've missed it. The embedded DMA's patchset is
> https://lore.kernel.org/linux-pci/20230113171409.30470-1-Sergey.Semin@baikalelectronics.ru/
> and, merged to a pci/dwc branch on kernel/git/lpieralisi/pci.git . The
> link is here:
> https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/commit/?h=pci/dwc
In addition, the patches are merged into next-20230131 .
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tag/?h=next-20230131

> I'll add the information to a cover letter from the next submission.
> > Frank
> >
> > >
> > > About this patchset has 4 patches. The first of two patch is little changes
> > > to virtio. The third patch add APIs to easily access virtio data structure
> > > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > > function. Details are in commit respectively.
> > >
> >
> Best,
> Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-07  3:27       ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07  3:27 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Lorenzo Pieralisi, Manivannan Sadhasivam, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月7日(火) 10:43 Shunsuke Mie <mie@igel.co.jp>:
>
> 2023年2月4日(土) 6:48 Frank Li <frank.li@nxp.com>:
> >
> > > foundation.org
> > > Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> > > function
> > >
> >
> > The dependent EDMA patch can't be applied at last linux-next.
> > Can you provide a git link? So I can try directly.
> Sorry, I've missed it. The embedded DMA's patchset is
> https://lore.kernel.org/linux-pci/20230113171409.30470-1-Sergey.Semin@baikalelectronics.ru/
> and, merged to a pci/dwc branch on kernel/git/lpieralisi/pci.git . The
> link is here:
> https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/commit/?h=pci/dwc
In addition, the patches are merged into next-20230131 .
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tag/?h=next-20230131

> I'll add the information to a cover letter from the next submission.
> > Frank
> >
> > >
> > > About this patchset has 4 patches. The first of two patch is little changes
> > > to virtio. The third patch add APIs to easily access virtio data structure
> > > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > > function. Details are in commit respectively.
> > >
> >
> Best,
> Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR
  2023-02-03 10:16     ` Michael S. Tsirkin
@ 2023-02-07 10:06       ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

2023年2月3日(金) 19:16 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:15PM +0900, Shunsuke Mie wrote:
> > Already it has beed defined a config changed flag of ISR, but not the queue
> > flag. Add a macro for it.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  include/uapi/linux/virtio_pci.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> > index f703afc7ad31..fa82afd6171a 100644
> > --- a/include/uapi/linux/virtio_pci.h
> > +++ b/include/uapi/linux/virtio_pci.h
> > @@ -94,6 +94,8 @@
> >
> >  #endif /* VIRTIO_PCI_NO_LEGACY */
> >
> > +/* Ths bit of the ISR which indicates a queue entry update */
>
> typo
> Something to add here:
>         Note: only when MSI-X is disabled
I'll fix both that way.
>
>
> > +#define VIRTIO_PCI_ISR_QUEUE         0x1
> >  /* The bit of the ISR which indicates a device configuration change. */
> >  #define VIRTIO_PCI_ISR_CONFIG                0x2
> >  /* Vector value used to disable MSI for queue */
> > --
> > 2.25.1
>
Best,
Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR
@ 2023-02-07 10:06       ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月3日(金) 19:16 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:15PM +0900, Shunsuke Mie wrote:
> > Already it has beed defined a config changed flag of ISR, but not the queue
> > flag. Add a macro for it.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  include/uapi/linux/virtio_pci.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> > index f703afc7ad31..fa82afd6171a 100644
> > --- a/include/uapi/linux/virtio_pci.h
> > +++ b/include/uapi/linux/virtio_pci.h
> > @@ -94,6 +94,8 @@
> >
> >  #endif /* VIRTIO_PCI_NO_LEGACY */
> >
> > +/* Ths bit of the ISR which indicates a queue entry update */
>
> typo
> Something to add here:
>         Note: only when MSI-X is disabled
I'll fix both that way.
>
>
> > +#define VIRTIO_PCI_ISR_QUEUE         0x1
> >  /* The bit of the ISR which indicates a device configuration change. */
> >  #define VIRTIO_PCI_ISR_CONFIG                0x2
> >  /* Vector value used to disable MSI for queue */
> > --
> > 2.25.1
>
Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-05 10:01   ` Michael S. Tsirkin
@ 2023-02-07 10:17     ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

2023年2月5日(日) 19:02 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:14PM +0900, Shunsuke Mie wrote:
> > This patchset introduce a virtio-net EP device function. It provides a
> > new option to communiate between PCIe host and endpoint over IP.
> > Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> > It is used to transport data between virtio ring directly each other. It
> > can be expected to better throughput.
> >
> > To realize the function, this patchset has few changes and introduces a
> > new APIs to PCI EP framework related to virtio. Furthermore, it device
> > depends on the some patchtes that is discussing. Those depended patchset
> > are following:
> > - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> > link: https://lore.kernel.org/dmaengine/20221223022608.550697-1-mie@igel.co.jp/
> > - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> > link: https://lore.kernel.org/linux-pci/20230113090350.1103494-1-mie@igel.co.jp/
> > - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> > link: https://lore.kernel.org/virtualization/20230202090934.549556-1-mie@igel.co.jp/
> >
> > About this patchset has 4 patches. The first of two patch is little changes
> > to virtio. The third patch add APIs to easily access virtio data structure
> > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > function. Details are in commit respectively.
> >
> > Currently those network devices are testd using ping only. I'll add a
> > result of performance evaluation using iperf and etc to the future version
> > of this patchset.
>
>
> All this feels like it'd need a virtio spec extension but I'm not 100%
> sure without spending much more time understanding this.
> what do you say?
This patch shows the virtio-net device as pcie device. Could you tell
me what part
of the spec are you concerned about?

> > Shunsuke Mie (4):
> >   virtio_pci: add a definition of queue flag in ISR
> >   virtio_ring: remove const from vring getter
> >   PCI: endpoint: Introduce virtio library for EP functions
> >   PCI: endpoint: function: Add EP function driver to provide virtio net
> >     device
> >
> >  drivers/pci/endpoint/Kconfig                  |   7 +
> >  drivers/pci/endpoint/Makefile                 |   1 +
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> >  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
> >  drivers/virtio/virtio_ring.c                  |   2 +-
> >  include/linux/pci-epf-virtio.h                |  25 +
> >  include/linux/virtio.h                        |   2 +-
> >  include/uapi/linux/virtio_pci.h               |   2 +
> >  13 files changed, 1590 insertions(+), 2 deletions(-)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> >  create mode 100644 include/linux/pci-epf-virtio.h
> >
> > --
> > 2.25.1
>

Best,
Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-07 10:17     ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月5日(日) 19:02 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:14PM +0900, Shunsuke Mie wrote:
> > This patchset introduce a virtio-net EP device function. It provides a
> > new option to communiate between PCIe host and endpoint over IP.
> > Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> > It is used to transport data between virtio ring directly each other. It
> > can be expected to better throughput.
> >
> > To realize the function, this patchset has few changes and introduces a
> > new APIs to PCI EP framework related to virtio. Furthermore, it device
> > depends on the some patchtes that is discussing. Those depended patchset
> > are following:
> > - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> > link: https://lore.kernel.org/dmaengine/20221223022608.550697-1-mie@igel.co.jp/
> > - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> > link: https://lore.kernel.org/linux-pci/20230113090350.1103494-1-mie@igel.co.jp/
> > - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> > link: https://lore.kernel.org/virtualization/20230202090934.549556-1-mie@igel.co.jp/
> >
> > About this patchset has 4 patches. The first of two patch is little changes
> > to virtio. The third patch add APIs to easily access virtio data structure
> > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > function. Details are in commit respectively.
> >
> > Currently those network devices are testd using ping only. I'll add a
> > result of performance evaluation using iperf and etc to the future version
> > of this patchset.
>
>
> All this feels like it'd need a virtio spec extension but I'm not 100%
> sure without spending much more time understanding this.
> what do you say?
This patch shows the virtio-net device as pcie device. Could you tell
me what part
of the spec are you concerned about?

> > Shunsuke Mie (4):
> >   virtio_pci: add a definition of queue flag in ISR
> >   virtio_ring: remove const from vring getter
> >   PCI: endpoint: Introduce virtio library for EP functions
> >   PCI: endpoint: function: Add EP function driver to provide virtio net
> >     device
> >
> >  drivers/pci/endpoint/Kconfig                  |   7 +
> >  drivers/pci/endpoint/Makefile                 |   1 +
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> >  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
> >  drivers/virtio/virtio_ring.c                  |   2 +-
> >  include/linux/pci-epf-virtio.h                |  25 +
> >  include/linux/virtio.h                        |   2 +-
> >  include/uapi/linux/virtio_pci.h               |   2 +
> >  13 files changed, 1590 insertions(+), 2 deletions(-)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> >  create mode 100644 include/linux/pci-epf-virtio.h
> >
> > --
> > 2.25.1
>

Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-03 16:45 ` [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function Frank Li
@ 2023-02-07 10:29     ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:29 UTC (permalink / raw)
  To: Frank Li
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization

2023年2月4日(土) 1:45 Frank Li <frank.li@nxp.com>:
>
>
>
> > -----Original Message-----
> > From: Shunsuke Mie <mie@igel.co.jp>
> > Sent: Friday, February 3, 2023 4:04 AM
> > To: Lorenzo Pieralisi <lpieralisi@kernel.org>
> > Cc: Krzysztof Wilczyński <kw@linux.com>; Manivannan Sadhasivam
> > <mani@kernel.org>; Kishon Vijay Abraham I <kishon@kernel.org>; Bjorn
> > Helgaas <bhelgaas@google.com>; Michael S. Tsirkin <mst@redhat.com>;
> > Jason Wang <jasowang@redhat.com>; Shunsuke Mie <mie@igel.co.jp>;
> > Frank Li <frank.li@nxp.com>; Jon Mason <jdmason@kudzu.us>; Ren Zhijie
> > <renzhijie2@huawei.com>; Takanari Hayama <taki@igel.co.jp>; linux-
> > kernel@vger.kernel.org; linux-pci@vger.kernel.org; virtualization@lists.linux-
> > foundation.org
> > Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> > function
> >
> > Caution: EXT Email
> >
> > This patchset introduce a virtio-net EP device function. It provides a
> > new option to communiate between PCIe host and endpoint over IP.
> > Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> > It is used to transport data between virtio ring directly each other. It
> > can be expected to better throughput.
>
> Thanks, basic that's what I want.  I am trying use RDMA.
> But I think virtio-net still be good solution.
We project extending this module to support RDMA. The plan is based on
virtio-rdma[1].
It extends the virtio-net and we are plan to implement the proposed
spec based on this patch.
[1] virtio-rdma
- proposal:
https://lore.kernel.org/all/20220511095900.343-1-xieyongji@bytedance.com/T/
- presentation on kvm forum:
https://youtu.be/Qrhv6hC_YK4

Please feel free to comment and suggest.
> Frank Li
>
> >
> > To realize the function, this patchset has few changes and introduces a
> > new APIs to PCI EP framework related to virtio. Furthermore, it device
> > depends on the some patchtes that is discussing. Those depended patchset
> > are following:
> > - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> > link:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> > ernel.org%2Fdmaengine%2F20221223022608.550697-1-
> > mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > C%7C%7C&sdata=tIn0MHzEvrdxaC4KKTvTRvYXBzQ6MyrFa2GXpa3ePv0%3D&
> > reserved=0
> > - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> > link:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> > ernel.org%2Flinux-pci%2F20230113090350.1103494-1-
> > mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > C%7C%7C&sdata=RLpnDiLwfqQd5QMXdiQyPVCkfOj8q2AyVeZOwWHvlsM%3
> > D&reserved=0
> > - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> > link:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> > ernel.org%2Fvirtualization%2F20230202090934.549556-1-
> > mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > C%7C%7C&sdata=6jgY76BMSbvamb%2Fl3Urjt4Gcizeqon%2BZE5nPssc2kDA%
> > 3D&reserved=0
> >
> > About this patchset has 4 patches. The first of two patch is little changes
> > to virtio. The third patch add APIs to easily access virtio data structure
> > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > function. Details are in commit respectively.
> >
> > Currently those network devices are testd using ping only. I'll add a
> > result of performance evaluation using iperf and etc to the future version
> > of this patchset.
> >
> > Shunsuke Mie (4):
> >   virtio_pci: add a definition of queue flag in ISR
> >   virtio_ring: remove const from vring getter
> >   PCI: endpoint: Introduce virtio library for EP functions
> >   PCI: endpoint: function: Add EP function driver to provide virtio net
> >     device
> >
> >  drivers/pci/endpoint/Kconfig                  |   7 +
> >  drivers/pci/endpoint/Makefile                 |   1 +
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> >  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
> >  drivers/virtio/virtio_ring.c                  |   2 +-
> >  include/linux/pci-epf-virtio.h                |  25 +
> >  include/linux/virtio.h                        |   2 +-
> >  include/uapi/linux/virtio_pci.h               |   2 +
> >  13 files changed, 1590 insertions(+), 2 deletions(-)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> >  create mode 100644 include/linux/pci-epf-virtio.h
> >
> > --
> > 2.25.1
>
Best,
Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-07 10:29     ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:29 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Lorenzo Pieralisi, Manivannan Sadhasivam, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月4日(土) 1:45 Frank Li <frank.li@nxp.com>:
>
>
>
> > -----Original Message-----
> > From: Shunsuke Mie <mie@igel.co.jp>
> > Sent: Friday, February 3, 2023 4:04 AM
> > To: Lorenzo Pieralisi <lpieralisi@kernel.org>
> > Cc: Krzysztof Wilczyński <kw@linux.com>; Manivannan Sadhasivam
> > <mani@kernel.org>; Kishon Vijay Abraham I <kishon@kernel.org>; Bjorn
> > Helgaas <bhelgaas@google.com>; Michael S. Tsirkin <mst@redhat.com>;
> > Jason Wang <jasowang@redhat.com>; Shunsuke Mie <mie@igel.co.jp>;
> > Frank Li <frank.li@nxp.com>; Jon Mason <jdmason@kudzu.us>; Ren Zhijie
> > <renzhijie2@huawei.com>; Takanari Hayama <taki@igel.co.jp>; linux-
> > kernel@vger.kernel.org; linux-pci@vger.kernel.org; virtualization@lists.linux-
> > foundation.org
> > Subject: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP
> > function
> >
> > Caution: EXT Email
> >
> > This patchset introduce a virtio-net EP device function. It provides a
> > new option to communiate between PCIe host and endpoint over IP.
> > Advantage of this option is that the driver fully uses a PCIe embedded DMA.
> > It is used to transport data between virtio ring directly each other. It
> > can be expected to better throughput.
>
> Thanks, basic that's what I want.  I am trying use RDMA.
> But I think virtio-net still be good solution.
We project extending this module to support RDMA. The plan is based on
virtio-rdma[1].
It extends the virtio-net and we are plan to implement the proposed
spec based on this patch.
[1] virtio-rdma
- proposal:
https://lore.kernel.org/all/20220511095900.343-1-xieyongji@bytedance.com/T/
- presentation on kvm forum:
https://youtu.be/Qrhv6hC_YK4

Please feel free to comment and suggest.
> Frank Li
>
> >
> > To realize the function, this patchset has few changes and introduces a
> > new APIs to PCI EP framework related to virtio. Furthermore, it device
> > depends on the some patchtes that is discussing. Those depended patchset
> > are following:
> > - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous transfer
> > link:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> > ernel.org%2Fdmaengine%2F20221223022608.550697-1-
> > mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > C%7C%7C&sdata=tIn0MHzEvrdxaC4KKTvTRvYXBzQ6MyrFa2GXpa3ePv0%3D&
> > reserved=0
> > - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> > link:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> > ernel.org%2Flinux-pci%2F20230113090350.1103494-1-
> > mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > C%7C%7C&sdata=RLpnDiLwfqQd5QMXdiQyPVCkfOj8q2AyVeZOwWHvlsM%3
> > D&reserved=0
> > - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> > link:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> > ernel.org%2Fvirtualization%2F20230202090934.549556-1-
> > mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > C%7C%7C&sdata=6jgY76BMSbvamb%2Fl3Urjt4Gcizeqon%2BZE5nPssc2kDA%
> > 3D&reserved=0
> >
> > About this patchset has 4 patches. The first of two patch is little changes
> > to virtio. The third patch add APIs to easily access virtio data structure
> > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > function. Details are in commit respectively.
> >
> > Currently those network devices are testd using ping only. I'll add a
> > result of performance evaluation using iperf and etc to the future version
> > of this patchset.
> >
> > Shunsuke Mie (4):
> >   virtio_pci: add a definition of queue flag in ISR
> >   virtio_ring: remove const from vring getter
> >   PCI: endpoint: Introduce virtio library for EP functions
> >   PCI: endpoint: function: Add EP function driver to provide virtio net
> >     device
> >
> >  drivers/pci/endpoint/Kconfig                  |   7 +
> >  drivers/pci/endpoint/Makefile                 |   1 +
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> >  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
> >  drivers/virtio/virtio_ring.c                  |   2 +-
> >  include/linux/pci-epf-virtio.h                |  25 +
> >  include/linux/virtio.h                        |   2 +-
> >  include/uapi/linux/virtio_pci.h               |   2 +
> >  13 files changed, 1590 insertions(+), 2 deletions(-)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> >  create mode 100644 include/linux/pci-epf-virtio.h
> >
> > --
> > 2.25.1
>
Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 10:22     ` Michael S. Tsirkin
@ 2023-02-07 10:47       ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

2023年2月3日(金) 19:22 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> > Add a new endpoint(EP) function driver to provide virtio-net device. This
> > function not only shows virtio-net device for PCIe host system, but also
> > provides virtio-net device to EP side(local) system. Virtualy those network
> > devices are connected, so we can use to communicate over IP like a simple
> > NIC.
> >
> > Architecture overview is following:
> >
> > to Host       |                       to Endpoint
> > network stack |                 network stack
> >       |       |                       |
> > +-----------+ |       +-----------+   +-----------+
> > |virtio-net | |       |virtio-net |   |virtio-net |
> > |driver     | |       |EP function|---|driver     |
> > +-----------+ |       +-----------+   +-----------+
> >       |       |             |
> > +-----------+ | +-----------+
> > |PCIeC      | | |PCIeC      |
> > |Rootcomplex|-|-|Endpoint   |
> > +-----------+ | +-----------+
> >   Host side   |          Endpoint side
> >
> > This driver uses PCIe EP framework to show virtio-net (pci) device Host
> > side, and generate virtual virtio-net device and register to EP side.
> > A communication date
>
> data?
>
> > is diractly
>
> directly?
Sorry, I have to revise this comment.
> > transported between virtqueue level
> > with each other using PCIe embedded DMA controller.
> >
> > by a limitation of the hardware and Linux EP framework, this function
> > follows a virtio legacy specification.
>
> what exactly is the limitation and why does it force legacy?
Modern virtio pci device have to provide a virtio pci capability,
Designware's PCIe controller is equipped to several boards. There is
no                                        functionality in the
controller to implement custom pci capability at
least. And the PCI EP framework is not supported either.

Those explanations have to be located on the cover letter. I'll add these.
> > This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> > just use the PCIe EP framework and depends on the PCIe EDMA.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> >  6 files changed, 1440 insertions(+)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> >
> > diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
> > index 9fd560886871..f88d8baaf689 100644
> > --- a/drivers/pci/endpoint/functions/Kconfig
> > +++ b/drivers/pci/endpoint/functions/Kconfig
> > @@ -37,3 +37,15 @@ config PCI_EPF_VNTB
> >         between PCI Root Port and PCIe Endpoint.
> >
> >         If in doubt, say "N" to disable Endpoint NTB driver.
> > +
> > +config PCI_EPF_VNET
> > +     tristate "PCI Endpoint virtio-net driver"
> > +     depends on PCI_ENDPOINT
> > +     select PCI_ENDPOINT_VIRTIO
> > +     select VHOST_RING
> > +     select VHOST_IOMEM
> > +     help
> > +       PCIe Endpoint virtio-net function implementation. This module enables to
> > +       show the virtio-net as pci device to PCIe Host side, and, another
> > +       virtio-net device show to local machine. Those devices can communicate
> > +       each other.
> > diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
> > index 5c13001deaba..74cc4c330c62 100644
> > --- a/drivers/pci/endpoint/functions/Makefile
> > +++ b/drivers/pci/endpoint/functions/Makefile
> > @@ -6,3 +6,4 @@
> >  obj-$(CONFIG_PCI_EPF_TEST)           += pci-epf-test.o
> >  obj-$(CONFIG_PCI_EPF_NTB)            += pci-epf-ntb.o
> >  obj-$(CONFIG_PCI_EPF_VNTB)           += pci-epf-vntb.o
> > +obj-$(CONFIG_PCI_EPF_VNET)           += pci-epf-vnet.o pci-epf-vnet-rc.o pci-epf-vnet-ep.o
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> > new file mode 100644
> > index 000000000000..93b7e00e8d06
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> > @@ -0,0 +1,343 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Functions work for Endpoint side(local) using EPF framework
> > + */
> > +#include <linux/pci-epc.h>
> > +#include <linux/virtio_pci.h>
> > +#include <linux/virtio_net.h>
> > +#include <linux/virtio_ring.h>
> > +
> > +#include "pci-epf-vnet.h"
> > +
> > +static inline struct epf_vnet *vdev_to_vnet(struct virtio_device *vdev)
> > +{
> > +     return container_of(vdev, struct epf_vnet, ep.vdev);
> > +}
> > +
> > +static void epf_vnet_ep_set_status(struct epf_vnet *vnet, u16 status)
> > +{
> > +     vnet->ep.net_config_status |= status;
> > +}
> > +
> > +static void epf_vnet_ep_clear_status(struct epf_vnet *vnet, u16 status)
> > +{
> > +     vnet->ep.net_config_status &= ~status;
> > +}
> > +
> > +static void epf_vnet_ep_raise_config_irq(struct epf_vnet *vnet)
> > +{
> > +     virtio_config_changed(&vnet->ep.vdev);
> > +}
> > +
> > +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_ep_set_status(vnet,
> > +                            VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> > +     epf_vnet_ep_raise_config_irq(vnet);
> > +}
> > +
> > +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq)
> > +{
> > +     vring_interrupt(0, vq);
> > +}
> > +
> > +static int epf_vnet_ep_process_ctrlq_entry(struct epf_vnet *vnet)
> > +{
> > +     struct vringh *vrh = &vnet->ep.ctlvrh;
> > +     struct vringh_kiov *wiov = &vnet->ep.ctl_riov;
> > +     struct vringh_kiov *riov = &vnet->ep.ctl_wiov;
> > +     struct virtio_net_ctrl_hdr *hdr;
> > +     virtio_net_ctrl_ack *ack;
> > +     int err;
> > +     u16 head;
> > +     size_t len;
> > +
> > +     err = vringh_getdesc(vrh, riov, wiov, &head);
> > +     if (err <= 0)
> > +             goto done;
> > +
> > +     len = vringh_kiov_length(riov);
> > +     if (len < sizeof(*hdr)) {
> > +             pr_debug("Command is too short: %ld\n", len);
> > +             err = -EIO;
> > +             goto done;
> > +     }
> > +
> > +     if (vringh_kiov_length(wiov) < sizeof(*ack)) {
> > +             pr_debug("Space for ack is not enough\n");
> > +             err = -EIO;
> > +             goto done;
> > +     }
> > +
> > +     hdr = phys_to_virt((unsigned long)riov->iov[riov->i].iov_base);
> > +     ack = phys_to_virt((unsigned long)wiov->iov[wiov->i].iov_base);
> > +
> > +     switch (hdr->class) {
> > +     case VIRTIO_NET_CTRL_ANNOUNCE:
> > +             if (hdr->cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> > +                     pr_debug("Invalid command: announce: %d\n", hdr->cmd);
> > +                     goto done;
> > +             }
> > +
> > +             epf_vnet_ep_clear_status(vnet, VIRTIO_NET_S_ANNOUNCE);
> > +             *ack = VIRTIO_NET_OK;
> > +             break;
> > +     default:
> > +             pr_debug("Found not supported class: %d\n", hdr->class);
> > +             err = -EIO;
> > +     }
> > +
> > +done:
> > +     vringh_complete(vrh, head, len);
> > +     return err;
> > +}
> > +
> > +static u64 epf_vnet_ep_vdev_get_features(struct virtio_device *vdev)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     return vnet->virtio_features;
> > +}
> > +
> > +static int epf_vnet_ep_vdev_finalize_features(struct virtio_device *vdev)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     if (vdev->features != vnet->virtio_features)
> > +             return -EINVAL;
> > +
> > +     return 0;
> > +}
> > +
> > +static void epf_vnet_ep_vdev_get_config(struct virtio_device *vdev,
> > +                                     unsigned int offset, void *buf,
> > +                                     unsigned int len)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +     const unsigned int mac_len = sizeof(vnet->vnet_cfg.mac);
> > +     const unsigned int status_len = sizeof(vnet->vnet_cfg.status);
> > +     unsigned int copy_len;
> > +
> > +     switch (offset) {
> > +     case offsetof(struct virtio_net_config, mac):
> > +             /* This PCIe EP function doesn't provide a VIRTIO_NET_F_MAC feature, so just
> > +              * clear the buffer.
> > +              */
> > +             copy_len = len >= mac_len ? mac_len : len;
> > +             memset(buf, 0x00, copy_len);
> > +             len -= copy_len;
> > +             buf += copy_len;
> > +             fallthrough;
> > +     case offsetof(struct virtio_net_config, status):
> > +             copy_len = len >= status_len ? status_len : len;
> > +             memcpy(buf, &vnet->ep.net_config_status, copy_len);
> > +             len -= copy_len;
> > +             buf += copy_len;
> > +             fallthrough;
> > +     default:
> > +             if (offset > sizeof(vnet->vnet_cfg)) {
> > +                     memset(buf, 0x00, len);
> > +                     break;
> > +             }
> > +             memcpy(buf, (void *)&vnet->vnet_cfg + offset, len);
> > +     }
> > +}
> > +
> > +static void epf_vnet_ep_vdev_set_config(struct virtio_device *vdev,
> > +                                     unsigned int offset, const void *buf,
> > +                                     unsigned int len)
> > +{
> > +     /* Do nothing, because all of virtio net config space is readonly. */
> > +}
> > +
> > +static u8 epf_vnet_ep_vdev_get_status(struct virtio_device *vdev)
> > +{
> > +     return 0;
> > +}
> > +
> > +static void epf_vnet_ep_vdev_set_status(struct virtio_device *vdev, u8 status)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     if (status & VIRTIO_CONFIG_S_DRIVER_OK)
> > +             epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_EP);
> > +}
> > +
> > +static void epf_vnet_ep_vdev_reset(struct virtio_device *vdev)
> > +{
> > +     pr_debug("doesn't support yet");
> > +}
> > +
> > +static bool epf_vnet_ep_vdev_vq_notify(struct virtqueue *vq)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vq->vdev);
> > +     struct vringh *tx_vrh = &vnet->ep.txvrh;
> > +     struct vringh *rx_vrh = &vnet->rc.rxvrh->vrh;
> > +     struct vringh_kiov *tx_iov = &vnet->ep.tx_iov;
> > +     struct vringh_kiov *rx_iov = &vnet->rc.rx_iov;
> > +     int err;
> > +
> > +     /* Support only one queue pair */
> > +     switch (vq->index) {
> > +     case 0: // rx queue
> > +             break;
> > +     case 1: // tx queue
> > +             while ((err = epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov,
> > +                                             rx_iov, DMA_MEM_TO_DEV)) > 0)
> > +                     ;
> > +             if (err < 0)
> > +                     pr_debug("Failed to transmit: EP -> Host: %d\n", err);
> > +             break;
> > +     case 2: // control queue
> > +             epf_vnet_ep_process_ctrlq_entry(vnet);
> > +             break;
> > +     default:
> > +             return false;
> > +     }
> > +
> > +     return true;
> > +}
> > +
> > +static int epf_vnet_ep_vdev_find_vqs(struct virtio_device *vdev,
> > +                                  unsigned int nvqs, struct virtqueue *vqs[],
> > +                                  vq_callback_t *callback[],
> > +                                  const char *const names[], const bool *ctx,
> > +                                  struct irq_affinity *desc)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +     const size_t vq_size = epf_vnet_get_vq_size();
> > +     int i;
> > +     int err;
> > +     int qidx;
> > +
> > +     for (qidx = 0, i = 0; i < nvqs; i++) {
> > +             struct virtqueue *vq;
> > +             struct vring *vring;
> > +             struct vringh *vrh;
> > +
> > +             if (!names[i]) {
> > +                     vqs[i] = NULL;
> > +                     continue;
> > +             }
> > +
> > +             vq = vring_create_virtqueue(qidx++, vq_size,
> > +                                         VIRTIO_PCI_VRING_ALIGN, vdev, true,
> > +                                         false, ctx ? ctx[i] : false,
> > +                                         epf_vnet_ep_vdev_vq_notify,
> > +                                         callback[i], names[i]);
> > +             if (!vq) {
> > +                     err = -ENOMEM;
> > +                     goto err_del_vqs;
> > +             }
> > +
> > +             vqs[i] = vq;
> > +             vring = virtqueue_get_vring(vq);
> > +
> > +             switch (i) {
> > +             case 0: // rx
> > +                     vrh = &vnet->ep.rxvrh;
> > +                     vnet->ep.rxvq = vq;
> > +                     break;
> > +             case 1: // tx
> > +                     vrh = &vnet->ep.txvrh;
> > +                     vnet->ep.txvq = vq;
> > +                     break;
> > +             case 2: // control
> > +                     vrh = &vnet->ep.ctlvrh;
> > +                     vnet->ep.ctlvq = vq;
> > +                     break;
> > +             default:
> > +                     err = -EIO;
> > +                     goto err_del_vqs;
> > +             }
> > +
> > +             err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
> > +                                    true, GFP_KERNEL, vring->desc,
> > +                                    vring->avail, vring->used);
> > +             if (err) {
> > +                     pr_err("failed to init vringh for vring %d\n", i);
> > +                     goto err_del_vqs;
> > +             }
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->ep.tx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +     err = epf_vnet_init_kiov(&vnet->ep.rx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +     err = epf_vnet_init_kiov(&vnet->ep.ctl_riov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +     err = epf_vnet_init_kiov(&vnet->ep.ctl_wiov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +
> > +     return 0;
> > +
> > +err_free_kiov:
> > +     epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> > +
> > +err_del_vqs:
> > +     for (; i >= 0; i--) {
> > +             if (!names[i])
> > +                     continue;
> > +
> > +             if (!vqs[i])
> > +                     continue;
> > +
> > +             vring_del_virtqueue(vqs[i]);
> > +     }
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_ep_vdev_del_vqs(struct virtio_device *vdev)
> > +{
> > +     struct virtqueue *vq, *n;
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     list_for_each_entry_safe(vq, n, &vdev->vqs, list)
> > +             vring_del_virtqueue(vq);
> > +
> > +     epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> > +}
> > +
> > +static const struct virtio_config_ops epf_vnet_ep_vdev_config_ops = {
> > +     .get_features = epf_vnet_ep_vdev_get_features,
> > +     .finalize_features = epf_vnet_ep_vdev_finalize_features,
> > +     .get = epf_vnet_ep_vdev_get_config,
> > +     .set = epf_vnet_ep_vdev_set_config,
> > +     .get_status = epf_vnet_ep_vdev_get_status,
> > +     .set_status = epf_vnet_ep_vdev_set_status,
> > +     .reset = epf_vnet_ep_vdev_reset,
> > +     .find_vqs = epf_vnet_ep_vdev_find_vqs,
> > +     .del_vqs = epf_vnet_ep_vdev_del_vqs,
> > +};
> > +
> > +void epf_vnet_ep_cleanup(struct epf_vnet *vnet)
> > +{
> > +     unregister_virtio_device(&vnet->ep.vdev);
> > +}
> > +
> > +int epf_vnet_ep_setup(struct epf_vnet *vnet)
> > +{
> > +     int err;
> > +     struct virtio_device *vdev = &vnet->ep.vdev;
> > +
> > +     vdev->dev.parent = vnet->epf->epc->dev.parent;
> > +     vdev->config = &epf_vnet_ep_vdev_config_ops;
> > +     vdev->id.vendor = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > +     vdev->id.device = VIRTIO_ID_NET;
> > +
> > +     err = register_virtio_device(vdev);
> > +     if (err)
> > +             return err;
> > +
> > +     return 0;
> > +}
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> > new file mode 100644
> > index 000000000000..2ca0245a9134
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> > @@ -0,0 +1,635 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Functions work for PCie Host side(remote) using EPF framework.
> > + */
> > +#include <linux/pci-epf.h>
> > +#include <linux/pci-epc.h>
> > +#include <linux/pci_ids.h>
> > +#include <linux/sched.h>
> > +#include <linux/virtio_pci.h>
> > +
> > +#include "pci-epf-vnet.h"
> > +
> > +#define VIRTIO_NET_LEGACY_CFG_BAR BAR_0
> > +
> > +/* Returns an out side of the valid queue index. */
> > +static inline u16 epf_vnet_rc_get_number_of_queues(struct epf_vnet *vnet)
> > +
> > +{
> > +     /* number of queue pairs and control queue */
> > +     return vnet->vnet_cfg.max_virtqueue_pairs * 2 + 1;
> > +}
> > +
> > +static void epf_vnet_rc_memcpy_config(struct epf_vnet *vnet, size_t offset,
> > +                                   void *buf, size_t len)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     memcpy_toio(base, buf, len);
> > +}
> > +
> > +static void epf_vnet_rc_set_config8(struct epf_vnet *vnet, size_t offset,
> > +                                 u8 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite8(ioread8(base) | config, base);
> > +}
> > +
> > +static void epf_vnet_rc_set_config16(struct epf_vnet *vnet, size_t offset,
> > +                                  u16 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite16(ioread16(base) | config, base);
> > +}
> > +
> > +static void epf_vnet_rc_clear_config16(struct epf_vnet *vnet, size_t offset,
> > +                                    u16 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite16(ioread16(base) & ~config, base);
> > +}
> > +
> > +static void epf_vnet_rc_set_config32(struct epf_vnet *vnet, size_t offset,
> > +                                  u32 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite32(ioread32(base) | config, base);
> > +}
> > +
> > +static void epf_vnet_rc_raise_config_irq(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_CONFIG);
> > +     queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> > +}
> > +
> > +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_rc_set_config16(vnet,
> > +                              VIRTIO_PCI_CONFIG_OFF(false) +
> > +                                      offsetof(struct virtio_net_config,
> > +                                               status),
> > +                              VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> > +     epf_vnet_rc_raise_config_irq(vnet);
> > +}
> > +
> > +/*
> > + * For the PCIe host, this driver shows legacy virtio-net device. Because,
> > + * virtio structure pci capabilities is mandatory for modern virtio device,
> > + * but there is no PCIe EP hardware that can be configured with any pci
> > + * capabilities and Linux PCIe EP framework doesn't support it.
> > + */
> > +static struct pci_epf_header epf_vnet_pci_header = {
> > +     .vendorid = PCI_VENDOR_ID_REDHAT_QUMRANET,
> > +     .deviceid = VIRTIO_TRANS_ID_NET,
> > +     .subsys_vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
> > +     .subsys_id = VIRTIO_ID_NET,
> > +     .revid = 0,
> > +     .baseclass_code = PCI_BASE_CLASS_NETWORK,
> > +     .interrupt_pin = PCI_INTERRUPT_PIN,
> > +};
> > +
> > +static void epf_vnet_rc_setup_configs(struct epf_vnet *vnet,
> > +                                   void __iomem *cfg_base)
> > +{
> > +     u16 default_qindex = epf_vnet_rc_get_number_of_queues(vnet);
> > +
> > +     epf_vnet_rc_set_config32(vnet, VIRTIO_PCI_HOST_FEATURES,
> > +                              vnet->virtio_features);
> > +
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_QUEUE);
> > +     /*
> > +      * Initialize the queue notify and selector to outside of the appropriate
> > +      * virtqueue index. It is used to detect change with polling. There is no
> > +      * other ways to detect host side driver updateing those values
> > +      */
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY, default_qindex);
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL, default_qindex);
> > +     /* This pfn is also set to 0 for the polling as well */
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> > +
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NUM,
> > +                              epf_vnet_get_vq_size());
> > +     epf_vnet_rc_set_config8(vnet, VIRTIO_PCI_STATUS, 0);
> > +     epf_vnet_rc_memcpy_config(vnet, VIRTIO_PCI_CONFIG_OFF(false),
> > +                               &vnet->vnet_cfg, sizeof(vnet->vnet_cfg));
> > +}
> > +
> > +static void epf_vnet_cleanup_bar(struct epf_vnet *vnet)
> > +{
> > +     struct pci_epf *epf = vnet->epf;
> > +
> > +     pci_epc_clear_bar(epf->epc, epf->func_no, epf->vfunc_no,
> > +                       &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR]);
> > +     pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> > +                        PRIMARY_INTERFACE);
> > +}
> > +
> > +static int epf_vnet_setup_bar(struct epf_vnet *vnet)
> > +{
> > +     int err;
> > +     size_t cfg_bar_size =
> > +             VIRTIO_PCI_CONFIG_OFF(false) + sizeof(struct virtio_net_config);
> > +     struct pci_epf *epf = vnet->epf;
> > +     const struct pci_epc_features *features;
> > +     struct pci_epf_bar *config_bar = &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR];
> > +
> > +     features = pci_epc_get_features(epf->epc, epf->func_no, epf->vfunc_no);
> > +     if (!features) {
> > +             pr_debug("Failed to get PCI EPC features\n");
> > +             return -EOPNOTSUPP;
> > +     }
> > +
> > +     if (features->reserved_bar & BIT(VIRTIO_NET_LEGACY_CFG_BAR)) {
> > +             pr_debug("Cannot use the PCI BAR for legacy virtio pci\n");
> > +             return -EOPNOTSUPP;
> > +     }
> > +
> > +     if (features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> > +             if (cfg_bar_size >
> > +                 features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> > +                     pr_debug("PCI BAR size is not enough\n");
> > +                     return -ENOMEM;
> > +             }
> > +     }
> > +
> > +     config_bar->flags |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> > +
> > +     vnet->rc.cfg_base = pci_epf_alloc_space(epf, cfg_bar_size,
> > +                                             VIRTIO_NET_LEGACY_CFG_BAR,
> > +                                             features->align,
> > +                                             PRIMARY_INTERFACE);
> > +     if (!vnet->rc.cfg_base) {
> > +             pr_debug("Failed to allocate virtio-net config memory\n");
> > +             return -ENOMEM;
> > +     }
> > +
> > +     epf_vnet_rc_setup_configs(vnet, vnet->rc.cfg_base);
> > +
> > +     err = pci_epc_set_bar(epf->epc, epf->func_no, epf->vfunc_no,
> > +                           config_bar);
> > +     if (err) {
> > +             pr_debug("Failed to set PCI BAR");
> > +             goto err_free_space;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_free_space:
> > +     pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> > +                        PRIMARY_INTERFACE);
> > +     return err;
> > +}
> > +
> > +static int epf_vnet_rc_negotiate_configs(struct epf_vnet *vnet, u32 *txpfn,
> > +                                      u32 *rxpfn, u32 *ctlpfn)
> > +{
> > +     const u16 nqueues = epf_vnet_rc_get_number_of_queues(vnet);
> > +     const u16 default_sel = nqueues;
> > +     u32 __iomem *queue_pfn = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_PFN;
> > +     u16 __iomem *queue_sel = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_SEL;
> > +     u8 __iomem *pci_status = vnet->rc.cfg_base + VIRTIO_PCI_STATUS;
> > +     u32 pfn;
> > +     u16 sel;
> > +     struct {
> > +             u32 pfn;
> > +             u16 sel;
> > +     } tmp[3] = {};
> > +     int tmp_index = 0;
> > +
> > +     *rxpfn = *txpfn = *ctlpfn = 0;
> > +
> > +     /* To avoid to miss a getting the pfn and selector for virtqueue wrote by
> > +      * host driver, we need to implement fast polling with saving.
> > +      *
> > +      * This implementation suspects that the host driver writes pfn only once
> > +      * for each queues
> > +      */
> > +     while (tmp_index < nqueues) {
> > +             pfn = ioread32(queue_pfn);
> > +             if (pfn == 0)
> > +                     continue;
> > +
> > +             iowrite32(0, queue_pfn);
> > +
> > +             sel = ioread16(queue_sel);
> > +             if (sel == default_sel)
> > +                     continue;
> > +
> > +             tmp[tmp_index].pfn = pfn;
> > +             tmp[tmp_index].sel = sel;
> > +             tmp_index++;
> > +     }
> > +
> > +     while (!((ioread8(pci_status) & VIRTIO_CONFIG_S_DRIVER_OK)))
> > +             ;
> > +
> > +     for (int i = 0; i < nqueues; ++i) {
> > +             switch (tmp[i].sel) {
> > +             case 0:
> > +                     *rxpfn = tmp[i].pfn;
> > +                     break;
> > +             case 1:
> > +                     *txpfn = tmp[i].pfn;
> > +                     break;
> > +             case 2:
> > +                     *ctlpfn = tmp[i].pfn;
> > +                     break;
> > +             }
> > +     }
> > +
> > +     if (!*rxpfn || !*txpfn || !*ctlpfn)
> > +             return -EIO;
> > +
> > +     return 0;
> > +}
> > +
> > +static int epf_vnet_rc_monitor_notify(void *data)
> > +{
> > +     struct epf_vnet *vnet = data;
> > +     u16 __iomem *queue_notify = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_NOTIFY;
> > +     const u16 notify_default = epf_vnet_rc_get_number_of_queues(vnet);
> > +
> > +     epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_RC);
> > +
> > +     /* Poll to detect a change of the queue_notify register. Sometimes this
> > +      * polling misses the change, so try to check each virtqueues
> > +      * everytime.
> > +      */
> > +     while (true) {
> > +             while (ioread16(queue_notify) == notify_default)
> > +                     ;
> > +             iowrite16(notify_default, queue_notify);
> > +
> > +             queue_work(vnet->rc.tx_wq, &vnet->rc.tx_work);
> > +             queue_work(vnet->rc.ctl_wq, &vnet->rc.ctl_work);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static int epf_vnet_rc_spawn_notify_monitor(struct epf_vnet *vnet)
> > +{
> > +     vnet->rc.notify_monitor_task =
> > +             kthread_create(epf_vnet_rc_monitor_notify, vnet,
> > +                            "pci-epf-vnet/cfg_negotiator");
> > +     if (IS_ERR(vnet->rc.notify_monitor_task))
> > +             return PTR_ERR(vnet->rc.notify_monitor_task);
> > +
> > +     /* Change the thread priority to high for polling. */
> > +     sched_set_fifo(vnet->rc.notify_monitor_task);
> > +     wake_up_process(vnet->rc.notify_monitor_task);
> > +
> > +     return 0;
> > +}
> > +
> > +static int epf_vnet_rc_device_setup(void *data)
> > +{
> > +     struct epf_vnet *vnet = data;
> > +     struct pci_epf *epf = vnet->epf;
> > +     u32 txpfn, rxpfn, ctlpfn;
> > +     const size_t vq_size = epf_vnet_get_vq_size();
> > +     int err;
> > +
> > +     err = epf_vnet_rc_negotiate_configs(vnet, &txpfn, &rxpfn, &ctlpfn);
> > +     if (err) {
> > +             pr_debug("Failed to negatiate configs with driver\n");
> > +             return err;
> > +     }
> > +
> > +     /* Polling phase is finished. This thread backs to normal priority. */
> > +     sched_set_normal(vnet->rc.device_setup_task, 19);
> > +
> > +     vnet->rc.txvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> > +                                                  txpfn, vq_size);
> > +     if (IS_ERR(vnet->rc.txvrh)) {
> > +             pr_debug("Failed to setup virtqueue for tx\n");
> > +             return PTR_ERR(vnet->rc.txvrh);
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.tx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_epf_tx_vringh;
> > +
> > +     vnet->rc.rxvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> > +                                                  rxpfn, vq_size);
> > +     if (IS_ERR(vnet->rc.rxvrh)) {
> > +             pr_debug("Failed to setup virtqueue for rx\n");
> > +             err = PTR_ERR(vnet->rc.rxvrh);
> > +             goto err_deinit_tx_kiov;
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.rx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_epf_rx_vringh;
> > +
> > +     vnet->rc.ctlvrh = pci_epf_virtio_alloc_vringh(
> > +             epf, vnet->virtio_features, ctlpfn, vq_size);
> > +     if (IS_ERR(vnet->rc.ctlvrh)) {
> > +             pr_err("failed to setup virtqueue\n");
> > +             err = PTR_ERR(vnet->rc.ctlvrh);
> > +             goto err_deinit_rx_kiov;
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.ctl_riov, vq_size);
> > +     if (err)
> > +             goto err_free_epf_ctl_vringh;
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.ctl_wiov, vq_size);
> > +     if (err)
> > +             goto err_deinit_ctl_riov;
> > +
> > +     err = epf_vnet_rc_spawn_notify_monitor(vnet);
> > +     if (err) {
> > +             pr_debug("Failed to create notify monitor thread\n");
> > +             goto err_deinit_ctl_wiov;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_deinit_ctl_wiov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.ctl_wiov);
> > +err_deinit_ctl_riov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.ctl_riov);
> > +err_free_epf_ctl_vringh:
> > +     pci_epf_virtio_free_vringh(epf, vnet->rc.ctlvrh);
> > +err_deinit_rx_kiov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.rx_iov);
> > +err_free_epf_rx_vringh:
> > +     pci_epf_virtio_free_vringh(epf, vnet->rc.rxvrh);
> > +err_deinit_tx_kiov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.tx_iov);
> > +err_free_epf_tx_vringh:
> > +     pci_epf_virtio_free_vringh(epf, vnet->rc.txvrh);
> > +
> > +     return err;
> > +}
> > +
> > +static int epf_vnet_rc_spawn_device_setup_task(struct epf_vnet *vnet)
> > +{
> > +     vnet->rc.device_setup_task = kthread_create(
> > +             epf_vnet_rc_device_setup, vnet, "pci-epf-vnet/cfg_negotiator");
> > +     if (IS_ERR(vnet->rc.device_setup_task))
> > +             return PTR_ERR(vnet->rc.device_setup_task);
> > +
> > +     /* Change the thread priority to high for the polling. */
> > +     sched_set_fifo(vnet->rc.device_setup_task);
> > +     wake_up_process(vnet->rc.device_setup_task);
> > +
> > +     return 0;
> > +}
> > +
> > +static void epf_vnet_rc_tx_handler(struct work_struct *work)
> > +{
> > +     struct epf_vnet *vnet = container_of(work, struct epf_vnet, rc.tx_work);
> > +     struct vringh *tx_vrh = &vnet->rc.txvrh->vrh;
> > +     struct vringh *rx_vrh = &vnet->ep.rxvrh;
> > +     struct vringh_kiov *tx_iov = &vnet->rc.tx_iov;
> > +     struct vringh_kiov *rx_iov = &vnet->ep.rx_iov;
> > +
> > +     while (epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov, rx_iov,
> > +                              DMA_DEV_TO_MEM) > 0)
> > +             ;
> > +}
> > +
> > +static void epf_vnet_rc_raise_irq_handler(struct work_struct *work)
> > +{
> > +     struct epf_vnet *vnet =
> > +             container_of(work, struct epf_vnet, rc.raise_irq_work);
> > +     struct pci_epf *epf = vnet->epf;
> > +
> > +     pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no,
> > +                       PCI_EPC_IRQ_LEGACY, 0);
> > +}
> > +
> > +struct epf_vnet_rc_meminfo {
> > +     void __iomem *addr, *virt;
> > +     phys_addr_t phys;
> > +     size_t len;
> > +};
> > +
> > +/* Util function to access PCIe host side memory from local CPU.  */
> > +static struct epf_vnet_rc_meminfo *
> > +epf_vnet_rc_epc_mmap(struct pci_epf *epf, phys_addr_t pci_addr, size_t len)
> > +{
> > +     int err;
> > +     phys_addr_t aaddr, phys_addr;
> > +     size_t asize, offset;
> > +     void __iomem *virt_addr;
> > +     struct epf_vnet_rc_meminfo *meminfo;
> > +
> > +     err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
> > +     if (err) {
> > +             pr_debug("Failed to get EPC align: %d\n", err);
> > +             return NULL;
> > +     }
> > +
> > +     offset = pci_addr - aaddr;
> > +
> > +     virt_addr = pci_epc_mem_alloc_addr(epf->epc, &phys_addr, asize);
> > +     if (!virt_addr) {
> > +             pr_debug("Failed to allocate epc memory\n");
> > +             return NULL;
> > +     }
> > +
> > +     err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, phys_addr,
> > +                            aaddr, asize);
> > +     if (err) {
> > +             pr_debug("Failed to map epc memory\n");
> > +             goto err_epc_free_addr;
> > +     }
> > +
> > +     meminfo = kmalloc(sizeof(*meminfo), GFP_KERNEL);
> > +     if (!meminfo)
> > +             goto err_epc_unmap_addr;
> > +
> > +     meminfo->virt = virt_addr;
> > +     meminfo->phys = phys_addr;
> > +     meminfo->len = len;
> > +     meminfo->addr = virt_addr + offset;
> > +
> > +     return meminfo;
> > +
> > +err_epc_unmap_addr:
> > +     pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> > +                        meminfo->phys);
> > +err_epc_free_addr:
> > +     pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> > +                           meminfo->len);
> > +
> > +     return NULL;
> > +}
> > +
> > +static void epf_vnet_rc_epc_munmap(struct pci_epf *epf,
> > +                                struct epf_vnet_rc_meminfo *meminfo)
> > +{
> > +     pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> > +                        meminfo->phys);
> > +     pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> > +                           meminfo->len);
> > +     kfree(meminfo);
> > +}
> > +
> > +static int epf_vnet_rc_process_ctrlq_entry(struct epf_vnet *vnet)
> > +{
> > +     struct vringh_kiov *riov = &vnet->rc.ctl_riov;
> > +     struct vringh_kiov *wiov = &vnet->rc.ctl_wiov;
> > +     struct vringh *vrh = &vnet->rc.ctlvrh->vrh;
> > +     struct pci_epf *epf = vnet->epf;
> > +     struct epf_vnet_rc_meminfo *rmem, *wmem;
> > +     struct virtio_net_ctrl_hdr *hdr;
> > +     int err;
> > +     u16 head;
> > +     size_t total_len;
> > +     u8 class, cmd;
> > +
> > +     err = vringh_getdesc(vrh, riov, wiov, &head);
> > +     if (err <= 0)
> > +             return err;
> > +
> > +     total_len = vringh_kiov_length(riov);
> > +
> > +     rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
> > +                                 riov->iov[riov->i].iov_len);
> > +     if (!rmem) {
> > +             err = -ENOMEM;
> > +             goto err_abandon_descs;
> > +     }
> > +
> > +     wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
> > +                                 wiov->iov[wiov->i].iov_len);
> > +     if (!wmem) {
> > +             err = -ENOMEM;
> > +             goto err_epc_unmap_rmem;
> > +     }
> > +
> > +     hdr = rmem->addr;
> > +     class = ioread8(&hdr->class);
> > +     cmd = ioread8(&hdr->cmd);
> > +     switch (ioread8(&hdr->class)) {
> > +     case VIRTIO_NET_CTRL_ANNOUNCE:
> > +             if (cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> > +                     pr_err("Found invalid command: announce: %d\n", cmd);
> > +                     break;
> > +             }
> > +             epf_vnet_rc_clear_config16(
> > +                     vnet,
> > +                     VIRTIO_PCI_CONFIG_OFF(false) +
> > +                             offsetof(struct virtio_net_config, status),
> > +                     VIRTIO_NET_S_ANNOUNCE);
> > +             epf_vnet_rc_clear_config16(vnet, VIRTIO_PCI_ISR,
> > +                                        VIRTIO_PCI_ISR_CONFIG);
> > +
> > +             iowrite8(VIRTIO_NET_OK, wmem->addr);
> > +             break;
> > +     default:
> > +             pr_err("Found unsupported class in control queue: %d\n", class);
> > +             break;
> > +     }
> > +
> > +     epf_vnet_rc_epc_munmap(epf, rmem);
> > +     epf_vnet_rc_epc_munmap(epf, wmem);
> > +     vringh_complete(vrh, head, total_len);
> > +
> > +     return 1;
> > +
> > +err_epc_unmap_rmem:
> > +     epf_vnet_rc_epc_munmap(epf, rmem);
> > +err_abandon_descs:
> > +     vringh_abandon(vrh, head);
> > +
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_rc_process_ctrlq_entries(struct work_struct *work)
> > +{
> > +     struct epf_vnet *vnet =
> > +             container_of(work, struct epf_vnet, rc.ctl_work);
> > +
> > +     while (epf_vnet_rc_process_ctrlq_entry(vnet) > 0)
> > +             ;
> > +}
> > +
> > +void epf_vnet_rc_notify(struct epf_vnet *vnet)
> > +{
> > +     queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> > +}
> > +
> > +void epf_vnet_rc_cleanup(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_cleanup_bar(vnet);
> > +     destroy_workqueue(vnet->rc.tx_wq);
> > +     destroy_workqueue(vnet->rc.irq_wq);
> > +     destroy_workqueue(vnet->rc.ctl_wq);
> > +
> > +     kthread_stop(vnet->rc.device_setup_task);
> > +}
> > +
> > +int epf_vnet_rc_setup(struct epf_vnet *vnet)
> > +{
> > +     int err;
> > +     struct pci_epf *epf = vnet->epf;
> > +
> > +     err = pci_epc_write_header(epf->epc, epf->func_no, epf->vfunc_no,
> > +                                &epf_vnet_pci_header);
> > +     if (err)
> > +             return err;
> > +
> > +     err = epf_vnet_setup_bar(vnet);
> > +     if (err)
> > +             return err;
> > +
> > +     vnet->rc.tx_wq =
> > +             alloc_workqueue("pci-epf-vnet/tx-wq",
> > +                             WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> > +     if (!vnet->rc.tx_wq) {
> > +             pr_debug(
> > +                     "Failed to allocate workqueue for rc -> ep transmission\n");
> > +             err = -ENOMEM;
> > +             goto err_cleanup_bar;
> > +     }
> > +
> > +     INIT_WORK(&vnet->rc.tx_work, epf_vnet_rc_tx_handler);
> > +
> > +     vnet->rc.irq_wq =
> > +             alloc_workqueue("pci-epf-vnet/irq-wq",
> > +                             WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> > +     if (!vnet->rc.irq_wq) {
> > +             pr_debug("Failed to allocate workqueue for irq\n");
> > +             err = -ENOMEM;
> > +             goto err_destory_tx_wq;
> > +     }
> > +
> > +     INIT_WORK(&vnet->rc.raise_irq_work, epf_vnet_rc_raise_irq_handler);
> > +
> > +     vnet->rc.ctl_wq =
> > +             alloc_workqueue("pci-epf-vnet/ctl-wq",
> > +                             WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> > +     if (!vnet->rc.ctl_wq) {
> > +             pr_err("Failed to allocate work queue for control queue processing\n");
> > +             err = -ENOMEM;
> > +             goto err_destory_irq_wq;
> > +     }
> > +
> > +     INIT_WORK(&vnet->rc.ctl_work, epf_vnet_rc_process_ctrlq_entries);
> > +
> > +     err = epf_vnet_rc_spawn_device_setup_task(vnet);
> > +     if (err)
> > +             goto err_destory_ctl_wq;
> > +
> > +     return 0;
> > +
> > +err_cleanup_bar:
> > +     epf_vnet_cleanup_bar(vnet);
> > +err_destory_tx_wq:
> > +     destroy_workqueue(vnet->rc.tx_wq);
> > +err_destory_irq_wq:
> > +     destroy_workqueue(vnet->rc.irq_wq);
> > +err_destory_ctl_wq:
> > +     destroy_workqueue(vnet->rc.ctl_wq);
> > +
> > +     return err;
> > +}
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.c b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> > new file mode 100644
> > index 000000000000..e48ad8067796
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> > @@ -0,0 +1,387 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * PCI Endpoint function driver to impliment virtio-net device.
> > + */
> > +#include <linux/module.h>
> > +#include <linux/pci-epf.h>
> > +#include <linux/pci-epc.h>
> > +#include <linux/vringh.h>
> > +#include <linux/dmaengine.h>
> > +
> > +#include "pci-epf-vnet.h"
> > +
> > +static int virtio_queue_size = 0x100;
> > +module_param(virtio_queue_size, int, 0444);
> > +MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
> > +
> > +int epf_vnet_get_vq_size(void)
> > +{
> > +     return virtio_queue_size;
> > +}
> > +
> > +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size)
> > +{
> > +     struct kvec *kvec;
> > +
> > +     kvec = kmalloc_array(vq_size, sizeof(*kvec), GFP_KERNEL);
> > +     if (!kvec)
> > +             return -ENOMEM;
> > +
> > +     vringh_kiov_init(kiov, kvec, vq_size);
> > +
> > +     return 0;
> > +}
> > +
> > +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov)
> > +{
> > +     kfree(kiov->iov);
> > +}
> > +
> > +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from)
> > +{
> > +     vnet->init_complete |= from;
> > +
> > +     if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_EP))
> > +             return;
> > +
> > +     if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_RC))
> > +             return;
> > +
> > +     epf_vnet_ep_announce_linkup(vnet);
> > +     epf_vnet_rc_announce_linkup(vnet);
> > +}
> > +
> > +struct epf_dma_filter_param {
> > +     struct device *dev;
> > +     u32 dma_mask;
> > +};
> > +
> > +static bool epf_virtnet_dma_filter(struct dma_chan *chan, void *param)
> > +{
> > +     struct epf_dma_filter_param *fparam = param;
> > +     struct dma_slave_caps caps;
> > +
> > +     memset(&caps, 0, sizeof(caps));
> > +     dma_get_slave_caps(chan, &caps);
> > +
> > +     return chan->device->dev == fparam->dev &&
> > +            (fparam->dma_mask & caps.directions);
> > +}
> > +
> > +static int epf_vnet_init_edma(struct epf_vnet *vnet, struct device *dma_dev)
> > +{
> > +     struct epf_dma_filter_param param;
> > +     dma_cap_mask_t mask;
> > +     int err;
> > +
> > +     dma_cap_zero(mask);
> > +     dma_cap_set(DMA_SLAVE, mask);
> > +
> > +     param.dev = dma_dev;
> > +     param.dma_mask = BIT(DMA_MEM_TO_DEV);
> > +     vnet->lr_dma_chan =
> > +             dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> > +     if (!vnet->lr_dma_chan)
> > +             return -EOPNOTSUPP;
> > +
> > +     param.dma_mask = BIT(DMA_DEV_TO_MEM);
> > +     vnet->rl_dma_chan =
> > +             dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> > +     if (!vnet->rl_dma_chan) {
> > +             err = -EOPNOTSUPP;
> > +             goto err_release_channel;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_release_channel:
> > +     dma_release_channel(vnet->lr_dma_chan);
> > +
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_deinit_edma(struct epf_vnet *vnet)
> > +{
> > +     dma_release_channel(vnet->lr_dma_chan);
> > +     dma_release_channel(vnet->rl_dma_chan);
> > +}
> > +
> > +static int epf_vnet_dma_single(struct epf_vnet *vnet, phys_addr_t pci,
> > +                            dma_addr_t dma, size_t len,
> > +                            void (*callback)(void *), void *param,
> > +                            enum dma_transfer_direction dir)
> > +{
> > +     struct dma_async_tx_descriptor *desc;
> > +     int err;
> > +     struct dma_chan *chan;
> > +     struct dma_slave_config sconf;
> > +     dma_cookie_t cookie;
> > +     unsigned long flags = 0;
> > +
> > +     if (dir == DMA_MEM_TO_DEV) {
> > +             sconf.dst_addr = pci;
> > +             chan = vnet->lr_dma_chan;
> > +     } else {
> > +             sconf.src_addr = pci;
> > +             chan = vnet->rl_dma_chan;
> > +     }
> > +
> > +     err = dmaengine_slave_config(chan, &sconf);
> > +     if (unlikely(err))
> > +             return err;
> > +
> > +     if (callback)
> > +             flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
> > +
> > +     desc = dmaengine_prep_slave_single(chan, dma, len, dir, flags);
> > +     if (unlikely(!desc))
> > +             return -EIO;
> > +
> > +     desc->callback = callback;
> > +     desc->callback_param = param;
> > +
> > +     cookie = dmaengine_submit(desc);
> > +     err = dma_submit_error(cookie);
> > +     if (unlikely(err))
> > +             return err;
> > +
> > +     dma_async_issue_pending(chan);
> > +
> > +     return 0;
> > +}
> > +
> > +struct epf_vnet_dma_callback_param {
> > +     struct epf_vnet *vnet;
> > +     struct vringh *tx_vrh, *rx_vrh;
> > +     struct virtqueue *vq;
> > +     size_t total_len;
> > +     u16 tx_head, rx_head;
> > +};
> > +
> > +static void epf_vnet_dma_callback(void *p)
> > +{
> > +     struct epf_vnet_dma_callback_param *param = p;
> > +     struct epf_vnet *vnet = param->vnet;
> > +
> > +     vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
> > +     vringh_complete(param->rx_vrh, param->rx_head, param->total_len);
> > +
> > +     epf_vnet_rc_notify(vnet);
> > +     epf_vnet_ep_notify(vnet, param->vq);
> > +
> > +     kfree(param);
> > +}
> > +
> > +/**
> > + * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
> > + * @vnet: epf virtio net device to do dma
> > + * @tx_vrh: vringh related to source tx vring
> > + * @rx_vrh: vringh related to target rx vring
> > + * @tx_iov: buffer to use tx
> > + * @rx_iov: buffer to use rx
> > + * @dir: a direction of DMA. local to remote or local from remote
> > + *
> > + * This function returns 0, 1 or error number. The 0 indicates there is not
> > + * data to send. The 1 indicates a request to DMA is succeeded. Other error
> > + * numbers shows error, however, ENOSPC means there is no buffer on target
> > + * vring, so should retry to call later.
> > + */
> > +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> > +                   struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> > +                   struct vringh_kiov *rx_iov,
> > +                   enum dma_transfer_direction dir)
> > +{
> > +     int err;
> > +     u16 tx_head, rx_head;
> > +     size_t total_tx_len;
> > +     struct epf_vnet_dma_callback_param *cb_param;
> > +     struct vringh_kiov *liov, *riov;
> > +
> > +     err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
> > +     if (err <= 0)
> > +             return err;
> > +
> > +     total_tx_len = vringh_kiov_length(tx_iov);
> > +
> > +     err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
> > +     if (err < 0) {
> > +             goto err_tx_complete;
> > +     } else if (!err) {
> > +             /* There is not space on a vring of destination to transmit data, so
> > +              * rollback tx vringh
> > +              */
> > +             vringh_abandon(tx_vrh, tx_head);
> > +             return -ENOSPC;
> > +     }
> > +
> > +     cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
> > +     if (!cb_param) {
> > +             err = -ENOMEM;
> > +             goto err_rx_complete;
> > +     }
> > +
> > +     cb_param->tx_vrh = tx_vrh;
> > +     cb_param->rx_vrh = rx_vrh;
> > +     cb_param->tx_head = tx_head;
> > +     cb_param->rx_head = rx_head;
> > +     cb_param->total_len = total_tx_len;
> > +     cb_param->vnet = vnet;
> > +
> > +     switch (dir) {
> > +     case DMA_MEM_TO_DEV:
> > +             liov = tx_iov;
> > +             riov = rx_iov;
> > +             cb_param->vq = vnet->ep.txvq;
> > +             break;
> > +     case DMA_DEV_TO_MEM:
> > +             liov = rx_iov;
> > +             riov = tx_iov;
> > +             cb_param->vq = vnet->ep.rxvq;
> > +             break;
> > +     default:
> > +             err = -EINVAL;
> > +             goto err_free_param;
> > +     }
> > +
> > +     for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
> > +             size_t len;
> > +             u64 lbase, rbase;
> > +             void (*callback)(void *) = NULL;
> > +
> > +             lbase = (u64)liov->iov[liov->i].iov_base;
> > +             rbase = (u64)riov->iov[riov->i].iov_base;
> > +             len = tx_iov->iov[tx_iov->i].iov_len;
> > +
> > +             if (tx_iov->i + 1 == tx_iov->used)
> > +                     callback = epf_vnet_dma_callback;
> > +
> > +             err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
> > +                                       cb_param, dir);
> > +             if (err)
> > +                     goto err_free_param;
> > +     }
> > +
> > +     return 1;
> > +
> > +err_free_param:
> > +     kfree(cb_param);
> > +err_rx_complete:
> > +     vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
> > +err_tx_complete:
> > +     vringh_complete(tx_vrh, tx_head, total_tx_len);
> > +
> > +     return err;
> > +}
> > +
> > +static int epf_vnet_bind(struct pci_epf *epf)
> > +{
> > +     int err;
> > +     struct epf_vnet *vnet = epf_get_drvdata(epf);
> > +
> > +     err = epf_vnet_init_edma(vnet, epf->epc->dev.parent);
> > +     if (err)
> > +             return err;
> > +
> > +     err = epf_vnet_rc_setup(vnet);
> > +     if (err)
> > +             goto err_free_edma;
> > +
> > +     err = epf_vnet_ep_setup(vnet);
> > +     if (err)
> > +             goto err_cleanup_rc;
> > +
> > +     return 0;
> > +
> > +err_free_edma:
> > +     epf_vnet_deinit_edma(vnet);
> > +err_cleanup_rc:
> > +     epf_vnet_rc_cleanup(vnet);
> > +
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_unbind(struct pci_epf *epf)
> > +{
> > +     struct epf_vnet *vnet = epf_get_drvdata(epf);
> > +
> > +     epf_vnet_deinit_edma(vnet);
> > +     epf_vnet_rc_cleanup(vnet);
> > +     epf_vnet_ep_cleanup(vnet);
> > +}
> > +
> > +static struct pci_epf_ops epf_vnet_ops = {
> > +     .bind = epf_vnet_bind,
> > +     .unbind = epf_vnet_unbind,
> > +};
> > +
> > +static const struct pci_epf_device_id epf_vnet_ids[] = {
> > +     { .name = "pci_epf_vnet" },
> > +     {}
> > +};
> > +
> > +static void epf_vnet_virtio_init(struct epf_vnet *vnet)
> > +{
> > +     vnet->virtio_features =
> > +             BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_STATUS) |
> > +             /* Following features are to skip any of checking and offloading, Like a
> > +              * transmission between virtual machines on same system. Details are on
> > +              * section 5.1.5 in virtio specification.
> > +              */
> > +             BIT(VIRTIO_NET_F_GUEST_CSUM) | BIT(VIRTIO_NET_F_GUEST_TSO4) |
> > +             BIT(VIRTIO_NET_F_GUEST_TSO6) | BIT(VIRTIO_NET_F_GUEST_ECN) |
> > +             BIT(VIRTIO_NET_F_GUEST_UFO) |
> > +             // The control queue is just used for linkup announcement.
> > +             BIT(VIRTIO_NET_F_CTRL_VQ);
> > +
> > +     vnet->vnet_cfg.max_virtqueue_pairs = 1;
> > +     vnet->vnet_cfg.status = 0;
> > +     vnet->vnet_cfg.mtu = PAGE_SIZE;
> > +}
> > +
> > +static int epf_vnet_probe(struct pci_epf *epf)
> > +{
> > +     struct epf_vnet *vnet;
> > +
> > +     vnet = devm_kzalloc(&epf->dev, sizeof(*vnet), GFP_KERNEL);
> > +     if (!vnet)
> > +             return -ENOMEM;
> > +
> > +     epf_set_drvdata(epf, vnet);
> > +     vnet->epf = epf;
> > +
> > +     epf_vnet_virtio_init(vnet);
> > +
> > +     return 0;
> > +}
> > +
> > +static struct pci_epf_driver epf_vnet_drv = {
> > +     .driver.name = "pci_epf_vnet",
> > +     .ops = &epf_vnet_ops,
> > +     .id_table = epf_vnet_ids,
> > +     .probe = epf_vnet_probe,
> > +     .owner = THIS_MODULE,
> > +};
> > +
> > +static int __init epf_vnet_init(void)
> > +{
> > +     int err;
> > +
> > +     err = pci_epf_register_driver(&epf_vnet_drv);
> > +     if (err) {
> > +             pr_err("Failed to register epf vnet driver\n");
> > +             return err;
> > +     }
> > +
> > +     return 0;
> > +}
> > +module_init(epf_vnet_init);
> > +
> > +static void epf_vnet_exit(void)
> > +{
> > +     pci_epf_unregister_driver(&epf_vnet_drv);
> > +}
> > +module_exit(epf_vnet_exit);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> > +MODULE_DESCRIPTION("PCI endpoint function acts as virtio net device");
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.h b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> > new file mode 100644
> > index 000000000000..1e0f90c95578
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> > @@ -0,0 +1,62 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _PCI_EPF_VNET_H
> > +#define _PCI_EPF_VNET_H
> > +
> > +#include <linux/pci-epf.h>
> > +#include <linux/pci-epf-virtio.h>
> > +#include <linux/virtio_net.h>
> > +#include <linux/dmaengine.h>
> > +#include <linux/virtio.h>
> > +
> > +struct epf_vnet {
> > +     //TODO Should this variable be placed here?
> > +     struct pci_epf *epf;
> > +     struct virtio_net_config vnet_cfg;
> > +     u64 virtio_features;
> > +
> > +     // dma channels for local to remote(lr) and remote to local(rl)
> > +     struct dma_chan *lr_dma_chan, *rl_dma_chan;
> > +
> > +     struct {
> > +             void __iomem *cfg_base;
> > +             struct task_struct *device_setup_task;
> > +             struct task_struct *notify_monitor_task;
> > +             struct workqueue_struct *tx_wq, *irq_wq, *ctl_wq;
> > +             struct work_struct tx_work, raise_irq_work, ctl_work;
> > +             struct pci_epf_vringh *txvrh, *rxvrh, *ctlvrh;
> > +             struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> > +     } rc;
> > +
> > +     struct {
> > +             struct virtqueue *rxvq, *txvq, *ctlvq;
> > +             struct vringh txvrh, rxvrh, ctlvrh;
> > +             struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> > +             struct virtio_device vdev;
> > +             u16 net_config_status;
> > +     } ep;
> > +
> > +#define EPF_VNET_INIT_COMPLETE_EP BIT(0)
> > +#define EPF_VNET_INIT_COMPLETE_RC BIT(1)
> > +     u8 init_complete;
> > +};
> > +
> > +int epf_vnet_rc_setup(struct epf_vnet *vnet);
> > +void epf_vnet_rc_cleanup(struct epf_vnet *vnet);
> > +int epf_vnet_ep_setup(struct epf_vnet *vnet);
> > +void epf_vnet_ep_cleanup(struct epf_vnet *vnet);
> > +
> > +int epf_vnet_get_vq_size(void);
> > +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size);
> > +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov);
> > +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> > +                   struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> > +                   struct vringh_kiov *rx_iov,
> > +                   enum dma_transfer_direction dir);
> > +void epf_vnet_rc_notify(struct epf_vnet *vnet);
> > +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq);
> > +
> > +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from);
> > +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet);
> > +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet);
> > +
> > +#endif // _PCI_EPF_VNET_H
> > --
> > 2.25.1
>
Best,
Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
@ 2023-02-07 10:47       ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月3日(金) 19:22 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> > Add a new endpoint(EP) function driver to provide virtio-net device. This
> > function not only shows virtio-net device for PCIe host system, but also
> > provides virtio-net device to EP side(local) system. Virtualy those network
> > devices are connected, so we can use to communicate over IP like a simple
> > NIC.
> >
> > Architecture overview is following:
> >
> > to Host       |                       to Endpoint
> > network stack |                 network stack
> >       |       |                       |
> > +-----------+ |       +-----------+   +-----------+
> > |virtio-net | |       |virtio-net |   |virtio-net |
> > |driver     | |       |EP function|---|driver     |
> > +-----------+ |       +-----------+   +-----------+
> >       |       |             |
> > +-----------+ | +-----------+
> > |PCIeC      | | |PCIeC      |
> > |Rootcomplex|-|-|Endpoint   |
> > +-----------+ | +-----------+
> >   Host side   |          Endpoint side
> >
> > This driver uses PCIe EP framework to show virtio-net (pci) device Host
> > side, and generate virtual virtio-net device and register to EP side.
> > A communication date
>
> data?
>
> > is diractly
>
> directly?
Sorry, I have to revise this comment.
> > transported between virtqueue level
> > with each other using PCIe embedded DMA controller.
> >
> > by a limitation of the hardware and Linux EP framework, this function
> > follows a virtio legacy specification.
>
> what exactly is the limitation and why does it force legacy?
Modern virtio pci device have to provide a virtio pci capability,
Designware's PCIe controller is equipped to several boards. There is
no                                        functionality in the
controller to implement custom pci capability at
least. And the PCI EP framework is not supported either.

Those explanations have to be located on the cover letter. I'll add these.
> > This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> > just use the PCIe EP framework and depends on the PCIe EDMA.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> >  drivers/pci/endpoint/functions/Makefile       |   1 +
> >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> >  6 files changed, 1440 insertions(+)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> >
> > diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
> > index 9fd560886871..f88d8baaf689 100644
> > --- a/drivers/pci/endpoint/functions/Kconfig
> > +++ b/drivers/pci/endpoint/functions/Kconfig
> > @@ -37,3 +37,15 @@ config PCI_EPF_VNTB
> >         between PCI Root Port and PCIe Endpoint.
> >
> >         If in doubt, say "N" to disable Endpoint NTB driver.
> > +
> > +config PCI_EPF_VNET
> > +     tristate "PCI Endpoint virtio-net driver"
> > +     depends on PCI_ENDPOINT
> > +     select PCI_ENDPOINT_VIRTIO
> > +     select VHOST_RING
> > +     select VHOST_IOMEM
> > +     help
> > +       PCIe Endpoint virtio-net function implementation. This module enables to
> > +       show the virtio-net as pci device to PCIe Host side, and, another
> > +       virtio-net device show to local machine. Those devices can communicate
> > +       each other.
> > diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
> > index 5c13001deaba..74cc4c330c62 100644
> > --- a/drivers/pci/endpoint/functions/Makefile
> > +++ b/drivers/pci/endpoint/functions/Makefile
> > @@ -6,3 +6,4 @@
> >  obj-$(CONFIG_PCI_EPF_TEST)           += pci-epf-test.o
> >  obj-$(CONFIG_PCI_EPF_NTB)            += pci-epf-ntb.o
> >  obj-$(CONFIG_PCI_EPF_VNTB)           += pci-epf-vntb.o
> > +obj-$(CONFIG_PCI_EPF_VNET)           += pci-epf-vnet.o pci-epf-vnet-rc.o pci-epf-vnet-ep.o
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> > new file mode 100644
> > index 000000000000..93b7e00e8d06
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> > @@ -0,0 +1,343 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Functions work for Endpoint side(local) using EPF framework
> > + */
> > +#include <linux/pci-epc.h>
> > +#include <linux/virtio_pci.h>
> > +#include <linux/virtio_net.h>
> > +#include <linux/virtio_ring.h>
> > +
> > +#include "pci-epf-vnet.h"
> > +
> > +static inline struct epf_vnet *vdev_to_vnet(struct virtio_device *vdev)
> > +{
> > +     return container_of(vdev, struct epf_vnet, ep.vdev);
> > +}
> > +
> > +static void epf_vnet_ep_set_status(struct epf_vnet *vnet, u16 status)
> > +{
> > +     vnet->ep.net_config_status |= status;
> > +}
> > +
> > +static void epf_vnet_ep_clear_status(struct epf_vnet *vnet, u16 status)
> > +{
> > +     vnet->ep.net_config_status &= ~status;
> > +}
> > +
> > +static void epf_vnet_ep_raise_config_irq(struct epf_vnet *vnet)
> > +{
> > +     virtio_config_changed(&vnet->ep.vdev);
> > +}
> > +
> > +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_ep_set_status(vnet,
> > +                            VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> > +     epf_vnet_ep_raise_config_irq(vnet);
> > +}
> > +
> > +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq)
> > +{
> > +     vring_interrupt(0, vq);
> > +}
> > +
> > +static int epf_vnet_ep_process_ctrlq_entry(struct epf_vnet *vnet)
> > +{
> > +     struct vringh *vrh = &vnet->ep.ctlvrh;
> > +     struct vringh_kiov *wiov = &vnet->ep.ctl_riov;
> > +     struct vringh_kiov *riov = &vnet->ep.ctl_wiov;
> > +     struct virtio_net_ctrl_hdr *hdr;
> > +     virtio_net_ctrl_ack *ack;
> > +     int err;
> > +     u16 head;
> > +     size_t len;
> > +
> > +     err = vringh_getdesc(vrh, riov, wiov, &head);
> > +     if (err <= 0)
> > +             goto done;
> > +
> > +     len = vringh_kiov_length(riov);
> > +     if (len < sizeof(*hdr)) {
> > +             pr_debug("Command is too short: %ld\n", len);
> > +             err = -EIO;
> > +             goto done;
> > +     }
> > +
> > +     if (vringh_kiov_length(wiov) < sizeof(*ack)) {
> > +             pr_debug("Space for ack is not enough\n");
> > +             err = -EIO;
> > +             goto done;
> > +     }
> > +
> > +     hdr = phys_to_virt((unsigned long)riov->iov[riov->i].iov_base);
> > +     ack = phys_to_virt((unsigned long)wiov->iov[wiov->i].iov_base);
> > +
> > +     switch (hdr->class) {
> > +     case VIRTIO_NET_CTRL_ANNOUNCE:
> > +             if (hdr->cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> > +                     pr_debug("Invalid command: announce: %d\n", hdr->cmd);
> > +                     goto done;
> > +             }
> > +
> > +             epf_vnet_ep_clear_status(vnet, VIRTIO_NET_S_ANNOUNCE);
> > +             *ack = VIRTIO_NET_OK;
> > +             break;
> > +     default:
> > +             pr_debug("Found not supported class: %d\n", hdr->class);
> > +             err = -EIO;
> > +     }
> > +
> > +done:
> > +     vringh_complete(vrh, head, len);
> > +     return err;
> > +}
> > +
> > +static u64 epf_vnet_ep_vdev_get_features(struct virtio_device *vdev)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     return vnet->virtio_features;
> > +}
> > +
> > +static int epf_vnet_ep_vdev_finalize_features(struct virtio_device *vdev)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     if (vdev->features != vnet->virtio_features)
> > +             return -EINVAL;
> > +
> > +     return 0;
> > +}
> > +
> > +static void epf_vnet_ep_vdev_get_config(struct virtio_device *vdev,
> > +                                     unsigned int offset, void *buf,
> > +                                     unsigned int len)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +     const unsigned int mac_len = sizeof(vnet->vnet_cfg.mac);
> > +     const unsigned int status_len = sizeof(vnet->vnet_cfg.status);
> > +     unsigned int copy_len;
> > +
> > +     switch (offset) {
> > +     case offsetof(struct virtio_net_config, mac):
> > +             /* This PCIe EP function doesn't provide a VIRTIO_NET_F_MAC feature, so just
> > +              * clear the buffer.
> > +              */
> > +             copy_len = len >= mac_len ? mac_len : len;
> > +             memset(buf, 0x00, copy_len);
> > +             len -= copy_len;
> > +             buf += copy_len;
> > +             fallthrough;
> > +     case offsetof(struct virtio_net_config, status):
> > +             copy_len = len >= status_len ? status_len : len;
> > +             memcpy(buf, &vnet->ep.net_config_status, copy_len);
> > +             len -= copy_len;
> > +             buf += copy_len;
> > +             fallthrough;
> > +     default:
> > +             if (offset > sizeof(vnet->vnet_cfg)) {
> > +                     memset(buf, 0x00, len);
> > +                     break;
> > +             }
> > +             memcpy(buf, (void *)&vnet->vnet_cfg + offset, len);
> > +     }
> > +}
> > +
> > +static void epf_vnet_ep_vdev_set_config(struct virtio_device *vdev,
> > +                                     unsigned int offset, const void *buf,
> > +                                     unsigned int len)
> > +{
> > +     /* Do nothing, because all of virtio net config space is readonly. */
> > +}
> > +
> > +static u8 epf_vnet_ep_vdev_get_status(struct virtio_device *vdev)
> > +{
> > +     return 0;
> > +}
> > +
> > +static void epf_vnet_ep_vdev_set_status(struct virtio_device *vdev, u8 status)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     if (status & VIRTIO_CONFIG_S_DRIVER_OK)
> > +             epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_EP);
> > +}
> > +
> > +static void epf_vnet_ep_vdev_reset(struct virtio_device *vdev)
> > +{
> > +     pr_debug("doesn't support yet");
> > +}
> > +
> > +static bool epf_vnet_ep_vdev_vq_notify(struct virtqueue *vq)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vq->vdev);
> > +     struct vringh *tx_vrh = &vnet->ep.txvrh;
> > +     struct vringh *rx_vrh = &vnet->rc.rxvrh->vrh;
> > +     struct vringh_kiov *tx_iov = &vnet->ep.tx_iov;
> > +     struct vringh_kiov *rx_iov = &vnet->rc.rx_iov;
> > +     int err;
> > +
> > +     /* Support only one queue pair */
> > +     switch (vq->index) {
> > +     case 0: // rx queue
> > +             break;
> > +     case 1: // tx queue
> > +             while ((err = epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov,
> > +                                             rx_iov, DMA_MEM_TO_DEV)) > 0)
> > +                     ;
> > +             if (err < 0)
> > +                     pr_debug("Failed to transmit: EP -> Host: %d\n", err);
> > +             break;
> > +     case 2: // control queue
> > +             epf_vnet_ep_process_ctrlq_entry(vnet);
> > +             break;
> > +     default:
> > +             return false;
> > +     }
> > +
> > +     return true;
> > +}
> > +
> > +static int epf_vnet_ep_vdev_find_vqs(struct virtio_device *vdev,
> > +                                  unsigned int nvqs, struct virtqueue *vqs[],
> > +                                  vq_callback_t *callback[],
> > +                                  const char *const names[], const bool *ctx,
> > +                                  struct irq_affinity *desc)
> > +{
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +     const size_t vq_size = epf_vnet_get_vq_size();
> > +     int i;
> > +     int err;
> > +     int qidx;
> > +
> > +     for (qidx = 0, i = 0; i < nvqs; i++) {
> > +             struct virtqueue *vq;
> > +             struct vring *vring;
> > +             struct vringh *vrh;
> > +
> > +             if (!names[i]) {
> > +                     vqs[i] = NULL;
> > +                     continue;
> > +             }
> > +
> > +             vq = vring_create_virtqueue(qidx++, vq_size,
> > +                                         VIRTIO_PCI_VRING_ALIGN, vdev, true,
> > +                                         false, ctx ? ctx[i] : false,
> > +                                         epf_vnet_ep_vdev_vq_notify,
> > +                                         callback[i], names[i]);
> > +             if (!vq) {
> > +                     err = -ENOMEM;
> > +                     goto err_del_vqs;
> > +             }
> > +
> > +             vqs[i] = vq;
> > +             vring = virtqueue_get_vring(vq);
> > +
> > +             switch (i) {
> > +             case 0: // rx
> > +                     vrh = &vnet->ep.rxvrh;
> > +                     vnet->ep.rxvq = vq;
> > +                     break;
> > +             case 1: // tx
> > +                     vrh = &vnet->ep.txvrh;
> > +                     vnet->ep.txvq = vq;
> > +                     break;
> > +             case 2: // control
> > +                     vrh = &vnet->ep.ctlvrh;
> > +                     vnet->ep.ctlvq = vq;
> > +                     break;
> > +             default:
> > +                     err = -EIO;
> > +                     goto err_del_vqs;
> > +             }
> > +
> > +             err = vringh_init_kern(vrh, vnet->virtio_features, vq_size,
> > +                                    true, GFP_KERNEL, vring->desc,
> > +                                    vring->avail, vring->used);
> > +             if (err) {
> > +                     pr_err("failed to init vringh for vring %d\n", i);
> > +                     goto err_del_vqs;
> > +             }
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->ep.tx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +     err = epf_vnet_init_kiov(&vnet->ep.rx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +     err = epf_vnet_init_kiov(&vnet->ep.ctl_riov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +     err = epf_vnet_init_kiov(&vnet->ep.ctl_wiov, vq_size);
> > +     if (err)
> > +             goto err_free_kiov;
> > +
> > +     return 0;
> > +
> > +err_free_kiov:
> > +     epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> > +
> > +err_del_vqs:
> > +     for (; i >= 0; i--) {
> > +             if (!names[i])
> > +                     continue;
> > +
> > +             if (!vqs[i])
> > +                     continue;
> > +
> > +             vring_del_virtqueue(vqs[i]);
> > +     }
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_ep_vdev_del_vqs(struct virtio_device *vdev)
> > +{
> > +     struct virtqueue *vq, *n;
> > +     struct epf_vnet *vnet = vdev_to_vnet(vdev);
> > +
> > +     list_for_each_entry_safe(vq, n, &vdev->vqs, list)
> > +             vring_del_virtqueue(vq);
> > +
> > +     epf_vnet_deinit_kiov(&vnet->ep.tx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.rx_iov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_riov);
> > +     epf_vnet_deinit_kiov(&vnet->ep.ctl_wiov);
> > +}
> > +
> > +static const struct virtio_config_ops epf_vnet_ep_vdev_config_ops = {
> > +     .get_features = epf_vnet_ep_vdev_get_features,
> > +     .finalize_features = epf_vnet_ep_vdev_finalize_features,
> > +     .get = epf_vnet_ep_vdev_get_config,
> > +     .set = epf_vnet_ep_vdev_set_config,
> > +     .get_status = epf_vnet_ep_vdev_get_status,
> > +     .set_status = epf_vnet_ep_vdev_set_status,
> > +     .reset = epf_vnet_ep_vdev_reset,
> > +     .find_vqs = epf_vnet_ep_vdev_find_vqs,
> > +     .del_vqs = epf_vnet_ep_vdev_del_vqs,
> > +};
> > +
> > +void epf_vnet_ep_cleanup(struct epf_vnet *vnet)
> > +{
> > +     unregister_virtio_device(&vnet->ep.vdev);
> > +}
> > +
> > +int epf_vnet_ep_setup(struct epf_vnet *vnet)
> > +{
> > +     int err;
> > +     struct virtio_device *vdev = &vnet->ep.vdev;
> > +
> > +     vdev->dev.parent = vnet->epf->epc->dev.parent;
> > +     vdev->config = &epf_vnet_ep_vdev_config_ops;
> > +     vdev->id.vendor = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > +     vdev->id.device = VIRTIO_ID_NET;
> > +
> > +     err = register_virtio_device(vdev);
> > +     if (err)
> > +             return err;
> > +
> > +     return 0;
> > +}
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> > new file mode 100644
> > index 000000000000..2ca0245a9134
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> > @@ -0,0 +1,635 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Functions work for PCie Host side(remote) using EPF framework.
> > + */
> > +#include <linux/pci-epf.h>
> > +#include <linux/pci-epc.h>
> > +#include <linux/pci_ids.h>
> > +#include <linux/sched.h>
> > +#include <linux/virtio_pci.h>
> > +
> > +#include "pci-epf-vnet.h"
> > +
> > +#define VIRTIO_NET_LEGACY_CFG_BAR BAR_0
> > +
> > +/* Returns an out side of the valid queue index. */
> > +static inline u16 epf_vnet_rc_get_number_of_queues(struct epf_vnet *vnet)
> > +
> > +{
> > +     /* number of queue pairs and control queue */
> > +     return vnet->vnet_cfg.max_virtqueue_pairs * 2 + 1;
> > +}
> > +
> > +static void epf_vnet_rc_memcpy_config(struct epf_vnet *vnet, size_t offset,
> > +                                   void *buf, size_t len)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     memcpy_toio(base, buf, len);
> > +}
> > +
> > +static void epf_vnet_rc_set_config8(struct epf_vnet *vnet, size_t offset,
> > +                                 u8 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite8(ioread8(base) | config, base);
> > +}
> > +
> > +static void epf_vnet_rc_set_config16(struct epf_vnet *vnet, size_t offset,
> > +                                  u16 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite16(ioread16(base) | config, base);
> > +}
> > +
> > +static void epf_vnet_rc_clear_config16(struct epf_vnet *vnet, size_t offset,
> > +                                    u16 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite16(ioread16(base) & ~config, base);
> > +}
> > +
> > +static void epf_vnet_rc_set_config32(struct epf_vnet *vnet, size_t offset,
> > +                                  u32 config)
> > +{
> > +     void __iomem *base = vnet->rc.cfg_base + offset;
> > +
> > +     iowrite32(ioread32(base) | config, base);
> > +}
> > +
> > +static void epf_vnet_rc_raise_config_irq(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_CONFIG);
> > +     queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> > +}
> > +
> > +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_rc_set_config16(vnet,
> > +                              VIRTIO_PCI_CONFIG_OFF(false) +
> > +                                      offsetof(struct virtio_net_config,
> > +                                               status),
> > +                              VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE);
> > +     epf_vnet_rc_raise_config_irq(vnet);
> > +}
> > +
> > +/*
> > + * For the PCIe host, this driver shows legacy virtio-net device. Because,
> > + * virtio structure pci capabilities is mandatory for modern virtio device,
> > + * but there is no PCIe EP hardware that can be configured with any pci
> > + * capabilities and Linux PCIe EP framework doesn't support it.
> > + */
> > +static struct pci_epf_header epf_vnet_pci_header = {
> > +     .vendorid = PCI_VENDOR_ID_REDHAT_QUMRANET,
> > +     .deviceid = VIRTIO_TRANS_ID_NET,
> > +     .subsys_vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
> > +     .subsys_id = VIRTIO_ID_NET,
> > +     .revid = 0,
> > +     .baseclass_code = PCI_BASE_CLASS_NETWORK,
> > +     .interrupt_pin = PCI_INTERRUPT_PIN,
> > +};
> > +
> > +static void epf_vnet_rc_setup_configs(struct epf_vnet *vnet,
> > +                                   void __iomem *cfg_base)
> > +{
> > +     u16 default_qindex = epf_vnet_rc_get_number_of_queues(vnet);
> > +
> > +     epf_vnet_rc_set_config32(vnet, VIRTIO_PCI_HOST_FEATURES,
> > +                              vnet->virtio_features);
> > +
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR, VIRTIO_PCI_ISR_QUEUE);
> > +     /*
> > +      * Initialize the queue notify and selector to outside of the appropriate
> > +      * virtqueue index. It is used to detect change with polling. There is no
> > +      * other ways to detect host side driver updateing those values
> > +      */
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY, default_qindex);
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL, default_qindex);
> > +     /* This pfn is also set to 0 for the polling as well */
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> > +
> > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NUM,
> > +                              epf_vnet_get_vq_size());
> > +     epf_vnet_rc_set_config8(vnet, VIRTIO_PCI_STATUS, 0);
> > +     epf_vnet_rc_memcpy_config(vnet, VIRTIO_PCI_CONFIG_OFF(false),
> > +                               &vnet->vnet_cfg, sizeof(vnet->vnet_cfg));
> > +}
> > +
> > +static void epf_vnet_cleanup_bar(struct epf_vnet *vnet)
> > +{
> > +     struct pci_epf *epf = vnet->epf;
> > +
> > +     pci_epc_clear_bar(epf->epc, epf->func_no, epf->vfunc_no,
> > +                       &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR]);
> > +     pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> > +                        PRIMARY_INTERFACE);
> > +}
> > +
> > +static int epf_vnet_setup_bar(struct epf_vnet *vnet)
> > +{
> > +     int err;
> > +     size_t cfg_bar_size =
> > +             VIRTIO_PCI_CONFIG_OFF(false) + sizeof(struct virtio_net_config);
> > +     struct pci_epf *epf = vnet->epf;
> > +     const struct pci_epc_features *features;
> > +     struct pci_epf_bar *config_bar = &epf->bar[VIRTIO_NET_LEGACY_CFG_BAR];
> > +
> > +     features = pci_epc_get_features(epf->epc, epf->func_no, epf->vfunc_no);
> > +     if (!features) {
> > +             pr_debug("Failed to get PCI EPC features\n");
> > +             return -EOPNOTSUPP;
> > +     }
> > +
> > +     if (features->reserved_bar & BIT(VIRTIO_NET_LEGACY_CFG_BAR)) {
> > +             pr_debug("Cannot use the PCI BAR for legacy virtio pci\n");
> > +             return -EOPNOTSUPP;
> > +     }
> > +
> > +     if (features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> > +             if (cfg_bar_size >
> > +                 features->bar_fixed_size[VIRTIO_NET_LEGACY_CFG_BAR]) {
> > +                     pr_debug("PCI BAR size is not enough\n");
> > +                     return -ENOMEM;
> > +             }
> > +     }
> > +
> > +     config_bar->flags |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> > +
> > +     vnet->rc.cfg_base = pci_epf_alloc_space(epf, cfg_bar_size,
> > +                                             VIRTIO_NET_LEGACY_CFG_BAR,
> > +                                             features->align,
> > +                                             PRIMARY_INTERFACE);
> > +     if (!vnet->rc.cfg_base) {
> > +             pr_debug("Failed to allocate virtio-net config memory\n");
> > +             return -ENOMEM;
> > +     }
> > +
> > +     epf_vnet_rc_setup_configs(vnet, vnet->rc.cfg_base);
> > +
> > +     err = pci_epc_set_bar(epf->epc, epf->func_no, epf->vfunc_no,
> > +                           config_bar);
> > +     if (err) {
> > +             pr_debug("Failed to set PCI BAR");
> > +             goto err_free_space;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_free_space:
> > +     pci_epf_free_space(epf, vnet->rc.cfg_base, VIRTIO_NET_LEGACY_CFG_BAR,
> > +                        PRIMARY_INTERFACE);
> > +     return err;
> > +}
> > +
> > +static int epf_vnet_rc_negotiate_configs(struct epf_vnet *vnet, u32 *txpfn,
> > +                                      u32 *rxpfn, u32 *ctlpfn)
> > +{
> > +     const u16 nqueues = epf_vnet_rc_get_number_of_queues(vnet);
> > +     const u16 default_sel = nqueues;
> > +     u32 __iomem *queue_pfn = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_PFN;
> > +     u16 __iomem *queue_sel = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_SEL;
> > +     u8 __iomem *pci_status = vnet->rc.cfg_base + VIRTIO_PCI_STATUS;
> > +     u32 pfn;
> > +     u16 sel;
> > +     struct {
> > +             u32 pfn;
> > +             u16 sel;
> > +     } tmp[3] = {};
> > +     int tmp_index = 0;
> > +
> > +     *rxpfn = *txpfn = *ctlpfn = 0;
> > +
> > +     /* To avoid to miss a getting the pfn and selector for virtqueue wrote by
> > +      * host driver, we need to implement fast polling with saving.
> > +      *
> > +      * This implementation suspects that the host driver writes pfn only once
> > +      * for each queues
> > +      */
> > +     while (tmp_index < nqueues) {
> > +             pfn = ioread32(queue_pfn);
> > +             if (pfn == 0)
> > +                     continue;
> > +
> > +             iowrite32(0, queue_pfn);
> > +
> > +             sel = ioread16(queue_sel);
> > +             if (sel == default_sel)
> > +                     continue;
> > +
> > +             tmp[tmp_index].pfn = pfn;
> > +             tmp[tmp_index].sel = sel;
> > +             tmp_index++;
> > +     }
> > +
> > +     while (!((ioread8(pci_status) & VIRTIO_CONFIG_S_DRIVER_OK)))
> > +             ;
> > +
> > +     for (int i = 0; i < nqueues; ++i) {
> > +             switch (tmp[i].sel) {
> > +             case 0:
> > +                     *rxpfn = tmp[i].pfn;
> > +                     break;
> > +             case 1:
> > +                     *txpfn = tmp[i].pfn;
> > +                     break;
> > +             case 2:
> > +                     *ctlpfn = tmp[i].pfn;
> > +                     break;
> > +             }
> > +     }
> > +
> > +     if (!*rxpfn || !*txpfn || !*ctlpfn)
> > +             return -EIO;
> > +
> > +     return 0;
> > +}
> > +
> > +static int epf_vnet_rc_monitor_notify(void *data)
> > +{
> > +     struct epf_vnet *vnet = data;
> > +     u16 __iomem *queue_notify = vnet->rc.cfg_base + VIRTIO_PCI_QUEUE_NOTIFY;
> > +     const u16 notify_default = epf_vnet_rc_get_number_of_queues(vnet);
> > +
> > +     epf_vnet_init_complete(vnet, EPF_VNET_INIT_COMPLETE_RC);
> > +
> > +     /* Poll to detect a change of the queue_notify register. Sometimes this
> > +      * polling misses the change, so try to check each virtqueues
> > +      * everytime.
> > +      */
> > +     while (true) {
> > +             while (ioread16(queue_notify) == notify_default)
> > +                     ;
> > +             iowrite16(notify_default, queue_notify);
> > +
> > +             queue_work(vnet->rc.tx_wq, &vnet->rc.tx_work);
> > +             queue_work(vnet->rc.ctl_wq, &vnet->rc.ctl_work);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static int epf_vnet_rc_spawn_notify_monitor(struct epf_vnet *vnet)
> > +{
> > +     vnet->rc.notify_monitor_task =
> > +             kthread_create(epf_vnet_rc_monitor_notify, vnet,
> > +                            "pci-epf-vnet/cfg_negotiator");
> > +     if (IS_ERR(vnet->rc.notify_monitor_task))
> > +             return PTR_ERR(vnet->rc.notify_monitor_task);
> > +
> > +     /* Change the thread priority to high for polling. */
> > +     sched_set_fifo(vnet->rc.notify_monitor_task);
> > +     wake_up_process(vnet->rc.notify_monitor_task);
> > +
> > +     return 0;
> > +}
> > +
> > +static int epf_vnet_rc_device_setup(void *data)
> > +{
> > +     struct epf_vnet *vnet = data;
> > +     struct pci_epf *epf = vnet->epf;
> > +     u32 txpfn, rxpfn, ctlpfn;
> > +     const size_t vq_size = epf_vnet_get_vq_size();
> > +     int err;
> > +
> > +     err = epf_vnet_rc_negotiate_configs(vnet, &txpfn, &rxpfn, &ctlpfn);
> > +     if (err) {
> > +             pr_debug("Failed to negatiate configs with driver\n");
> > +             return err;
> > +     }
> > +
> > +     /* Polling phase is finished. This thread backs to normal priority. */
> > +     sched_set_normal(vnet->rc.device_setup_task, 19);
> > +
> > +     vnet->rc.txvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> > +                                                  txpfn, vq_size);
> > +     if (IS_ERR(vnet->rc.txvrh)) {
> > +             pr_debug("Failed to setup virtqueue for tx\n");
> > +             return PTR_ERR(vnet->rc.txvrh);
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.tx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_epf_tx_vringh;
> > +
> > +     vnet->rc.rxvrh = pci_epf_virtio_alloc_vringh(epf, vnet->virtio_features,
> > +                                                  rxpfn, vq_size);
> > +     if (IS_ERR(vnet->rc.rxvrh)) {
> > +             pr_debug("Failed to setup virtqueue for rx\n");
> > +             err = PTR_ERR(vnet->rc.rxvrh);
> > +             goto err_deinit_tx_kiov;
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.rx_iov, vq_size);
> > +     if (err)
> > +             goto err_free_epf_rx_vringh;
> > +
> > +     vnet->rc.ctlvrh = pci_epf_virtio_alloc_vringh(
> > +             epf, vnet->virtio_features, ctlpfn, vq_size);
> > +     if (IS_ERR(vnet->rc.ctlvrh)) {
> > +             pr_err("failed to setup virtqueue\n");
> > +             err = PTR_ERR(vnet->rc.ctlvrh);
> > +             goto err_deinit_rx_kiov;
> > +     }
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.ctl_riov, vq_size);
> > +     if (err)
> > +             goto err_free_epf_ctl_vringh;
> > +
> > +     err = epf_vnet_init_kiov(&vnet->rc.ctl_wiov, vq_size);
> > +     if (err)
> > +             goto err_deinit_ctl_riov;
> > +
> > +     err = epf_vnet_rc_spawn_notify_monitor(vnet);
> > +     if (err) {
> > +             pr_debug("Failed to create notify monitor thread\n");
> > +             goto err_deinit_ctl_wiov;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_deinit_ctl_wiov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.ctl_wiov);
> > +err_deinit_ctl_riov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.ctl_riov);
> > +err_free_epf_ctl_vringh:
> > +     pci_epf_virtio_free_vringh(epf, vnet->rc.ctlvrh);
> > +err_deinit_rx_kiov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.rx_iov);
> > +err_free_epf_rx_vringh:
> > +     pci_epf_virtio_free_vringh(epf, vnet->rc.rxvrh);
> > +err_deinit_tx_kiov:
> > +     epf_vnet_deinit_kiov(&vnet->rc.tx_iov);
> > +err_free_epf_tx_vringh:
> > +     pci_epf_virtio_free_vringh(epf, vnet->rc.txvrh);
> > +
> > +     return err;
> > +}
> > +
> > +static int epf_vnet_rc_spawn_device_setup_task(struct epf_vnet *vnet)
> > +{
> > +     vnet->rc.device_setup_task = kthread_create(
> > +             epf_vnet_rc_device_setup, vnet, "pci-epf-vnet/cfg_negotiator");
> > +     if (IS_ERR(vnet->rc.device_setup_task))
> > +             return PTR_ERR(vnet->rc.device_setup_task);
> > +
> > +     /* Change the thread priority to high for the polling. */
> > +     sched_set_fifo(vnet->rc.device_setup_task);
> > +     wake_up_process(vnet->rc.device_setup_task);
> > +
> > +     return 0;
> > +}
> > +
> > +static void epf_vnet_rc_tx_handler(struct work_struct *work)
> > +{
> > +     struct epf_vnet *vnet = container_of(work, struct epf_vnet, rc.tx_work);
> > +     struct vringh *tx_vrh = &vnet->rc.txvrh->vrh;
> > +     struct vringh *rx_vrh = &vnet->ep.rxvrh;
> > +     struct vringh_kiov *tx_iov = &vnet->rc.tx_iov;
> > +     struct vringh_kiov *rx_iov = &vnet->ep.rx_iov;
> > +
> > +     while (epf_vnet_transfer(vnet, tx_vrh, rx_vrh, tx_iov, rx_iov,
> > +                              DMA_DEV_TO_MEM) > 0)
> > +             ;
> > +}
> > +
> > +static void epf_vnet_rc_raise_irq_handler(struct work_struct *work)
> > +{
> > +     struct epf_vnet *vnet =
> > +             container_of(work, struct epf_vnet, rc.raise_irq_work);
> > +     struct pci_epf *epf = vnet->epf;
> > +
> > +     pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no,
> > +                       PCI_EPC_IRQ_LEGACY, 0);
> > +}
> > +
> > +struct epf_vnet_rc_meminfo {
> > +     void __iomem *addr, *virt;
> > +     phys_addr_t phys;
> > +     size_t len;
> > +};
> > +
> > +/* Util function to access PCIe host side memory from local CPU.  */
> > +static struct epf_vnet_rc_meminfo *
> > +epf_vnet_rc_epc_mmap(struct pci_epf *epf, phys_addr_t pci_addr, size_t len)
> > +{
> > +     int err;
> > +     phys_addr_t aaddr, phys_addr;
> > +     size_t asize, offset;
> > +     void __iomem *virt_addr;
> > +     struct epf_vnet_rc_meminfo *meminfo;
> > +
> > +     err = pci_epc_mem_align(epf->epc, pci_addr, len, &aaddr, &asize);
> > +     if (err) {
> > +             pr_debug("Failed to get EPC align: %d\n", err);
> > +             return NULL;
> > +     }
> > +
> > +     offset = pci_addr - aaddr;
> > +
> > +     virt_addr = pci_epc_mem_alloc_addr(epf->epc, &phys_addr, asize);
> > +     if (!virt_addr) {
> > +             pr_debug("Failed to allocate epc memory\n");
> > +             return NULL;
> > +     }
> > +
> > +     err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, phys_addr,
> > +                            aaddr, asize);
> > +     if (err) {
> > +             pr_debug("Failed to map epc memory\n");
> > +             goto err_epc_free_addr;
> > +     }
> > +
> > +     meminfo = kmalloc(sizeof(*meminfo), GFP_KERNEL);
> > +     if (!meminfo)
> > +             goto err_epc_unmap_addr;
> > +
> > +     meminfo->virt = virt_addr;
> > +     meminfo->phys = phys_addr;
> > +     meminfo->len = len;
> > +     meminfo->addr = virt_addr + offset;
> > +
> > +     return meminfo;
> > +
> > +err_epc_unmap_addr:
> > +     pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> > +                        meminfo->phys);
> > +err_epc_free_addr:
> > +     pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> > +                           meminfo->len);
> > +
> > +     return NULL;
> > +}
> > +
> > +static void epf_vnet_rc_epc_munmap(struct pci_epf *epf,
> > +                                struct epf_vnet_rc_meminfo *meminfo)
> > +{
> > +     pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no,
> > +                        meminfo->phys);
> > +     pci_epc_mem_free_addr(epf->epc, meminfo->phys, meminfo->virt,
> > +                           meminfo->len);
> > +     kfree(meminfo);
> > +}
> > +
> > +static int epf_vnet_rc_process_ctrlq_entry(struct epf_vnet *vnet)
> > +{
> > +     struct vringh_kiov *riov = &vnet->rc.ctl_riov;
> > +     struct vringh_kiov *wiov = &vnet->rc.ctl_wiov;
> > +     struct vringh *vrh = &vnet->rc.ctlvrh->vrh;
> > +     struct pci_epf *epf = vnet->epf;
> > +     struct epf_vnet_rc_meminfo *rmem, *wmem;
> > +     struct virtio_net_ctrl_hdr *hdr;
> > +     int err;
> > +     u16 head;
> > +     size_t total_len;
> > +     u8 class, cmd;
> > +
> > +     err = vringh_getdesc(vrh, riov, wiov, &head);
> > +     if (err <= 0)
> > +             return err;
> > +
> > +     total_len = vringh_kiov_length(riov);
> > +
> > +     rmem = epf_vnet_rc_epc_mmap(epf, (u64)riov->iov[riov->i].iov_base,
> > +                                 riov->iov[riov->i].iov_len);
> > +     if (!rmem) {
> > +             err = -ENOMEM;
> > +             goto err_abandon_descs;
> > +     }
> > +
> > +     wmem = epf_vnet_rc_epc_mmap(epf, (u64)wiov->iov[wiov->i].iov_base,
> > +                                 wiov->iov[wiov->i].iov_len);
> > +     if (!wmem) {
> > +             err = -ENOMEM;
> > +             goto err_epc_unmap_rmem;
> > +     }
> > +
> > +     hdr = rmem->addr;
> > +     class = ioread8(&hdr->class);
> > +     cmd = ioread8(&hdr->cmd);
> > +     switch (ioread8(&hdr->class)) {
> > +     case VIRTIO_NET_CTRL_ANNOUNCE:
> > +             if (cmd != VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> > +                     pr_err("Found invalid command: announce: %d\n", cmd);
> > +                     break;
> > +             }
> > +             epf_vnet_rc_clear_config16(
> > +                     vnet,
> > +                     VIRTIO_PCI_CONFIG_OFF(false) +
> > +                             offsetof(struct virtio_net_config, status),
> > +                     VIRTIO_NET_S_ANNOUNCE);
> > +             epf_vnet_rc_clear_config16(vnet, VIRTIO_PCI_ISR,
> > +                                        VIRTIO_PCI_ISR_CONFIG);
> > +
> > +             iowrite8(VIRTIO_NET_OK, wmem->addr);
> > +             break;
> > +     default:
> > +             pr_err("Found unsupported class in control queue: %d\n", class);
> > +             break;
> > +     }
> > +
> > +     epf_vnet_rc_epc_munmap(epf, rmem);
> > +     epf_vnet_rc_epc_munmap(epf, wmem);
> > +     vringh_complete(vrh, head, total_len);
> > +
> > +     return 1;
> > +
> > +err_epc_unmap_rmem:
> > +     epf_vnet_rc_epc_munmap(epf, rmem);
> > +err_abandon_descs:
> > +     vringh_abandon(vrh, head);
> > +
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_rc_process_ctrlq_entries(struct work_struct *work)
> > +{
> > +     struct epf_vnet *vnet =
> > +             container_of(work, struct epf_vnet, rc.ctl_work);
> > +
> > +     while (epf_vnet_rc_process_ctrlq_entry(vnet) > 0)
> > +             ;
> > +}
> > +
> > +void epf_vnet_rc_notify(struct epf_vnet *vnet)
> > +{
> > +     queue_work(vnet->rc.irq_wq, &vnet->rc.raise_irq_work);
> > +}
> > +
> > +void epf_vnet_rc_cleanup(struct epf_vnet *vnet)
> > +{
> > +     epf_vnet_cleanup_bar(vnet);
> > +     destroy_workqueue(vnet->rc.tx_wq);
> > +     destroy_workqueue(vnet->rc.irq_wq);
> > +     destroy_workqueue(vnet->rc.ctl_wq);
> > +
> > +     kthread_stop(vnet->rc.device_setup_task);
> > +}
> > +
> > +int epf_vnet_rc_setup(struct epf_vnet *vnet)
> > +{
> > +     int err;
> > +     struct pci_epf *epf = vnet->epf;
> > +
> > +     err = pci_epc_write_header(epf->epc, epf->func_no, epf->vfunc_no,
> > +                                &epf_vnet_pci_header);
> > +     if (err)
> > +             return err;
> > +
> > +     err = epf_vnet_setup_bar(vnet);
> > +     if (err)
> > +             return err;
> > +
> > +     vnet->rc.tx_wq =
> > +             alloc_workqueue("pci-epf-vnet/tx-wq",
> > +                             WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> > +     if (!vnet->rc.tx_wq) {
> > +             pr_debug(
> > +                     "Failed to allocate workqueue for rc -> ep transmission\n");
> > +             err = -ENOMEM;
> > +             goto err_cleanup_bar;
> > +     }
> > +
> > +     INIT_WORK(&vnet->rc.tx_work, epf_vnet_rc_tx_handler);
> > +
> > +     vnet->rc.irq_wq =
> > +             alloc_workqueue("pci-epf-vnet/irq-wq",
> > +                             WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> > +     if (!vnet->rc.irq_wq) {
> > +             pr_debug("Failed to allocate workqueue for irq\n");
> > +             err = -ENOMEM;
> > +             goto err_destory_tx_wq;
> > +     }
> > +
> > +     INIT_WORK(&vnet->rc.raise_irq_work, epf_vnet_rc_raise_irq_handler);
> > +
> > +     vnet->rc.ctl_wq =
> > +             alloc_workqueue("pci-epf-vnet/ctl-wq",
> > +                             WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
> > +     if (!vnet->rc.ctl_wq) {
> > +             pr_err("Failed to allocate work queue for control queue processing\n");
> > +             err = -ENOMEM;
> > +             goto err_destory_irq_wq;
> > +     }
> > +
> > +     INIT_WORK(&vnet->rc.ctl_work, epf_vnet_rc_process_ctrlq_entries);
> > +
> > +     err = epf_vnet_rc_spawn_device_setup_task(vnet);
> > +     if (err)
> > +             goto err_destory_ctl_wq;
> > +
> > +     return 0;
> > +
> > +err_cleanup_bar:
> > +     epf_vnet_cleanup_bar(vnet);
> > +err_destory_tx_wq:
> > +     destroy_workqueue(vnet->rc.tx_wq);
> > +err_destory_irq_wq:
> > +     destroy_workqueue(vnet->rc.irq_wq);
> > +err_destory_ctl_wq:
> > +     destroy_workqueue(vnet->rc.ctl_wq);
> > +
> > +     return err;
> > +}
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.c b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> > new file mode 100644
> > index 000000000000..e48ad8067796
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.c
> > @@ -0,0 +1,387 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * PCI Endpoint function driver to impliment virtio-net device.
> > + */
> > +#include <linux/module.h>
> > +#include <linux/pci-epf.h>
> > +#include <linux/pci-epc.h>
> > +#include <linux/vringh.h>
> > +#include <linux/dmaengine.h>
> > +
> > +#include "pci-epf-vnet.h"
> > +
> > +static int virtio_queue_size = 0x100;
> > +module_param(virtio_queue_size, int, 0444);
> > +MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
> > +
> > +int epf_vnet_get_vq_size(void)
> > +{
> > +     return virtio_queue_size;
> > +}
> > +
> > +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size)
> > +{
> > +     struct kvec *kvec;
> > +
> > +     kvec = kmalloc_array(vq_size, sizeof(*kvec), GFP_KERNEL);
> > +     if (!kvec)
> > +             return -ENOMEM;
> > +
> > +     vringh_kiov_init(kiov, kvec, vq_size);
> > +
> > +     return 0;
> > +}
> > +
> > +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov)
> > +{
> > +     kfree(kiov->iov);
> > +}
> > +
> > +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from)
> > +{
> > +     vnet->init_complete |= from;
> > +
> > +     if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_EP))
> > +             return;
> > +
> > +     if (!(vnet->init_complete & EPF_VNET_INIT_COMPLETE_RC))
> > +             return;
> > +
> > +     epf_vnet_ep_announce_linkup(vnet);
> > +     epf_vnet_rc_announce_linkup(vnet);
> > +}
> > +
> > +struct epf_dma_filter_param {
> > +     struct device *dev;
> > +     u32 dma_mask;
> > +};
> > +
> > +static bool epf_virtnet_dma_filter(struct dma_chan *chan, void *param)
> > +{
> > +     struct epf_dma_filter_param *fparam = param;
> > +     struct dma_slave_caps caps;
> > +
> > +     memset(&caps, 0, sizeof(caps));
> > +     dma_get_slave_caps(chan, &caps);
> > +
> > +     return chan->device->dev == fparam->dev &&
> > +            (fparam->dma_mask & caps.directions);
> > +}
> > +
> > +static int epf_vnet_init_edma(struct epf_vnet *vnet, struct device *dma_dev)
> > +{
> > +     struct epf_dma_filter_param param;
> > +     dma_cap_mask_t mask;
> > +     int err;
> > +
> > +     dma_cap_zero(mask);
> > +     dma_cap_set(DMA_SLAVE, mask);
> > +
> > +     param.dev = dma_dev;
> > +     param.dma_mask = BIT(DMA_MEM_TO_DEV);
> > +     vnet->lr_dma_chan =
> > +             dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> > +     if (!vnet->lr_dma_chan)
> > +             return -EOPNOTSUPP;
> > +
> > +     param.dma_mask = BIT(DMA_DEV_TO_MEM);
> > +     vnet->rl_dma_chan =
> > +             dma_request_channel(mask, epf_virtnet_dma_filter, &param);
> > +     if (!vnet->rl_dma_chan) {
> > +             err = -EOPNOTSUPP;
> > +             goto err_release_channel;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_release_channel:
> > +     dma_release_channel(vnet->lr_dma_chan);
> > +
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_deinit_edma(struct epf_vnet *vnet)
> > +{
> > +     dma_release_channel(vnet->lr_dma_chan);
> > +     dma_release_channel(vnet->rl_dma_chan);
> > +}
> > +
> > +static int epf_vnet_dma_single(struct epf_vnet *vnet, phys_addr_t pci,
> > +                            dma_addr_t dma, size_t len,
> > +                            void (*callback)(void *), void *param,
> > +                            enum dma_transfer_direction dir)
> > +{
> > +     struct dma_async_tx_descriptor *desc;
> > +     int err;
> > +     struct dma_chan *chan;
> > +     struct dma_slave_config sconf;
> > +     dma_cookie_t cookie;
> > +     unsigned long flags = 0;
> > +
> > +     if (dir == DMA_MEM_TO_DEV) {
> > +             sconf.dst_addr = pci;
> > +             chan = vnet->lr_dma_chan;
> > +     } else {
> > +             sconf.src_addr = pci;
> > +             chan = vnet->rl_dma_chan;
> > +     }
> > +
> > +     err = dmaengine_slave_config(chan, &sconf);
> > +     if (unlikely(err))
> > +             return err;
> > +
> > +     if (callback)
> > +             flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
> > +
> > +     desc = dmaengine_prep_slave_single(chan, dma, len, dir, flags);
> > +     if (unlikely(!desc))
> > +             return -EIO;
> > +
> > +     desc->callback = callback;
> > +     desc->callback_param = param;
> > +
> > +     cookie = dmaengine_submit(desc);
> > +     err = dma_submit_error(cookie);
> > +     if (unlikely(err))
> > +             return err;
> > +
> > +     dma_async_issue_pending(chan);
> > +
> > +     return 0;
> > +}
> > +
> > +struct epf_vnet_dma_callback_param {
> > +     struct epf_vnet *vnet;
> > +     struct vringh *tx_vrh, *rx_vrh;
> > +     struct virtqueue *vq;
> > +     size_t total_len;
> > +     u16 tx_head, rx_head;
> > +};
> > +
> > +static void epf_vnet_dma_callback(void *p)
> > +{
> > +     struct epf_vnet_dma_callback_param *param = p;
> > +     struct epf_vnet *vnet = param->vnet;
> > +
> > +     vringh_complete(param->tx_vrh, param->tx_head, param->total_len);
> > +     vringh_complete(param->rx_vrh, param->rx_head, param->total_len);
> > +
> > +     epf_vnet_rc_notify(vnet);
> > +     epf_vnet_ep_notify(vnet, param->vq);
> > +
> > +     kfree(param);
> > +}
> > +
> > +/**
> > + * epf_vnet_transfer() - transfer data between tx vring to rx vring using edma
> > + * @vnet: epf virtio net device to do dma
> > + * @tx_vrh: vringh related to source tx vring
> > + * @rx_vrh: vringh related to target rx vring
> > + * @tx_iov: buffer to use tx
> > + * @rx_iov: buffer to use rx
> > + * @dir: a direction of DMA. local to remote or local from remote
> > + *
> > + * This function returns 0, 1 or error number. The 0 indicates there is not
> > + * data to send. The 1 indicates a request to DMA is succeeded. Other error
> > + * numbers shows error, however, ENOSPC means there is no buffer on target
> > + * vring, so should retry to call later.
> > + */
> > +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> > +                   struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> > +                   struct vringh_kiov *rx_iov,
> > +                   enum dma_transfer_direction dir)
> > +{
> > +     int err;
> > +     u16 tx_head, rx_head;
> > +     size_t total_tx_len;
> > +     struct epf_vnet_dma_callback_param *cb_param;
> > +     struct vringh_kiov *liov, *riov;
> > +
> > +     err = vringh_getdesc(tx_vrh, tx_iov, NULL, &tx_head);
> > +     if (err <= 0)
> > +             return err;
> > +
> > +     total_tx_len = vringh_kiov_length(tx_iov);
> > +
> > +     err = vringh_getdesc(rx_vrh, NULL, rx_iov, &rx_head);
> > +     if (err < 0) {
> > +             goto err_tx_complete;
> > +     } else if (!err) {
> > +             /* There is not space on a vring of destination to transmit data, so
> > +              * rollback tx vringh
> > +              */
> > +             vringh_abandon(tx_vrh, tx_head);
> > +             return -ENOSPC;
> > +     }
> > +
> > +     cb_param = kmalloc(sizeof(*cb_param), GFP_KERNEL);
> > +     if (!cb_param) {
> > +             err = -ENOMEM;
> > +             goto err_rx_complete;
> > +     }
> > +
> > +     cb_param->tx_vrh = tx_vrh;
> > +     cb_param->rx_vrh = rx_vrh;
> > +     cb_param->tx_head = tx_head;
> > +     cb_param->rx_head = rx_head;
> > +     cb_param->total_len = total_tx_len;
> > +     cb_param->vnet = vnet;
> > +
> > +     switch (dir) {
> > +     case DMA_MEM_TO_DEV:
> > +             liov = tx_iov;
> > +             riov = rx_iov;
> > +             cb_param->vq = vnet->ep.txvq;
> > +             break;
> > +     case DMA_DEV_TO_MEM:
> > +             liov = rx_iov;
> > +             riov = tx_iov;
> > +             cb_param->vq = vnet->ep.rxvq;
> > +             break;
> > +     default:
> > +             err = -EINVAL;
> > +             goto err_free_param;
> > +     }
> > +
> > +     for (; tx_iov->i < tx_iov->used; tx_iov->i++, rx_iov->i++) {
> > +             size_t len;
> > +             u64 lbase, rbase;
> > +             void (*callback)(void *) = NULL;
> > +
> > +             lbase = (u64)liov->iov[liov->i].iov_base;
> > +             rbase = (u64)riov->iov[riov->i].iov_base;
> > +             len = tx_iov->iov[tx_iov->i].iov_len;
> > +
> > +             if (tx_iov->i + 1 == tx_iov->used)
> > +                     callback = epf_vnet_dma_callback;
> > +
> > +             err = epf_vnet_dma_single(vnet, rbase, lbase, len, callback,
> > +                                       cb_param, dir);
> > +             if (err)
> > +                     goto err_free_param;
> > +     }
> > +
> > +     return 1;
> > +
> > +err_free_param:
> > +     kfree(cb_param);
> > +err_rx_complete:
> > +     vringh_complete(rx_vrh, rx_head, vringh_kiov_length(rx_iov));
> > +err_tx_complete:
> > +     vringh_complete(tx_vrh, tx_head, total_tx_len);
> > +
> > +     return err;
> > +}
> > +
> > +static int epf_vnet_bind(struct pci_epf *epf)
> > +{
> > +     int err;
> > +     struct epf_vnet *vnet = epf_get_drvdata(epf);
> > +
> > +     err = epf_vnet_init_edma(vnet, epf->epc->dev.parent);
> > +     if (err)
> > +             return err;
> > +
> > +     err = epf_vnet_rc_setup(vnet);
> > +     if (err)
> > +             goto err_free_edma;
> > +
> > +     err = epf_vnet_ep_setup(vnet);
> > +     if (err)
> > +             goto err_cleanup_rc;
> > +
> > +     return 0;
> > +
> > +err_free_edma:
> > +     epf_vnet_deinit_edma(vnet);
> > +err_cleanup_rc:
> > +     epf_vnet_rc_cleanup(vnet);
> > +
> > +     return err;
> > +}
> > +
> > +static void epf_vnet_unbind(struct pci_epf *epf)
> > +{
> > +     struct epf_vnet *vnet = epf_get_drvdata(epf);
> > +
> > +     epf_vnet_deinit_edma(vnet);
> > +     epf_vnet_rc_cleanup(vnet);
> > +     epf_vnet_ep_cleanup(vnet);
> > +}
> > +
> > +static struct pci_epf_ops epf_vnet_ops = {
> > +     .bind = epf_vnet_bind,
> > +     .unbind = epf_vnet_unbind,
> > +};
> > +
> > +static const struct pci_epf_device_id epf_vnet_ids[] = {
> > +     { .name = "pci_epf_vnet" },
> > +     {}
> > +};
> > +
> > +static void epf_vnet_virtio_init(struct epf_vnet *vnet)
> > +{
> > +     vnet->virtio_features =
> > +             BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_STATUS) |
> > +             /* Following features are to skip any of checking and offloading, Like a
> > +              * transmission between virtual machines on same system. Details are on
> > +              * section 5.1.5 in virtio specification.
> > +              */
> > +             BIT(VIRTIO_NET_F_GUEST_CSUM) | BIT(VIRTIO_NET_F_GUEST_TSO4) |
> > +             BIT(VIRTIO_NET_F_GUEST_TSO6) | BIT(VIRTIO_NET_F_GUEST_ECN) |
> > +             BIT(VIRTIO_NET_F_GUEST_UFO) |
> > +             // The control queue is just used for linkup announcement.
> > +             BIT(VIRTIO_NET_F_CTRL_VQ);
> > +
> > +     vnet->vnet_cfg.max_virtqueue_pairs = 1;
> > +     vnet->vnet_cfg.status = 0;
> > +     vnet->vnet_cfg.mtu = PAGE_SIZE;
> > +}
> > +
> > +static int epf_vnet_probe(struct pci_epf *epf)
> > +{
> > +     struct epf_vnet *vnet;
> > +
> > +     vnet = devm_kzalloc(&epf->dev, sizeof(*vnet), GFP_KERNEL);
> > +     if (!vnet)
> > +             return -ENOMEM;
> > +
> > +     epf_set_drvdata(epf, vnet);
> > +     vnet->epf = epf;
> > +
> > +     epf_vnet_virtio_init(vnet);
> > +
> > +     return 0;
> > +}
> > +
> > +static struct pci_epf_driver epf_vnet_drv = {
> > +     .driver.name = "pci_epf_vnet",
> > +     .ops = &epf_vnet_ops,
> > +     .id_table = epf_vnet_ids,
> > +     .probe = epf_vnet_probe,
> > +     .owner = THIS_MODULE,
> > +};
> > +
> > +static int __init epf_vnet_init(void)
> > +{
> > +     int err;
> > +
> > +     err = pci_epf_register_driver(&epf_vnet_drv);
> > +     if (err) {
> > +             pr_err("Failed to register epf vnet driver\n");
> > +             return err;
> > +     }
> > +
> > +     return 0;
> > +}
> > +module_init(epf_vnet_init);
> > +
> > +static void epf_vnet_exit(void)
> > +{
> > +     pci_epf_unregister_driver(&epf_vnet_drv);
> > +}
> > +module_exit(epf_vnet_exit);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> > +MODULE_DESCRIPTION("PCI endpoint function acts as virtio net device");
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vnet.h b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> > new file mode 100644
> > index 000000000000..1e0f90c95578
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vnet.h
> > @@ -0,0 +1,62 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _PCI_EPF_VNET_H
> > +#define _PCI_EPF_VNET_H
> > +
> > +#include <linux/pci-epf.h>
> > +#include <linux/pci-epf-virtio.h>
> > +#include <linux/virtio_net.h>
> > +#include <linux/dmaengine.h>
> > +#include <linux/virtio.h>
> > +
> > +struct epf_vnet {
> > +     //TODO Should this variable be placed here?
> > +     struct pci_epf *epf;
> > +     struct virtio_net_config vnet_cfg;
> > +     u64 virtio_features;
> > +
> > +     // dma channels for local to remote(lr) and remote to local(rl)
> > +     struct dma_chan *lr_dma_chan, *rl_dma_chan;
> > +
> > +     struct {
> > +             void __iomem *cfg_base;
> > +             struct task_struct *device_setup_task;
> > +             struct task_struct *notify_monitor_task;
> > +             struct workqueue_struct *tx_wq, *irq_wq, *ctl_wq;
> > +             struct work_struct tx_work, raise_irq_work, ctl_work;
> > +             struct pci_epf_vringh *txvrh, *rxvrh, *ctlvrh;
> > +             struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> > +     } rc;
> > +
> > +     struct {
> > +             struct virtqueue *rxvq, *txvq, *ctlvq;
> > +             struct vringh txvrh, rxvrh, ctlvrh;
> > +             struct vringh_kiov tx_iov, rx_iov, ctl_riov, ctl_wiov;
> > +             struct virtio_device vdev;
> > +             u16 net_config_status;
> > +     } ep;
> > +
> > +#define EPF_VNET_INIT_COMPLETE_EP BIT(0)
> > +#define EPF_VNET_INIT_COMPLETE_RC BIT(1)
> > +     u8 init_complete;
> > +};
> > +
> > +int epf_vnet_rc_setup(struct epf_vnet *vnet);
> > +void epf_vnet_rc_cleanup(struct epf_vnet *vnet);
> > +int epf_vnet_ep_setup(struct epf_vnet *vnet);
> > +void epf_vnet_ep_cleanup(struct epf_vnet *vnet);
> > +
> > +int epf_vnet_get_vq_size(void);
> > +int epf_vnet_init_kiov(struct vringh_kiov *kiov, const size_t vq_size);
> > +void epf_vnet_deinit_kiov(struct vringh_kiov *kiov);
> > +int epf_vnet_transfer(struct epf_vnet *vnet, struct vringh *tx_vrh,
> > +                   struct vringh *rx_vrh, struct vringh_kiov *tx_iov,
> > +                   struct vringh_kiov *rx_iov,
> > +                   enum dma_transfer_direction dir);
> > +void epf_vnet_rc_notify(struct epf_vnet *vnet);
> > +void epf_vnet_ep_notify(struct epf_vnet *vnet, struct virtqueue *vq);
> > +
> > +void epf_vnet_init_complete(struct epf_vnet *vnet, u8 from);
> > +void epf_vnet_ep_announce_linkup(struct epf_vnet *vnet);
> > +void epf_vnet_rc_announce_linkup(struct epf_vnet *vnet);
> > +
> > +#endif // _PCI_EPF_VNET_H
> > --
> > 2.25.1
>
Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-03 22:15     ` [EXT] " Frank Li
@ 2023-02-07 10:56         ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:56 UTC (permalink / raw)
  To: Frank Li
  Cc: Michael S. Tsirkin, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Jon Mason, Ren Zhijie, Takanari Hayama, linux-kernel,
	linux-pci, virtualization

2023年2月4日(土) 7:15 Frank Li <frank.li@nxp.com>:
>
> >
> > Caution: EXT Email
> >
> > On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> > > Add a new endpoint(EP) function driver to provide virtio-net device. This
> > > function not only shows virtio-net device for PCIe host system, but also
> > > provides virtio-net device to EP side(local) system. Virtualy those network
> > > devices are connected, so we can use to communicate over IP like a simple
> > > NIC.
> > >
> > > Architecture overview is following:
> > >
> > > to Host       |                       to Endpoint
> > > network stack |                 network stack
> > >       |       |                       |
> > > +-----------+ |       +-----------+   +-----------+
> > > |virtio-net | |       |virtio-net |   |virtio-net |
> > > |driver     | |       |EP function|---|driver     |
> > > +-----------+ |       +-----------+   +-----------+
> > >       |       |             |
> > > +-----------+ | +-----------+
> > > |PCIeC      | | |PCIeC      |
> > > |Rootcomplex|-|-|Endpoint   |
> > > +-----------+ | +-----------+
> > >   Host side   |          Endpoint side
> > >
> > > This driver uses PCIe EP framework to show virtio-net (pci) device Host
> > > side, and generate virtual virtio-net device and register to EP side.
> > > A communication date
> >
> > data?
> >
> > > is diractly
> >
> > directly?
> >
> > > transported between virtqueue level
> > > with each other using PCIe embedded DMA controller.
> > >
> > > by a limitation of the hardware and Linux EP framework, this function
> > > follows a virtio legacy specification.
> >
> > what exactly is the limitation and why does it force legacy?
> >
> > > This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> > > just use the PCIe EP framework and depends on the PCIe EDMA.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > > ---
> > >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> > >  drivers/pci/endpoint/functions/Makefile       |   1 +
> > >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>
> It is actually that not related vnet. Just virtio.
> I think pci-epf-virtio.c is better.
Yes, it have to be.
> > >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>
> It is epf driver. rc is quite confused.
> Maybe you can combine pci-epf-vnet-ep.c and pci-epf-vnet-rc.c to one file.
I agree. Try to combine them
> > >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>
> This file setup dma transfer according virtio-ring.
> How about pci-epf-virtio-dma.c ?
I attempt to rearrange the location of code and filenames.
> > > +
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR,
> > VIRTIO_PCI_ISR_QUEUE);
> > > +     /*
> > > +      * Initialize the queue notify and selector to outside of the appropriate
> > > +      * virtqueue index. It is used to detect change with polling. There is no
> > > +      * other ways to detect host side driver updateing those values
> > > +      */
>
> I am try to use gic-its or other msi controller as doorbell.
> https://lore.kernel.org/imx/20221125192729.1722913-1-Frank.Li@nxp.com/T/#u
>
> but it may need update host side pci virtio driver.
Thanks, is it possible to use  MSI-X as well? The virtio spec
indicates to use legacy irq or
MSI-X only.
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY,
> > default_qindex);
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL,
> > default_qindex);
> > > +     /* This pfn is also set to 0 for the polling as well */
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> > > +
> > --
> > > 2.25.1
>
Best,
Shunsuke.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
@ 2023-02-07 10:56         ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 10:56 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Michael S. Tsirkin, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月4日(土) 7:15 Frank Li <frank.li@nxp.com>:
>
> >
> > Caution: EXT Email
> >
> > On Fri, Feb 03, 2023 at 07:04:18PM +0900, Shunsuke Mie wrote:
> > > Add a new endpoint(EP) function driver to provide virtio-net device. This
> > > function not only shows virtio-net device for PCIe host system, but also
> > > provides virtio-net device to EP side(local) system. Virtualy those network
> > > devices are connected, so we can use to communicate over IP like a simple
> > > NIC.
> > >
> > > Architecture overview is following:
> > >
> > > to Host       |                       to Endpoint
> > > network stack |                 network stack
> > >       |       |                       |
> > > +-----------+ |       +-----------+   +-----------+
> > > |virtio-net | |       |virtio-net |   |virtio-net |
> > > |driver     | |       |EP function|---|driver     |
> > > +-----------+ |       +-----------+   +-----------+
> > >       |       |             |
> > > +-----------+ | +-----------+
> > > |PCIeC      | | |PCIeC      |
> > > |Rootcomplex|-|-|Endpoint   |
> > > +-----------+ | +-----------+
> > >   Host side   |          Endpoint side
> > >
> > > This driver uses PCIe EP framework to show virtio-net (pci) device Host
> > > side, and generate virtual virtio-net device and register to EP side.
> > > A communication date
> >
> > data?
> >
> > > is diractly
> >
> > directly?
> >
> > > transported between virtqueue level
> > > with each other using PCIe embedded DMA controller.
> > >
> > > by a limitation of the hardware and Linux EP framework, this function
> > > follows a virtio legacy specification.
> >
> > what exactly is the limitation and why does it force legacy?
> >
> > > This function driver has beed tested on S4 Rcar (r8a779fa-spider) board but
> > > just use the PCIe EP framework and depends on the PCIe EDMA.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > > ---
> > >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> > >  drivers/pci/endpoint/functions/Makefile       |   1 +
> > >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>
> It is actually that not related vnet. Just virtio.
> I think pci-epf-virtio.c is better.
Yes, it have to be.
> > >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635 ++++++++++++++++++
>
> It is epf driver. rc is quite confused.
> Maybe you can combine pci-epf-vnet-ep.c and pci-epf-vnet-rc.c to one file.
I agree. Try to combine them
> > >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>
> This file setup dma transfer according virtio-ring.
> How about pci-epf-virtio-dma.c ?
I attempt to rearrange the location of code and filenames.
> > > +
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_ISR,
> > VIRTIO_PCI_ISR_QUEUE);
> > > +     /*
> > > +      * Initialize the queue notify and selector to outside of the appropriate
> > > +      * virtqueue index. It is used to detect change with polling. There is no
> > > +      * other ways to detect host side driver updateing those values
> > > +      */
>
> I am try to use gic-its or other msi controller as doorbell.
> https://lore.kernel.org/imx/20221125192729.1722913-1-Frank.Li@nxp.com/T/#u
>
> but it may need update host side pci virtio driver.
Thanks, is it possible to use  MSI-X as well? The virtio spec
indicates to use legacy irq or
MSI-X only.
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_NOTIFY,
> > default_qindex);
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_SEL,
> > default_qindex);
> > > +     /* This pfn is also set to 0 for the polling as well */
> > > +     epf_vnet_rc_set_config16(vnet, VIRTIO_PCI_QUEUE_PFN, 0);
> > > +
> > --
> > > 2.25.1
>
Best,
Shunsuke.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions
  2023-02-03 10:20     ` Michael S. Tsirkin
@ 2023-02-07 11:05       ` Shunsuke Mie
  -1 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 11:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Frank Li, Jon Mason, Ren Zhijie, Takanari Hayama,
	linux-kernel, linux-pci, virtualization

2023年2月3日(金) 19:20 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:17PM +0900, Shunsuke Mie wrote:
> > Add a new library to access a virtio ring located on PCIe host memory. The
> > library generates struct pci_epf_vringh that is introduced in this patch.
> > The struct has a vringh member, so vringh APIs can be used to access the
> > virtio ring.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  drivers/pci/endpoint/Kconfig          |   7 ++
> >  drivers/pci/endpoint/Makefile         |   1 +
> >  drivers/pci/endpoint/pci-epf-virtio.c | 113 ++++++++++++++++++++++++++
> >  include/linux/pci-epf-virtio.h        |  25 ++++++
> >  4 files changed, 146 insertions(+)
> >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> >  create mode 100644 include/linux/pci-epf-virtio.h
> >
> > diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
> > index 17bbdc9bbde0..07276dcc43c8 100644
> > --- a/drivers/pci/endpoint/Kconfig
> > +++ b/drivers/pci/endpoint/Kconfig
> > @@ -28,6 +28,13 @@ config PCI_ENDPOINT_CONFIGFS
> >          configure the endpoint function and used to bind the
> >          function with a endpoint controller.
> >
> > +config PCI_ENDPOINT_VIRTIO
> > +     tristate
> > +     depends on PCI_ENDPOINT
> > +     select VHOST_IOMEM
> > +     help
> > +       TODO update this comment
> > +
> >  source "drivers/pci/endpoint/functions/Kconfig"
> >
> >  endmenu
> > diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
> > index 95b2fe47e3b0..95712f0a13d1 100644
> > --- a/drivers/pci/endpoint/Makefile
> > +++ b/drivers/pci/endpoint/Makefile
> > @@ -4,5 +4,6 @@
> >  #
> >
> >  obj-$(CONFIG_PCI_ENDPOINT_CONFIGFS)  += pci-ep-cfs.o
> > +obj-$(CONFIG_PCI_ENDPOINT_VIRTIO)    += pci-epf-virtio.o
> >  obj-$(CONFIG_PCI_ENDPOINT)           += pci-epc-core.o pci-epf-core.o\
> >                                          pci-epc-mem.o functions/
> > diff --git a/drivers/pci/endpoint/pci-epf-virtio.c b/drivers/pci/endpoint/pci-epf-virtio.c
> > new file mode 100644
> > index 000000000000..7134ca407a03
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/pci-epf-virtio.c
> > @@ -0,0 +1,113 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Virtio library for PCI Endpoint function
> > + */
> > +#include <linux/kernel.h>
> > +#include <linux/pci-epf-virtio.h>
> > +#include <linux/pci-epc.h>
> > +#include <linux/virtio_pci.h>
> > +
> > +static void __iomem *epf_virtio_map_vq(struct pci_epf *epf, u32 pfn,
> > +                                    size_t size, phys_addr_t *vq_phys)
> > +{
> > +     int err;
> > +     phys_addr_t vq_addr;
> > +     size_t vq_size;
> > +     void __iomem *vq_virt;
> > +
> > +     vq_addr = (phys_addr_t)pfn << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> > +
> > +     vq_size = vring_size(size, VIRTIO_PCI_VRING_ALIGN) + 100;
>
> 100?
It is a mistake and will be removed.
> Also ugh, this uses the legacy vring_size.
> Did not look closely but is all this limited to legacy virtio then?
> Pls make sure you code builds with #define VIRTIO_RING_NO_LEGACY.
Thanks for your suggestion, but this device works as a legacy device.
In this case, the NO_LEGACY macro can not applicable I think. Is that right?
> > +
> > +     vq_virt = pci_epc_mem_alloc_addr(epf->epc, vq_phys, vq_size);
> > +     if (!vq_virt) {
> > +             pr_err("Failed to allocate epc memory\n");
> > +             return ERR_PTR(-ENOMEM);
> > +     }
> > +
> > +     err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, *vq_phys,
> > +                            vq_addr, vq_size);
> > +     if (err) {
> > +             pr_err("Failed to map virtuqueue to local");
> > +             goto err_free;
> > +     }
> > +
> > +     return vq_virt;
> > +
> > +err_free:
> > +     pci_epc_mem_free_addr(epf->epc, *vq_phys, vq_virt, vq_size);
> > +
> > +     return ERR_PTR(err);
> > +}
> > +
> > +static void epf_virtio_unmap_vq(struct pci_epf *epf, void __iomem *vq_virt,
> > +                             phys_addr_t vq_phys, size_t size)
> > +{
> > +     pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no, vq_phys);
> > +     pci_epc_mem_free_addr(epf->epc, vq_phys, vq_virt,
> > +                           vring_size(size, VIRTIO_PCI_VRING_ALIGN));
> > +}
> > +
> > +/**
> > + * pci_epf_virtio_alloc_vringh() - allocate epf vringh from @pfn
> > + * @epf: the EPF device that communicates to host virtio dirver
> > + * @features: the virtio features of device
> > + * @pfn: page frame number of virtqueue located on host memory. It is
> > + *           passed during virtqueue negotiation.
> > + * @size: a length of virtqueue
> > + */
> > +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> > +                                                u64 features, u32 pfn,
> > +                                                size_t size)
> > +{
> > +     int err;
> > +     struct vring vring;
> > +     struct pci_epf_vringh *evrh;
> > +
> > +     evrh = kmalloc(sizeof(*evrh), GFP_KERNEL);
> > +     if (!evrh) {
> > +             err = -ENOMEM;
> > +             goto err_unmap_vq;
> > +     }
> > +
> > +     evrh->size = size;
> > +
> > +     evrh->virt = epf_virtio_map_vq(epf, pfn, size, &evrh->phys);
> > +     if (IS_ERR(evrh->virt))
> > +             return evrh->virt;
> > +
> > +     vring_init(&vring, size, evrh->virt, VIRTIO_PCI_VRING_ALIGN);
> > +
> > +     err = vringh_init_iomem(&evrh->vrh, features, size, false, GFP_KERNEL,
> > +                             vring.desc, vring.avail, vring.used);
> > +     if (err)
> > +             goto err_free_epf_vq;
> > +
> > +     return evrh;
> > +
> > +err_free_epf_vq:
> > +     kfree(evrh);
> > +
> > +err_unmap_vq:
> > +     epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> > +
> > +     return ERR_PTR(err);
> > +}
> > +EXPORT_SYMBOL_GPL(pci_epf_virtio_alloc_vringh);
> > +
> > +/**
> > + * pci_epf_virtio_free_vringh() - release allocated epf vring
> > + * @epf: the EPF device that communicates to host virtio dirver
> > + * @evrh: epf vringh to free
> > + */
> > +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> > +                             struct pci_epf_vringh *evrh)
> > +{
> > +     epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> > +     kfree(evrh);
> > +}
> > +EXPORT_SYMBOL_GPL(pci_epf_virtio_free_vringh);
> > +
> > +MODULE_DESCRIPTION("PCI EP Virtio Library");
> > +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> > +MODULE_LICENSE("GPL");
> > diff --git a/include/linux/pci-epf-virtio.h b/include/linux/pci-epf-virtio.h
> > new file mode 100644
> > index 000000000000..ae09087919a9
> > --- /dev/null
> > +++ b/include/linux/pci-epf-virtio.h
> > @@ -0,0 +1,25 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * PCI Endpoint Function (EPF) for virtio definitions
> > + */
> > +#ifndef __LINUX_PCI_EPF_VIRTIO_H
> > +#define __LINUX_PCI_EPF_VIRTIO_H
> > +
> > +#include <linux/types.h>
> > +#include <linux/vringh.h>
> > +#include <linux/pci-epf.h>
> > +
> > +struct pci_epf_vringh {
> > +     struct vringh vrh;
> > +     void __iomem *virt;
> > +     phys_addr_t phys;
> > +     size_t size;
> > +};
> > +
> > +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> > +                                                u64 features, u32 pfn,
> > +                                                size_t size);
> > +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> > +                             struct pci_epf_vringh *evrh);
> > +
> > +#endif // __LINUX_PCI_EPF_VIRTIO_H
> > --
> > 2.25.1
>
Best,
Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions
@ 2023-02-07 11:05       ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-07 11:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Frank Li, linux-kernel, virtualization,
	Ren Zhijie, Jon Mason, Bjorn Helgaas

2023年2月3日(金) 19:20 Michael S. Tsirkin <mst@redhat.com>:
>
> On Fri, Feb 03, 2023 at 07:04:17PM +0900, Shunsuke Mie wrote:
> > Add a new library to access a virtio ring located on PCIe host memory. The
> > library generates struct pci_epf_vringh that is introduced in this patch.
> > The struct has a vringh member, so vringh APIs can be used to access the
> > virtio ring.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > Signed-off-by: Takanari Hayama <taki@igel.co.jp>
> > ---
> >  drivers/pci/endpoint/Kconfig          |   7 ++
> >  drivers/pci/endpoint/Makefile         |   1 +
> >  drivers/pci/endpoint/pci-epf-virtio.c | 113 ++++++++++++++++++++++++++
> >  include/linux/pci-epf-virtio.h        |  25 ++++++
> >  4 files changed, 146 insertions(+)
> >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> >  create mode 100644 include/linux/pci-epf-virtio.h
> >
> > diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
> > index 17bbdc9bbde0..07276dcc43c8 100644
> > --- a/drivers/pci/endpoint/Kconfig
> > +++ b/drivers/pci/endpoint/Kconfig
> > @@ -28,6 +28,13 @@ config PCI_ENDPOINT_CONFIGFS
> >          configure the endpoint function and used to bind the
> >          function with a endpoint controller.
> >
> > +config PCI_ENDPOINT_VIRTIO
> > +     tristate
> > +     depends on PCI_ENDPOINT
> > +     select VHOST_IOMEM
> > +     help
> > +       TODO update this comment
> > +
> >  source "drivers/pci/endpoint/functions/Kconfig"
> >
> >  endmenu
> > diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
> > index 95b2fe47e3b0..95712f0a13d1 100644
> > --- a/drivers/pci/endpoint/Makefile
> > +++ b/drivers/pci/endpoint/Makefile
> > @@ -4,5 +4,6 @@
> >  #
> >
> >  obj-$(CONFIG_PCI_ENDPOINT_CONFIGFS)  += pci-ep-cfs.o
> > +obj-$(CONFIG_PCI_ENDPOINT_VIRTIO)    += pci-epf-virtio.o
> >  obj-$(CONFIG_PCI_ENDPOINT)           += pci-epc-core.o pci-epf-core.o\
> >                                          pci-epc-mem.o functions/
> > diff --git a/drivers/pci/endpoint/pci-epf-virtio.c b/drivers/pci/endpoint/pci-epf-virtio.c
> > new file mode 100644
> > index 000000000000..7134ca407a03
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/pci-epf-virtio.c
> > @@ -0,0 +1,113 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Virtio library for PCI Endpoint function
> > + */
> > +#include <linux/kernel.h>
> > +#include <linux/pci-epf-virtio.h>
> > +#include <linux/pci-epc.h>
> > +#include <linux/virtio_pci.h>
> > +
> > +static void __iomem *epf_virtio_map_vq(struct pci_epf *epf, u32 pfn,
> > +                                    size_t size, phys_addr_t *vq_phys)
> > +{
> > +     int err;
> > +     phys_addr_t vq_addr;
> > +     size_t vq_size;
> > +     void __iomem *vq_virt;
> > +
> > +     vq_addr = (phys_addr_t)pfn << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> > +
> > +     vq_size = vring_size(size, VIRTIO_PCI_VRING_ALIGN) + 100;
>
> 100?
It is a mistake and will be removed.
> Also ugh, this uses the legacy vring_size.
> Did not look closely but is all this limited to legacy virtio then?
> Pls make sure you code builds with #define VIRTIO_RING_NO_LEGACY.
Thanks for your suggestion, but this device works as a legacy device.
In this case, the NO_LEGACY macro can not applicable I think. Is that right?
> > +
> > +     vq_virt = pci_epc_mem_alloc_addr(epf->epc, vq_phys, vq_size);
> > +     if (!vq_virt) {
> > +             pr_err("Failed to allocate epc memory\n");
> > +             return ERR_PTR(-ENOMEM);
> > +     }
> > +
> > +     err = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no, *vq_phys,
> > +                            vq_addr, vq_size);
> > +     if (err) {
> > +             pr_err("Failed to map virtuqueue to local");
> > +             goto err_free;
> > +     }
> > +
> > +     return vq_virt;
> > +
> > +err_free:
> > +     pci_epc_mem_free_addr(epf->epc, *vq_phys, vq_virt, vq_size);
> > +
> > +     return ERR_PTR(err);
> > +}
> > +
> > +static void epf_virtio_unmap_vq(struct pci_epf *epf, void __iomem *vq_virt,
> > +                             phys_addr_t vq_phys, size_t size)
> > +{
> > +     pci_epc_unmap_addr(epf->epc, epf->func_no, epf->vfunc_no, vq_phys);
> > +     pci_epc_mem_free_addr(epf->epc, vq_phys, vq_virt,
> > +                           vring_size(size, VIRTIO_PCI_VRING_ALIGN));
> > +}
> > +
> > +/**
> > + * pci_epf_virtio_alloc_vringh() - allocate epf vringh from @pfn
> > + * @epf: the EPF device that communicates to host virtio dirver
> > + * @features: the virtio features of device
> > + * @pfn: page frame number of virtqueue located on host memory. It is
> > + *           passed during virtqueue negotiation.
> > + * @size: a length of virtqueue
> > + */
> > +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> > +                                                u64 features, u32 pfn,
> > +                                                size_t size)
> > +{
> > +     int err;
> > +     struct vring vring;
> > +     struct pci_epf_vringh *evrh;
> > +
> > +     evrh = kmalloc(sizeof(*evrh), GFP_KERNEL);
> > +     if (!evrh) {
> > +             err = -ENOMEM;
> > +             goto err_unmap_vq;
> > +     }
> > +
> > +     evrh->size = size;
> > +
> > +     evrh->virt = epf_virtio_map_vq(epf, pfn, size, &evrh->phys);
> > +     if (IS_ERR(evrh->virt))
> > +             return evrh->virt;
> > +
> > +     vring_init(&vring, size, evrh->virt, VIRTIO_PCI_VRING_ALIGN);
> > +
> > +     err = vringh_init_iomem(&evrh->vrh, features, size, false, GFP_KERNEL,
> > +                             vring.desc, vring.avail, vring.used);
> > +     if (err)
> > +             goto err_free_epf_vq;
> > +
> > +     return evrh;
> > +
> > +err_free_epf_vq:
> > +     kfree(evrh);
> > +
> > +err_unmap_vq:
> > +     epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> > +
> > +     return ERR_PTR(err);
> > +}
> > +EXPORT_SYMBOL_GPL(pci_epf_virtio_alloc_vringh);
> > +
> > +/**
> > + * pci_epf_virtio_free_vringh() - release allocated epf vring
> > + * @epf: the EPF device that communicates to host virtio dirver
> > + * @evrh: epf vringh to free
> > + */
> > +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> > +                             struct pci_epf_vringh *evrh)
> > +{
> > +     epf_virtio_unmap_vq(epf, evrh->virt, evrh->phys, evrh->size);
> > +     kfree(evrh);
> > +}
> > +EXPORT_SYMBOL_GPL(pci_epf_virtio_free_vringh);
> > +
> > +MODULE_DESCRIPTION("PCI EP Virtio Library");
> > +MODULE_AUTHOR("Shunsuke Mie <mie@igel.co.jp>");
> > +MODULE_LICENSE("GPL");
> > diff --git a/include/linux/pci-epf-virtio.h b/include/linux/pci-epf-virtio.h
> > new file mode 100644
> > index 000000000000..ae09087919a9
> > --- /dev/null
> > +++ b/include/linux/pci-epf-virtio.h
> > @@ -0,0 +1,25 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * PCI Endpoint Function (EPF) for virtio definitions
> > + */
> > +#ifndef __LINUX_PCI_EPF_VIRTIO_H
> > +#define __LINUX_PCI_EPF_VIRTIO_H
> > +
> > +#include <linux/types.h>
> > +#include <linux/vringh.h>
> > +#include <linux/pci-epf.h>
> > +
> > +struct pci_epf_vringh {
> > +     struct vringh vrh;
> > +     void __iomem *virt;
> > +     phys_addr_t phys;
> > +     size_t size;
> > +};
> > +
> > +struct pci_epf_vringh *pci_epf_virtio_alloc_vringh(struct pci_epf *epf,
> > +                                                u64 features, u32 pfn,
> > +                                                size_t size);
> > +void pci_epf_virtio_free_vringh(struct pci_epf *epf,
> > +                             struct pci_epf_vringh *evrh);
> > +
> > +#endif // __LINUX_PCI_EPF_VIRTIO_H
> > --
> > 2.25.1
>
Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [EXT] Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-07 10:56         ` Shunsuke Mie
  (?)
@ 2023-02-07 15:37         ` Frank Li
  2023-02-08  5:46             ` Shunsuke Mie
  -1 siblings, 1 reply; 50+ messages in thread
From: Frank Li @ 2023-02-07 15:37 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Michael S. Tsirkin, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Jon Mason, Ren Zhijie, Takanari Hayama, linux-kernel,
	linux-pci, virtualization

> > but it may need update host side pci virtio driver.
> Thanks, is it possible to use  MSI-X as well? The virtio spec
> indicates to use legacy irq or
> MSI-X only.

I supposed yes. It is depend MSI controller type in EP side. 
But not like standard PCI MSI-X, it is platform MSI-X irq. 

If use GIC-its, it should support MSI-X.

Thomas Gleixner is working on pre-device msi irq domain. 
https://lore.kernel.org/all/20221121135653.208611233@linutronix.de

I hope Thomas can finish their work soon. 
so I can continue my patch upstream work.
https://lore.kernel.org/imx/87wn7evql7.ffs@tglx/T/#u

> >
> Best,
> Shunsuke.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-07 10:29     ` Shunsuke Mie
  (?)
@ 2023-02-07 16:02     ` Frank Li
  2023-02-14  3:27         ` Shunsuke Mie
  -1 siblings, 1 reply; 50+ messages in thread
From: Frank Li @ 2023-02-07 16:02 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization


> We project extending this module to support RDMA. The plan is based on
> virtio-rdma[1].
> It extends the virtio-net and we are plan to implement the proposed
> spec based on this patch.
> [1] virtio-rdma
> - proposal:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fall%2F20220511095900.343-1-
> xieyongji%40bytedance.com%2FT%2F&data=05%7C01%7Cfrank.li%40nxp.co
> m%7C0ef2bd62eda945c413be08db08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5
> c301635%7C0%7C0%7C638113625610341574%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> 3D%7C3000%7C%7C%7C&sdata=HyhpRTG8MNx%2BtfmWn6x3srmdBjHcZAo
> 2qbxL9USph9o%3D&reserved=0
> - presentation on kvm forum:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fyout
> u.be%2FQrhv6hC_YK4&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62
> eda945c413be08db08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C0%7C638113625610341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> C%7C%7C&sdata=ucOsGR1letTjxf0gKN6uls5y951CXaIspZtLGnASEC8%3D&res
> erved=0
> 

Sorry for our outlook client always change link.  This previous discussion. 
https://lore.kernel.org/imx/d098a631-9930-26d3-48f3-8f95386c8e50@ti.com/T/#t

Look like Endpoint maintainer Kishon like endpoint side work as vhost.
Previous  Haotian Wang submit similar patches, which just not use eDMA, just use memcpy.
But overall idea is the same. 

I think your and haotian's method is more reasonable for PCI-RC EP connection.

Kishon is not active recently.   Maybe need Lorenzo Pieralisi and Bjorn helgass's comments
for overall directions. 

Frank Li 

> Please feel free to comment and suggest.
> > Frank Li
> >
> > >
> > > To realize the function, this patchset has few changes and introduces a
> > > new APIs to PCI EP framework related to virtio. Furthermore, it device
> > > depends on the some patchtes that is discussing. Those depended
> patchset
> > > are following:
> > > - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous
> transfer
> > > link:
> > >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
> > > ernel.org%2Fdmaengine%2F20221223022608.550697-1-
> > >
> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > >
> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > >
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > >
> C%7C%7C&sdata=tIn0MHzEvrdxaC4KKTvTRvYXBzQ6MyrFa2GXpa3ePv0%3D&
> > > reserved=0
> > > - [RFC PATCH 0/3] Deal with alignment restriction on EP side
> > > link:
> > >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
> > > ernel.org%2Flinux-pci%2F20230113090350.1103494-1-
> > >
> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > >
> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > >
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > >
> C%7C%7C&sdata=RLpnDiLwfqQd5QMXdiQyPVCkfOj8q2AyVeZOwWHvlsM%3
> > > D&reserved=0
> > > - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
> > > link:
> > >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
> > > ernel.org%2Fvirtualization%2F20230202090934.549556-1-
> > >
> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
> > > 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> > >
> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > >
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> > >
> C%7C%7C&sdata=6jgY76BMSbvamb%2Fl3Urjt4Gcizeqon%2BZE5nPssc2kDA%
> > > 3D&reserved=0
> > >
> > > About this patchset has 4 patches. The first of two patch is little changes
> > > to virtio. The third patch add APIs to easily access virtio data structure
> > > on PCIe Host side memory. The last one introduce a virtio-net EP device
> > > function. Details are in commit respectively.
> > >
> > > Currently those network devices are testd using ping only. I'll add a
> > > result of performance evaluation using iperf and etc to the future version
> > > of this patchset.
> > >
> > > Shunsuke Mie (4):
> > >   virtio_pci: add a definition of queue flag in ISR
> > >   virtio_ring: remove const from vring getter
> > >   PCI: endpoint: Introduce virtio library for EP functions
> > >   PCI: endpoint: function: Add EP function driver to provide virtio net
> > >     device
> > >
> > >  drivers/pci/endpoint/Kconfig                  |   7 +
> > >  drivers/pci/endpoint/Makefile                 |   1 +
> > >  drivers/pci/endpoint/functions/Kconfig        |  12 +
> > >  drivers/pci/endpoint/functions/Makefile       |   1 +
> > >  .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
> > >  .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635
> ++++++++++++++++++
> > >  drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
> > >  drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
> > >  drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
> > >  drivers/virtio/virtio_ring.c                  |   2 +-
> > >  include/linux/pci-epf-virtio.h                |  25 +
> > >  include/linux/virtio.h                        |   2 +-
> > >  include/uapi/linux/virtio_pci.h               |   2 +
> > >  13 files changed, 1590 insertions(+), 2 deletions(-)
> > >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
> > >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
> > >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
> > >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
> > >  create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
> > >  create mode 100644 include/linux/pci-epf-virtio.h
> > >
> > > --
> > > 2.25.1
> >
> Best,
> Shunsuke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
  2023-02-07 15:37         ` Frank Li
@ 2023-02-08  5:46             ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-08  5:46 UTC (permalink / raw)
  To: Frank Li
  Cc: Michael S. Tsirkin, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Jason Wang, Jon Mason, Ren Zhijie, Takanari Hayama, linux-kernel,
	linux-pci, virtualization


On 2023/02/08 0:37, Frank Li wrote:
>>> but it may need update host side pci virtio driver.
>> Thanks, is it possible to use  MSI-X as well? The virtio spec
>> indicates to use legacy irq or
>> MSI-X only.
> I supposed yes. It is depend MSI controller type in EP side.
> But not like standard PCI MSI-X, it is platform MSI-X irq.
>
> If use GIC-its, it should support MSI-X.
>
> Thomas Gleixner is working on pre-device msi irq domain.
> https://lore.kernel.org/all/20221121135653.208611233@linutronix.de
>
> I hope Thomas can finish their work soon.
> so I can continue my patch upstream work.
> https://lore.kernel.org/imx/87wn7evql7.ffs@tglx/T/#u

Thank for sharing this those information. I'll see the details to embed.

>> Best,
>> Shunsuke.

Best,

Shunsuke.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] Re: [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device
@ 2023-02-08  5:46             ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-08  5:46 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Manivannan Sadhasivam, linux-pci,
	Lorenzo Pieralisi, Michael S. Tsirkin, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas


On 2023/02/08 0:37, Frank Li wrote:
>>> but it may need update host side pci virtio driver.
>> Thanks, is it possible to use  MSI-X as well? The virtio spec
>> indicates to use legacy irq or
>> MSI-X only.
> I supposed yes. It is depend MSI controller type in EP side.
> But not like standard PCI MSI-X, it is platform MSI-X irq.
>
> If use GIC-its, it should support MSI-X.
>
> Thomas Gleixner is working on pre-device msi irq domain.
> https://lore.kernel.org/all/20221121135653.208611233@linutronix.de
>
> I hope Thomas can finish their work soon.
> so I can continue my patch upstream work.
> https://lore.kernel.org/imx/87wn7evql7.ffs@tglx/T/#u

Thank for sharing this those information. I'll see the details to embed.

>> Best,
>> Shunsuke.

Best,

Shunsuke.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-07 16:02     ` Frank Li
@ 2023-02-14  3:27         ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-14  3:27 UTC (permalink / raw)
  To: Frank Li
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization


On 2023/02/08 1:02, Frank Li wrote:
>> We project extending this module to support RDMA. The plan is based on
>> virtio-rdma[1].
>> It extends the virtio-net and we are plan to implement the proposed
>> spec based on this patch.
>> [1] virtio-rdma
>> - proposal:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> ernel.org%2Fall%2F20220511095900.343-1-
>> xieyongji%40bytedance.com%2FT%2F&data=05%7C01%7Cfrank.li%40nxp.co
>> m%7C0ef2bd62eda945c413be08db08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5
>> c301635%7C0%7C0%7C638113625610341574%7CUnknown%7CTWFpbGZsb3d
>> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
>> 3D%7C3000%7C%7C%7C&sdata=HyhpRTG8MNx%2BtfmWn6x3srmdBjHcZAo
>> 2qbxL9USph9o%3D&reserved=0
>> - presentation on kvm forum:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fyout
>> u.be%2FQrhv6hC_YK4&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62
>> eda945c413be08db08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>> 7C0%7C638113625610341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=ucOsGR1letTjxf0gKN6uls5y951CXaIspZtLGnASEC8%3D&res
>> erved=0
>>
> Sorry for our outlook client always change link.  This previous discussion.
> https://lore.kernel.org/imx/d098a631-9930-26d3-48f3-8f95386c8e50@ti.com/T/#t
>
> Look like Endpoint maintainer Kishon like endpoint side work as vhost.
> Previous  Haotian Wang submit similar patches, which just not use eDMA, just use memcpy.
> But overall idea is the same.
>
> I think your and haotian's method is more reasonable for PCI-RC EP connection.
>
> Kishon is not active recently.   Maybe need Lorenzo Pieralisi and Bjorn helgass's comments
> for overall directions.
I think so too. Thank you for your summarization. I've commented on the 
e-mail.
> Frank Li
>
>> Please feel free to comment and suggest.
>>> Frank Li
>>>
>>>> To realize the function, this patchset has few changes and introduces a
>>>> new APIs to PCI EP framework related to virtio. Furthermore, it device
>>>> depends on the some patchtes that is discussing. Those depended
>> patchset
>>>> are following:
>>>> - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous
>> transfer
>>>> link:
>>>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
>> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
>> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
>> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
>>>> ernel.org%2Fdmaengine%2F20221223022608.550697-1-
>>>>
>> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
>>>> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>>>>
>> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=tIn0MHzEvrdxaC4KKTvTRvYXBzQ6MyrFa2GXpa3ePv0%3D&
>>>> reserved=0
>>>> - [RFC PATCH 0/3] Deal with alignment restriction on EP side
>>>> link:
>>>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
>> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
>> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
>> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
>>>> ernel.org%2Flinux-pci%2F20230113090350.1103494-1-
>>>>
>> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
>>>> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>>>>
>> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=RLpnDiLwfqQd5QMXdiQyPVCkfOj8q2AyVeZOwWHvlsM%3
>>>> D&reserved=0
>>>> - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
>>>> link:
>>>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
>> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
>> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
>> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
>>>> ernel.org%2Fvirtualization%2F20230202090934.549556-1-
>>>>
>> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
>>>> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>>>>
>> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=6jgY76BMSbvamb%2Fl3Urjt4Gcizeqon%2BZE5nPssc2kDA%
>>>> 3D&reserved=0
>>>>
>>>> About this patchset has 4 patches. The first of two patch is little changes
>>>> to virtio. The third patch add APIs to easily access virtio data structure
>>>> on PCIe Host side memory. The last one introduce a virtio-net EP device
>>>> function. Details are in commit respectively.
>>>>
>>>> Currently those network devices are testd using ping only. I'll add a
>>>> result of performance evaluation using iperf and etc to the future version
>>>> of this patchset.
>>>>
>>>> Shunsuke Mie (4):
>>>>    virtio_pci: add a definition of queue flag in ISR
>>>>    virtio_ring: remove const from vring getter
>>>>    PCI: endpoint: Introduce virtio library for EP functions
>>>>    PCI: endpoint: function: Add EP function driver to provide virtio net
>>>>      device
>>>>
>>>>   drivers/pci/endpoint/Kconfig                  |   7 +
>>>>   drivers/pci/endpoint/Makefile                 |   1 +
>>>>   drivers/pci/endpoint/functions/Kconfig        |  12 +
>>>>   drivers/pci/endpoint/functions/Makefile       |   1 +
>>>>   .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>>>>   .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635
>> ++++++++++++++++++
>>>>   drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>>>>   drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>>>>   drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
>>>>   drivers/virtio/virtio_ring.c                  |   2 +-
>>>>   include/linux/pci-epf-virtio.h                |  25 +
>>>>   include/linux/virtio.h                        |   2 +-
>>>>   include/uapi/linux/virtio_pci.h               |   2 +
>>>>   13 files changed, 1590 insertions(+), 2 deletions(-)
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
>>>>   create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>>>>   create mode 100644 include/linux/pci-epf-virtio.h
>>>>
>>>> --
>>>> 2.25.1
>> Best,
>> Shunsuke

Best,

Shunsuke.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-02-14  3:27         ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-02-14  3:27 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Lorenzo Pieralisi, Manivannan Sadhasivam, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas


On 2023/02/08 1:02, Frank Li wrote:
>> We project extending this module to support RDMA. The plan is based on
>> virtio-rdma[1].
>> It extends the virtio-net and we are plan to implement the proposed
>> spec based on this patch.
>> [1] virtio-rdma
>> - proposal:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> ernel.org%2Fall%2F20220511095900.343-1-
>> xieyongji%40bytedance.com%2FT%2F&data=05%7C01%7Cfrank.li%40nxp.co
>> m%7C0ef2bd62eda945c413be08db08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5
>> c301635%7C0%7C0%7C638113625610341574%7CUnknown%7CTWFpbGZsb3d
>> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
>> 3D%7C3000%7C%7C%7C&sdata=HyhpRTG8MNx%2BtfmWn6x3srmdBjHcZAo
>> 2qbxL9USph9o%3D&reserved=0
>> - presentation on kvm forum:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fyout
>> u.be%2FQrhv6hC_YK4&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62
>> eda945c413be08db08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>> 7C0%7C638113625610341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=ucOsGR1letTjxf0gKN6uls5y951CXaIspZtLGnASEC8%3D&res
>> erved=0
>>
> Sorry for our outlook client always change link.  This previous discussion.
> https://lore.kernel.org/imx/d098a631-9930-26d3-48f3-8f95386c8e50@ti.com/T/#t
>
> Look like Endpoint maintainer Kishon like endpoint side work as vhost.
> Previous  Haotian Wang submit similar patches, which just not use eDMA, just use memcpy.
> But overall idea is the same.
>
> I think your and haotian's method is more reasonable for PCI-RC EP connection.
>
> Kishon is not active recently.   Maybe need Lorenzo Pieralisi and Bjorn helgass's comments
> for overall directions.
I think so too. Thank you for your summarization. I've commented on the 
e-mail.
> Frank Li
>
>> Please feel free to comment and suggest.
>>> Frank Li
>>>
>>>> To realize the function, this patchset has few changes and introduces a
>>>> new APIs to PCI EP framework related to virtio. Furthermore, it device
>>>> depends on the some patchtes that is discussing. Those depended
>> patchset
>>>> are following:
>>>> - [PATCH 1/2] dmaengine: dw-edma: Fix to change for continuous
>> transfer
>>>> link:
>>>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
>> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
>> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
>> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
>>>> ernel.org%2Fdmaengine%2F20221223022608.550697-1-
>>>>
>> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
>>>> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>>>>
>> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=tIn0MHzEvrdxaC4KKTvTRvYXBzQ6MyrFa2GXpa3ePv0%3D&
>>>> reserved=0
>>>> - [RFC PATCH 0/3] Deal with alignment restriction on EP side
>>>> link:
>>>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
>> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
>> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
>> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
>>>> ernel.org%2Flinux-pci%2F20230113090350.1103494-1-
>>>>
>> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
>>>> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>>>>
>> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=RLpnDiLwfqQd5QMXdiQyPVCkfOj8q2AyVeZOwWHvlsM%3
>>>> D&reserved=0
>>>> - [RFC PATCH v2 0/7] Introduce a vringh accessor for IO memory
>>>> link:
>>>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
>> %2F&data=05%7C01%7Cfrank.li%40nxp.com%7C0ef2bd62eda945c413be08db
>> 08f62ba3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6381136256
>> 10341574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=d
>> VZMaheX3eR1xA2wQtecmT857h2%2BFtUbhDSHXwgvsEY%3D&reserved=0
>>>> ernel.org%2Fvirtualization%2F20230202090934.549556-1-
>>>>
>> mie%40igel.co.jp%2F&data=05%7C01%7CFrank.Li%40nxp.com%7Cac57a62d4
>>>> 10b458a5ba408db05ce0a4e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
>>>>
>> 7C0%7C638110154722945380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
>> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
>> C%7C%7C&sdata=6jgY76BMSbvamb%2Fl3Urjt4Gcizeqon%2BZE5nPssc2kDA%
>>>> 3D&reserved=0
>>>>
>>>> About this patchset has 4 patches. The first of two patch is little changes
>>>> to virtio. The third patch add APIs to easily access virtio data structure
>>>> on PCIe Host side memory. The last one introduce a virtio-net EP device
>>>> function. Details are in commit respectively.
>>>>
>>>> Currently those network devices are testd using ping only. I'll add a
>>>> result of performance evaluation using iperf and etc to the future version
>>>> of this patchset.
>>>>
>>>> Shunsuke Mie (4):
>>>>    virtio_pci: add a definition of queue flag in ISR
>>>>    virtio_ring: remove const from vring getter
>>>>    PCI: endpoint: Introduce virtio library for EP functions
>>>>    PCI: endpoint: function: Add EP function driver to provide virtio net
>>>>      device
>>>>
>>>>   drivers/pci/endpoint/Kconfig                  |   7 +
>>>>   drivers/pci/endpoint/Makefile                 |   1 +
>>>>   drivers/pci/endpoint/functions/Kconfig        |  12 +
>>>>   drivers/pci/endpoint/functions/Makefile       |   1 +
>>>>   .../pci/endpoint/functions/pci-epf-vnet-ep.c  | 343 ++++++++++
>>>>   .../pci/endpoint/functions/pci-epf-vnet-rc.c  | 635
>> ++++++++++++++++++
>>>>   drivers/pci/endpoint/functions/pci-epf-vnet.c | 387 +++++++++++
>>>>   drivers/pci/endpoint/functions/pci-epf-vnet.h |  62 ++
>>>>   drivers/pci/endpoint/pci-epf-virtio.c         | 113 ++++
>>>>   drivers/virtio/virtio_ring.c                  |   2 +-
>>>>   include/linux/pci-epf-virtio.h                |  25 +
>>>>   include/linux/virtio.h                        |   2 +-
>>>>   include/uapi/linux/virtio_pci.h               |   2 +
>>>>   13 files changed, 1590 insertions(+), 2 deletions(-)
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-ep.c
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet-rc.c
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.c
>>>>   create mode 100644 drivers/pci/endpoint/functions/pci-epf-vnet.h
>>>>   create mode 100644 drivers/pci/endpoint/pci-epf-virtio.c
>>>>   create mode 100644 include/linux/pci-epf-virtio.h
>>>>
>>>> --
>>>> 2.25.1
>> Best,
>> Shunsuke

Best,

Shunsuke.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-02-14  3:27         ` Shunsuke Mie
  (?)
@ 2023-03-29 16:46         ` Frank Li
  2023-04-05  1:22             ` Shunsuke Mie
  2023-04-11 10:22             ` Shunsuke Mie
  -1 siblings, 2 replies; 50+ messages in thread
From: Frank Li @ 2023-03-29 16:46 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization


> 
> On 2023/02/08 1:02, Frank Li wrote:

Did you have chance to improve this? 

Best regards
Frank Li 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-03-29 16:46         ` Frank Li
@ 2023-04-05  1:22             ` Shunsuke Mie
  2023-04-11 10:22             ` Shunsuke Mie
  1 sibling, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-04-05  1:22 UTC (permalink / raw)
  To: Frank Li
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization


On 2023/03/30 1:46, Frank Li wrote:
>> On 2023/02/08 1:02, Frank Li wrote:
> Did you have chance to improve this?

Yes. I'm working on it.I'd like to submit new one in this week.

> Best regards
> Frank Li

Best,

Shunsuke,


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-04-05  1:22             ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-04-05  1:22 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Lorenzo Pieralisi, Manivannan Sadhasivam, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas


On 2023/03/30 1:46, Frank Li wrote:
>> On 2023/02/08 1:02, Frank Li wrote:
> Did you have chance to improve this?

Yes. I'm working on it.I'd like to submit new one in this week.

> Best regards
> Frank Li

Best,

Shunsuke,

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
  2023-03-29 16:46         ` Frank Li
@ 2023-04-11 10:22             ` Shunsuke Mie
  2023-04-11 10:22             ` Shunsuke Mie
  1 sibling, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-04-11 10:22 UTC (permalink / raw)
  To: Frank Li
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Kishon Vijay Abraham I, Bjorn Helgaas,
	Michael S. Tsirkin, Jason Wang, Jon Mason, Ren Zhijie,
	Takanari Hayama, linux-kernel, linux-pci, virtualization


On 2023/03/30 1:46, Frank Li wrote:
>> On 2023/02/08 1:02, Frank Li wrote:
> Did you have chance to improve this?

I'm working on it. I'll submit the next version.

>
> Best regards
> Frank Li


Best regards,

Shunsuke


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function
@ 2023-04-11 10:22             ` Shunsuke Mie
  0 siblings, 0 replies; 50+ messages in thread
From: Shunsuke Mie @ 2023-04-11 10:22 UTC (permalink / raw)
  To: Frank Li
  Cc: Kishon Vijay Abraham I, Krzysztof Wilczyński,
	Takanari Hayama, Michael S. Tsirkin, linux-pci,
	Lorenzo Pieralisi, Manivannan Sadhasivam, linux-kernel,
	virtualization, Ren Zhijie, Jon Mason, Bjorn Helgaas


On 2023/03/30 1:46, Frank Li wrote:
>> On 2023/02/08 1:02, Frank Li wrote:
> Did you have chance to improve this?

I'm working on it. I'll submit the next version.

>
> Best regards
> Frank Li


Best regards,

Shunsuke

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2023-04-11 10:23 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-03 10:04 [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function Shunsuke Mie
2023-02-03 10:04 ` Shunsuke Mie
2023-02-03 10:04 ` [RFC PATCH 1/4] virtio_pci: add a definition of queue flag in ISR Shunsuke Mie
2023-02-03 10:04   ` Shunsuke Mie
2023-02-03 10:16   ` Michael S. Tsirkin
2023-02-03 10:16     ` Michael S. Tsirkin
2023-02-07 10:06     ` Shunsuke Mie
2023-02-07 10:06       ` Shunsuke Mie
2023-02-03 10:04 ` [RFC PATCH 2/4] virtio_ring: remove const from vring getter Shunsuke Mie
2023-02-03 10:04   ` Shunsuke Mie
2023-02-03 10:04 ` [RFC PATCH 3/4] PCI: endpoint: Introduce virtio library for EP functions Shunsuke Mie
2023-02-03 10:04   ` Shunsuke Mie
2023-02-03 10:20   ` Michael S. Tsirkin
2023-02-03 10:20     ` Michael S. Tsirkin
2023-02-07 11:05     ` Shunsuke Mie
2023-02-07 11:05       ` Shunsuke Mie
2023-02-03 10:04 ` [RFC PATCH 4/4] PCI: endpoint: function: Add EP function driver to provide virtio net device Shunsuke Mie
2023-02-03 10:04   ` Shunsuke Mie
2023-02-03 10:22   ` Michael S. Tsirkin
2023-02-03 10:22     ` Michael S. Tsirkin
2023-02-03 22:15     ` [EXT] " Frank Li
2023-02-07 10:56       ` Shunsuke Mie
2023-02-07 10:56         ` Shunsuke Mie
2023-02-07 15:37         ` Frank Li
2023-02-08  5:46           ` Shunsuke Mie
2023-02-08  5:46             ` Shunsuke Mie
2023-02-07 10:47     ` Shunsuke Mie
2023-02-07 10:47       ` Shunsuke Mie
2023-02-03 11:58   ` kernel test robot
2023-02-04  1:05   ` kernel test robot
2023-02-03 16:45 ` [EXT] [RFC PATCH 0/4] PCI: endpoint: Introduce a virtio-net EP function Frank Li
2023-02-07 10:29   ` Shunsuke Mie
2023-02-07 10:29     ` Shunsuke Mie
2023-02-07 16:02     ` Frank Li
2023-02-14  3:27       ` Shunsuke Mie
2023-02-14  3:27         ` Shunsuke Mie
2023-03-29 16:46         ` Frank Li
2023-04-05  1:22           ` Shunsuke Mie
2023-04-05  1:22             ` Shunsuke Mie
2023-04-11 10:22           ` Shunsuke Mie
2023-04-11 10:22             ` Shunsuke Mie
2023-02-03 21:48 ` Frank Li
2023-02-07  1:43   ` Shunsuke Mie
2023-02-07  1:43     ` Shunsuke Mie
2023-02-07  3:27     ` Shunsuke Mie
2023-02-07  3:27       ` Shunsuke Mie
2023-02-05 10:01 ` Michael S. Tsirkin
2023-02-05 10:01   ` Michael S. Tsirkin
2023-02-07 10:17   ` Shunsuke Mie
2023-02-07 10:17     ` Shunsuke Mie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.