linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/4] NTB function for PCIe RC to EP connection
@ 2022-02-22 16:23 Frank Li
  2022-02-22 16:23 ` [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address Frank Li
                   ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Frank Li @ 2022-02-22 16:23 UTC (permalink / raw)
  To: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh
  Cc: linux-ntb, linux-pci

This implement NTB function for PCIe EP to RC connections.
The existed ntb epf need two PCI EPs and two PCI Host.

This just need EP to RC connections.

    ┌────────────┐         ┌─────────────────────────────────────┐
    │            │         │                                     │
    ├────────────┤         │                      ┌──────────────┤
    │ NTB        │         │                      │ NTB          │
    │ NetDev     │         │                      │ NetDev       │
    ├────────────┤         │                      ├──────────────┤
    │ NTB        │         │                      │ NTB          │
    │ Transfer   │         │                      │ Transfer     │
    ├────────────┤         │                      ├──────────────┤
    │            │         │                      │              │
    │  PCI NTB   │         │                      │              │
    │    EPF     │         │                      │              │
    │   Driver   │         │                      │ PCI Virtual  │
    │            │         ├───────────────┐      │ NTB Driver   │
    │            │         │ PCI EP NTB    │◄────►│              │
    │            │         │  FN Driver    │      │              │
    ├────────────┤         ├───────────────┤      ├──────────────┤
    │            │         │               │      │              │
    │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
    │            │  PCI    │               │      │     BUS      │
    └────────────┘         └───────────────┴──────┴──────────────┘
        PCI RC                        PCI EP



Frank Li (4):
  PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  NTB: epf: Allow more flexibility in the memory BAR map method
  PCI: endpoint: Support NTB transfer between RC and EP
  Documentation: PCI: Add specification for the PCI vNTB function device

 Documentation/PCI/endpoint/index.rst          |    2 +
 .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
 Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
 drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
 .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
 drivers/pci/endpoint/functions/Kconfig        |   11 +
 drivers/pci/endpoint/functions/Makefile       |    1 +
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
 8 files changed, 1775 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
 create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c

-- 
2.24.0.rc1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
@ 2022-02-22 16:23 ` Frank Li
  2023-12-14 14:31   ` Niklas Cassel
  2022-02-22 16:23 ` [PATCH v2 2/4] NTB: epf: Allow more flexibility in the memory BAR map method Frank Li
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Frank Li @ 2022-02-22 16:23 UTC (permalink / raw)
  To: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh
  Cc: linux-ntb, linux-pci

ntb_mw_set_trans() will set memory map window after endpoint function
driver bind. The inbound map address need be updated dynamically when
using NTB by PCIe Root Port and PCIe Endpoint connection.

Checking if iatu already assigned to the BAR, if yes, using assigned iatu
number to update inbound address map and skip set BAR's register.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---

Change from V1:
 - improve commit message

 drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 998b698f40858..cad5d9ea7cc6c 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
 	u32 free_win;
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 
-	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
+	if (!ep->bar_to_atu[bar])
+		free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
+	else
+		free_win = ep->bar_to_atu[bar];
+
 	if (free_win >= pci->num_ib_windows) {
 		dev_err(pci->dev, "No free inbound window\n");
 		return -EINVAL;
@@ -215,6 +219,7 @@ static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
 	clear_bit(atu_index, ep->ib_window_map);
 	ep->epf_bar[bar] = NULL;
+	ep->bar_to_atu[bar] = 0;
 }
 
 static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
@@ -244,6 +249,9 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 	if (ret)
 		return ret;
 
+	if (ep->epf_bar[bar])
+		return 0;
+
 	dw_pcie_dbi_ro_wr_en(pci);
 
 	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
-- 
2.24.0.rc1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/4] NTB: epf: Allow more flexibility in the memory BAR map method
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
  2022-02-22 16:23 ` [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address Frank Li
@ 2022-02-22 16:23 ` Frank Li
  2022-03-10 22:08   ` Zhi Li
  2022-02-22 16:23 ` [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP Frank Li
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Frank Li @ 2022-02-22 16:23 UTC (permalink / raw)
  To: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh
  Cc: linux-ntb, linux-pci

Support the below BAR configuration methods for epf NTB.

BAR 0: config and scratchpad
BAR 2: doorbell
BAR 4: memory map windows

Set difference BAR number information into struct ntb_epf_data. So difference
VID/PID can choose different BAR configurations. There are difference
BAR map method between epf NTB and epf vNTB Endpoint function.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---

Change from v1:
 - Improve commit message

 drivers/ntb/hw/epf/ntb_hw_epf.c | 48 ++++++++++++++++++++++++---------
 1 file changed, 35 insertions(+), 13 deletions(-)

diff --git a/drivers/ntb/hw/epf/ntb_hw_epf.c b/drivers/ntb/hw/epf/ntb_hw_epf.c
index b019755e4e21b..3ece49cb18ffa 100644
--- a/drivers/ntb/hw/epf/ntb_hw_epf.c
+++ b/drivers/ntb/hw/epf/ntb_hw_epf.c
@@ -45,7 +45,6 @@
 
 #define NTB_EPF_MIN_DB_COUNT	3
 #define NTB_EPF_MAX_DB_COUNT	31
-#define NTB_EPF_MW_OFFSET	2
 
 #define NTB_EPF_COMMAND_TIMEOUT	1000 /* 1 Sec */
 
@@ -67,6 +66,7 @@ struct ntb_epf_dev {
 	enum pci_barno ctrl_reg_bar;
 	enum pci_barno peer_spad_reg_bar;
 	enum pci_barno db_reg_bar;
+	enum pci_barno mw_bar;
 
 	unsigned int mw_count;
 	unsigned int spad_count;
@@ -92,6 +92,8 @@ struct ntb_epf_data {
 	enum pci_barno peer_spad_reg_bar;
 	/* BAR that contains Doorbell region and Memory window '1' */
 	enum pci_barno db_reg_bar;
+	/* BAR that contains memory windows*/
+	enum pci_barno mw_bar;
 };
 
 static int ntb_epf_send_command(struct ntb_epf_dev *ndev, u32 command,
@@ -411,7 +413,7 @@ static int ntb_epf_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
 		return -EINVAL;
 	}
 
-	bar = idx + NTB_EPF_MW_OFFSET;
+	bar = idx + ndev->mw_bar;
 
 	mw_size = pci_resource_len(ntb->pdev, bar);
 
@@ -453,7 +455,7 @@ static int ntb_epf_peer_mw_get_addr(struct ntb_dev *ntb, int idx,
 	if (idx == 0)
 		offset = readl(ndev->ctrl_reg + NTB_EPF_MW1_OFFSET);
 
-	bar = idx + NTB_EPF_MW_OFFSET;
+	bar = idx + ndev->mw_bar;
 
 	if (base)
 		*base = pci_resource_start(ndev->ntb.pdev, bar) + offset;
@@ -565,6 +567,7 @@ static int ntb_epf_init_pci(struct ntb_epf_dev *ndev,
 			    struct pci_dev *pdev)
 {
 	struct device *dev = ndev->dev;
+	size_t spad_sz, spad_off;
 	int ret;
 
 	pci_set_drvdata(pdev, ndev);
@@ -599,10 +602,16 @@ static int ntb_epf_init_pci(struct ntb_epf_dev *ndev,
 		goto err_dma_mask;
 	}
 
-	ndev->peer_spad_reg = pci_iomap(pdev, ndev->peer_spad_reg_bar, 0);
-	if (!ndev->peer_spad_reg) {
-		ret = -EIO;
-		goto err_dma_mask;
+	if (ndev->peer_spad_reg_bar) {
+		ndev->peer_spad_reg = pci_iomap(pdev, ndev->peer_spad_reg_bar, 0);
+		if (!ndev->peer_spad_reg) {
+			ret = -EIO;
+			goto err_dma_mask;
+		}
+	} else {
+		spad_sz = 4 * readl(ndev->ctrl_reg + NTB_EPF_SPAD_COUNT);
+		spad_off = readl(ndev->ctrl_reg + NTB_EPF_SPAD_OFFSET);
+		ndev->peer_spad_reg = ndev->ctrl_reg + spad_off  + spad_sz;
 	}
 
 	ndev->db_reg = pci_iomap(pdev, ndev->db_reg_bar, 0);
@@ -657,6 +666,7 @@ static int ntb_epf_pci_probe(struct pci_dev *pdev,
 	enum pci_barno peer_spad_reg_bar = BAR_1;
 	enum pci_barno ctrl_reg_bar = BAR_0;
 	enum pci_barno db_reg_bar = BAR_2;
+	enum pci_barno mw_bar = BAR_2;
 	struct device *dev = &pdev->dev;
 	struct ntb_epf_data *data;
 	struct ntb_epf_dev *ndev;
@@ -671,17 +681,16 @@ static int ntb_epf_pci_probe(struct pci_dev *pdev,
 
 	data = (struct ntb_epf_data *)id->driver_data;
 	if (data) {
-		if (data->peer_spad_reg_bar)
-			peer_spad_reg_bar = data->peer_spad_reg_bar;
-		if (data->ctrl_reg_bar)
-			ctrl_reg_bar = data->ctrl_reg_bar;
-		if (data->db_reg_bar)
-			db_reg_bar = data->db_reg_bar;
+		peer_spad_reg_bar = data->peer_spad_reg_bar;
+		ctrl_reg_bar = data->ctrl_reg_bar;
+		db_reg_bar = data->db_reg_bar;
+		mw_bar = data->mw_bar;
 	}
 
 	ndev->peer_spad_reg_bar = peer_spad_reg_bar;
 	ndev->ctrl_reg_bar = ctrl_reg_bar;
 	ndev->db_reg_bar = db_reg_bar;
+	ndev->mw_bar = mw_bar;
 	ndev->dev = dev;
 
 	ntb_epf_init_struct(ndev, pdev);
@@ -729,6 +738,14 @@ static const struct ntb_epf_data j721e_data = {
 	.ctrl_reg_bar = BAR_0,
 	.peer_spad_reg_bar = BAR_1,
 	.db_reg_bar = BAR_2,
+	.mw_bar = BAR_2,
+};
+
+static const struct ntb_epf_data mx8_data = {
+	.ctrl_reg_bar = BAR_0,
+	.peer_spad_reg_bar = BAR_0,
+	.db_reg_bar = BAR_2,
+	.mw_bar = BAR_4,
 };
 
 static const struct pci_device_id ntb_epf_pci_tbl[] = {
@@ -737,6 +754,11 @@ static const struct pci_device_id ntb_epf_pci_tbl[] = {
 		.class = PCI_CLASS_MEMORY_RAM << 8, .class_mask = 0xffff00,
 		.driver_data = (kernel_ulong_t)&j721e_data,
 	},
+	{
+		PCI_DEVICE(PCI_VENDOR_ID_FREESCALE, 0x0809),
+		.class = PCI_CLASS_MEMORY_RAM << 8, .class_mask = 0xffff00,
+		.driver_data = (kernel_ulong_t)&mx8_data,
+	},
 	{ },
 };
 
-- 
2.24.0.rc1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
  2022-02-22 16:23 ` [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address Frank Li
  2022-02-22 16:23 ` [PATCH v2 2/4] NTB: epf: Allow more flexibility in the memory BAR map method Frank Li
@ 2022-02-22 16:23 ` Frank Li
  2022-03-10 22:09   ` Zhi Li
  2022-12-14  0:08   ` Bjorn Helgaas
  2022-02-22 16:23 ` [PATCH v2 4/4] Documentation: PCI: Add specification for the PCI vNTB function device Frank Li
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 30+ messages in thread
From: Frank Li @ 2022-02-22 16:23 UTC (permalink / raw)
  To: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh
  Cc: linux-ntb, linux-pci

Add NTB function driver and virtual PCI Bus and Virtual NTB driver
to implement communication between PCIe Root Port and PCIe EP devices

┌────────────┐         ┌─────────────────────────────────────┐
│            │         │                                     │
├────────────┤         │                      ┌──────────────┤
│ NTB        │         │                      │ NTB          │
│ NetDev     │         │                      │ NetDev       │
├────────────┤         │                      ├──────────────┤
│ NTB        │         │                      │ NTB          │
│ Transfer   │         │                      │ Transfer     │
├────────────┤         │                      ├──────────────┤
│            │         │                      │              │
│  PCI NTB   │         │                      │              │
│    EPF     │         │                      │              │
│   Driver   │         │                      │ PCI Virtual  │
│            │         ├───────────────┐      │ NTB Driver   │
│            │         │ PCI EP NTB    │◄────►│              │
│            │         │  FN Driver    │      │              │
├────────────┤         ├───────────────┤      ├──────────────┤
│            │         │               │      │              │
│  PCI Bus   │ ◄─────► │  PCI EP Bus   │      │  Virtual PCI │
│            │  PCI    │               │      │     Bus      │
└────────────┘         └───────────────┴──────┴──────────────┘
PCIe Root Port                        PCI EP

This driver includes 3 parts:
 1 PCI EP NTB function driver
 2 Virtual PCI bus
 3 PCI virtual NTB driver, which is loaded only by above virtual PCI bus

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
Change from v1:
 - Fix code style problem according to Bjorn's feedback
 - Remove hardcode VPCI_BUS_NUM, let user choose free number by configfs
 - Remove hardcode vid pid for virtual pci ntb driver, change by
   configfs

 drivers/pci/endpoint/functions/Kconfig        |   11 +
 drivers/pci/endpoint/functions/Makefile       |    1 +
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
 3 files changed, 1436 insertions(+)
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c

diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
index 5f1242ca2f4e4..65217428d17b9 100644
--- a/drivers/pci/endpoint/functions/Kconfig
+++ b/drivers/pci/endpoint/functions/Kconfig
@@ -25,3 +25,14 @@ config PCI_EPF_NTB
 	  device tree.
 
 	  If in doubt, say "N" to disable Endpoint NTB driver.
+
+config PCI_EPF_VNTB
+        tristate "PCI Endpoint NTB driver"
+        depends on PCI_ENDPOINT
+        select CONFIGFS_FS
+        help
+          Select this configuration option to enable the Non-Transparent
+          Bridge (NTB) driver for PCIe Endpoint. NTB driver implements NTB
+          between PCI Root Port and PCIe Endpoint.
+
+          If in doubt, say "N" to disable Endpoint NTB driver.
diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
index 96ab932a537a2..5c13001deaba1 100644
--- a/drivers/pci/endpoint/functions/Makefile
+++ b/drivers/pci/endpoint/functions/Makefile
@@ -5,3 +5,4 @@
 
 obj-$(CONFIG_PCI_EPF_TEST)		+= pci-epf-test.o
 obj-$(CONFIG_PCI_EPF_NTB)		+= pci-epf-ntb.o
+obj-$(CONFIG_PCI_EPF_VNTB) 		+= pci-epf-vntb.o
diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
new file mode 100644
index 0000000000000..1466dd1904175
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -0,0 +1,1424 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Endpoint Function Driver to implement Non-Transparent Bridge functionality
+ * Between PCI RC and EP
+ *
+ * Copyright (C) 2020 Texas Instruments
+ * Copyright (C) 2022 NXP
+ *
+ * Based on pci-epf-ntb.c
+ * Author: Frank Li <Frank.Li@nxp.com>
+ * Author: Kishon Vijay Abraham I <kishon@ti.com>
+ */
+
+/**
+ * +------------+         +---------------------------------------+
+ * |            |         |                                       |
+ * +------------+         |                        +--------------+
+ * | NTB        |         |                        | NTB          |
+ * | NetDev     |         |                        | NetDev       |
+ * +------------+         |                        +--------------+
+ * | NTB        |         |                        | NTB          |
+ * | Transfer   |         |                        | Transfer     |
+ * +------------+         |                        +--------------+
+ * |            |         |                        |              |
+ * |  PCI NTB   |         |                        |              |
+ * |    EPF     |         |                        |              |
+ * |   Driver   |         |                        | PCI Virtual  |
+ * |            |         +---------------+        | NTB Driver   |
+ * |            |         | PCI EP NTB    |<------>|              |
+ * |            |         |  FN Driver    |        |              |
+ * +------------+         +---------------+        +--------------+
+ * |            |         |               |        |              |
+ * |  PCI Bus   | <-----> |  PCI EP Bus   |        |  Virtual PCI |
+ * |            |  PCI    |               |        |     Bus      |
+ * +------------+         +---------------+--------+--------------+
+ * PCIe Root Port                        PCI EP
+ */
+
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include <linux/pci-epc.h>
+#include <linux/pci-epf.h>
+#include <linux/ntb.h>
+
+static struct workqueue_struct *kpcintb_workqueue;
+
+#define COMMAND_CONFIGURE_DOORBELL	1
+#define COMMAND_TEARDOWN_DOORBELL	2
+#define COMMAND_CONFIGURE_MW		3
+#define COMMAND_TEARDOWN_MW		4
+#define COMMAND_LINK_UP			5
+#define COMMAND_LINK_DOWN		6
+
+#define COMMAND_STATUS_OK		1
+#define COMMAND_STATUS_ERROR		2
+
+#define LINK_STATUS_UP			BIT(0)
+
+#define SPAD_COUNT			64
+#define DB_COUNT			4
+#define NTB_MW_OFFSET			2
+#define DB_COUNT_MASK			GENMASK(15, 0)
+#define MSIX_ENABLE			BIT(16)
+#define MAX_DB_COUNT			32
+#define MAX_MW				4
+
+enum epf_ntb_bar {
+	BAR_CONFIG,
+	BAR_DB,
+	BAR_MW0,
+	BAR_MW1,
+	BAR_MW2,
+};
+
+/*
+ * +--------------------------------------------------+ Base
+ * |                                                  |
+ * |                                                  |
+ * |                                                  |
+ * |          Common Control Register                 |
+ * |                                                  |
+ * |                                                  |
+ * |                                                  |
+ * +-----------------------+--------------------------+ Base+span_offset
+ * |                       |                          |
+ * |    Peer Span Space    |    Span Space            |
+ * |                       |                          |
+ * |                       |                          |
+ * +-----------------------+--------------------------+ Base+span_offset
+ * |                       |                          |     +span_count * 4
+ * |                       |                          |
+ * |     Span Space        |   Peer Span Space        |
+ * |                       |                          |
+ * +-----------------------+--------------------------+
+ *       Virtual PCI             PCIe Endpoint
+ *       NTB Driver               NTB Driver
+ */
+struct epf_ntb_ctrl {
+	u32     command;
+	u32     argument;
+	u16     command_status;
+	u16     link_status;
+	u32     topology;
+	u64     addr;
+	u64     size;
+	u32     num_mws;
+	u32	reserved;
+	u32     spad_offset;
+	u32     spad_count;
+	u32	db_entry_size;
+	u32     db_data[MAX_DB_COUNT];
+	u32     db_offset[MAX_DB_COUNT];
+} __packed;
+
+struct epf_ntb {
+	struct ntb_dev ntb;
+	struct pci_epf *epf;
+	struct config_group group;
+
+	u32 num_mws;
+	u32 db_count;
+	u32 spad_count;
+	u64 mws_size[MAX_MW];
+	u64 db;
+	u32 vbus_number;
+	u16 vntb_pid;
+	u16 vntb_vid;
+
+	bool linkup;
+	u32 spad_size;
+
+	enum pci_barno epf_ntb_bar[6];
+
+	struct epf_ntb_ctrl *reg;
+
+	phys_addr_t epf_db_phy;
+	void __iomem *epf_db;
+
+	phys_addr_t vpci_mw_phy[MAX_MW];
+	void __iomem *vpci_mw_addr[MAX_MW];
+
+	struct delayed_work cmd_handler;
+};
+
+#define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb, group)
+#define ntb_ndev(__ntb) container_of(__ntb, struct epf_ntb, ntb)
+
+static struct pci_epf_header epf_ntb_header = {
+	.vendorid	= PCI_ANY_ID,
+	.deviceid	= PCI_ANY_ID,
+	.baseclass_code	= PCI_BASE_CLASS_MEMORY,
+	.interrupt_pin	= PCI_INTERRUPT_INTA,
+};
+
+/**
+ * epf_ntb_link_up() - Raise link_up interrupt to Virtual Host
+ * @ntb: NTB device that facilitates communication between HOST and VHOST
+ * @link_up: true or false indicating Link is UP or Down
+ *
+ * Once NTB function in HOST invoke ntb_link_enable(),
+ * this NTB function driver will trigger a link event to vhost.
+ */
+static int epf_ntb_link_up(struct epf_ntb *ntb, bool link_up)
+{
+	if (link_up)
+		ntb->reg->link_status |= LINK_STATUS_UP;
+	else
+		ntb->reg->link_status &= ~LINK_STATUS_UP;
+
+	ntb_link_event(&ntb->ntb);
+	return 0;
+}
+
+/**
+ * epf_ntb_configure_mw() - Configure the Outbound Address Space for vhost
+ *   to access the memory window of host
+ * @ntb: NTB device that facilitates communication between host and vhost
+ * @mw: Index of the memory window (either 0, 1, 2 or 3)
+ *
+ *                          EP Outbound Window
+ * +--------+              +-----------+
+ * |        |              |           |
+ * |        |              |           |
+ * |        |              |           |
+ * |        |              |           |
+ * |        |              +-----------+
+ * | Virtual|              | Memory Win|
+ * | NTB    | -----------> |           |
+ * | Driver |              |           |
+ * |        |              +-----------+
+ * |        |              |           |
+ * |        |              |           |
+ * +--------+              +-----------+
+ *  VHost                   PCI EP
+ */
+static int epf_ntb_configure_mw(struct epf_ntb *ntb, u32 mw)
+{
+	phys_addr_t phys_addr;
+	u8 func_no, vfunc_no;
+	u64 addr, size;
+	int ret = 0;
+
+	phys_addr = ntb->vpci_mw_phy[mw];
+	addr = ntb->reg->addr;
+	size = ntb->reg->size;
+
+	func_no = ntb->epf->func_no;
+	vfunc_no = ntb->epf->vfunc_no;
+
+	ret = pci_epc_map_addr(ntb->epf->epc, func_no, vfunc_no, phys_addr, addr, size);
+	if (ret)
+		dev_err(&ntb->epf->epc->dev,
+			"Failed to map memory window %d address\n", mw);
+	return ret;
+}
+
+/**
+ * epf_ntb_teardown_mw() - Teardown the configured OB ATU
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ * @mw: Index of the memory window (either 0, 1, 2 or 3)
+ *
+ * Teardown the configured OB ATU configured in epf_ntb_configure_mw() using
+ * pci_epc_unmap_addr()
+ */
+static void epf_ntb_teardown_mw(struct epf_ntb *ntb, u32 mw)
+{
+	pci_epc_unmap_addr(ntb->epf->epc,
+			   ntb->epf->func_no,
+			   ntb->epf->vfunc_no,
+			   ntb->vpci_mw_phy[mw]);
+}
+
+/**
+ * epf_ntb_cmd_handler() - Handle commands provided by the NTB Host
+ * @work: work_struct for the epf_ntb_epc
+ *
+ * Workqueue function that gets invoked for the two epf_ntb_epc
+ * periodically (once every 5ms) to see if it has received any commands
+ * from NTB host. The host can send commands to configure doorbell or
+ * configure memory window or to update link status.
+ */
+static void epf_ntb_cmd_handler(struct work_struct *work)
+{
+	struct epf_ntb_ctrl *ctrl;
+	u32 command, argument;
+	struct epf_ntb *ntb;
+	struct device *dev;
+	int ret;
+	int i;
+
+	ntb = container_of(work, struct epf_ntb, cmd_handler.work);
+
+	for (i = 1; i < ntb->db_count; i++) {
+		if (readl(ntb->epf_db + i * 4)) {
+			if (readl(ntb->epf_db + i * 4))
+				ntb->db |= 1 << (i - 1);
+
+			ntb_db_event(&ntb->ntb, i);
+			writel(0, ntb->epf_db + i * 4);
+		}
+	}
+
+	ctrl = ntb->reg;
+	command = ctrl->command;
+	if (!command)
+		goto reset_handler;
+	argument = ctrl->argument;
+
+	ctrl->command = 0;
+	ctrl->argument = 0;
+
+	ctrl = ntb->reg;
+	dev = &ntb->epf->dev;
+
+	switch (command) {
+	case COMMAND_CONFIGURE_DOORBELL:
+		ctrl->command_status = COMMAND_STATUS_OK;
+		break;
+	case COMMAND_TEARDOWN_DOORBELL:
+		ctrl->command_status = COMMAND_STATUS_OK;
+		break;
+	case COMMAND_CONFIGURE_MW:
+		ret = epf_ntb_configure_mw(ntb, argument);
+		if (ret < 0)
+			ctrl->command_status = COMMAND_STATUS_ERROR;
+		else
+			ctrl->command_status = COMMAND_STATUS_OK;
+		break;
+	case COMMAND_TEARDOWN_MW:
+		epf_ntb_teardown_mw(ntb, argument);
+		ctrl->command_status = COMMAND_STATUS_OK;
+		break;
+	case COMMAND_LINK_UP:
+		ntb->linkup = true;
+		ret = epf_ntb_link_up(ntb, true);
+		if (ret < 0)
+			ctrl->command_status = COMMAND_STATUS_ERROR;
+		else
+			ctrl->command_status = COMMAND_STATUS_OK;
+		goto reset_handler;
+	case COMMAND_LINK_DOWN:
+		ntb->linkup = false;
+		ret = epf_ntb_link_up(ntb, false);
+		if (ret < 0)
+			ctrl->command_status = COMMAND_STATUS_ERROR;
+		else
+			ctrl->command_status = COMMAND_STATUS_OK;
+		break;
+	default:
+		dev_err(dev, "UNKNOWN command: %d\n", command);
+		break;
+	}
+
+reset_handler:
+	queue_delayed_work(kpcintb_workqueue, &ntb->cmd_handler,
+			   msecs_to_jiffies(5));
+}
+
+/**
+ * epf_ntb_config_sspad_bar_clear() - Clear Config + Self scratchpad BAR
+ * @ntb_epc: EPC associated with one of the HOST which holds peer's outbound
+ *	     address.
+ *
+ * Clear BAR0 of EP CONTROLLER 1 which contains the HOST1's config and
+ * self scratchpad region (removes inbound ATU configuration). While BAR0 is
+ * the default self scratchpad BAR, an NTB could have other BARs for self
+ * scratchpad (because of reserved BARs). This function can get the exact BAR
+ * used for self scratchpad from epf_ntb_bar[BAR_CONFIG].
+ *
+ * Please note the self scratchpad region and config region is combined to
+ * a single region and mapped using the same BAR. Also note HOST2's peer
+ * scratchpad is HOST1's self scratchpad.
+ */
+static void epf_ntb_config_sspad_bar_clear(struct epf_ntb *ntb)
+{
+	struct pci_epf_bar *epf_bar;
+	enum pci_barno barno;
+
+	barno = ntb->epf_ntb_bar[BAR_CONFIG];
+	epf_bar = &ntb->epf->bar[barno];
+
+	pci_epc_clear_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no, epf_bar);
+}
+
+/**
+ * epf_ntb_config_sspad_bar_set() - Set Config + Self scratchpad BAR
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ *
+ * Map BAR0 of EP CONTROLLER 1 which contains the HOST1's config and
+ * self scratchpad region.
+ *
+ * Please note the self scratchpad region and config region is combined to
+ * a single region and mapped using the same BAR.
+ */
+static int epf_ntb_config_sspad_bar_set(struct epf_ntb *ntb)
+{
+	struct pci_epf_bar *epf_bar;
+	enum pci_barno barno;
+	u8 func_no, vfunc_no;
+	struct device *dev;
+	int ret;
+
+	dev = &ntb->epf->dev;
+	func_no = ntb->epf->func_no;
+	vfunc_no = ntb->epf->vfunc_no;
+	barno = ntb->epf_ntb_bar[BAR_CONFIG];
+	epf_bar = &ntb->epf->bar[barno];
+
+	ret = pci_epc_set_bar(ntb->epf->epc, func_no, vfunc_no, epf_bar);
+	if (ret) {
+		dev_err(dev, "inft: Config/Status/SPAD BAR set failed\n");
+		return ret;
+	}
+	return 0;
+}
+
+/**
+ * epf_ntb_config_spad_bar_free() - Free the physical memory associated with
+ *   config + scratchpad region
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ */
+static void epf_ntb_config_spad_bar_free(struct epf_ntb *ntb)
+{
+	enum pci_barno barno;
+
+	barno = ntb->epf_ntb_bar[BAR_CONFIG];
+	pci_epf_free_space(ntb->epf, ntb->reg, barno, 0);
+}
+
+/**
+ * epf_ntb_config_spad_bar_alloc() - Allocate memory for config + scratchpad
+ *   region
+ * @ntb: NTB device that facilitates communication between HOST1 and HOST2
+ *
+ * Allocate the Local Memory mentioned in the above diagram. The size of
+ * CONFIG REGION is sizeof(struct epf_ntb_ctrl) and size of SCRATCHPAD REGION
+ * is obtained from "spad-count" configfs entry.
+ */
+static int epf_ntb_config_spad_bar_alloc(struct epf_ntb *ntb)
+{
+	size_t align;
+	enum pci_barno barno;
+	struct epf_ntb_ctrl *ctrl;
+	u32 spad_size, ctrl_size;
+	u64 size;
+	struct pci_epf *epf = ntb->epf;
+	struct device *dev = &epf->dev;
+	u32 spad_count;
+	void *base;
+	int i;
+	const struct pci_epc_features *epc_features = pci_epc_get_features(epf->epc,
+								epf->func_no,
+								epf->vfunc_no);
+	barno = ntb->epf_ntb_bar[BAR_CONFIG];
+	size = epc_features->bar_fixed_size[barno];
+	align = epc_features->align;
+
+	if ((!IS_ALIGNED(size, align)))
+		return -EINVAL;
+
+	spad_count = ntb->spad_count;
+
+	ctrl_size = sizeof(struct epf_ntb_ctrl);
+	spad_size = 2 * spad_count * 4;
+
+	if (!align) {
+		ctrl_size = roundup_pow_of_two(ctrl_size);
+		spad_size = roundup_pow_of_two(spad_size);
+	} else {
+		ctrl_size = ALIGN(ctrl_size, align);
+		spad_size = ALIGN(spad_size, align);
+	}
+
+	if (!size)
+		size = ctrl_size + spad_size;
+	else if (size < ctrl_size + spad_size)
+		return -EINVAL;
+
+	base = pci_epf_alloc_space(epf, size, barno, align, 0);
+	if (!base) {
+		dev_err(dev, "Config/Status/SPAD alloc region fail\n");
+		return -ENOMEM;
+	}
+
+	ntb->reg = base;
+
+	ctrl = ntb->reg;
+	ctrl->spad_offset = ctrl_size;
+
+	ctrl->spad_count = spad_count;
+	ctrl->num_mws = ntb->num_mws;
+	ntb->spad_size = spad_size;
+
+	ctrl->db_entry_size = 4;
+
+	for (i = 0; i < ntb->db_count; i++) {
+		ntb->reg->db_data[i] = 1 + i;
+		ntb->reg->db_offset[i] = 0;
+	}
+
+	return 0;
+}
+
+/**
+ * epf_ntb_configure_interrupt() - Configure MSI/MSI-X capaiblity
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ *
+ * Configure MSI/MSI-X capability for each interface with number of
+ * interrupts equal to "db_count" configfs entry.
+ */
+static int epf_ntb_configure_interrupt(struct epf_ntb *ntb)
+{
+	const struct pci_epc_features *epc_features;
+	struct device *dev;
+	u32 db_count;
+	int ret;
+
+	dev = &ntb->epf->dev;
+
+	epc_features = pci_epc_get_features(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no);
+
+	if (!(epc_features->msix_capable || epc_features->msi_capable)) {
+		dev_err(dev, "MSI or MSI-X is required for doorbell\n");
+		return -EINVAL;
+	}
+
+	db_count = ntb->db_count;
+	if (db_count > MAX_DB_COUNT) {
+		dev_err(dev, "DB count cannot be more than %d\n", MAX_DB_COUNT);
+		return -EINVAL;
+	}
+
+	ntb->db_count = db_count;
+
+	if (epc_features->msi_capable) {
+		ret = pci_epc_set_msi(ntb->epf->epc,
+				      ntb->epf->func_no,
+				      ntb->epf->vfunc_no,
+				      16);
+		if (ret) {
+			dev_err(dev, "MSI configuration failed\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * epf_ntb_db_bar_init() - Configure Doorbell window BARs
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ */
+static int epf_ntb_db_bar_init(struct epf_ntb *ntb)
+{
+	const struct pci_epc_features *epc_features;
+	u32 align;
+	struct device *dev = &ntb->epf->dev;
+	int ret;
+	struct pci_epf_bar *epf_bar;
+	void __iomem *mw_addr;
+	enum pci_barno barno;
+	size_t size = 4 * ntb->db_count;
+
+	epc_features = pci_epc_get_features(ntb->epf->epc,
+					    ntb->epf->func_no,
+					    ntb->epf->vfunc_no);
+	align = epc_features->align;
+
+	if (size < 128)
+		size = 128;
+
+	if (align)
+		size = ALIGN(size, align);
+	else
+		size = roundup_pow_of_two(size);
+
+	barno = ntb->epf_ntb_bar[BAR_DB];
+
+	mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
+	if (!mw_addr) {
+		dev_err(dev, "Failed to allocate OB address\n");
+		return -ENOMEM;
+	}
+
+	ntb->epf_db = mw_addr;
+
+	epf_bar = &ntb->epf->bar[barno];
+
+	ret = pci_epc_set_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no, epf_bar);
+	if (ret) {
+		dev_err(dev, "Doorbell BAR set failed\n");
+			goto err_alloc_peer_mem;
+	}
+	return ret;
+
+err_alloc_peer_mem:
+	pci_epc_mem_free_addr(ntb->epf->epc, epf_bar->phys_addr, mw_addr, epf_bar->size);
+	return -1;
+}
+
+/**
+ * epf_ntb_db_bar_clear() - Clear doorbell BAR and free memory
+ *   allocated in peer's outbound address space
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ */
+static void epf_ntb_db_bar_clear(struct epf_ntb *ntb)
+{
+	enum pci_barno barno;
+
+	barno = ntb->epf_ntb_bar[BAR_DB];
+	pci_epf_free_space(ntb->epf, ntb->epf_db, barno, 0);
+	pci_epc_clear_bar(ntb->epf->epc,
+			  ntb->epf->func_no,
+			  ntb->epf->vfunc_no,
+			  &ntb->epf->bar[barno]);
+}
+
+/**
+ * epf_ntb_mw_bar_init() - Configure Memory window BARs
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ *
+ */
+static int epf_ntb_mw_bar_init(struct epf_ntb *ntb)
+{
+	int ret = 0;
+	int i;
+	u64 size;
+	enum pci_barno barno;
+	struct device *dev = &ntb->epf->dev;
+
+	for (i = 0; i < ntb->num_mws; i++) {
+		size = ntb->mws_size[i];
+		barno = ntb->epf_ntb_bar[BAR_MW0 + i];
+
+		ntb->epf->bar[barno].barno = barno;
+		ntb->epf->bar[barno].size = size;
+		ntb->epf->bar[barno].addr = 0;
+		ntb->epf->bar[barno].phys_addr = 0;
+		ntb->epf->bar[barno].flags |= upper_32_bits(size) ?
+				PCI_BASE_ADDRESS_MEM_TYPE_64 :
+				PCI_BASE_ADDRESS_MEM_TYPE_32;
+
+		ret = pci_epc_set_bar(ntb->epf->epc,
+				      ntb->epf->func_no,
+				      ntb->epf->vfunc_no,
+				      &ntb->epf->bar[barno]);
+		if (ret) {
+			dev_err(dev, "MW set failed\n");
+			goto err_alloc_mem;
+		}
+
+		/* Allocate EPC outbound memory windows to vpci vntb device */
+		ntb->vpci_mw_addr[i] = pci_epc_mem_alloc_addr(ntb->epf->epc,
+							      &ntb->vpci_mw_phy[i],
+							      size);
+		if (!ntb->vpci_mw_addr[i]) {
+			dev_err(dev, "Failed to allocate source address\n");
+			goto err_alloc_mem;
+		}
+	}
+
+	return ret;
+err_alloc_mem:
+	return ret;
+}
+
+/**
+ * epf_ntb_mw_bar_clear() - Clear Memory window BARs
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ */
+static void epf_ntb_mw_bar_clear(struct epf_ntb *ntb)
+{
+	enum pci_barno barno;
+	int i;
+
+	for (i = 0; i < ntb->num_mws; i++) {
+		barno = ntb->epf_ntb_bar[BAR_MW0 + i];
+		pci_epc_clear_bar(ntb->epf->epc,
+				  ntb->epf->func_no,
+				  ntb->epf->vfunc_no,
+				  &ntb->epf->bar[barno]);
+
+		pci_epc_mem_free_addr(ntb->epf->epc,
+				      ntb->vpci_mw_phy[i],
+				      ntb->vpci_mw_addr[i],
+				      ntb->mws_size[i]);
+	}
+}
+
+/**
+ * epf_ntb_epc_destroy() - Cleanup NTB EPC interface
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ *
+ * Wrapper for epf_ntb_epc_destroy_interface() to cleanup all the NTB interfaces
+ */
+static void epf_ntb_epc_destroy(struct epf_ntb *ntb)
+{
+	pci_epc_remove_epf(ntb->epf->epc, ntb->epf, 0);
+	pci_epc_put(ntb->epf->epc);
+}
+
+/**
+ * epf_ntb_init_epc_bar() - Identify BARs to be used for each of the NTB
+ * constructs (scratchpad region, doorbell, memorywindow)
+ * @ntb: NTB device that facilitates communication between HOST and vHOST
+ */
+static int epf_ntb_init_epc_bar(struct epf_ntb *ntb)
+{
+	const struct pci_epc_features *epc_features;
+	enum pci_barno barno;
+	enum epf_ntb_bar bar;
+	struct device *dev;
+	u32 num_mws;
+	int i;
+
+	barno = BAR_0;
+	num_mws = ntb->num_mws;
+	dev = &ntb->epf->dev;
+	epc_features = pci_epc_get_features(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no);
+
+	/* These are required BARs which are mandatory for NTB functionality */
+	for (bar = BAR_CONFIG; bar <= BAR_MW0; bar++, barno++) {
+		barno = pci_epc_get_next_free_bar(epc_features, barno);
+		if (barno < 0) {
+			dev_err(dev, "Fail to get NTB function BAR\n");
+			return barno;
+		}
+		ntb->epf_ntb_bar[bar] = barno;
+	}
+
+	/* These are optional BARs which don't impact NTB functionality */
+	for (bar = BAR_MW1, i = 1; i < num_mws; bar++, barno++, i++) {
+		barno = pci_epc_get_next_free_bar(epc_features, barno);
+		if (barno < 0) {
+			ntb->num_mws = i;
+			dev_dbg(dev, "BAR not available for > MW%d\n", i + 1);
+		}
+		ntb->epf_ntb_bar[bar] = barno;
+	}
+
+	return 0;
+}
+
+/**
+ * epf_ntb_epc_init() - Initialize NTB interface
+ * @ntb: NTB device that facilitates communication between HOST and vHOST2
+ *
+ * Wrapper to initialize a particular EPC interface and start the workqueue
+ * to check for commands from host. This function will write to the
+ * EP controller HW for configuring it.
+ */
+static int epf_ntb_epc_init(struct epf_ntb *ntb)
+{
+	u8 func_no, vfunc_no;
+	struct pci_epc *epc;
+	struct pci_epf *epf;
+	struct device *dev;
+	int ret;
+
+	epf = ntb->epf;
+	dev = &epf->dev;
+	epc = epf->epc;
+	func_no = ntb->epf->func_no;
+	vfunc_no = ntb->epf->vfunc_no;
+
+	ret = epf_ntb_config_sspad_bar_set(ntb);
+	if (ret) {
+		dev_err(dev, "Config/self SPAD BAR init failed");
+		return ret;
+	}
+
+	ret = epf_ntb_configure_interrupt(ntb);
+	if (ret) {
+		dev_err(dev, "Interrupt configuration failed\n");
+		goto err_config_interrupt;
+	}
+
+	ret = epf_ntb_db_bar_init(ntb);
+	if (ret) {
+		dev_err(dev, "DB BAR init failed\n");
+		goto err_db_bar_init;
+	}
+
+	ret = epf_ntb_mw_bar_init(ntb);
+	if (ret) {
+		dev_err(dev, "MW BAR init failed\n");
+		goto err_mw_bar_init;
+	}
+
+	if (vfunc_no <= 1) {
+		ret = pci_epc_write_header(epc, func_no, vfunc_no, epf->header);
+		if (ret) {
+			dev_err(dev, "Configuration header write failed\n");
+			goto err_write_header;
+		}
+	}
+
+	INIT_DELAYED_WORK(&ntb->cmd_handler, epf_ntb_cmd_handler);
+	queue_work(kpcintb_workqueue, &ntb->cmd_handler.work);
+
+	return 0;
+
+err_write_header:
+	epf_ntb_mw_bar_clear(ntb);
+err_mw_bar_init:
+	epf_ntb_db_bar_clear(ntb);
+err_db_bar_init:
+err_config_interrupt:
+	epf_ntb_config_sspad_bar_clear(ntb);
+
+	return ret;
+}
+
+
+/**
+ * epf_ntb_epc_cleanup() - Cleanup all NTB interfaces
+ * @ntb: NTB device that facilitates communication between HOST1 and HOST2
+ *
+ * Wrapper to cleanup all NTB interfaces.
+ */
+static void epf_ntb_epc_cleanup(struct epf_ntb *ntb)
+{
+	epf_ntb_db_bar_clear(ntb);
+	epf_ntb_mw_bar_clear(ntb);
+}
+
+#define EPF_NTB_R(_name)						\
+static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
+				      char *page)			\
+{									\
+	struct config_group *group = to_config_group(item);		\
+	struct epf_ntb *ntb = to_epf_ntb(group);			\
+									\
+	return sprintf(page, "%d\n", ntb->_name);			\
+}
+
+#define EPF_NTB_W(_name)						\
+static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
+				       const char *page, size_t len)	\
+{									\
+	struct config_group *group = to_config_group(item);		\
+	struct epf_ntb *ntb = to_epf_ntb(group);			\
+	u32 val;							\
+	int ret;							\
+									\
+	ret = kstrtou32(page, 0, &val);					\
+	if (ret)							\
+		return ret;						\
+									\
+	ntb->_name = val;						\
+									\
+	return len;							\
+}
+
+#define EPF_NTB_MW_R(_name)						\
+static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
+				      char *page)			\
+{									\
+	struct config_group *group = to_config_group(item);		\
+	struct epf_ntb *ntb = to_epf_ntb(group);			\
+	int win_no;							\
+									\
+	sscanf(#_name, "mw%d", &win_no);				\
+									\
+	return sprintf(page, "%lld\n", ntb->mws_size[win_no - 1]);	\
+}
+
+#define EPF_NTB_MW_W(_name)						\
+static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
+				       const char *page, size_t len)	\
+{									\
+	struct config_group *group = to_config_group(item);		\
+	struct epf_ntb *ntb = to_epf_ntb(group);			\
+	struct device *dev = &ntb->epf->dev;				\
+	int win_no;							\
+	u64 val;							\
+	int ret;							\
+									\
+	ret = kstrtou64(page, 0, &val);					\
+	if (ret)							\
+		return ret;						\
+									\
+	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
+		return -EINVAL;						\
+									\
+	if (ntb->num_mws < win_no) {					\
+		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
+		return -EINVAL;						\
+	}								\
+									\
+	ntb->mws_size[win_no - 1] = val;				\
+									\
+	return len;							\
+}
+
+static ssize_t epf_ntb_num_mws_store(struct config_item *item,
+				     const char *page, size_t len)
+{
+	struct config_group *group = to_config_group(item);
+	struct epf_ntb *ntb = to_epf_ntb(group);
+	u32 val;
+	int ret;
+
+	ret = kstrtou32(page, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val > MAX_MW)
+		return -EINVAL;
+
+	ntb->num_mws = val;
+
+	return len;
+}
+
+EPF_NTB_R(spad_count)
+EPF_NTB_W(spad_count)
+EPF_NTB_R(db_count)
+EPF_NTB_W(db_count)
+EPF_NTB_R(num_mws)
+EPF_NTB_R(vbus_number)
+EPF_NTB_W(vbus_number)
+EPF_NTB_R(vntb_pid)
+EPF_NTB_W(vntb_pid)
+EPF_NTB_R(vntb_vid)
+EPF_NTB_W(vntb_vid)
+EPF_NTB_MW_R(mw1)
+EPF_NTB_MW_W(mw1)
+EPF_NTB_MW_R(mw2)
+EPF_NTB_MW_W(mw2)
+EPF_NTB_MW_R(mw3)
+EPF_NTB_MW_W(mw3)
+EPF_NTB_MW_R(mw4)
+EPF_NTB_MW_W(mw4)
+
+CONFIGFS_ATTR(epf_ntb_, spad_count);
+CONFIGFS_ATTR(epf_ntb_, db_count);
+CONFIGFS_ATTR(epf_ntb_, num_mws);
+CONFIGFS_ATTR(epf_ntb_, mw1);
+CONFIGFS_ATTR(epf_ntb_, mw2);
+CONFIGFS_ATTR(epf_ntb_, mw3);
+CONFIGFS_ATTR(epf_ntb_, mw4);
+CONFIGFS_ATTR(epf_ntb_, vbus_number);
+CONFIGFS_ATTR(epf_ntb_, vntb_pid);
+CONFIGFS_ATTR(epf_ntb_, vntb_vid);
+
+static struct configfs_attribute *epf_ntb_attrs[] = {
+	&epf_ntb_attr_spad_count,
+	&epf_ntb_attr_db_count,
+	&epf_ntb_attr_num_mws,
+	&epf_ntb_attr_mw1,
+	&epf_ntb_attr_mw2,
+	&epf_ntb_attr_mw3,
+	&epf_ntb_attr_mw4,
+	&epf_ntb_attr_vbus_number,
+	&epf_ntb_attr_vntb_pid,
+	&epf_ntb_attr_vntb_vid,
+	NULL,
+};
+
+static const struct config_item_type ntb_group_type = {
+	.ct_attrs	= epf_ntb_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+/**
+ * epf_ntb_add_cfs() - Add configfs directory specific to NTB
+ * @epf: NTB endpoint function device
+ * @group: A pointer to the config_group structure referencing a group of
+ *	   config_items of a specific type that belong to a specific sub-system.
+ *
+ * Add configfs directory specific to NTB. This directory will hold
+ * NTB specific properties like db_count, spad_count, num_mws etc.,
+ */
+static struct config_group *epf_ntb_add_cfs(struct pci_epf *epf,
+					    struct config_group *group)
+{
+	struct epf_ntb *ntb = epf_get_drvdata(epf);
+	struct config_group *ntb_group = &ntb->group;
+	struct device *dev = &epf->dev;
+
+	config_group_init_type_name(ntb_group, dev_name(dev), &ntb_group_type);
+
+	return ntb_group;
+}
+
+/*==== virtual PCI bus driver, which only load virtual NTB PCI driver ====*/
+
+static u32 pci_space[] = {
+	0xffffffff,	/*DeviceID, Vendor ID*/
+	0,		/*Status, Command*/
+	0xffffffff,	/*Class code, subclass, prog if, revision id*/
+	0x40,		/*bist, header type, latency Timer, cache line size*/
+	0,		/*BAR 0*/
+	0,		/*BAR 1*/
+	0,		/*BAR 2*/
+	0,		/*BAR 3*/
+	0,		/*BAR 4*/
+	0,		/*BAR 5*/
+	0,		/*Cardbus cis point*/
+	0,		/*Subsystem ID Subystem vendor id*/
+	0,		/*ROM Base Address*/
+	0,		/*Reserved, Cap. Point*/
+	0,		/*Reserved,*/
+	0,		/*Max Lat, Min Gnt, interrupt pin, interrupt line*/
+};
+
+int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val)
+{
+	if (devfn == 0) {
+		memcpy(val, ((u8 *)pci_space) + where, size);
+		return PCIBIOS_SUCCESSFUL;
+	}
+	return PCIBIOS_DEVICE_NOT_FOUND;
+}
+
+int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val)
+{
+	return 0;
+}
+
+struct pci_ops vpci_ops = {
+	.read = pci_read,
+	.write = pci_write,
+};
+
+static int vpci_scan_bus(void *sysdata)
+{
+	struct pci_bus *vpci_bus;
+	struct epf_ntb *ndev = sysdata;
+
+	vpci_bus = pci_scan_bus(ndev->vbus_number, &vpci_ops, sysdata);
+	if (vpci_bus)
+		pr_err("create pci bus\n");
+
+	pci_bus_add_devices(vpci_bus);
+
+	return 0;
+}
+
+/*==================== Virtual PCIe NTB driver ==========================*/
+
+static int vntb_epf_mw_count(struct ntb_dev *ntb, int pidx)
+{
+	struct epf_ntb *ndev = ntb_ndev(ntb);
+
+	return ndev->num_mws;
+}
+
+static int vntb_epf_spad_count(struct ntb_dev *ntb)
+{
+	return ntb_ndev(ntb)->spad_count;
+}
+
+static int vntb_epf_peer_mw_count(struct ntb_dev *ntb)
+{
+	return ntb_ndev(ntb)->num_mws;
+}
+
+static u64 vntb_epf_db_valid_mask(struct ntb_dev *ntb)
+{
+	return BIT_ULL(ntb_ndev(ntb)->db_count) - 1;
+}
+
+static int vntb_epf_db_set_mask(struct ntb_dev *ntb, u64 db_bits)
+{
+	return 0;
+}
+
+static int vntb_epf_mw_set_trans(struct ntb_dev *ndev, int pidx, int idx,
+		dma_addr_t addr, resource_size_t size)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+	struct pci_epf_bar *epf_bar;
+	enum pci_barno barno;
+	int ret;
+	struct device *dev;
+
+	dev = &ntb->ntb.dev;
+	barno = ntb->epf_ntb_bar[BAR_MW0 + idx];
+	epf_bar = &ntb->epf->bar[barno];
+	epf_bar->phys_addr = addr;
+	epf_bar->barno = barno;
+	epf_bar->size = size;
+
+	ret = pci_epc_set_bar(ntb->epf->epc, 0, 0, epf_bar);
+	if (ret) {
+		dev_err(dev, "failure set mw trans\n");
+		return ret;
+	}
+	return 0;
+}
+
+static int vntb_epf_mw_clear_trans(struct ntb_dev *ntb, int pidx, int idx)
+{
+	return 0;
+}
+
+static int vntb_epf_peer_mw_get_addr(struct ntb_dev *ndev, int idx,
+				phys_addr_t *base, resource_size_t *size)
+{
+
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+
+	if (base)
+		*base = ntb->vpci_mw_phy[idx];
+
+	if (size)
+		*size = ntb->mws_size[idx];
+
+	return 0;
+}
+
+static int vntb_epf_link_enable(struct ntb_dev *ntb,
+			enum ntb_speed max_speed,
+			enum ntb_width max_width)
+{
+	return 0;
+}
+
+static u32 vntb_epf_spad_read(struct ntb_dev *ndev, int idx)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+	int off = ntb->reg->spad_offset, ct = ntb->reg->spad_count * 4;
+	u32 val;
+	void __iomem *base = ntb->reg;
+
+	val = readl(base + off + ct + idx * 4);
+	return val;
+}
+
+static int vntb_epf_spad_write(struct ntb_dev *ndev, int idx, u32 val)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+	struct epf_ntb_ctrl *ctrl = ntb->reg;
+	int off = ctrl->spad_offset, ct = ctrl->spad_count * 4;
+	void __iomem *base = ntb->reg;
+
+	writel(val, base + off + ct + idx * 4);
+	return 0;
+}
+
+static u32 vntb_epf_peer_spad_read(struct ntb_dev *ndev, int pidx, int idx)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+	struct epf_ntb_ctrl *ctrl = ntb->reg;
+	int off = ctrl->spad_offset;
+	void __iomem *base = ntb->reg;
+	u32 val;
+
+	val = readl(base + off + idx * 4);
+	return val;
+}
+
+static int vntb_epf_peer_spad_write(struct ntb_dev *ndev, int pidx, int idx, u32 val)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+	struct epf_ntb_ctrl *ctrl = ntb->reg;
+	int off = ctrl->spad_offset;
+	void __iomem *base = ntb->reg;
+
+	writel(val, base + off + idx * 4);
+	return 0;
+}
+
+static int vntb_epf_peer_db_set(struct ntb_dev *ndev, u64 db_bits)
+{
+	u32 interrupt_num = ffs(db_bits) + 1;
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+	u8 func_no, vfunc_no;
+	int ret;
+
+	func_no = ntb->epf->func_no;
+	vfunc_no = ntb->epf->vfunc_no;
+
+	ret = pci_epc_raise_irq(ntb->epf->epc,
+				func_no,
+				vfunc_no,
+				PCI_EPC_IRQ_MSI,
+				interrupt_num + 1);
+	if (ret)
+		dev_err(&ntb->ntb.dev, "Failed to raise IRQ\n");
+
+	return ret;
+}
+
+static u64 vntb_epf_db_read(struct ntb_dev *ndev)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+
+	return ntb->db;
+}
+
+static int vntb_epf_mw_get_align(struct ntb_dev *ndev, int pidx, int idx,
+			resource_size_t *addr_align,
+			resource_size_t *size_align,
+			resource_size_t *size_max)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+
+	if (addr_align)
+		*addr_align = SZ_4K;
+
+	if (size_align)
+		*size_align = 1;
+
+	if (size_max)
+		*size_max = ntb->mws_size[idx];
+
+	return 0;
+}
+
+static u64 vntb_epf_link_is_up(struct ntb_dev *ndev,
+			enum ntb_speed *speed,
+			enum ntb_width *width)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+
+	return ntb->reg->link_status;
+}
+
+static int vntb_epf_db_clear_mask(struct ntb_dev *ndev, u64 db_bits)
+{
+	return 0;
+}
+
+static int vntb_epf_db_clear(struct ntb_dev *ndev, u64 db_bits)
+{
+	struct epf_ntb *ntb = ntb_ndev(ndev);
+
+	ntb->db &= ~db_bits;
+	return 0;
+}
+
+static int vntb_epf_link_disable(struct ntb_dev *ntb)
+{
+	return 0;
+}
+
+static const struct ntb_dev_ops vntb_epf_ops = {
+	.mw_count		= vntb_epf_mw_count,
+	.spad_count		= vntb_epf_spad_count,
+	.peer_mw_count		= vntb_epf_peer_mw_count,
+	.db_valid_mask		= vntb_epf_db_valid_mask,
+	.db_set_mask		= vntb_epf_db_set_mask,
+	.mw_set_trans		= vntb_epf_mw_set_trans,
+	.mw_clear_trans		= vntb_epf_mw_clear_trans,
+	.peer_mw_get_addr	= vntb_epf_peer_mw_get_addr,
+	.link_enable		= vntb_epf_link_enable,
+	.spad_read		= vntb_epf_spad_read,
+	.spad_write		= vntb_epf_spad_write,
+	.peer_spad_read		= vntb_epf_peer_spad_read,
+	.peer_spad_write	= vntb_epf_peer_spad_write,
+	.peer_db_set		= vntb_epf_peer_db_set,
+	.db_read		= vntb_epf_db_read,
+	.mw_get_align		= vntb_epf_mw_get_align,
+	.link_is_up		= vntb_epf_link_is_up,
+	.db_clear_mask		= vntb_epf_db_clear_mask,
+	.db_clear		= vntb_epf_db_clear,
+	.link_disable		= vntb_epf_link_disable,
+};
+
+static int pci_vntb_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	int ret;
+	struct epf_ntb *ndev = (struct epf_ntb *)pdev->sysdata;
+	struct device *dev = &pdev->dev;
+
+	ndev->ntb.pdev = pdev;
+	ndev->ntb.topo = NTB_TOPO_NONE;
+	ndev->ntb.ops =  &vntb_epf_ops;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+	if (ret) {
+		dev_err(dev, "Cannot set DMA mask\n");
+		return -EINVAL;
+	}
+
+	ret = ntb_register_device(&ndev->ntb);
+	if (ret) {
+		dev_err(dev, "Failed to register NTB device\n");
+		goto err_register_dev;
+	}
+
+	dev_dbg(dev, "PCI Virtual NTB driver loaded\n");
+	return 0;
+
+err_register_dev:
+	return -EINVAL;
+}
+
+static struct pci_device_id pci_vntb_table[] = {
+	{
+		PCI_DEVICE(0xffff, 0xffff),
+	},
+	{},
+};
+
+static struct pci_driver vntb_pci_driver = {
+	.name           = "pci-vntb",
+	.id_table       = pci_vntb_table,
+	.probe          = pci_vntb_probe,
+};
+
+/* ============ PCIe EPF Driver Bind ====================*/
+
+/**
+ * epf_ntb_bind() - Initialize endpoint controller to provide NTB functionality
+ * @epf: NTB endpoint function device
+ *
+ * Initialize both the endpoint controllers associated with NTB function device.
+ * Invoked when a primary interface or secondary interface is bound to EPC
+ * device. This function will succeed only when EPC is bound to both the
+ * interfaces.
+ */
+static int epf_ntb_bind(struct pci_epf *epf)
+{
+	struct epf_ntb *ntb = epf_get_drvdata(epf);
+	struct device *dev = &epf->dev;
+	int ret;
+
+	if (!epf->epc) {
+		dev_dbg(dev, "PRIMARY EPC interface not yet bound\n");
+		return 0;
+	}
+
+	ret = epf_ntb_init_epc_bar(ntb);
+	if (ret) {
+		dev_err(dev, "Failed to create NTB EPC\n");
+		goto err_bar_init;
+	}
+
+	ret = epf_ntb_config_spad_bar_alloc(ntb);
+	if (ret) {
+		dev_err(dev, "Failed to allocate BAR memory\n");
+		goto err_bar_alloc;
+	}
+
+	ret = epf_ntb_epc_init(ntb);
+	if (ret) {
+		dev_err(dev, "Failed to initialize EPC\n");
+		goto err_bar_alloc;
+	}
+
+	epf_set_drvdata(epf, ntb);
+
+	pci_space[0] = (ntb->vntb_pid << 16) | ntb->vntb_vid;
+	pci_vntb_table[0].vendor = ntb->vntb_vid;
+	pci_vntb_table[0].device = ntb->vntb_pid;
+
+	if (pci_register_driver(&vntb_pci_driver)) {
+		dev_err(dev, "failure register vntb pci driver\n");
+		goto err_bar_alloc;
+	}
+
+	vpci_scan_bus(ntb);
+
+	return 0;
+
+err_bar_alloc:
+	epf_ntb_config_spad_bar_free(ntb);
+
+err_bar_init:
+	epf_ntb_epc_destroy(ntb);
+
+	return ret;
+}
+
+/**
+ * epf_ntb_unbind() - Cleanup the initialization from epf_ntb_bind()
+ * @epf: NTB endpoint function device
+ *
+ * Cleanup the initialization from epf_ntb_bind()
+ */
+static void epf_ntb_unbind(struct pci_epf *epf)
+{
+	struct epf_ntb *ntb = epf_get_drvdata(epf);
+
+	epf_ntb_epc_cleanup(ntb);
+	epf_ntb_config_spad_bar_free(ntb);
+	epf_ntb_epc_destroy(ntb);
+
+	pci_unregister_driver(&vntb_pci_driver);
+}
+
+// EPF driver probe
+static struct pci_epf_ops epf_ntb_ops = {
+	.bind   = epf_ntb_bind,
+	.unbind = epf_ntb_unbind,
+	.add_cfs = epf_ntb_add_cfs,
+};
+
+/**
+ * epf_ntb_probe() - Probe NTB function driver
+ * @epf: NTB endpoint function device
+ *
+ * Probe NTB function driver when endpoint function bus detects a NTB
+ * endpoint function.
+ */
+static int epf_ntb_probe(struct pci_epf *epf)
+{
+	struct epf_ntb *ntb;
+	struct device *dev;
+
+	dev = &epf->dev;
+
+	ntb = devm_kzalloc(dev, sizeof(*ntb), GFP_KERNEL);
+	if (!ntb)
+		return -ENOMEM;
+
+	epf->header = &epf_ntb_header;
+	ntb->epf = epf;
+	ntb->vbus_number = 0xff;
+	epf_set_drvdata(epf, ntb);
+
+	dev_info(dev, "pci-ep epf driver loaded\n");
+	return 0;
+}
+
+static const struct pci_epf_device_id epf_ntb_ids[] = {
+	{
+		.name = "pci_epf_vntb",
+	},
+	{},
+};
+
+static struct pci_epf_driver epf_ntb_driver = {
+	.driver.name    = "pci_epf_vntb",
+	.probe          = epf_ntb_probe,
+	.id_table       = epf_ntb_ids,
+	.ops            = &epf_ntb_ops,
+	.owner          = THIS_MODULE,
+};
+
+static int __init epf_ntb_init(void)
+{
+	int ret;
+
+	kpcintb_workqueue = alloc_workqueue("kpcintb", WQ_MEM_RECLAIM |
+					    WQ_HIGHPRI, 0);
+	ret = pci_epf_register_driver(&epf_ntb_driver);
+	if (ret) {
+		destroy_workqueue(kpcintb_workqueue);
+		pr_err("Failed to register pci epf ntb driver --> %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+module_init(epf_ntb_init);
+
+static void __exit epf_ntb_exit(void)
+{
+	pci_epf_unregister_driver(&epf_ntb_driver);
+	destroy_workqueue(kpcintb_workqueue);
+}
+module_exit(epf_ntb_exit);
+
+MODULE_DESCRIPTION("PCI EPF NTB DRIVER");
+MODULE_AUTHOR("Frank Li <Frank.li@nxp.com>");
+MODULE_LICENSE("GPL v2");
-- 
2.24.0.rc1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 4/4] Documentation: PCI: Add specification for the PCI vNTB function device
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
                   ` (2 preceding siblings ...)
  2022-02-22 16:23 ` [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP Frank Li
@ 2022-02-22 16:23 ` Frank Li
  2022-03-10 22:01 ` [PATCH V2 0/4] NTB function for PCIe RC to EP connection Zhi Li
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Frank Li @ 2022-02-22 16:23 UTC (permalink / raw)
  To: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh
  Cc: linux-ntb, linux-pci

Add specification for the PCI vNTB function device. The endpoint function
driver and the host PCI driver should be created based on this
specification.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
Change from v1:
 - Add virtual pci ntb driver config

 Documentation/PCI/endpoint/index.rst          |   2 +
 .../PCI/endpoint/pci-vntb-function.rst        | 126 +++++++++++++
 Documentation/PCI/endpoint/pci-vntb-howto.rst | 167 ++++++++++++++++++
 3 files changed, 295 insertions(+)
 create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
 create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst

diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
index 38ea1f604b6d3..4d2333e7ae067 100644
--- a/Documentation/PCI/endpoint/index.rst
+++ b/Documentation/PCI/endpoint/index.rst
@@ -13,6 +13,8 @@ PCI Endpoint Framework
    pci-test-howto
    pci-ntb-function
    pci-ntb-howto
+   pci-vntb-function
+   pci-vntb-howto
 
    function/binding/pci-test
    function/binding/pci-ntb
diff --git a/Documentation/PCI/endpoint/pci-vntb-function.rst b/Documentation/PCI/endpoint/pci-vntb-function.rst
new file mode 100644
index 0000000000000..cad8013e88390
--- /dev/null
+++ b/Documentation/PCI/endpoint/pci-vntb-function.rst
@@ -0,0 +1,126 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+PCI vNTB Function
+=================
+
+:Author: Frank Li <Frank.Li@nxp.com>
+
+The difference between PCI NTB function and PCI vNTB function is
+
+PCI NTB function need at two endpoint instances and connect HOST1
+and HOST2.
+
+PCI vNTB function only use one host and one endpoint(EP), use NTB
+connect EP and PCI host
+
+.. code-block:: text
+
+
+  +------------+         +---------------------------------------+
+  |            |         |                                       |
+  +------------+         |                        +--------------+
+  | NTB        |         |                        | NTB          |
+  | NetDev     |         |                        | NetDev       |
+  +------------+         |                        +--------------+
+  | NTB        |         |                        | NTB          |
+  | Transfer   |         |                        | Transfer     |
+  +------------+         |                        +--------------+
+  |            |         |                        |              |
+  |  PCI NTB   |         |                        |              |
+  |    EPF     |         |                        |              |
+  |   Driver   |         |                        | PCI Virtual  |
+  |            |         +---------------+        | NTB Driver   |
+  |            |         | PCI EP NTB    |<------>|              |
+  |            |         |  FN Driver    |        |              |
+  +------------+         +---------------+        +--------------+
+  |            |         |               |        |              |
+  |  PCI BUS   | <-----> |  PCI EP BUS   |        |  Virtual PCI |
+  |            |  PCI    |               |        |     BUS      |
+  +------------+         +---------------+--------+--------------+
+      PCI RC                        PCI EP
+
+Constructs used for Implementing vNTB
+=====================================
+
+	1) Config Region
+	2) Self Scratchpad Registers
+	3) Peer Scratchpad Registers
+	4) Doorbell (DB) Registers
+	5) Memory Window (MW)
+
+
+Config Region:
+--------------
+
+It is same as PCI NTB Function driver
+
+Scratchpad Registers:
+---------------------
+
+  It is appended after Config region.
+
+  +--------------------------------------------------+ Base
+  |                                                  |
+  |                                                  |
+  |                                                  |
+  |          Common Config Register                  |
+  |                                                  |
+  |                                                  |
+  |                                                  |
+  +-----------------------+--------------------------+ Base + span_offset
+  |                       |                          |
+  |    Peer Span Space    |    Span Space            |
+  |                       |                          |
+  |                       |                          |
+  +-----------------------+--------------------------+ Base + span_offset
+  |                       |                          |      + span_count * 4
+  |                       |                          |
+  |     Span Space        |   Peer Span Space        |
+  |                       |                          |
+  +-----------------------+--------------------------+
+        Virtual PCI             Pcie Endpoint
+        NTB Driver               NTB Driver
+
+
+Doorbell Registers:
+-------------------
+
+  Doorbell Registers are used by the hosts to interrupt each other.
+
+Memory Window:
+--------------
+
+  Actual transfer of data between the two hosts will happen using the
+  memory window.
+
+Modeling Constructs:
+====================
+
+32-bit BARs.
+
+======  ===============
+BAR NO  CONSTRUCTS USED
+======  ===============
+BAR0    Config Region
+BAR1    Doorbell
+BAR2    Memory Window 1
+BAR3    Memory Window 2
+BAR4    Memory Window 3
+BAR5    Memory Window 4
+======  ===============
+
+64-bit BARs.
+
+======  ===============================
+BAR NO  CONSTRUCTS USED
+======  ===============================
+BAR0    Config Region + Scratchpad
+BAR1
+BAR2    Doorbell
+BAR3
+BAR4    Memory Window 1
+BAR5
+======  ===============================
+
+
diff --git a/Documentation/PCI/endpoint/pci-vntb-howto.rst b/Documentation/PCI/endpoint/pci-vntb-howto.rst
new file mode 100644
index 0000000000000..524cd487e1840
--- /dev/null
+++ b/Documentation/PCI/endpoint/pci-vntb-howto.rst
@@ -0,0 +1,167 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
+PCI Non-Transparent Bridge (NTB) Endpoint Function (EPF) User Guide
+===================================================================
+
+:Author: Frank Li <Frank.Li@nxp.com>
+
+This document is a guide to help users use pci-epf-vntb function driver
+and ntb_hw_epf host driver for NTB functionality. The list of steps to
+be followed in the host side and EP side is given below. For the hardware
+configuration and internals of NTB using configurable endpoints see
+Documentation/PCI/endpoint/pci-vntb-function.rst
+
+Endpoint Device
+===============
+
+Endpoint Controller Devices
+---------------------------
+
+To find the list of endpoint controller devices in the system::
+
+        # ls /sys/class/pci_epc/
+          5f010000.pcie_ep
+
+If PCI_ENDPOINT_CONFIGFS is enabled::
+
+        # ls /sys/kernel/config/pci_ep/controllers
+          5f010000.pcie_ep
+
+Endpoint Function Drivers
+-------------------------
+
+To find the list of endpoint function drivers in the system::
+
+	# ls /sys/bus/pci-epf/drivers
+	pci_epf_ntb  pci_epf_test  pci_epf_vntb
+
+If PCI_ENDPOINT_CONFIGFS is enabled::
+
+	# ls /sys/kernel/config/pci_ep/functions
+	pci_epf_ntb  pci_epf_test  pci_epf_vntb
+
+
+Creating pci-epf-vntb Device
+----------------------------
+
+PCI endpoint function device can be created using the configfs. To create
+pci-epf-vntb device, the following commands can be used::
+
+	# mount -t configfs none /sys/kernel/config
+	# cd /sys/kernel/config/pci_ep/
+	# mkdir functions/pci_epf_vntb/func1
+
+The "mkdir func1" above creates the pci-epf-ntb function device that will
+be probed by pci_epf_vntb driver.
+
+The PCI endpoint framework populates the directory with the following
+configurable fields::
+
+	# ls functions/pci_epf_ntb/func1
+	baseclass_code    deviceid          msi_interrupts    pci-epf-ntb.0
+	progif_code       secondary         subsys_id         vendorid
+	cache_line_size   interrupt_pin     msix_interrupts   primary
+	revid             subclass_code     subsys_vendor_id
+
+The PCI endpoint function driver populates these entries with default values
+when the device is bound to the driver. The pci-epf-vntb driver populates
+vendorid with 0xffff and interrupt_pin with 0x0001::
+
+	# cat functions/pci_epf_vntb/func1/vendorid
+	0xffff
+	# cat functions/pci_epf_vntb/func1/interrupt_pin
+	0x0001
+
+
+Configuring pci-epf-vntb Device
+-------------------------------
+
+The user can configure the pci-epf-vntb device using its configfs entry. In order
+to change the vendorid and the deviceid, the following
+commands can be used::
+
+	# echo 0x1957 > functions/pci_epf_vntb/func1/vendorid
+	# echo 0x0809 > functions/pci_epf_vntb/func1/deviceid
+
+In order to configure NTB specific attributes, a new sub-directory to func1
+should be created::
+
+	# mkdir functions/pci_epf_vntb/func1/pci_epf_vntb.0/
+
+The NTB function driver will populate this directory with various attributes
+that can be configured by the user::
+
+	# ls functions/pci_epf_vntb/func1/pci_epf_vntb.0/
+	db_count    mw1         mw2         mw3         mw4         num_mws
+	spad_count
+
+A sample configuration for NTB function is given below::
+
+	# echo 4 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_count
+	# echo 128 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/spad_count
+	# echo 1 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
+	# echo 0x100000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
+
+A sample configuration for virtual NTB driver for virutal PCI bus::
+
+	# echo 0x1957 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid
+	# echo 0x080A > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid
+	# echo 0x10 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vbus_number
+
+Binding pci-epf-ntb Device to EP Controller
+--------------------------------------------
+
+NTB function device should be attached to PCI endpoint controllers
+connected to the host.
+
+	# ln -s controllers/5f010000.pcie_ep functions/pci-epf-ntb/func1/primary
+
+Once the above step is completed, the PCI endpoint controllers are ready to
+establish a link with the host.
+
+
+Start the Link
+--------------
+
+In order for the endpoint device to establish a link with the host, the _start_
+field should be populated with '1'. For NTB, both the PCI endpoint controllers
+should establish link with the host (imx8 don't need this steps)::
+
+	# echo 1 > controllers/5f010000.pcie_ep/start
+
+RootComplex Device
+==================
+
+lspci Output at Host side
+------------------------
+
+Note that the devices listed here correspond to the values populated in
+"Creating pci-epf-ntb Device" section above::
+
+	# lspci
+        00:00.0 PCI bridge: Freescale Semiconductor Inc Device 0000 (rev 01)
+        01:00.0 RAM memory: Freescale Semiconductor Inc Device 0809
+
+Endpoint Device / Virtual PCI bus
+=================================
+
+lspci Output at EP Side / Virtual PCI bus
+-----------------------------------------
+
+Note that the devices listed here correspond to the values populated in
+"Creating pci-epf-ntb Device" section above::
+
+        # lspci
+        10:00.0 Unassigned class [ffff]: Dawicontrol Computersysteme GmbH Device 1234 (rev ff)
+
+Using ntb_hw_epf Device
+-----------------------
+
+The host side software follows the standard NTB software architecture in Linux.
+All the existing client side NTB utilities like NTB Transport Client and NTB
+Netdev, NTB Ping Pong Test Client and NTB Tool Test Client can be used with NTB
+function device.
+
+For more information on NTB see
+:doc:`Non-Transparent Bridge <../../driver-api/ntb>`
-- 
2.24.0.rc1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
                   ` (3 preceding siblings ...)
  2022-02-22 16:23 ` [PATCH v2 4/4] Documentation: PCI: Add specification for the PCI vNTB function device Frank Li
@ 2022-03-10 22:01 ` Zhi Li
  2022-03-10 22:07   ` Zhi Li
  2022-04-05 10:34 ` Kishon Vijay Abraham I
  2022-08-12 14:02 ` Jon Mason
  6 siblings, 1 reply; 30+ messages in thread
From: Zhi Li @ 2022-03-10 22:01 UTC (permalink / raw)
  To: Frank Li
  Cc: Bjorn Helgaas, kishon, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci

On Tue, Feb 22, 2022 at 10:24 AM Frank Li <Frank.Li@nxp.com> wrote:
>
> This implement NTB function for PCIe EP to RC connections.
> The existed ntb epf need two PCI EPs and two PCI Host.
>
> This just need EP to RC connections.
>
>     ┌────────────┐         ┌─────────────────────────────────────┐
>     │            │         │                                     │
>     ├────────────┤         │                      ┌──────────────┤
>     │ NTB        │         │                      │ NTB          │
>     │ NetDev     │         │                      │ NetDev       │
>     ├────────────┤         │                      ├──────────────┤
>     │ NTB        │         │                      │ NTB          │
>     │ Transfer   │         │                      │ Transfer     │
>     ├────────────┤         │                      ├──────────────┤
>     │            │         │                      │              │
>     │  PCI NTB   │         │                      │              │
>     │    EPF     │         │                      │              │
>     │   Driver   │         │                      │ PCI Virtual  │
>     │            │         ├───────────────┐      │ NTB Driver   │
>     │            │         │ PCI EP NTB    │◄────►│              │
>     │            │         │  FN Driver    │      │              │
>     ├────────────┤         ├───────────────┤      ├──────────────┤
>     │            │         │               │      │              │
>     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
>     │            │  PCI    │               │      │     BUS      │
>     └────────────┘         └───────────────┴──────┴──────────────┘
>         PCI RC                        PCI EP
>
>
>
> Frank Li (4):
>   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
>   NTB: epf: Allow more flexibility in the memory BAR map method
>   PCI: endpoint: Support NTB transfer between RC and EP
>   Documentation: PCI: Add specification for the PCI vNTB function device
>

Ping

>  Documentation/PCI/endpoint/index.rst          |    2 +
>  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
>  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
>  drivers/pci/endpoint/functions/Kconfig        |   11 +
>  drivers/pci/endpoint/functions/Makefile       |    1 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
>  8 files changed, 1775 insertions(+), 14 deletions(-)
>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
>
> --
> 2.24.0.rc1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-03-10 22:01 ` [PATCH V2 0/4] NTB function for PCIe RC to EP connection Zhi Li
@ 2022-03-10 22:07   ` Zhi Li
  2022-04-04 20:12     ` Zhi Li
  0 siblings, 1 reply; 30+ messages in thread
From: Zhi Li @ 2022-03-10 22:07 UTC (permalink / raw)
  To: Frank Li
  Cc: Bjorn Helgaas, kishon, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci, ntb

On Thu, Mar 10, 2022 at 4:01 PM Zhi Li <lznuaa@gmail.com> wrote:
>
> On Tue, Feb 22, 2022 at 10:24 AM Frank Li <Frank.Li@nxp.com> wrote:
> >
> > This implement NTB function for PCIe EP to RC connections.
> > The existed ntb epf need two PCI EPs and two PCI Host.
> >
> > This just need EP to RC connections.
> >
> >     ┌────────────┐         ┌─────────────────────────────────────┐
> >     │            │         │                                     │
> >     ├────────────┤         │                      ┌──────────────┤
> >     │ NTB        │         │                      │ NTB          │
> >     │ NetDev     │         │                      │ NetDev       │
> >     ├────────────┤         │                      ├──────────────┤
> >     │ NTB        │         │                      │ NTB          │
> >     │ Transfer   │         │                      │ Transfer     │
> >     ├────────────┤         │                      ├──────────────┤
> >     │            │         │                      │              │
> >     │  PCI NTB   │         │                      │              │
> >     │    EPF     │         │                      │              │
> >     │   Driver   │         │                      │ PCI Virtual  │
> >     │            │         ├───────────────┐      │ NTB Driver   │
> >     │            │         │ PCI EP NTB    │◄────►│              │
> >     │            │         │  FN Driver    │      │              │
> >     ├────────────┤         ├───────────────┤      ├──────────────┤
> >     │            │         │               │      │              │
> >     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
> >     │            │  PCI    │               │      │     BUS      │
> >     └────────────┘         └───────────────┴──────┴──────────────┘
> >         PCI RC                        PCI EP
> >
> >
> >
> > Frank Li (4):
> >   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
> >   NTB: epf: Allow more flexibility in the memory BAR map method
> >   PCI: endpoint: Support NTB transfer between RC and EP
> >   Documentation: PCI: Add specification for the PCI vNTB function device
> >
>

Update ntb mail list

>
> >  Documentation/PCI/endpoint/index.rst          |    2 +
> >  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
> >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
> >  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
> >  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
> >  drivers/pci/endpoint/functions/Kconfig        |   11 +
> >  drivers/pci/endpoint/functions/Makefile       |    1 +
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
> >  8 files changed, 1775 insertions(+), 14 deletions(-)
> >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
> >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> >
> > --
> > 2.24.0.rc1
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] NTB: epf: Allow more flexibility in the memory BAR map method
  2022-02-22 16:23 ` [PATCH v2 2/4] NTB: epf: Allow more flexibility in the memory BAR map method Frank Li
@ 2022-03-10 22:08   ` Zhi Li
  0 siblings, 0 replies; 30+ messages in thread
From: Zhi Li @ 2022-03-10 22:08 UTC (permalink / raw)
  To: Frank Li
  Cc: Bjorn Helgaas, kishon, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci, ntb

On Tue, Feb 22, 2022 at 10:24 AM Frank Li <Frank.Li@nxp.com> wrote:
>
> Support the below BAR configuration methods for epf NTB.
>
> BAR 0: config and scratchpad
> BAR 2: doorbell
> BAR 4: memory map windows
>
> Set difference BAR number information into struct ntb_epf_data. So difference
> VID/PID can choose different BAR configurations. There are difference
> BAR map method between epf NTB and epf vNTB Endpoint function.
>
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---

Update ntb mail list.

>
> Change from v1:
>  - Improve commit message
>
>  drivers/ntb/hw/epf/ntb_hw_epf.c | 48 ++++++++++++++++++++++++---------
>  1 file changed, 35 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/ntb/hw/epf/ntb_hw_epf.c b/drivers/ntb/hw/epf/ntb_hw_epf.c
> index b019755e4e21b..3ece49cb18ffa 100644
> --- a/drivers/ntb/hw/epf/ntb_hw_epf.c
> +++ b/drivers/ntb/hw/epf/ntb_hw_epf.c
> @@ -45,7 +45,6 @@
>
>  #define NTB_EPF_MIN_DB_COUNT   3
>  #define NTB_EPF_MAX_DB_COUNT   31
> -#define NTB_EPF_MW_OFFSET      2
>
>  #define NTB_EPF_COMMAND_TIMEOUT        1000 /* 1 Sec */
>
> @@ -67,6 +66,7 @@ struct ntb_epf_dev {
>         enum pci_barno ctrl_reg_bar;
>         enum pci_barno peer_spad_reg_bar;
>         enum pci_barno db_reg_bar;
> +       enum pci_barno mw_bar;
>
>         unsigned int mw_count;
>         unsigned int spad_count;
> @@ -92,6 +92,8 @@ struct ntb_epf_data {
>         enum pci_barno peer_spad_reg_bar;
>         /* BAR that contains Doorbell region and Memory window '1' */
>         enum pci_barno db_reg_bar;
> +       /* BAR that contains memory windows*/
> +       enum pci_barno mw_bar;
>  };
>
>  static int ntb_epf_send_command(struct ntb_epf_dev *ndev, u32 command,
> @@ -411,7 +413,7 @@ static int ntb_epf_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
>                 return -EINVAL;
>         }
>
> -       bar = idx + NTB_EPF_MW_OFFSET;
> +       bar = idx + ndev->mw_bar;
>
>         mw_size = pci_resource_len(ntb->pdev, bar);
>
> @@ -453,7 +455,7 @@ static int ntb_epf_peer_mw_get_addr(struct ntb_dev *ntb, int idx,
>         if (idx == 0)
>                 offset = readl(ndev->ctrl_reg + NTB_EPF_MW1_OFFSET);
>
> -       bar = idx + NTB_EPF_MW_OFFSET;
> +       bar = idx + ndev->mw_bar;
>
>         if (base)
>                 *base = pci_resource_start(ndev->ntb.pdev, bar) + offset;
> @@ -565,6 +567,7 @@ static int ntb_epf_init_pci(struct ntb_epf_dev *ndev,
>                             struct pci_dev *pdev)
>  {
>         struct device *dev = ndev->dev;
> +       size_t spad_sz, spad_off;
>         int ret;
>
>         pci_set_drvdata(pdev, ndev);
> @@ -599,10 +602,16 @@ static int ntb_epf_init_pci(struct ntb_epf_dev *ndev,
>                 goto err_dma_mask;
>         }
>
> -       ndev->peer_spad_reg = pci_iomap(pdev, ndev->peer_spad_reg_bar, 0);
> -       if (!ndev->peer_spad_reg) {
> -               ret = -EIO;
> -               goto err_dma_mask;
> +       if (ndev->peer_spad_reg_bar) {
> +               ndev->peer_spad_reg = pci_iomap(pdev, ndev->peer_spad_reg_bar, 0);
> +               if (!ndev->peer_spad_reg) {
> +                       ret = -EIO;
> +                       goto err_dma_mask;
> +               }
> +       } else {
> +               spad_sz = 4 * readl(ndev->ctrl_reg + NTB_EPF_SPAD_COUNT);
> +               spad_off = readl(ndev->ctrl_reg + NTB_EPF_SPAD_OFFSET);
> +               ndev->peer_spad_reg = ndev->ctrl_reg + spad_off  + spad_sz;
>         }
>
>         ndev->db_reg = pci_iomap(pdev, ndev->db_reg_bar, 0);
> @@ -657,6 +666,7 @@ static int ntb_epf_pci_probe(struct pci_dev *pdev,
>         enum pci_barno peer_spad_reg_bar = BAR_1;
>         enum pci_barno ctrl_reg_bar = BAR_0;
>         enum pci_barno db_reg_bar = BAR_2;
> +       enum pci_barno mw_bar = BAR_2;
>         struct device *dev = &pdev->dev;
>         struct ntb_epf_data *data;
>         struct ntb_epf_dev *ndev;
> @@ -671,17 +681,16 @@ static int ntb_epf_pci_probe(struct pci_dev *pdev,
>
>         data = (struct ntb_epf_data *)id->driver_data;
>         if (data) {
> -               if (data->peer_spad_reg_bar)
> -                       peer_spad_reg_bar = data->peer_spad_reg_bar;
> -               if (data->ctrl_reg_bar)
> -                       ctrl_reg_bar = data->ctrl_reg_bar;
> -               if (data->db_reg_bar)
> -                       db_reg_bar = data->db_reg_bar;
> +               peer_spad_reg_bar = data->peer_spad_reg_bar;
> +               ctrl_reg_bar = data->ctrl_reg_bar;
> +               db_reg_bar = data->db_reg_bar;
> +               mw_bar = data->mw_bar;
>         }
>
>         ndev->peer_spad_reg_bar = peer_spad_reg_bar;
>         ndev->ctrl_reg_bar = ctrl_reg_bar;
>         ndev->db_reg_bar = db_reg_bar;
> +       ndev->mw_bar = mw_bar;
>         ndev->dev = dev;
>
>         ntb_epf_init_struct(ndev, pdev);
> @@ -729,6 +738,14 @@ static const struct ntb_epf_data j721e_data = {
>         .ctrl_reg_bar = BAR_0,
>         .peer_spad_reg_bar = BAR_1,
>         .db_reg_bar = BAR_2,
> +       .mw_bar = BAR_2,
> +};
> +
> +static const struct ntb_epf_data mx8_data = {
> +       .ctrl_reg_bar = BAR_0,
> +       .peer_spad_reg_bar = BAR_0,
> +       .db_reg_bar = BAR_2,
> +       .mw_bar = BAR_4,
>  };
>
>  static const struct pci_device_id ntb_epf_pci_tbl[] = {
> @@ -737,6 +754,11 @@ static const struct pci_device_id ntb_epf_pci_tbl[] = {
>                 .class = PCI_CLASS_MEMORY_RAM << 8, .class_mask = 0xffff00,
>                 .driver_data = (kernel_ulong_t)&j721e_data,
>         },
> +       {
> +               PCI_DEVICE(PCI_VENDOR_ID_FREESCALE, 0x0809),
> +               .class = PCI_CLASS_MEMORY_RAM << 8, .class_mask = 0xffff00,
> +               .driver_data = (kernel_ulong_t)&mx8_data,
> +       },
>         { },
>  };
>
> --
> 2.24.0.rc1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP
  2022-02-22 16:23 ` [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP Frank Li
@ 2022-03-10 22:09   ` Zhi Li
  2022-12-14  0:08   ` Bjorn Helgaas
  1 sibling, 0 replies; 30+ messages in thread
From: Zhi Li @ 2022-03-10 22:09 UTC (permalink / raw)
  To: Frank Li
  Cc: Bjorn Helgaas, kishon, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci, ntb

On Tue, Feb 22, 2022 at 10:24 AM Frank Li <Frank.Li@nxp.com> wrote:
>
> Add NTB function driver and virtual PCI Bus and Virtual NTB driver
> to implement communication between PCIe Root Port and PCIe EP devices
>
> ┌────────────┐         ┌─────────────────────────────────────┐
> │            │         │                                     │
> ├────────────┤         │                      ┌──────────────┤
> │ NTB        │         │                      │ NTB          │
> │ NetDev     │         │                      │ NetDev       │
> ├────────────┤         │                      ├──────────────┤
> │ NTB        │         │                      │ NTB          │
> │ Transfer   │         │                      │ Transfer     │
> ├────────────┤         │                      ├──────────────┤
> │            │         │                      │              │
> │  PCI NTB   │         │                      │              │
> │    EPF     │         │                      │              │
> │   Driver   │         │                      │ PCI Virtual  │
> │            │         ├───────────────┐      │ NTB Driver   │
> │            │         │ PCI EP NTB    │◄────►│              │
> │            │         │  FN Driver    │      │              │
> ├────────────┤         ├───────────────┤      ├──────────────┤
> │            │         │               │      │              │
> │  PCI Bus   │ ◄─────► │  PCI EP Bus   │      │  Virtual PCI │
> │            │  PCI    │               │      │     Bus      │
> └────────────┘         └───────────────┴──────┴──────────────┘
> PCIe Root Port                        PCI EP
>
> This driver includes 3 parts:
>  1 PCI EP NTB function driver
>  2 Virtual PCI bus
>  3 PCI virtual NTB driver, which is loaded only by above virtual PCI bus
>
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---

Update ntb mail list.

> Change from v1:
>  - Fix code style problem according to Bjorn's feedback
>  - Remove hardcode VPCI_BUS_NUM, let user choose free number by configfs
>  - Remove hardcode vid pid for virtual pci ntb driver, change by
>    configfs
>
>  drivers/pci/endpoint/functions/Kconfig        |   11 +
>  drivers/pci/endpoint/functions/Makefile       |    1 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
>  3 files changed, 1436 insertions(+)
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
>
> diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
> index 5f1242ca2f4e4..65217428d17b9 100644
> --- a/drivers/pci/endpoint/functions/Kconfig
> +++ b/drivers/pci/endpoint/functions/Kconfig
> @@ -25,3 +25,14 @@ config PCI_EPF_NTB
>           device tree.
>
>           If in doubt, say "N" to disable Endpoint NTB driver.
> +
> +config PCI_EPF_VNTB
> +        tristate "PCI Endpoint NTB driver"
> +        depends on PCI_ENDPOINT
> +        select CONFIGFS_FS
> +        help
> +          Select this configuration option to enable the Non-Transparent
> +          Bridge (NTB) driver for PCIe Endpoint. NTB driver implements NTB
> +          between PCI Root Port and PCIe Endpoint.
> +
> +          If in doubt, say "N" to disable Endpoint NTB driver.
> diff --git a/drivers/pci/endpoint/functions/Makefile b/drivers/pci/endpoint/functions/Makefile
> index 96ab932a537a2..5c13001deaba1 100644
> --- a/drivers/pci/endpoint/functions/Makefile
> +++ b/drivers/pci/endpoint/functions/Makefile
> @@ -5,3 +5,4 @@
>
>  obj-$(CONFIG_PCI_EPF_TEST)             += pci-epf-test.o
>  obj-$(CONFIG_PCI_EPF_NTB)              += pci-epf-ntb.o
> +obj-$(CONFIG_PCI_EPF_VNTB)             += pci-epf-vntb.o
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> new file mode 100644
> index 0000000000000..1466dd1904175
> --- /dev/null
> +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> @@ -0,0 +1,1424 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Endpoint Function Driver to implement Non-Transparent Bridge functionality
> + * Between PCI RC and EP
> + *
> + * Copyright (C) 2020 Texas Instruments
> + * Copyright (C) 2022 NXP
> + *
> + * Based on pci-epf-ntb.c
> + * Author: Frank Li <Frank.Li@nxp.com>
> + * Author: Kishon Vijay Abraham I <kishon@ti.com>
> + */
> +
> +/**
> + * +------------+         +---------------------------------------+
> + * |            |         |                                       |
> + * +------------+         |                        +--------------+
> + * | NTB        |         |                        | NTB          |
> + * | NetDev     |         |                        | NetDev       |
> + * +------------+         |                        +--------------+
> + * | NTB        |         |                        | NTB          |
> + * | Transfer   |         |                        | Transfer     |
> + * +------------+         |                        +--------------+
> + * |            |         |                        |              |
> + * |  PCI NTB   |         |                        |              |
> + * |    EPF     |         |                        |              |
> + * |   Driver   |         |                        | PCI Virtual  |
> + * |            |         +---------------+        | NTB Driver   |
> + * |            |         | PCI EP NTB    |<------>|              |
> + * |            |         |  FN Driver    |        |              |
> + * +------------+         +---------------+        +--------------+
> + * |            |         |               |        |              |
> + * |  PCI Bus   | <-----> |  PCI EP Bus   |        |  Virtual PCI |
> + * |            |  PCI    |               |        |     Bus      |
> + * +------------+         +---------------+--------+--------------+
> + * PCIe Root Port                        PCI EP
> + */
> +
> +#include <linux/delay.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +
> +#include <linux/pci-epc.h>
> +#include <linux/pci-epf.h>
> +#include <linux/ntb.h>
> +
> +static struct workqueue_struct *kpcintb_workqueue;
> +
> +#define COMMAND_CONFIGURE_DOORBELL     1
> +#define COMMAND_TEARDOWN_DOORBELL      2
> +#define COMMAND_CONFIGURE_MW           3
> +#define COMMAND_TEARDOWN_MW            4
> +#define COMMAND_LINK_UP                        5
> +#define COMMAND_LINK_DOWN              6
> +
> +#define COMMAND_STATUS_OK              1
> +#define COMMAND_STATUS_ERROR           2
> +
> +#define LINK_STATUS_UP                 BIT(0)
> +
> +#define SPAD_COUNT                     64
> +#define DB_COUNT                       4
> +#define NTB_MW_OFFSET                  2
> +#define DB_COUNT_MASK                  GENMASK(15, 0)
> +#define MSIX_ENABLE                    BIT(16)
> +#define MAX_DB_COUNT                   32
> +#define MAX_MW                         4
> +
> +enum epf_ntb_bar {
> +       BAR_CONFIG,
> +       BAR_DB,
> +       BAR_MW0,
> +       BAR_MW1,
> +       BAR_MW2,
> +};
> +
> +/*
> + * +--------------------------------------------------+ Base
> + * |                                                  |
> + * |                                                  |
> + * |                                                  |
> + * |          Common Control Register                 |
> + * |                                                  |
> + * |                                                  |
> + * |                                                  |
> + * +-----------------------+--------------------------+ Base+span_offset
> + * |                       |                          |
> + * |    Peer Span Space    |    Span Space            |
> + * |                       |                          |
> + * |                       |                          |
> + * +-----------------------+--------------------------+ Base+span_offset
> + * |                       |                          |     +span_count * 4
> + * |                       |                          |
> + * |     Span Space        |   Peer Span Space        |
> + * |                       |                          |
> + * +-----------------------+--------------------------+
> + *       Virtual PCI             PCIe Endpoint
> + *       NTB Driver               NTB Driver
> + */
> +struct epf_ntb_ctrl {
> +       u32     command;
> +       u32     argument;
> +       u16     command_status;
> +       u16     link_status;
> +       u32     topology;
> +       u64     addr;
> +       u64     size;
> +       u32     num_mws;
> +       u32     reserved;
> +       u32     spad_offset;
> +       u32     spad_count;
> +       u32     db_entry_size;
> +       u32     db_data[MAX_DB_COUNT];
> +       u32     db_offset[MAX_DB_COUNT];
> +} __packed;
> +
> +struct epf_ntb {
> +       struct ntb_dev ntb;
> +       struct pci_epf *epf;
> +       struct config_group group;
> +
> +       u32 num_mws;
> +       u32 db_count;
> +       u32 spad_count;
> +       u64 mws_size[MAX_MW];
> +       u64 db;
> +       u32 vbus_number;
> +       u16 vntb_pid;
> +       u16 vntb_vid;
> +
> +       bool linkup;
> +       u32 spad_size;
> +
> +       enum pci_barno epf_ntb_bar[6];
> +
> +       struct epf_ntb_ctrl *reg;
> +
> +       phys_addr_t epf_db_phy;
> +       void __iomem *epf_db;
> +
> +       phys_addr_t vpci_mw_phy[MAX_MW];
> +       void __iomem *vpci_mw_addr[MAX_MW];
> +
> +       struct delayed_work cmd_handler;
> +};
> +
> +#define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb, group)
> +#define ntb_ndev(__ntb) container_of(__ntb, struct epf_ntb, ntb)
> +
> +static struct pci_epf_header epf_ntb_header = {
> +       .vendorid       = PCI_ANY_ID,
> +       .deviceid       = PCI_ANY_ID,
> +       .baseclass_code = PCI_BASE_CLASS_MEMORY,
> +       .interrupt_pin  = PCI_INTERRUPT_INTA,
> +};
> +
> +/**
> + * epf_ntb_link_up() - Raise link_up interrupt to Virtual Host
> + * @ntb: NTB device that facilitates communication between HOST and VHOST
> + * @link_up: true or false indicating Link is UP or Down
> + *
> + * Once NTB function in HOST invoke ntb_link_enable(),
> + * this NTB function driver will trigger a link event to vhost.
> + */
> +static int epf_ntb_link_up(struct epf_ntb *ntb, bool link_up)
> +{
> +       if (link_up)
> +               ntb->reg->link_status |= LINK_STATUS_UP;
> +       else
> +               ntb->reg->link_status &= ~LINK_STATUS_UP;
> +
> +       ntb_link_event(&ntb->ntb);
> +       return 0;
> +}
> +
> +/**
> + * epf_ntb_configure_mw() - Configure the Outbound Address Space for vhost
> + *   to access the memory window of host
> + * @ntb: NTB device that facilitates communication between host and vhost
> + * @mw: Index of the memory window (either 0, 1, 2 or 3)
> + *
> + *                          EP Outbound Window
> + * +--------+              +-----------+
> + * |        |              |           |
> + * |        |              |           |
> + * |        |              |           |
> + * |        |              |           |
> + * |        |              +-----------+
> + * | Virtual|              | Memory Win|
> + * | NTB    | -----------> |           |
> + * | Driver |              |           |
> + * |        |              +-----------+
> + * |        |              |           |
> + * |        |              |           |
> + * +--------+              +-----------+
> + *  VHost                   PCI EP
> + */
> +static int epf_ntb_configure_mw(struct epf_ntb *ntb, u32 mw)
> +{
> +       phys_addr_t phys_addr;
> +       u8 func_no, vfunc_no;
> +       u64 addr, size;
> +       int ret = 0;
> +
> +       phys_addr = ntb->vpci_mw_phy[mw];
> +       addr = ntb->reg->addr;
> +       size = ntb->reg->size;
> +
> +       func_no = ntb->epf->func_no;
> +       vfunc_no = ntb->epf->vfunc_no;
> +
> +       ret = pci_epc_map_addr(ntb->epf->epc, func_no, vfunc_no, phys_addr, addr, size);
> +       if (ret)
> +               dev_err(&ntb->epf->epc->dev,
> +                       "Failed to map memory window %d address\n", mw);
> +       return ret;
> +}
> +
> +/**
> + * epf_ntb_teardown_mw() - Teardown the configured OB ATU
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + * @mw: Index of the memory window (either 0, 1, 2 or 3)
> + *
> + * Teardown the configured OB ATU configured in epf_ntb_configure_mw() using
> + * pci_epc_unmap_addr()
> + */
> +static void epf_ntb_teardown_mw(struct epf_ntb *ntb, u32 mw)
> +{
> +       pci_epc_unmap_addr(ntb->epf->epc,
> +                          ntb->epf->func_no,
> +                          ntb->epf->vfunc_no,
> +                          ntb->vpci_mw_phy[mw]);
> +}
> +
> +/**
> + * epf_ntb_cmd_handler() - Handle commands provided by the NTB Host
> + * @work: work_struct for the epf_ntb_epc
> + *
> + * Workqueue function that gets invoked for the two epf_ntb_epc
> + * periodically (once every 5ms) to see if it has received any commands
> + * from NTB host. The host can send commands to configure doorbell or
> + * configure memory window or to update link status.
> + */
> +static void epf_ntb_cmd_handler(struct work_struct *work)
> +{
> +       struct epf_ntb_ctrl *ctrl;
> +       u32 command, argument;
> +       struct epf_ntb *ntb;
> +       struct device *dev;
> +       int ret;
> +       int i;
> +
> +       ntb = container_of(work, struct epf_ntb, cmd_handler.work);
> +
> +       for (i = 1; i < ntb->db_count; i++) {
> +               if (readl(ntb->epf_db + i * 4)) {
> +                       if (readl(ntb->epf_db + i * 4))
> +                               ntb->db |= 1 << (i - 1);
> +
> +                       ntb_db_event(&ntb->ntb, i);
> +                       writel(0, ntb->epf_db + i * 4);
> +               }
> +       }
> +
> +       ctrl = ntb->reg;
> +       command = ctrl->command;
> +       if (!command)
> +               goto reset_handler;
> +       argument = ctrl->argument;
> +
> +       ctrl->command = 0;
> +       ctrl->argument = 0;
> +
> +       ctrl = ntb->reg;
> +       dev = &ntb->epf->dev;
> +
> +       switch (command) {
> +       case COMMAND_CONFIGURE_DOORBELL:
> +               ctrl->command_status = COMMAND_STATUS_OK;
> +               break;
> +       case COMMAND_TEARDOWN_DOORBELL:
> +               ctrl->command_status = COMMAND_STATUS_OK;
> +               break;
> +       case COMMAND_CONFIGURE_MW:
> +               ret = epf_ntb_configure_mw(ntb, argument);
> +               if (ret < 0)
> +                       ctrl->command_status = COMMAND_STATUS_ERROR;
> +               else
> +                       ctrl->command_status = COMMAND_STATUS_OK;
> +               break;
> +       case COMMAND_TEARDOWN_MW:
> +               epf_ntb_teardown_mw(ntb, argument);
> +               ctrl->command_status = COMMAND_STATUS_OK;
> +               break;
> +       case COMMAND_LINK_UP:
> +               ntb->linkup = true;
> +               ret = epf_ntb_link_up(ntb, true);
> +               if (ret < 0)
> +                       ctrl->command_status = COMMAND_STATUS_ERROR;
> +               else
> +                       ctrl->command_status = COMMAND_STATUS_OK;
> +               goto reset_handler;
> +       case COMMAND_LINK_DOWN:
> +               ntb->linkup = false;
> +               ret = epf_ntb_link_up(ntb, false);
> +               if (ret < 0)
> +                       ctrl->command_status = COMMAND_STATUS_ERROR;
> +               else
> +                       ctrl->command_status = COMMAND_STATUS_OK;
> +               break;
> +       default:
> +               dev_err(dev, "UNKNOWN command: %d\n", command);
> +               break;
> +       }
> +
> +reset_handler:
> +       queue_delayed_work(kpcintb_workqueue, &ntb->cmd_handler,
> +                          msecs_to_jiffies(5));
> +}
> +
> +/**
> + * epf_ntb_config_sspad_bar_clear() - Clear Config + Self scratchpad BAR
> + * @ntb_epc: EPC associated with one of the HOST which holds peer's outbound
> + *          address.
> + *
> + * Clear BAR0 of EP CONTROLLER 1 which contains the HOST1's config and
> + * self scratchpad region (removes inbound ATU configuration). While BAR0 is
> + * the default self scratchpad BAR, an NTB could have other BARs for self
> + * scratchpad (because of reserved BARs). This function can get the exact BAR
> + * used for self scratchpad from epf_ntb_bar[BAR_CONFIG].
> + *
> + * Please note the self scratchpad region and config region is combined to
> + * a single region and mapped using the same BAR. Also note HOST2's peer
> + * scratchpad is HOST1's self scratchpad.
> + */
> +static void epf_ntb_config_sspad_bar_clear(struct epf_ntb *ntb)
> +{
> +       struct pci_epf_bar *epf_bar;
> +       enum pci_barno barno;
> +
> +       barno = ntb->epf_ntb_bar[BAR_CONFIG];
> +       epf_bar = &ntb->epf->bar[barno];
> +
> +       pci_epc_clear_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no, epf_bar);
> +}
> +
> +/**
> + * epf_ntb_config_sspad_bar_set() - Set Config + Self scratchpad BAR
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + *
> + * Map BAR0 of EP CONTROLLER 1 which contains the HOST1's config and
> + * self scratchpad region.
> + *
> + * Please note the self scratchpad region and config region is combined to
> + * a single region and mapped using the same BAR.
> + */
> +static int epf_ntb_config_sspad_bar_set(struct epf_ntb *ntb)
> +{
> +       struct pci_epf_bar *epf_bar;
> +       enum pci_barno barno;
> +       u8 func_no, vfunc_no;
> +       struct device *dev;
> +       int ret;
> +
> +       dev = &ntb->epf->dev;
> +       func_no = ntb->epf->func_no;
> +       vfunc_no = ntb->epf->vfunc_no;
> +       barno = ntb->epf_ntb_bar[BAR_CONFIG];
> +       epf_bar = &ntb->epf->bar[barno];
> +
> +       ret = pci_epc_set_bar(ntb->epf->epc, func_no, vfunc_no, epf_bar);
> +       if (ret) {
> +               dev_err(dev, "inft: Config/Status/SPAD BAR set failed\n");
> +               return ret;
> +       }
> +       return 0;
> +}
> +
> +/**
> + * epf_ntb_config_spad_bar_free() - Free the physical memory associated with
> + *   config + scratchpad region
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + */
> +static void epf_ntb_config_spad_bar_free(struct epf_ntb *ntb)
> +{
> +       enum pci_barno barno;
> +
> +       barno = ntb->epf_ntb_bar[BAR_CONFIG];
> +       pci_epf_free_space(ntb->epf, ntb->reg, barno, 0);
> +}
> +
> +/**
> + * epf_ntb_config_spad_bar_alloc() - Allocate memory for config + scratchpad
> + *   region
> + * @ntb: NTB device that facilitates communication between HOST1 and HOST2
> + *
> + * Allocate the Local Memory mentioned in the above diagram. The size of
> + * CONFIG REGION is sizeof(struct epf_ntb_ctrl) and size of SCRATCHPAD REGION
> + * is obtained from "spad-count" configfs entry.
> + */
> +static int epf_ntb_config_spad_bar_alloc(struct epf_ntb *ntb)
> +{
> +       size_t align;
> +       enum pci_barno barno;
> +       struct epf_ntb_ctrl *ctrl;
> +       u32 spad_size, ctrl_size;
> +       u64 size;
> +       struct pci_epf *epf = ntb->epf;
> +       struct device *dev = &epf->dev;
> +       u32 spad_count;
> +       void *base;
> +       int i;
> +       const struct pci_epc_features *epc_features = pci_epc_get_features(epf->epc,
> +                                                               epf->func_no,
> +                                                               epf->vfunc_no);
> +       barno = ntb->epf_ntb_bar[BAR_CONFIG];
> +       size = epc_features->bar_fixed_size[barno];
> +       align = epc_features->align;
> +
> +       if ((!IS_ALIGNED(size, align)))
> +               return -EINVAL;
> +
> +       spad_count = ntb->spad_count;
> +
> +       ctrl_size = sizeof(struct epf_ntb_ctrl);
> +       spad_size = 2 * spad_count * 4;
> +
> +       if (!align) {
> +               ctrl_size = roundup_pow_of_two(ctrl_size);
> +               spad_size = roundup_pow_of_two(spad_size);
> +       } else {
> +               ctrl_size = ALIGN(ctrl_size, align);
> +               spad_size = ALIGN(spad_size, align);
> +       }
> +
> +       if (!size)
> +               size = ctrl_size + spad_size;
> +       else if (size < ctrl_size + spad_size)
> +               return -EINVAL;
> +
> +       base = pci_epf_alloc_space(epf, size, barno, align, 0);
> +       if (!base) {
> +               dev_err(dev, "Config/Status/SPAD alloc region fail\n");
> +               return -ENOMEM;
> +       }
> +
> +       ntb->reg = base;
> +
> +       ctrl = ntb->reg;
> +       ctrl->spad_offset = ctrl_size;
> +
> +       ctrl->spad_count = spad_count;
> +       ctrl->num_mws = ntb->num_mws;
> +       ntb->spad_size = spad_size;
> +
> +       ctrl->db_entry_size = 4;
> +
> +       for (i = 0; i < ntb->db_count; i++) {
> +               ntb->reg->db_data[i] = 1 + i;
> +               ntb->reg->db_offset[i] = 0;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * epf_ntb_configure_interrupt() - Configure MSI/MSI-X capaiblity
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + *
> + * Configure MSI/MSI-X capability for each interface with number of
> + * interrupts equal to "db_count" configfs entry.
> + */
> +static int epf_ntb_configure_interrupt(struct epf_ntb *ntb)
> +{
> +       const struct pci_epc_features *epc_features;
> +       struct device *dev;
> +       u32 db_count;
> +       int ret;
> +
> +       dev = &ntb->epf->dev;
> +
> +       epc_features = pci_epc_get_features(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no);
> +
> +       if (!(epc_features->msix_capable || epc_features->msi_capable)) {
> +               dev_err(dev, "MSI or MSI-X is required for doorbell\n");
> +               return -EINVAL;
> +       }
> +
> +       db_count = ntb->db_count;
> +       if (db_count > MAX_DB_COUNT) {
> +               dev_err(dev, "DB count cannot be more than %d\n", MAX_DB_COUNT);
> +               return -EINVAL;
> +       }
> +
> +       ntb->db_count = db_count;
> +
> +       if (epc_features->msi_capable) {
> +               ret = pci_epc_set_msi(ntb->epf->epc,
> +                                     ntb->epf->func_no,
> +                                     ntb->epf->vfunc_no,
> +                                     16);
> +               if (ret) {
> +                       dev_err(dev, "MSI configuration failed\n");
> +                       return ret;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * epf_ntb_db_bar_init() - Configure Doorbell window BARs
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + */
> +static int epf_ntb_db_bar_init(struct epf_ntb *ntb)
> +{
> +       const struct pci_epc_features *epc_features;
> +       u32 align;
> +       struct device *dev = &ntb->epf->dev;
> +       int ret;
> +       struct pci_epf_bar *epf_bar;
> +       void __iomem *mw_addr;
> +       enum pci_barno barno;
> +       size_t size = 4 * ntb->db_count;
> +
> +       epc_features = pci_epc_get_features(ntb->epf->epc,
> +                                           ntb->epf->func_no,
> +                                           ntb->epf->vfunc_no);
> +       align = epc_features->align;
> +
> +       if (size < 128)
> +               size = 128;
> +
> +       if (align)
> +               size = ALIGN(size, align);
> +       else
> +               size = roundup_pow_of_two(size);
> +
> +       barno = ntb->epf_ntb_bar[BAR_DB];
> +
> +       mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> +       if (!mw_addr) {
> +               dev_err(dev, "Failed to allocate OB address\n");
> +               return -ENOMEM;
> +       }
> +
> +       ntb->epf_db = mw_addr;
> +
> +       epf_bar = &ntb->epf->bar[barno];
> +
> +       ret = pci_epc_set_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no, epf_bar);
> +       if (ret) {
> +               dev_err(dev, "Doorbell BAR set failed\n");
> +                       goto err_alloc_peer_mem;
> +       }
> +       return ret;
> +
> +err_alloc_peer_mem:
> +       pci_epc_mem_free_addr(ntb->epf->epc, epf_bar->phys_addr, mw_addr, epf_bar->size);
> +       return -1;
> +}
> +
> +/**
> + * epf_ntb_db_bar_clear() - Clear doorbell BAR and free memory
> + *   allocated in peer's outbound address space
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + */
> +static void epf_ntb_db_bar_clear(struct epf_ntb *ntb)
> +{
> +       enum pci_barno barno;
> +
> +       barno = ntb->epf_ntb_bar[BAR_DB];
> +       pci_epf_free_space(ntb->epf, ntb->epf_db, barno, 0);
> +       pci_epc_clear_bar(ntb->epf->epc,
> +                         ntb->epf->func_no,
> +                         ntb->epf->vfunc_no,
> +                         &ntb->epf->bar[barno]);
> +}
> +
> +/**
> + * epf_ntb_mw_bar_init() - Configure Memory window BARs
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + *
> + */
> +static int epf_ntb_mw_bar_init(struct epf_ntb *ntb)
> +{
> +       int ret = 0;
> +       int i;
> +       u64 size;
> +       enum pci_barno barno;
> +       struct device *dev = &ntb->epf->dev;
> +
> +       for (i = 0; i < ntb->num_mws; i++) {
> +               size = ntb->mws_size[i];
> +               barno = ntb->epf_ntb_bar[BAR_MW0 + i];
> +
> +               ntb->epf->bar[barno].barno = barno;
> +               ntb->epf->bar[barno].size = size;
> +               ntb->epf->bar[barno].addr = 0;
> +               ntb->epf->bar[barno].phys_addr = 0;
> +               ntb->epf->bar[barno].flags |= upper_32_bits(size) ?
> +                               PCI_BASE_ADDRESS_MEM_TYPE_64 :
> +                               PCI_BASE_ADDRESS_MEM_TYPE_32;
> +
> +               ret = pci_epc_set_bar(ntb->epf->epc,
> +                                     ntb->epf->func_no,
> +                                     ntb->epf->vfunc_no,
> +                                     &ntb->epf->bar[barno]);
> +               if (ret) {
> +                       dev_err(dev, "MW set failed\n");
> +                       goto err_alloc_mem;
> +               }
> +
> +               /* Allocate EPC outbound memory windows to vpci vntb device */
> +               ntb->vpci_mw_addr[i] = pci_epc_mem_alloc_addr(ntb->epf->epc,
> +                                                             &ntb->vpci_mw_phy[i],
> +                                                             size);
> +               if (!ntb->vpci_mw_addr[i]) {
> +                       dev_err(dev, "Failed to allocate source address\n");
> +                       goto err_alloc_mem;
> +               }
> +       }
> +
> +       return ret;
> +err_alloc_mem:
> +       return ret;
> +}
> +
> +/**
> + * epf_ntb_mw_bar_clear() - Clear Memory window BARs
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + */
> +static void epf_ntb_mw_bar_clear(struct epf_ntb *ntb)
> +{
> +       enum pci_barno barno;
> +       int i;
> +
> +       for (i = 0; i < ntb->num_mws; i++) {
> +               barno = ntb->epf_ntb_bar[BAR_MW0 + i];
> +               pci_epc_clear_bar(ntb->epf->epc,
> +                                 ntb->epf->func_no,
> +                                 ntb->epf->vfunc_no,
> +                                 &ntb->epf->bar[barno]);
> +
> +               pci_epc_mem_free_addr(ntb->epf->epc,
> +                                     ntb->vpci_mw_phy[i],
> +                                     ntb->vpci_mw_addr[i],
> +                                     ntb->mws_size[i]);
> +       }
> +}
> +
> +/**
> + * epf_ntb_epc_destroy() - Cleanup NTB EPC interface
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + *
> + * Wrapper for epf_ntb_epc_destroy_interface() to cleanup all the NTB interfaces
> + */
> +static void epf_ntb_epc_destroy(struct epf_ntb *ntb)
> +{
> +       pci_epc_remove_epf(ntb->epf->epc, ntb->epf, 0);
> +       pci_epc_put(ntb->epf->epc);
> +}
> +
> +/**
> + * epf_ntb_init_epc_bar() - Identify BARs to be used for each of the NTB
> + * constructs (scratchpad region, doorbell, memorywindow)
> + * @ntb: NTB device that facilitates communication between HOST and vHOST
> + */
> +static int epf_ntb_init_epc_bar(struct epf_ntb *ntb)
> +{
> +       const struct pci_epc_features *epc_features;
> +       enum pci_barno barno;
> +       enum epf_ntb_bar bar;
> +       struct device *dev;
> +       u32 num_mws;
> +       int i;
> +
> +       barno = BAR_0;
> +       num_mws = ntb->num_mws;
> +       dev = &ntb->epf->dev;
> +       epc_features = pci_epc_get_features(ntb->epf->epc, ntb->epf->func_no, ntb->epf->vfunc_no);
> +
> +       /* These are required BARs which are mandatory for NTB functionality */
> +       for (bar = BAR_CONFIG; bar <= BAR_MW0; bar++, barno++) {
> +               barno = pci_epc_get_next_free_bar(epc_features, barno);
> +               if (barno < 0) {
> +                       dev_err(dev, "Fail to get NTB function BAR\n");
> +                       return barno;
> +               }
> +               ntb->epf_ntb_bar[bar] = barno;
> +       }
> +
> +       /* These are optional BARs which don't impact NTB functionality */
> +       for (bar = BAR_MW1, i = 1; i < num_mws; bar++, barno++, i++) {
> +               barno = pci_epc_get_next_free_bar(epc_features, barno);
> +               if (barno < 0) {
> +                       ntb->num_mws = i;
> +                       dev_dbg(dev, "BAR not available for > MW%d\n", i + 1);
> +               }
> +               ntb->epf_ntb_bar[bar] = barno;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * epf_ntb_epc_init() - Initialize NTB interface
> + * @ntb: NTB device that facilitates communication between HOST and vHOST2
> + *
> + * Wrapper to initialize a particular EPC interface and start the workqueue
> + * to check for commands from host. This function will write to the
> + * EP controller HW for configuring it.
> + */
> +static int epf_ntb_epc_init(struct epf_ntb *ntb)
> +{
> +       u8 func_no, vfunc_no;
> +       struct pci_epc *epc;
> +       struct pci_epf *epf;
> +       struct device *dev;
> +       int ret;
> +
> +       epf = ntb->epf;
> +       dev = &epf->dev;
> +       epc = epf->epc;
> +       func_no = ntb->epf->func_no;
> +       vfunc_no = ntb->epf->vfunc_no;
> +
> +       ret = epf_ntb_config_sspad_bar_set(ntb);
> +       if (ret) {
> +               dev_err(dev, "Config/self SPAD BAR init failed");
> +               return ret;
> +       }
> +
> +       ret = epf_ntb_configure_interrupt(ntb);
> +       if (ret) {
> +               dev_err(dev, "Interrupt configuration failed\n");
> +               goto err_config_interrupt;
> +       }
> +
> +       ret = epf_ntb_db_bar_init(ntb);
> +       if (ret) {
> +               dev_err(dev, "DB BAR init failed\n");
> +               goto err_db_bar_init;
> +       }
> +
> +       ret = epf_ntb_mw_bar_init(ntb);
> +       if (ret) {
> +               dev_err(dev, "MW BAR init failed\n");
> +               goto err_mw_bar_init;
> +       }
> +
> +       if (vfunc_no <= 1) {
> +               ret = pci_epc_write_header(epc, func_no, vfunc_no, epf->header);
> +               if (ret) {
> +                       dev_err(dev, "Configuration header write failed\n");
> +                       goto err_write_header;
> +               }
> +       }
> +
> +       INIT_DELAYED_WORK(&ntb->cmd_handler, epf_ntb_cmd_handler);
> +       queue_work(kpcintb_workqueue, &ntb->cmd_handler.work);
> +
> +       return 0;
> +
> +err_write_header:
> +       epf_ntb_mw_bar_clear(ntb);
> +err_mw_bar_init:
> +       epf_ntb_db_bar_clear(ntb);
> +err_db_bar_init:
> +err_config_interrupt:
> +       epf_ntb_config_sspad_bar_clear(ntb);
> +
> +       return ret;
> +}
> +
> +
> +/**
> + * epf_ntb_epc_cleanup() - Cleanup all NTB interfaces
> + * @ntb: NTB device that facilitates communication between HOST1 and HOST2
> + *
> + * Wrapper to cleanup all NTB interfaces.
> + */
> +static void epf_ntb_epc_cleanup(struct epf_ntb *ntb)
> +{
> +       epf_ntb_db_bar_clear(ntb);
> +       epf_ntb_mw_bar_clear(ntb);
> +}
> +
> +#define EPF_NTB_R(_name)                                               \
> +static ssize_t epf_ntb_##_name##_show(struct config_item *item,                \
> +                                     char *page)                       \
> +{                                                                      \
> +       struct config_group *group = to_config_group(item);             \
> +       struct epf_ntb *ntb = to_epf_ntb(group);                        \
> +                                                                       \
> +       return sprintf(page, "%d\n", ntb->_name);                       \
> +}
> +
> +#define EPF_NTB_W(_name)                                               \
> +static ssize_t epf_ntb_##_name##_store(struct config_item *item,       \
> +                                      const char *page, size_t len)    \
> +{                                                                      \
> +       struct config_group *group = to_config_group(item);             \
> +       struct epf_ntb *ntb = to_epf_ntb(group);                        \
> +       u32 val;                                                        \
> +       int ret;                                                        \
> +                                                                       \
> +       ret = kstrtou32(page, 0, &val);                                 \
> +       if (ret)                                                        \
> +               return ret;                                             \
> +                                                                       \
> +       ntb->_name = val;                                               \
> +                                                                       \
> +       return len;                                                     \
> +}
> +
> +#define EPF_NTB_MW_R(_name)                                            \
> +static ssize_t epf_ntb_##_name##_show(struct config_item *item,                \
> +                                     char *page)                       \
> +{                                                                      \
> +       struct config_group *group = to_config_group(item);             \
> +       struct epf_ntb *ntb = to_epf_ntb(group);                        \
> +       int win_no;                                                     \
> +                                                                       \
> +       sscanf(#_name, "mw%d", &win_no);                                \
> +                                                                       \
> +       return sprintf(page, "%lld\n", ntb->mws_size[win_no - 1]);      \
> +}
> +
> +#define EPF_NTB_MW_W(_name)                                            \
> +static ssize_t epf_ntb_##_name##_store(struct config_item *item,       \
> +                                      const char *page, size_t len)    \
> +{                                                                      \
> +       struct config_group *group = to_config_group(item);             \
> +       struct epf_ntb *ntb = to_epf_ntb(group);                        \
> +       struct device *dev = &ntb->epf->dev;                            \
> +       int win_no;                                                     \
> +       u64 val;                                                        \
> +       int ret;                                                        \
> +                                                                       \
> +       ret = kstrtou64(page, 0, &val);                                 \
> +       if (ret)                                                        \
> +               return ret;                                             \
> +                                                                       \
> +       if (sscanf(#_name, "mw%d", &win_no) != 1)                       \
> +               return -EINVAL;                                         \
> +                                                                       \
> +       if (ntb->num_mws < win_no) {                                    \
> +               dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> +               return -EINVAL;                                         \
> +       }                                                               \
> +                                                                       \
> +       ntb->mws_size[win_no - 1] = val;                                \
> +                                                                       \
> +       return len;                                                     \
> +}
> +
> +static ssize_t epf_ntb_num_mws_store(struct config_item *item,
> +                                    const char *page, size_t len)
> +{
> +       struct config_group *group = to_config_group(item);
> +       struct epf_ntb *ntb = to_epf_ntb(group);
> +       u32 val;
> +       int ret;
> +
> +       ret = kstrtou32(page, 0, &val);
> +       if (ret)
> +               return ret;
> +
> +       if (val > MAX_MW)
> +               return -EINVAL;
> +
> +       ntb->num_mws = val;
> +
> +       return len;
> +}
> +
> +EPF_NTB_R(spad_count)
> +EPF_NTB_W(spad_count)
> +EPF_NTB_R(db_count)
> +EPF_NTB_W(db_count)
> +EPF_NTB_R(num_mws)
> +EPF_NTB_R(vbus_number)
> +EPF_NTB_W(vbus_number)
> +EPF_NTB_R(vntb_pid)
> +EPF_NTB_W(vntb_pid)
> +EPF_NTB_R(vntb_vid)
> +EPF_NTB_W(vntb_vid)
> +EPF_NTB_MW_R(mw1)
> +EPF_NTB_MW_W(mw1)
> +EPF_NTB_MW_R(mw2)
> +EPF_NTB_MW_W(mw2)
> +EPF_NTB_MW_R(mw3)
> +EPF_NTB_MW_W(mw3)
> +EPF_NTB_MW_R(mw4)
> +EPF_NTB_MW_W(mw4)
> +
> +CONFIGFS_ATTR(epf_ntb_, spad_count);
> +CONFIGFS_ATTR(epf_ntb_, db_count);
> +CONFIGFS_ATTR(epf_ntb_, num_mws);
> +CONFIGFS_ATTR(epf_ntb_, mw1);
> +CONFIGFS_ATTR(epf_ntb_, mw2);
> +CONFIGFS_ATTR(epf_ntb_, mw3);
> +CONFIGFS_ATTR(epf_ntb_, mw4);
> +CONFIGFS_ATTR(epf_ntb_, vbus_number);
> +CONFIGFS_ATTR(epf_ntb_, vntb_pid);
> +CONFIGFS_ATTR(epf_ntb_, vntb_vid);
> +
> +static struct configfs_attribute *epf_ntb_attrs[] = {
> +       &epf_ntb_attr_spad_count,
> +       &epf_ntb_attr_db_count,
> +       &epf_ntb_attr_num_mws,
> +       &epf_ntb_attr_mw1,
> +       &epf_ntb_attr_mw2,
> +       &epf_ntb_attr_mw3,
> +       &epf_ntb_attr_mw4,
> +       &epf_ntb_attr_vbus_number,
> +       &epf_ntb_attr_vntb_pid,
> +       &epf_ntb_attr_vntb_vid,
> +       NULL,
> +};
> +
> +static const struct config_item_type ntb_group_type = {
> +       .ct_attrs       = epf_ntb_attrs,
> +       .ct_owner       = THIS_MODULE,
> +};
> +
> +/**
> + * epf_ntb_add_cfs() - Add configfs directory specific to NTB
> + * @epf: NTB endpoint function device
> + * @group: A pointer to the config_group structure referencing a group of
> + *        config_items of a specific type that belong to a specific sub-system.
> + *
> + * Add configfs directory specific to NTB. This directory will hold
> + * NTB specific properties like db_count, spad_count, num_mws etc.,
> + */
> +static struct config_group *epf_ntb_add_cfs(struct pci_epf *epf,
> +                                           struct config_group *group)
> +{
> +       struct epf_ntb *ntb = epf_get_drvdata(epf);
> +       struct config_group *ntb_group = &ntb->group;
> +       struct device *dev = &epf->dev;
> +
> +       config_group_init_type_name(ntb_group, dev_name(dev), &ntb_group_type);
> +
> +       return ntb_group;
> +}
> +
> +/*==== virtual PCI bus driver, which only load virtual NTB PCI driver ====*/
> +
> +static u32 pci_space[] = {
> +       0xffffffff,     /*DeviceID, Vendor ID*/
> +       0,              /*Status, Command*/
> +       0xffffffff,     /*Class code, subclass, prog if, revision id*/
> +       0x40,           /*bist, header type, latency Timer, cache line size*/
> +       0,              /*BAR 0*/
> +       0,              /*BAR 1*/
> +       0,              /*BAR 2*/
> +       0,              /*BAR 3*/
> +       0,              /*BAR 4*/
> +       0,              /*BAR 5*/
> +       0,              /*Cardbus cis point*/
> +       0,              /*Subsystem ID Subystem vendor id*/
> +       0,              /*ROM Base Address*/
> +       0,              /*Reserved, Cap. Point*/
> +       0,              /*Reserved,*/
> +       0,              /*Max Lat, Min Gnt, interrupt pin, interrupt line*/
> +};
> +
> +int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val)
> +{
> +       if (devfn == 0) {
> +               memcpy(val, ((u8 *)pci_space) + where, size);
> +               return PCIBIOS_SUCCESSFUL;
> +       }
> +       return PCIBIOS_DEVICE_NOT_FOUND;
> +}
> +
> +int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val)
> +{
> +       return 0;
> +}
> +
> +struct pci_ops vpci_ops = {
> +       .read = pci_read,
> +       .write = pci_write,
> +};
> +
> +static int vpci_scan_bus(void *sysdata)
> +{
> +       struct pci_bus *vpci_bus;
> +       struct epf_ntb *ndev = sysdata;
> +
> +       vpci_bus = pci_scan_bus(ndev->vbus_number, &vpci_ops, sysdata);
> +       if (vpci_bus)
> +               pr_err("create pci bus\n");
> +
> +       pci_bus_add_devices(vpci_bus);
> +
> +       return 0;
> +}
> +
> +/*==================== Virtual PCIe NTB driver ==========================*/
> +
> +static int vntb_epf_mw_count(struct ntb_dev *ntb, int pidx)
> +{
> +       struct epf_ntb *ndev = ntb_ndev(ntb);
> +
> +       return ndev->num_mws;
> +}
> +
> +static int vntb_epf_spad_count(struct ntb_dev *ntb)
> +{
> +       return ntb_ndev(ntb)->spad_count;
> +}
> +
> +static int vntb_epf_peer_mw_count(struct ntb_dev *ntb)
> +{
> +       return ntb_ndev(ntb)->num_mws;
> +}
> +
> +static u64 vntb_epf_db_valid_mask(struct ntb_dev *ntb)
> +{
> +       return BIT_ULL(ntb_ndev(ntb)->db_count) - 1;
> +}
> +
> +static int vntb_epf_db_set_mask(struct ntb_dev *ntb, u64 db_bits)
> +{
> +       return 0;
> +}
> +
> +static int vntb_epf_mw_set_trans(struct ntb_dev *ndev, int pidx, int idx,
> +               dma_addr_t addr, resource_size_t size)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +       struct pci_epf_bar *epf_bar;
> +       enum pci_barno barno;
> +       int ret;
> +       struct device *dev;
> +
> +       dev = &ntb->ntb.dev;
> +       barno = ntb->epf_ntb_bar[BAR_MW0 + idx];
> +       epf_bar = &ntb->epf->bar[barno];
> +       epf_bar->phys_addr = addr;
> +       epf_bar->barno = barno;
> +       epf_bar->size = size;
> +
> +       ret = pci_epc_set_bar(ntb->epf->epc, 0, 0, epf_bar);
> +       if (ret) {
> +               dev_err(dev, "failure set mw trans\n");
> +               return ret;
> +       }
> +       return 0;
> +}
> +
> +static int vntb_epf_mw_clear_trans(struct ntb_dev *ntb, int pidx, int idx)
> +{
> +       return 0;
> +}
> +
> +static int vntb_epf_peer_mw_get_addr(struct ntb_dev *ndev, int idx,
> +                               phys_addr_t *base, resource_size_t *size)
> +{
> +
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +
> +       if (base)
> +               *base = ntb->vpci_mw_phy[idx];
> +
> +       if (size)
> +               *size = ntb->mws_size[idx];
> +
> +       return 0;
> +}
> +
> +static int vntb_epf_link_enable(struct ntb_dev *ntb,
> +                       enum ntb_speed max_speed,
> +                       enum ntb_width max_width)
> +{
> +       return 0;
> +}
> +
> +static u32 vntb_epf_spad_read(struct ntb_dev *ndev, int idx)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +       int off = ntb->reg->spad_offset, ct = ntb->reg->spad_count * 4;
> +       u32 val;
> +       void __iomem *base = ntb->reg;
> +
> +       val = readl(base + off + ct + idx * 4);
> +       return val;
> +}
> +
> +static int vntb_epf_spad_write(struct ntb_dev *ndev, int idx, u32 val)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +       struct epf_ntb_ctrl *ctrl = ntb->reg;
> +       int off = ctrl->spad_offset, ct = ctrl->spad_count * 4;
> +       void __iomem *base = ntb->reg;
> +
> +       writel(val, base + off + ct + idx * 4);
> +       return 0;
> +}
> +
> +static u32 vntb_epf_peer_spad_read(struct ntb_dev *ndev, int pidx, int idx)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +       struct epf_ntb_ctrl *ctrl = ntb->reg;
> +       int off = ctrl->spad_offset;
> +       void __iomem *base = ntb->reg;
> +       u32 val;
> +
> +       val = readl(base + off + idx * 4);
> +       return val;
> +}
> +
> +static int vntb_epf_peer_spad_write(struct ntb_dev *ndev, int pidx, int idx, u32 val)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +       struct epf_ntb_ctrl *ctrl = ntb->reg;
> +       int off = ctrl->spad_offset;
> +       void __iomem *base = ntb->reg;
> +
> +       writel(val, base + off + idx * 4);
> +       return 0;
> +}
> +
> +static int vntb_epf_peer_db_set(struct ntb_dev *ndev, u64 db_bits)
> +{
> +       u32 interrupt_num = ffs(db_bits) + 1;
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +       u8 func_no, vfunc_no;
> +       int ret;
> +
> +       func_no = ntb->epf->func_no;
> +       vfunc_no = ntb->epf->vfunc_no;
> +
> +       ret = pci_epc_raise_irq(ntb->epf->epc,
> +                               func_no,
> +                               vfunc_no,
> +                               PCI_EPC_IRQ_MSI,
> +                               interrupt_num + 1);
> +       if (ret)
> +               dev_err(&ntb->ntb.dev, "Failed to raise IRQ\n");
> +
> +       return ret;
> +}
> +
> +static u64 vntb_epf_db_read(struct ntb_dev *ndev)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +
> +       return ntb->db;
> +}
> +
> +static int vntb_epf_mw_get_align(struct ntb_dev *ndev, int pidx, int idx,
> +                       resource_size_t *addr_align,
> +                       resource_size_t *size_align,
> +                       resource_size_t *size_max)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +
> +       if (addr_align)
> +               *addr_align = SZ_4K;
> +
> +       if (size_align)
> +               *size_align = 1;
> +
> +       if (size_max)
> +               *size_max = ntb->mws_size[idx];
> +
> +       return 0;
> +}
> +
> +static u64 vntb_epf_link_is_up(struct ntb_dev *ndev,
> +                       enum ntb_speed *speed,
> +                       enum ntb_width *width)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +
> +       return ntb->reg->link_status;
> +}
> +
> +static int vntb_epf_db_clear_mask(struct ntb_dev *ndev, u64 db_bits)
> +{
> +       return 0;
> +}
> +
> +static int vntb_epf_db_clear(struct ntb_dev *ndev, u64 db_bits)
> +{
> +       struct epf_ntb *ntb = ntb_ndev(ndev);
> +
> +       ntb->db &= ~db_bits;
> +       return 0;
> +}
> +
> +static int vntb_epf_link_disable(struct ntb_dev *ntb)
> +{
> +       return 0;
> +}
> +
> +static const struct ntb_dev_ops vntb_epf_ops = {
> +       .mw_count               = vntb_epf_mw_count,
> +       .spad_count             = vntb_epf_spad_count,
> +       .peer_mw_count          = vntb_epf_peer_mw_count,
> +       .db_valid_mask          = vntb_epf_db_valid_mask,
> +       .db_set_mask            = vntb_epf_db_set_mask,
> +       .mw_set_trans           = vntb_epf_mw_set_trans,
> +       .mw_clear_trans         = vntb_epf_mw_clear_trans,
> +       .peer_mw_get_addr       = vntb_epf_peer_mw_get_addr,
> +       .link_enable            = vntb_epf_link_enable,
> +       .spad_read              = vntb_epf_spad_read,
> +       .spad_write             = vntb_epf_spad_write,
> +       .peer_spad_read         = vntb_epf_peer_spad_read,
> +       .peer_spad_write        = vntb_epf_peer_spad_write,
> +       .peer_db_set            = vntb_epf_peer_db_set,
> +       .db_read                = vntb_epf_db_read,
> +       .mw_get_align           = vntb_epf_mw_get_align,
> +       .link_is_up             = vntb_epf_link_is_up,
> +       .db_clear_mask          = vntb_epf_db_clear_mask,
> +       .db_clear               = vntb_epf_db_clear,
> +       .link_disable           = vntb_epf_link_disable,
> +};
> +
> +static int pci_vntb_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> +       int ret;
> +       struct epf_ntb *ndev = (struct epf_ntb *)pdev->sysdata;
> +       struct device *dev = &pdev->dev;
> +
> +       ndev->ntb.pdev = pdev;
> +       ndev->ntb.topo = NTB_TOPO_NONE;
> +       ndev->ntb.ops =  &vntb_epf_ops;
> +
> +       ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> +       if (ret) {
> +               dev_err(dev, "Cannot set DMA mask\n");
> +               return -EINVAL;
> +       }
> +
> +       ret = ntb_register_device(&ndev->ntb);
> +       if (ret) {
> +               dev_err(dev, "Failed to register NTB device\n");
> +               goto err_register_dev;
> +       }
> +
> +       dev_dbg(dev, "PCI Virtual NTB driver loaded\n");
> +       return 0;
> +
> +err_register_dev:
> +       return -EINVAL;
> +}
> +
> +static struct pci_device_id pci_vntb_table[] = {
> +       {
> +               PCI_DEVICE(0xffff, 0xffff),
> +       },
> +       {},
> +};
> +
> +static struct pci_driver vntb_pci_driver = {
> +       .name           = "pci-vntb",
> +       .id_table       = pci_vntb_table,
> +       .probe          = pci_vntb_probe,
> +};
> +
> +/* ============ PCIe EPF Driver Bind ====================*/
> +
> +/**
> + * epf_ntb_bind() - Initialize endpoint controller to provide NTB functionality
> + * @epf: NTB endpoint function device
> + *
> + * Initialize both the endpoint controllers associated with NTB function device.
> + * Invoked when a primary interface or secondary interface is bound to EPC
> + * device. This function will succeed only when EPC is bound to both the
> + * interfaces.
> + */
> +static int epf_ntb_bind(struct pci_epf *epf)
> +{
> +       struct epf_ntb *ntb = epf_get_drvdata(epf);
> +       struct device *dev = &epf->dev;
> +       int ret;
> +
> +       if (!epf->epc) {
> +               dev_dbg(dev, "PRIMARY EPC interface not yet bound\n");
> +               return 0;
> +       }
> +
> +       ret = epf_ntb_init_epc_bar(ntb);
> +       if (ret) {
> +               dev_err(dev, "Failed to create NTB EPC\n");
> +               goto err_bar_init;
> +       }
> +
> +       ret = epf_ntb_config_spad_bar_alloc(ntb);
> +       if (ret) {
> +               dev_err(dev, "Failed to allocate BAR memory\n");
> +               goto err_bar_alloc;
> +       }
> +
> +       ret = epf_ntb_epc_init(ntb);
> +       if (ret) {
> +               dev_err(dev, "Failed to initialize EPC\n");
> +               goto err_bar_alloc;
> +       }
> +
> +       epf_set_drvdata(epf, ntb);
> +
> +       pci_space[0] = (ntb->vntb_pid << 16) | ntb->vntb_vid;
> +       pci_vntb_table[0].vendor = ntb->vntb_vid;
> +       pci_vntb_table[0].device = ntb->vntb_pid;
> +
> +       if (pci_register_driver(&vntb_pci_driver)) {
> +               dev_err(dev, "failure register vntb pci driver\n");
> +               goto err_bar_alloc;
> +       }
> +
> +       vpci_scan_bus(ntb);
> +
> +       return 0;
> +
> +err_bar_alloc:
> +       epf_ntb_config_spad_bar_free(ntb);
> +
> +err_bar_init:
> +       epf_ntb_epc_destroy(ntb);
> +
> +       return ret;
> +}
> +
> +/**
> + * epf_ntb_unbind() - Cleanup the initialization from epf_ntb_bind()
> + * @epf: NTB endpoint function device
> + *
> + * Cleanup the initialization from epf_ntb_bind()
> + */
> +static void epf_ntb_unbind(struct pci_epf *epf)
> +{
> +       struct epf_ntb *ntb = epf_get_drvdata(epf);
> +
> +       epf_ntb_epc_cleanup(ntb);
> +       epf_ntb_config_spad_bar_free(ntb);
> +       epf_ntb_epc_destroy(ntb);
> +
> +       pci_unregister_driver(&vntb_pci_driver);
> +}
> +
> +// EPF driver probe
> +static struct pci_epf_ops epf_ntb_ops = {
> +       .bind   = epf_ntb_bind,
> +       .unbind = epf_ntb_unbind,
> +       .add_cfs = epf_ntb_add_cfs,
> +};
> +
> +/**
> + * epf_ntb_probe() - Probe NTB function driver
> + * @epf: NTB endpoint function device
> + *
> + * Probe NTB function driver when endpoint function bus detects a NTB
> + * endpoint function.
> + */
> +static int epf_ntb_probe(struct pci_epf *epf)
> +{
> +       struct epf_ntb *ntb;
> +       struct device *dev;
> +
> +       dev = &epf->dev;
> +
> +       ntb = devm_kzalloc(dev, sizeof(*ntb), GFP_KERNEL);
> +       if (!ntb)
> +               return -ENOMEM;
> +
> +       epf->header = &epf_ntb_header;
> +       ntb->epf = epf;
> +       ntb->vbus_number = 0xff;
> +       epf_set_drvdata(epf, ntb);
> +
> +       dev_info(dev, "pci-ep epf driver loaded\n");
> +       return 0;
> +}
> +
> +static const struct pci_epf_device_id epf_ntb_ids[] = {
> +       {
> +               .name = "pci_epf_vntb",
> +       },
> +       {},
> +};
> +
> +static struct pci_epf_driver epf_ntb_driver = {
> +       .driver.name    = "pci_epf_vntb",
> +       .probe          = epf_ntb_probe,
> +       .id_table       = epf_ntb_ids,
> +       .ops            = &epf_ntb_ops,
> +       .owner          = THIS_MODULE,
> +};
> +
> +static int __init epf_ntb_init(void)
> +{
> +       int ret;
> +
> +       kpcintb_workqueue = alloc_workqueue("kpcintb", WQ_MEM_RECLAIM |
> +                                           WQ_HIGHPRI, 0);
> +       ret = pci_epf_register_driver(&epf_ntb_driver);
> +       if (ret) {
> +               destroy_workqueue(kpcintb_workqueue);
> +               pr_err("Failed to register pci epf ntb driver --> %d\n", ret);
> +               return ret;
> +       }
> +
> +       return 0;
> +}
> +module_init(epf_ntb_init);
> +
> +static void __exit epf_ntb_exit(void)
> +{
> +       pci_epf_unregister_driver(&epf_ntb_driver);
> +       destroy_workqueue(kpcintb_workqueue);
> +}
> +module_exit(epf_ntb_exit);
> +
> +MODULE_DESCRIPTION("PCI EPF NTB DRIVER");
> +MODULE_AUTHOR("Frank Li <Frank.li@nxp.com>");
> +MODULE_LICENSE("GPL v2");
> --
> 2.24.0.rc1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-03-10 22:07   ` Zhi Li
@ 2022-04-04 20:12     ` Zhi Li
  0 siblings, 0 replies; 30+ messages in thread
From: Zhi Li @ 2022-04-04 20:12 UTC (permalink / raw)
  To: Frank Li
  Cc: Bjorn Helgaas, kishon, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci, ntb

On Thu, Mar 10, 2022 at 4:07 PM Zhi Li <lznuaa@gmail.com> wrote:
>
> On Thu, Mar 10, 2022 at 4:01 PM Zhi Li <lznuaa@gmail.com> wrote:
> >
> > On Tue, Feb 22, 2022 at 10:24 AM Frank Li <Frank.Li@nxp.com> wrote:
> > >
> > > This implement NTB function for PCIe EP to RC connections.
> > > The existed ntb epf need two PCI EPs and two PCI Host.
> > >
> > > This just need EP to RC connections.
> > >
> > >     ┌────────────┐         ┌─────────────────────────────────────┐
> > >     │            │         │                                     │
> > >     ├────────────┤         │                      ┌──────────────┤
> > >     │ NTB        │         │                      │ NTB          │
> > >     │ NetDev     │         │                      │ NetDev       │
> > >     ├────────────┤         │                      ├──────────────┤
> > >     │ NTB        │         │                      │ NTB          │
> > >     │ Transfer   │         │                      │ Transfer     │
> > >     ├────────────┤         │                      ├──────────────┤
> > >     │            │         │                      │              │
> > >     │  PCI NTB   │         │                      │              │
> > >     │    EPF     │         │                      │              │
> > >     │   Driver   │         │                      │ PCI Virtual  │
> > >     │            │         ├───────────────┐      │ NTB Driver   │
> > >     │            │         │ PCI EP NTB    │◄────►│              │
> > >     │            │         │  FN Driver    │      │              │
> > >     ├────────────┤         ├───────────────┤      ├──────────────┤
> > >     │            │         │               │      │              │
> > >     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
> > >     │            │  PCI    │               │      │     BUS      │
> > >     └────────────┘         └───────────────┴──────┴──────────────┘
> > >         PCI RC                        PCI EP
> > >
> > >
> > >
> > > Frank Li (4):
> > >   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
> > >   NTB: epf: Allow more flexibility in the memory BAR map method
> > >   PCI: endpoint: Support NTB transfer between RC and EP
> > >   Documentation: PCI: Add specification for the PCI vNTB function device
> > >
> >
>
> Update ntb mail list

Friendly ping!

Frank Li

>
> >
> > >  Documentation/PCI/endpoint/index.rst          |    2 +
> > >  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
> > >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
> > >  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
> > >  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
> > >  drivers/pci/endpoint/functions/Kconfig        |   11 +
> > >  drivers/pci/endpoint/functions/Makefile       |    1 +
> > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
> > >  8 files changed, 1775 insertions(+), 14 deletions(-)
> > >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
> > >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
> > >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> > >
> > > --
> > > 2.24.0.rc1
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
                   ` (4 preceding siblings ...)
  2022-03-10 22:01 ` [PATCH V2 0/4] NTB function for PCIe RC to EP connection Zhi Li
@ 2022-04-05 10:34 ` Kishon Vijay Abraham I
  2022-04-05 15:35   ` Zhi Li
  2022-08-12 14:02 ` Jon Mason
  6 siblings, 1 reply; 30+ messages in thread
From: Kishon Vijay Abraham I @ 2022-04-05 10:34 UTC (permalink / raw)
  To: Frank Li, helgaas, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh
  Cc: linux-ntb, linux-pci

Hi Frank Li,

On 22/02/22 9:53 pm, Frank Li wrote:
> This implement NTB function for PCIe EP to RC connections.
> The existed ntb epf need two PCI EPs and two PCI Host.

As I had earlier mentioned in [1], IMHO ideal solution would be build on virtio
layer instead of trying to build on NTB layer (which is specific to RC<->RC
communication).

Are there any specific reasons for not taking that path?

Thanks,
Kishon

[1] -> https://lore.kernel.org/r/459745d1-9fe7-e792-3532-33ee9552bc4d@ti.com
> 
> This just need EP to RC connections.
> 
>     ┌────────────┐         ┌─────────────────────────────────────┐
>     │            │         │                                     │
>     ├────────────┤         │                      ┌──────────────┤
>     │ NTB        │         │                      │ NTB          │
>     │ NetDev     │         │                      │ NetDev       │
>     ├────────────┤         │                      ├──────────────┤
>     │ NTB        │         │                      │ NTB          │
>     │ Transfer   │         │                      │ Transfer     │
>     ├────────────┤         │                      ├──────────────┤
>     │            │         │                      │              │
>     │  PCI NTB   │         │                      │              │
>     │    EPF     │         │                      │              │
>     │   Driver   │         │                      │ PCI Virtual  │
>     │            │         ├───────────────┐      │ NTB Driver   │
>     │            │         │ PCI EP NTB    │◄────►│              │
>     │            │         │  FN Driver    │      │              │
>     ├────────────┤         ├───────────────┤      ├──────────────┤
>     │            │         │               │      │              │
>     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
>     │            │  PCI    │               │      │     BUS      │
>     └────────────┘         └───────────────┴──────┴──────────────┘
>         PCI RC                        PCI EP
> 
> 
> 
> Frank Li (4):
>   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
>   NTB: epf: Allow more flexibility in the memory BAR map method
>   PCI: endpoint: Support NTB transfer between RC and EP
>   Documentation: PCI: Add specification for the PCI vNTB function device
> 
>  Documentation/PCI/endpoint/index.rst          |    2 +
>  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
>  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
>  drivers/pci/endpoint/functions/Kconfig        |   11 +
>  drivers/pci/endpoint/functions/Makefile       |    1 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
>  8 files changed, 1775 insertions(+), 14 deletions(-)
>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-04-05 10:34 ` Kishon Vijay Abraham I
@ 2022-04-05 15:35   ` Zhi Li
  2022-04-20 20:22     ` Zhi Li
  0 siblings, 1 reply; 30+ messages in thread
From: Zhi Li @ 2022-04-05 15:35 UTC (permalink / raw)
  To: Kishon Vijay Abraham I
  Cc: Frank Li, Bjorn Helgaas, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci

On Tue, Apr 5, 2022 at 5:34 AM Kishon Vijay Abraham I <kishon@ti.com> wrote:
>
> Hi Frank Li,
>
> On 22/02/22 9:53 pm, Frank Li wrote:
> > This implement NTB function for PCIe EP to RC connections.
> > The existed ntb epf need two PCI EPs and two PCI Host.
>
> As I had earlier mentioned in [1], IMHO ideal solution would be build on virtio
> layer instead of trying to build on NTB layer (which is specific to RC<->RC
> communication).
>
> Are there any specific reasons for not taking that path?

1. EP side work as vHOST mode.  vHost suppose access all memory of virtual io.
But there are only map windows on the EP side to access RC side
memory. You have to move
map windows for each access.  It is quite low efficiency.

2. So far as I know, virtio is still not DMA yet.  CPU access PCI
can't generate longer PCI TLP,
So the speed is quite slow.  NTB already has DMA support.  If you use
system level DMA,
no change is needed at NTB level.  If we want to use a PCI controller
embedded DMA,  some small
changes need if based on my other Designware PCI eDMA patches, which
are under review.

3. All the major data transfer of NTB is using write.  Because TLP
write needn't wait for complete,  write
performance is better than reading.  On our platform,  write
performance is about 10% better than  read.

Frank

>
> Thanks,
> Kishon
>
> [1] -> https://lore.kernel.org/r/459745d1-9fe7-e792-3532-33ee9552bc4d@ti.com
> >
> > This just need EP to RC connections.
> >
> >     ┌────────────┐         ┌─────────────────────────────────────┐
> >     │            │         │                                     │
> >     ├────────────┤         │                      ┌──────────────┤
> >     │ NTB        │         │                      │ NTB          │
> >     │ NetDev     │         │                      │ NetDev       │
> >     ├────────────┤         │                      ├──────────────┤
> >     │ NTB        │         │                      │ NTB          │
> >     │ Transfer   │         │                      │ Transfer     │
> >     ├────────────┤         │                      ├──────────────┤
> >     │            │         │                      │              │
> >     │  PCI NTB   │         │                      │              │
> >     │    EPF     │         │                      │              │
> >     │   Driver   │         │                      │ PCI Virtual  │
> >     │            │         ├───────────────┐      │ NTB Driver   │
> >     │            │         │ PCI EP NTB    │◄────►│              │
> >     │            │         │  FN Driver    │      │              │
> >     ├────────────┤         ├───────────────┤      ├──────────────┤
> >     │            │         │               │      │              │
> >     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
> >     │            │  PCI    │               │      │     BUS      │
> >     └────────────┘         └───────────────┴──────┴──────────────┘
> >         PCI RC                        PCI EP
> >
> >
> >
> > Frank Li (4):
> >   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
> >   NTB: epf: Allow more flexibility in the memory BAR map method
> >   PCI: endpoint: Support NTB transfer between RC and EP
> >   Documentation: PCI: Add specification for the PCI vNTB function device
> >
> >  Documentation/PCI/endpoint/index.rst          |    2 +
> >  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
> >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
> >  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
> >  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
> >  drivers/pci/endpoint/functions/Kconfig        |   11 +
> >  drivers/pci/endpoint/functions/Makefile       |    1 +
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
> >  8 files changed, 1775 insertions(+), 14 deletions(-)
> >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
> >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-04-05 15:35   ` Zhi Li
@ 2022-04-20 20:22     ` Zhi Li
  2022-04-22 15:15       ` Kishon Vijay Abraham I
  0 siblings, 1 reply; 30+ messages in thread
From: Zhi Li @ 2022-04-20 20:22 UTC (permalink / raw)
  To: Kishon Vijay Abraham I
  Cc: Frank Li, Bjorn Helgaas, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci

On Tue, Apr 5, 2022 at 10:35 AM Zhi Li <lznuaa@gmail.com> wrote:
>
> On Tue, Apr 5, 2022 at 5:34 AM Kishon Vijay Abraham I <kishon@ti.com> wrote:
> >
> > Hi Frank Li,
> >
> > On 22/02/22 9:53 pm, Frank Li wrote:
> > > This implement NTB function for PCIe EP to RC connections.
> > > The existed ntb epf need two PCI EPs and two PCI Host.
> >
> > As I had earlier mentioned in [1], IMHO ideal solution would be build on virtio
> > layer instead of trying to build on NTB layer (which is specific to RC<->RC
> > communication).
> >
> > Are there any specific reasons for not taking that path?
>
> 1. EP side work as vHOST mode.  vHost suppose access all memory of virtual io.
> But there are only map windows on the EP side to access RC side
> memory. You have to move
> map windows for each access.  It is quite low efficiency.
>
> 2. So far as I know, virtio is still not DMA yet.  CPU access PCI
> can't generate longer PCI TLP,
> So the speed is quite slow.  NTB already has DMA support.  If you use
> system level DMA,
> no change is needed at NTB level.  If we want to use a PCI controller
> embedded DMA,  some small
> changes need if based on my other Designware PCI eDMA patches, which
> are under review.
>
> 3. All the major data transfer of NTB is using write.  Because TLP
> write needn't wait for complete,  write
> performance is better than reading.  On our platform,  write
> performance is about 10% better than  read.
>
> Frank

Any Comments or rejection? @Kishon Vijay Abraham I

best regards
Frank Li

>
> >
> > Thanks,
> > Kishon
> >
> > [1] -> https://lore.kernel.org/r/459745d1-9fe7-e792-3532-33ee9552bc4d@ti.com
> > >
> > > This just need EP to RC connections.
> > >
> > >     ┌────────────┐         ┌─────────────────────────────────────┐
> > >     │            │         │                                     │
> > >     ├────────────┤         │                      ┌──────────────┤
> > >     │ NTB        │         │                      │ NTB          │
> > >     │ NetDev     │         │                      │ NetDev       │
> > >     ├────────────┤         │                      ├──────────────┤
> > >     │ NTB        │         │                      │ NTB          │
> > >     │ Transfer   │         │                      │ Transfer     │
> > >     ├────────────┤         │                      ├──────────────┤
> > >     │            │         │                      │              │
> > >     │  PCI NTB   │         │                      │              │
> > >     │    EPF     │         │                      │              │
> > >     │   Driver   │         │                      │ PCI Virtual  │
> > >     │            │         ├───────────────┐      │ NTB Driver   │
> > >     │            │         │ PCI EP NTB    │◄────►│              │
> > >     │            │         │  FN Driver    │      │              │
> > >     ├────────────┤         ├───────────────┤      ├──────────────┤
> > >     │            │         │               │      │              │
> > >     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
> > >     │            │  PCI    │               │      │     BUS      │
> > >     └────────────┘         └───────────────┴──────┴──────────────┘
> > >         PCI RC                        PCI EP
> > >
> > >
> > >
> > > Frank Li (4):
> > >   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
> > >   NTB: epf: Allow more flexibility in the memory BAR map method
> > >   PCI: endpoint: Support NTB transfer between RC and EP
> > >   Documentation: PCI: Add specification for the PCI vNTB function device
> > >
> > >  Documentation/PCI/endpoint/index.rst          |    2 +
> > >  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
> > >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
> > >  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
> > >  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
> > >  drivers/pci/endpoint/functions/Kconfig        |   11 +
> > >  drivers/pci/endpoint/functions/Makefile       |    1 +
> > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
> > >  8 files changed, 1775 insertions(+), 14 deletions(-)
> > >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
> > >  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
> > >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-04-20 20:22     ` Zhi Li
@ 2022-04-22 15:15       ` Kishon Vijay Abraham I
  2022-04-22 15:36         ` Zhi Li
  0 siblings, 1 reply; 30+ messages in thread
From: Kishon Vijay Abraham I @ 2022-04-22 15:15 UTC (permalink / raw)
  To: Zhi Li
  Cc: Frank Li, Bjorn Helgaas, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci

Hi Frank,

On 21/04/22 1:52 am, Zhi Li wrote:
> On Tue, Apr 5, 2022 at 10:35 AM Zhi Li <lznuaa@gmail.com> wrote:
>>
>> On Tue, Apr 5, 2022 at 5:34 AM Kishon Vijay Abraham I <kishon@ti.com> wrote:
>>>
>>> Hi Frank Li,
>>>
>>> On 22/02/22 9:53 pm, Frank Li wrote:
>>>> This implement NTB function for PCIe EP to RC connections.
>>>> The existed ntb epf need two PCI EPs and two PCI Host.
>>>
>>> As I had earlier mentioned in [1], IMHO ideal solution would be build on virtio
>>> layer instead of trying to build on NTB layer (which is specific to RC<->RC
>>> communication).
>>>
>>> Are there any specific reasons for not taking that path?
>>
>> 1. EP side work as vHOST mode.  vHost suppose access all memory of virtual io.
>> But there are only map windows on the EP side to access RC side
>> memory. You have to move
>> map windows for each access.  It is quite low efficiency.

I'm not sure I quite get this. EP HW has limited outbound memory to access RC
memory irrespective of how we implement it. This is not a SW framework
limitation AFAICS.
>>
>> 2. So far as I know, virtio is still not DMA yet.  CPU access PCI
>> can't generate longer PCI TLP,
>> So the speed is quite slow.  NTB already has DMA support.  If you use
>> system level DMA,
>> no change is needed at NTB level.  If we want to use a PCI controller
>> embedded DMA,  some small
>> changes need if based on my other Designware PCI eDMA patches, which
>> are under review.

Adding dmaengine API to do memcopy should be simple to add in vhost/virtio
interface.
>>
>> 3. All the major data transfer of NTB is using write.  Because TLP
>> write needn't wait for complete,  write
>> performance is better than reading.  On our platform,  write
>> performance is about 10% better than  read.
>>
>> Frank
> 
> Any Comments or rejection? @Kishon Vijay Abraham I

I'd strongly recommend going with virtio/vhost based approach and standardizing
it IMO.

Thanks,
Kishon

> 
> best regards
> Frank Li
> 
>>
>>>
>>> Thanks,
>>> Kishon
>>>
>>> [1] -> https://lore.kernel.org/r/459745d1-9fe7-e792-3532-33ee9552bc4d@ti.com
>>>>
>>>> This just need EP to RC connections.
>>>>
>>>>     ┌────────────┐         ┌─────────────────────────────────────┐
>>>>     │            │         │                                     │
>>>>     ├────────────┤         │                      ┌──────────────┤
>>>>     │ NTB        │         │                      │ NTB          │
>>>>     │ NetDev     │         │                      │ NetDev       │
>>>>     ├────────────┤         │                      ├──────────────┤
>>>>     │ NTB        │         │                      │ NTB          │
>>>>     │ Transfer   │         │                      │ Transfer     │
>>>>     ├────────────┤         │                      ├──────────────┤
>>>>     │            │         │                      │              │
>>>>     │  PCI NTB   │         │                      │              │
>>>>     │    EPF     │         │                      │              │
>>>>     │   Driver   │         │                      │ PCI Virtual  │
>>>>     │            │         ├───────────────┐      │ NTB Driver   │
>>>>     │            │         │ PCI EP NTB    │◄────►│              │
>>>>     │            │         │  FN Driver    │      │              │
>>>>     ├────────────┤         ├───────────────┤      ├──────────────┤
>>>>     │            │         │               │      │              │
>>>>     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
>>>>     │            │  PCI    │               │      │     BUS      │
>>>>     └────────────┘         └───────────────┴──────┴──────────────┘
>>>>         PCI RC                        PCI EP
>>>>
>>>>
>>>>
>>>> Frank Li (4):
>>>>   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
>>>>   NTB: epf: Allow more flexibility in the memory BAR map method
>>>>   PCI: endpoint: Support NTB transfer between RC and EP
>>>>   Documentation: PCI: Add specification for the PCI vNTB function device
>>>>
>>>>  Documentation/PCI/endpoint/index.rst          |    2 +
>>>>  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
>>>>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
>>>>  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
>>>>  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
>>>>  drivers/pci/endpoint/functions/Kconfig        |   11 +
>>>>  drivers/pci/endpoint/functions/Makefile       |    1 +
>>>>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
>>>>  8 files changed, 1775 insertions(+), 14 deletions(-)
>>>>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
>>>>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
>>>>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
>>>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-04-22 15:15       ` Kishon Vijay Abraham I
@ 2022-04-22 15:36         ` Zhi Li
  0 siblings, 0 replies; 30+ messages in thread
From: Zhi Li @ 2022-04-22 15:36 UTC (permalink / raw)
  To: Kishon Vijay Abraham I
  Cc: Frank Li, Bjorn Helgaas, lorenzo.pieralisi, kw, Jingoo Han,
	gustavo.pimentel, hongxing.zhu, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, linux-pci

On Fri, Apr 22, 2022 at 10:15 AM Kishon Vijay Abraham I <kishon@ti.com> wrote:
>
> Hi Frank,
>
> On 21/04/22 1:52 am, Zhi Li wrote:
> > On Tue, Apr 5, 2022 at 10:35 AM Zhi Li <lznuaa@gmail.com> wrote:
> >>
> >> On Tue, Apr 5, 2022 at 5:34 AM Kishon Vijay Abraham I <kishon@ti.com> wrote:
> >>>
> >>> Hi Frank Li,
> >>>
> >>> On 22/02/22 9:53 pm, Frank Li wrote:
> >>>> This implement NTB function for PCIe EP to RC connections.
> >>>> The existed ntb epf need two PCI EPs and two PCI Host.
> >>>
> >>> As I had earlier mentioned in [1], IMHO ideal solution would be build on virtio
> >>> layer instead of trying to build on NTB layer (which is specific to RC<->RC
> >>> communication).
> >>>
> >>> Are there any specific reasons for not taking that path?
> >>
> >> 1. EP side work as vHOST mode.  vHost suppose access all memory of virtual io.
> >> But there are only map windows on the EP side to access RC side
> >> memory. You have to move
> >> map windows for each access.  It is quite low efficiency.
>
> I'm not sure I quite get this. EP HW has limited outbound memory to access RC
> memory irrespective of how we implement it. This is not a SW framework
> limitation AFAICS.

Almost all EP HW have limited outbound memory windows to access RC.
We face transfer efficiency problems if we stick into vhost.

> >>
> >> 2. So far as I know, virtio is still not DMA yet.  CPU access PCI
> >> can't generate longer PCI TLP,
> >> So the speed is quite slow.  NTB already has DMA support.  If you use
> >> system level DMA,
> >> no change is needed at NTB level.  If we want to use a PCI controller
> >> embedded DMA,  some small
> >> changes need if based on my other Designware PCI eDMA patches, which
> >> are under review.
>
> Adding dmaengine API to do memcopy should be simple to add in vhost/virtio
> interface.
> >>
> >> 3. All the major data transfer of NTB is using write.  Because TLP
> >> write needn't wait for complete,  write
> >> performance is better than reading.  On our platform,  write
> >> performance is about 10% better than  read.
> >>
> >> Frank
> >
> > Any Comments or rejection? @Kishon Vijay Abraham I
>
> I'd strongly recommend going with virtio/vhost based approach and standardizing
> it IMO.

But No progress in recent years on this path.  At least, my patches make PCIe EP
work as enet with minimized change.
And NTB don't conflict with virtio/vhost solution.

Frank
>
> Thanks,
> Kishon
>
> >
> > best regards
> > Frank Li
> >
> >>
> >>>
> >>> Thanks,
> >>> Kishon
> >>>
> >>> [1] -> https://lore.kernel.org/r/459745d1-9fe7-e792-3532-33ee9552bc4d@ti.com
> >>>>
> >>>> This just need EP to RC connections.
> >>>>
> >>>>     ┌────────────┐         ┌─────────────────────────────────────┐
> >>>>     │            │         │                                     │
> >>>>     ├────────────┤         │                      ┌──────────────┤
> >>>>     │ NTB        │         │                      │ NTB          │
> >>>>     │ NetDev     │         │                      │ NetDev       │
> >>>>     ├────────────┤         │                      ├──────────────┤
> >>>>     │ NTB        │         │                      │ NTB          │
> >>>>     │ Transfer   │         │                      │ Transfer     │
> >>>>     ├────────────┤         │                      ├──────────────┤
> >>>>     │            │         │                      │              │
> >>>>     │  PCI NTB   │         │                      │              │
> >>>>     │    EPF     │         │                      │              │
> >>>>     │   Driver   │         │                      │ PCI Virtual  │
> >>>>     │            │         ├───────────────┐      │ NTB Driver   │
> >>>>     │            │         │ PCI EP NTB    │◄────►│              │
> >>>>     │            │         │  FN Driver    │      │              │
> >>>>     ├────────────┤         ├───────────────┤      ├──────────────┤
> >>>>     │            │         │               │      │              │
> >>>>     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
> >>>>     │            │  PCI    │               │      │     BUS      │
> >>>>     └────────────┘         └───────────────┴──────┴──────────────┘
> >>>>         PCI RC                        PCI EP
> >>>>
> >>>>
> >>>>
> >>>> Frank Li (4):
> >>>>   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
> >>>>   NTB: epf: Allow more flexibility in the memory BAR map method
> >>>>   PCI: endpoint: Support NTB transfer between RC and EP
> >>>>   Documentation: PCI: Add specification for the PCI vNTB function device
> >>>>
> >>>>  Documentation/PCI/endpoint/index.rst          |    2 +
> >>>>  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
> >>>>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
> >>>>  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
> >>>>  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
> >>>>  drivers/pci/endpoint/functions/Kconfig        |   11 +
> >>>>  drivers/pci/endpoint/functions/Makefile       |    1 +
> >>>>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
> >>>>  8 files changed, 1775 insertions(+), 14 deletions(-)
> >>>>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
> >>>>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
> >>>>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> >>>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V2 0/4] NTB function for PCIe RC to EP connection
  2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
                   ` (5 preceding siblings ...)
  2022-04-05 10:34 ` Kishon Vijay Abraham I
@ 2022-08-12 14:02 ` Jon Mason
  6 siblings, 0 replies; 30+ messages in thread
From: Jon Mason @ 2022-08-12 14:02 UTC (permalink / raw)
  To: Frank Li
  Cc: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, dave.jiang, allenbh,
	linux-ntb, linux-pci

On Tue, Feb 22, 2022 at 10:23:51AM -0600, Frank Li wrote:
> This implement NTB function for PCIe EP to RC connections.
> The existed ntb epf need two PCI EPs and two PCI Host.
> 
> This just need EP to RC connections.
> 
>     ┌────────────┐         ┌─────────────────────────────────────┐
>     │            │         │                                     │
>     ├────────────┤         │                      ┌──────────────┤
>     │ NTB        │         │                      │ NTB          │
>     │ NetDev     │         │                      │ NetDev       │
>     ├────────────┤         │                      ├──────────────┤
>     │ NTB        │         │                      │ NTB          │
>     │ Transfer   │         │                      │ Transfer     │
>     ├────────────┤         │                      ├──────────────┤
>     │            │         │                      │              │
>     │  PCI NTB   │         │                      │              │
>     │    EPF     │         │                      │              │
>     │   Driver   │         │                      │ PCI Virtual  │
>     │            │         ├───────────────┐      │ NTB Driver   │
>     │            │         │ PCI EP NTB    │◄────►│              │
>     │            │         │  FN Driver    │      │              │
>     ├────────────┤         ├───────────────┤      ├──────────────┤
>     │            │         │               │      │              │
>     │  PCI BUS   │ ◄─────► │  PCI EP BUS   │      │  Virtual PCI │
>     │            │  PCI    │               │      │     BUS      │
>     └────────────┘         └───────────────┴──────┴──────────────┘
>         PCI RC                        PCI EP
> 
> 
> 
> Frank Li (4):
>   PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
>   NTB: epf: Allow more flexibility in the memory BAR map method
>   PCI: endpoint: Support NTB transfer between RC and EP
>   Documentation: PCI: Add specification for the PCI vNTB function device
> 
>  Documentation/PCI/endpoint/index.rst          |    2 +
>  .../PCI/endpoint/pci-vntb-function.rst        |  126 ++
>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  167 ++
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |   48 +-
>  .../pci/controller/dwc/pcie-designware-ep.c   |   10 +-
>  drivers/pci/endpoint/functions/Kconfig        |   11 +
>  drivers/pci/endpoint/functions/Makefile       |    1 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 1424 +++++++++++++++++
>  8 files changed, 1775 insertions(+), 14 deletions(-)
>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-function.rst
>  create mode 100644 Documentation/PCI/endpoint/pci-vntb-howto.rst
>  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vntb.c
> 
> -- 
> 2.24.0.rc1
> 

Sorry for the extremely long delay in response.  This series has been
in my ntb-next branch for some time and will be in my pull request for
v5.20 which should be going out later today.

Thanks,
Jon

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP
  2022-02-22 16:23 ` [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP Frank Li
  2022-03-10 22:09   ` Zhi Li
@ 2022-12-14  0:08   ` Bjorn Helgaas
       [not found]     ` <CAHrpEqSGySHDET3YPu3czzoMBmCRJsgGgU4s3GWWbtruFLVHaA@mail.gmail.com>
  1 sibling, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2022-12-14  0:08 UTC (permalink / raw)
  To: Frank Li
  Cc: kishon, lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel,
	lznuaa, hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb,
	linux-pci

On Tue, Feb 22, 2022 at 10:23:54AM -0600, Frank Li wrote:

> + * +--------------------------------------------------+ Base
> + * |                                                  |
> + * |                                                  |
> + * |                                                  |
> + * |          Common Control Register                 |
> + * |                                                  |
> + * |                                                  |
> + * |                                                  |
> + * +-----------------------+--------------------------+ Base+span_offset
> + * |                       |                          |
> + * |    Peer Span Space    |    Span Space            |
> + * |                       |                          |
> + * |                       |                          |
> + * +-----------------------+--------------------------+ Base+span_offset
> + * |                       |                          |     +span_count * 4
> + * |                       |                          |
> + * |     Span Space        |   Peer Span Space        |
> + * |                       |                          |
> + * +-----------------------+--------------------------+

Are these comments supposed to say *spad*, i.e., scratchpad space,
instead of "span", to correspond with spad_offset and spad_count
below?

> +struct epf_ntb_ctrl {
> +	u32     command;
> +	u32     argument;
> +	u16     command_status;
> +	u16     link_status;
> +	u32     topology;
> +	u64     addr;
> +	u64     size;
> +	u32     num_mws;
> +	u32	reserved;
> +	u32     spad_offset;
> +	u32     spad_count;
> +	u32	db_entry_size;
> +	u32     db_data[MAX_DB_COUNT];
> +	u32     db_offset[MAX_DB_COUNT];
> +} __packed;

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP
       [not found]     ` <CAHrpEqSGySHDET3YPu3czzoMBmCRJsgGgU4s3GWWbtruFLVHaA@mail.gmail.com>
@ 2022-12-14  0:28       ` Bjorn Helgaas
  2022-12-14  0:47         ` [EXT] " Frank Li
  0 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2022-12-14  0:28 UTC (permalink / raw)
  To: Zhi Li
  Cc: Bjorn Helgaas, Frank Li, allenbh, dave.jiang, gustavo.pimentel,
	hongxing.zhu, jdmason, jingoohan1, kishon, kw, linux-ntb,
	linux-pci, lorenzo.pieralisi

On Tue, Dec 13, 2022 at 6:17 PM Zhi Li <lznuaa@gmail.com> wrote:
> On Tue, Dec 13, 2022 at 6:08 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> On Tue, Feb 22, 2022 at 10:23:54AM -0600, Frank Li wrote:
>>
>> > + * +--------------------------------------------------+ Base
>> > + * |                                                  |
>> > + * |                                                  |
>> > + * |                                                  |
>> > + * |          Common Control Register                 |
>> > + * |                                                  |
>> > + * |                                                  |
>> > + * |                                                  |
>> > + * +-----------------------+--------------------------+ Base+span_offset
>> > + * |                       |                          |
>> > + * |    Peer Span Space    |    Span Space            |
>> > + * |                       |                          |
>> > + * |                       |                          |
>> > + * +-----------------------+--------------------------+ Base+span_offset
>> > + * |                       |                          |     +span_count * 4
>> > + * |                       |                          |
>> > + * |     Span Space        |   Peer Span Space        |
>> > + * |                       |                          |
>> > + * +-----------------------+--------------------------+
>>
>> Are these comments supposed to say *spad*, i.e., scratchpad space,
>> instead of "span", to correspond with spad_offset and spad_count
>> below?
>
> Strange, I received some of your comments on the very old patches.

What's strange about it?  I went to the trouble to look up the patch
that introduced the thing I'm asking about.  I sent the email a few
minutes ago.  The question still applies to the current tree.

Please use plain text email on the Linux mailing lists.

Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [EXT] Re: [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP
  2022-12-14  0:28       ` Bjorn Helgaas
@ 2022-12-14  0:47         ` Frank Li
  0 siblings, 0 replies; 30+ messages in thread
From: Frank Li @ 2022-12-14  0:47 UTC (permalink / raw)
  To: bjorn, Zhi Li
  Cc: Bjorn Helgaas, allenbh, dave.jiang, gustavo.pimentel,
	Hongxing Zhu, jdmason, jingoohan1, kishon, kw, linux-ntb,
	linux-pci, lorenzo.pieralisi



> -----Original Message-----
> From: Bjorn Helgaas <bjorn.helgaas@gmail.com>
> Sent: Tuesday, December 13, 2022 6:29 PM
> To: Zhi Li <lznuaa@gmail.com>
> Cc: Bjorn Helgaas <helgaas@kernel.org>; Frank Li <frank.li@nxp.com>;
> allenbh@gmail.com; dave.jiang@intel.com;
> gustavo.pimentel@synopsys.com; Hongxing Zhu <hongxing.zhu@nxp.com>;
> jdmason@kudzu.us; jingoohan1@gmail.com; kishon@kernel.org;
> kw@linux.com; linux-ntb@googlegroups.com; linux-pci@vger.kernel.org;
> lorenzo.pieralisi@arm.com
> Subject: [EXT] Re: [PATCH v2 3/4] PCI: endpoint: Support NTB transfer
> between RC and EP
> 
> Caution: EXT Email
> 
> On Tue, Dec 13, 2022 at 6:17 PM Zhi Li <lznuaa@gmail.com> wrote:
> > On Tue, Dec 13, 2022 at 6:08 PM Bjorn Helgaas <helgaas@kernel.org>
> wrote:
> >> On Tue, Feb 22, 2022 at 10:23:54AM -0600, Frank Li wrote:
> >>
> >> > + * +--------------------------------------------------+ Base
> >> > + * |                                                  |
> >> > + * |                                                  |
> >> > + * |                                                  |
> >> > + * |          Common Control Register                 |
> >> > + * |                                                  |
> >> > + * |                                                  |
> >> > + * |                                                  |
> >> > + * +-----------------------+--------------------------+ Base+span_offset
> >> > + * |                       |                          |
> >> > + * |    Peer Span Space    |    Span Space            |
> >> > + * |                       |                          |
> >> > + * |                       |                          |
> >> > + * +-----------------------+--------------------------+ Base+span_offset
> >> > + * |                       |                          |     +span_count * 4
> >> > + * |                       |                          |
> >> > + * |     Span Space        |   Peer Span Space        |
> >> > + * |                       |                          |
> >> > + * +-----------------------+--------------------------+
> >>
> >> Are these comments supposed to say *spad*, i.e., scratchpad space,
> >> instead of "span", to correspond with spad_offset and spad_count
> >> below?
> >
> > Strange, I received some of your comments on the very old patches.
> 
> What's strange about it?  I went to the trouble to look up the patch
> that introduced the thing I'm asking about.  I sent the email a few
> minutes ago.  The question still applies to the current tree.

Okay, I doesn't realized you are talking about current tree also. 
Let me send patch to fix this soon.

Best regards
Frank Li

> 
> Please use plain text email on the Linux mailing lists.
> 
> Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2022-02-22 16:23 ` [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address Frank Li
@ 2023-12-14 14:31   ` Niklas Cassel
  2023-12-14 15:19     ` Frank Li
  0 siblings, 1 reply; 30+ messages in thread
From: Niklas Cassel @ 2023-12-14 14:31 UTC (permalink / raw)
  To: Frank Li, Manivannan Sadhasivam, Vidya Sagar
  Cc: helgaas, kishon, lorenzo.pieralisi, kw, jingoohan1,
	gustavo.pimentel, lznuaa, hongxing.zhu, jdmason, dave.jiang,
	allenbh, linux-ntb, linux-pci

Hello Frank,

On Tue, Feb 22, 2022 at 10:23:52AM -0600, Frank Li wrote:
> ntb_mw_set_trans() will set memory map window after endpoint function
> driver bind. The inbound map address need be updated dynamically when
> using NTB by PCIe Root Port and PCIe Endpoint connection.
> 
> Checking if iatu already assigned to the BAR, if yes, using assigned iatu
> number to update inbound address map and skip set BAR's register.
> 
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
> 
> Change from V1:
>  - improve commit message
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 998b698f40858..cad5d9ea7cc6c 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
>  	u32 free_win;
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  
> -	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);

find_first_zero_bit() can return 0, representing bit 0,
which is a perfectly valid return value.

> +	if (!ep->bar_to_atu[bar])

so this check is not correct.


For platforms that has a core_init_notifier, e.g. qcom and tegra,
what will happen is that, on first boot:

BAR0: iATU0
BAR1: iATU1
BAR2: iATU2
BAR3: iATU3
BAR4: iATU4
BAR5: iATU5

Rebooting the RC, will cause a perst assertion + deassertion,
after which:

BAR0: iATU6
BAR1: iATU1
BAR2: iATU2
BAR3: iATU3
BAR4: iATU4
BAR5: iATU5

This is obviously a bug, because you cannot use:

if (!ep->bar_to_atu[bar])

when 0 is a valid return value.

My proposed fix is:


@@ -172,16 +176,26 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
 {
        int ret;
        u32 free_win;
+       u32 saved_atu;
        struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 
-       if (!ep->bar_to_atu[bar])
+       saved_atu = ep->bar_to_atu[bar];
+       if (!saved_atu) {
                free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
-       else
-               free_win = ep->bar_to_atu[bar];
-
-       if (free_win >= pci->num_ib_windows) {
-               dev_err(pci->dev, "No free inbound window\n");
-               return -EINVAL;
+               //pr_err("%s BAR: %d, found no ATU, using first free, index: %d\n", __func__, bar, free_win);
+               if (free_win >= pci->num_ib_windows) {
+                       dev_err(pci->dev, "No free inbound window\n");
+                       return -EINVAL;
+               }
+
+               /*
+                * In order for bar_to_atu[bar] == 0 to equal no iATU, offset
+                * the saved value with 1.
+                */
+               saved_atu = free_win + 1;
+       } else {
+               free_win = saved_atu - 1;
+               //pr_err("%s BAR: %d, already has ATU, index: %d\n", __func__, bar, free_win);
        }
 
        ret = dw_pcie_prog_ep_inbound_atu(pci, func_no, free_win, type,
@@ -191,7 +205,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
                return ret;
        }
 
-       ep->bar_to_atu[bar] = free_win;
+       ep->bar_to_atu[bar] = saved_atu;
        set_bit(free_win, ep->ib_window_map);
 
        return 0;


>  static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> @@ -244,6 +249,9 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  	if (ret)
>  		return ret;
>  
> +	if (ep->epf_bar[bar])
> +		return 0;
> +

However, here you ignore writing the settings if (ep->epf_bar[bar]),
again, this will not work for a platform with a core_init_notifier,
as they perform a full core reset using reset_control_assert(),
when perst is asserted + deasserted, so AFAICT, these settings will
get cleared for those platforms, so they will need to be re-written.


>  	dw_pcie_dbi_ro_wr_en(pci);
>  
>  	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-14 14:31   ` Niklas Cassel
@ 2023-12-14 15:19     ` Frank Li
  2023-12-14 20:23       ` Niklas Cassel
  0 siblings, 1 reply; 30+ messages in thread
From: Frank Li @ 2023-12-14 15:19 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Manivannan Sadhasivam, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Thu, Dec 14, 2023 at 02:31:04PM +0000, Niklas Cassel wrote:
> Hello Frank,
> 
> On Tue, Feb 22, 2022 at 10:23:52AM -0600, Frank Li wrote:
> > ntb_mw_set_trans() will set memory map window after endpoint function
> > driver bind. The inbound map address need be updated dynamically when
> > using NTB by PCIe Root Port and PCIe Endpoint connection.
> > 
> > Checking if iatu already assigned to the BAR, if yes, using assigned iatu
> > number to update inbound address map and skip set BAR's register.
> > 
> > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > ---
> > 
> > Change from V1:
> >  - improve commit message
> > 
> >  drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > index 998b698f40858..cad5d9ea7cc6c 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > @@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
> >  	u32 free_win;
> >  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> >  
> > -	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> 
> find_first_zero_bit() can return 0, representing bit 0,
> which is a perfectly valid return value.
> 
> > +	if (!ep->bar_to_atu[bar])
> 
> so this check is not correct.
> 

Please sent out your fixed patch. If want to me fix it, please tell me
reproduce steps.

Frank

> 
> For platforms that has a core_init_notifier, e.g. qcom and tegra,
> what will happen is that, on first boot:
> 
> BAR0: iATU0
> BAR1: iATU1
> BAR2: iATU2
> BAR3: iATU3
> BAR4: iATU4
> BAR5: iATU5
> 
> Rebooting the RC, will cause a perst assertion + deassertion,
> after which:
> 
> BAR0: iATU6
> BAR1: iATU1
> BAR2: iATU2
> BAR3: iATU3
> BAR4: iATU4
> BAR5: iATU5
> 
> This is obviously a bug, because you cannot use:
> 
> if (!ep->bar_to_atu[bar])
> 
> when 0 is a valid return value.
> 
> My proposed fix is:
> 
> 
> @@ -172,16 +176,26 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
>  {
>         int ret;
>         u32 free_win;
> +       u32 saved_atu;
>         struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  
> -       if (!ep->bar_to_atu[bar])
> +       saved_atu = ep->bar_to_atu[bar];
> +       if (!saved_atu) {
>                 free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> -       else
> -               free_win = ep->bar_to_atu[bar];
> -
> -       if (free_win >= pci->num_ib_windows) {
> -               dev_err(pci->dev, "No free inbound window\n");
> -               return -EINVAL;
> +               //pr_err("%s BAR: %d, found no ATU, using first free, index: %d\n", __func__, bar, free_win);
> +               if (free_win >= pci->num_ib_windows) {
> +                       dev_err(pci->dev, "No free inbound window\n");
> +                       return -EINVAL;
> +               }
> +
> +               /*
> +                * In order for bar_to_atu[bar] == 0 to equal no iATU, offset
> +                * the saved value with 1.
> +                */
> +               saved_atu = free_win + 1;
> +       } else {
> +               free_win = saved_atu - 1;
> +               //pr_err("%s BAR: %d, already has ATU, index: %d\n", __func__, bar, free_win);
>         }
>  
>         ret = dw_pcie_prog_ep_inbound_atu(pci, func_no, free_win, type,
> @@ -191,7 +205,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
>                 return ret;
>         }
>  
> -       ep->bar_to_atu[bar] = free_win;
> +       ep->bar_to_atu[bar] = saved_atu;
>         set_bit(free_win, ep->ib_window_map);
>  
>         return 0;
> 
> 
> >  static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> > @@ -244,6 +249,9 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> >  	if (ret)
> >  		return ret;
> >  
> > +	if (ep->epf_bar[bar])
> > +		return 0;
> > +
> 
> However, here you ignore writing the settings if (ep->epf_bar[bar]),
> again, this will not work for a platform with a core_init_notifier,
> as they perform a full core reset using reset_control_assert(),
> when perst is asserted + deasserted, so AFAICT, these settings will
> get cleared for those platforms, so they will need to be re-written.
> 
> 
> >  	dw_pcie_dbi_ro_wr_en(pci);
> >  
> >  	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
> 
> 
> Kind regards,
> Niklas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-14 15:19     ` Frank Li
@ 2023-12-14 20:23       ` Niklas Cassel
  2023-12-14 20:52         ` Frank Li
  0 siblings, 1 reply; 30+ messages in thread
From: Niklas Cassel @ 2023-12-14 20:23 UTC (permalink / raw)
  To: Frank Li
  Cc: Manivannan Sadhasivam, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Thu, Dec 14, 2023 at 10:19:03AM -0500, Frank Li wrote:
> On Thu, Dec 14, 2023 at 02:31:04PM +0000, Niklas Cassel wrote:
> > Hello Frank,
> > 
> > On Tue, Feb 22, 2022 at 10:23:52AM -0600, Frank Li wrote:
> > > ntb_mw_set_trans() will set memory map window after endpoint function
> > > driver bind. The inbound map address need be updated dynamically when
> > > using NTB by PCIe Root Port and PCIe Endpoint connection.
> > > 
> > > Checking if iatu already assigned to the BAR, if yes, using assigned iatu
> > > number to update inbound address map and skip set BAR's register.
> > > 
> > > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > > ---
> > > 
> > > Change from V1:
> > >  - improve commit message
> > > 
> > >  drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > index 998b698f40858..cad5d9ea7cc6c 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > @@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
> > >  	u32 free_win;
> > >  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > >  
> > > -	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> > 
> > find_first_zero_bit() can return 0, representing bit 0,
> > which is a perfectly valid return value.
> > 
> > > +	if (!ep->bar_to_atu[bar])
> > 
> > so this check is not correct.
> > 
> 
> Please sent out your fixed patch. If want to me fix it, please tell me
> reproduce steps.

Reproduce steps are simple:
1) Add debug messages to dw_pcie_ep_inbound_atu() to see the iATU index for
each BAR.
2) Boot an EP platform with a core_init_notifier.
3) Boot the RC.
4) Reboot the RC, which will assert + deassert PERST, and will call
   pci_epc_init_notify(), which will call .core_init (pci_epf_test_core_init())
   which will set the BARs.


In step 3) BAR0 will use iATU0.

In step 4) BAR0 will use iATU6 instead of iATU0.
There is no reason for this, as it should really reuse the same
iATU index as before, just like all the other BARs do.
(This is because of find_first_zero_bit() misusage.)


I could send out my patch, but from inspecting the code, it looks like:

> > > + if (ep->epf_bar[bar])
> > > +         return 0;

from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
settings will be re-written for platforms with core_init_notifiers.

Right now, for a platform with a core_init_notifier, if you run
pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
and then run pci_endpoint_test again, then I'm quite sure that
pci_endpoint_test will not pass the second time (because iATU settings
were not rewritten after reset).

It would be nice if Mani or Vidya could confirm this.


I guess that you added this statement for some reason, so I assume
that we can't just drop this line without breaking something else.

I guess one option would be modify dw_pcie_ep_init_notify() to call
dw_pcie_ep_clear_bar() on all non-NULL BARs stored in ep->epf_bar[],
before calling pci_epc_init_notify(). That way the second .core_init
(pci_epf_test_core_init()) call will use write the settings, because
ep->epf_bar[] will be empty, so the "write the settings only the first
time" approach will then also work for core_init_notifier platforms.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-14 20:23       ` Niklas Cassel
@ 2023-12-14 20:52         ` Frank Li
  2023-12-14 21:28           ` Frank Li
  2023-12-14 21:39           ` Niklas Cassel
  0 siblings, 2 replies; 30+ messages in thread
From: Frank Li @ 2023-12-14 20:52 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Manivannan Sadhasivam, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Thu, Dec 14, 2023 at 08:23:14PM +0000, Niklas Cassel wrote:
> On Thu, Dec 14, 2023 at 10:19:03AM -0500, Frank Li wrote:
> > On Thu, Dec 14, 2023 at 02:31:04PM +0000, Niklas Cassel wrote:
> > > Hello Frank,
> > > 
> > > On Tue, Feb 22, 2022 at 10:23:52AM -0600, Frank Li wrote:
> > > > ntb_mw_set_trans() will set memory map window after endpoint function
> > > > driver bind. The inbound map address need be updated dynamically when
> > > > using NTB by PCIe Root Port and PCIe Endpoint connection.
> > > > 
> > > > Checking if iatu already assigned to the BAR, if yes, using assigned iatu
> > > > number to update inbound address map and skip set BAR's register.
> > > > 
> > > > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > > > ---
> > > > 
> > > > Change from V1:
> > > >  - improve commit message
> > > > 
> > > >  drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > index 998b698f40858..cad5d9ea7cc6c 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > @@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
> > > >  	u32 free_win;
> > > >  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > >  
> > > > -	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> > > 
> > > find_first_zero_bit() can return 0, representing bit 0,
> > > which is a perfectly valid return value.
> > > 
> > > > +	if (!ep->bar_to_atu[bar])
> > > 
> > > so this check is not correct.
> > > 
> > 
> > Please sent out your fixed patch. If want to me fix it, please tell me
> > reproduce steps.
> 
> Reproduce steps are simple:
> 1) Add debug messages to dw_pcie_ep_inbound_atu() to see the iATU index for
> each BAR.
> 2) Boot an EP platform with a core_init_notifier.
> 3) Boot the RC.
> 4) Reboot the RC, which will assert + deassert PERST, and will call
>    pci_epc_init_notify(), which will call .core_init (pci_epf_test_core_init())
>    which will set the BARs.
> 
> 
> In step 3) BAR0 will use iATU0.
> 
> In step 4) BAR0 will use iATU6 instead of iATU0.
> There is no reason for this, as it should really reuse the same
> iATU index as before, just like all the other BARs do.
> (This is because of find_first_zero_bit() misusage.)
> 
> 
> I could send out my patch, but from inspecting the code, it looks like:
> 
> > > > + if (ep->epf_bar[bar])
> > > > +         return 0;

I checked current code. 

dw_pcie_ep_inbound_atu()
{
 	if (!ep->bar_to_atu[bar])                                                                   
                free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);             
        else                                                                                        
                free_win = ep->bar_to_atu[bar]; 
}

I missed conside '0' is validate windows number. I think we should init
bar_to_atu to -1. 

	if (ep->bar_to_atu[bar] < 0)


Origial change want dw_pcie_ep_inbound_atu() can be call twice to update
bar map address. vntb use "bar3" as memory map windows, so have not trigger
this problem.

Frank

> 
> from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
> settings will be re-written for platforms with core_init_notifiers.
> 
> Right now, for a platform with a core_init_notifier, if you run
> pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
> and then run pci_endpoint_test again, then I'm quite sure that
> pci_endpoint_test will not pass the second time (because iATU settings
> were not rewritten after reset).
> 
> It would be nice if Mani or Vidya could confirm this.
> 
> 
> I guess that you added this statement for some reason, so I assume
> that we can't just drop this line without breaking something else.
> 
> I guess one option would be modify dw_pcie_ep_init_notify() to call
> dw_pcie_ep_clear_bar() on all non-NULL BARs stored in ep->epf_bar[],
> before calling pci_epc_init_notify(). That way the second .core_init
> (pci_epf_test_core_init()) call will use write the settings, because
> ep->epf_bar[] will be empty, so the "write the settings only the first
> time" approach will then also work for core_init_notifier platforms.
> 
> 
> Kind regards,
> Niklas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-14 20:52         ` Frank Li
@ 2023-12-14 21:28           ` Frank Li
  2023-12-16 10:11             ` Niklas Cassel
  2023-12-14 21:39           ` Niklas Cassel
  1 sibling, 1 reply; 30+ messages in thread
From: Frank Li @ 2023-12-14 21:28 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Manivannan Sadhasivam, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Thu, Dec 14, 2023 at 03:52:43PM -0500, Frank Li wrote:
> On Thu, Dec 14, 2023 at 08:23:14PM +0000, Niklas Cassel wrote:
> > On Thu, Dec 14, 2023 at 10:19:03AM -0500, Frank Li wrote:
> > > On Thu, Dec 14, 2023 at 02:31:04PM +0000, Niklas Cassel wrote:
> > > > Hello Frank,
> > > > 
> > > > On Tue, Feb 22, 2022 at 10:23:52AM -0600, Frank Li wrote:
> > > > > ntb_mw_set_trans() will set memory map window after endpoint function
> > > > > driver bind. The inbound map address need be updated dynamically when
> > > > > using NTB by PCIe Root Port and PCIe Endpoint connection.
> > > > > 
> > > > > Checking if iatu already assigned to the BAR, if yes, using assigned iatu
> > > > > number to update inbound address map and skip set BAR's register.
> > > > > 
> > > > > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > > > > ---
> > > > > 
> > > > > Change from V1:
> > > > >  - improve commit message
> > > > > 
> > > > >  drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
> > > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > index 998b698f40858..cad5d9ea7cc6c 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > @@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
> > > > >  	u32 free_win;
> > > > >  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > > >  
> > > > > -	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> > > > 
> > > > find_first_zero_bit() can return 0, representing bit 0,
> > > > which is a perfectly valid return value.
> > > > 
> > > > > +	if (!ep->bar_to_atu[bar])
> > > > 
> > > > so this check is not correct.
> > > > 
> > > 
> > > Please sent out your fixed patch. If want to me fix it, please tell me
> > > reproduce steps.
> > 
> > Reproduce steps are simple:
> > 1) Add debug messages to dw_pcie_ep_inbound_atu() to see the iATU index for
> > each BAR.
> > 2) Boot an EP platform with a core_init_notifier.
> > 3) Boot the RC.
> > 4) Reboot the RC, which will assert + deassert PERST, and will call
> >    pci_epc_init_notify(), which will call .core_init (pci_epf_test_core_init())
> >    which will set the BARs.
> > 
> > 
> > In step 3) BAR0 will use iATU0.
> > 
> > In step 4) BAR0 will use iATU6 instead of iATU0.
> > There is no reason for this, as it should really reuse the same
> > iATU index as before, just like all the other BARs do.
> > (This is because of find_first_zero_bit() misusage.)
> > 
> > 
> > I could send out my patch, but from inspecting the code, it looks like:
> > 
> > > > > + if (ep->epf_bar[bar])
> > > > > +         return 0;
> 
> I checked current code. 
> 
> dw_pcie_ep_inbound_atu()
> {
>  	if (!ep->bar_to_atu[bar])                                                                   
>                 free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);             
>         else                                                                                        
>                 free_win = ep->bar_to_atu[bar]; 
> }
> 
> I missed conside '0' is validate windows number. I think we should init
> bar_to_atu to -1. 
> 
> 	if (ep->bar_to_atu[bar] < 0)
> 
> 
> Origial change want dw_pcie_ep_inbound_atu() can be call twice to update
> bar map address. vntb use "bar3" as memory map windows, so have not trigger
> this problem.

Does below change fix your problem?

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index f6207989fc6ad..2868d44649ef7 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -177,7 +177,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
        if (!ep->bar_to_atu[bar])
                free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
        else
-               free_win = ep->bar_to_atu[bar];
+               free_win = ep->bar_to_atu[bar] - 1;
 
        if (free_win >= pci->num_ib_windows) {
                dev_err(pci->dev, "No free inbound window\n");
@@ -191,7 +191,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
                return ret;
        }
 
-       ep->bar_to_atu[bar] = free_win;
+       ep->bar_to_atu[bar] = free_win + 1;
        set_bit(free_win, ep->ib_window_map);
 
        return 0;
@@ -228,7 +228,7 @@ static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
        struct dw_pcie_ep *ep = epc_get_drvdata(epc);
        struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
        enum pci_barno bar = epf_bar->barno;
-       u32 atu_index = ep->bar_to_atu[bar];
+       u32 atu_index = ep->bar_to_atu[bar] - 1;
 
        __dw_pcie_ep_reset_bar(pci, func_no, bar, epf_bar->flags);

> 
> Frank
> 
> > 
> > from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
> > settings will be re-written for platforms with core_init_notifiers.
> > 
> > Right now, for a platform with a core_init_notifier, if you run
> > pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
> > and then run pci_endpoint_test again, then I'm quite sure that
> > pci_endpoint_test will not pass the second time (because iATU settings
> > were not rewritten after reset).
> > 
> > It would be nice if Mani or Vidya could confirm this.
> > 
> > 
> > I guess that you added this statement for some reason, so I assume
> > that we can't just drop this line without breaking something else.
> > 
> > I guess one option would be modify dw_pcie_ep_init_notify() to call
> > dw_pcie_ep_clear_bar() on all non-NULL BARs stored in ep->epf_bar[],
> > before calling pci_epc_init_notify(). That way the second .core_init
> > (pci_epf_test_core_init()) call will use write the settings, because
> > ep->epf_bar[] will be empty, so the "write the settings only the first
> > time" approach will then also work for core_init_notifier platforms.
> > 
> > 
> > Kind regards,
> > Niklas

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-14 20:52         ` Frank Li
  2023-12-14 21:28           ` Frank Li
@ 2023-12-14 21:39           ` Niklas Cassel
  1 sibling, 0 replies; 30+ messages in thread
From: Niklas Cassel @ 2023-12-14 21:39 UTC (permalink / raw)
  To: Frank Li
  Cc: Manivannan Sadhasivam, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Thu, Dec 14, 2023 at 03:52:43PM -0500, Frank Li wrote:
> On Thu, Dec 14, 2023 at 08:23:14PM +0000, Niklas Cassel wrote:
> > On Thu, Dec 14, 2023 at 10:19:03AM -0500, Frank Li wrote:
> > > On Thu, Dec 14, 2023 at 02:31:04PM +0000, Niklas Cassel wrote:
> > > > Hello Frank,
> > > > 
> > > > On Tue, Feb 22, 2022 at 10:23:52AM -0600, Frank Li wrote:
> > > > > ntb_mw_set_trans() will set memory map window after endpoint function
> > > > > driver bind. The inbound map address need be updated dynamically when
> > > > > using NTB by PCIe Root Port and PCIe Endpoint connection.
> > > > > 
> > > > > Checking if iatu already assigned to the BAR, if yes, using assigned iatu
> > > > > number to update inbound address map and skip set BAR's register.
> > > > > 
> > > > > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > > > > ---
> > > > > 
> > > > > Change from V1:
> > > > >  - improve commit message
> > > > > 
> > > > >  drivers/pci/controller/dwc/pcie-designware-ep.c | 10 +++++++++-
> > > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > index 998b698f40858..cad5d9ea7cc6c 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > > > @@ -161,7 +161,11 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no,
> > > > >  	u32 free_win;
> > > > >  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > > >  
> > > > > -	free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> > > > 
> > > > find_first_zero_bit() can return 0, representing bit 0,
> > > > which is a perfectly valid return value.
> > > > 
> > > > > +	if (!ep->bar_to_atu[bar])
> > > > 
> > > > so this check is not correct.
> > > > 
> > > 
> > > Please sent out your fixed patch. If want to me fix it, please tell me
> > > reproduce steps.
> > 
> > Reproduce steps are simple:
> > 1) Add debug messages to dw_pcie_ep_inbound_atu() to see the iATU index for
> > each BAR.
> > 2) Boot an EP platform with a core_init_notifier.
> > 3) Boot the RC.
> > 4) Reboot the RC, which will assert + deassert PERST, and will call
> >    pci_epc_init_notify(), which will call .core_init (pci_epf_test_core_init())
> >    which will set the BARs.
> > 
> > 
> > In step 3) BAR0 will use iATU0.
> > 
> > In step 4) BAR0 will use iATU6 instead of iATU0.
> > There is no reason for this, as it should really reuse the same
> > iATU index as before, just like all the other BARs do.
> > (This is because of find_first_zero_bit() misusage.)
> > 
> > 
> > I could send out my patch, but from inspecting the code, it looks like:
> > 
> > > > > + if (ep->epf_bar[bar])
> > > > > +         return 0;
> 
> I checked current code. 
> 
> dw_pcie_ep_inbound_atu()
> {
>  	if (!ep->bar_to_atu[bar])                                                                   
>                 free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);             
>         else                                                                                        
>                 free_win = ep->bar_to_atu[bar]; 
> }
> 
> I missed conside '0' is validate windows number. I think we should init
> bar_to_atu to -1. 
> 
> 	if (ep->bar_to_atu[bar] < 0)
> 
> 
> Origial change want dw_pcie_ep_inbound_atu() can be call twice to update
> bar map address. vntb use "bar3" as memory map windows, so have not trigger
> this problem.

Feel free to send out a patch, I just wanted to inform people about the
problem.


I can understand that the point of the patch in $subject was to re-use the
exact same iATU for a specific BAR, however, does the change:

if (ep->epf_bar[bar])
	return 0;

to dw_pcie_ep_set_bar(), really provide any value?

I guess it will avoid clearing the BAR_REG, if the RC has assigned an
PCI address to that BAR, which I guess was important for vntb?

If so, I don't see any alternative but to let DWC drivers with a
core_init_notifier to call dw_pcie_ep_clear_bar() when asserting perst,
or to modify dw_pcie_ep_init_notify() to call dw_pcie_ep_clear_bar().


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-14 21:28           ` Frank Li
@ 2023-12-16 10:11             ` Niklas Cassel
  2023-12-19 17:59               ` Manivannan Sadhasivam
  0 siblings, 1 reply; 30+ messages in thread
From: Niklas Cassel @ 2023-12-16 10:11 UTC (permalink / raw)
  To: Frank Li
  Cc: Manivannan Sadhasivam, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Thu, Dec 14, 2023 at 04:28:44PM -0500, Frank Li wrote:
> 

(snip)

> Does below change fix your problem?

It is basically the same fix as I sent out earlier in this thread,
but yes, it does solve 1 out of 2 problems introduced by the patch
in $subject, so:

Tested-by: Niklas Cassel <niklas.cassel@wdc.com>

> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index f6207989fc6ad..2868d44649ef7 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -177,7 +177,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
>         if (!ep->bar_to_atu[bar])
>                 free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
>         else
> -               free_win = ep->bar_to_atu[bar];
> +               free_win = ep->bar_to_atu[bar] - 1;
>  
>         if (free_win >= pci->num_ib_windows) {
>                 dev_err(pci->dev, "No free inbound window\n");
> @@ -191,7 +191,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
>                 return ret;
>         }
>  
> -       ep->bar_to_atu[bar] = free_win;
> +       ep->bar_to_atu[bar] = free_win + 1;
>         set_bit(free_win, ep->ib_window_map);
>  
>         return 0;
> @@ -228,7 +228,7 @@ static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>         struct dw_pcie_ep *ep = epc_get_drvdata(epc);
>         struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>         enum pci_barno bar = epf_bar->barno;
> -       u32 atu_index = ep->bar_to_atu[bar];
> +       u32 atu_index = ep->bar_to_atu[bar] - 1;

You probably want to add a:

if (!ep->bar_to_atu[bar])
	return;

here, so that dw_pcie_ep_clear_bar() will never try to use -1 as index.
(E.g. if clear_bar() is called twice on the same BAR.)

>  
>         __dw_pcie_ep_reset_bar(pci, func_no, bar, epf_bar->flags);
> 

(snip)

> > > from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
> > > settings will be re-written for platforms with core_init_notifiers.
> > > 
> > > Right now, for a platform with a core_init_notifier, if you run
> > > pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
> > > and then run pci_endpoint_test again, then I'm quite sure that
> > > pci_endpoint_test will not pass the second time (because iATU settings
> > > were not rewritten after reset).
> > > 
> > > It would be nice if Mani or Vidya could confirm this.

So problem 2 out of 2 introduced by the patch in $subject is that
DWC drivers with a .core_init_notifier, will perform a reset_control_assert()
to reset the core (which will reset both sticky and non-sticky registers),
which means that the early return in dw_pcie_ep_set_bar():
https://github.com/torvalds/linux/blob/v6.7-rc5/drivers/pci/controller/dwc/pcie-designware-ep.c#L268-L269

while returning after the iATU settings have been written,
it will return before:

	dw_pcie_writel_dbi2(pci, reg_dbi2, lower_32_bits(size - 1));
	dw_pcie_writel_dbi(pci, reg, flags);

Which means that, for drivers with a .core_init_notifier, BARx_REG and
BARx_MASK registers will not be written. This means that they will have
reset values for these registers, which means that e.g. the BAR_SIZE
(which is defined by BARx_MASK) might be incorrect for these platforms
because this function returns early.

I will not send a fix for this problem, I will leave that to you, or Mani,
or Vidya, and hope that people are happy that I simply reported this issue.


Here is my suggested solution in case anyone wants to take a stab at it:

> > > I guess one option would be modify dw_pcie_ep_init_notify() to call
> > > dw_pcie_ep_clear_bar() on all non-NULL BARs stored in ep->epf_bar[],
> > > before calling pci_epc_init_notify(). That way the second .core_init
> > > (pci_epf_test_core_init()) call will use write the settings, because
> > > ep->epf_bar[] will be empty, so the "write the settings only the first
> > > time" approach will then also work for core_init_notifier platforms.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-16 10:11             ` Niklas Cassel
@ 2023-12-19 17:59               ` Manivannan Sadhasivam
  2023-12-20  5:14                 ` Damien Le Moal
  0 siblings, 1 reply; 30+ messages in thread
From: Manivannan Sadhasivam @ 2023-12-19 17:59 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Frank Li, Vidya Sagar, helgaas, kishon, lorenzo.pieralisi, kw,
	jingoohan1, gustavo.pimentel, lznuaa, hongxing.zhu, jdmason,
	dave.jiang, allenbh, linux-ntb, linux-pci

On Sat, Dec 16, 2023 at 10:11:19AM +0000, Niklas Cassel wrote:
> On Thu, Dec 14, 2023 at 04:28:44PM -0500, Frank Li wrote:
> > 
> 
> (snip)
> 
> > Does below change fix your problem?
> 
> It is basically the same fix as I sent out earlier in this thread,
> but yes, it does solve 1 out of 2 problems introduced by the patch
> in $subject, so:
> 
> Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
> 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > index f6207989fc6ad..2868d44649ef7 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > @@ -177,7 +177,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
> >         if (!ep->bar_to_atu[bar])
> >                 free_win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
> >         else
> > -               free_win = ep->bar_to_atu[bar];
> > +               free_win = ep->bar_to_atu[bar] - 1;
> >  
> >         if (free_win >= pci->num_ib_windows) {
> >                 dev_err(pci->dev, "No free inbound window\n");
> > @@ -191,7 +191,7 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, u8 func_no, int type,
> >                 return ret;
> >         }
> >  
> > -       ep->bar_to_atu[bar] = free_win;
> > +       ep->bar_to_atu[bar] = free_win + 1;
> >         set_bit(free_win, ep->ib_window_map);
> >  
> >         return 0;
> > @@ -228,7 +228,7 @@ static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> >         struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> >         struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> >         enum pci_barno bar = epf_bar->barno;
> > -       u32 atu_index = ep->bar_to_atu[bar];
> > +       u32 atu_index = ep->bar_to_atu[bar] - 1;
> 
> You probably want to add a:
> 
> if (!ep->bar_to_atu[bar])
> 	return;
> 
> here, so that dw_pcie_ep_clear_bar() will never try to use -1 as index.
> (E.g. if clear_bar() is called twice on the same BAR.)
> 
> >  
> >         __dw_pcie_ep_reset_bar(pci, func_no, bar, epf_bar->flags);
> > 
> 
> (snip)
> 
> > > > from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
> > > > settings will be re-written for platforms with core_init_notifiers.
> > > > 
> > > > Right now, for a platform with a core_init_notifier, if you run
> > > > pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
> > > > and then run pci_endpoint_test again, then I'm quite sure that
> > > > pci_endpoint_test will not pass the second time (because iATU settings
> > > > were not rewritten after reset).
> > > > 
> > > > It would be nice if Mani or Vidya could confirm this.
> 
> So problem 2 out of 2 introduced by the patch in $subject is that
> DWC drivers with a .core_init_notifier, will perform a reset_control_assert()
> to reset the core (which will reset both sticky and non-sticky registers),
> which means that the early return in dw_pcie_ep_set_bar():
> https://github.com/torvalds/linux/blob/v6.7-rc5/drivers/pci/controller/dwc/pcie-designware-ep.c#L268-L269
> 
> while returning after the iATU settings have been written,
> it will return before:
> 
> 	dw_pcie_writel_dbi2(pci, reg_dbi2, lower_32_bits(size - 1));
> 	dw_pcie_writel_dbi(pci, reg, flags);
> 
> Which means that, for drivers with a .core_init_notifier, BARx_REG and
> BARx_MASK registers will not be written. This means that they will have
> reset values for these registers, which means that e.g. the BAR_SIZE
> (which is defined by BARx_MASK) might be incorrect for these platforms
> because this function returns early.
> 
> I will not send a fix for this problem, I will leave that to you, or Mani,
> or Vidya, and hope that people are happy that I simply reported this issue.
> 

The fix for this issue shouldn't be in the DWC driver but rather in the EPF
drivers. Because, EPF drivers are the ones calling pci_epc_set_bar() during
bind(). So they have to properly cleanup the resources once the perst assertion
happens. This issue also applies to other resources such as DMA channels.

The problem here is, there is a disconnect between DWC (EPC) and EPF drivers.
When the perst assertion happens, the event is not passed to EPF drivers
resulting in the EPF drivers having no knowledge that they need to give up the
resources.

We need to fix this in a sane way and I'm looking into it.

But I really appreciate your finding here and in other thread where we discussed
similar issue.

- Mani

> 
> Here is my suggested solution in case anyone wants to take a stab at it:
> 
> > > > I guess one option would be modify dw_pcie_ep_init_notify() to call
> > > > dw_pcie_ep_clear_bar() on all non-NULL BARs stored in ep->epf_bar[],
> > > > before calling pci_epc_init_notify(). That way the second .core_init
> > > > (pci_epf_test_core_init()) call will use write the settings, because
> > > > ep->epf_bar[] will be empty, so the "write the settings only the first
> > > > time" approach will then also work for core_init_notifier platforms.
> 
> 
> Kind regards,
> Niklas

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-19 17:59               ` Manivannan Sadhasivam
@ 2023-12-20  5:14                 ` Damien Le Moal
  2023-12-20  6:03                   ` Manivannan Sadhasivam
  0 siblings, 1 reply; 30+ messages in thread
From: Damien Le Moal @ 2023-12-20  5:14 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Niklas Cassel
  Cc: Frank Li, Vidya Sagar, helgaas, kishon, lorenzo.pieralisi, kw,
	jingoohan1, gustavo.pimentel, lznuaa, hongxing.zhu, jdmason,
	dave.jiang, allenbh, linux-ntb, linux-pci

On 12/20/23 02:59, Manivannan Sadhasivam wrote:
>>>>> from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
>>>>> settings will be re-written for platforms with core_init_notifiers.
>>>>>
>>>>> Right now, for a platform with a core_init_notifier, if you run
>>>>> pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
>>>>> and then run pci_endpoint_test again, then I'm quite sure that
>>>>> pci_endpoint_test will not pass the second time (because iATU settings
>>>>> were not rewritten after reset).
>>>>>
>>>>> It would be nice if Mani or Vidya could confirm this.
>>
>> So problem 2 out of 2 introduced by the patch in $subject is that
>> DWC drivers with a .core_init_notifier, will perform a reset_control_assert()
>> to reset the core (which will reset both sticky and non-sticky registers),
>> which means that the early return in dw_pcie_ep_set_bar():
>> https://github.com/torvalds/linux/blob/v6.7-rc5/drivers/pci/controller/dwc/pcie-designware-ep.c#L268-L269
>>
>> while returning after the iATU settings have been written,
>> it will return before:
>>
>> 	dw_pcie_writel_dbi2(pci, reg_dbi2, lower_32_bits(size - 1));
>> 	dw_pcie_writel_dbi(pci, reg, flags);
>>
>> Which means that, for drivers with a .core_init_notifier, BARx_REG and
>> BARx_MASK registers will not be written. This means that they will have
>> reset values for these registers, which means that e.g. the BAR_SIZE
>> (which is defined by BARx_MASK) might be incorrect for these platforms
>> because this function returns early.
>>
>> I will not send a fix for this problem, I will leave that to you, or Mani,
>> or Vidya, and hope that people are happy that I simply reported this issue.
>>
> 
> The fix for this issue shouldn't be in the DWC driver but rather in the EPF
> drivers. Because, EPF drivers are the ones calling pci_epc_set_bar() during
> bind(). So they have to properly cleanup the resources once the perst assertion
> happens. This issue also applies to other resources such as DMA channels.
> 
> The problem here is, there is a disconnect between DWC (EPC) and EPF drivers.
> When the perst assertion happens, the event is not passed to EPF drivers
> resulting in the EPF drivers having no knowledge that they need to give up the
> resources.
> 
> We need to fix this in a sane way and I'm looking into it.
> 
> But I really appreciate your finding here and in other thread where we discussed
> similar issue.

We have core_init and link_up notifiers for EPF drivers. Adding link_down and
core_exit notifiers would make a lot of sense :) A core_reset notifier could
also be a solution.

Adding that would also help EPF drivers to handle links going down temporarily
(which can happen). Right now, an EPF driver can only deal with such event by
tracking if it gets multiple link_up events.


-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-20  5:14                 ` Damien Le Moal
@ 2023-12-20  6:03                   ` Manivannan Sadhasivam
  2023-12-20  7:06                     ` Damien Le Moal
  0 siblings, 1 reply; 30+ messages in thread
From: Manivannan Sadhasivam @ 2023-12-20  6:03 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Niklas Cassel, Frank Li, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On Wed, Dec 20, 2023 at 02:14:53PM +0900, Damien Le Moal wrote:
> On 12/20/23 02:59, Manivannan Sadhasivam wrote:
> >>>>> from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU
> >>>>> settings will be re-written for platforms with core_init_notifiers.
> >>>>>
> >>>>> Right now, for a platform with a core_init_notifier, if you run
> >>>>> pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted),
> >>>>> and then run pci_endpoint_test again, then I'm quite sure that
> >>>>> pci_endpoint_test will not pass the second time (because iATU settings
> >>>>> were not rewritten after reset).
> >>>>>
> >>>>> It would be nice if Mani or Vidya could confirm this.
> >>
> >> So problem 2 out of 2 introduced by the patch in $subject is that
> >> DWC drivers with a .core_init_notifier, will perform a reset_control_assert()
> >> to reset the core (which will reset both sticky and non-sticky registers),
> >> which means that the early return in dw_pcie_ep_set_bar():
> >> https://github.com/torvalds/linux/blob/v6.7-rc5/drivers/pci/controller/dwc/pcie-designware-ep.c#L268-L269
> >>
> >> while returning after the iATU settings have been written,
> >> it will return before:
> >>
> >> 	dw_pcie_writel_dbi2(pci, reg_dbi2, lower_32_bits(size - 1));
> >> 	dw_pcie_writel_dbi(pci, reg, flags);
> >>
> >> Which means that, for drivers with a .core_init_notifier, BARx_REG and
> >> BARx_MASK registers will not be written. This means that they will have
> >> reset values for these registers, which means that e.g. the BAR_SIZE
> >> (which is defined by BARx_MASK) might be incorrect for these platforms
> >> because this function returns early.
> >>
> >> I will not send a fix for this problem, I will leave that to you, or Mani,
> >> or Vidya, and hope that people are happy that I simply reported this issue.
> >>
> > 
> > The fix for this issue shouldn't be in the DWC driver but rather in the EPF
> > drivers. Because, EPF drivers are the ones calling pci_epc_set_bar() during
> > bind(). So they have to properly cleanup the resources once the perst assertion
> > happens. This issue also applies to other resources such as DMA channels.
> > 
> > The problem here is, there is a disconnect between DWC (EPC) and EPF drivers.
> > When the perst assertion happens, the event is not passed to EPF drivers
> > resulting in the EPF drivers having no knowledge that they need to give up the
> > resources.
> > 
> > We need to fix this in a sane way and I'm looking into it.
> > 
> > But I really appreciate your finding here and in other thread where we discussed
> > similar issue.
> 
> We have core_init and link_up notifiers for EPF drivers. Adding link_down and
> core_exit notifiers would make a lot of sense :) A core_reset notifier could
> also be a solution.
> 

Yeah, I'm thinking along those lines.

> Adding that would also help EPF drivers to handle links going down temporarily
> (which can happen). Right now, an EPF driver can only deal with such event by
> tracking if it gets multiple link_up events.
> 

There is already link_down notifier that I introduced a while back. But it is
only used by MHI_EPF driver.

- Mani

> 
> -- 
> Damien Le Moal
> Western Digital Research
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address
  2023-12-20  6:03                   ` Manivannan Sadhasivam
@ 2023-12-20  7:06                     ` Damien Le Moal
  0 siblings, 0 replies; 30+ messages in thread
From: Damien Le Moal @ 2023-12-20  7:06 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Niklas Cassel, Frank Li, Vidya Sagar, helgaas, kishon,
	lorenzo.pieralisi, kw, jingoohan1, gustavo.pimentel, lznuaa,
	hongxing.zhu, jdmason, dave.jiang, allenbh, linux-ntb, linux-pci

On 12/20/23 15:03, Manivannan Sadhasivam wrote:
>> We have core_init and link_up notifiers for EPF drivers. Adding link_down and
>> core_exit notifiers would make a lot of sense :) A core_reset notifier could
>> also be a solution.
>>
> 
> Yeah, I'm thinking along those lines.
> 
>> Adding that would also help EPF drivers to handle links going down temporarily
>> (which can happen). Right now, an EPF driver can only deal with such event by
>> tracking if it gets multiple link_up events.
>>
> 
> There is already link_down notifier that I introduced a while back. But it is
> only used by MHI_EPF driver.

I missed that. I need that notifier :)

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-12-20  7:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-22 16:23 [PATCH V2 0/4] NTB function for PCIe RC to EP connection Frank Li
2022-02-22 16:23 ` [PATCH v2 1/4] PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address Frank Li
2023-12-14 14:31   ` Niklas Cassel
2023-12-14 15:19     ` Frank Li
2023-12-14 20:23       ` Niklas Cassel
2023-12-14 20:52         ` Frank Li
2023-12-14 21:28           ` Frank Li
2023-12-16 10:11             ` Niklas Cassel
2023-12-19 17:59               ` Manivannan Sadhasivam
2023-12-20  5:14                 ` Damien Le Moal
2023-12-20  6:03                   ` Manivannan Sadhasivam
2023-12-20  7:06                     ` Damien Le Moal
2023-12-14 21:39           ` Niklas Cassel
2022-02-22 16:23 ` [PATCH v2 2/4] NTB: epf: Allow more flexibility in the memory BAR map method Frank Li
2022-03-10 22:08   ` Zhi Li
2022-02-22 16:23 ` [PATCH v2 3/4] PCI: endpoint: Support NTB transfer between RC and EP Frank Li
2022-03-10 22:09   ` Zhi Li
2022-12-14  0:08   ` Bjorn Helgaas
     [not found]     ` <CAHrpEqSGySHDET3YPu3czzoMBmCRJsgGgU4s3GWWbtruFLVHaA@mail.gmail.com>
2022-12-14  0:28       ` Bjorn Helgaas
2022-12-14  0:47         ` [EXT] " Frank Li
2022-02-22 16:23 ` [PATCH v2 4/4] Documentation: PCI: Add specification for the PCI vNTB function device Frank Li
2022-03-10 22:01 ` [PATCH V2 0/4] NTB function for PCIe RC to EP connection Zhi Li
2022-03-10 22:07   ` Zhi Li
2022-04-04 20:12     ` Zhi Li
2022-04-05 10:34 ` Kishon Vijay Abraham I
2022-04-05 15:35   ` Zhi Li
2022-04-20 20:22     ` Zhi Li
2022-04-22 15:15       ` Kishon Vijay Abraham I
2022-04-22 15:36         ` Zhi Li
2022-08-12 14:02 ` Jon Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).