All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 0/4] add debugfs to migration driver
@ 2023-07-28  7:21 liulongfang
  2023-07-28  7:21 ` [PATCH v12 1/4] vfio/migration: Add debugfs to live " liulongfang
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: liulongfang @ 2023-07-28  7:21 UTC (permalink / raw)
  To: alex.williamson, jgg, shameerali.kolothum.thodi, jonathan.cameron
  Cc: cohuck, linux-kernel, linuxarm, liulongfang

Add a debugfs function to the migration driver in VFIO to provide
a step-by-step test function for the migration driver.

When the execution of live migration fails, the user can view the
status and data during the migration process separately from the
source and the destination, which is convenient for users to analyze
and locate problems.

Changes v11 -> v12
	Update loading conditions of vfio debugfs.

Changes v10 -> v11
	Delete the device restore function in debugfs.

Changes v9 -> v10
	Update the debugfs file of the live migration driver.

Changes v8 -> v9
	Update the debugfs directory structure of vfio.

Changes v7 -> v8
	Add support for platform devices.

Changes v6 -> v7
	Fix some code style issues.

Changes v5 -> v6
	Control the creation of debugfs through the CONFIG_DEBUG_FS.

Changes v4 -> v5
	Remove the newly added vfio_migration_ops and use seq_printf
	to optimize the implementation of debugfs.

Changes v3 -> v4
	Change the migration_debug_operate interface to debug_root file.

Changes v2 -> v3
	Extend the debugfs function from hisilicon device to vfio.

Changes v1 -> v2
	Change the registration method of root_debugfs to register
	with module initialization.

Longfang Liu (4):
  vfio/migration: Add debugfs to live migration driver
  hisi_acc_vfio_pci: extract public functions for container_of
  hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
  Documentation: add debugfs description for vfio

 .../ABI/testing/debugfs-hisi-migration        |  36 ++++
 Documentation/ABI/testing/debugfs-vfio        |  25 +++
 MAINTAINERS                                   |   2 +
 drivers/vfio/Makefile                         |   1 +
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 199 +++++++++++++++++-
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h    |   3 +
 drivers/vfio/vfio.h                           |  14 ++
 drivers/vfio/vfio_debugfs.c                   |  80 +++++++
 drivers/vfio/vfio_main.c                      |   5 +-
 include/linux/vfio.h                          |   7 +
 10 files changed, 361 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
 create mode 100644 Documentation/ABI/testing/debugfs-vfio
 create mode 100644 drivers/vfio/vfio_debugfs.c

-- 
2.24.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v12 1/4] vfio/migration: Add debugfs to live migration driver
  2023-07-28  7:21 [PATCH v12 0/4] add debugfs to migration driver liulongfang
@ 2023-07-28  7:21 ` liulongfang
  2023-07-28  7:21 ` [PATCH v12 2/4] hisi_acc_vfio_pci: extract public functions for container_of liulongfang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: liulongfang @ 2023-07-28  7:21 UTC (permalink / raw)
  To: alex.williamson, jgg, shameerali.kolothum.thodi, jonathan.cameron
  Cc: cohuck, linux-kernel, linuxarm, liulongfang

From: Longfang Liu <liulongfang@huawei.com>

There are multiple devices, software and operational steps involved
in the process of live migration. An error occurred on any node may
cause the live migration operation to fail.
This complex process makes it very difficult to locate and analyze
the cause when the function fails.

In order to quickly locate the cause of the problem when the
live migration fails, I added a set of debugfs to the vfio
live migration driver.

    +-------------------------------------------+
    |                                           |
    |                                           |
    |                  QEMU                     |
    |                                           |
    |                                           |
    +---+----------------------------+----------+
        |      ^                     |      ^
        |      |                     |      |
        |      |                     |      |
        v      |                     v      |
     +---------+--+               +---------+--+
     |src vfio_dev|               |dst vfio_dev|
     +--+---------+               +--+---------+
        |      ^                     |      ^
        |      |                     |      |
        v      |                     |      |
   +-----------+----+           +-----------+----+
   |src dev debugfs |           |dst dev debugfs |
   +----------------+           +----------------+

The entire debugfs directory will be based on the definition of
the CONFIG_DEBUG_FS macro. If this macro is not enabled, the
interfaces in vfio.h will be empty definitions, and the creation
and initialization of the debugfs directory will not be executed.

   vfio
    |
    +---<dev_name1>
    |    +---migration
    |        +--state
    |        +--hisi_acc
    |            +--attr
    |            +--data
    |            +--save
    |            +--io_test
    |
    +---<dev_name2>
         +---migration
             +--state
             +--hisi_acc
                 +--attr
                 +--data
                 +--save
                 +--io_test

debugfs will create a public root directory "vfio" file.
then create a dev_name() file for each live migration device.
First, create a unified state acquisition file of "migration"
in this device directory.
Then, create a public live migration state lookup file "state"
Finally, create a directory file based on the device type,
and then create the device's own debugging files under
this directory file.

Here, HiSilicon accelerator creates three debug files:
attr: used to export the attribute parameters of the
current live migration device.
data: used to export the live migration data of the current
live migration device.
save: used to read the current live migration device's data
and save it to the driver.
io_test: used to test the IO read and write for the driver.

The live migration function of the current device can be tested by
operating the debug files, and the functional status of the equipment
and software at each stage can be tested step by step without
performing the complete live migration function. And after the live
migration is performed, the migration device data of the live migration
can be obtained through the debug files.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
---
 drivers/vfio/Makefile       |  1 +
 drivers/vfio/vfio.h         | 14 +++++++
 drivers/vfio/vfio_debugfs.c | 80 +++++++++++++++++++++++++++++++++++++
 drivers/vfio/vfio_main.c    |  5 ++-
 include/linux/vfio.h        |  7 ++++
 5 files changed, 106 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/vfio_debugfs.c

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index c82ea032d352..7934ac829989 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -8,6 +8,7 @@ vfio-$(CONFIG_VFIO_GROUP) += group.o
 vfio-$(CONFIG_IOMMUFD) += iommufd.o
 vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o
+vfio-$(CONFIG_DEBUG_FS) += vfio_debugfs.o
 
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 307e3f29b527..09b00757d0bb 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -448,4 +448,18 @@ static inline void vfio_device_put_kvm(struct vfio_device *device)
 }
 #endif
 
+#ifdef CONFIG_DEBUG_FS
+void vfio_debugfs_create_root(void);
+void vfio_debugfs_remove_root(void);
+
+void vfio_device_debugfs_init(struct vfio_device *vdev);
+void vfio_device_debugfs_exit(struct vfio_device *vdev);
+#else
+static inline void vfio_debugfs_create_root(void) { }
+static inline void vfio_debugfs_remove_root(void) { }
+
+static inline void vfio_device_debugfs_init(struct vfio_device *vdev) { }
+static inline void vfio_device_debugfs_exit(struct vfio_device *vdev) { }
+#endif /* CONFIG_DEBUG_FS */
+
 #endif
diff --git a/drivers/vfio/vfio_debugfs.c b/drivers/vfio/vfio_debugfs.c
new file mode 100644
index 000000000000..d903293ed9c7
--- /dev/null
+++ b/drivers/vfio/vfio_debugfs.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, HiSilicon Ltd.
+ */
+
+#include <linux/device.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#include <linux/vfio.h>
+#include "vfio.h"
+
+static struct dentry *vfio_debugfs_root;
+
+static int vfio_device_state_read(struct seq_file *seq, void *data)
+{
+	struct device *vf_dev = seq->private;
+	struct vfio_device *vdev = container_of(vf_dev, struct vfio_device, device);
+	enum vfio_device_mig_state state;
+	int ret;
+
+	ret = vdev->mig_ops->migration_get_state(vdev, &state);
+	if (ret)
+		return -EINVAL;
+
+	switch (state) {
+	case VFIO_DEVICE_STATE_RUNNING:
+		seq_printf(seq, "%s\n", "RUNNING");
+		break;
+	case VFIO_DEVICE_STATE_STOP_COPY:
+		seq_printf(seq, "%s\n", "STOP_COPY");
+		break;
+	case VFIO_DEVICE_STATE_STOP:
+		seq_printf(seq, "%s\n", "STOP");
+		break;
+	case VFIO_DEVICE_STATE_RESUMING:
+		seq_printf(seq, "%s\n", "RESUMING");
+		break;
+	case VFIO_DEVICE_STATE_RUNNING_P2P:
+		seq_printf(seq, "%s\n", "RESUMING_P2P");
+		break;
+	case VFIO_DEVICE_STATE_ERROR:
+		seq_printf(seq, "%s\n", "ERROR");
+		break;
+	default:
+		seq_printf(seq, "%s\n", "Invalid");
+	}
+
+	return 0;
+}
+
+void vfio_device_debugfs_init(struct vfio_device *vdev)
+{
+	struct dentry *vfio_dev_migration = NULL;
+	struct device *dev = &vdev->device;
+
+	vdev->debug_root = debugfs_create_dir(dev_name(vdev->dev), vfio_debugfs_root);
+
+	if (vdev->mig_ops) {
+		vfio_dev_migration = debugfs_create_dir("migration", vdev->debug_root);
+		debugfs_create_devm_seqfile(dev, "state", vfio_dev_migration,
+					  vfio_device_state_read);
+	}
+}
+
+void vfio_device_debugfs_exit(struct vfio_device *vdev)
+{
+	debugfs_remove_recursive(vdev->debug_root);
+}
+
+void vfio_debugfs_create_root(void)
+{
+	vfio_debugfs_root = debugfs_create_dir("vfio", NULL);
+}
+
+void vfio_debugfs_remove_root(void)
+{
+	debugfs_remove_recursive(vfio_debugfs_root);
+	vfio_debugfs_root = NULL;
+}
+
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 902f06e52c48..7f88532d0476 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -309,7 +309,7 @@ static int __vfio_register_dev(struct vfio_device *device,
 
 	/* Refcounting can't start until the driver calls register */
 	refcount_set(&device->refcount, 1);
-
+	vfio_device_debugfs_init(device);
 	vfio_device_group_register(device);
 
 	return 0;
@@ -378,6 +378,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 		}
 	}
 
+	vfio_device_debugfs_exit(device);
 	/* Balances vfio_device_set_group in register path */
 	vfio_device_remove_group(device);
 }
@@ -1609,6 +1610,7 @@ static int __init vfio_init(void)
 	if (ret)
 		goto err_alloc_dev_chrdev;
 
+	vfio_debugfs_create_root();
 	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
 	return 0;
 
@@ -1631,6 +1633,7 @@ static void __exit vfio_cleanup(void)
 	vfio_virqfd_exit();
 	vfio_group_cleanup();
 	xa_destroy(&vfio_device_set_xa);
+	vfio_debugfs_remove_root();
 }
 
 module_init(vfio_init);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 5a1dee983f17..10cd84a3e31c 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -69,6 +69,13 @@ struct vfio_device {
 	u8 iommufd_attached:1;
 #endif
 	u8 cdev_opened:1;
+#ifdef CONFIG_DEBUG_FS
+	/*
+	 * debug_root is a static property of the vfio_device
+	 * which must be set prior to registering the vfio_device.
+	 */
+	struct dentry *debug_root;
+#endif
 };
 
 /**
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v12 2/4] hisi_acc_vfio_pci: extract public functions for container_of
  2023-07-28  7:21 [PATCH v12 0/4] add debugfs to migration driver liulongfang
  2023-07-28  7:21 ` [PATCH v12 1/4] vfio/migration: Add debugfs to live " liulongfang
@ 2023-07-28  7:21 ` liulongfang
  2023-07-28  7:21 ` [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver liulongfang
  2023-07-28  7:21 ` [PATCH v12 4/4] Documentation: add debugfs description for vfio liulongfang
  3 siblings, 0 replies; 11+ messages in thread
From: liulongfang @ 2023-07-28  7:21 UTC (permalink / raw)
  To: alex.williamson, jgg, shameerali.kolothum.thodi, jonathan.cameron
  Cc: cohuck, linux-kernel, linuxarm, liulongfang

From: Longfang Liu <liulongfang@huawei.com>

In the current driver, vdev is obtained from struct
hisi_acc_vf_core_device through the container_of function.
This method is used in many places in the driver. In order to
reduce this repetitive operation, I extracted a public function
to replace it.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 21 ++++++++++---------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index b2f9778c8366..242ad319932a 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -630,6 +630,12 @@ static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vde
 	}
 }
 
+static struct hisi_acc_vf_core_device *hisi_acc_get_vf_dev(struct vfio_device *vdev)
+{
+	return container_of(vdev, struct hisi_acc_vf_core_device,
+			    core_device.vdev);
+}
+
 /*
  * This function is called in all state_mutex unlock cases to
  * handle a 'deferred_reset' if exists.
@@ -1042,8 +1048,7 @@ static struct file *
 hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev,
 				   enum vfio_device_mig_state new_state)
 {
-	struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(vdev,
-			struct hisi_acc_vf_core_device, core_device.vdev);
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
 	enum vfio_device_mig_state next_state;
 	struct file *res = NULL;
 	int ret;
@@ -1084,8 +1089,7 @@ static int
 hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev,
 				   enum vfio_device_mig_state *curr_state)
 {
-	struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(vdev,
-			struct hisi_acc_vf_core_device, core_device.vdev);
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
 
 	mutex_lock(&hisi_acc_vdev->state_mutex);
 	*curr_state = hisi_acc_vdev->mig_state;
@@ -1301,8 +1305,7 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
 
 static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
 {
-	struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
-			struct hisi_acc_vf_core_device, core_device.vdev);
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
 	struct vfio_pci_core_device *vdev = &hisi_acc_vdev->core_device;
 	int ret;
 
@@ -1325,8 +1328,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
 
 static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
 {
-	struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
-			struct hisi_acc_vf_core_device, core_device.vdev);
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
 	struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
 
 	iounmap(vf_qm->io_base);
@@ -1341,8 +1343,7 @@ static const struct vfio_migration_ops hisi_acc_vfio_pci_migrn_state_ops = {
 
 static int hisi_acc_vfio_pci_migrn_init_dev(struct vfio_device *core_vdev)
 {
-	struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
-			struct hisi_acc_vf_core_device, core_device.vdev);
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
 	struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
 	struct hisi_qm *pf_qm = hisi_acc_get_pf_qm(pdev);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
  2023-07-28  7:21 [PATCH v12 0/4] add debugfs to migration driver liulongfang
  2023-07-28  7:21 ` [PATCH v12 1/4] vfio/migration: Add debugfs to live " liulongfang
  2023-07-28  7:21 ` [PATCH v12 2/4] hisi_acc_vfio_pci: extract public functions for container_of liulongfang
@ 2023-07-28  7:21 ` liulongfang
  2023-08-07 21:43   ` Alex Williamson
  2023-07-28  7:21 ` [PATCH v12 4/4] Documentation: add debugfs description for vfio liulongfang
  3 siblings, 1 reply; 11+ messages in thread
From: liulongfang @ 2023-07-28  7:21 UTC (permalink / raw)
  To: alex.williamson, jgg, shameerali.kolothum.thodi, jonathan.cameron
  Cc: cohuck, linux-kernel, linuxarm, liulongfang

From: Longfang Liu <liulongfang@huawei.com>

On the debugfs framework of VFIO, if the CONFIG_DEBUG_FS macro is
enabled, the debug function is registered for the live migration driver
of the HiSilicon accelerator device.

After registering the HiSilicon accelerator device on the debugfs
framework of live migration of vfio, a directory file "hisi_acc"
of debugfs is created, and then three debug function files are
created in this directory:

data file: used to get the migration data from the driver
attr file: used to get device attributes parameters from the driver
save file: used to read the data of the live migration device and save
it to the driver.
io_test: used to test IO read and write for the driver.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
---
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 178 ++++++++++++++++++
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h    |   3 +
 2 files changed, 181 insertions(+)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 242ad319932a..a811dc237a29 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -15,6 +15,7 @@
 #include <linux/anon_inodes.h>
 
 #include "hisi_acc_vfio_pci.h"
+#include "../../vfio.h"
 
 /* Return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
 static int qm_wait_dev_not_ready(struct hisi_qm *qm)
@@ -606,6 +607,18 @@ hisi_acc_check_int_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
 	}
 }
 
+static void hisi_acc_vf_migf_save(struct hisi_acc_vf_migration_file *dst_migf,
+	struct hisi_acc_vf_migration_file *src_migf)
+{
+	if (!dst_migf)
+		return;
+
+	dst_migf->disabled = false;
+	dst_migf->total_length = src_migf->total_length;
+	memcpy(&dst_migf->vf_data, &src_migf->vf_data,
+		    sizeof(struct acc_vf_data));
+}
+
 static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
 {
 	mutex_lock(&migf->lock);
@@ -618,12 +631,16 @@ static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
 static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vdev)
 {
 	if (hisi_acc_vdev->resuming_migf) {
+		hisi_acc_vf_migf_save(hisi_acc_vdev->debug_migf,
+						hisi_acc_vdev->resuming_migf);
 		hisi_acc_vf_disable_fd(hisi_acc_vdev->resuming_migf);
 		fput(hisi_acc_vdev->resuming_migf->filp);
 		hisi_acc_vdev->resuming_migf = NULL;
 	}
 
 	if (hisi_acc_vdev->saving_migf) {
+		hisi_acc_vf_migf_save(hisi_acc_vdev->debug_migf,
+						hisi_acc_vdev->saving_migf);
 		hisi_acc_vf_disable_fd(hisi_acc_vdev->saving_migf);
 		fput(hisi_acc_vdev->saving_migf->filp);
 		hisi_acc_vdev->saving_migf = NULL;
@@ -1303,6 +1320,162 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
 	return vfio_pci_core_ioctl(core_vdev, cmd, arg);
 }
 
+static int hisi_acc_vf_debug_check(struct seq_file *seq, struct vfio_device *vdev)
+{
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+	struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
+
+	if (!vdev->mig_ops || !migf) {
+		seq_printf(seq, "%s\n", "device does not support live migration!");
+		return -EINVAL;
+	}
+
+	/* If device not opened, the debugfs operation will trigger calltrace */
+	if (!vdev->open_count) {
+		seq_printf(seq, "%s\n", "device not opened!");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int hisi_acc_vf_debug_io(struct seq_file *seq, void *data)
+{
+	struct device *vf_dev = seq->private;
+	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+	struct vfio_device *vdev = &core_device->vdev;
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+	struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+	u64 value;
+	int ret;
+
+	ret = hisi_acc_vf_debug_check(seq, vdev);
+	if (ret)
+		return 0;
+
+	ret = qm_wait_dev_not_ready(vf_qm);
+	if (ret) {
+		seq_printf(seq, "%s\n", "VF device not ready!");
+		return 0;
+	}
+
+	value = readl(vf_qm->io_base + QM_MB_CMD_SEND_BASE);
+	seq_printf(seq, "%s:0x%llx\n", "debug mailbox val", value);
+
+	return 0;
+}
+
+static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
+{
+	struct device *vf_dev = seq->private;
+	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+	struct vfio_device *vdev = &core_device->vdev;
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+	struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
+	int ret;
+
+	ret = hisi_acc_vf_debug_check(seq, vdev);
+	if (ret)
+		return 0;
+
+	ret = vf_qm_state_save(hisi_acc_vdev, migf);
+	if (ret) {
+		seq_printf(seq, "%s\n", "failed to save device data!");
+		return 0;
+	}
+	seq_printf(seq, "%s\n", "successful to save device data!");
+
+	return 0;
+}
+
+static int hisi_acc_vf_data_read(struct seq_file *seq, void *data)
+{
+	struct device *vf_dev = seq->private;
+	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+	struct vfio_device *vdev = &core_device->vdev;
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+	struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
+	size_t vf_data_sz = offsetofend(struct acc_vf_data, padding);
+
+	if (debug_migf && debug_migf->total_length)
+		seq_hex_dump(seq, "Mig Data:", DUMP_PREFIX_OFFSET, 16, 1,
+				(unsigned char *)&debug_migf->vf_data,
+				vf_data_sz, false);
+	else
+		seq_printf(seq, "%s\n", "device not migrated!");
+
+	return 0;
+}
+
+static int hisi_acc_vf_attr_read(struct seq_file *seq, void *data)
+{
+	struct device *vf_dev = seq->private;
+	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+	struct vfio_device *vdev = &core_device->vdev;
+	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+	struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
+
+	if (debug_migf && debug_migf->total_length) {
+		seq_printf(seq,
+			 "acc device:\n"
+			 "device  state: %d\n"
+			 "device  ready: %u\n"
+			 "data    valid: %d\n"
+			 "data     size: %lu\n",
+			 hisi_acc_vdev->mig_state,
+			 hisi_acc_vdev->vf_qm_state,
+			 debug_migf->disabled,
+			 debug_migf->total_length);
+	} else {
+		seq_printf(seq, "%s\n", "device not migrated!");
+	}
+
+	return 0;
+}
+
+static int hisi_acc_vfio_debug_init(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+	struct vfio_device *vdev = &hisi_acc_vdev->core_device.vdev;
+	struct dentry *vfio_dev_migration = NULL;
+	struct dentry *vfio_hisi_acc = NULL;
+	struct device *dev = vdev->dev;
+	void *migf = NULL;
+
+	if (!debugfs_initialized())
+		return 0;
+
+	migf = kzalloc(sizeof(struct hisi_acc_vf_migration_file), GFP_KERNEL);
+	if (!migf)
+		return -ENOMEM;
+	hisi_acc_vdev->debug_migf = migf;
+
+	vfio_dev_migration = debugfs_lookup("migration", vdev->debug_root);
+	if (!vfio_dev_migration) {
+		dev_err(dev, "failed to lookup migration debugfs file!\n");
+		return -ENODEV;
+	}
+
+	vfio_hisi_acc = debugfs_create_dir("hisi_acc", vfio_dev_migration);
+	debugfs_create_devm_seqfile(dev, "data", vfio_hisi_acc,
+				  hisi_acc_vf_data_read);
+	debugfs_create_devm_seqfile(dev, "attr", vfio_hisi_acc,
+				  hisi_acc_vf_attr_read);
+	debugfs_create_devm_seqfile(dev, "io_test", vfio_hisi_acc,
+				  hisi_acc_vf_debug_io);
+	debugfs_create_devm_seqfile(dev, "save", vfio_hisi_acc,
+				  hisi_acc_vf_debug_save);
+
+	return 0;
+}
+
+static void hisi_acc_vf_debugfs_exit(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+	if (!debugfs_initialized())
+		return;
+
+	kfree(hisi_acc_vdev->debug_migf);
+}
+
 static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
 {
 	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
@@ -1323,6 +1496,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
 	}
 
 	vfio_pci_core_finish_enable(vdev);
+
 	return 0;
 }
 
@@ -1422,6 +1596,9 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
 	ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
 	if (ret)
 		goto out_put_vdev;
+
+	if (ops == &hisi_acc_vfio_pci_migrn_ops)
+		hisi_acc_vfio_debug_init(hisi_acc_vdev);
 	return 0;
 
 out_put_vdev:
@@ -1433,6 +1610,7 @@ static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
 {
 	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);
 
+	hisi_acc_vf_debugfs_exit(hisi_acc_vdev);
 	vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
 	vfio_put_device(&hisi_acc_vdev->core_device.vdev);
 }
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
index dcabfeec6ca1..93f44bcf53ee 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
@@ -113,5 +113,8 @@ struct hisi_acc_vf_core_device {
 	spinlock_t reset_lock;
 	struct hisi_acc_vf_migration_file *resuming_migf;
 	struct hisi_acc_vf_migration_file *saving_migf;
+
+	/* For debugfs */
+	struct hisi_acc_vf_migration_file *debug_migf;
 };
 #endif /* HISI_ACC_VFIO_PCI_H */
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v12 4/4] Documentation: add debugfs description for vfio
  2023-07-28  7:21 [PATCH v12 0/4] add debugfs to migration driver liulongfang
                   ` (2 preceding siblings ...)
  2023-07-28  7:21 ` [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver liulongfang
@ 2023-07-28  7:21 ` liulongfang
  2023-08-04 14:58   ` Jason Gunthorpe
  3 siblings, 1 reply; 11+ messages in thread
From: liulongfang @ 2023-07-28  7:21 UTC (permalink / raw)
  To: alex.williamson, jgg, shameerali.kolothum.thodi, jonathan.cameron
  Cc: cohuck, linux-kernel, linuxarm, liulongfang

From: Longfang Liu <liulongfang@huawei.com>

1.Add two debugfs document description file to help users understand
how to use the accelerator live migration driver's debugfs.
2.Update the file paths that need to be maintained in MAINTAINERS

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
---
 .../ABI/testing/debugfs-hisi-migration        | 36 +++++++++++++++++++
 Documentation/ABI/testing/debugfs-vfio        | 25 +++++++++++++
 MAINTAINERS                                   |  2 ++
 3 files changed, 63 insertions(+)
 create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
 create mode 100644 Documentation/ABI/testing/debugfs-vfio

diff --git a/Documentation/ABI/testing/debugfs-hisi-migration b/Documentation/ABI/testing/debugfs-hisi-migration
new file mode 100644
index 000000000000..791dd8a09575
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-hisi-migration
@@ -0,0 +1,36 @@
+What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/data
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	Read the live migration data of the vfio device.
+		These data include device status data, queue configuration
+		data and some task configuration data.
+		The output format of the data is defined by the live
+		migration driver.
+
+What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/attr
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	Read the live migration attributes of the vfio device.
+		it include device status attributes and data length attributes
+		The output format of the attributes is defined by the live
+		migration driver.
+
+What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/io_test
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	Trigger the HiSilicon accelerator device to perform
+		the io test through the read operation. If successful,
+		it returns the execution result of mailbox. If fails,
+		it returns error log result.
+
+What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/save
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	Trigger the Hisilicon accelerator device to perform
+		the state saving operation of live migration through the read
+		operation, and output the operation log results.
+
diff --git a/Documentation/ABI/testing/debugfs-vfio b/Documentation/ABI/testing/debugfs-vfio
new file mode 100644
index 000000000000..086a8c52df35
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-vfio
@@ -0,0 +1,25 @@
+What:		/sys/kernel/debug/vfio
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	This debugfs file directory is used for debugging
+		of vfio devices, it's a common directory for all vfio devices.
+		Each device should create a device subdirectory under this
+		directory by referencing the public registration interface.
+
+What:		/sys/kernel/debug/vfio/<device>/migration
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	This debugfs file directory is used for debugging
+		of vfio devices that support live migration.
+		The debugfs of each vfio device that supports live migration
+		could be created under this directory.
+
+What:		/sys/kernel/debug/vfio/<device>/migration/state
+Date:		Aug 2023
+KernelVersion:  6.6
+Contact:	Longfang Liu <liulongfang@huawei.com>
+Description:	Read the live migration status of the vfio device.
+		The status of these live migrations includes:
+		ERROR, RUNNING, STOP, STOP_COPY, RESUMING.
diff --git a/MAINTAINERS b/MAINTAINERS
index d516295978a4..d4fb7547b687 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22304,6 +22304,7 @@ L:	kvm@vger.kernel.org
 S:	Maintained
 T:	git https://github.com/awilliam/linux-vfio.git
 F:	Documentation/ABI/testing/sysfs-devices-vfio-dev
+F:	Documentation/ABI/testing/debugfs-vfio
 F:	Documentation/driver-api/vfio.rst
 F:	drivers/vfio/
 F:	include/linux/vfio.h
@@ -22321,6 +22322,7 @@ M:	Longfang Liu <liulongfang@huawei.com>
 M:	Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
 L:	kvm@vger.kernel.org
 S:	Maintained
+F:	Documentation/ABI/testing/debugfs-hisi-migration
 F:	drivers/vfio/pci/hisilicon/
 
 VFIO MEDIATED DEVICE DRIVERS
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v12 4/4] Documentation: add debugfs description for vfio
  2023-07-28  7:21 ` [PATCH v12 4/4] Documentation: add debugfs description for vfio liulongfang
@ 2023-08-04 14:58   ` Jason Gunthorpe
  2023-08-07  1:33     ` liulongfang
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2023-08-04 14:58 UTC (permalink / raw)
  To: liulongfang
  Cc: alex.williamson, shameerali.kolothum.thodi, jonathan.cameron,
	cohuck, linux-kernel, linuxarm

On Fri, Jul 28, 2023 at 03:21:04PM +0800, liulongfang wrote:
> From: Longfang Liu <liulongfang@huawei.com>
> 
> 1.Add two debugfs document description file to help users understand
> how to use the accelerator live migration driver's debugfs.
> 2.Update the file paths that need to be maintained in MAINTAINERS
> 
> Signed-off-by: Longfang Liu <liulongfang@huawei.com>
> ---
>  .../ABI/testing/debugfs-hisi-migration        | 36 +++++++++++++++++++
>  Documentation/ABI/testing/debugfs-vfio        | 25 +++++++++++++
>  MAINTAINERS                                   |  2 ++
>  3 files changed, 63 insertions(+)
>  create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
>  create mode 100644 Documentation/ABI/testing/debugfs-vfio
> 
> diff --git a/Documentation/ABI/testing/debugfs-hisi-migration b/Documentation/ABI/testing/debugfs-hisi-migration
> new file mode 100644
> index 000000000000..791dd8a09575
> --- /dev/null
> +++ b/Documentation/ABI/testing/debugfs-hisi-migration
> @@ -0,0 +1,36 @@
> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/data
> +Date:		Aug 2023
> +KernelVersion:  6.6
> +Contact:	Longfang Liu <liulongfang@huawei.com>
> +Description:	Read the live migration data of the vfio device.
> +		These data include device status data, queue configuration
> +		data and some task configuration data.
> +		The output format of the data is defined by the live
> +		migration driver.
> +
> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/attr
> +Date:		Aug 2023
> +KernelVersion:  6.6
> +Contact:	Longfang Liu <liulongfang@huawei.com>
> +Description:	Read the live migration attributes of the vfio device.
> +		it include device status attributes and data length attributes
> +		The output format of the attributes is defined by the live
> +		migration driver.
> +
> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/io_test
> +Date:		Aug 2023
> +KernelVersion:  6.6
> +Contact:	Longfang Liu <liulongfang@huawei.com>
> +Description:	Trigger the HiSilicon accelerator device to perform
> +		the io test through the read operation. If successful,
> +		it returns the execution result of mailbox. If fails,
> +		it returns error log result.
> +
> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/save
> +Date:		Aug 2023
> +KernelVersion:  6.6
> +Contact:	Longfang Liu <liulongfang@huawei.com>
> +Description:	Trigger the Hisilicon accelerator device to perform
> +		the state saving operation of live migration through the read
> +		operation, and output the operation log results.

I still very much do not like this use of debugfs.

If you want to test migration then make a test program and use the
normal api

Creating some parallel backdoor to work the same API is just
unneeded complexity.

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v12 4/4] Documentation: add debugfs description for vfio
  2023-08-04 14:58   ` Jason Gunthorpe
@ 2023-08-07  1:33     ` liulongfang
  2023-08-07 22:03       ` Alex Williamson
  0 siblings, 1 reply; 11+ messages in thread
From: liulongfang @ 2023-08-07  1:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: alex.williamson, shameerali.kolothum.thodi, jonathan.cameron,
	cohuck, linux-kernel, linuxarm

On 2023/8/4 22:58, Jason Gunthorpe wrote:
> On Fri, Jul 28, 2023 at 03:21:04PM +0800, liulongfang wrote:
>> From: Longfang Liu <liulongfang@huawei.com>
>>
>> 1.Add two debugfs document description file to help users understand
>> how to use the accelerator live migration driver's debugfs.
>> 2.Update the file paths that need to be maintained in MAINTAINERS
>>
>> Signed-off-by: Longfang Liu <liulongfang@huawei.com>
>> ---
>>  .../ABI/testing/debugfs-hisi-migration        | 36 +++++++++++++++++++
>>  Documentation/ABI/testing/debugfs-vfio        | 25 +++++++++++++
>>  MAINTAINERS                                   |  2 ++
>>  3 files changed, 63 insertions(+)
>>  create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
>>  create mode 100644 Documentation/ABI/testing/debugfs-vfio
>>
>> diff --git a/Documentation/ABI/testing/debugfs-hisi-migration b/Documentation/ABI/testing/debugfs-hisi-migration
>> new file mode 100644
>> index 000000000000..791dd8a09575
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/debugfs-hisi-migration
>> @@ -0,0 +1,36 @@
>> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/data
>> +Date:		Aug 2023
>> +KernelVersion:  6.6
>> +Contact:	Longfang Liu <liulongfang@huawei.com>
>> +Description:	Read the live migration data of the vfio device.
>> +		These data include device status data, queue configuration
>> +		data and some task configuration data.
>> +		The output format of the data is defined by the live
>> +		migration driver.
>> +
>> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/attr
>> +Date:		Aug 2023
>> +KernelVersion:  6.6
>> +Contact:	Longfang Liu <liulongfang@huawei.com>
>> +Description:	Read the live migration attributes of the vfio device.
>> +		it include device status attributes and data length attributes
>> +		The output format of the attributes is defined by the live
>> +		migration driver.
>> +
>> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/io_test
>> +Date:		Aug 2023
>> +KernelVersion:  6.6
>> +Contact:	Longfang Liu <liulongfang@huawei.com>
>> +Description:	Trigger the HiSilicon accelerator device to perform
>> +		the io test through the read operation. If successful,
>> +		it returns the execution result of mailbox. If fails,
>> +		it returns error log result.
>> +
>> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/save
>> +Date:		Aug 2023
>> +KernelVersion:  6.6
>> +Contact:	Longfang Liu <liulongfang@huawei.com>
>> +Description:	Trigger the Hisilicon accelerator device to perform
>> +		the state saving operation of live migration through the read
>> +		operation, and output the operation log results.
> 
> I still very much do not like this use of debugfs.
> 
> If you want to test migration then make a test program and use the
> normal api
>
These debugfs are just to get internal state data.
The test function is no longer executed.
The store file with test function has been deleted.

Thanks,
Longfang.

> Creating some parallel backdoor to work the same API is just
> unneeded complexity.
> 
> Jason
> .
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
  2023-07-28  7:21 ` [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver liulongfang
@ 2023-08-07 21:43   ` Alex Williamson
  2023-08-14  9:34     ` liulongfang
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2023-08-07 21:43 UTC (permalink / raw)
  To: liulongfang
  Cc: jgg, shameerali.kolothum.thodi, jonathan.cameron, cohuck,
	linux-kernel, linuxarm

On Fri, 28 Jul 2023 15:21:03 +0800
liulongfang <liulongfang@huawei.com> wrote:

> From: Longfang Liu <liulongfang@huawei.com>
> 
> On the debugfs framework of VFIO, if the CONFIG_DEBUG_FS macro is
> enabled, the debug function is registered for the live migration driver
> of the HiSilicon accelerator device.
> 
> After registering the HiSilicon accelerator device on the debugfs
> framework of live migration of vfio, a directory file "hisi_acc"
> of debugfs is created, and then three debug function files are
> created in this directory:
> 
> data file: used to get the migration data from the driver
> attr file: used to get device attributes parameters from the driver
> save file: used to read the data of the live migration device and save
> it to the driver.
> io_test: used to test IO read and write for the driver.
> 
> Signed-off-by: Longfang Liu <liulongfang@huawei.com>
> ---
>  .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 178 ++++++++++++++++++
>  .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h    |   3 +
>  2 files changed, 181 insertions(+)
> 
> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> index 242ad319932a..a811dc237a29 100644
> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> @@ -15,6 +15,7 @@
>  #include <linux/anon_inodes.h>
>  
>  #include "hisi_acc_vfio_pci.h"
> +#include "../../vfio.h"
>  
>  /* Return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
>  static int qm_wait_dev_not_ready(struct hisi_qm *qm)
> @@ -606,6 +607,18 @@ hisi_acc_check_int_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>  	}
>  }
>  
> +static void hisi_acc_vf_migf_save(struct hisi_acc_vf_migration_file *dst_migf,
> +	struct hisi_acc_vf_migration_file *src_migf)
> +{
> +	if (!dst_migf)
> +		return;
> +
> +	dst_migf->disabled = false;
> +	dst_migf->total_length = src_migf->total_length;
> +	memcpy(&dst_migf->vf_data, &src_migf->vf_data,
> +		    sizeof(struct acc_vf_data));
> +}
> +
>  static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
>  {
>  	mutex_lock(&migf->lock);
> @@ -618,12 +631,16 @@ static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
>  static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>  {
>  	if (hisi_acc_vdev->resuming_migf) {
> +		hisi_acc_vf_migf_save(hisi_acc_vdev->debug_migf,
> +						hisi_acc_vdev->resuming_migf);
>  		hisi_acc_vf_disable_fd(hisi_acc_vdev->resuming_migf);
>  		fput(hisi_acc_vdev->resuming_migf->filp);
>  		hisi_acc_vdev->resuming_migf = NULL;
>  	}
>  
>  	if (hisi_acc_vdev->saving_migf) {
> +		hisi_acc_vf_migf_save(hisi_acc_vdev->debug_migf,
> +						hisi_acc_vdev->saving_migf);
>  		hisi_acc_vf_disable_fd(hisi_acc_vdev->saving_migf);
>  		fput(hisi_acc_vdev->saving_migf->filp);
>  		hisi_acc_vdev->saving_migf = NULL;
> @@ -1303,6 +1320,162 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
>  	return vfio_pci_core_ioctl(core_vdev, cmd, arg);
>  }
>  
> +static int hisi_acc_vf_debug_check(struct seq_file *seq, struct vfio_device *vdev)
> +{
> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> +	struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
> +
> +	if (!vdev->mig_ops || !migf) {
> +		seq_printf(seq, "%s\n", "device does not support live migration!");
> +		return -EINVAL;
> +	}
> +
> +	/* If device not opened, the debugfs operation will trigger calltrace */
> +	if (!vdev->open_count) {
> +		seq_printf(seq, "%s\n", "device not opened!");
> +		return -EINVAL;
> +	}

Following up on the previous reply:

https://lore.kernel.org/all/f01944a8-5668-8a3e-f384-fb9b0fc3b09f@huawei.com/

>> What prevents this from racing release of the device?
>>
> Now there are only read operations for debugfs. The open_count here only needs
> to be used to prevent read operations when the device is not opened.
> There is no need to deal with competition issues.

The explanation doesn't make sense to me, if we're not protecting that
open_count remains elevated for the code path alluded to in the
comment, then this test is useless.  If the calltrace can happen when
the device is not open then it can happen when the device is closed
immediately after this test is performed.

> +
> +	return 0;
> +}
> +
> +static int hisi_acc_vf_debug_io(struct seq_file *seq, void *data)
> +{
> +	struct device *vf_dev = seq->private;
> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
> +	struct vfio_device *vdev = &core_device->vdev;
> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> +	struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
> +	u64 value;
> +	int ret;
> +
> +	ret = hisi_acc_vf_debug_check(seq, vdev);
> +	if (ret)
> +		return 0;
> +

For example, open_count can can be zero here regardless of the test in
the previous function.

> +	ret = qm_wait_dev_not_ready(vf_qm);
> +	if (ret) {
> +		seq_printf(seq, "%s\n", "VF device not ready!");
> +		return 0;
> +	}
> +
> +	value = readl(vf_qm->io_base + QM_MB_CMD_SEND_BASE);
> +	seq_printf(seq, "%s:0x%llx\n", "debug mailbox val", value);
> +
> +	return 0;
> +}

I still don't understand why the debugfs file is called "io_test" for
reading the mailbox.

> +
> +static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
> +{
> +	struct device *vf_dev = seq->private;
> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
> +	struct vfio_device *vdev = &core_device->vdev;
> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> +	struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
> +	int ret;
> +
> +	ret = hisi_acc_vf_debug_check(seq, vdev);
> +	if (ret)
> +		return 0;

Nothing requires that open_count is still elevated here.

> +
> +	ret = vf_qm_state_save(hisi_acc_vdev, migf);
> +	if (ret) {
> +		seq_printf(seq, "%s\n", "failed to save device data!");
> +		return 0;
> +	}
> +	seq_printf(seq, "%s\n", "successful to save device data!");
> +
> +	return 0;
> +}
> +
> +static int hisi_acc_vf_data_read(struct seq_file *seq, void *data)
> +{
> +	struct device *vf_dev = seq->private;
> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
> +	struct vfio_device *vdev = &core_device->vdev;
> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> +	struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
> +	size_t vf_data_sz = offsetofend(struct acc_vf_data, padding);
> +
> +	if (debug_migf && debug_migf->total_length)
> +		seq_hex_dump(seq, "Mig Data:", DUMP_PREFIX_OFFSET, 16, 1,
> +				(unsigned char *)&debug_migf->vf_data,
> +				vf_data_sz, false);

The previous save function attempts to make sure the device is open,
but there's no attempt to drop the debug_migf data when the device is
closed, so we can read the save data regardless of the device being
opened or opened within the same instance where the data was saved.  Is
this intentional?

> +	else
> +		seq_printf(seq, "%s\n", "device not migrated!");
> +
> +	return 0;
> +}
> +
> +static int hisi_acc_vf_attr_read(struct seq_file *seq, void *data)
> +{
> +	struct device *vf_dev = seq->private;
> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
> +	struct vfio_device *vdev = &core_device->vdev;
> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> +	struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
> +
> +	if (debug_migf && debug_migf->total_length) {
> +		seq_printf(seq,
> +			 "acc device:\n"
> +			 "device  state: %d\n"
> +			 "device  ready: %u\n"
> +			 "data    valid: %d\n"
> +			 "data     size: %lu\n",
> +			 hisi_acc_vdev->mig_state,
> +			 hisi_acc_vdev->vf_qm_state,
> +			 debug_migf->disabled,

This is only ever false?

> +			 debug_migf->total_length);
> +	} else {
> +		seq_printf(seq, "%s\n", "device not migrated!");
> +	}
> +
> +	return 0;
> +}
> +
> +static int hisi_acc_vfio_debug_init(struct hisi_acc_vf_core_device *hisi_acc_vdev)
> +{
> +	struct vfio_device *vdev = &hisi_acc_vdev->core_device.vdev;
> +	struct dentry *vfio_dev_migration = NULL;
> +	struct dentry *vfio_hisi_acc = NULL;
> +	struct device *dev = vdev->dev;
> +	void *migf = NULL;
> +
> +	if (!debugfs_initialized())
> +		return 0;
> +
> +	migf = kzalloc(sizeof(struct hisi_acc_vf_migration_file), GFP_KERNEL);
> +	if (!migf)
> +		return -ENOMEM;
> +	hisi_acc_vdev->debug_migf = migf;
> +
> +	vfio_dev_migration = debugfs_lookup("migration", vdev->debug_root);
> +	if (!vfio_dev_migration) {
> +		dev_err(dev, "failed to lookup migration debugfs file!\n");
> +		return -ENODEV;

The allocation of debug_migf is rather wasted if we get here.

> +	}
> +
> +	vfio_hisi_acc = debugfs_create_dir("hisi_acc", vfio_dev_migration);
> +	debugfs_create_devm_seqfile(dev, "data", vfio_hisi_acc,
> +				  hisi_acc_vf_data_read);
> +	debugfs_create_devm_seqfile(dev, "attr", vfio_hisi_acc,
> +				  hisi_acc_vf_attr_read);

Why do we want separate debugfs files for meta data vs data?  ie. why
isn't the hex dump just another line of output along with the meta data?

> +	debugfs_create_devm_seqfile(dev, "io_test", vfio_hisi_acc,
> +				  hisi_acc_vf_debug_io);
> +	debugfs_create_devm_seqfile(dev, "save", vfio_hisi_acc,
> +				  hisi_acc_vf_debug_save);
> +
> +	return 0;
> +}
> +
> +static void hisi_acc_vf_debugfs_exit(struct hisi_acc_vf_core_device *hisi_acc_vdev)
> +{
> +	if (!debugfs_initialized())
> +		return;
> +
> +	kfree(hisi_acc_vdev->debug_migf);
> +}
> +
>  static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
>  {
>  	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
> @@ -1323,6 +1496,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
>  	}
>  
>  	vfio_pci_core_finish_enable(vdev);
> +
>  	return 0;
>  }
>  
> @@ -1422,6 +1596,9 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
>  	ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
>  	if (ret)
>  		goto out_put_vdev;
> +
> +	if (ops == &hisi_acc_vfio_pci_migrn_ops)
> +		hisi_acc_vfio_debug_init(hisi_acc_vdev);
>  	return 0;
>  
>  out_put_vdev:
> @@ -1433,6 +1610,7 @@ static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
>  {
>  	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);
>  
> +	hisi_acc_vf_debugfs_exit(hisi_acc_vdev);

This frees debug_migf

>  	vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);

This triggers the recursive removal of the debugfs seqfiles.  There's a
use-after-free race here where we can dump the contents of the freed
buffer.  Thanks,

Alex

>  	vfio_put_device(&hisi_acc_vdev->core_device.vdev);
>  }
> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
> index dcabfeec6ca1..93f44bcf53ee 100644
> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
> @@ -113,5 +113,8 @@ struct hisi_acc_vf_core_device {
>  	spinlock_t reset_lock;
>  	struct hisi_acc_vf_migration_file *resuming_migf;
>  	struct hisi_acc_vf_migration_file *saving_migf;
> +
> +	/* For debugfs */
> +	struct hisi_acc_vf_migration_file *debug_migf;
>  };
>  #endif /* HISI_ACC_VFIO_PCI_H */


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v12 4/4] Documentation: add debugfs description for vfio
  2023-08-07  1:33     ` liulongfang
@ 2023-08-07 22:03       ` Alex Williamson
  2023-08-10  6:10         ` liulongfang
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2023-08-07 22:03 UTC (permalink / raw)
  To: liulongfang
  Cc: Jason Gunthorpe, shameerali.kolothum.thodi, jonathan.cameron,
	cohuck, linux-kernel, linuxarm

On Mon, 7 Aug 2023 09:33:07 +0800
liulongfang <liulongfang@huawei.com> wrote:

> On 2023/8/4 22:58, Jason Gunthorpe wrote:
> > On Fri, Jul 28, 2023 at 03:21:04PM +0800, liulongfang wrote:  
> >> From: Longfang Liu <liulongfang@huawei.com>
> >>
> >> 1.Add two debugfs document description file to help users understand
> >> how to use the accelerator live migration driver's debugfs.
> >> 2.Update the file paths that need to be maintained in MAINTAINERS
> >>
> >> Signed-off-by: Longfang Liu <liulongfang@huawei.com>
> >> ---
> >>  .../ABI/testing/debugfs-hisi-migration        | 36 +++++++++++++++++++
> >>  Documentation/ABI/testing/debugfs-vfio        | 25 +++++++++++++
> >>  MAINTAINERS                                   |  2 ++
> >>  3 files changed, 63 insertions(+)
> >>  create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
> >>  create mode 100644 Documentation/ABI/testing/debugfs-vfio
> >>
> >> diff --git a/Documentation/ABI/testing/debugfs-hisi-migration b/Documentation/ABI/testing/debugfs-hisi-migration
> >> new file mode 100644
> >> index 000000000000..791dd8a09575
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/debugfs-hisi-migration
> >> @@ -0,0 +1,36 @@
> >> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/data
> >> +Date:		Aug 2023
> >> +KernelVersion:  6.6
> >> +Contact:	Longfang Liu <liulongfang@huawei.com>
> >> +Description:	Read the live migration data of the vfio device.
> >> +		These data include device status data, queue configuration
> >> +		data and some task configuration data.
> >> +		The output format of the data is defined by the live
> >> +		migration driver.
> >> +
> >> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/attr
> >> +Date:		Aug 2023
> >> +KernelVersion:  6.6
> >> +Contact:	Longfang Liu <liulongfang@huawei.com>
> >> +Description:	Read the live migration attributes of the vfio device.
> >> +		it include device status attributes and data length attributes
> >> +		The output format of the attributes is defined by the live
> >> +		migration driver.
> >> +
> >> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/io_test
> >> +Date:		Aug 2023
> >> +KernelVersion:  6.6
> >> +Contact:	Longfang Liu <liulongfang@huawei.com>
> >> +Description:	Trigger the HiSilicon accelerator device to perform
> >> +		the io test through the read operation. If successful,
> >> +		it returns the execution result of mailbox. If fails,
> >> +		it returns error log result.
> >> +
> >> +What:		/sys/kernel/debug/vfio/<device>/migration/hisi_acc/save
> >> +Date:		Aug 2023
> >> +KernelVersion:  6.6
> >> +Contact:	Longfang Liu <liulongfang@huawei.com>
> >> +Description:	Trigger the Hisilicon accelerator device to perform
> >> +		the state saving operation of live migration through the read
> >> +		operation, and output the operation log results.  
> > 
> > I still very much do not like this use of debugfs.
> > 
> > If you want to test migration then make a test program and use the
> > normal api
> >  
> These debugfs are just to get internal state data.
> The test function is no longer executed.
> The store file with test function has been deleted.

The vfio/<device>/migration/state file can provide useful monitoring of
the device progress during a migration, but I think the point Jason is
trying to make is that these hisi_acc seqfiles aren't really doing
anything that couldn't be done by a simple userspace test driver.

Based on my review of the previous patch, we're playing pretty loose
with concurrency and data buffers.  Access to the migration data of
the device outside of the process that owns the device is also a
concern.

The value-add here needs to be that there's something useful about the
kernel being able to dump this data rather than either a simple
userspace program or instrumenting a userspace driver like QEMU, where
we can avoid the complexity that's going to be required to resolve the
issues in the previous patch and ensure that sensitive data from the
device isn't available through debugfs.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v12 4/4] Documentation: add debugfs description for vfio
  2023-08-07 22:03       ` Alex Williamson
@ 2023-08-10  6:10         ` liulongfang
  0 siblings, 0 replies; 11+ messages in thread
From: liulongfang @ 2023-08-10  6:10 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jason Gunthorpe, shameerali.kolothum.thodi, jonathan.cameron,
	cohuck, linux-kernel, linuxarm

On 2023/8/8 6:03, Alex Williamson wrote:
> The vfio/<device>/migration/state file can provide useful monitoring of
> the device progress during a migration, but I think the point Jason is
> trying to make is that these hisi_acc seqfiles aren't really doing
> anything that couldn't be done by a simple userspace test driver.
>The state file was originally used to provide a migration state.
When the migration fails, it is used to locate the problem.
For it, we have no other functional demands.

> Based on my review of the previous patch, we're playing pretty loose
> with concurrency and data buffers.  Access to the migration data of
> the device outside of the process that owns the device is also a
> concern.
> 
> The value-add here needs to be that there's something useful about the
> kernel being able to dump this data rather than either a simple
> userspace program or instrumenting a userspace driver like QEMU, where
> we can avoid the complexity that's going to be required to resolve the
> issues in the previous patch and ensure that sensitive data from the
> device isn't available through debugfs.
The question of whether the migrated data is sensitive data.
It is up to the device driver to choose which data can be output.
Currently, the data that can be output through debugfs in this
HiSilicon device driver does not involve sensitive data.

Thanks,
Longfang.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
  2023-08-07 21:43   ` Alex Williamson
@ 2023-08-14  9:34     ` liulongfang
  0 siblings, 0 replies; 11+ messages in thread
From: liulongfang @ 2023-08-14  9:34 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, shameerali.kolothum.thodi, jonathan.cameron, cohuck,
	linux-kernel, linuxarm

On 2023/8/8 5:43, Alex Williamson wrote:
> On Fri, 28 Jul 2023 15:21:03 +0800
> liulongfang <liulongfang@huawei.com> wrote:
> 
>> From: Longfang Liu <liulongfang@huawei.com>
>>
>> On the debugfs framework of VFIO, if the CONFIG_DEBUG_FS macro is
>> enabled, the debug function is registered for the live migration driver
>> of the HiSilicon accelerator device.
>>
>> After registering the HiSilicon accelerator device on the debugfs
>> framework of live migration of vfio, a directory file "hisi_acc"
>> of debugfs is created, and then three debug function files are
>> created in this directory:
>>
>> data file: used to get the migration data from the driver
>> attr file: used to get device attributes parameters from the driver
>> save file: used to read the data of the live migration device and save
>> it to the driver.
>> io_test: used to test IO read and write for the driver.
>>
>> Signed-off-by: Longfang Liu <liulongfang@huawei.com>
>> ---
>>  .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 178 ++++++++++++++++++
>>  .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h    |   3 +
>>  2 files changed, 181 insertions(+)
>>
>> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
>> index 242ad319932a..a811dc237a29 100644
>> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
>> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/anon_inodes.h>
>>  
>>  #include "hisi_acc_vfio_pci.h"
>> +#include "../../vfio.h"
>>  
>>  /* Return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
>>  static int qm_wait_dev_not_ready(struct hisi_qm *qm)
>> @@ -606,6 +607,18 @@ hisi_acc_check_int_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>>  	}
>>  }
>>  
>> +static void hisi_acc_vf_migf_save(struct hisi_acc_vf_migration_file *dst_migf,
>> +	struct hisi_acc_vf_migration_file *src_migf)
>> +{
>> +	if (!dst_migf)
>> +		return;
>> +
>> +	dst_migf->disabled = false;
>> +	dst_migf->total_length = src_migf->total_length;
>> +	memcpy(&dst_migf->vf_data, &src_migf->vf_data,
>> +		    sizeof(struct acc_vf_data));
>> +}
>> +
>>  static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
>>  {
>>  	mutex_lock(&migf->lock);
>> @@ -618,12 +631,16 @@ static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
>>  static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>>  {
>>  	if (hisi_acc_vdev->resuming_migf) {
>> +		hisi_acc_vf_migf_save(hisi_acc_vdev->debug_migf,
>> +						hisi_acc_vdev->resuming_migf);
>>  		hisi_acc_vf_disable_fd(hisi_acc_vdev->resuming_migf);
>>  		fput(hisi_acc_vdev->resuming_migf->filp);
>>  		hisi_acc_vdev->resuming_migf = NULL;
>>  	}
>>  
>>  	if (hisi_acc_vdev->saving_migf) {
>> +		hisi_acc_vf_migf_save(hisi_acc_vdev->debug_migf,
>> +						hisi_acc_vdev->saving_migf);
>>  		hisi_acc_vf_disable_fd(hisi_acc_vdev->saving_migf);
>>  		fput(hisi_acc_vdev->saving_migf->filp);
>>  		hisi_acc_vdev->saving_migf = NULL;
>> @@ -1303,6 +1320,162 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
>>  	return vfio_pci_core_ioctl(core_vdev, cmd, arg);
>>  }
>>  
>> +static int hisi_acc_vf_debug_check(struct seq_file *seq, struct vfio_device *vdev)
>> +{
>> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> +	struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
>> +
>> +	if (!vdev->mig_ops || !migf) {
>> +		seq_printf(seq, "%s\n", "device does not support live migration!");
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* If device not opened, the debugfs operation will trigger calltrace */
>> +	if (!vdev->open_count) {
>> +		seq_printf(seq, "%s\n", "device not opened!");
>> +		return -EINVAL;
>> +	}
> 
> Following up on the previous reply:
> 
> https://lore.kernel.org/all/f01944a8-5668-8a3e-f384-fb9b0fc3b09f@huawei.com/
> 
>>> What prevents this from racing release of the device?
>>>
>> Now there are only read operations for debugfs. The open_count here only needs
>> to be used to prevent read operations when the device is not opened.
>> There is no need to deal with competition issues.
> 
> The explanation doesn't make sense to me, if we're not protecting that
> open_count remains elevated for the code path alluded to in the
> comment, then this test is useless.  If the calltrace can happen when
> the device is not open then it can happen when the device is closed
> immediately after this test is performed.
>

Yes, a solution is really needed here to ensure that the debugfs operation will
not be performed after device close.

The root cause of whether the device can be operated is whether the io_base of
the device has been mapped.
So, my solution is to use the mutex lock in vfio_device_set of vfio_device.
This mutex lock is used to ensure that this problem will not occur.

Thanks
Longfang.
>> +
>> +	return 0;
>> +}
>> +
>> +static int hisi_acc_vf_debug_io(struct seq_file *seq, void *data)
>> +{
>> +	struct device *vf_dev = seq->private;
>> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>> +	struct vfio_device *vdev = &core_device->vdev;
>> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> +	struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
>> +	u64 value;
>> +	int ret;
>> +
>> +	ret = hisi_acc_vf_debug_check(seq, vdev);
>> +	if (ret)
>> +		return 0;
>> +
> 
> For example, open_count can can be zero here regardless of the test in
> the previous function.
>
OK

>> +	ret = qm_wait_dev_not_ready(vf_qm);
>> +	if (ret) {
>> +		seq_printf(seq, "%s\n", "VF device not ready!");
>> +		return 0;
>> +	}
>> +
>> +	value = readl(vf_qm->io_base + QM_MB_CMD_SEND_BASE);
>> +	seq_printf(seq, "%s:0x%llx\n", "debug mailbox val", value);
>> +
>> +	return 0;
>> +}
> 
> I still don't understand why the debugfs file is called "io_test" for
> reading the mailbox.
>

Yes, it can be changed to io_state here.

>> +
>> +static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
>> +{
>> +	struct device *vf_dev = seq->private;
>> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>> +	struct vfio_device *vdev = &core_device->vdev;
>> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> +	struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
>> +	int ret;
>> +
>> +	ret = hisi_acc_vf_debug_check(seq, vdev);
>> +	if (ret)
>> +		return 0;
> 
> Nothing requires that open_count is still elevated here.
>

OK

>> +
>> +	ret = vf_qm_state_save(hisi_acc_vdev, migf);
>> +	if (ret) {
>> +		seq_printf(seq, "%s\n", "failed to save device data!");
>> +		return 0;
>> +	}
>> +	seq_printf(seq, "%s\n", "successful to save device data!");
>> +
>> +	return 0;
>> +}
>> +
>> +static int hisi_acc_vf_data_read(struct seq_file *seq, void *data)
>> +{
>> +	struct device *vf_dev = seq->private;
>> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>> +	struct vfio_device *vdev = &core_device->vdev;
>> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> +	struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
>> +	size_t vf_data_sz = offsetofend(struct acc_vf_data, padding);
>> +
>> +	if (debug_migf && debug_migf->total_length)
>> +		seq_hex_dump(seq, "Mig Data:", DUMP_PREFIX_OFFSET, 16, 1,
>> +				(unsigned char *)&debug_migf->vf_data,
>> +				vf_data_sz, false);
> 
> The previous save function attempts to make sure the device is open,
> but there's no attempt to drop the debug_migf data when the device is
> closed, so we can read the save data regardless of the device being
> opened or opened within the same instance where the data was saved.  Is
> this intentional?
>

I have understood what you said. The current save operation needs to have
a lock to ensure that the device is not closed when it reads.

>> +	else
>> +		seq_printf(seq, "%s\n", "device not migrated!");
>> +
>> +	return 0;
>> +}
>> +
>> +static int hisi_acc_vf_attr_read(struct seq_file *seq, void *data)
>> +{
>> +	struct device *vf_dev = seq->private;
>> +	struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>> +	struct vfio_device *vdev = &core_device->vdev;
>> +	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> +	struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
>> +
>> +	if (debug_migf && debug_migf->total_length) {
>> +		seq_printf(seq,
>> +			 "acc device:\n"
>> +			 "device  state: %d\n"
>> +			 "device  ready: %u\n"
>> +			 "data    valid: %d\n"
>> +			 "data     size: %lu\n",
>> +			 hisi_acc_vdev->mig_state,
>> +			 hisi_acc_vdev->vf_qm_state,
>> +			 debug_migf->disabled,
> 
> This is only ever false?
>
It should be false when there is no error.

>> +			 debug_migf->total_length);
>> +	} else {
>> +		seq_printf(seq, "%s\n", "device not migrated!");
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int hisi_acc_vfio_debug_init(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>> +{
>> +	struct vfio_device *vdev = &hisi_acc_vdev->core_device.vdev;
>> +	struct dentry *vfio_dev_migration = NULL;
>> +	struct dentry *vfio_hisi_acc = NULL;
>> +	struct device *dev = vdev->dev;
>> +	void *migf = NULL;
>> +
>> +	if (!debugfs_initialized())
>> +		return 0;
>> +
>> +	migf = kzalloc(sizeof(struct hisi_acc_vf_migration_file), GFP_KERNEL);
>> +	if (!migf)
>> +		return -ENOMEM;
>> +	hisi_acc_vdev->debug_migf = migf;
>> +
>> +	vfio_dev_migration = debugfs_lookup("migration", vdev->debug_root);
>> +	if (!vfio_dev_migration) {
>> +		dev_err(dev, "failed to lookup migration debugfs file!\n");
>> +		return -ENODEV;
> 
> The allocation of debug_migf is rather wasted if we get here.
>

yes it should be free.


>> +	}
>> +
>> +	vfio_hisi_acc = debugfs_create_dir("hisi_acc", vfio_dev_migration);
>> +	debugfs_create_devm_seqfile(dev, "data", vfio_hisi_acc,
>> +				  hisi_acc_vf_data_read);
>> +	debugfs_create_devm_seqfile(dev, "attr", vfio_hisi_acc,
>> +				  hisi_acc_vf_attr_read);
> 
> Why do we want separate debugfs files for meta data vs data?  ie. why
> isn't the hex dump just another line of output along with the meta data?
>

The above data is the original data of the migration.
attr is the description attribute of migration data,
for example, total length, migration length.

>> +	debugfs_create_devm_seqfile(dev, "io_test", vfio_hisi_acc,
>> +				  hisi_acc_vf_debug_io);
>> +	debugfs_create_devm_seqfile(dev, "save", vfio_hisi_acc,
>> +				  hisi_acc_vf_debug_save);
>> +
>> +	return 0;
>> +}
>> +
>> +static void hisi_acc_vf_debugfs_exit(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>> +{
>> +	if (!debugfs_initialized())
>> +		return;
>> +
>> +	kfree(hisi_acc_vdev->debug_migf);
>> +}
>> +
>>  static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
>>  {
>>  	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
>> @@ -1323,6 +1496,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
>>  	}
>>  
>>  	vfio_pci_core_finish_enable(vdev);
>> +
>>  	return 0;
>>  }
>>  
>> @@ -1422,6 +1596,9 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
>>  	ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
>>  	if (ret)
>>  		goto out_put_vdev;
>> +
>> +	if (ops == &hisi_acc_vfio_pci_migrn_ops)
>> +		hisi_acc_vfio_debug_init(hisi_acc_vdev);
>>  	return 0;
>>  
>>  out_put_vdev:
>> @@ -1433,6 +1610,7 @@ static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
>>  {
>>  	struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);
>>  
>> +	hisi_acc_vf_debugfs_exit(hisi_acc_vdev);
> 
> This frees debug_migf
> 
>>  	vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
> 
> This triggers the recursive removal of the debugfs seqfiles.  There's a
> use-after-free race here where we can dump the contents of the freed
> buffer.  Thanks,
> 

Yes. This problem can be avoided if debugfs is deleted first,
and then the memory of debug_migf is released.

> Alex
>

Thanks.
Longfang.

>>  	vfio_put_device(&hisi_acc_vdev->core_device.vdev);
>>  }
>> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
>> index dcabfeec6ca1..93f44bcf53ee 100644
>> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
>> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
>> @@ -113,5 +113,8 @@ struct hisi_acc_vf_core_device {
>>  	spinlock_t reset_lock;
>>  	struct hisi_acc_vf_migration_file *resuming_migf;
>>  	struct hisi_acc_vf_migration_file *saving_migf;
>> +
>> +	/* For debugfs */
>> +	struct hisi_acc_vf_migration_file *debug_migf;
>>  };
>>  #endif /* HISI_ACC_VFIO_PCI_H */
> 
> .
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-08-14  9:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-28  7:21 [PATCH v12 0/4] add debugfs to migration driver liulongfang
2023-07-28  7:21 ` [PATCH v12 1/4] vfio/migration: Add debugfs to live " liulongfang
2023-07-28  7:21 ` [PATCH v12 2/4] hisi_acc_vfio_pci: extract public functions for container_of liulongfang
2023-07-28  7:21 ` [PATCH v12 3/4] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver liulongfang
2023-08-07 21:43   ` Alex Williamson
2023-08-14  9:34     ` liulongfang
2023-07-28  7:21 ` [PATCH v12 4/4] Documentation: add debugfs description for vfio liulongfang
2023-08-04 14:58   ` Jason Gunthorpe
2023-08-07  1:33     ` liulongfang
2023-08-07 22:03       ` Alex Williamson
2023-08-10  6:10         ` liulongfang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.