linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] mlx5 ConnectX diagnostic misc driver
@ 2023-10-18  8:19 Saeed Mahameed
  2023-10-18  8:19 ` [PATCH 1/5] mlx5: Add aux dev for ctl interface Saeed Mahameed
                   ` (5 more replies)
  0 siblings, 6 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-10-18  8:19 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Hello Greg and Arnd,

The ConnectX HW family supported by the mlx5 drivers uses an architecture
where a FW component executes "mailbox RPCs" issued by the driver to make
changes to the device. This results in a complex debugging environment
where the FW component has information and complex low level state that
needs to be accessed to userspace for debugging purposes.

Historically a userspace program was used that accessed the PCI register
and config space directly through /sys/bus/pci/.../XXX and could operate
these debugging interfaces in parallel with the running driver.
This approach is incompatible with secure boot and kernel lockdown so this
driver provides a secure and restricted interface to that.

To solve this we add a misc driver "mlx5ctl" that would interface with
mlx5_core ConnectX driver to access the underlaying device debug
information. 

1) The first patch in the series introduces the main driver file with the
implementation of a new mlx5 auxiliary device driver to run on top
mlx5_core device instances, on probe it creates a new misc device and in
this patch we implement the open and release fops, On open the driver
would allocate a special FW UID (user context ID) restricted to debug
RPCs only, where all user debug RPCs will be executed under this UID,
and on release the UID will be freed.

2) The second patch adds an info ioctl that will show the allocated UID
and the available capability masks of the device and the current UID, and
some other useful device information such as the underlying ConnectX
device

Example:
    $ mlx5ctl mlx5_core.ctl.0
    mlx5dev: 0000:00:04.0
    UCTX UID: 1
    UCTX CAP: 0x3
    DEV UCTX CAP: 0x3
    USER CAP: 0x1d

3) Third patch adds RPC ioctl to execute debug RPCs under the
special UID.

In the mlx5 architecture the FW RPC commands are of the format of
inbox and outbox buffers. The inbox buffer contains the command
rpc layout as described in the ConnectX Programmers Reference Manual
(PRM) document and as defined in linux/include/mlx5/mlx5_ifc.h.

On success the user outbox buffer will be filled with the device's rpc
response.

For example to query device capabilities:
a user fills out an inbox buffer with the inbox layout:
    struct mlx5_ifc_query_hca_cap_in_bits
and expects an outbox buffer with the layout:
     struct mlx5_ifc_cmd_hca_cap_bits

4) The fourth patch adds the ability to register user memory into the
ConntectX device and create a umem object that points to that memory.

Command rpc outbox buffer is limited in size, which can be very
annoying when trying to pull large traces out of the device.
Many rpcs offer the ability to scatter output traces, contexts
and logs directly into user space buffers in a single shot.

The registered/pinned memory will be described by a device UMEM object
which has a unique umem_id, this umem_id can be later used in the rpc
inbox to tell the device where to populate the response output,
e.g HW traces and other debug object queries.

Example usecase, a ConnectX device coredump can be as large as 2MB.
Using inline rpcs will take thousands of rpcs to get the full
coredump which can consume multiple seconds.

With UMEM, it can be done in a single rpc, using 2MB of umem user buffer.

Other usecases with umem:
  - dynamic HW and FW trace monitoring
  - high frequency diagnostic counters sampling
  - batched objects and resource dumps

See links below for information about user-space tools that use this
interface:

[1] https://github.com/saeedtx/mlx5ctl

[2] https://github.com/Mellanox/mstflint
see:
    d) mstregdump utility
      This utility dumps hardware registers from Mellanox hardware
      for later analysis by Mellanox.

    g) mstconfig
      This tool sets or queries non-volatile configurable options

    i) mstreg
      The mlxreg utility allows users to obtain information
      regarding supported access registers, such as their fields

Saeed Mahameed (5):
  mlx5: Add aux dev for ctl interface
  misc: mlx5ctl: Add mlx5ctl misc driver
  misc: mlx5ctl: Add info ioctl
  misc: mlx5ctl: Add command rpc ioctl
  misc: mlx5ctl: Add umem reg/unreg ioctl

 .../userspace-api/ioctl/ioctl-number.rst      |   1 +
 drivers/misc/Kconfig                          |   1 +
 drivers/misc/Makefile                         |   1 +
 drivers/misc/mlx5ctl/Kconfig                  |  14 +
 drivers/misc/mlx5ctl/Makefile                 |   5 +
 drivers/misc/mlx5ctl/main.c                   | 528 ++++++++++++++++++
 drivers/misc/mlx5ctl/umem.c                   | 325 +++++++++++
 drivers/misc/mlx5ctl/umem.h                   |  17 +
 drivers/net/ethernet/mellanox/mlx5/core/dev.c |   8 +
 include/uapi/misc/mlx5ctl.h                   |  51 ++
 10 files changed, 951 insertions(+)
 create mode 100644 drivers/misc/mlx5ctl/Kconfig
 create mode 100644 drivers/misc/mlx5ctl/Makefile
 create mode 100644 drivers/misc/mlx5ctl/main.c
 create mode 100644 drivers/misc/mlx5ctl/umem.c
 create mode 100644 drivers/misc/mlx5ctl/umem.h
 create mode 100644 include/uapi/misc/mlx5ctl.h

-- 
2.41.0


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/5] mlx5: Add aux dev for ctl interface
  2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
@ 2023-10-18  8:19 ` Saeed Mahameed
  2023-10-18  8:19 ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Saeed Mahameed
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-10-18  8:19 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Allow ctl protocol interface auxiliary driver in mlx5.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/dev.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index 7909f378dc93..f0e91793f4ad 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -215,8 +215,14 @@ enum {
 	MLX5_INTERFACE_PROTOCOL_MPIB,
 
 	MLX5_INTERFACE_PROTOCOL_VNET,
+	MLX5_INTERFACE_PROTOCOL_CTL,
 };
 
+static bool is_ctl_supported(struct mlx5_core_dev *dev)
+{
+	return MLX5_CAP_GEN(dev, uctx_cap);
+}
+
 static const struct mlx5_adev_device {
 	const char *suffix;
 	bool (*is_supported)(struct mlx5_core_dev *dev);
@@ -237,6 +243,8 @@ static const struct mlx5_adev_device {
 					   .is_supported = &is_ib_rep_supported },
 	[MLX5_INTERFACE_PROTOCOL_MPIB] = { .suffix = "multiport",
 					   .is_supported = &is_mp_supported },
+	[MLX5_INTERFACE_PROTOCOL_CTL] = { .suffix = "ctl",
+					  .is_supported = &is_ctl_supported },
 };
 
 int mlx5_adev_idx_alloc(void)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
  2023-10-18  8:19 ` [PATCH 1/5] mlx5: Add aux dev for ctl interface Saeed Mahameed
@ 2023-10-18  8:19 ` Saeed Mahameed
  2023-10-18  8:30   ` Greg Kroah-Hartman
  2023-10-18  8:30   ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Greg Kroah-Hartman
  2023-10-18  8:19 ` [PATCH 3/5] misc: mlx5ctl: Add info ioctl Saeed Mahameed
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-10-18  8:19 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

The ConnectX HW family supported by the mlx5 drivers uses an architecture
where a FW component executes "mailbox RPCs" issued by the driver to make
changes to the device. This results in a complex debugging environment
where the FW component has information and low level configuration that
needs to be accessed to userspace for debugging purposes.

Historically a userspace program was used that accessed the PCI register
and config space directly through /sys/bus/pci/.../XXX and could operate
these debugging interfaces in parallel with the running driver.
This approach is incompatible with secure boot and kernel lockdown so this
driver provides a secure and restricted interface to that same data.

On open the driver would allocate a special FW UID (user context ID)
restrected to debug RPCs only, later in this series all user RPCs will
be stamped with this UID.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/misc/Kconfig          |   1 +
 drivers/misc/Makefile         |   1 +
 drivers/misc/mlx5ctl/Kconfig  |  14 ++
 drivers/misc/mlx5ctl/Makefile |   4 +
 drivers/misc/mlx5ctl/main.c   | 314 ++++++++++++++++++++++++++++++++++
 5 files changed, 334 insertions(+)
 create mode 100644 drivers/misc/mlx5ctl/Kconfig
 create mode 100644 drivers/misc/mlx5ctl/Makefile
 create mode 100644 drivers/misc/mlx5ctl/main.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index cadd4a820c03..b46bd8edc348 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -579,4 +579,5 @@ source "drivers/misc/cardreader/Kconfig"
 source "drivers/misc/uacce/Kconfig"
 source "drivers/misc/pvpanic/Kconfig"
 source "drivers/misc/mchp_pci1xxxx/Kconfig"
+source "drivers/misc/mlx5ctl/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index f2a4d1ff65d4..49bc4697f498 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -67,3 +67,4 @@ obj-$(CONFIG_TMR_MANAGER)      += xilinx_tmr_manager.o
 obj-$(CONFIG_TMR_INJECT)	+= xilinx_tmr_inject.o
 obj-$(CONFIG_TPS6594_ESM)	+= tps6594-esm.o
 obj-$(CONFIG_TPS6594_PFSM)	+= tps6594-pfsm.o
+obj-$(CONFIG_MLX5CTL)		+= mlx5ctl/
diff --git a/drivers/misc/mlx5ctl/Kconfig b/drivers/misc/mlx5ctl/Kconfig
new file mode 100644
index 000000000000..faaa1dba2cc2
--- /dev/null
+++ b/drivers/misc/mlx5ctl/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+
+config MLX5CTL
+	tristate "mlx5 ConnectX control misc driver"
+	depends on MLX5_CORE
+	help
+	  MLX5CTL provides interface for the user process to access the debug and
+          configuration registers of the ConnectX hardware family
+          (NICs, PCI switches and SmartNIC SoCs).
+          This will allow configuration and debug tools to work out of the box on
+          mainstream kernel.
+
+	  If you don't know what to do here, say N.
diff --git a/drivers/misc/mlx5ctl/Makefile b/drivers/misc/mlx5ctl/Makefile
new file mode 100644
index 000000000000..b5c7f99e0ab6
--- /dev/null
+++ b/drivers/misc/mlx5ctl/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_MLX5CTL) += mlx5ctl.o
+mlx5ctl-y := main.o
diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c
new file mode 100644
index 000000000000..de8d6129432c
--- /dev/null
+++ b/drivers/misc/mlx5ctl/main.c
@@ -0,0 +1,314 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#include <linux/miscdevice.h>
+#include <linux/fs.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/auxiliary_bus.h>
+#include <linux/mlx5/device.h>
+#include <linux/mlx5/driver.h>
+#include <linux/atomic.h>
+#include <linux/refcount.h>
+
+MODULE_DESCRIPTION("mlx5 ConnectX control misc driver");
+MODULE_AUTHOR("Saeed Mahameed <saeedm@nvidia.com>");
+MODULE_LICENSE("Dual BSD/GPL");
+
+struct mlx5ctl_dev {
+	struct mlx5_core_dev *mdev;
+	struct miscdevice miscdev;
+	struct auxiliary_device *adev;
+	struct list_head fd_list;
+	spinlock_t fd_list_lock; /* protect list add/del */
+	struct rw_semaphore rw_lock;
+	struct kref refcount;
+};
+
+struct mlx5ctl_fd {
+	u16 uctx_uid;
+	u32 uctx_cap;
+	u32 ucap; /* user cap */
+	struct mlx5ctl_dev *mcdev;
+	struct list_head list;
+};
+
+#define mlx5ctl_err(mcdev, format, ...) \
+	dev_err(mcdev->miscdev.parent, format, ##__VA_ARGS__)
+
+#define mlx5ctl_dbg(mcdev, format, ...) \
+	dev_dbg(mcdev->miscdev.parent, "PID %d: " format, \
+		current->pid, ##__VA_ARGS__)
+
+enum {
+	MLX5_UCTX_OBJECT_CAP_RAW_TX                     = 0x1,
+	MLX5_UCTX_OBJECT_CAP_INTERNAL_DEVICE_RESOURCES  = 0x2,
+	MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES            = 0x4,
+};
+
+static int mlx5ctl_alloc_uid(struct mlx5ctl_dev *mcdev, u32 cap)
+{
+	u32 out[MLX5_ST_SZ_DW(create_uctx_out)] = {};
+	u32 in[MLX5_ST_SZ_DW(create_uctx_in)] = {};
+	void *uctx;
+	int err;
+	u16 uid;
+
+	uctx = MLX5_ADDR_OF(create_uctx_in, in, uctx);
+
+	mlx5ctl_dbg(mcdev, "MLX5_CMD_OP_CREATE_UCTX: caps 0x%x\n", cap);
+	MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX);
+	MLX5_SET(uctx, uctx, cap, cap);
+
+	err = mlx5_cmd_exec(mcdev->mdev, in, sizeof(in), out, sizeof(out));
+	if (err)
+		return err;
+
+	uid = MLX5_GET(create_uctx_out, out, uid);
+	mlx5ctl_dbg(mcdev, "allocated uid %d with caps 0x%x\n", uid, cap);
+	return uid;
+}
+
+static void mlx5ctl_release_uid(struct mlx5ctl_dev *mcdev, u16 uid)
+{
+	u32 in[MLX5_ST_SZ_DW(destroy_uctx_in)] = {};
+	struct mlx5_core_dev *mdev = mcdev->mdev;
+	int err;
+
+	MLX5_SET(destroy_uctx_in, in, opcode, MLX5_CMD_OP_DESTROY_UCTX);
+	MLX5_SET(destroy_uctx_in, in, uid, uid);
+
+	err = mlx5_cmd_exec_in(mdev, destroy_uctx, in);
+	mlx5ctl_dbg(mcdev, "released uid %d err(%d)\n", uid, err);
+}
+
+static void mcdev_get(struct mlx5ctl_dev *mcdev);
+static void mcdev_put(struct mlx5ctl_dev *mcdev);
+
+static int mlx5ctl_open_mfd(struct mlx5ctl_fd *mfd)
+{
+	struct mlx5_core_dev *mdev = mfd->mcdev->mdev;
+	struct mlx5ctl_dev *mcdev = mfd->mcdev;
+	u32 ucap = 0, cap = 0;
+	int uid;
+
+#define MLX5_UCTX_CAP(mdev, cap) \
+	(MLX5_CAP_GEN(mdev, uctx_cap) & MLX5_UCTX_OBJECT_CAP_##cap)
+
+	if (capable(CAP_NET_RAW) && MLX5_UCTX_CAP(mdev, RAW_TX)) {
+		ucap |= CAP_NET_RAW;
+		cap |= MLX5_UCTX_OBJECT_CAP_RAW_TX;
+	}
+
+	if (capable(CAP_SYS_RAWIO) && MLX5_UCTX_CAP(mdev, INTERNAL_DEVICE_RESOURCES)) {
+		ucap |= CAP_SYS_RAWIO;
+		cap |= MLX5_UCTX_OBJECT_CAP_INTERNAL_DEVICE_RESOURCES;
+	}
+
+	if (capable(CAP_SYS_ADMIN) && MLX5_UCTX_CAP(mdev, TOOLS_RESOURCES)) {
+		ucap |= CAP_SYS_ADMIN;
+		cap |= MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES;
+	}
+
+	uid = mlx5ctl_alloc_uid(mcdev, cap);
+	if (uid < 0)
+		return uid;
+
+	mfd->uctx_uid = uid;
+	mfd->uctx_cap = cap;
+	mfd->ucap = ucap;
+	mfd->mcdev = mcdev;
+
+	mlx5ctl_dbg(mcdev, "allocated uid %d with uctx caps 0x%x, user cap 0x%x\n",
+		    uid, cap, ucap);
+	return 0;
+}
+
+static void mlx5ctl_release_mfd(struct mlx5ctl_fd *mfd)
+{
+	struct mlx5ctl_dev *mcdev = mfd->mcdev;
+
+	mlx5ctl_release_uid(mcdev,  mfd->uctx_uid);
+}
+
+static int mlx5ctl_open(struct inode *inode, struct file *file)
+{
+	struct mlx5_core_dev *mdev;
+	struct mlx5ctl_dev *mcdev;
+	struct mlx5ctl_fd *mfd;
+	int err = 0;
+
+	mcdev = container_of(file->private_data, struct mlx5ctl_dev, miscdev);
+	mcdev_get(mcdev);
+	down_read(&mcdev->rw_lock);
+	mdev = mcdev->mdev;
+	if (!mdev) {
+		err = -ENODEV;
+		goto unlock;
+	}
+
+	mfd = kzalloc(sizeof(*mfd), GFP_KERNEL_ACCOUNT);
+	if (!mfd)
+		return -ENOMEM;
+
+	mfd->mcdev = mcdev;
+	err = mlx5ctl_open_mfd(mfd);
+	if (err)
+		goto unlock;
+
+	spin_lock(&mcdev->fd_list_lock);
+	list_add_tail(&mfd->list, &mcdev->fd_list);
+	spin_unlock(&mcdev->fd_list_lock);
+
+	file->private_data = mfd;
+
+unlock:
+	up_read(&mcdev->rw_lock);
+	if (err) {
+		mcdev_put(mcdev);
+		kfree(mfd);
+	}
+	return err;
+}
+
+static int mlx5ctl_release(struct inode *inode, struct file *file)
+{
+	struct mlx5ctl_fd *mfd = file->private_data;
+	struct mlx5ctl_dev *mcdev = mfd->mcdev;
+
+	down_read(&mcdev->rw_lock);
+	if (!mcdev->mdev) {
+		pr_debug("[%d] UID %d mlx5ctl: mdev is already released\n",
+			 current->pid, mfd->uctx_uid);
+		/* All mfds are already released, skip ... */
+		goto unlock;
+	}
+
+	spin_lock(&mcdev->fd_list_lock);
+	list_del(&mfd->list);
+	spin_unlock(&mcdev->fd_list_lock);
+
+	mlx5ctl_release_mfd(mfd);
+
+unlock:
+	kfree(mfd);
+	up_read(&mcdev->rw_lock);
+	mcdev_put(mcdev);
+	file->private_data = NULL;
+	return 0;
+}
+
+static const struct file_operations mlx5ctl_fops = {
+	.owner = THIS_MODULE,
+	.open = mlx5ctl_open,
+	.release = mlx5ctl_release,
+};
+
+static int mlx5ctl_probe(struct auxiliary_device *adev,
+			 const struct auxiliary_device_id *id)
+
+{
+	struct mlx5_adev *madev = container_of(adev, struct mlx5_adev, adev);
+	struct mlx5_core_dev *mdev = madev->mdev;
+	struct mlx5ctl_dev *mcdev;
+	char *devname = NULL;
+	int err;
+
+	mcdev = kzalloc(sizeof(*mcdev), GFP_KERNEL_ACCOUNT);
+	if (!mcdev)
+		return -ENOMEM;
+
+	kref_init(&mcdev->refcount);
+	INIT_LIST_HEAD(&mcdev->fd_list);
+	spin_lock_init(&mcdev->fd_list_lock);
+	init_rwsem(&mcdev->rw_lock);
+	mcdev->mdev = mdev;
+	mcdev->adev = adev;
+	devname = kasprintf(GFP_KERNEL_ACCOUNT, "mlx5ctl-%s",
+			    dev_name(&adev->dev));
+	if (!devname) {
+		err = -ENOMEM;
+		goto abort;
+	}
+
+	mcdev->miscdev = (struct miscdevice) {
+		.minor = MISC_DYNAMIC_MINOR,
+		.name = devname,
+		.fops = &mlx5ctl_fops,
+		.parent = &adev->dev,
+	};
+
+	err = misc_register(&mcdev->miscdev);
+	if (err) {
+		mlx5ctl_err(mcdev, "mlx5ctl: failed to register misc device err %d\n", err);
+		goto abort;
+	}
+
+	mlx5ctl_dbg(mcdev, "probe mdev@%s %s\n", dev_driver_string(mdev->device), dev_name(mdev->device));
+
+	auxiliary_set_drvdata(adev, mcdev);
+
+	return 0;
+
+abort:
+	kfree(devname);
+	kfree(mcdev);
+	return err;
+}
+
+static void mlx5ctl_remove(struct auxiliary_device *adev)
+{
+	struct mlx5ctl_dev *mcdev = auxiliary_get_drvdata(adev);
+	struct mlx5_core_dev *mdev = mcdev->mdev;
+	struct mlx5ctl_fd *mfd, *n;
+
+	misc_deregister(&mcdev->miscdev);
+	down_write(&mcdev->rw_lock);
+
+	list_for_each_entry_safe(mfd, n, &mcdev->fd_list, list) {
+		mlx5ctl_dbg(mcdev, "UID %d still has open FDs\n", mfd->uctx_uid);
+		list_del(&mfd->list);
+		mlx5ctl_release_mfd(mfd);
+	}
+
+	mlx5ctl_dbg(mcdev, "removed mdev %s %s\n",
+		    dev_driver_string(mdev->device), dev_name(mdev->device));
+
+	mcdev->mdev = NULL; /* prevent already open fds from accessing the device */
+	up_write(&mcdev->rw_lock);
+	mcdev_put(mcdev);
+}
+
+static void mcdev_free(struct kref *ref)
+{
+	struct mlx5ctl_dev *mcdev = container_of(ref, struct mlx5ctl_dev, refcount);
+
+	kfree(mcdev->miscdev.name);
+	kfree(mcdev);
+}
+
+static void mcdev_get(struct mlx5ctl_dev *mcdev)
+{
+	kref_get(&mcdev->refcount);
+}
+
+static void mcdev_put(struct mlx5ctl_dev *mcdev)
+{
+	kref_put(&mcdev->refcount, mcdev_free);
+}
+
+static const struct auxiliary_device_id mlx5ctl_id_table[] = {
+	{ .name = MLX5_ADEV_NAME ".ctl", },
+	{},
+};
+
+MODULE_DEVICE_TABLE(auxiliary, mlx5ctl_id_table);
+
+static struct auxiliary_driver mlx5ctl_driver = {
+	.name = "ctl",
+	.probe = mlx5ctl_probe,
+	.remove = mlx5ctl_remove,
+	.id_table = mlx5ctl_id_table,
+};
+
+module_auxiliary_driver(mlx5ctl_driver);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/5] misc: mlx5ctl: Add info ioctl
  2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
  2023-10-18  8:19 ` [PATCH 1/5] mlx5: Add aux dev for ctl interface Saeed Mahameed
  2023-10-18  8:19 ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Saeed Mahameed
@ 2023-10-18  8:19 ` Saeed Mahameed
  2023-10-18  9:02   ` Arnd Bergmann
                     ` (2 more replies)
  2023-10-18  8:19 ` [PATCH 4/5] misc: mlx5ctl: Add command rpc ioctl Saeed Mahameed
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-10-18  8:19 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Implement INFO ioctl to return the allocated UID and the capability flags
and some other useful device information such as the underlying ConnectX
device.

Example:
$ mlx5ctl mlx5_core.ctl.0
mlx5dev: 0000:00:04.0
UCTX UID: 1
UCTX CAP: 0x3
DEV UCTX CAP: 0x3
USER CAP: 0x1d

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../userspace-api/ioctl/ioctl-number.rst      |  1 +
 drivers/misc/mlx5ctl/main.c                   | 72 +++++++++++++++++++
 include/uapi/misc/mlx5ctl.h                   | 24 +++++++
 3 files changed, 97 insertions(+)
 create mode 100644 include/uapi/misc/mlx5ctl.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 4ea5b837399a..9faf91ffefff 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -89,6 +89,7 @@ Code  Seq#    Include File                                           Comments
 0x20  all    drivers/cdrom/cm206.h
 0x22  all    scsi/sg.h
 0x3E  00-0F  linux/counter.h                                         <mailto:linux-iio@vger.kernel.org>
+0x5c  all    uapi/misc/mlx5ctl.h                                     Nvidia ConnectX control
 '!'   00-1F  uapi/linux/seccomp.h
 '#'   00-3F                                                          IEEE 1394 Subsystem
                                                                      Block for the entire subsystem
diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c
index de8d6129432c..008ad3a12d97 100644
--- a/drivers/misc/mlx5ctl/main.c
+++ b/drivers/misc/mlx5ctl/main.c
@@ -8,6 +8,7 @@
 #include <linux/auxiliary_bus.h>
 #include <linux/mlx5/device.h>
 #include <linux/mlx5/driver.h>
+#include <uapi/misc/mlx5ctl.h>
 #include <linux/atomic.h>
 #include <linux/refcount.h>
 
@@ -198,10 +199,81 @@ static int mlx5ctl_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
+static int mlx5ctl_info_ioctl(struct file *file, void __user *arg, size_t usize)
+{
+	struct mlx5ctl_fd *mfd = file->private_data;
+	struct mlx5ctl_dev *mcdev = mfd->mcdev;
+	struct mlx5_core_dev *mdev = mcdev->mdev;
+	struct mlx5ctl_info *info;
+	void *kbuff = NULL;
+	size_t ksize = 0;
+	int err = 0;
+
+	ksize = max(sizeof(struct mlx5ctl_info), usize);
+	kbuff = kzalloc(ksize, GFP_KERNEL_ACCOUNT);
+	if (!kbuff)
+		return -ENOMEM;
+
+	info = kbuff;
+
+	info->size = sizeof(struct mlx5ctl_info);
+	info->flags = 0;
+
+	info->dev_uctx_cap = MLX5_CAP_GEN(mdev, uctx_cap);
+	info->uctx_cap = mfd->uctx_cap;
+	info->uctx_uid = mfd->uctx_uid;
+	info->ucap = mfd->ucap;
+
+	strscpy(info->devname, dev_name(&mdev->pdev->dev), sizeof(info->devname));
+
+	if (copy_to_user(arg, kbuff, usize))
+		err = -EFAULT;
+
+	kfree(kbuff);
+	return err;
+}
+
+static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd,
+			     unsigned long arg)
+{
+	struct mlx5ctl_fd *mfd = file->private_data;
+	struct mlx5ctl_dev *mcdev = mfd->mcdev;
+	void __user *argp = (void __user *)arg;
+	size_t size = _IOC_SIZE(cmd);
+	int err = 0;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	mlx5ctl_dbg(mcdev, "ioctl 0x%x type/nr: %d/%d size: %d DIR:%d\n", cmd,
+		    _IOC_TYPE(cmd), _IOC_NR(cmd), _IOC_SIZE(cmd), _IOC_DIR(cmd));
+
+	down_read(&mcdev->rw_lock);
+	if (!mcdev->mdev) {
+		err = -ENODEV;
+		goto unlock;
+	}
+
+	switch (cmd) {
+	case MLX5CTL_IOCTL_INFO:
+		err = mlx5ctl_info_ioctl(file, argp, size);
+		break;
+
+	default:
+		mlx5ctl_dbg(mcdev, "Unknown ioctl %x\n", cmd);
+		err = -ENOIOCTLCMD;
+		break;
+	}
+unlock:
+	up_read(&mcdev->rw_lock);
+	return err;
+}
+
 static const struct file_operations mlx5ctl_fops = {
 	.owner = THIS_MODULE,
 	.open = mlx5ctl_open,
 	.release = mlx5ctl_release,
+	.unlocked_ioctl = mlx5ctl_ioctl,
 };
 
 static int mlx5ctl_probe(struct auxiliary_device *adev,
diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h
new file mode 100644
index 000000000000..81d89cd285fc
--- /dev/null
+++ b/include/uapi/misc/mlx5ctl.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB WITH Linux-syscall-note */
+/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#ifndef __MLX5CTL_IOCTL_H__
+#define __MLX5CTL_IOCTL_H__
+
+struct mlx5ctl_info {
+	__aligned_u64 flags;
+	__u32 size;
+	__u8 devname[64]; /* underlaying ConnectX device */
+	__u16 uctx_uid; /* current process allocated UCTX UID */
+	__u16 reserved1;
+	__u32 uctx_cap; /* current process effective UCTX cap */
+	__u32 dev_uctx_cap; /* device's UCTX capabilities */
+	__u32 ucap; /* process user capability */
+	__u32 reserved2[4];
+};
+
+#define MLX5CTL_IOCTL_MAGIC 0x5c
+
+#define MLX5CTL_IOCTL_INFO \
+	_IOR(MLX5CTL_IOCTL_MAGIC, 0x0, struct mlx5ctl_info)
+
+#endif /* __MLX5CTL_IOCTL_H__ */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/5] misc: mlx5ctl: Add command rpc ioctl
  2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
                   ` (2 preceding siblings ...)
  2023-10-18  8:19 ` [PATCH 3/5] misc: mlx5ctl: Add info ioctl Saeed Mahameed
@ 2023-10-18  8:19 ` Saeed Mahameed
  2023-10-18  8:19 ` [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl Saeed Mahameed
  2023-10-18  8:31 ` [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Greg Kroah-Hartman
  5 siblings, 0 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-10-18  8:19 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Add new IOCTL to allow user space to send device debug rpcs and
attach the user's uctx UID to each rpc.

In the mlx5 architecture the FW RPC commands are of the format of
inbox and outbox buffers. The inbox buffer contains the command
rpc layout as described in the ConnectX Programmers Reference Manual
(PRM) document and as defined in include/linux/mlx5/mlx5_ifc.h.

On success the user outbox buffer will be filled with the device's rpc
response.

For example to query device capabilities:
a user fills out an inbox buffer with the inbox layout:
   struct mlx5_ifc_query_hca_cap_in_bits
and expects an outbox buffer with the layout:
   struct mlx5_ifc_cmd_hca_cap_bits

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/misc/mlx5ctl/main.c | 93 +++++++++++++++++++++++++++++++++++++
 include/uapi/misc/mlx5ctl.h | 13 ++++++
 2 files changed, 106 insertions(+)

diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c
index 008ad3a12d97..5f4edcc3e112 100644
--- a/drivers/misc/mlx5ctl/main.c
+++ b/drivers/misc/mlx5ctl/main.c
@@ -233,6 +233,95 @@ static int mlx5ctl_info_ioctl(struct file *file, void __user *arg, size_t usize)
 	return err;
 }
 
+struct mlx5_ifc_mbox_in_hdr_bits {
+	u8         opcode[0x10];
+	u8         uid[0x10];
+
+	u8         reserved_at_20[0x10];
+	u8         op_mod[0x10];
+
+	u8         reserved_at_40[0x40];
+};
+
+struct mlx5_ifc_mbox_out_hdr_bits {
+	u8         status[0x8];
+	u8         reserved_at_8[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_at_40[0x40];
+};
+
+static int mlx5ctl_cmdrpc_ioctl(struct file *file, void __user *arg, size_t usize)
+{
+	struct mlx5ctl_fd *mfd = file->private_data;
+	struct mlx5ctl_dev *mcdev = mfd->mcdev;
+	struct mlx5ctl_cmdrpc *rpc = NULL;
+	void *in = NULL, *out = NULL;
+	size_t ksize = 0;
+	int err;
+
+	ksize = max(sizeof(struct mlx5ctl_cmdrpc), usize);
+	rpc = kzalloc(ksize, GFP_KERNEL_ACCOUNT);
+	if (!rpc)
+		return -ENOMEM;
+
+	err = copy_from_user(rpc, arg, usize);
+	if (err)
+		goto out;
+
+	mlx5ctl_dbg(mcdev, "[UID %d] cmdrpc: rpc->inlen %d rpc->outlen %d\n",
+		    mfd->uctx_uid, rpc->inlen, rpc->outlen);
+
+	if (rpc->inlen < MLX5_ST_SZ_BYTES(mbox_in_hdr) ||
+	    rpc->outlen < MLX5_ST_SZ_BYTES(mbox_out_hdr) ||
+	    rpc->inlen > MLX5CTL_MAX_RPC_SIZE ||
+	    rpc->outlen > MLX5CTL_MAX_RPC_SIZE) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (rpc->flags) {
+		err = -EOPNOTSUPP;
+		goto out;
+	}
+
+	in = memdup_user(u64_to_user_ptr(rpc->in), rpc->inlen);
+	if (IS_ERR(in)) {
+		err = PTR_ERR(in);
+		goto out;
+	}
+
+	out = kvzalloc(rpc->outlen, GFP_KERNEL_ACCOUNT);
+	if (!out) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	mlx5ctl_dbg(mcdev, "[UID %d] cmdif: opcode 0x%x inlen %d outlen %d\n",
+		    mfd->uctx_uid,
+		    MLX5_GET(mbox_in_hdr, in, opcode), rpc->inlen, rpc->outlen);
+
+	MLX5_SET(mbox_in_hdr, in, uid, mfd->uctx_uid);
+	err = mlx5_cmd_do(mcdev->mdev, in, rpc->inlen, out, rpc->outlen);
+	mlx5ctl_dbg(mcdev, "[UID %d] cmdif: opcode 0x%x retval %d\n",
+		    mfd->uctx_uid,
+		    MLX5_GET(mbox_in_hdr, in, opcode), err);
+
+	/* -EREMOTEIO means outbox is valid, but out.status is not */
+	if (!err || err == -EREMOTEIO) {
+		err = 0;
+		if (copy_to_user(u64_to_user_ptr(rpc->out), out, rpc->outlen))
+			err = -EFAULT;
+	}
+
+out:
+	kvfree(out);
+	kfree(in);
+	kfree(rpc);
+	return err;
+}
+
 static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd,
 			     unsigned long arg)
 {
@@ -259,6 +348,10 @@ static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd,
 		err = mlx5ctl_info_ioctl(file, argp, size);
 		break;
 
+	case MLX5CTL_IOCTL_CMDRPC:
+		err = mlx5ctl_cmdrpc_ioctl(file, argp, size);
+		break;
+
 	default:
 		mlx5ctl_dbg(mcdev, "Unknown ioctl %x\n", cmd);
 		err = -ENOIOCTLCMD;
diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h
index 81d89cd285fc..49c26ccc2d21 100644
--- a/include/uapi/misc/mlx5ctl.h
+++ b/include/uapi/misc/mlx5ctl.h
@@ -16,9 +16,22 @@ struct mlx5ctl_info {
 	__u32 reserved2[4];
 };
 
+struct mlx5ctl_cmdrpc {
+	__aligned_u64 in; /* RPC inbox buffer user address */
+	__aligned_u64 out; /* RPC outbox buffer user address */
+	__u32 inlen; /* inbox buffer length */
+	__u32 outlen; /* outbox buffer length */
+	__aligned_u64 flags;
+};
+
+#define MLX5CTL_MAX_RPC_SIZE 8192
+
 #define MLX5CTL_IOCTL_MAGIC 0x5c
 
 #define MLX5CTL_IOCTL_INFO \
 	_IOR(MLX5CTL_IOCTL_MAGIC, 0x0, struct mlx5ctl_info)
 
+#define MLX5CTL_IOCTL_CMDRPC \
+	_IOWR(MLX5CTL_IOCTL_MAGIC, 0x1, struct mlx5ctl_cmdrpc)
+
 #endif /* __MLX5CTL_IOCTL_H__ */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
  2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
                   ` (3 preceding siblings ...)
  2023-10-18  8:19 ` [PATCH 4/5] misc: mlx5ctl: Add command rpc ioctl Saeed Mahameed
@ 2023-10-18  8:19 ` Saeed Mahameed
  2023-10-18  8:33   ` Greg Kroah-Hartman
  2023-10-18  9:30   ` Arnd Bergmann
  2023-10-18  8:31 ` [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Greg Kroah-Hartman
  5 siblings, 2 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-10-18  8:19 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Command rpc outbox buffer is limited in size, which can be very
annoying when trying to pull large traces out of the device.
Many device rpcs offer the ability to scatter output traces, contexts
and logs directly into user space buffers in a single shot.

Allow user to register user memory space, so the device may dump
information directly into user memory space.

The registered memory will be described by a device UMEM object which
has a unique umem_id, this umem_id can be later used in the rpc inbox
to tell the device where to populate the response output,
e.g HW traces and other debug object queries.

To do so this patch introduces two ioctls:

MLX5CTL_IOCTL_UMEM_REG(va_address, size):
 - calculate page fragments from the user provided virtual address
 - pin the pages, and allocate a sg list
 - dma map the sg list
 - create a UMEM device object that points to the dma addresses
 - add a driver umem object to an xarray data base for bookkeeping
 - return UMEM ID to user so it can be used in subsequent rpcs

MLX5CTL_IOCTL_UMEM_UNREG(umem_id):
 - user provides a pre allocated umem ID
 - unwinds the above

Example usecase, ConnectX device coredump can be as large as 2MB.
Using inline rpcs will take thousands of rpcs to get the full
coredump which can take multiple seconds.

With UMEM, it can be done in a single rpc, using 2MB of umem user buffer.

$ mlx5ctl mlx5_core.ctl.0 coredump --umem_size=$(( 2 ** 20 ))

00 00 00 00 01 00 20 00 00 00 00 04 00 00 48 ec
00 00 00 08 00 00 00 00 00 00 00 0c 00 00 00 03
00 00 00 10 00 00 00 00 00 00 00 14 00 00 00 00
....
00 50 0b 3c 00 00 00 00 00 50 0b 40 00 00 00 00
00 50 0b 44 00 00 00 00 00 50 0b 48 00 00 00 00
00 50 0c 00 00 00 00 00

INFO : Core dump done
INFO : Core dump size 831304
INFO : Core dump address 0x0
INFO : Core dump cookie 0x500c04
INFO : More Dump 0

Other usecases are: dynamic HW and FW trace monitoring, high frequency
diagnostic counters sampling and batched objects and resource dumps.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/misc/mlx5ctl/Makefile |   1 +
 drivers/misc/mlx5ctl/main.c   |  49 +++++
 drivers/misc/mlx5ctl/umem.c   | 325 ++++++++++++++++++++++++++++++++++
 drivers/misc/mlx5ctl/umem.h   |  17 ++
 include/uapi/misc/mlx5ctl.h   |  14 ++
 5 files changed, 406 insertions(+)
 create mode 100644 drivers/misc/mlx5ctl/umem.c
 create mode 100644 drivers/misc/mlx5ctl/umem.h

diff --git a/drivers/misc/mlx5ctl/Makefile b/drivers/misc/mlx5ctl/Makefile
index b5c7f99e0ab6..f35234e931a8 100644
--- a/drivers/misc/mlx5ctl/Makefile
+++ b/drivers/misc/mlx5ctl/Makefile
@@ -2,3 +2,4 @@
 
 obj-$(CONFIG_MLX5CTL) += mlx5ctl.o
 mlx5ctl-y := main.o
+mlx5ctl-y += umem.o
diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c
index 5f4edcc3e112..d4d72689f6e9 100644
--- a/drivers/misc/mlx5ctl/main.c
+++ b/drivers/misc/mlx5ctl/main.c
@@ -12,6 +12,8 @@
 #include <linux/atomic.h>
 #include <linux/refcount.h>
 
+#include "umem.h"
+
 MODULE_DESCRIPTION("mlx5 ConnectX control misc driver");
 MODULE_AUTHOR("Saeed Mahameed <saeedm@nvidia.com>");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -30,6 +32,8 @@ struct mlx5ctl_fd {
 	u16 uctx_uid;
 	u32 uctx_cap;
 	u32 ucap; /* user cap */
+
+	struct mlx5ctl_umem_db *umem_db;
 	struct mlx5ctl_dev *mcdev;
 	struct list_head list;
 };
@@ -115,6 +119,12 @@ static int mlx5ctl_open_mfd(struct mlx5ctl_fd *mfd)
 	if (uid < 0)
 		return uid;
 
+	mfd->umem_db = mlx5ctl_umem_db_create(mdev, uid);
+	if (IS_ERR(mfd->umem_db)) {
+		mlx5ctl_release_uid(mcdev, uid);
+		return PTR_ERR(mfd->umem_db);
+	}
+
 	mfd->uctx_uid = uid;
 	mfd->uctx_cap = cap;
 	mfd->ucap = ucap;
@@ -129,6 +139,7 @@ static void mlx5ctl_release_mfd(struct mlx5ctl_fd *mfd)
 {
 	struct mlx5ctl_dev *mcdev = mfd->mcdev;
 
+	mlx5ctl_umem_db_destroy(mfd->umem_db);
 	mlx5ctl_release_uid(mcdev,  mfd->uctx_uid);
 }
 
@@ -322,6 +333,36 @@ static int mlx5ctl_cmdrpc_ioctl(struct file *file, void __user *arg, size_t usiz
 	return err;
 }
 
+static ssize_t mlx5ctl_ioctl_umem_reg(struct file *file, unsigned long arg)
+{
+	struct mlx5ctl_fd *mfd = file->private_data;
+	struct mlx5ctl_umem_reg umem_reg;
+	int umem_id;
+
+	if (copy_from_user(&umem_reg, (void __user *)arg, sizeof(umem_reg)))
+		return -EFAULT;
+
+	umem_id = mlx5ctl_umem_reg(mfd->umem_db, (unsigned long)umem_reg.addr, umem_reg.len);
+	if (umem_id < 0)
+		return umem_id;
+
+	umem_reg.umem_id = umem_id;
+
+	if (copy_to_user((void __user *)arg, &umem_reg, sizeof(umem_reg))) {
+		mlx5ctl_umem_unreg(mfd->umem_db, umem_id);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static size_t mlx5ctl_ioctl_umem_unreg(struct file *file, unsigned long arg)
+{
+	struct mlx5ctl_fd *mfd = file->private_data;
+
+	return mlx5ctl_umem_unreg(mfd->umem_db, (u32)arg);
+}
+
 static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd,
 			     unsigned long arg)
 {
@@ -352,6 +393,14 @@ static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd,
 		err = mlx5ctl_cmdrpc_ioctl(file, argp, size);
 		break;
 
+	case MLX5CTL_IOCTL_UMEM_REG:
+		err = mlx5ctl_ioctl_umem_reg(file, arg);
+		break;
+
+	case MLX5CTL_IOCTL_UMEM_UNREG:
+		err = mlx5ctl_ioctl_umem_unreg(file, arg);
+		break;
+
 	default:
 		mlx5ctl_dbg(mcdev, "Unknown ioctl %x\n", cmd);
 		err = -ENOIOCTLCMD;
diff --git a/drivers/misc/mlx5ctl/umem.c b/drivers/misc/mlx5ctl/umem.c
new file mode 100644
index 000000000000..c21b54d24762
--- /dev/null
+++ b/drivers/misc/mlx5ctl/umem.c
@@ -0,0 +1,325 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#include <linux/mlx5/device.h>
+#include <linux/mlx5/driver.h>
+#include <uapi/misc/mlx5ctl.h>
+
+#include "umem.h"
+
+#define umem_dbg(__mdev, fmt, ...) \
+	dev_dbg((__mdev)->device, "mlx5ctl_umem: " fmt, ##__VA_ARGS__)
+
+#define MLX5CTL_UMEM_MAX_MB 64
+
+static size_t umem_num_pages(u64 addr, size_t len)
+{
+	return (size_t)((ALIGN(addr + len, PAGE_SIZE) -
+			 ALIGN_DOWN(addr, PAGE_SIZE))) /
+			 PAGE_SIZE;
+}
+
+struct mlx5ctl_umem {
+	struct sg_table sgt;
+	unsigned long addr;
+	size_t size;
+	size_t offset;
+	size_t npages;
+	struct task_struct *source_task;
+	struct mm_struct *source_mm;
+	struct user_struct *source_user;
+	u32 umem_id;
+	struct page **page_list;
+};
+
+struct mlx5ctl_umem_db {
+	struct xarray xarray;
+	struct mlx5_core_dev *mdev;
+	u32 uctx_uid;
+};
+
+static int inc_user_locked_vm(struct mlx5ctl_umem *umem, unsigned long npages)
+{
+	unsigned long lock_limit;
+	unsigned long cur_pages;
+	unsigned long new_pages;
+
+	lock_limit = task_rlimit(umem->source_task, RLIMIT_MEMLOCK) >>
+		     PAGE_SHIFT;
+	do {
+		cur_pages = atomic_long_read(&umem->source_user->locked_vm);
+		new_pages = cur_pages + npages;
+		if (new_pages > lock_limit)
+			return -ENOMEM;
+	} while (atomic_long_cmpxchg(&umem->source_user->locked_vm, cur_pages,
+				     new_pages) != cur_pages);
+	return 0;
+}
+
+static void dec_user_locked_vm(struct mlx5ctl_umem *umem, unsigned long npages)
+{
+	if (WARN_ON(atomic_long_read(&umem->source_user->locked_vm) < npages))
+		return;
+	atomic_long_sub(npages, &umem->source_user->locked_vm);
+}
+
+static struct mlx5ctl_umem *mlx5ctl_umem_pin(struct mlx5ctl_umem_db *umem_db,
+					     unsigned long addr, size_t size)
+{
+	size_t npages = umem_num_pages(addr, size);
+	struct mlx5_core_dev *mdev = umem_db->mdev;
+	unsigned long endaddr = addr + size;
+	struct mlx5ctl_umem *umem;
+	struct page **page_list;
+	int err = -EINVAL;
+	int pinned = 0;
+
+	umem_dbg(mdev, "%s: addr %p size %zu npages %zu\n",
+		 __func__, (void *)addr, size, npages);
+
+	/* Avoid integer overflow */
+	if (endaddr < addr || PAGE_ALIGN(endaddr) < endaddr)
+		return ERR_PTR(-EINVAL);
+
+	if (npages == 0 || pages_to_mb(npages) > MLX5CTL_UMEM_MAX_MB)
+		return ERR_PTR(-EINVAL);
+
+	page_list = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL_ACCOUNT);
+	if (!page_list)
+		return ERR_PTR(-ENOMEM);
+
+	umem = kzalloc(sizeof(*umem), GFP_KERNEL_ACCOUNT);
+	if (!umem) {
+		kvfree(page_list);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	umem->addr = addr;
+	umem->size = size;
+	umem->offset = addr & ~PAGE_MASK;
+	umem->npages = npages;
+
+	umem->page_list = page_list;
+	umem->source_mm = current->mm;
+	umem->source_task = current->group_leader;
+	get_task_struct(current->group_leader);
+	umem->source_user = get_uid(current_user());
+
+	/* mm and RLIMIT_MEMLOCK user task accounting similar to what is
+	 * being done in iopt_alloc_pages() and do_update_pinned()
+	 * for IOPT_PAGES_ACCOUNT_USER @drivers/iommu/iommufd/pages.c
+	 */
+	mmgrab(umem->source_mm);
+
+	pinned = pin_user_pages_fast(addr, npages, FOLL_WRITE, page_list);
+	if (pinned != npages) {
+		umem_dbg(mdev, "pin_user_pages_fast failed %d\n", pinned);
+		err = pinned < 0 ? pinned : -ENOMEM;
+		goto pin_failed;
+	}
+
+	err = inc_user_locked_vm(umem, npages);
+	if (err)
+		goto pin_failed;
+
+	atomic64_add(npages, &umem->source_mm->pinned_vm);
+
+	err = sg_alloc_table_from_pages(&umem->sgt, page_list, npages, 0,
+					npages << PAGE_SHIFT, GFP_KERNEL_ACCOUNT);
+	if (err) {
+		umem_dbg(mdev, "sg_alloc_table failed: %d\n", err);
+		goto sgt_failed;
+	}
+
+	umem_dbg(mdev, "\tsgt: size %zu npages %zu sgt.nents (%d)\n",
+		 size, npages, umem->sgt.nents);
+
+	err = dma_map_sgtable(mdev->device, &umem->sgt, DMA_BIDIRECTIONAL, 0);
+	if (err) {
+		umem_dbg(mdev, "dma_map_sgtable failed: %d\n", err);
+		goto dma_failed;
+	}
+
+	umem_dbg(mdev, "\tsgt: dma_nents %d\n", umem->sgt.nents);
+	return umem;
+
+dma_failed:
+sgt_failed:
+	sg_free_table(&umem->sgt);
+	atomic64_sub(npages, &umem->source_mm->pinned_vm);
+	dec_user_locked_vm(umem, npages);
+pin_failed:
+	if (pinned > 0)
+		unpin_user_pages(page_list, pinned);
+	mmdrop(umem->source_mm);
+	free_uid(umem->source_user);
+	put_task_struct(umem->source_task);
+
+	kfree(umem);
+	kvfree(page_list);
+	return ERR_PTR(err);
+}
+
+static void mlx5ctl_umem_unpin(struct mlx5ctl_umem_db *umem_db,
+			       struct mlx5ctl_umem *umem)
+{
+	struct mlx5_core_dev *mdev = umem_db->mdev;
+
+	umem_dbg(mdev, "%s: addr %p size %zu npages %zu dma_nents %d\n",
+		 __func__, (void *)umem->addr, umem->size, umem->npages,
+		 umem->sgt.nents);
+
+	dma_unmap_sgtable(mdev->device, &umem->sgt, DMA_BIDIRECTIONAL, 0);
+	sg_free_table(&umem->sgt);
+
+	atomic64_sub(umem->npages, &umem->source_mm->pinned_vm);
+	dec_user_locked_vm(umem, umem->npages);
+	unpin_user_pages(umem->page_list, umem->npages);
+	mmdrop(umem->source_mm);
+	free_uid(umem->source_user);
+	put_task_struct(umem->source_task);
+
+	kvfree(umem->page_list);
+	kfree(umem);
+}
+
+static int mlx5ctl_umem_create(struct mlx5_core_dev *mdev,
+			       struct mlx5ctl_umem *umem, u32 uid)
+{
+	u32 out[MLX5_ST_SZ_DW(create_umem_out)] = {};
+	int err, inlen, i, n = 0;
+	struct scatterlist *sg;
+	void *in, *umemptr;
+	__be64 *mtt;
+
+	inlen = MLX5_ST_SZ_BYTES(create_umem_in) +
+		umem->npages * MLX5_ST_SZ_BYTES(mtt);
+
+	in = kzalloc(inlen, GFP_KERNEL_ACCOUNT);
+	if (!in)
+		return -ENOMEM;
+
+	MLX5_SET(create_umem_in, in, opcode, MLX5_CMD_OP_CREATE_UMEM);
+	MLX5_SET(create_umem_in, in, uid, uid);
+
+	umemptr = MLX5_ADDR_OF(create_umem_in, in, umem);
+
+	MLX5_SET(umem, umemptr, log_page_size,
+		 PAGE_SHIFT - MLX5_ADAPTER_PAGE_SHIFT);
+	MLX5_SET64(umem, umemptr, num_of_mtt, umem->npages);
+	MLX5_SET(umem, umemptr, page_offset, umem->offset);
+
+	umem_dbg(mdev,
+		 "UMEM CREATE: log_page_size %d num_of_mtt %lld page_offset %d\n",
+		 MLX5_GET(umem, umemptr, log_page_size),
+		 MLX5_GET64(umem, umemptr, num_of_mtt),
+		 MLX5_GET(umem, umemptr, page_offset));
+
+	mtt = MLX5_ADDR_OF(create_umem_in, in, umem.mtt);
+	for_each_sgtable_dma_sg(&umem->sgt, sg, i) {
+		u64 dma_addr = sg_dma_address(sg);
+		ssize_t len = sg_dma_len(sg);
+
+		for (; n < umem->npages && len > 0; n++, mtt++) {
+			*mtt = cpu_to_be64(dma_addr);
+			MLX5_SET(mtt, mtt, wr_en, 1);
+			MLX5_SET(mtt, mtt, rd_en, 1);
+			dma_addr += PAGE_SIZE;
+			len -= PAGE_SIZE;
+		}
+		WARN_ON_ONCE(n == umem->npages && len > 0);
+	}
+
+	err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out));
+	if (err)
+		goto out;
+
+	umem->umem_id = MLX5_GET(create_umem_out, out, umem_id);
+	umem_dbg(mdev, "\tUMEM CREATED: umem_id %d\n", umem->umem_id);
+out:
+	kfree(in);
+	return err;
+}
+
+static void mlx5ctl_umem_destroy(struct mlx5_core_dev *mdev,
+				 struct mlx5ctl_umem *umem)
+{
+	u32 in[MLX5_ST_SZ_DW(destroy_umem_in)] = {};
+
+	MLX5_SET(destroy_umem_in, in, opcode, MLX5_CMD_OP_DESTROY_UMEM);
+	MLX5_SET(destroy_umem_in, in, umem_id, umem->umem_id);
+
+	umem_dbg(mdev, "UMEM DESTROY: umem_id %d\n", umem->umem_id);
+	mlx5_cmd_exec_in(mdev, destroy_umem, in);
+}
+
+int mlx5ctl_umem_reg(struct mlx5ctl_umem_db *umem_db, unsigned long addr,
+		     size_t size)
+{
+	struct mlx5ctl_umem *umem;
+	void *ret;
+	int err;
+
+	umem = mlx5ctl_umem_pin(umem_db, addr, size);
+	if (IS_ERR(umem))
+		return PTR_ERR(umem);
+
+	err = mlx5ctl_umem_create(umem_db->mdev, umem, umem_db->uctx_uid);
+	if (err)
+		goto umem_create_err;
+
+	ret = xa_store(&umem_db->xarray, umem->umem_id, umem, GFP_KERNEL_ACCOUNT);
+	if (WARN(xa_is_err(ret), "Failed to store UMEM")) {
+		err = xa_err(ret);
+		goto xa_store_err;
+	}
+
+	return umem->umem_id;
+
+xa_store_err:
+	mlx5ctl_umem_destroy(umem_db->mdev, umem);
+umem_create_err:
+	mlx5ctl_umem_unpin(umem_db, umem);
+	return err;
+}
+
+int mlx5ctl_umem_unreg(struct mlx5ctl_umem_db *umem_db, u32 umem_id)
+{
+	struct mlx5ctl_umem *umem;
+
+	umem = xa_erase(&umem_db->xarray, umem_id);
+	if (!umem)
+		return -ENOENT;
+
+	mlx5ctl_umem_destroy(umem_db->mdev, umem);
+	mlx5ctl_umem_unpin(umem_db, umem);
+	return 0;
+}
+
+struct mlx5ctl_umem_db *mlx5ctl_umem_db_create(struct mlx5_core_dev *mdev,
+					       u32 uctx_uid)
+{
+	struct mlx5ctl_umem_db *umem_db;
+
+	umem_db = kzalloc(sizeof(*umem_db), GFP_KERNEL_ACCOUNT);
+	if (!umem_db)
+		return ERR_PTR(-ENOMEM);
+
+	xa_init(&umem_db->xarray);
+	umem_db->mdev = mdev;
+	umem_db->uctx_uid = uctx_uid;
+
+	return umem_db;
+}
+
+void mlx5ctl_umem_db_destroy(struct mlx5ctl_umem_db *umem_db)
+{
+	struct mlx5ctl_umem *umem;
+	unsigned long index;
+
+	xa_for_each(&umem_db->xarray, index, umem)
+		mlx5ctl_umem_unreg(umem_db, umem->umem_id);
+
+	xa_destroy(&umem_db->xarray);
+	kfree(umem_db);
+}
diff --git a/drivers/misc/mlx5ctl/umem.h b/drivers/misc/mlx5ctl/umem.h
new file mode 100644
index 000000000000..880bf66e600d
--- /dev/null
+++ b/drivers/misc/mlx5ctl/umem.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#ifndef __MLX5CTL_UMEM_H__
+#define __MLX5CTL_UMEM_H__
+
+#include <linux/types.h>
+#include <linux/mlx5/driver.h>
+
+struct mlx5ctl_umem_db;
+
+struct mlx5ctl_umem_db *mlx5ctl_umem_db_create(struct mlx5_core_dev *mdev, u32 uctx_uid);
+void mlx5ctl_umem_db_destroy(struct mlx5ctl_umem_db *umem_db);
+int mlx5ctl_umem_reg(struct mlx5ctl_umem_db *umem_db, unsigned long addr, size_t size);
+int mlx5ctl_umem_unreg(struct mlx5ctl_umem_db *umem_db, u32 umem_id);
+
+#endif /* __MLX5CTL_UMEM_H__ */
diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h
index 49c26ccc2d21..c0960ad1a8f0 100644
--- a/include/uapi/misc/mlx5ctl.h
+++ b/include/uapi/misc/mlx5ctl.h
@@ -24,6 +24,14 @@ struct mlx5ctl_cmdrpc {
 	__aligned_u64 flags;
 };
 
+struct mlx5ctl_umem_reg {
+	__aligned_u64 addr; /* user address */
+	__aligned_u64 len; /* user buffer length */
+	__aligned_u64 flags;
+	__u32 umem_id; /* returned device's umem ID */
+	__u32 reserved[7];
+};
+
 #define MLX5CTL_MAX_RPC_SIZE 8192
 
 #define MLX5CTL_IOCTL_MAGIC 0x5c
@@ -34,4 +42,10 @@ struct mlx5ctl_cmdrpc {
 #define MLX5CTL_IOCTL_CMDRPC \
 	_IOWR(MLX5CTL_IOCTL_MAGIC, 0x1, struct mlx5ctl_cmdrpc)
 
+#define MLX5CTL_IOCTL_UMEM_REG \
+	_IOWR(MLX5CTL_IOCTL_MAGIC, 0x2, struct mlx5ctl_umem_reg)
+
+#define MLX5CTL_IOCTL_UMEM_UNREG \
+	_IOWR(MLX5CTL_IOCTL_MAGIC, 0x3, unsigned long)
+
 #endif /* __MLX5CTL_IOCTL_H__ */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:19 ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Saeed Mahameed
@ 2023-10-18  8:30   ` Greg Kroah-Hartman
  2023-10-18  8:49     ` Leon Romanovsky
  2023-10-18 18:01     ` Jason Gunthorpe
  2023-10-18  8:30   ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Greg Kroah-Hartman
  1 sibling, 2 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18  8:30 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Arnd Bergmann, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB

For dual-licensed code, I need a LOT of documentation as to why this
must be dual-licensed, AND a signed-off-by from your corporate lawyer
agreeing to it so they convey an understanding of just how complex and
messy this will get over time and what you are agreeing to do here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:19 ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Saeed Mahameed
  2023-10-18  8:30   ` Greg Kroah-Hartman
@ 2023-10-18  8:30   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18  8:30 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Arnd Bergmann, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB

<snip>

> +MODULE_LICENSE("Dual BSD/GPL");

See, it's a mess already :(


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] mlx5 ConnectX diagnostic misc driver
  2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
                   ` (4 preceding siblings ...)
  2023-10-18  8:19 ` [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl Saeed Mahameed
@ 2023-10-18  8:31 ` Greg Kroah-Hartman
  2023-10-18 12:00   ` Jason Gunthorpe
  5 siblings, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18  8:31 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Arnd Bergmann, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 01:19:36AM -0700, Saeed Mahameed wrote:
> Historically a userspace program was used that accessed the PCI register
> and config space directly through /sys/bus/pci/.../XXX and could operate
> these debugging interfaces in parallel with the running driver.
> This approach is incompatible with secure boot and kernel lockdown so this
> driver provides a secure and restricted interface to that.

Why not just write a UIO driver for this hardware then?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
  2023-10-18  8:19 ` [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl Saeed Mahameed
@ 2023-10-18  8:33   ` Greg Kroah-Hartman
  2023-11-19  9:49     ` Saeed Mahameed
  2023-10-18  9:30   ` Arnd Bergmann
  1 sibling, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18  8:33 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Arnd Bergmann, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 01:19:41AM -0700, Saeed Mahameed wrote:
> +#define umem_dbg(__mdev, fmt, ...) \
> +	dev_dbg((__mdev)->device, "mlx5ctl_umem: " fmt, ##__VA_ARGS__)

That's really really odd, and should not be needed for dev_dbg() because
you already have the driver name and file name and line information in
that message.  Why add yet-another-prefix?  Please just use normal
dev_dbg() lines.

> +#define MLX5CTL_UMEM_MAX_MB 64
> +
> +static size_t umem_num_pages(u64 addr, size_t len)
> +{
> +	return (size_t)((ALIGN(addr + len, PAGE_SIZE) -
> +			 ALIGN_DOWN(addr, PAGE_SIZE))) /
> +			 PAGE_SIZE;
> +}

We don't have a function or macro for this already?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:30   ` Greg Kroah-Hartman
@ 2023-10-18  8:49     ` Leon Romanovsky
  2023-10-18  8:55       ` Greg Kroah-Hartman
  2023-10-18 18:01     ` Jason Gunthorpe
  1 sibling, 1 reply; 36+ messages in thread
From: Leon Romanovsky @ 2023-10-18  8:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> 
> For dual-licensed code, I need a LOT of documentation as to why this
> must be dual-licensed, AND a signed-off-by from your corporate lawyer
> agreeing to it so they convey an understanding of just how complex and
> messy this will get over time and what you are agreeing to do here.

This is how we (NBU, Networking Business Unit in Nvidia, former Mellanox)
are instructed to submit all our code by default. This license is aligned
with our networking, vdpa and RDMA code.

So yes, our legal understands pros and cons of dual-license code.

Thanks

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:49     ` Leon Romanovsky
@ 2023-10-18  8:55       ` Greg Kroah-Hartman
  2023-10-18 10:00         ` Leon Romanovsky
  0 siblings, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18  8:55 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 11:49:08AM +0300, Leon Romanovsky wrote:
> On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> > 
> > For dual-licensed code, I need a LOT of documentation as to why this
> > must be dual-licensed, AND a signed-off-by from your corporate lawyer
> > agreeing to it so they convey an understanding of just how complex and
> > messy this will get over time and what you are agreeing to do here.
> 
> This is how we (NBU, Networking Business Unit in Nvidia, former Mellanox)
> are instructed to submit all our code by default. This license is aligned
> with our networking, vdpa and RDMA code.

Please don't do this without a really really good reason.  Especially
for a "misc" driver that is so linux-dependant here.  The "Linux-OpenIB"
license is old, obsolete, and problematic and should not be added to any
new files in the kernel tree, outside of the island of
drivers/infiniband/ as you all insist on that crazyness there.

Heck, it's even documented that you shouldn't be adding that license to
any new files, why ignore that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
  2023-10-18  8:19 ` [PATCH 3/5] misc: mlx5ctl: Add info ioctl Saeed Mahameed
@ 2023-10-18  9:02   ` Arnd Bergmann
  2023-10-18 10:08     ` Leon Romanovsky
  2023-10-22  1:46   ` kernel test robot
  2023-10-22 11:27   ` kernel test robot
  2 siblings, 1 reply; 36+ messages in thread
From: Arnd Bergmann @ 2023-10-18  9:02 UTC (permalink / raw)
  To: Saeed Mahameed, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

On Wed, Oct 18, 2023, at 10:19, Saeed Mahameed wrote:

> Implement INFO ioctl to return the allocated UID and the capability flags
> and some other useful device information such as the underlying ConnectX
> device.

I'm commenting on the ABI here, ignoring everything for the moment.

>  static const struct file_operations mlx5ctl_fops = {
>  	.owner = THIS_MODULE,
>  	.open = mlx5ctl_open,
>  	.release = mlx5ctl_release,
> +	.unlocked_ioctl = mlx5ctl_ioctl,
>  };

There should be a .compat_ioctl entry as well, to allow 32-bit
tasks to use the same interface.
 
>  static int mlx5ctl_probe(struct auxiliary_device *adev,
> diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h
> new file mode 100644
> index 000000000000..81d89cd285fc
> --- /dev/null
> +++ b/include/uapi/misc/mlx5ctl.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB WITH Linux-syscall-note */
> +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
> +
> +#ifndef __MLX5CTL_IOCTL_H__
> +#define __MLX5CTL_IOCTL_H__
> +
> +struct mlx5ctl_info {
> +	__aligned_u64 flags;
> +	__u32 size;
> +	__u8 devname[64]; /* underlaying ConnectX device */
> +	__u16 uctx_uid; /* current process allocated UCTX UID */

I don't know what a UCTX UID is, but if this is related to
uid_t, it has to be 32 bit wide.

> +	__u16 reserved1;
> +	__u32 uctx_cap; /* current process effective UCTX cap */
> +	__u32 dev_uctx_cap; /* device's UCTX capabilities */
> +	__u32 ucap; /* process user capability */
> +	__u32 reserved2[4];
> +};

If I'm counting right, this structure has a size of
108 bytes but an alignment of 8 bytes, so you end up with
a 4-byte hole at the end, which introduces a risk of
leaking kernel data. Maybe give it an extra reserved word?

     Arnd

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
  2023-10-18  8:19 ` [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl Saeed Mahameed
  2023-10-18  8:33   ` Greg Kroah-Hartman
@ 2023-10-18  9:30   ` Arnd Bergmann
  2023-10-18 11:51     ` Jason Gunthorpe
  2023-11-19  9:44     ` Saeed Mahameed
  1 sibling, 2 replies; 36+ messages in thread
From: Arnd Bergmann @ 2023-10-18  9:30 UTC (permalink / raw)
  To: Saeed Mahameed, Greg Kroah-Hartman
  Cc: linux-kernel, Leon Romanovsky, Jason Gunthorpe, Jiri Pirko,
	Saeed Mahameed

On Wed, Oct 18, 2023, at 10:19, Saeed Mahameed wrote:
> From: Saeed Mahameed <saeedm@nvidia.com>

>
> To do so this patch introduces two ioctls:
>
> MLX5CTL_IOCTL_UMEM_REG(va_address, size):
>  - calculate page fragments from the user provided virtual address
>  - pin the pages, and allocate a sg list
>  - dma map the sg list
>  - create a UMEM device object that points to the dma addresses
>  - add a driver umem object to an xarray data base for bookkeeping
>  - return UMEM ID to user so it can be used in subsequent rpcs
>
> MLX5CTL_IOCTL_UMEM_UNREG(umem_id):
>  - user provides a pre allocated umem ID
>  - unwinds the above
>

> +static ssize_t mlx5ctl_ioctl_umem_reg(struct file *file, unsigned long 
> arg)
> +{
> +	struct mlx5ctl_fd *mfd = file->private_data;
> +	struct mlx5ctl_umem_reg umem_reg;
> +	int umem_id;
> +
> +	if (copy_from_user(&umem_reg, (void __user *)arg, sizeof(umem_reg)))
> +		return -EFAULT;

Instead of an in-place cast, the usual way to do it is to
have a local pointer of the correct type
(struct mlx5ctl_umem_reg __iomem *) as the function argument.

> +
> +	umem_id = mlx5ctl_umem_reg(mfd->umem_db, (unsigned 
> long)umem_reg.addr, umem_reg.len);

umem_reg.addr seems to be a user space address, so I would
suggest consistently passing it as a 'void __user *' instead
of casting to (unsigned long) here. You can use u64_to_user_ptr()
to handle the pointer conversion correctly across all
architectures that way, and get better type checking.

> @@ -24,6 +24,14 @@ struct mlx5ctl_cmdrpc {
>  	__aligned_u64 flags;
>  };
> 
> +struct mlx5ctl_umem_reg {
> +	__aligned_u64 addr; /* user address */
> +	__aligned_u64 len; /* user buffer length */
> +	__aligned_u64 flags;
> +	__u32 umem_id; /* returned device's umem ID */
> +	__u32 reserved[7];
> +};
> +

You have a 'flags' argument that is never accessed and can
probably be removed. If the intention was to make the ioctl
extensible for the future, this doesn't work unless you
ensure that only known flags (i.e. none at this point)
are set, and it's probably a bad idea anyway, compared
to creating a new ioctl command with new semantics.

Same for the 'reserved' fields except as needed for padding.

> +#define MLX5CTL_IOCTL_UMEM_UNREG \
> +	_IOWR(MLX5CTL_IOCTL_MAGIC, 0x3, unsigned long)

The use of 'unsigned long' here is wrong, this just makes
the command incompatible with compat mode, since
that type has different sizes. It also doesn't
match what the implementation uses, as that does
not try to read and write an 'unsigned long'
from user memory but instead takes the argument
itself as an integer. If you want to keep the use
of direct integer arguments (instead of pointer to
__u32 or __u64) here, this would have to be

   _IO(MLX5CTL_IOCTL_MAGIC, 0x3)

     Arnd

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:55       ` Greg Kroah-Hartman
@ 2023-10-18 10:00         ` Leon Romanovsky
  2023-10-18 11:52           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 36+ messages in thread
From: Leon Romanovsky @ 2023-10-18 10:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 10:55:35AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 18, 2023 at 11:49:08AM +0300, Leon Romanovsky wrote:
> > On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > > > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> > > 
> > > For dual-licensed code, I need a LOT of documentation as to why this
> > > must be dual-licensed, AND a signed-off-by from your corporate lawyer
> > > agreeing to it so they convey an understanding of just how complex and
> > > messy this will get over time and what you are agreeing to do here.
> > 
> > This is how we (NBU, Networking Business Unit in Nvidia, former Mellanox)
> > are instructed to submit all our code by default. This license is aligned
> > with our networking, vdpa and RDMA code.
> 
> Please don't do this without a really really good reason.  Especially
> for a "misc" driver that is so linux-dependant here.  

Like I said, it is our default license. Saeed will change this code to be GPL-only.

> The "Linux-OpenIB" license is old, obsolete, and problematic and should not be added to any
> new files in the kernel tree, outside of the island of
> drivers/infiniband/ as you all insist on that crazyness there.
> 
> Heck, it's even documented that you shouldn't be adding that license to
> any new files, why ignore that?

I don't remember being asked about it, not in this patch
https://lore.kernel.org/all/20180425203703.568152337@linutronix.de/
and not in this later patch which changed LICENSES/other/ to be
LICENSES/deprecated/
https://lore.kernel.org/all/20190430105130.24500-4-hch@lst.de/

So it is nice that someone decided to deprecate it without notifying users
about such change. It will be better to put Linux-OpenIB to LICENSES/dual/
folder instead.

Our lawyers are perfectly fine with Linux-OpenIB license.

Thanks

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
  2023-10-18  9:02   ` Arnd Bergmann
@ 2023-10-18 10:08     ` Leon Romanovsky
  2023-10-18 11:02       ` Arnd Bergmann
  0 siblings, 1 reply; 36+ messages in thread
From: Leon Romanovsky @ 2023-10-18 10:08 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Saeed Mahameed, Greg Kroah-Hartman, linux-kernel,
	Jason Gunthorpe, Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 11:02:00AM +0200, Arnd Bergmann wrote:
> On Wed, Oct 18, 2023, at 10:19, Saeed Mahameed wrote:
> 
> > Implement INFO ioctl to return the allocated UID and the capability flags
> > and some other useful device information such as the underlying ConnectX
> > device.
> 
> I'm commenting on the ABI here, ignoring everything for the moment.
> 
> >  static const struct file_operations mlx5ctl_fops = {
> >  	.owner = THIS_MODULE,
> >  	.open = mlx5ctl_open,
> >  	.release = mlx5ctl_release,
> > +	.unlocked_ioctl = mlx5ctl_ioctl,
> >  };
> 
> There should be a .compat_ioctl entry as well, to allow 32-bit
> tasks to use the same interface.

Even for new code as well?

Both kernel and userspace code is new, not released yet.

>  
> >  static int mlx5ctl_probe(struct auxiliary_device *adev,
> > diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h
> > new file mode 100644
> > index 000000000000..81d89cd285fc
> > --- /dev/null
> > +++ b/include/uapi/misc/mlx5ctl.h
> > @@ -0,0 +1,24 @@
> > +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB WITH Linux-syscall-note */
> > +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
> > +
> > +#ifndef __MLX5CTL_IOCTL_H__
> > +#define __MLX5CTL_IOCTL_H__
> > +
> > +struct mlx5ctl_info {
> > +	__aligned_u64 flags;
> > +	__u32 size;
> > +	__u8 devname[64]; /* underlaying ConnectX device */
> > +	__u16 uctx_uid; /* current process allocated UCTX UID */
> 
> I don't know what a UCTX UID is, but if this is related to
> uid_t, it has to be 32 bit wide.

UCTX UID is mlx5 hardware index, it is not uid_t.

> 
> > +	__u16 reserved1;
> > +	__u32 uctx_cap; /* current process effective UCTX cap */
> > +	__u32 dev_uctx_cap; /* device's UCTX capabilities */
> > +	__u32 ucap; /* process user capability */
> > +	__u32 reserved2[4];
> > +};
> 
> If I'm counting right, this structure has a size of
> 108 bytes but an alignment of 8 bytes, so you end up with
> a 4-byte hole at the end, which introduces a risk of
> leaking kernel data. Maybe give it an extra reserved word?

I think that he needs to delete __u32 reserved2[4]; completely.

Thanks

> 
>      Arnd

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
  2023-10-18 10:08     ` Leon Romanovsky
@ 2023-10-18 11:02       ` Arnd Bergmann
  0 siblings, 0 replies; 36+ messages in thread
From: Arnd Bergmann @ 2023-10-18 11:02 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Saeed Mahameed, Greg Kroah-Hartman, linux-kernel,
	Jason Gunthorpe, Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023, at 12:08, Leon Romanovsky wrote:
> On Wed, Oct 18, 2023 at 11:02:00AM +0200, Arnd Bergmann wrote:
>> On Wed, Oct 18, 2023, at 10:19, Saeed Mahameed wrote:
>> 
>> > Implement INFO ioctl to return the allocated UID and the capability flags
>> > and some other useful device information such as the underlying ConnectX
>> > device.
>> 
>> I'm commenting on the ABI here, ignoring everything for the moment.
>> 
>> >  static const struct file_operations mlx5ctl_fops = {
>> >  	.owner = THIS_MODULE,
>> >  	.open = mlx5ctl_open,
>> >  	.release = mlx5ctl_release,
>> > +	.unlocked_ioctl = mlx5ctl_ioctl,
>> >  };
>> 
>> There should be a .compat_ioctl entry as well, to allow 32-bit
>> tasks to use the same interface.
>
> Even for new code as well?
>
> Both kernel and userspace code is new, not released yet.

Yes, the main thing is that both x86 and arm support 32-bit
user space, and there are lots of users that use those in
containers and embedded systems. Even if it's unlikely to be
used in combination with your driver, there really isn't
much reason to be different from other drivers here.


>> I don't know what a UCTX UID is, but if this is related to
>> uid_t, it has to be 32 bit wide.
>
> UCTX UID is mlx5 hardware index, it is not uid_t.

Ok

>> 
>> > +	__u16 reserved1;
>> > +	__u32 uctx_cap; /* current process effective UCTX cap */
>> > +	__u32 dev_uctx_cap; /* device's UCTX capabilities */
>> > +	__u32 ucap; /* process user capability */
>> > +	__u32 reserved2[4];
>> > +};
>> 
>> If I'm counting right, this structure has a size of
>> 108 bytes but an alignment of 8 bytes, so you end up with
>> a 4-byte hole at the end, which introduces a risk of
>> leaking kernel data. Maybe give it an extra reserved word?
>
> I think that he needs to delete __u32 reserved2[4]; completely.

Removing the extra reserved words and the 'flags' is probably
best here, but you still need one 32-bit word to get to
a multiple of eight bytes to avoid the hole.

    Arnd

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
  2023-10-18  9:30   ` Arnd Bergmann
@ 2023-10-18 11:51     ` Jason Gunthorpe
  2023-11-19  9:44     ` Saeed Mahameed
  1 sibling, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-10-18 11:51 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Saeed Mahameed, Greg Kroah-Hartman, linux-kernel,
	Leon Romanovsky, Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 11:30:24AM +0200, Arnd Bergmann wrote:

> > +struct mlx5ctl_umem_reg {
> > +	__aligned_u64 addr; /* user address */
> > +	__aligned_u64 len; /* user buffer length */
> > +	__aligned_u64 flags;
> > +	__u32 umem_id; /* returned device's umem ID */
> > +	__u32 reserved[7];
> > +};
> > +
>
> You have a 'flags' argument that is never accessed and can
> probably be removed. If the intention was to make the ioctl
> extensible for the future, this doesn't work unless you
> ensure that only known flags (i.e. none at this point)
> are set,

Yes, all the reserved fields and flags should be checked for 0 in the
ioctl paths to allow them to be used for something else someday.

> and it's probably a bad idea anyway, compared
> to creating a new ioctl command with new semantics.

This has been done successfully quite often in the kernel.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18 10:00         ` Leon Romanovsky
@ 2023-10-18 11:52           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18 11:52 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 01:00:14PM +0300, Leon Romanovsky wrote:
> On Wed, Oct 18, 2023 at 10:55:35AM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Oct 18, 2023 at 11:49:08AM +0300, Leon Romanovsky wrote:
> > > On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> > > > On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > > > > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> > > > 
> > > > For dual-licensed code, I need a LOT of documentation as to why this
> > > > must be dual-licensed, AND a signed-off-by from your corporate lawyer
> > > > agreeing to it so they convey an understanding of just how complex and
> > > > messy this will get over time and what you are agreeing to do here.
> > > 
> > > This is how we (NBU, Networking Business Unit in Nvidia, former Mellanox)
> > > are instructed to submit all our code by default. This license is aligned
> > > with our networking, vdpa and RDMA code.
> > 
> > Please don't do this without a really really good reason.  Especially
> > for a "misc" driver that is so linux-dependant here.  
> 
> Like I said, it is our default license. Saeed will change this code to be GPL-only.
> 
> > The "Linux-OpenIB" license is old, obsolete, and problematic and should not be added to any
> > new files in the kernel tree, outside of the island of
> > drivers/infiniband/ as you all insist on that crazyness there.
> > 
> > Heck, it's even documented that you shouldn't be adding that license to
> > any new files, why ignore that?
> 
> I don't remember being asked about it, not in this patch
> https://lore.kernel.org/all/20180425203703.568152337@linutronix.de/
> and not in this later patch which changed LICENSES/other/ to be
> LICENSES/deprecated/
> https://lore.kernel.org/all/20190430105130.24500-4-hch@lst.de/
> 
> So it is nice that someone decided to deprecate it without notifying users
> about such change. It will be better to put Linux-OpenIB to LICENSES/dual/
> folder instead.
> 
> Our lawyers are perfectly fine with Linux-OpenIB license.

Almost all others are not, sorry, you will have to convince them
otherwise, so for now, let's keep this in "deprecated" as this license
is known to have issues and should not get applied to any new code
without EXPLICIT justification why it must be so.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] mlx5 ConnectX diagnostic misc driver
  2023-10-18  8:31 ` [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Greg Kroah-Hartman
@ 2023-10-18 12:00   ` Jason Gunthorpe
  2023-10-18 12:11     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-10-18 12:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 10:31:23AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 18, 2023 at 01:19:36AM -0700, Saeed Mahameed wrote:
> > Historically a userspace program was used that accessed the PCI register
> > and config space directly through /sys/bus/pci/.../XXX and could operate
> > these debugging interfaces in parallel with the running driver.
> > This approach is incompatible with secure boot and kernel lockdown so this
> > driver provides a secure and restricted interface to that.
> 
> Why not just write a UIO driver for this hardware then?

The old mechanism relied on direct config space and sometimes mmio
access to the PCI device. We did a security analysis and concluded
that approach could not provide the required security for what our
customers want from the secure boot and lockdown modes. We cannot
allow a lockdown userspace direct access to those device registers.

So, it was redesigned to be RPC driven instead of having direct HW
access. The RPCs allow the device to provide the required level of
security.

This new misc driver does not expose any HW registers or interrupts to
userspace, so it does not seem like a fit for UIO.

Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] mlx5 ConnectX diagnostic misc driver
  2023-10-18 12:00   ` Jason Gunthorpe
@ 2023-10-18 12:11     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18 12:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 09:00:25AM -0300, Jason Gunthorpe wrote:
> On Wed, Oct 18, 2023 at 10:31:23AM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Oct 18, 2023 at 01:19:36AM -0700, Saeed Mahameed wrote:
> > > Historically a userspace program was used that accessed the PCI register
> > > and config space directly through /sys/bus/pci/.../XXX and could operate
> > > these debugging interfaces in parallel with the running driver.
> > > This approach is incompatible with secure boot and kernel lockdown so this
> > > driver provides a secure and restricted interface to that.
> > 
> > Why not just write a UIO driver for this hardware then?
> 
> The old mechanism relied on direct config space and sometimes mmio
> access to the PCI device. We did a security analysis and concluded
> that approach could not provide the required security for what our
> customers want from the secure boot and lockdown modes. We cannot
> allow a lockdown userspace direct access to those device registers.
> 
> So, it was redesigned to be RPC driven instead of having direct HW
> access. The RPCs allow the device to provide the required level of
> security.
> 
> This new misc driver does not expose any HW registers or interrupts to
> userspace, so it does not seem like a fit for UIO.

Ok, I'll not complain, I always like "real" drivers instead of UIO ones,
nice to see this happen!

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18  8:30   ` Greg Kroah-Hartman
  2023-10-18  8:49     ` Leon Romanovsky
@ 2023-10-18 18:01     ` Jason Gunthorpe
  2023-10-18 18:22       ` Greg Kroah-Hartman
  1 sibling, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-10-18 18:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> 
> For dual-licensed code, I need a LOT of documentation as to why this
> must be dual-licensed, AND a signed-off-by from your corporate lawyer
> agreeing to it so they convey an understanding of just how complex and
> messy this will get over time and what you are agreeing to do here.

Can you provide a brief or whitepaper discussing this complexity
please? This pushback is news to me, Mellanox and the RDMA ecosystem
has been doing this for over 15 years now. I would need something
substantive to have a conversation with our legal.

However, I believe we can get an exception approval for single license
MIT or BSD-3-Clause for this code.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18 18:01     ` Jason Gunthorpe
@ 2023-10-18 18:22       ` Greg Kroah-Hartman
  2023-10-18 18:56         ` Jason Gunthorpe
  0 siblings, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-18 18:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 03:01:28PM -0300, Jason Gunthorpe wrote:
> On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> > 
> > For dual-licensed code, I need a LOT of documentation as to why this
> > must be dual-licensed, AND a signed-off-by from your corporate lawyer
> > agreeing to it so they convey an understanding of just how complex and
> > messy this will get over time and what you are agreeing to do here.
> 
> Can you provide a brief or whitepaper discussing this complexity
> please? This pushback is news to me, Mellanox and the RDMA ecosystem
> has been doing this for over 15 years now. I would need something
> substantive to have a conversation with our legal.

Have your legal talk to the LF legal working group, they are the ones
that told me never to mess with this license again.  I'm sure that
nvidia's lawyers are part of this group, so let's let them hash it out.

> However, I believe we can get an exception approval for single license
> MIT or BSD-3-Clause for this code.

GPLv2 please, otherwise again, I'm going to demand a really really good
reason why Linux kernel code needs a non-GPL license and again, a sign
off from your lawyers explaining why it must be so.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18 18:22       ` Greg Kroah-Hartman
@ 2023-10-18 18:56         ` Jason Gunthorpe
  2023-10-19 17:21           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-10-18 18:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 08:22:39PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 18, 2023 at 03:01:28PM -0300, Jason Gunthorpe wrote:
> > On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > > > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> > > 
> > > For dual-licensed code, I need a LOT of documentation as to why this
> > > must be dual-licensed, AND a signed-off-by from your corporate lawyer
> > > agreeing to it so they convey an understanding of just how complex and
> > > messy this will get over time and what you are agreeing to do here.
> > 
> > Can you provide a brief or whitepaper discussing this complexity
> > please? This pushback is news to me, Mellanox and the RDMA ecosystem
> > has been doing this for over 15 years now. I would need something
> > substantive to have a conversation with our legal.
> 
> Have your legal talk to the LF legal working group, they are the ones
> that told me never to mess with this license again.  I'm sure that
> nvidia's lawyers are part of this group, so let's let them hash it
> out.

Are you talking about OpenIB specifically or the concept of dual
license (eg GPL/MIT) in general?

> > However, I believe we can get an exception approval for single license
> > MIT or BSD-3-Clause for this code.
> 
> GPLv2 please, otherwise again, I'm going to demand a really really good
> reason why Linux kernel code needs a non-GPL license and again, a sign
> off from your lawyers explaining why it must be so.

All of the Mellanox driver stack (over 400 files now!) is dual
licensed because we have a large team of people working the Mellanox
driver for many operating systems with many different licenses. We
want the certainty of a permissive license for the driver code we
supply to Linux as the team routinely references and/or re-uses
Mellanox authored Linux driver code into other scenarios under the
permissive side of the dual license.

For instance I could easily see the work Saeed has done here finding
its way into FreeBSD. We significantly support FreeBSD employing
maintainers and develop a sophisticated Mellanox driver over
there. This would not be possible without the Linux driver being dual
licensed.

This has been the direction from our legal for a long time.

AFAIK this has also been a long time accepted Linux practice, there
are many examples in the driver tree. What has changed now that Saeed
tries to add 3 more files the giant driver? I need a bigger
explanation if we are going to revisit this practice with our legal.

To be clear, we can surely get the approvals to remove the offensive
OpenIB from these files. eg mlxsw is already approved using
"BSD-3-Clause OR GPL-2.0".

Further, as a maintainer myself, is this now the community consensus
expected review standard?  When was it discussed? I do not see
evidence of a ban on dual licensing in
https://docs.kernel.org/process/license-rules.html ?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-18 18:56         ` Jason Gunthorpe
@ 2023-10-19 17:21           ` Greg Kroah-Hartman
  2023-10-19 19:00             ` Jason Gunthorpe
  2023-10-19 21:50             ` Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver] Jonathan Corbet
  0 siblings, 2 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-19 17:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Wed, Oct 18, 2023 at 03:56:29PM -0300, Jason Gunthorpe wrote:
> On Wed, Oct 18, 2023 at 08:22:39PM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Oct 18, 2023 at 03:01:28PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Oct 18, 2023 at 10:30:00AM +0200, Greg Kroah-Hartman wrote:
> > > > On Wed, Oct 18, 2023 at 01:19:38AM -0700, Saeed Mahameed wrote:
> > > > > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> > > > 
> > > > For dual-licensed code, I need a LOT of documentation as to why this
> > > > must be dual-licensed, AND a signed-off-by from your corporate lawyer
> > > > agreeing to it so they convey an understanding of just how complex and
> > > > messy this will get over time and what you are agreeing to do here.
> > > 
> > > Can you provide a brief or whitepaper discussing this complexity
> > > please? This pushback is news to me, Mellanox and the RDMA ecosystem
> > > has been doing this for over 15 years now. I would need something
> > > substantive to have a conversation with our legal.
> > 
> > Have your legal talk to the LF legal working group, they are the ones
> > that told me never to mess with this license again.  I'm sure that
> > nvidia's lawyers are part of this group, so let's let them hash it
> > out.
> 
> Are you talking about OpenIB specifically or the concept of dual
> license (eg GPL/MIT) in general?

I'm talking about OpenIB specifically.

> > > However, I believe we can get an exception approval for single license
> > > MIT or BSD-3-Clause for this code.
> > 
> > GPLv2 please, otherwise again, I'm going to demand a really really good
> > reason why Linux kernel code needs a non-GPL license and again, a sign
> > off from your lawyers explaining why it must be so.
> 
> All of the Mellanox driver stack (over 400 files now!) is dual
> licensed because we have a large team of people working the Mellanox
> driver for many operating systems with many different licenses. We
> want the certainty of a permissive license for the driver code we
> supply to Linux as the team routinely references and/or re-uses
> Mellanox authored Linux driver code into other scenarios under the
> permissive side of the dual license.
> 
> For instance I could easily see the work Saeed has done here finding
> its way into FreeBSD. We significantly support FreeBSD employing
> maintainers and develop a sophisticated Mellanox driver over
> there. This would not be possible without the Linux driver being dual
> licensed.

Yes it would, you can take the work that you all do and license it under
the BSD license and put it into FreeBSD just fine.  But you are saying
you require Linux developers to help you with your FreeBSD drivers,
which is not always fair or nice to take from others that way (in my
opinion.)

> This has been the direction from our legal for a long time.

I know, but the OpenIB license is known to have issues, and so I thought
they were going to stop using it for new code.

> AFAIK this has also been a long time accepted Linux practice, there
> are many examples in the driver tree. What has changed now that Saeed
> tries to add 3 more files the giant driver? I need a bigger
> explanation if we are going to revisit this practice with our legal.

"the giant driver"?  I'm confused.

> To be clear, we can surely get the approvals to remove the offensive
> OpenIB from these files. eg mlxsw is already approved using
> "BSD-3-Clause OR GPL-2.0".

For your new files, please make them single license.  If you insist on
dual licensing them, I will insist on have a lawyer sign off on them so
that they understand the issues involved with dual licenses, and just
how much I hate them in the kernel tree as they are a pain over time.

Note, this isn't special to you, I do this to all new files sent to me
with this type of non-GPL-only license, see the archives for details.

> Further, as a maintainer myself, is this now the community consensus
> expected review standard?  When was it discussed? I do not see
> evidence of a ban on dual licensing in
> https://docs.kernel.org/process/license-rules.html ?

It's based on my experience with existing dual licensed kernel code AND
discussing it with many many lawyers over the years.  It's a pain, they
hate it, it's dubious if it actually helps anyone out in any other
operating system (as again, you can take your work and send it to
FreeBSD just fine), it is messy when dealing with gpl-only exports (like
the driver model or the USB layer), and you are taking something from
the community (free labor) to help other operating systems out when the
Linux developers might not realize it.

I think the only real place this works out is the ACPI core, for the
obvious reasons that we all want a solid ACPI core that all operating
systems can use.  And Intel goes through a lot of extra effort and time
and energy to keep that going, so it is costing them real money to do
this work for this, so that makes sense.  For just a hardware driver for
a specific company, this feels very selfish in my opinion.

I would be really interested in if you all actually have taken any
not-from-your-company changes to your drivers and copied that into other
operating systems for anything "real" that wasn't just tiny bugfixes.
Have you?  If not, why go through this hassle?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-19 17:21           ` Greg Kroah-Hartman
@ 2023-10-19 19:00             ` Jason Gunthorpe
  2023-10-19 19:46               ` Greg Kroah-Hartman
  2023-10-19 21:50             ` Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver] Jonathan Corbet
  1 sibling, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-10-19 19:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Thu, Oct 19, 2023 at 07:21:57PM +0200, Greg Kroah-Hartman wrote:

> > Are you talking about OpenIB specifically or the concept of dual
> > license (eg GPL/MIT) in general?
> 
> I'm talking about OpenIB specifically.

Let's put that aside, I think Saeed made a C&P error since he works
mostly on the historical code that is grandfathered. He will fix it to
be another our-legal approved license, probably "BSD-3-Clause & GPLv2"

> > All of the Mellanox driver stack (over 400 files now!) is dual
> > licensed because we have a large team of people working the Mellanox
> > driver for many operating systems with many different licenses. We
> > want the certainty of a permissive license for the driver code we
> > supply to Linux as the team routinely references and/or re-uses
> > Mellanox authored Linux driver code into other scenarios under the
> > permissive side of the dual license.
> > 
> > For instance I could easily see the work Saeed has done here finding
> > its way into FreeBSD. We significantly support FreeBSD employing
> > maintainers and develop a sophisticated Mellanox driver over
> > there. This would not be possible without the Linux driver being dual
> > licensed.
> 
> Yes it would, you can take the work that you all do and license it under
> the BSD license and put it into FreeBSD just fine.

Sure, you can do that at day 0, but mlx5 is now about 10 years old and
has tens of thousands of commits. Many non-Mellanox commits. (mostly
non-significant, IMHO, IANAL)

If Mellanox today writes a new patch for mlx5 based on that history,
can that patch be re-licensed to BSD if the file it is based on is GPL
only with a complex history? Our legal has historically said no to
this question.

We are not dumping code over a wall where there is some internal
reference that has a single copyright. The mlx5 driver team is fully
integrated with the upstream community lead processes. I'm very proud
of how Mellanox is able to work like this.

> But you are saying you require Linux developers to help you with
> your FreeBSD drivers, which is not always fair or nice to take from
> others that way (in my opinion.)

AFAIK Mellanox has never done "require". If you want to do significant
work in mlx5-land and want GPL only then put it in its own file with a
GPL only license.

This has happened in drivers/infiniband which started in 2005 with a
group of like-minded people/companies that wanted to enable a full
ecosystem across a lot of operating systems. Later someone came with
significant work and wanted GPL only so it was placed in its own files
with a GPL only license.

I agree that "require" is not really fair, but I think there is room
in Linux to support people that want their open source work shared
outside Linux along side people that don't.

> > AFAIK this has also been a long time accepted Linux practice, there
> > are many examples in the driver tree. What has changed now that Saeed
> > tries to add 3 more files the giant driver? I need a bigger
> > explanation if we are going to revisit this practice with our legal.
> 
> "the giant driver"?  I'm confused.

The > 500 files of approx:

$ git ls-files | egrep -i mlx5

See the other thread debating what mlx5 HW actually is.

Remember that Leon created auxiliary bus so these complex multi-system
HWs could be split up cleanly into their respective subsystems? This
is an aux device driver for the misc subsystem as part of the giant
cross-subsystem mlx5 driver. Ie Saeed is adding 3 more files to that
existing monster.

> > To be clear, we can surely get the approvals to remove the offensive
> > OpenIB from these files. eg mlxsw is already approved using
> > "BSD-3-Clause OR GPL-2.0".
> 
> For your new files, please make them single license.  If you insist on
> dual licensing them, I will insist on have a lawyer sign off on them so
> that they understand the issues involved with dual licenses, and just
> how much I hate them in the kernel tree as they are a pain over time.

Please share with me what words you want to see and I will get them
from our legal team.

> I think the only real place this works out is the ACPI core, for the
> obvious reasons that we all want a solid ACPI core that all operating
> systems can use.  And Intel goes through a lot of extra effort and time
> and energy to keep that going, so it is costing them real money to do
> this work for this, so that makes sense.  For just a hardware driver for
> a specific company, this feels very selfish in my opinion.

It costs Mellanox real money/etc too. I'm not sure I see who it is
selfish to?

> I would be really interested in if you all actually have taken any
> not-from-your-company changes to your drivers and copied that into other
> operating systems for anything "real" that wasn't just tiny bugfixes.
> Have you?  If not, why go through this hassle?

The FreeBSD team references the current state of the driver in Linux
to guide their work. I don't think they carefully track the providence
of every single line. Not doing that is the main point of a dual
license.

Honestly, this has been done for 15 years now and has never been a
hassle at all. If you are asking why go through the hassle you are now
requesting - then I will have to get back to you based on how much
hassle it turns out to be :)

I believe strongly that Mellanox's efforts in FreeBSD are open source
noble and not selfish at all. I'm confident our legal will shut down
that project without a dual license, so I will go through some hassle
at your request to protect it.

Regards,
Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-19 19:00             ` Jason Gunthorpe
@ 2023-10-19 19:46               ` Greg Kroah-Hartman
  2023-10-19 23:49                 ` Jason Gunthorpe
  0 siblings, 1 reply; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-19 19:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Thu, Oct 19, 2023 at 04:00:46PM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 19, 2023 at 07:21:57PM +0200, Greg Kroah-Hartman wrote:
> > > All of the Mellanox driver stack (over 400 files now!) is dual
> > > licensed because we have a large team of people working the Mellanox
> > > driver for many operating systems with many different licenses. We
> > > want the certainty of a permissive license for the driver code we
> > > supply to Linux as the team routinely references and/or re-uses
> > > Mellanox authored Linux driver code into other scenarios under the
> > > permissive side of the dual license.
> > > 
> > > For instance I could easily see the work Saeed has done here finding
> > > its way into FreeBSD. We significantly support FreeBSD employing
> > > maintainers and develop a sophisticated Mellanox driver over
> > > there. This would not be possible without the Linux driver being dual
> > > licensed.
> > 
> > Yes it would, you can take the work that you all do and license it under
> > the BSD license and put it into FreeBSD just fine.
> 
> Sure, you can do that at day 0, but mlx5 is now about 10 years old and
> has tens of thousands of commits. Many non-Mellanox commits. (mostly
> non-significant, IMHO, IANAL)

That's not the case for this specific chunk of code, so it's not a valid
point at all, sorry.

Let's stick to just this new file, please keep it one-license, not dual,
it makes everything simpler overall.

> Remember that Leon created auxiliary bus so these complex multi-system
> HWs could be split up cleanly into their respective subsystems? This
> is an aux device driver for the misc subsystem as part of the giant
> cross-subsystem mlx5 driver. Ie Saeed is adding 3 more files to that
> existing monster.

Yes, and as the auxiliary bus code is EXPORT_SYMBOL_GPL() attempting to
license code that is a driver for that bus (i.e. this new contribution)
under anything other than just GPL is crazy.  Go talk to your lawyers
about that please, it's obviously not ok.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver]
  2023-10-19 17:21           ` Greg Kroah-Hartman
  2023-10-19 19:00             ` Jason Gunthorpe
@ 2023-10-19 21:50             ` Jonathan Corbet
  2023-10-20 19:30               ` Dave Airlie
  2023-10-20 20:07               ` Greg Kroah-Hartman
  1 sibling, 2 replies; 36+ messages in thread
From: Jonathan Corbet @ 2023-10-19 21:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jason Gunthorpe
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:

> For your new files, please make them single license.  If you insist on
> dual licensing them, I will insist on have a lawyer sign off on them so
> that they understand the issues involved with dual licenses, and just
> how much I hate them in the kernel tree as they are a pain over time.

Out of curiosity, is there somewhere people can look for a description
of these issues and the pain they cause?  I've seen this go by enough
times to think that it would be good to set down in Documentation/
somewhere for future reference.

Thanks,

jon

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-19 19:46               ` Greg Kroah-Hartman
@ 2023-10-19 23:49                 ` Jason Gunthorpe
  2023-10-20 20:17                   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-10-19 23:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Thu, Oct 19, 2023 at 09:46:29PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Oct 19, 2023 at 04:00:46PM -0300, Jason Gunthorpe wrote:
> > On Thu, Oct 19, 2023 at 07:21:57PM +0200, Greg Kroah-Hartman wrote:
> > > > All of the Mellanox driver stack (over 400 files now!) is dual
> > > > licensed because we have a large team of people working the Mellanox
> > > > driver for many operating systems with many different licenses. We
> > > > want the certainty of a permissive license for the driver code we
> > > > supply to Linux as the team routinely references and/or re-uses
> > > > Mellanox authored Linux driver code into other scenarios under the
> > > > permissive side of the dual license.
> > > > 
> > > > For instance I could easily see the work Saeed has done here finding
> > > > its way into FreeBSD. We significantly support FreeBSD employing
> > > > maintainers and develop a sophisticated Mellanox driver over
> > > > there. This would not be possible without the Linux driver being dual
> > > > licensed.
> > > 
> > > Yes it would, you can take the work that you all do and license it under
> > > the BSD license and put it into FreeBSD just fine.
> > 
> > Sure, you can do that at day 0, but mlx5 is now about 10 years old and
> > has tens of thousands of commits. Many non-Mellanox commits. (mostly
> > non-significant, IMHO, IANAL)
> 
> That's not the case for this specific chunk of code, so it's not a valid
> point at all, sorry.

In 10 years it will be in the same situation as the rest of the
driver. You are saying we can't plan ahead now? Why?

> Let's stick to just this new file, please keep it one-license, not dual,
> it makes everything simpler overall.

Simpler for who? It seems to complicate Mellanox's situation.

More importantly it seems to represent an important philosophical
shift for Linux that touches on something we have a significant
investment in.

I would like clarity here, on a going forward basis. You do set the
tone for the whole project. I've made my case for why we are doing and
why it brings value. You are saying dual license is now effectively
banned.

Previously you said you would agree with a sign off from our legal,
please tell me what statement you want and I will go get it.

> > Remember that Leon created auxiliary bus so these complex multi-system
> > HWs could be split up cleanly into their respective subsystems? This
> > is an aux device driver for the misc subsystem as part of the giant
> > cross-subsystem mlx5 driver. Ie Saeed is adding 3 more files to that
> > existing monster.
> 
> Yes, and as the auxiliary bus code is EXPORT_SYMBOL_GPL() attempting to
> license code that is a driver for that bus (i.e. this new contribution)
> under anything other than just GPL is crazy.  Go talk to your lawyers
> about that please, it's obviously not ok.

The entire mlx5 driver makes free use of EXPORT_SYMBOL_GPL(). Our
legal has looked at this in the past and they continue to give
instruction to use a dual license.

You keep saying go talk to our lawyers like this hasn't been a legally
vetted approach at Mellanox for the last 15 years :( Mellanox has
300,000 lines of code in Linux with a dual license. We have a good
in-house legal and they take license obligations seriously. I don't
see any new information in this thread that makes me think they will
change their long standing position.

Tell me what you want and I will go get it from them.

Regards,
Jason

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver]
  2023-10-19 21:50             ` Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver] Jonathan Corbet
@ 2023-10-20 19:30               ` Dave Airlie
  2023-10-20 20:07               ` Greg Kroah-Hartman
  1 sibling, 0 replies; 36+ messages in thread
From: Dave Airlie @ 2023-10-20 19:30 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Greg Kroah-Hartman, Jason Gunthorpe, Saeed Mahameed,
	Arnd Bergmann, linux-kernel, Leon Romanovsky, Jiri Pirko,
	Saeed Mahameed

On Fri, 20 Oct 2023 at 07:51, Jonathan Corbet <corbet@lwn.net> wrote:
>
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
>
> > For your new files, please make them single license.  If you insist on
> > dual licensing them, I will insist on have a lawyer sign off on them so
> > that they understand the issues involved with dual licenses, and just
> > how much I hate them in the kernel tree as they are a pain over time.
>
> Out of curiosity, is there somewhere people can look for a description
> of these issues and the pain they cause?  I've seen this go by enough
> times to think that it would be good to set down in Documentation/
> somewhere for future reference.

I'm also very curious, I've never had any issues in 15+ years with the
dual licensed code in the DRM, and continue to prefer it.

Greg if you get new advice that you want to apply to the kernel as a
whole, you probably need to write it up for us to discuss, your
lawyers are not my lawyers etc.

Maybe something for Maintainers summit?

Dave.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver]
  2023-10-19 21:50             ` Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver] Jonathan Corbet
  2023-10-20 19:30               ` Dave Airlie
@ 2023-10-20 20:07               ` Greg Kroah-Hartman
  1 sibling, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-20 20:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Jason Gunthorpe, Saeed Mahameed, Arnd Bergmann, linux-kernel,
	Leon Romanovsky, Jiri Pirko, Saeed Mahameed

On Thu, Oct 19, 2023 at 03:50:30PM -0600, Jonathan Corbet wrote:
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> 
> > For your new files, please make them single license.  If you insist on
> > dual licensing them, I will insist on have a lawyer sign off on them so
> > that they understand the issues involved with dual licenses, and just
> > how much I hate them in the kernel tree as they are a pain over time.
> 
> Out of curiosity, is there somewhere people can look for a description
> of these issues and the pain they cause?  I've seen this go by enough
> times to think that it would be good to set down in Documentation/
> somewhere for future reference.

I don't have a description anywhere, sorry, this was just discussions
with many open source lawyers who focus on the kernel over the years.
I'll work to try to get something "solid" from them that I can put into
writing, but it will probably take quite a while...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver
  2023-10-19 23:49                 ` Jason Gunthorpe
@ 2023-10-20 20:17                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 36+ messages in thread
From: Greg Kroah-Hartman @ 2023-10-20 20:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Saeed Mahameed, Arnd Bergmann, linux-kernel, Leon Romanovsky,
	Jiri Pirko, Saeed Mahameed

On Thu, Oct 19, 2023 at 08:49:47PM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 19, 2023 at 09:46:29PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Oct 19, 2023 at 04:00:46PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Oct 19, 2023 at 07:21:57PM +0200, Greg Kroah-Hartman wrote:
> > > > > All of the Mellanox driver stack (over 400 files now!) is dual
> > > > > licensed because we have a large team of people working the Mellanox
> > > > > driver for many operating systems with many different licenses. We
> > > > > want the certainty of a permissive license for the driver code we
> > > > > supply to Linux as the team routinely references and/or re-uses
> > > > > Mellanox authored Linux driver code into other scenarios under the
> > > > > permissive side of the dual license.
> > > > > 
> > > > > For instance I could easily see the work Saeed has done here finding
> > > > > its way into FreeBSD. We significantly support FreeBSD employing
> > > > > maintainers and develop a sophisticated Mellanox driver over
> > > > > there. This would not be possible without the Linux driver being dual
> > > > > licensed.
> > > > 
> > > > Yes it would, you can take the work that you all do and license it under
> > > > the BSD license and put it into FreeBSD just fine.
> > > 
> > > Sure, you can do that at day 0, but mlx5 is now about 10 years old and
> > > has tens of thousands of commits. Many non-Mellanox commits. (mostly
> > > non-significant, IMHO, IANAL)
> > 
> > That's not the case for this specific chunk of code, so it's not a valid
> > point at all, sorry.
> 
> In 10 years it will be in the same situation as the rest of the
> driver. You are saying we can't plan ahead now? Why?

I really am confused as to what exactly the scope of this driver is
then.  For some reason I thought it was just a "here's a debug driver
that we use to query the hardware and get some performance stats out of
it at times which we used to do previously by just doing raw PCI-memory
reads".  Is this really going to blow up into a giant infrastructure
that you are going to use for real functionality over the next decades?
If so, it needs a lot better description of what exactly this driver is
supposed to be doing, and how it ties into existing userspace tools.

> > Let's stick to just this new file, please keep it one-license, not dual,
> > it makes everything simpler overall.
> 
> Simpler for who? It seems to complicate Mellanox's situation.

How in the world is Mellanox going to take the code in a aux driver for
Linux and us that in any other operating systems WITHOUT it just being
an original contribution from a Mellanox developer themself?

> More importantly it seems to represent an important philosophical
> shift for Linux that touches on something we have a significant
> investment in.

Again, this is something that I ask of any new file that is sent to me
for my approval for many many years.  There are some subsystems in Linux
(drm, IB) that have lived with dual-license code for a very long time,
but that's about it.

> I would like clarity here, on a going forward basis. You do set the
> tone for the whole project. I've made my case for why we are doing and
> why it brings value. You are saying dual license is now effectively
> banned.

I'm saying that it needs to be explicitly chosen for very good reasons.

In the past, when I have pushed back on "why are you doing this?" I have
gotten "oops, we didn't mean that or understand it at all, we'll fix it"
the majority of the times.  If your company really does understand it
and knows it, great, that's what I need to confirm here.  I don't keep a
list of what companies do/do-not know this type of thing as that would
be pointless (hint, companies buy other companies and change legal
policies...)

> Previously you said you would agree with a sign off from our legal,
> please tell me what statement you want and I will go get it.

A simple note in the changelog that says something like:
	This file is dual licensed under XXX and YYY because of the
	following reasons.... and we will be handling all contributions
	to it in the following way...

and a signed-off-by in the chain by your group's lawyer so that we know
they understand the issues.

Or some other text like this, they can figure it out as they obviously
know the issues involved, and they know what they want to express to us.

> > > Remember that Leon created auxiliary bus so these complex multi-system
> > > HWs could be split up cleanly into their respective subsystems? This
> > > is an aux device driver for the misc subsystem as part of the giant
> > > cross-subsystem mlx5 driver. Ie Saeed is adding 3 more files to that
> > > existing monster.
> > 
> > Yes, and as the auxiliary bus code is EXPORT_SYMBOL_GPL() attempting to
> > license code that is a driver for that bus (i.e. this new contribution)
> > under anything other than just GPL is crazy.  Go talk to your lawyers
> > about that please, it's obviously not ok.
> 
> The entire mlx5 driver makes free use of EXPORT_SYMBOL_GPL(). Our
> legal has looked at this in the past and they continue to give
> instruction to use a dual license.

That's odd, I wonder how you take those contributions into other
operating systems...

> You keep saying go talk to our lawyers like this hasn't been a legally
> vetted approach at Mellanox for the last 15 years :(

I keep saying this as I have no idea what company is behind this, nor
what their lawyers want, nor what has been vetted by them at all.

And this email is going to a nvidia.com address, not Mellanox, so you
can understand my "you all better get the license issues right!" wish
for me to do here, based on the past experience with "issues" from that
domain.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
  2023-10-18  8:19 ` [PATCH 3/5] misc: mlx5ctl: Add info ioctl Saeed Mahameed
  2023-10-18  9:02   ` Arnd Bergmann
@ 2023-10-22  1:46   ` kernel test robot
  2023-10-22 11:27   ` kernel test robot
  2 siblings, 0 replies; 36+ messages in thread
From: kernel test robot @ 2023-10-22  1:46 UTC (permalink / raw)
  To: Saeed Mahameed, Arnd Bergmann, Greg Kroah-Hartman
  Cc: oe-kbuild-all, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

Hi Saeed,

kernel test robot noticed the following build errors:

[auto build test ERROR on char-misc/char-misc-testing]
[also build test ERROR on char-misc/char-misc-next char-misc/char-misc-linus soc/for-next rdma/for-next linus/master v6.6-rc6]
[cannot apply to next-20231020]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Saeed-Mahameed/mlx5-Add-aux-dev-for-ctl-interface/20231018-162610
base:   char-misc/char-misc-testing
patch link:    https://lore.kernel.org/r/20231018081941.475277-4-saeed%40kernel.org
patch subject: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
config: i386-randconfig-011-20231022 (https://download.01.org/0day-ci/archive/20231022/202310220923.Lga6m0Af-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231022/202310220923.Lga6m0Af-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310220923.Lga6m0Af-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from <command-line>:
>> ./usr/include/misc/mlx5ctl.h:8:9: error: unknown type name '__aligned_u64'
       8 |         __aligned_u64 flags;
         |         ^~~~~~~~~~~~~
>> ./usr/include/misc/mlx5ctl.h:9:9: error: unknown type name '__u32'
       9 |         __u32 size;
         |         ^~~~~
>> ./usr/include/misc/mlx5ctl.h:10:9: error: unknown type name '__u8'
      10 |         __u8 devname[64]; /* underlaying ConnectX device */
         |         ^~~~
>> ./usr/include/misc/mlx5ctl.h:11:9: error: unknown type name '__u16'
      11 |         __u16 uctx_uid; /* current process allocated UCTX UID */
         |         ^~~~~
   ./usr/include/misc/mlx5ctl.h:12:9: error: unknown type name '__u16'
      12 |         __u16 reserved1;
         |         ^~~~~
   ./usr/include/misc/mlx5ctl.h:13:9: error: unknown type name '__u32'
      13 |         __u32 uctx_cap; /* current process effective UCTX cap */
         |         ^~~~~
   ./usr/include/misc/mlx5ctl.h:14:9: error: unknown type name '__u32'
      14 |         __u32 dev_uctx_cap; /* device's UCTX capabilities */
         |         ^~~~~
   ./usr/include/misc/mlx5ctl.h:15:9: error: unknown type name '__u32'
      15 |         __u32 ucap; /* process user capability */
         |         ^~~~~
   ./usr/include/misc/mlx5ctl.h:16:9: error: unknown type name '__u32'
      16 |         __u32 reserved2[4];
         |         ^~~~~

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
  2023-10-18  8:19 ` [PATCH 3/5] misc: mlx5ctl: Add info ioctl Saeed Mahameed
  2023-10-18  9:02   ` Arnd Bergmann
  2023-10-22  1:46   ` kernel test robot
@ 2023-10-22 11:27   ` kernel test robot
  2 siblings, 0 replies; 36+ messages in thread
From: kernel test robot @ 2023-10-22 11:27 UTC (permalink / raw)
  To: Saeed Mahameed, Arnd Bergmann, Greg Kroah-Hartman
  Cc: oe-kbuild-all, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

Hi Saeed,

kernel test robot noticed the following build errors:

[auto build test ERROR on char-misc/char-misc-testing]
[also build test ERROR on char-misc/char-misc-next char-misc/char-misc-linus soc/for-next rdma/for-next linus/master v6.6-rc6]
[cannot apply to next-20231020]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Saeed-Mahameed/mlx5-Add-aux-dev-for-ctl-interface/20231018-162610
base:   char-misc/char-misc-testing
patch link:    https://lore.kernel.org/r/20231018081941.475277-4-saeed%40kernel.org
patch subject: [PATCH 3/5] misc: mlx5ctl: Add info ioctl
config: arc-allmodconfig (https://download.01.org/0day-ci/archive/20231022/202310221931.r1vsjtys-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231022/202310221931.r1vsjtys-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310221931.r1vsjtys-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/misc/mlx5ctl/main.c:276:27: error: initialization of 'long int (*)(struct file *, unsigned int,  long unsigned int)' from incompatible pointer type 'ssize_t (*)(struct file *, unsigned int,  long unsigned int)' {aka 'int (*)(struct file *, unsigned int,  long unsigned int)'} [-Werror=incompatible-pointer-types]
     276 |         .unlocked_ioctl = mlx5ctl_ioctl,
         |                           ^~~~~~~~~~~~~
   drivers/misc/mlx5ctl/main.c:276:27: note: (near initialization for 'mlx5ctl_fops.unlocked_ioctl')
   cc1: some warnings being treated as errors


vim +276 drivers/misc/mlx5ctl/main.c

   271	
   272	static const struct file_operations mlx5ctl_fops = {
   273		.owner = THIS_MODULE,
   274		.open = mlx5ctl_open,
   275		.release = mlx5ctl_release,
 > 276		.unlocked_ioctl = mlx5ctl_ioctl,
   277	};
   278	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
  2023-10-18  9:30   ` Arnd Bergmann
  2023-10-18 11:51     ` Jason Gunthorpe
@ 2023-11-19  9:44     ` Saeed Mahameed
  1 sibling, 0 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-11-19  9:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Greg Kroah-Hartman, linux-kernel, Leon Romanovsky,
	Jason Gunthorpe, Jiri Pirko, Saeed Mahameed

On 18 Oct 11:30, Arnd Bergmann wrote:
>On Wed, Oct 18, 2023, at 10:19, Saeed Mahameed wrote:
>> From: Saeed Mahameed <saeedm@nvidia.com>
>
>>
>> To do so this patch introduces two ioctls:
>>
>> MLX5CTL_IOCTL_UMEM_REG(va_address, size):
>>  - calculate page fragments from the user provided virtual address
>>  - pin the pages, and allocate a sg list
>>  - dma map the sg list
>>  - create a UMEM device object that points to the dma addresses
>>  - add a driver umem object to an xarray data base for bookkeeping
>>  - return UMEM ID to user so it can be used in subsequent rpcs
>>
>> MLX5CTL_IOCTL_UMEM_UNREG(umem_id):
>>  - user provides a pre allocated umem ID
>>  - unwinds the above
>>
>

[...]

>> +
>> +	umem_id = mlx5ctl_umem_reg(mfd->umem_db, (unsigned
>> long)umem_reg.addr, umem_reg.len);
>
>umem_reg.addr seems to be a user space address, so I would
>suggest consistently passing it as a 'void __user *' instead
>of casting to (unsigned long) here. You can use u64_to_user_ptr()
>to handle the pointer conversion correctly across all
>architectures that way, and get better type checking.
>

Hi Arnd,
I handled all of your comments in V2 except this one, since here we use
the user address for arithmetic calculations only, so it is easier to
convert it to (unsigned long) early on ..


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
  2023-10-18  8:33   ` Greg Kroah-Hartman
@ 2023-11-19  9:49     ` Saeed Mahameed
  0 siblings, 0 replies; 36+ messages in thread
From: Saeed Mahameed @ 2023-11-19  9:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, linux-kernel, Leon Romanovsky, Jason Gunthorpe,
	Jiri Pirko, Saeed Mahameed

On 18 Oct 10:33, Greg Kroah-Hartman wrote:
>On Wed, Oct 18, 2023 at 01:19:41AM -0700, Saeed Mahameed wrote:
> +#define MLX5CTL_UMEM_MAX_MB 64
>> +
>> +static size_t umem_num_pages(u64 addr, size_t len)
>> +{
>> +	return (size_t)((ALIGN(addr + len, PAGE_SIZE) -
>> +			 ALIGN_DOWN(addr, PAGE_SIZE))) /
>> +			 PAGE_SIZE;
>> +}
>
>We don't have a function or macro for this already?
>

I looked around and saw similar implementations, but nothing generic,
each has different assumptions, for example PAGE_SIZE can be different
or base address is assumed to be PAGE aligned, or len has a specific
alignment, so each use is a bit different.

I managed to reduce this to a one liner in V2.

Thanks,
Saeed.


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-11-19  9:49 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18  8:19 [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Saeed Mahameed
2023-10-18  8:19 ` [PATCH 1/5] mlx5: Add aux dev for ctl interface Saeed Mahameed
2023-10-18  8:19 ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Saeed Mahameed
2023-10-18  8:30   ` Greg Kroah-Hartman
2023-10-18  8:49     ` Leon Romanovsky
2023-10-18  8:55       ` Greg Kroah-Hartman
2023-10-18 10:00         ` Leon Romanovsky
2023-10-18 11:52           ` Greg Kroah-Hartman
2023-10-18 18:01     ` Jason Gunthorpe
2023-10-18 18:22       ` Greg Kroah-Hartman
2023-10-18 18:56         ` Jason Gunthorpe
2023-10-19 17:21           ` Greg Kroah-Hartman
2023-10-19 19:00             ` Jason Gunthorpe
2023-10-19 19:46               ` Greg Kroah-Hartman
2023-10-19 23:49                 ` Jason Gunthorpe
2023-10-20 20:17                   ` Greg Kroah-Hartman
2023-10-19 21:50             ` Dual licensing [was: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver] Jonathan Corbet
2023-10-20 19:30               ` Dave Airlie
2023-10-20 20:07               ` Greg Kroah-Hartman
2023-10-18  8:30   ` [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Greg Kroah-Hartman
2023-10-18  8:19 ` [PATCH 3/5] misc: mlx5ctl: Add info ioctl Saeed Mahameed
2023-10-18  9:02   ` Arnd Bergmann
2023-10-18 10:08     ` Leon Romanovsky
2023-10-18 11:02       ` Arnd Bergmann
2023-10-22  1:46   ` kernel test robot
2023-10-22 11:27   ` kernel test robot
2023-10-18  8:19 ` [PATCH 4/5] misc: mlx5ctl: Add command rpc ioctl Saeed Mahameed
2023-10-18  8:19 ` [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl Saeed Mahameed
2023-10-18  8:33   ` Greg Kroah-Hartman
2023-11-19  9:49     ` Saeed Mahameed
2023-10-18  9:30   ` Arnd Bergmann
2023-10-18 11:51     ` Jason Gunthorpe
2023-11-19  9:44     ` Saeed Mahameed
2023-10-18  8:31 ` [PATCH 0/5] mlx5 ConnectX diagnostic misc driver Greg Kroah-Hartman
2023-10-18 12:00   ` Jason Gunthorpe
2023-10-18 12:11     ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).