All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/3] ice: lighter locking for PTP time reading
@ 2024-03-06 16:29 ` Michal Schmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

This series removes the use of the heavy-weight PTP hardware semaphore
in the gettimex64 path. Instead, serialization of access to the time
register is done using a host-side spinlock. The timer hardware is
shared between PFs on the PCI adapter, so the spinlock must be shared
between ice_pf instances too.

Replacing the PTP hardware semaphore entirely with a mutex is also
possible and you can see it done in my git branch[1], but I am not
posting those patches yet to keep the scope of this series limited.

[1] https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8

v2:
 - Patch 1: Rely on xarray's own lock. (Suggested by Jiri Pirko)
 - Patch 2: Do not use *_irqsave with ptp_gltsyn_time_lock, as it's used
   only in process contexts.

Michal Schmidt (3):
  ice: add ice_adapter for shared data across PFs on the same NIC
  ice: avoid the PTP hardware semaphore in gettimex64 path
  ice: fold ice_ptp_read_time into ice_ptp_gettimex64

 drivers/net/ethernet/intel/ice/Makefile      |  3 +-
 drivers/net/ethernet/intel/ice/ice.h         |  2 +
 drivers/net/ethernet/intel/ice/ice_adapter.c | 87 ++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 28 +++++++
 drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
 drivers/net/ethernet/intel/ice/ice_ptp.c     | 33 +-------
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c  |  3 +
 7 files changed, 134 insertions(+), 30 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h

-- 
2.43.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [PATCH net-next v2 0/3] ice: lighter locking for PTP time reading
@ 2024-03-06 16:29 ` Michal Schmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
	Jacob Keller, Jakub Kicinski

This series removes the use of the heavy-weight PTP hardware semaphore
in the gettimex64 path. Instead, serialization of access to the time
register is done using a host-side spinlock. The timer hardware is
shared between PFs on the PCI adapter, so the spinlock must be shared
between ice_pf instances too.

Replacing the PTP hardware semaphore entirely with a mutex is also
possible and you can see it done in my git branch[1], but I am not
posting those patches yet to keep the scope of this series limited.

[1] https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8

v2:
 - Patch 1: Rely on xarray's own lock. (Suggested by Jiri Pirko)
 - Patch 2: Do not use *_irqsave with ptp_gltsyn_time_lock, as it's used
   only in process contexts.

Michal Schmidt (3):
  ice: add ice_adapter for shared data across PFs on the same NIC
  ice: avoid the PTP hardware semaphore in gettimex64 path
  ice: fold ice_ptp_read_time into ice_ptp_gettimex64

 drivers/net/ethernet/intel/ice/Makefile      |  3 +-
 drivers/net/ethernet/intel/ice/ice.h         |  2 +
 drivers/net/ethernet/intel/ice/ice_adapter.c | 87 ++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 28 +++++++
 drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
 drivers/net/ethernet/intel/ice/ice_ptp.c     | 33 +-------
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c  |  3 +
 7 files changed, 134 insertions(+), 30 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h

-- 
2.43.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
  2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 16:29   ` Michal Schmidt
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

There is a need for synchronization between ice PFs on the same physical
adapter.

Add a "struct ice_adapter" for holding data shared between PFs of the
same multifunction PCI device. The struct is refcounted - each ice_pf
holds a reference to it.

Its first use will be for PTP. I expect it will be useful also to
improve the ugliness that is ice_prot_id_tbl.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/Makefile      |  3 +-
 drivers/net/ethernet/intel/ice/ice.h         |  2 +
 drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
 drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
 5 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h

diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index cddd82d4ca0f..4fa09c321440 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -36,7 +36,8 @@ ice-y := ice_main.o	\
 	 ice_repr.o	\
 	 ice_tc_lib.o	\
 	 ice_fwlog.o	\
-	 ice_debugfs.o
+	 ice_debugfs.o  \
+	 ice_adapter.o
 ice-$(CONFIG_PCI_IOV) +=	\
 	ice_sriov.o		\
 	ice_virtchnl.o		\
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 365c03d1c462..1ffecbdd361a 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -77,6 +77,7 @@
 #include "ice_gnss.h"
 #include "ice_irq.h"
 #include "ice_dpll.h"
+#include "ice_adapter.h"
 
 #define ICE_BAR0		0
 #define ICE_REQ_DESC_MULTIPLE	32
@@ -544,6 +545,7 @@ struct ice_agg_node {
 
 struct ice_pf {
 	struct pci_dev *pdev;
+	struct ice_adapter *adapter;
 
 	struct devlink_region *nvm_region;
 	struct devlink_region *sram_region;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
new file mode 100644
index 000000000000..b93b4db4c04c
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// SPDX-FileCopyrightText: Copyright Red Hat
+
+#include <linux/cleanup.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/xarray.h>
+#include "ice_adapter.h"
+
+static DEFINE_XARRAY(ice_adapters);
+
+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
+{
+	unsigned int domain = pci_domain_nr(pdev->bus);
+
+	WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
+	return ((unsigned long)domain << 13) |
+	       ((unsigned long)pdev->bus->number << 5) |
+	       PCI_SLOT(pdev->devfn);
+}
+
+static struct ice_adapter *ice_adapter_new(void)
+{
+	struct ice_adapter *a;
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return NULL;
+
+	refcount_set(&a->refcount, 1);
+
+	return a;
+}
+
+static void ice_adapter_free(struct ice_adapter *a)
+{
+	kfree(a);
+}
+
+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
+{
+	struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
+	unsigned long index = ice_adapter_index(pdev);
+
+	a = ice_adapter_new();
+	if (!a)
+		return NULL;
+
+	xa_lock(&ice_adapters);
+	ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
+	if (xa_is_err(ret)) {
+		ret = NULL;
+		goto unlock;
+	}
+	if (ret) {
+		refcount_inc(&ret->refcount);
+		goto unlock;
+	}
+	ret = no_free_ptr(a);
+unlock:
+	xa_unlock(&ice_adapters);
+	return ret;
+}
+
+void ice_adapter_put(const struct pci_dev *pdev)
+{
+	unsigned long index = ice_adapter_index(pdev);
+	struct ice_adapter *a;
+
+	xa_lock(&ice_adapters);
+	a = xa_load(&ice_adapters, index);
+	if (WARN_ON(!a))
+		goto unlock;
+
+	if (!refcount_dec_and_test(&a->refcount))
+		goto unlock;
+
+	WARN_ON(__xa_erase(&ice_adapters, index) != a);
+	ice_adapter_free(a);
+unlock:
+	xa_unlock(&ice_adapters);
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
new file mode 100644
index 000000000000..cb5a02eb24c1
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* SPDX-FileCopyrightText: Copyright Red Hat */
+
+#ifndef _ICE_ADAPTER_H_
+#define _ICE_ADAPTER_H_
+
+#include <linux/refcount_types.h>
+
+struct pci_dev;
+
+/**
+ * struct ice_adapter - PCI adapter resources shared across PFs
+ * @refcount: Reference count. struct ice_pf objects hold the references.
+ */
+struct ice_adapter {
+	refcount_t refcount;
+};
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
+void ice_adapter_put(const struct pci_dev *pdev);
+
+#endif /* _ICE_ADAPTER_H */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8f73ba77e835..413219d81a12 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5093,6 +5093,7 @@ static int
 ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 {
 	struct device *dev = &pdev->dev;
+	struct ice_adapter *adapter;
 	struct ice_pf *pf;
 	struct ice_hw *hw;
 	int err;
@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 
 	pci_set_master(pdev);
 
+	adapter = ice_adapter_get(pdev);
+	if (!adapter)
+		return -ENOMEM;
+
 	pf->pdev = pdev;
+	pf->adapter = adapter;
 	pci_set_drvdata(pdev, pf);
 	set_bit(ICE_DOWN, pf->state);
 	/* Disable service task until DOWN bit is cleared */
@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 err_load:
 	ice_deinit(pf);
 err_init:
+	ice_adapter_put(pdev);
 	pci_disable_device(pdev);
 	return err;
 }
@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
 	ice_setup_mc_magic_wake(pf);
 	ice_set_wake(pf);
 
+	ice_adapter_put(pdev);
 	pci_disable_device(pdev);
 }
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-06 16:29   ` Michal Schmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
	Jacob Keller, Jakub Kicinski

There is a need for synchronization between ice PFs on the same physical
adapter.

Add a "struct ice_adapter" for holding data shared between PFs of the
same multifunction PCI device. The struct is refcounted - each ice_pf
holds a reference to it.

Its first use will be for PTP. I expect it will be useful also to
improve the ugliness that is ice_prot_id_tbl.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/Makefile      |  3 +-
 drivers/net/ethernet/intel/ice/ice.h         |  2 +
 drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
 drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
 5 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h

diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index cddd82d4ca0f..4fa09c321440 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -36,7 +36,8 @@ ice-y := ice_main.o	\
 	 ice_repr.o	\
 	 ice_tc_lib.o	\
 	 ice_fwlog.o	\
-	 ice_debugfs.o
+	 ice_debugfs.o  \
+	 ice_adapter.o
 ice-$(CONFIG_PCI_IOV) +=	\
 	ice_sriov.o		\
 	ice_virtchnl.o		\
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 365c03d1c462..1ffecbdd361a 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -77,6 +77,7 @@
 #include "ice_gnss.h"
 #include "ice_irq.h"
 #include "ice_dpll.h"
+#include "ice_adapter.h"
 
 #define ICE_BAR0		0
 #define ICE_REQ_DESC_MULTIPLE	32
@@ -544,6 +545,7 @@ struct ice_agg_node {
 
 struct ice_pf {
 	struct pci_dev *pdev;
+	struct ice_adapter *adapter;
 
 	struct devlink_region *nvm_region;
 	struct devlink_region *sram_region;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
new file mode 100644
index 000000000000..b93b4db4c04c
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// SPDX-FileCopyrightText: Copyright Red Hat
+
+#include <linux/cleanup.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/xarray.h>
+#include "ice_adapter.h"
+
+static DEFINE_XARRAY(ice_adapters);
+
+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
+{
+	unsigned int domain = pci_domain_nr(pdev->bus);
+
+	WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
+	return ((unsigned long)domain << 13) |
+	       ((unsigned long)pdev->bus->number << 5) |
+	       PCI_SLOT(pdev->devfn);
+}
+
+static struct ice_adapter *ice_adapter_new(void)
+{
+	struct ice_adapter *a;
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return NULL;
+
+	refcount_set(&a->refcount, 1);
+
+	return a;
+}
+
+static void ice_adapter_free(struct ice_adapter *a)
+{
+	kfree(a);
+}
+
+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
+{
+	struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
+	unsigned long index = ice_adapter_index(pdev);
+
+	a = ice_adapter_new();
+	if (!a)
+		return NULL;
+
+	xa_lock(&ice_adapters);
+	ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
+	if (xa_is_err(ret)) {
+		ret = NULL;
+		goto unlock;
+	}
+	if (ret) {
+		refcount_inc(&ret->refcount);
+		goto unlock;
+	}
+	ret = no_free_ptr(a);
+unlock:
+	xa_unlock(&ice_adapters);
+	return ret;
+}
+
+void ice_adapter_put(const struct pci_dev *pdev)
+{
+	unsigned long index = ice_adapter_index(pdev);
+	struct ice_adapter *a;
+
+	xa_lock(&ice_adapters);
+	a = xa_load(&ice_adapters, index);
+	if (WARN_ON(!a))
+		goto unlock;
+
+	if (!refcount_dec_and_test(&a->refcount))
+		goto unlock;
+
+	WARN_ON(__xa_erase(&ice_adapters, index) != a);
+	ice_adapter_free(a);
+unlock:
+	xa_unlock(&ice_adapters);
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
new file mode 100644
index 000000000000..cb5a02eb24c1
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* SPDX-FileCopyrightText: Copyright Red Hat */
+
+#ifndef _ICE_ADAPTER_H_
+#define _ICE_ADAPTER_H_
+
+#include <linux/refcount_types.h>
+
+struct pci_dev;
+
+/**
+ * struct ice_adapter - PCI adapter resources shared across PFs
+ * @refcount: Reference count. struct ice_pf objects hold the references.
+ */
+struct ice_adapter {
+	refcount_t refcount;
+};
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
+void ice_adapter_put(const struct pci_dev *pdev);
+
+#endif /* _ICE_ADAPTER_H */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8f73ba77e835..413219d81a12 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5093,6 +5093,7 @@ static int
 ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 {
 	struct device *dev = &pdev->dev;
+	struct ice_adapter *adapter;
 	struct ice_pf *pf;
 	struct ice_hw *hw;
 	int err;
@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 
 	pci_set_master(pdev);
 
+	adapter = ice_adapter_get(pdev);
+	if (!adapter)
+		return -ENOMEM;
+
 	pf->pdev = pdev;
+	pf->adapter = adapter;
 	pci_set_drvdata(pdev, pf);
 	set_bit(ICE_DOWN, pf->state);
 	/* Disable service task until DOWN bit is cleared */
@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 err_load:
 	ice_deinit(pf);
 err_init:
+	ice_adapter_put(pdev);
 	pci_disable_device(pdev);
 	return err;
 }
@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
 	ice_setup_mc_magic_wake(pf);
 	ice_set_wake(pf);
 
+	ice_adapter_put(pdev);
 	pci_disable_device(pdev);
 }
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v2 2/3] ice: avoid the PTP hardware semaphore in gettimex64 path
  2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 16:29   ` Michal Schmidt
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize
operations that program the PTP timers. The operations involve issuing
commands to the sideband queue. The E810 does not have a hardware
sideband queue, so the admin queue is used. The admin queue is slow.
I have observed delays in hundreds of milliseconds waiting for
ice_sq_done.

When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is
held by a task performing one of the slow operations, ice_ptp_lock can
easily time out. phc2sys gets -EBUSY and the kernel prints:
  ice 0000:XX:YY.0: PTP failed to get time
These messages appear once every few seconds, causing log spam.

The E810 datasheet recommends an algorithm for reading the upper 64 bits
of the GLTSYN_TIME register. It matches what's implemented in
ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not
necessarily against the concurrent setting of the register (with
GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why
ice_ptp_gettimex64 also takes PFTSYN_SEM.

The race with time setters can be prevented without relying on the PTP
hardware semaphore. Using the "ice_adapter" from the previous patch,
we can have a common spinlock for the PFs that share the clock hardware.
It will protect the reading and writing to the GLTSYN_TIME register.
The writing is performed indirectly, by the hardware, as a result of
the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
updated, but it works well in my testing.

My test code can be seen here:
https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8
It consists of:
 - kernel threads reading the time in a busy loop and looking at the
   deltas between consecutive values, reporting new maxima.
 - a shell script that sets the time repeatedly;
 - a bpftrace probe to produce a histogram of the measured deltas.
Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
Deltas in the [2G, 4G) range appear in the histograms.
With the spinlock added, there is no tearing and the biggest delta I saw
was in the range [1M, 2M), that is under 2 ms.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
 drivers/net/ethernet/intel/ice/ice_ptp.c     | 8 +-------
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 3 +++
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
index b93b4db4c04c..85fa2e1326af 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.c
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -5,6 +5,7 @@
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
+#include <linux/spinlock.h>
 #include <linux/xarray.h>
 #include "ice_adapter.h"
 
@@ -28,6 +29,7 @@ static struct ice_adapter *ice_adapter_new(void)
 	if (!a)
 		return NULL;
 
+	spin_lock_init(&a->ptp_gltsyn_time_lock);
 	refcount_set(&a->refcount, 1);
 
 	return a;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
index cb5a02eb24c1..9d11014ec02f 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.h
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -4,15 +4,21 @@
 #ifndef _ICE_ADAPTER_H_
 #define _ICE_ADAPTER_H_
 
+#include <linux/spinlock_types.h>
 #include <linux/refcount_types.h>
 
 struct pci_dev;
 
 /**
  * struct ice_adapter - PCI adapter resources shared across PFs
+ * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
+ *                        register of the PTP clock.
  * @refcount: Reference count. struct ice_pf objects hold the references.
  */
 struct ice_adapter {
+	/* For access to the GLTSYN_TIME register */
+	spinlock_t ptp_gltsyn_time_lock;
+
 	refcount_t refcount;
 };
 
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index c11eba07283c..0875f37add78 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
 	u8 tmr_idx;
 
 	tmr_idx = ice_get_ptp_src_clock_index(hw);
+	guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
 	/* Read the system timestamp pre PHC read */
 	ptp_read_system_prets(sts);
 
@@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
 		   struct ptp_system_timestamp *sts)
 {
 	struct ice_pf *pf = ptp_info_to_pf(info);
-	struct ice_hw *hw = &pf->hw;
-
-	if (!ice_ptp_lock(hw)) {
-		dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
-		return -EBUSY;
-	}
 
 	ice_ptp_read_time(pf, ts, sts);
-	ice_ptp_unlock(hw);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 187ce9b54e1a..2b9423a173bb 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
  */
 static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
 {
+	struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
+
+	guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
 	wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
 	ice_flush(hw);
 }
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [PATCH net-next v2 2/3] ice: avoid the PTP hardware semaphore in gettimex64 path
@ 2024-03-06 16:29   ` Michal Schmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
	Jacob Keller, Jakub Kicinski

The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize
operations that program the PTP timers. The operations involve issuing
commands to the sideband queue. The E810 does not have a hardware
sideband queue, so the admin queue is used. The admin queue is slow.
I have observed delays in hundreds of milliseconds waiting for
ice_sq_done.

When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is
held by a task performing one of the slow operations, ice_ptp_lock can
easily time out. phc2sys gets -EBUSY and the kernel prints:
  ice 0000:XX:YY.0: PTP failed to get time
These messages appear once every few seconds, causing log spam.

The E810 datasheet recommends an algorithm for reading the upper 64 bits
of the GLTSYN_TIME register. It matches what's implemented in
ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not
necessarily against the concurrent setting of the register (with
GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why
ice_ptp_gettimex64 also takes PFTSYN_SEM.

The race with time setters can be prevented without relying on the PTP
hardware semaphore. Using the "ice_adapter" from the previous patch,
we can have a common spinlock for the PFs that share the clock hardware.
It will protect the reading and writing to the GLTSYN_TIME register.
The writing is performed indirectly, by the hardware, as a result of
the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
updated, but it works well in my testing.

My test code can be seen here:
https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8
It consists of:
 - kernel threads reading the time in a busy loop and looking at the
   deltas between consecutive values, reporting new maxima.
 - a shell script that sets the time repeatedly;
 - a bpftrace probe to produce a histogram of the measured deltas.
Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
Deltas in the [2G, 4G) range appear in the histograms.
With the spinlock added, there is no tearing and the biggest delta I saw
was in the range [1M, 2M), that is under 2 ms.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
 drivers/net/ethernet/intel/ice/ice_ptp.c     | 8 +-------
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 3 +++
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
index b93b4db4c04c..85fa2e1326af 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.c
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -5,6 +5,7 @@
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
+#include <linux/spinlock.h>
 #include <linux/xarray.h>
 #include "ice_adapter.h"
 
@@ -28,6 +29,7 @@ static struct ice_adapter *ice_adapter_new(void)
 	if (!a)
 		return NULL;
 
+	spin_lock_init(&a->ptp_gltsyn_time_lock);
 	refcount_set(&a->refcount, 1);
 
 	return a;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
index cb5a02eb24c1..9d11014ec02f 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.h
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -4,15 +4,21 @@
 #ifndef _ICE_ADAPTER_H_
 #define _ICE_ADAPTER_H_
 
+#include <linux/spinlock_types.h>
 #include <linux/refcount_types.h>
 
 struct pci_dev;
 
 /**
  * struct ice_adapter - PCI adapter resources shared across PFs
+ * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
+ *                        register of the PTP clock.
  * @refcount: Reference count. struct ice_pf objects hold the references.
  */
 struct ice_adapter {
+	/* For access to the GLTSYN_TIME register */
+	spinlock_t ptp_gltsyn_time_lock;
+
 	refcount_t refcount;
 };
 
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index c11eba07283c..0875f37add78 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
 	u8 tmr_idx;
 
 	tmr_idx = ice_get_ptp_src_clock_index(hw);
+	guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
 	/* Read the system timestamp pre PHC read */
 	ptp_read_system_prets(sts);
 
@@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
 		   struct ptp_system_timestamp *sts)
 {
 	struct ice_pf *pf = ptp_info_to_pf(info);
-	struct ice_hw *hw = &pf->hw;
-
-	if (!ice_ptp_lock(hw)) {
-		dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
-		return -EBUSY;
-	}
 
 	ice_ptp_read_time(pf, ts, sts);
-	ice_ptp_unlock(hw);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 187ce9b54e1a..2b9423a173bb 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
  */
 static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
 {
+	struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
+
+	guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
 	wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
 	ice_flush(hw);
 }
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v2 3/3] ice: fold ice_ptp_read_time into ice_ptp_gettimex64
  2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 16:29   ` Michal Schmidt
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

This is a cleanup. It is unnecessary to have this function just to call
another function.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_ptp.c | 25 +++---------------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 0875f37add78..0f17fc1181d2 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1166,26 +1166,6 @@ static void ice_ptp_reset_cached_phctime(struct ice_pf *pf)
 	ice_ptp_mark_tx_tracker_stale(&pf->ptp.port.tx);
 }
 
-/**
- * ice_ptp_read_time - Read the time from the device
- * @pf: Board private structure
- * @ts: timespec structure to hold the current time value
- * @sts: Optional parameter for holding a pair of system timestamps from
- *       the system clock. Will be ignored if NULL is given.
- *
- * This function reads the source clock registers and stores them in a timespec.
- * However, since the registers are 64 bits of nanoseconds, we must convert the
- * result to a timespec before we can return.
- */
-static void
-ice_ptp_read_time(struct ice_pf *pf, struct timespec64 *ts,
-		  struct ptp_system_timestamp *sts)
-{
-	u64 time_ns = ice_ptp_read_src_clk_reg(pf, sts);
-
-	*ts = ns_to_timespec64(time_ns);
-}
-
 /**
  * ice_ptp_write_init - Set PHC time to provided value
  * @pf: Board private structure
@@ -1926,9 +1906,10 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
 		   struct ptp_system_timestamp *sts)
 {
 	struct ice_pf *pf = ptp_info_to_pf(info);
+	u64 time_ns;
 
-	ice_ptp_read_time(pf, ts, sts);
-
+	time_ns = ice_ptp_read_src_clk_reg(pf, sts);
+	*ts = ns_to_timespec64(time_ns);
 	return 0;
 }
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [PATCH net-next v2 3/3] ice: fold ice_ptp_read_time into ice_ptp_gettimex64
@ 2024-03-06 16:29   ` Michal Schmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
	Jacob Keller, Jakub Kicinski

This is a cleanup. It is unnecessary to have this function just to call
another function.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_ptp.c | 25 +++---------------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 0875f37add78..0f17fc1181d2 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1166,26 +1166,6 @@ static void ice_ptp_reset_cached_phctime(struct ice_pf *pf)
 	ice_ptp_mark_tx_tracker_stale(&pf->ptp.port.tx);
 }
 
-/**
- * ice_ptp_read_time - Read the time from the device
- * @pf: Board private structure
- * @ts: timespec structure to hold the current time value
- * @sts: Optional parameter for holding a pair of system timestamps from
- *       the system clock. Will be ignored if NULL is given.
- *
- * This function reads the source clock registers and stores them in a timespec.
- * However, since the registers are 64 bits of nanoseconds, we must convert the
- * result to a timespec before we can return.
- */
-static void
-ice_ptp_read_time(struct ice_pf *pf, struct timespec64 *ts,
-		  struct ptp_system_timestamp *sts)
-{
-	u64 time_ns = ice_ptp_read_src_clk_reg(pf, sts);
-
-	*ts = ns_to_timespec64(time_ns);
-}
-
 /**
  * ice_ptp_write_init - Set PHC time to provided value
  * @pf: Board private structure
@@ -1926,9 +1906,10 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
 		   struct ptp_system_timestamp *sts)
 {
 	struct ice_pf *pf = ptp_info_to_pf(info);
+	u64 time_ns;
 
-	ice_ptp_read_time(pf, ts, sts);
-
+	time_ns = ice_ptp_read_src_clk_reg(pf, sts);
+	*ts = ns_to_timespec64(time_ns);
 	return 0;
 }
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
  2024-03-06 16:29   ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 17:00     ` Jiri Pirko
  -1 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-06 17:00 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: intel-wired-lan, netdev, Jacob Keller, Jakub Kicinski,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>There is a need for synchronization between ice PFs on the same physical
>adapter.
>
>Add a "struct ice_adapter" for holding data shared between PFs of the
>same multifunction PCI device. The struct is refcounted - each ice_pf
>holds a reference to it.
>
>Its first use will be for PTP. I expect it will be useful also to
>improve the ugliness that is ice_prot_id_tbl.
>
>Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>---
> drivers/net/ethernet/intel/ice/Makefile      |  3 +-
> drivers/net/ethernet/intel/ice/ice.h         |  2 +
> drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
> 5 files changed, 119 insertions(+), 1 deletion(-)
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>
>diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>index cddd82d4ca0f..4fa09c321440 100644
>--- a/drivers/net/ethernet/intel/ice/Makefile
>+++ b/drivers/net/ethernet/intel/ice/Makefile
>@@ -36,7 +36,8 @@ ice-y := ice_main.o	\
> 	 ice_repr.o	\
> 	 ice_tc_lib.o	\
> 	 ice_fwlog.o	\
>-	 ice_debugfs.o
>+	 ice_debugfs.o  \
>+	 ice_adapter.o
> ice-$(CONFIG_PCI_IOV) +=	\
> 	ice_sriov.o		\
> 	ice_virtchnl.o		\
>diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>index 365c03d1c462..1ffecbdd361a 100644
>--- a/drivers/net/ethernet/intel/ice/ice.h
>+++ b/drivers/net/ethernet/intel/ice/ice.h
>@@ -77,6 +77,7 @@
> #include "ice_gnss.h"
> #include "ice_irq.h"
> #include "ice_dpll.h"
>+#include "ice_adapter.h"
> 
> #define ICE_BAR0		0
> #define ICE_REQ_DESC_MULTIPLE	32
>@@ -544,6 +545,7 @@ struct ice_agg_node {
> 
> struct ice_pf {
> 	struct pci_dev *pdev;
>+	struct ice_adapter *adapter;
> 
> 	struct devlink_region *nvm_region;
> 	struct devlink_region *sram_region;
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>new file mode 100644
>index 000000000000..b93b4db4c04c
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>@@ -0,0 +1,85 @@
>+// SPDX-License-Identifier: GPL-2.0-only
>+// SPDX-FileCopyrightText: Copyright Red Hat
>+
>+#include <linux/cleanup.h>
>+#include <linux/mutex.h>
>+#include <linux/pci.h>
>+#include <linux/slab.h>
>+#include <linux/xarray.h>
>+#include "ice_adapter.h"
>+
>+static DEFINE_XARRAY(ice_adapters);
>+
>+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>+{
>+	unsigned int domain = pci_domain_nr(pdev->bus);
>+
>+	WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>+	return ((unsigned long)domain << 13) |
>+	       ((unsigned long)pdev->bus->number << 5) |
>+	       PCI_SLOT(pdev->devfn);
>+}
>+
>+static struct ice_adapter *ice_adapter_new(void)
>+{
>+	struct ice_adapter *a;
>+
>+	a = kzalloc(sizeof(*a), GFP_KERNEL);
>+	if (!a)
>+		return NULL;
>+
>+	refcount_set(&a->refcount, 1);
>+
>+	return a;
>+}
>+
>+static void ice_adapter_free(struct ice_adapter *a)
>+{
>+	kfree(a);
>+}
>+
>+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>+{
>+	struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>+	unsigned long index = ice_adapter_index(pdev);
>+
>+	a = ice_adapter_new();

Please consider some non-single-letter variable name.


>+	if (!a)
>+		return NULL;
>+
>+	xa_lock(&ice_adapters);
>+	ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);

This is atomic section, can't sleep.


>+	if (xa_is_err(ret)) {
>+		ret = NULL;

Why don't you propagate err through ERR_PTR() ?


>+		goto unlock;
>+	}
>+	if (ret) {
>+		refcount_inc(&ret->refcount);
>+		goto unlock;
>+	}
>+	ret = no_free_ptr(a);
>+unlock:
>+	xa_unlock(&ice_adapters);
>+	return ret;
>+}
>+
>+void ice_adapter_put(const struct pci_dev *pdev)
>+{
>+	unsigned long index = ice_adapter_index(pdev);
>+	struct ice_adapter *a;
>+
>+	xa_lock(&ice_adapters);
>+	a = xa_load(&ice_adapters, index);
>+	if (WARN_ON(!a))
>+		goto unlock;
>+
>+	if (!refcount_dec_and_test(&a->refcount))
>+		goto unlock;
>+
>+	WARN_ON(__xa_erase(&ice_adapters, index) != a);

Nice paranoia level :)


>+	ice_adapter_free(a);
>+unlock:
>+	xa_unlock(&ice_adapters);
>+}
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>new file mode 100644
>index 000000000000..cb5a02eb24c1
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>@@ -0,0 +1,22 @@
>+/* SPDX-License-Identifier: GPL-2.0-only */
>+/* SPDX-FileCopyrightText: Copyright Red Hat */
>+
>+#ifndef _ICE_ADAPTER_H_
>+#define _ICE_ADAPTER_H_
>+
>+#include <linux/refcount_types.h>
>+
>+struct pci_dev;
>+
>+/**
>+ * struct ice_adapter - PCI adapter resources shared across PFs
>+ * @refcount: Reference count. struct ice_pf objects hold the references.
>+ */
>+struct ice_adapter {
>+	refcount_t refcount;
>+};
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>+void ice_adapter_put(const struct pci_dev *pdev);
>+
>+#endif /* _ICE_ADAPTER_H */
>diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>index 8f73ba77e835..413219d81a12 100644
>--- a/drivers/net/ethernet/intel/ice/ice_main.c
>+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>@@ -5093,6 +5093,7 @@ static int
> ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> {
> 	struct device *dev = &pdev->dev;
>+	struct ice_adapter *adapter;
> 	struct ice_pf *pf;
> 	struct ice_hw *hw;
> 	int err;
>@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> 
> 	pci_set_master(pdev);
> 
>+	adapter = ice_adapter_get(pdev);
>+	if (!adapter)
>+		return -ENOMEM;
>+
> 	pf->pdev = pdev;
>+	pf->adapter = adapter;
> 	pci_set_drvdata(pdev, pf);
> 	set_bit(ICE_DOWN, pf->state);
> 	/* Disable service task until DOWN bit is cleared */
>@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> err_load:
> 	ice_deinit(pf);
> err_init:
>+	ice_adapter_put(pdev);
> 	pci_disable_device(pdev);
> 	return err;
> }
>@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> 	ice_setup_mc_magic_wake(pf);
> 	ice_set_wake(pf);
> 
>+	ice_adapter_put(pdev);
> 	pci_disable_device(pdev);
> }
> 
>-- 
>2.43.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-06 17:00     ` Jiri Pirko
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-06 17:00 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: netdev, Arkadiusz Kubalewski, Karol Kolacinski, intel-wired-lan,
	Jacob Keller, Jakub Kicinski

Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>There is a need for synchronization between ice PFs on the same physical
>adapter.
>
>Add a "struct ice_adapter" for holding data shared between PFs of the
>same multifunction PCI device. The struct is refcounted - each ice_pf
>holds a reference to it.
>
>Its first use will be for PTP. I expect it will be useful also to
>improve the ugliness that is ice_prot_id_tbl.
>
>Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>---
> drivers/net/ethernet/intel/ice/Makefile      |  3 +-
> drivers/net/ethernet/intel/ice/ice.h         |  2 +
> drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
> 5 files changed, 119 insertions(+), 1 deletion(-)
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>
>diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>index cddd82d4ca0f..4fa09c321440 100644
>--- a/drivers/net/ethernet/intel/ice/Makefile
>+++ b/drivers/net/ethernet/intel/ice/Makefile
>@@ -36,7 +36,8 @@ ice-y := ice_main.o	\
> 	 ice_repr.o	\
> 	 ice_tc_lib.o	\
> 	 ice_fwlog.o	\
>-	 ice_debugfs.o
>+	 ice_debugfs.o  \
>+	 ice_adapter.o
> ice-$(CONFIG_PCI_IOV) +=	\
> 	ice_sriov.o		\
> 	ice_virtchnl.o		\
>diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>index 365c03d1c462..1ffecbdd361a 100644
>--- a/drivers/net/ethernet/intel/ice/ice.h
>+++ b/drivers/net/ethernet/intel/ice/ice.h
>@@ -77,6 +77,7 @@
> #include "ice_gnss.h"
> #include "ice_irq.h"
> #include "ice_dpll.h"
>+#include "ice_adapter.h"
> 
> #define ICE_BAR0		0
> #define ICE_REQ_DESC_MULTIPLE	32
>@@ -544,6 +545,7 @@ struct ice_agg_node {
> 
> struct ice_pf {
> 	struct pci_dev *pdev;
>+	struct ice_adapter *adapter;
> 
> 	struct devlink_region *nvm_region;
> 	struct devlink_region *sram_region;
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>new file mode 100644
>index 000000000000..b93b4db4c04c
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>@@ -0,0 +1,85 @@
>+// SPDX-License-Identifier: GPL-2.0-only
>+// SPDX-FileCopyrightText: Copyright Red Hat
>+
>+#include <linux/cleanup.h>
>+#include <linux/mutex.h>
>+#include <linux/pci.h>
>+#include <linux/slab.h>
>+#include <linux/xarray.h>
>+#include "ice_adapter.h"
>+
>+static DEFINE_XARRAY(ice_adapters);
>+
>+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>+{
>+	unsigned int domain = pci_domain_nr(pdev->bus);
>+
>+	WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>+	return ((unsigned long)domain << 13) |
>+	       ((unsigned long)pdev->bus->number << 5) |
>+	       PCI_SLOT(pdev->devfn);
>+}
>+
>+static struct ice_adapter *ice_adapter_new(void)
>+{
>+	struct ice_adapter *a;
>+
>+	a = kzalloc(sizeof(*a), GFP_KERNEL);
>+	if (!a)
>+		return NULL;
>+
>+	refcount_set(&a->refcount, 1);
>+
>+	return a;
>+}
>+
>+static void ice_adapter_free(struct ice_adapter *a)
>+{
>+	kfree(a);
>+}
>+
>+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>+{
>+	struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>+	unsigned long index = ice_adapter_index(pdev);
>+
>+	a = ice_adapter_new();

Please consider some non-single-letter variable name.


>+	if (!a)
>+		return NULL;
>+
>+	xa_lock(&ice_adapters);
>+	ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);

This is atomic section, can't sleep.


>+	if (xa_is_err(ret)) {
>+		ret = NULL;

Why don't you propagate err through ERR_PTR() ?


>+		goto unlock;
>+	}
>+	if (ret) {
>+		refcount_inc(&ret->refcount);
>+		goto unlock;
>+	}
>+	ret = no_free_ptr(a);
>+unlock:
>+	xa_unlock(&ice_adapters);
>+	return ret;
>+}
>+
>+void ice_adapter_put(const struct pci_dev *pdev)
>+{
>+	unsigned long index = ice_adapter_index(pdev);
>+	struct ice_adapter *a;
>+
>+	xa_lock(&ice_adapters);
>+	a = xa_load(&ice_adapters, index);
>+	if (WARN_ON(!a))
>+		goto unlock;
>+
>+	if (!refcount_dec_and_test(&a->refcount))
>+		goto unlock;
>+
>+	WARN_ON(__xa_erase(&ice_adapters, index) != a);

Nice paranoia level :)


>+	ice_adapter_free(a);
>+unlock:
>+	xa_unlock(&ice_adapters);
>+}
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>new file mode 100644
>index 000000000000..cb5a02eb24c1
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>@@ -0,0 +1,22 @@
>+/* SPDX-License-Identifier: GPL-2.0-only */
>+/* SPDX-FileCopyrightText: Copyright Red Hat */
>+
>+#ifndef _ICE_ADAPTER_H_
>+#define _ICE_ADAPTER_H_
>+
>+#include <linux/refcount_types.h>
>+
>+struct pci_dev;
>+
>+/**
>+ * struct ice_adapter - PCI adapter resources shared across PFs
>+ * @refcount: Reference count. struct ice_pf objects hold the references.
>+ */
>+struct ice_adapter {
>+	refcount_t refcount;
>+};
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>+void ice_adapter_put(const struct pci_dev *pdev);
>+
>+#endif /* _ICE_ADAPTER_H */
>diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>index 8f73ba77e835..413219d81a12 100644
>--- a/drivers/net/ethernet/intel/ice/ice_main.c
>+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>@@ -5093,6 +5093,7 @@ static int
> ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> {
> 	struct device *dev = &pdev->dev;
>+	struct ice_adapter *adapter;
> 	struct ice_pf *pf;
> 	struct ice_hw *hw;
> 	int err;
>@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> 
> 	pci_set_master(pdev);
> 
>+	adapter = ice_adapter_get(pdev);
>+	if (!adapter)
>+		return -ENOMEM;
>+
> 	pf->pdev = pdev;
>+	pf->adapter = adapter;
> 	pci_set_drvdata(pdev, pf);
> 	set_bit(ICE_DOWN, pf->state);
> 	/* Disable service task until DOWN bit is cleared */
>@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> err_load:
> 	ice_deinit(pf);
> err_init:
>+	ice_adapter_put(pdev);
> 	pci_disable_device(pdev);
> 	return err;
> }
>@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> 	ice_setup_mc_magic_wake(pf);
> 	ice_set_wake(pf);
> 
>+	ice_adapter_put(pdev);
> 	pci_disable_device(pdev);
> }
> 
>-- 
>2.43.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
  2024-03-06 17:00     ` [Intel-wired-lan] " Jiri Pirko
@ 2024-03-06 19:20       ` Michal Schmidt
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 19:20 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: intel-wired-lan, netdev, Jacob Keller, Jakub Kicinski,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
> >There is a need for synchronization between ice PFs on the same physical
> >adapter.
> >
> >Add a "struct ice_adapter" for holding data shared between PFs of the
> >same multifunction PCI device. The struct is refcounted - each ice_pf
> >holds a reference to it.
> >
> >Its first use will be for PTP. I expect it will be useful also to
> >improve the ugliness that is ice_prot_id_tbl.
> >
> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> >---
> > drivers/net/ethernet/intel/ice/Makefile      |  3 +-
> > drivers/net/ethernet/intel/ice/ice.h         |  2 +
> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> > drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
> > 5 files changed, 119 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
> >
> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
> >index cddd82d4ca0f..4fa09c321440 100644
> >--- a/drivers/net/ethernet/intel/ice/Makefile
> >+++ b/drivers/net/ethernet/intel/ice/Makefile
> >@@ -36,7 +36,8 @@ ice-y := ice_main.o  \
> >        ice_repr.o     \
> >        ice_tc_lib.o   \
> >        ice_fwlog.o    \
> >-       ice_debugfs.o
> >+       ice_debugfs.o  \
> >+       ice_adapter.o
> > ice-$(CONFIG_PCI_IOV) +=      \
> >       ice_sriov.o             \
> >       ice_virtchnl.o          \
> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> >index 365c03d1c462..1ffecbdd361a 100644
> >--- a/drivers/net/ethernet/intel/ice/ice.h
> >+++ b/drivers/net/ethernet/intel/ice/ice.h
> >@@ -77,6 +77,7 @@
> > #include "ice_gnss.h"
> > #include "ice_irq.h"
> > #include "ice_dpll.h"
> >+#include "ice_adapter.h"
> >
> > #define ICE_BAR0              0
> > #define ICE_REQ_DESC_MULTIPLE 32
> >@@ -544,6 +545,7 @@ struct ice_agg_node {
> >
> > struct ice_pf {
> >       struct pci_dev *pdev;
> >+      struct ice_adapter *adapter;
> >
> >       struct devlink_region *nvm_region;
> >       struct devlink_region *sram_region;
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >new file mode 100644
> >index 000000000000..b93b4db4c04c
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >@@ -0,0 +1,85 @@
> >+// SPDX-License-Identifier: GPL-2.0-only
> >+// SPDX-FileCopyrightText: Copyright Red Hat
> >+
> >+#include <linux/cleanup.h>
> >+#include <linux/mutex.h>
> >+#include <linux/pci.h>
> >+#include <linux/slab.h>
> >+#include <linux/xarray.h>
> >+#include "ice_adapter.h"
> >+
> >+static DEFINE_XARRAY(ice_adapters);
> >+
> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
> >+{
> >+      unsigned int domain = pci_domain_nr(pdev->bus);
> >+
> >+      WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
> >+      return ((unsigned long)domain << 13) |
> >+             ((unsigned long)pdev->bus->number << 5) |
> >+             PCI_SLOT(pdev->devfn);
> >+}
> >+
> >+static struct ice_adapter *ice_adapter_new(void)
> >+{
> >+      struct ice_adapter *a;
> >+
> >+      a = kzalloc(sizeof(*a), GFP_KERNEL);
> >+      if (!a)
> >+              return NULL;
> >+
> >+      refcount_set(&a->refcount, 1);
> >+
> >+      return a;
> >+}
> >+
> >+static void ice_adapter_free(struct ice_adapter *a)
> >+{
> >+      kfree(a);
> >+}
> >+
> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
> >+{
> >+      struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
> >+      unsigned long index = ice_adapter_index(pdev);
> >+
> >+      a = ice_adapter_new();
>
> Please consider some non-single-letter variable name.

Alright, I can change the name.

> >+      if (!a)
> >+              return NULL;
> >+
> >+      xa_lock(&ice_adapters);
> >+      ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>
> This is atomic section, can't sleep.

It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
memory, then reacquires it.

> >+      if (xa_is_err(ret)) {
> >+              ret = NULL;
>
> Why don't you propagate err through ERR_PTR() ?

It seemed unnecessary. ENOMEM is the only failure that can possibly
happen. EINVAL could be returned only if attempting to store an
unaligned pointer, which won't happen here.

>
> >+              goto unlock;
> >+      }
> >+      if (ret) {
> >+              refcount_inc(&ret->refcount);
> >+              goto unlock;
> >+      }
> >+      ret = no_free_ptr(a);
> >+unlock:
> >+      xa_unlock(&ice_adapters);
> >+      return ret;
> >+}
> >+
> >+void ice_adapter_put(const struct pci_dev *pdev)
> >+{
> >+      unsigned long index = ice_adapter_index(pdev);
> >+      struct ice_adapter *a;
> >+
> >+      xa_lock(&ice_adapters);
> >+      a = xa_load(&ice_adapters, index);
> >+      if (WARN_ON(!a))
> >+              goto unlock;
> >+
> >+      if (!refcount_dec_and_test(&a->refcount))
> >+              goto unlock;
> >+
> >+      WARN_ON(__xa_erase(&ice_adapters, index) != a);
>
> Nice paranoia level :)
>
>
> >+      ice_adapter_free(a);
> >+unlock:
> >+      xa_unlock(&ice_adapters);
> >+}
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >new file mode 100644
> >index 000000000000..cb5a02eb24c1
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >@@ -0,0 +1,22 @@
> >+/* SPDX-License-Identifier: GPL-2.0-only */
> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
> >+
> >+#ifndef _ICE_ADAPTER_H_
> >+#define _ICE_ADAPTER_H_
> >+
> >+#include <linux/refcount_types.h>
> >+
> >+struct pci_dev;
> >+
> >+/**
> >+ * struct ice_adapter - PCI adapter resources shared across PFs
> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
> >+ */
> >+struct ice_adapter {
> >+      refcount_t refcount;
> >+};
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
> >+void ice_adapter_put(const struct pci_dev *pdev);
> >+
> >+#endif /* _ICE_ADAPTER_H */
> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> >index 8f73ba77e835..413219d81a12 100644
> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
> >@@ -5093,6 +5093,7 @@ static int
> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > {
> >       struct device *dev = &pdev->dev;
> >+      struct ice_adapter *adapter;
> >       struct ice_pf *pf;
> >       struct ice_hw *hw;
> >       int err;
> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> >
> >       pci_set_master(pdev);
> >
> >+      adapter = ice_adapter_get(pdev);
> >+      if (!adapter)
> >+              return -ENOMEM;
> >+
> >       pf->pdev = pdev;
> >+      pf->adapter = adapter;
> >       pci_set_drvdata(pdev, pf);
> >       set_bit(ICE_DOWN, pf->state);
> >       /* Disable service task until DOWN bit is cleared */
> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > err_load:
> >       ice_deinit(pf);
> > err_init:
> >+      ice_adapter_put(pdev);
> >       pci_disable_device(pdev);
> >       return err;
> > }
> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> >       ice_setup_mc_magic_wake(pf);
> >       ice_set_wake(pf);
> >
> >+      ice_adapter_put(pdev);
> >       pci_disable_device(pdev);
> > }
> >
> >--
> >2.43.2
> >
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-06 19:20       ` Michal Schmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 19:20 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, Arkadiusz Kubalewski, Karol Kolacinski, intel-wired-lan,
	Jacob Keller, Jakub Kicinski

On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
> >There is a need for synchronization between ice PFs on the same physical
> >adapter.
> >
> >Add a "struct ice_adapter" for holding data shared between PFs of the
> >same multifunction PCI device. The struct is refcounted - each ice_pf
> >holds a reference to it.
> >
> >Its first use will be for PTP. I expect it will be useful also to
> >improve the ugliness that is ice_prot_id_tbl.
> >
> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> >---
> > drivers/net/ethernet/intel/ice/Makefile      |  3 +-
> > drivers/net/ethernet/intel/ice/ice.h         |  2 +
> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> > drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
> > 5 files changed, 119 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
> >
> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
> >index cddd82d4ca0f..4fa09c321440 100644
> >--- a/drivers/net/ethernet/intel/ice/Makefile
> >+++ b/drivers/net/ethernet/intel/ice/Makefile
> >@@ -36,7 +36,8 @@ ice-y := ice_main.o  \
> >        ice_repr.o     \
> >        ice_tc_lib.o   \
> >        ice_fwlog.o    \
> >-       ice_debugfs.o
> >+       ice_debugfs.o  \
> >+       ice_adapter.o
> > ice-$(CONFIG_PCI_IOV) +=      \
> >       ice_sriov.o             \
> >       ice_virtchnl.o          \
> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> >index 365c03d1c462..1ffecbdd361a 100644
> >--- a/drivers/net/ethernet/intel/ice/ice.h
> >+++ b/drivers/net/ethernet/intel/ice/ice.h
> >@@ -77,6 +77,7 @@
> > #include "ice_gnss.h"
> > #include "ice_irq.h"
> > #include "ice_dpll.h"
> >+#include "ice_adapter.h"
> >
> > #define ICE_BAR0              0
> > #define ICE_REQ_DESC_MULTIPLE 32
> >@@ -544,6 +545,7 @@ struct ice_agg_node {
> >
> > struct ice_pf {
> >       struct pci_dev *pdev;
> >+      struct ice_adapter *adapter;
> >
> >       struct devlink_region *nvm_region;
> >       struct devlink_region *sram_region;
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >new file mode 100644
> >index 000000000000..b93b4db4c04c
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >@@ -0,0 +1,85 @@
> >+// SPDX-License-Identifier: GPL-2.0-only
> >+// SPDX-FileCopyrightText: Copyright Red Hat
> >+
> >+#include <linux/cleanup.h>
> >+#include <linux/mutex.h>
> >+#include <linux/pci.h>
> >+#include <linux/slab.h>
> >+#include <linux/xarray.h>
> >+#include "ice_adapter.h"
> >+
> >+static DEFINE_XARRAY(ice_adapters);
> >+
> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
> >+{
> >+      unsigned int domain = pci_domain_nr(pdev->bus);
> >+
> >+      WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
> >+      return ((unsigned long)domain << 13) |
> >+             ((unsigned long)pdev->bus->number << 5) |
> >+             PCI_SLOT(pdev->devfn);
> >+}
> >+
> >+static struct ice_adapter *ice_adapter_new(void)
> >+{
> >+      struct ice_adapter *a;
> >+
> >+      a = kzalloc(sizeof(*a), GFP_KERNEL);
> >+      if (!a)
> >+              return NULL;
> >+
> >+      refcount_set(&a->refcount, 1);
> >+
> >+      return a;
> >+}
> >+
> >+static void ice_adapter_free(struct ice_adapter *a)
> >+{
> >+      kfree(a);
> >+}
> >+
> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
> >+{
> >+      struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
> >+      unsigned long index = ice_adapter_index(pdev);
> >+
> >+      a = ice_adapter_new();
>
> Please consider some non-single-letter variable name.

Alright, I can change the name.

> >+      if (!a)
> >+              return NULL;
> >+
> >+      xa_lock(&ice_adapters);
> >+      ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>
> This is atomic section, can't sleep.

It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
memory, then reacquires it.

> >+      if (xa_is_err(ret)) {
> >+              ret = NULL;
>
> Why don't you propagate err through ERR_PTR() ?

It seemed unnecessary. ENOMEM is the only failure that can possibly
happen. EINVAL could be returned only if attempting to store an
unaligned pointer, which won't happen here.

>
> >+              goto unlock;
> >+      }
> >+      if (ret) {
> >+              refcount_inc(&ret->refcount);
> >+              goto unlock;
> >+      }
> >+      ret = no_free_ptr(a);
> >+unlock:
> >+      xa_unlock(&ice_adapters);
> >+      return ret;
> >+}
> >+
> >+void ice_adapter_put(const struct pci_dev *pdev)
> >+{
> >+      unsigned long index = ice_adapter_index(pdev);
> >+      struct ice_adapter *a;
> >+
> >+      xa_lock(&ice_adapters);
> >+      a = xa_load(&ice_adapters, index);
> >+      if (WARN_ON(!a))
> >+              goto unlock;
> >+
> >+      if (!refcount_dec_and_test(&a->refcount))
> >+              goto unlock;
> >+
> >+      WARN_ON(__xa_erase(&ice_adapters, index) != a);
>
> Nice paranoia level :)
>
>
> >+      ice_adapter_free(a);
> >+unlock:
> >+      xa_unlock(&ice_adapters);
> >+}
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >new file mode 100644
> >index 000000000000..cb5a02eb24c1
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >@@ -0,0 +1,22 @@
> >+/* SPDX-License-Identifier: GPL-2.0-only */
> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
> >+
> >+#ifndef _ICE_ADAPTER_H_
> >+#define _ICE_ADAPTER_H_
> >+
> >+#include <linux/refcount_types.h>
> >+
> >+struct pci_dev;
> >+
> >+/**
> >+ * struct ice_adapter - PCI adapter resources shared across PFs
> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
> >+ */
> >+struct ice_adapter {
> >+      refcount_t refcount;
> >+};
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
> >+void ice_adapter_put(const struct pci_dev *pdev);
> >+
> >+#endif /* _ICE_ADAPTER_H */
> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> >index 8f73ba77e835..413219d81a12 100644
> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
> >@@ -5093,6 +5093,7 @@ static int
> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > {
> >       struct device *dev = &pdev->dev;
> >+      struct ice_adapter *adapter;
> >       struct ice_pf *pf;
> >       struct ice_hw *hw;
> >       int err;
> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> >
> >       pci_set_master(pdev);
> >
> >+      adapter = ice_adapter_get(pdev);
> >+      if (!adapter)
> >+              return -ENOMEM;
> >+
> >       pf->pdev = pdev;
> >+      pf->adapter = adapter;
> >       pci_set_drvdata(pdev, pf);
> >       set_bit(ICE_DOWN, pf->state);
> >       /* Disable service task until DOWN bit is cleared */
> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > err_load:
> >       ice_deinit(pf);
> > err_init:
> >+      ice_adapter_put(pdev);
> >       pci_disable_device(pdev);
> >       return err;
> > }
> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> >       ice_setup_mc_magic_wake(pf);
> >       ice_set_wake(pf);
> >
> >+      ice_adapter_put(pdev);
> >       pci_disable_device(pdev);
> > }
> >
> >--
> >2.43.2
> >
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
  2024-03-06 19:20       ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-07  8:27         ` Jiri Pirko
  -1 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-07  8:27 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: intel-wired-lan, netdev, Jacob Keller, Jakub Kicinski,
	Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski

Wed, Mar 06, 2024 at 08:20:33PM CET, mschmidt@redhat.com wrote:
>On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
>> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>> >There is a need for synchronization between ice PFs on the same physical
>> >adapter.
>> >
>> >Add a "struct ice_adapter" for holding data shared between PFs of the
>> >same multifunction PCI device. The struct is refcounted - each ice_pf
>> >holds a reference to it.
>> >
>> >Its first use will be for PTP. I expect it will be useful also to
>> >improve the ugliness that is ice_prot_id_tbl.
>> >
>> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> >---
>> > drivers/net/ethernet/intel/ice/Makefile      |  3 +-
>> > drivers/net/ethernet/intel/ice/ice.h         |  2 +
>> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
>> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
>> > drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
>> > 5 files changed, 119 insertions(+), 1 deletion(-)
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>> >
>> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>> >index cddd82d4ca0f..4fa09c321440 100644
>> >--- a/drivers/net/ethernet/intel/ice/Makefile
>> >+++ b/drivers/net/ethernet/intel/ice/Makefile
>> >@@ -36,7 +36,8 @@ ice-y := ice_main.o  \
>> >        ice_repr.o     \
>> >        ice_tc_lib.o   \
>> >        ice_fwlog.o    \
>> >-       ice_debugfs.o
>> >+       ice_debugfs.o  \
>> >+       ice_adapter.o
>> > ice-$(CONFIG_PCI_IOV) +=      \
>> >       ice_sriov.o             \
>> >       ice_virtchnl.o          \
>> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>> >index 365c03d1c462..1ffecbdd361a 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice.h
>> >+++ b/drivers/net/ethernet/intel/ice/ice.h
>> >@@ -77,6 +77,7 @@
>> > #include "ice_gnss.h"
>> > #include "ice_irq.h"
>> > #include "ice_dpll.h"
>> >+#include "ice_adapter.h"
>> >
>> > #define ICE_BAR0              0
>> > #define ICE_REQ_DESC_MULTIPLE 32
>> >@@ -544,6 +545,7 @@ struct ice_agg_node {
>> >
>> > struct ice_pf {
>> >       struct pci_dev *pdev;
>> >+      struct ice_adapter *adapter;
>> >
>> >       struct devlink_region *nvm_region;
>> >       struct devlink_region *sram_region;
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >new file mode 100644
>> >index 000000000000..b93b4db4c04c
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >@@ -0,0 +1,85 @@
>> >+// SPDX-License-Identifier: GPL-2.0-only
>> >+// SPDX-FileCopyrightText: Copyright Red Hat
>> >+
>> >+#include <linux/cleanup.h>
>> >+#include <linux/mutex.h>
>> >+#include <linux/pci.h>
>> >+#include <linux/slab.h>
>> >+#include <linux/xarray.h>
>> >+#include "ice_adapter.h"
>> >+
>> >+static DEFINE_XARRAY(ice_adapters);
>> >+
>> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>> >+{
>> >+      unsigned int domain = pci_domain_nr(pdev->bus);
>> >+
>> >+      WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>> >+      return ((unsigned long)domain << 13) |
>> >+             ((unsigned long)pdev->bus->number << 5) |
>> >+             PCI_SLOT(pdev->devfn);
>> >+}
>> >+
>> >+static struct ice_adapter *ice_adapter_new(void)
>> >+{
>> >+      struct ice_adapter *a;
>> >+
>> >+      a = kzalloc(sizeof(*a), GFP_KERNEL);
>> >+      if (!a)
>> >+              return NULL;
>> >+
>> >+      refcount_set(&a->refcount, 1);
>> >+
>> >+      return a;
>> >+}
>> >+
>> >+static void ice_adapter_free(struct ice_adapter *a)
>> >+{
>> >+      kfree(a);
>> >+}
>> >+
>> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>> >+{
>> >+      struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>> >+      unsigned long index = ice_adapter_index(pdev);
>> >+
>> >+      a = ice_adapter_new();
>>
>> Please consider some non-single-letter variable name.
>
>Alright, I can change the name.
>
>> >+      if (!a)
>> >+              return NULL;
>> >+
>> >+      xa_lock(&ice_adapters);
>> >+      ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>>
>> This is atomic section, can't sleep.
>
>It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
>memory, then reacquires it.

Ah, cool.


>
>> >+      if (xa_is_err(ret)) {
>> >+              ret = NULL;
>>
>> Why don't you propagate err through ERR_PTR() ?
>
>It seemed unnecessary. ENOMEM is the only failure that can possibly
>happen. EINVAL could be returned only if attempting to store an
>unaligned pointer, which won't happen here.

Yeah, the point is that you have valid err, you toss it out, the caller
then does:
	adapter = ice_adapter_get(pdev);
	if (!adapter)
		return -ENOMEM;
And reinvents err. So my point was to propagate it through.



>
>>
>> >+              goto unlock;
>> >+      }
>> >+      if (ret) {
>> >+              refcount_inc(&ret->refcount);
>> >+              goto unlock;
>> >+      }
>> >+      ret = no_free_ptr(a);
>> >+unlock:
>> >+      xa_unlock(&ice_adapters);
>> >+      return ret;
>> >+}
>> >+
>> >+void ice_adapter_put(const struct pci_dev *pdev)
>> >+{
>> >+      unsigned long index = ice_adapter_index(pdev);
>> >+      struct ice_adapter *a;
>> >+
>> >+      xa_lock(&ice_adapters);
>> >+      a = xa_load(&ice_adapters, index);
>> >+      if (WARN_ON(!a))
>> >+              goto unlock;
>> >+
>> >+      if (!refcount_dec_and_test(&a->refcount))
>> >+              goto unlock;
>> >+
>> >+      WARN_ON(__xa_erase(&ice_adapters, index) != a);
>>
>> Nice paranoia level :)
>>
>>
>> >+      ice_adapter_free(a);
>> >+unlock:
>> >+      xa_unlock(&ice_adapters);
>> >+}
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >new file mode 100644
>> >index 000000000000..cb5a02eb24c1
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >@@ -0,0 +1,22 @@
>> >+/* SPDX-License-Identifier: GPL-2.0-only */
>> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
>> >+
>> >+#ifndef _ICE_ADAPTER_H_
>> >+#define _ICE_ADAPTER_H_
>> >+
>> >+#include <linux/refcount_types.h>
>> >+
>> >+struct pci_dev;
>> >+
>> >+/**
>> >+ * struct ice_adapter - PCI adapter resources shared across PFs
>> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
>> >+ */
>> >+struct ice_adapter {
>> >+      refcount_t refcount;
>> >+};
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>> >+void ice_adapter_put(const struct pci_dev *pdev);
>> >+
>> >+#endif /* _ICE_ADAPTER_H */
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>> >index 8f73ba77e835..413219d81a12 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
>> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> >@@ -5093,6 +5093,7 @@ static int
>> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > {
>> >       struct device *dev = &pdev->dev;
>> >+      struct ice_adapter *adapter;
>> >       struct ice_pf *pf;
>> >       struct ice_hw *hw;
>> >       int err;
>> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> >
>> >       pci_set_master(pdev);
>> >
>> >+      adapter = ice_adapter_get(pdev);
>> >+      if (!adapter)
>> >+              return -ENOMEM;
>> >+
>> >       pf->pdev = pdev;
>> >+      pf->adapter = adapter;
>> >       pci_set_drvdata(pdev, pf);
>> >       set_bit(ICE_DOWN, pf->state);
>> >       /* Disable service task until DOWN bit is cleared */
>> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > err_load:
>> >       ice_deinit(pf);
>> > err_init:
>> >+      ice_adapter_put(pdev);
>> >       pci_disable_device(pdev);
>> >       return err;
>> > }
>> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
>> >       ice_setup_mc_magic_wake(pf);
>> >       ice_set_wake(pf);
>> >
>> >+      ice_adapter_put(pdev);
>> >       pci_disable_device(pdev);
>> > }
>> >
>> >--
>> >2.43.2
>> >
>>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-07  8:27         ` Jiri Pirko
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-07  8:27 UTC (permalink / raw)
  To: Michal Schmidt
  Cc: netdev, Arkadiusz Kubalewski, Karol Kolacinski, intel-wired-lan,
	Jacob Keller, Jakub Kicinski

Wed, Mar 06, 2024 at 08:20:33PM CET, mschmidt@redhat.com wrote:
>On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
>> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>> >There is a need for synchronization between ice PFs on the same physical
>> >adapter.
>> >
>> >Add a "struct ice_adapter" for holding data shared between PFs of the
>> >same multifunction PCI device. The struct is refcounted - each ice_pf
>> >holds a reference to it.
>> >
>> >Its first use will be for PTP. I expect it will be useful also to
>> >improve the ugliness that is ice_prot_id_tbl.
>> >
>> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> >---
>> > drivers/net/ethernet/intel/ice/Makefile      |  3 +-
>> > drivers/net/ethernet/intel/ice/ice.h         |  2 +
>> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
>> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
>> > drivers/net/ethernet/intel/ice/ice_main.c    |  8 ++
>> > 5 files changed, 119 insertions(+), 1 deletion(-)
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>> >
>> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>> >index cddd82d4ca0f..4fa09c321440 100644
>> >--- a/drivers/net/ethernet/intel/ice/Makefile
>> >+++ b/drivers/net/ethernet/intel/ice/Makefile
>> >@@ -36,7 +36,8 @@ ice-y := ice_main.o  \
>> >        ice_repr.o     \
>> >        ice_tc_lib.o   \
>> >        ice_fwlog.o    \
>> >-       ice_debugfs.o
>> >+       ice_debugfs.o  \
>> >+       ice_adapter.o
>> > ice-$(CONFIG_PCI_IOV) +=      \
>> >       ice_sriov.o             \
>> >       ice_virtchnl.o          \
>> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>> >index 365c03d1c462..1ffecbdd361a 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice.h
>> >+++ b/drivers/net/ethernet/intel/ice/ice.h
>> >@@ -77,6 +77,7 @@
>> > #include "ice_gnss.h"
>> > #include "ice_irq.h"
>> > #include "ice_dpll.h"
>> >+#include "ice_adapter.h"
>> >
>> > #define ICE_BAR0              0
>> > #define ICE_REQ_DESC_MULTIPLE 32
>> >@@ -544,6 +545,7 @@ struct ice_agg_node {
>> >
>> > struct ice_pf {
>> >       struct pci_dev *pdev;
>> >+      struct ice_adapter *adapter;
>> >
>> >       struct devlink_region *nvm_region;
>> >       struct devlink_region *sram_region;
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >new file mode 100644
>> >index 000000000000..b93b4db4c04c
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >@@ -0,0 +1,85 @@
>> >+// SPDX-License-Identifier: GPL-2.0-only
>> >+// SPDX-FileCopyrightText: Copyright Red Hat
>> >+
>> >+#include <linux/cleanup.h>
>> >+#include <linux/mutex.h>
>> >+#include <linux/pci.h>
>> >+#include <linux/slab.h>
>> >+#include <linux/xarray.h>
>> >+#include "ice_adapter.h"
>> >+
>> >+static DEFINE_XARRAY(ice_adapters);
>> >+
>> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>> >+{
>> >+      unsigned int domain = pci_domain_nr(pdev->bus);
>> >+
>> >+      WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>> >+      return ((unsigned long)domain << 13) |
>> >+             ((unsigned long)pdev->bus->number << 5) |
>> >+             PCI_SLOT(pdev->devfn);
>> >+}
>> >+
>> >+static struct ice_adapter *ice_adapter_new(void)
>> >+{
>> >+      struct ice_adapter *a;
>> >+
>> >+      a = kzalloc(sizeof(*a), GFP_KERNEL);
>> >+      if (!a)
>> >+              return NULL;
>> >+
>> >+      refcount_set(&a->refcount, 1);
>> >+
>> >+      return a;
>> >+}
>> >+
>> >+static void ice_adapter_free(struct ice_adapter *a)
>> >+{
>> >+      kfree(a);
>> >+}
>> >+
>> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>> >+{
>> >+      struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>> >+      unsigned long index = ice_adapter_index(pdev);
>> >+
>> >+      a = ice_adapter_new();
>>
>> Please consider some non-single-letter variable name.
>
>Alright, I can change the name.
>
>> >+      if (!a)
>> >+              return NULL;
>> >+
>> >+      xa_lock(&ice_adapters);
>> >+      ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>>
>> This is atomic section, can't sleep.
>
>It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
>memory, then reacquires it.

Ah, cool.


>
>> >+      if (xa_is_err(ret)) {
>> >+              ret = NULL;
>>
>> Why don't you propagate err through ERR_PTR() ?
>
>It seemed unnecessary. ENOMEM is the only failure that can possibly
>happen. EINVAL could be returned only if attempting to store an
>unaligned pointer, which won't happen here.

Yeah, the point is that you have valid err, you toss it out, the caller
then does:
	adapter = ice_adapter_get(pdev);
	if (!adapter)
		return -ENOMEM;
And reinvents err. So my point was to propagate it through.



>
>>
>> >+              goto unlock;
>> >+      }
>> >+      if (ret) {
>> >+              refcount_inc(&ret->refcount);
>> >+              goto unlock;
>> >+      }
>> >+      ret = no_free_ptr(a);
>> >+unlock:
>> >+      xa_unlock(&ice_adapters);
>> >+      return ret;
>> >+}
>> >+
>> >+void ice_adapter_put(const struct pci_dev *pdev)
>> >+{
>> >+      unsigned long index = ice_adapter_index(pdev);
>> >+      struct ice_adapter *a;
>> >+
>> >+      xa_lock(&ice_adapters);
>> >+      a = xa_load(&ice_adapters, index);
>> >+      if (WARN_ON(!a))
>> >+              goto unlock;
>> >+
>> >+      if (!refcount_dec_and_test(&a->refcount))
>> >+              goto unlock;
>> >+
>> >+      WARN_ON(__xa_erase(&ice_adapters, index) != a);
>>
>> Nice paranoia level :)
>>
>>
>> >+      ice_adapter_free(a);
>> >+unlock:
>> >+      xa_unlock(&ice_adapters);
>> >+}
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >new file mode 100644
>> >index 000000000000..cb5a02eb24c1
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >@@ -0,0 +1,22 @@
>> >+/* SPDX-License-Identifier: GPL-2.0-only */
>> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
>> >+
>> >+#ifndef _ICE_ADAPTER_H_
>> >+#define _ICE_ADAPTER_H_
>> >+
>> >+#include <linux/refcount_types.h>
>> >+
>> >+struct pci_dev;
>> >+
>> >+/**
>> >+ * struct ice_adapter - PCI adapter resources shared across PFs
>> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
>> >+ */
>> >+struct ice_adapter {
>> >+      refcount_t refcount;
>> >+};
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>> >+void ice_adapter_put(const struct pci_dev *pdev);
>> >+
>> >+#endif /* _ICE_ADAPTER_H */
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>> >index 8f73ba77e835..413219d81a12 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
>> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> >@@ -5093,6 +5093,7 @@ static int
>> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > {
>> >       struct device *dev = &pdev->dev;
>> >+      struct ice_adapter *adapter;
>> >       struct ice_pf *pf;
>> >       struct ice_hw *hw;
>> >       int err;
>> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> >
>> >       pci_set_master(pdev);
>> >
>> >+      adapter = ice_adapter_get(pdev);
>> >+      if (!adapter)
>> >+              return -ENOMEM;
>> >+
>> >       pf->pdev = pdev;
>> >+      pf->adapter = adapter;
>> >       pci_set_drvdata(pdev, pf);
>> >       set_bit(ICE_DOWN, pf->state);
>> >       /* Disable service task until DOWN bit is cleared */
>> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > err_load:
>> >       ice_deinit(pf);
>> > err_init:
>> >+      ice_adapter_put(pdev);
>> >       pci_disable_device(pdev);
>> >       return err;
>> > }
>> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
>> >       ice_setup_mc_magic_wake(pf);
>> >       ice_set_wake(pf);
>> >
>> >+      ice_adapter_put(pdev);
>> >       pci_disable_device(pdev);
>> > }
>> >
>> >--
>> >2.43.2
>> >
>>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-07  8:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06 16:29 [PATCH net-next v2 0/3] ice: lighter locking for PTP time reading Michal Schmidt
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
2024-03-06 16:29 ` [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC Michal Schmidt
2024-03-06 16:29   ` [Intel-wired-lan] " Michal Schmidt
2024-03-06 17:00   ` Jiri Pirko
2024-03-06 17:00     ` [Intel-wired-lan] " Jiri Pirko
2024-03-06 19:20     ` Michal Schmidt
2024-03-06 19:20       ` [Intel-wired-lan] " Michal Schmidt
2024-03-07  8:27       ` Jiri Pirko
2024-03-07  8:27         ` [Intel-wired-lan] " Jiri Pirko
2024-03-06 16:29 ` [PATCH net-next v2 2/3] ice: avoid the PTP hardware semaphore in gettimex64 path Michal Schmidt
2024-03-06 16:29   ` [Intel-wired-lan] " Michal Schmidt
2024-03-06 16:29 ` [PATCH net-next v2 3/3] ice: fold ice_ptp_read_time into ice_ptp_gettimex64 Michal Schmidt
2024-03-06 16:29   ` [Intel-wired-lan] " Michal Schmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.