* [PATCH net-next v2 0/3] ice: lighter locking for PTP time reading
@ 2024-03-06 16:29 ` Michal Schmidt
0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
This series removes the use of the heavy-weight PTP hardware semaphore
in the gettimex64 path. Instead, serialization of access to the time
register is done using a host-side spinlock. The timer hardware is
shared between PFs on the PCI adapter, so the spinlock must be shared
between ice_pf instances too.
Replacing the PTP hardware semaphore entirely with a mutex is also
possible and you can see it done in my git branch[1], but I am not
posting those patches yet to keep the scope of this series limited.
[1] https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8
v2:
- Patch 1: Rely on xarray's own lock. (Suggested by Jiri Pirko)
- Patch 2: Do not use *_irqsave with ptp_gltsyn_time_lock, as it's used
only in process contexts.
Michal Schmidt (3):
ice: add ice_adapter for shared data across PFs on the same NIC
ice: avoid the PTP hardware semaphore in gettimex64 path
ice: fold ice_ptp_read_time into ice_ptp_gettimex64
drivers/net/ethernet/intel/ice/Makefile | 3 +-
drivers/net/ethernet/intel/ice/ice.h | 2 +
drivers/net/ethernet/intel/ice/ice_adapter.c | 87 ++++++++++++++++++++
drivers/net/ethernet/intel/ice/ice_adapter.h | 28 +++++++
drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
drivers/net/ethernet/intel/ice/ice_ptp.c | 33 +-------
drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 3 +
7 files changed, 134 insertions(+), 30 deletions(-)
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
--
2.43.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Intel-wired-lan] [PATCH net-next v2 0/3] ice: lighter locking for PTP time reading
@ 2024-03-06 16:29 ` Michal Schmidt
0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
Jacob Keller, Jakub Kicinski
This series removes the use of the heavy-weight PTP hardware semaphore
in the gettimex64 path. Instead, serialization of access to the time
register is done using a host-side spinlock. The timer hardware is
shared between PFs on the PCI adapter, so the spinlock must be shared
between ice_pf instances too.
Replacing the PTP hardware semaphore entirely with a mutex is also
possible and you can see it done in my git branch[1], but I am not
posting those patches yet to keep the scope of this series limited.
[1] https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8
v2:
- Patch 1: Rely on xarray's own lock. (Suggested by Jiri Pirko)
- Patch 2: Do not use *_irqsave with ptp_gltsyn_time_lock, as it's used
only in process contexts.
Michal Schmidt (3):
ice: add ice_adapter for shared data across PFs on the same NIC
ice: avoid the PTP hardware semaphore in gettimex64 path
ice: fold ice_ptp_read_time into ice_ptp_gettimex64
drivers/net/ethernet/intel/ice/Makefile | 3 +-
drivers/net/ethernet/intel/ice/ice.h | 2 +
drivers/net/ethernet/intel/ice/ice_adapter.c | 87 ++++++++++++++++++++
drivers/net/ethernet/intel/ice/ice_adapter.h | 28 +++++++
drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
drivers/net/ethernet/intel/ice/ice_ptp.c | 33 +-------
drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 3 +
7 files changed, 134 insertions(+), 30 deletions(-)
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
--
2.43.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 16:29 ` Michal Schmidt
-1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
There is a need for synchronization between ice PFs on the same physical
adapter.
Add a "struct ice_adapter" for holding data shared between PFs of the
same multifunction PCI device. The struct is refcounted - each ice_pf
holds a reference to it.
Its first use will be for PTP. I expect it will be useful also to
improve the ugliness that is ice_prot_id_tbl.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/intel/ice/Makefile | 3 +-
drivers/net/ethernet/intel/ice/ice.h | 2 +
drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
5 files changed, 119 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index cddd82d4ca0f..4fa09c321440 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -36,7 +36,8 @@ ice-y := ice_main.o \
ice_repr.o \
ice_tc_lib.o \
ice_fwlog.o \
- ice_debugfs.o
+ ice_debugfs.o \
+ ice_adapter.o
ice-$(CONFIG_PCI_IOV) += \
ice_sriov.o \
ice_virtchnl.o \
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 365c03d1c462..1ffecbdd361a 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -77,6 +77,7 @@
#include "ice_gnss.h"
#include "ice_irq.h"
#include "ice_dpll.h"
+#include "ice_adapter.h"
#define ICE_BAR0 0
#define ICE_REQ_DESC_MULTIPLE 32
@@ -544,6 +545,7 @@ struct ice_agg_node {
struct ice_pf {
struct pci_dev *pdev;
+ struct ice_adapter *adapter;
struct devlink_region *nvm_region;
struct devlink_region *sram_region;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
new file mode 100644
index 000000000000..b93b4db4c04c
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// SPDX-FileCopyrightText: Copyright Red Hat
+
+#include <linux/cleanup.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/xarray.h>
+#include "ice_adapter.h"
+
+static DEFINE_XARRAY(ice_adapters);
+
+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
+{
+ unsigned int domain = pci_domain_nr(pdev->bus);
+
+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
+ return ((unsigned long)domain << 13) |
+ ((unsigned long)pdev->bus->number << 5) |
+ PCI_SLOT(pdev->devfn);
+}
+
+static struct ice_adapter *ice_adapter_new(void)
+{
+ struct ice_adapter *a;
+
+ a = kzalloc(sizeof(*a), GFP_KERNEL);
+ if (!a)
+ return NULL;
+
+ refcount_set(&a->refcount, 1);
+
+ return a;
+}
+
+static void ice_adapter_free(struct ice_adapter *a)
+{
+ kfree(a);
+}
+
+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
+{
+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
+ unsigned long index = ice_adapter_index(pdev);
+
+ a = ice_adapter_new();
+ if (!a)
+ return NULL;
+
+ xa_lock(&ice_adapters);
+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
+ if (xa_is_err(ret)) {
+ ret = NULL;
+ goto unlock;
+ }
+ if (ret) {
+ refcount_inc(&ret->refcount);
+ goto unlock;
+ }
+ ret = no_free_ptr(a);
+unlock:
+ xa_unlock(&ice_adapters);
+ return ret;
+}
+
+void ice_adapter_put(const struct pci_dev *pdev)
+{
+ unsigned long index = ice_adapter_index(pdev);
+ struct ice_adapter *a;
+
+ xa_lock(&ice_adapters);
+ a = xa_load(&ice_adapters, index);
+ if (WARN_ON(!a))
+ goto unlock;
+
+ if (!refcount_dec_and_test(&a->refcount))
+ goto unlock;
+
+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
+ ice_adapter_free(a);
+unlock:
+ xa_unlock(&ice_adapters);
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
new file mode 100644
index 000000000000..cb5a02eb24c1
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* SPDX-FileCopyrightText: Copyright Red Hat */
+
+#ifndef _ICE_ADAPTER_H_
+#define _ICE_ADAPTER_H_
+
+#include <linux/refcount_types.h>
+
+struct pci_dev;
+
+/**
+ * struct ice_adapter - PCI adapter resources shared across PFs
+ * @refcount: Reference count. struct ice_pf objects hold the references.
+ */
+struct ice_adapter {
+ refcount_t refcount;
+};
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
+void ice_adapter_put(const struct pci_dev *pdev);
+
+#endif /* _ICE_ADAPTER_H */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8f73ba77e835..413219d81a12 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5093,6 +5093,7 @@ static int
ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
{
struct device *dev = &pdev->dev;
+ struct ice_adapter *adapter;
struct ice_pf *pf;
struct ice_hw *hw;
int err;
@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
pci_set_master(pdev);
+ adapter = ice_adapter_get(pdev);
+ if (!adapter)
+ return -ENOMEM;
+
pf->pdev = pdev;
+ pf->adapter = adapter;
pci_set_drvdata(pdev, pf);
set_bit(ICE_DOWN, pf->state);
/* Disable service task until DOWN bit is cleared */
@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
err_load:
ice_deinit(pf);
err_init:
+ ice_adapter_put(pdev);
pci_disable_device(pdev);
return err;
}
@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
ice_setup_mc_magic_wake(pf);
ice_set_wake(pf);
+ ice_adapter_put(pdev);
pci_disable_device(pdev);
}
--
2.43.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-06 16:29 ` Michal Schmidt
0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
Jacob Keller, Jakub Kicinski
There is a need for synchronization between ice PFs on the same physical
adapter.
Add a "struct ice_adapter" for holding data shared between PFs of the
same multifunction PCI device. The struct is refcounted - each ice_pf
holds a reference to it.
Its first use will be for PTP. I expect it will be useful also to
improve the ugliness that is ice_prot_id_tbl.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/intel/ice/Makefile | 3 +-
drivers/net/ethernet/intel/ice/ice.h | 2 +
drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
5 files changed, 119 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index cddd82d4ca0f..4fa09c321440 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -36,7 +36,8 @@ ice-y := ice_main.o \
ice_repr.o \
ice_tc_lib.o \
ice_fwlog.o \
- ice_debugfs.o
+ ice_debugfs.o \
+ ice_adapter.o
ice-$(CONFIG_PCI_IOV) += \
ice_sriov.o \
ice_virtchnl.o \
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 365c03d1c462..1ffecbdd361a 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -77,6 +77,7 @@
#include "ice_gnss.h"
#include "ice_irq.h"
#include "ice_dpll.h"
+#include "ice_adapter.h"
#define ICE_BAR0 0
#define ICE_REQ_DESC_MULTIPLE 32
@@ -544,6 +545,7 @@ struct ice_agg_node {
struct ice_pf {
struct pci_dev *pdev;
+ struct ice_adapter *adapter;
struct devlink_region *nvm_region;
struct devlink_region *sram_region;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
new file mode 100644
index 000000000000..b93b4db4c04c
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// SPDX-FileCopyrightText: Copyright Red Hat
+
+#include <linux/cleanup.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/xarray.h>
+#include "ice_adapter.h"
+
+static DEFINE_XARRAY(ice_adapters);
+
+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
+{
+ unsigned int domain = pci_domain_nr(pdev->bus);
+
+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
+ return ((unsigned long)domain << 13) |
+ ((unsigned long)pdev->bus->number << 5) |
+ PCI_SLOT(pdev->devfn);
+}
+
+static struct ice_adapter *ice_adapter_new(void)
+{
+ struct ice_adapter *a;
+
+ a = kzalloc(sizeof(*a), GFP_KERNEL);
+ if (!a)
+ return NULL;
+
+ refcount_set(&a->refcount, 1);
+
+ return a;
+}
+
+static void ice_adapter_free(struct ice_adapter *a)
+{
+ kfree(a);
+}
+
+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
+{
+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
+ unsigned long index = ice_adapter_index(pdev);
+
+ a = ice_adapter_new();
+ if (!a)
+ return NULL;
+
+ xa_lock(&ice_adapters);
+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
+ if (xa_is_err(ret)) {
+ ret = NULL;
+ goto unlock;
+ }
+ if (ret) {
+ refcount_inc(&ret->refcount);
+ goto unlock;
+ }
+ ret = no_free_ptr(a);
+unlock:
+ xa_unlock(&ice_adapters);
+ return ret;
+}
+
+void ice_adapter_put(const struct pci_dev *pdev)
+{
+ unsigned long index = ice_adapter_index(pdev);
+ struct ice_adapter *a;
+
+ xa_lock(&ice_adapters);
+ a = xa_load(&ice_adapters, index);
+ if (WARN_ON(!a))
+ goto unlock;
+
+ if (!refcount_dec_and_test(&a->refcount))
+ goto unlock;
+
+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
+ ice_adapter_free(a);
+unlock:
+ xa_unlock(&ice_adapters);
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
new file mode 100644
index 000000000000..cb5a02eb24c1
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* SPDX-FileCopyrightText: Copyright Red Hat */
+
+#ifndef _ICE_ADAPTER_H_
+#define _ICE_ADAPTER_H_
+
+#include <linux/refcount_types.h>
+
+struct pci_dev;
+
+/**
+ * struct ice_adapter - PCI adapter resources shared across PFs
+ * @refcount: Reference count. struct ice_pf objects hold the references.
+ */
+struct ice_adapter {
+ refcount_t refcount;
+};
+
+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
+void ice_adapter_put(const struct pci_dev *pdev);
+
+#endif /* _ICE_ADAPTER_H */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8f73ba77e835..413219d81a12 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5093,6 +5093,7 @@ static int
ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
{
struct device *dev = &pdev->dev;
+ struct ice_adapter *adapter;
struct ice_pf *pf;
struct ice_hw *hw;
int err;
@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
pci_set_master(pdev);
+ adapter = ice_adapter_get(pdev);
+ if (!adapter)
+ return -ENOMEM;
+
pf->pdev = pdev;
+ pf->adapter = adapter;
pci_set_drvdata(pdev, pf);
set_bit(ICE_DOWN, pf->state);
/* Disable service task until DOWN bit is cleared */
@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
err_load:
ice_deinit(pf);
err_init:
+ ice_adapter_put(pdev);
pci_disable_device(pdev);
return err;
}
@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
ice_setup_mc_magic_wake(pf);
ice_set_wake(pf);
+ ice_adapter_put(pdev);
pci_disable_device(pdev);
}
--
2.43.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v2 2/3] ice: avoid the PTP hardware semaphore in gettimex64 path
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 16:29 ` Michal Schmidt
-1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize
operations that program the PTP timers. The operations involve issuing
commands to the sideband queue. The E810 does not have a hardware
sideband queue, so the admin queue is used. The admin queue is slow.
I have observed delays in hundreds of milliseconds waiting for
ice_sq_done.
When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is
held by a task performing one of the slow operations, ice_ptp_lock can
easily time out. phc2sys gets -EBUSY and the kernel prints:
ice 0000:XX:YY.0: PTP failed to get time
These messages appear once every few seconds, causing log spam.
The E810 datasheet recommends an algorithm for reading the upper 64 bits
of the GLTSYN_TIME register. It matches what's implemented in
ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not
necessarily against the concurrent setting of the register (with
GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why
ice_ptp_gettimex64 also takes PFTSYN_SEM.
The race with time setters can be prevented without relying on the PTP
hardware semaphore. Using the "ice_adapter" from the previous patch,
we can have a common spinlock for the PFs that share the clock hardware.
It will protect the reading and writing to the GLTSYN_TIME register.
The writing is performed indirectly, by the hardware, as a result of
the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
updated, but it works well in my testing.
My test code can be seen here:
https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8
It consists of:
- kernel threads reading the time in a busy loop and looking at the
deltas between consecutive values, reporting new maxima.
- a shell script that sets the time repeatedly;
- a bpftrace probe to produce a histogram of the measured deltas.
Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
Deltas in the [2G, 4G) range appear in the histograms.
With the spinlock added, there is no tearing and the biggest delta I saw
was in the range [1M, 2M), that is under 2 ms.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
drivers/net/ethernet/intel/ice/ice_ptp.c | 8 +-------
drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 3 +++
4 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
index b93b4db4c04c..85fa2e1326af 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.c
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -5,6 +5,7 @@
#include <linux/mutex.h>
#include <linux/pci.h>
#include <linux/slab.h>
+#include <linux/spinlock.h>
#include <linux/xarray.h>
#include "ice_adapter.h"
@@ -28,6 +29,7 @@ static struct ice_adapter *ice_adapter_new(void)
if (!a)
return NULL;
+ spin_lock_init(&a->ptp_gltsyn_time_lock);
refcount_set(&a->refcount, 1);
return a;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
index cb5a02eb24c1..9d11014ec02f 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.h
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -4,15 +4,21 @@
#ifndef _ICE_ADAPTER_H_
#define _ICE_ADAPTER_H_
+#include <linux/spinlock_types.h>
#include <linux/refcount_types.h>
struct pci_dev;
/**
* struct ice_adapter - PCI adapter resources shared across PFs
+ * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
+ * register of the PTP clock.
* @refcount: Reference count. struct ice_pf objects hold the references.
*/
struct ice_adapter {
+ /* For access to the GLTSYN_TIME register */
+ spinlock_t ptp_gltsyn_time_lock;
+
refcount_t refcount;
};
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index c11eba07283c..0875f37add78 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
u8 tmr_idx;
tmr_idx = ice_get_ptp_src_clock_index(hw);
+ guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
/* Read the system timestamp pre PHC read */
ptp_read_system_prets(sts);
@@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
struct ptp_system_timestamp *sts)
{
struct ice_pf *pf = ptp_info_to_pf(info);
- struct ice_hw *hw = &pf->hw;
-
- if (!ice_ptp_lock(hw)) {
- dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
- return -EBUSY;
- }
ice_ptp_read_time(pf, ts, sts);
- ice_ptp_unlock(hw);
return 0;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 187ce9b54e1a..2b9423a173bb 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
*/
static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
{
+ struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
+
+ guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
ice_flush(hw);
}
--
2.43.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Intel-wired-lan] [PATCH net-next v2 2/3] ice: avoid the PTP hardware semaphore in gettimex64 path
@ 2024-03-06 16:29 ` Michal Schmidt
0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
Jacob Keller, Jakub Kicinski
The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize
operations that program the PTP timers. The operations involve issuing
commands to the sideband queue. The E810 does not have a hardware
sideband queue, so the admin queue is used. The admin queue is slow.
I have observed delays in hundreds of milliseconds waiting for
ice_sq_done.
When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is
held by a task performing one of the slow operations, ice_ptp_lock can
easily time out. phc2sys gets -EBUSY and the kernel prints:
ice 0000:XX:YY.0: PTP failed to get time
These messages appear once every few seconds, causing log spam.
The E810 datasheet recommends an algorithm for reading the upper 64 bits
of the GLTSYN_TIME register. It matches what's implemented in
ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not
necessarily against the concurrent setting of the register (with
GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why
ice_ptp_gettimex64 also takes PFTSYN_SEM.
The race with time setters can be prevented without relying on the PTP
hardware semaphore. Using the "ice_adapter" from the previous patch,
we can have a common spinlock for the PFs that share the clock hardware.
It will protect the reading and writing to the GLTSYN_TIME register.
The writing is performed indirectly, by the hardware, as a result of
the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
updated, but it works well in my testing.
My test code can be seen here:
https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-8
It consists of:
- kernel threads reading the time in a busy loop and looking at the
deltas between consecutive values, reporting new maxima.
- a shell script that sets the time repeatedly;
- a bpftrace probe to produce a histogram of the measured deltas.
Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
Deltas in the [2G, 4G) range appear in the histograms.
With the spinlock added, there is no tearing and the biggest delta I saw
was in the range [1M, 2M), that is under 2 ms.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
drivers/net/ethernet/intel/ice/ice_ptp.c | 8 +-------
drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 3 +++
4 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
index b93b4db4c04c..85fa2e1326af 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.c
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -5,6 +5,7 @@
#include <linux/mutex.h>
#include <linux/pci.h>
#include <linux/slab.h>
+#include <linux/spinlock.h>
#include <linux/xarray.h>
#include "ice_adapter.h"
@@ -28,6 +29,7 @@ static struct ice_adapter *ice_adapter_new(void)
if (!a)
return NULL;
+ spin_lock_init(&a->ptp_gltsyn_time_lock);
refcount_set(&a->refcount, 1);
return a;
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
index cb5a02eb24c1..9d11014ec02f 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.h
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -4,15 +4,21 @@
#ifndef _ICE_ADAPTER_H_
#define _ICE_ADAPTER_H_
+#include <linux/spinlock_types.h>
#include <linux/refcount_types.h>
struct pci_dev;
/**
* struct ice_adapter - PCI adapter resources shared across PFs
+ * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
+ * register of the PTP clock.
* @refcount: Reference count. struct ice_pf objects hold the references.
*/
struct ice_adapter {
+ /* For access to the GLTSYN_TIME register */
+ spinlock_t ptp_gltsyn_time_lock;
+
refcount_t refcount;
};
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index c11eba07283c..0875f37add78 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
u8 tmr_idx;
tmr_idx = ice_get_ptp_src_clock_index(hw);
+ guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
/* Read the system timestamp pre PHC read */
ptp_read_system_prets(sts);
@@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
struct ptp_system_timestamp *sts)
{
struct ice_pf *pf = ptp_info_to_pf(info);
- struct ice_hw *hw = &pf->hw;
-
- if (!ice_ptp_lock(hw)) {
- dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
- return -EBUSY;
- }
ice_ptp_read_time(pf, ts, sts);
- ice_ptp_unlock(hw);
return 0;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 187ce9b54e1a..2b9423a173bb 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
*/
static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
{
+ struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
+
+ guard(spinlock)(&pf->adapter->ptp_gltsyn_time_lock);
wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
ice_flush(hw);
}
--
2.43.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v2 3/3] ice: fold ice_ptp_read_time into ice_ptp_gettimex64
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 16:29 ` Michal Schmidt
-1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, Jacob Keller, Jakub Kicinski, Jiri Pirko,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
This is a cleanup. It is unnecessary to have this function just to call
another function.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/intel/ice/ice_ptp.c | 25 +++---------------------
1 file changed, 3 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 0875f37add78..0f17fc1181d2 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1166,26 +1166,6 @@ static void ice_ptp_reset_cached_phctime(struct ice_pf *pf)
ice_ptp_mark_tx_tracker_stale(&pf->ptp.port.tx);
}
-/**
- * ice_ptp_read_time - Read the time from the device
- * @pf: Board private structure
- * @ts: timespec structure to hold the current time value
- * @sts: Optional parameter for holding a pair of system timestamps from
- * the system clock. Will be ignored if NULL is given.
- *
- * This function reads the source clock registers and stores them in a timespec.
- * However, since the registers are 64 bits of nanoseconds, we must convert the
- * result to a timespec before we can return.
- */
-static void
-ice_ptp_read_time(struct ice_pf *pf, struct timespec64 *ts,
- struct ptp_system_timestamp *sts)
-{
- u64 time_ns = ice_ptp_read_src_clk_reg(pf, sts);
-
- *ts = ns_to_timespec64(time_ns);
-}
-
/**
* ice_ptp_write_init - Set PHC time to provided value
* @pf: Board private structure
@@ -1926,9 +1906,10 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
struct ptp_system_timestamp *sts)
{
struct ice_pf *pf = ptp_info_to_pf(info);
+ u64 time_ns;
- ice_ptp_read_time(pf, ts, sts);
-
+ time_ns = ice_ptp_read_src_clk_reg(pf, sts);
+ *ts = ns_to_timespec64(time_ns);
return 0;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Intel-wired-lan] [PATCH net-next v2 3/3] ice: fold ice_ptp_read_time into ice_ptp_gettimex64
@ 2024-03-06 16:29 ` Michal Schmidt
0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 16:29 UTC (permalink / raw)
To: intel-wired-lan
Cc: Jiri Pirko, netdev, Arkadiusz Kubalewski, Karol Kolacinski,
Jacob Keller, Jakub Kicinski
This is a cleanup. It is unnecessary to have this function just to call
another function.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/intel/ice/ice_ptp.c | 25 +++---------------------
1 file changed, 3 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 0875f37add78..0f17fc1181d2 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1166,26 +1166,6 @@ static void ice_ptp_reset_cached_phctime(struct ice_pf *pf)
ice_ptp_mark_tx_tracker_stale(&pf->ptp.port.tx);
}
-/**
- * ice_ptp_read_time - Read the time from the device
- * @pf: Board private structure
- * @ts: timespec structure to hold the current time value
- * @sts: Optional parameter for holding a pair of system timestamps from
- * the system clock. Will be ignored if NULL is given.
- *
- * This function reads the source clock registers and stores them in a timespec.
- * However, since the registers are 64 bits of nanoseconds, we must convert the
- * result to a timespec before we can return.
- */
-static void
-ice_ptp_read_time(struct ice_pf *pf, struct timespec64 *ts,
- struct ptp_system_timestamp *sts)
-{
- u64 time_ns = ice_ptp_read_src_clk_reg(pf, sts);
-
- *ts = ns_to_timespec64(time_ns);
-}
-
/**
* ice_ptp_write_init - Set PHC time to provided value
* @pf: Board private structure
@@ -1926,9 +1906,10 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
struct ptp_system_timestamp *sts)
{
struct ice_pf *pf = ptp_info_to_pf(info);
+ u64 time_ns;
- ice_ptp_read_time(pf, ts, sts);
-
+ time_ns = ice_ptp_read_src_clk_reg(pf, sts);
+ *ts = ns_to_timespec64(time_ns);
return 0;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-06 17:00 ` Jiri Pirko
-1 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-06 17:00 UTC (permalink / raw)
To: Michal Schmidt
Cc: intel-wired-lan, netdev, Jacob Keller, Jakub Kicinski,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>There is a need for synchronization between ice PFs on the same physical
>adapter.
>
>Add a "struct ice_adapter" for holding data shared between PFs of the
>same multifunction PCI device. The struct is refcounted - each ice_pf
>holds a reference to it.
>
>Its first use will be for PTP. I expect it will be useful also to
>improve the ugliness that is ice_prot_id_tbl.
>
>Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>---
> drivers/net/ethernet/intel/ice/Makefile | 3 +-
> drivers/net/ethernet/intel/ice/ice.h | 2 +
> drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
> 5 files changed, 119 insertions(+), 1 deletion(-)
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>
>diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>index cddd82d4ca0f..4fa09c321440 100644
>--- a/drivers/net/ethernet/intel/ice/Makefile
>+++ b/drivers/net/ethernet/intel/ice/Makefile
>@@ -36,7 +36,8 @@ ice-y := ice_main.o \
> ice_repr.o \
> ice_tc_lib.o \
> ice_fwlog.o \
>- ice_debugfs.o
>+ ice_debugfs.o \
>+ ice_adapter.o
> ice-$(CONFIG_PCI_IOV) += \
> ice_sriov.o \
> ice_virtchnl.o \
>diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>index 365c03d1c462..1ffecbdd361a 100644
>--- a/drivers/net/ethernet/intel/ice/ice.h
>+++ b/drivers/net/ethernet/intel/ice/ice.h
>@@ -77,6 +77,7 @@
> #include "ice_gnss.h"
> #include "ice_irq.h"
> #include "ice_dpll.h"
>+#include "ice_adapter.h"
>
> #define ICE_BAR0 0
> #define ICE_REQ_DESC_MULTIPLE 32
>@@ -544,6 +545,7 @@ struct ice_agg_node {
>
> struct ice_pf {
> struct pci_dev *pdev;
>+ struct ice_adapter *adapter;
>
> struct devlink_region *nvm_region;
> struct devlink_region *sram_region;
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>new file mode 100644
>index 000000000000..b93b4db4c04c
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>@@ -0,0 +1,85 @@
>+// SPDX-License-Identifier: GPL-2.0-only
>+// SPDX-FileCopyrightText: Copyright Red Hat
>+
>+#include <linux/cleanup.h>
>+#include <linux/mutex.h>
>+#include <linux/pci.h>
>+#include <linux/slab.h>
>+#include <linux/xarray.h>
>+#include "ice_adapter.h"
>+
>+static DEFINE_XARRAY(ice_adapters);
>+
>+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>+{
>+ unsigned int domain = pci_domain_nr(pdev->bus);
>+
>+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>+ return ((unsigned long)domain << 13) |
>+ ((unsigned long)pdev->bus->number << 5) |
>+ PCI_SLOT(pdev->devfn);
>+}
>+
>+static struct ice_adapter *ice_adapter_new(void)
>+{
>+ struct ice_adapter *a;
>+
>+ a = kzalloc(sizeof(*a), GFP_KERNEL);
>+ if (!a)
>+ return NULL;
>+
>+ refcount_set(&a->refcount, 1);
>+
>+ return a;
>+}
>+
>+static void ice_adapter_free(struct ice_adapter *a)
>+{
>+ kfree(a);
>+}
>+
>+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>+{
>+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>+ unsigned long index = ice_adapter_index(pdev);
>+
>+ a = ice_adapter_new();
Please consider some non-single-letter variable name.
>+ if (!a)
>+ return NULL;
>+
>+ xa_lock(&ice_adapters);
>+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
This is atomic section, can't sleep.
>+ if (xa_is_err(ret)) {
>+ ret = NULL;
Why don't you propagate err through ERR_PTR() ?
>+ goto unlock;
>+ }
>+ if (ret) {
>+ refcount_inc(&ret->refcount);
>+ goto unlock;
>+ }
>+ ret = no_free_ptr(a);
>+unlock:
>+ xa_unlock(&ice_adapters);
>+ return ret;
>+}
>+
>+void ice_adapter_put(const struct pci_dev *pdev)
>+{
>+ unsigned long index = ice_adapter_index(pdev);
>+ struct ice_adapter *a;
>+
>+ xa_lock(&ice_adapters);
>+ a = xa_load(&ice_adapters, index);
>+ if (WARN_ON(!a))
>+ goto unlock;
>+
>+ if (!refcount_dec_and_test(&a->refcount))
>+ goto unlock;
>+
>+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
Nice paranoia level :)
>+ ice_adapter_free(a);
>+unlock:
>+ xa_unlock(&ice_adapters);
>+}
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>new file mode 100644
>index 000000000000..cb5a02eb24c1
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>@@ -0,0 +1,22 @@
>+/* SPDX-License-Identifier: GPL-2.0-only */
>+/* SPDX-FileCopyrightText: Copyright Red Hat */
>+
>+#ifndef _ICE_ADAPTER_H_
>+#define _ICE_ADAPTER_H_
>+
>+#include <linux/refcount_types.h>
>+
>+struct pci_dev;
>+
>+/**
>+ * struct ice_adapter - PCI adapter resources shared across PFs
>+ * @refcount: Reference count. struct ice_pf objects hold the references.
>+ */
>+struct ice_adapter {
>+ refcount_t refcount;
>+};
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>+void ice_adapter_put(const struct pci_dev *pdev);
>+
>+#endif /* _ICE_ADAPTER_H */
>diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>index 8f73ba77e835..413219d81a12 100644
>--- a/drivers/net/ethernet/intel/ice/ice_main.c
>+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>@@ -5093,6 +5093,7 @@ static int
> ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> {
> struct device *dev = &pdev->dev;
>+ struct ice_adapter *adapter;
> struct ice_pf *pf;
> struct ice_hw *hw;
> int err;
>@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>
> pci_set_master(pdev);
>
>+ adapter = ice_adapter_get(pdev);
>+ if (!adapter)
>+ return -ENOMEM;
>+
> pf->pdev = pdev;
>+ pf->adapter = adapter;
> pci_set_drvdata(pdev, pf);
> set_bit(ICE_DOWN, pf->state);
> /* Disable service task until DOWN bit is cleared */
>@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> err_load:
> ice_deinit(pf);
> err_init:
>+ ice_adapter_put(pdev);
> pci_disable_device(pdev);
> return err;
> }
>@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> ice_setup_mc_magic_wake(pf);
> ice_set_wake(pf);
>
>+ ice_adapter_put(pdev);
> pci_disable_device(pdev);
> }
>
>--
>2.43.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-06 17:00 ` Jiri Pirko
0 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-06 17:00 UTC (permalink / raw)
To: Michal Schmidt
Cc: netdev, Arkadiusz Kubalewski, Karol Kolacinski, intel-wired-lan,
Jacob Keller, Jakub Kicinski
Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>There is a need for synchronization between ice PFs on the same physical
>adapter.
>
>Add a "struct ice_adapter" for holding data shared between PFs of the
>same multifunction PCI device. The struct is refcounted - each ice_pf
>holds a reference to it.
>
>Its first use will be for PTP. I expect it will be useful also to
>improve the ugliness that is ice_prot_id_tbl.
>
>Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>---
> drivers/net/ethernet/intel/ice/Makefile | 3 +-
> drivers/net/ethernet/intel/ice/ice.h | 2 +
> drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
> 5 files changed, 119 insertions(+), 1 deletion(-)
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>
>diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>index cddd82d4ca0f..4fa09c321440 100644
>--- a/drivers/net/ethernet/intel/ice/Makefile
>+++ b/drivers/net/ethernet/intel/ice/Makefile
>@@ -36,7 +36,8 @@ ice-y := ice_main.o \
> ice_repr.o \
> ice_tc_lib.o \
> ice_fwlog.o \
>- ice_debugfs.o
>+ ice_debugfs.o \
>+ ice_adapter.o
> ice-$(CONFIG_PCI_IOV) += \
> ice_sriov.o \
> ice_virtchnl.o \
>diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>index 365c03d1c462..1ffecbdd361a 100644
>--- a/drivers/net/ethernet/intel/ice/ice.h
>+++ b/drivers/net/ethernet/intel/ice/ice.h
>@@ -77,6 +77,7 @@
> #include "ice_gnss.h"
> #include "ice_irq.h"
> #include "ice_dpll.h"
>+#include "ice_adapter.h"
>
> #define ICE_BAR0 0
> #define ICE_REQ_DESC_MULTIPLE 32
>@@ -544,6 +545,7 @@ struct ice_agg_node {
>
> struct ice_pf {
> struct pci_dev *pdev;
>+ struct ice_adapter *adapter;
>
> struct devlink_region *nvm_region;
> struct devlink_region *sram_region;
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>new file mode 100644
>index 000000000000..b93b4db4c04c
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>@@ -0,0 +1,85 @@
>+// SPDX-License-Identifier: GPL-2.0-only
>+// SPDX-FileCopyrightText: Copyright Red Hat
>+
>+#include <linux/cleanup.h>
>+#include <linux/mutex.h>
>+#include <linux/pci.h>
>+#include <linux/slab.h>
>+#include <linux/xarray.h>
>+#include "ice_adapter.h"
>+
>+static DEFINE_XARRAY(ice_adapters);
>+
>+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>+{
>+ unsigned int domain = pci_domain_nr(pdev->bus);
>+
>+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>+ return ((unsigned long)domain << 13) |
>+ ((unsigned long)pdev->bus->number << 5) |
>+ PCI_SLOT(pdev->devfn);
>+}
>+
>+static struct ice_adapter *ice_adapter_new(void)
>+{
>+ struct ice_adapter *a;
>+
>+ a = kzalloc(sizeof(*a), GFP_KERNEL);
>+ if (!a)
>+ return NULL;
>+
>+ refcount_set(&a->refcount, 1);
>+
>+ return a;
>+}
>+
>+static void ice_adapter_free(struct ice_adapter *a)
>+{
>+ kfree(a);
>+}
>+
>+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>+{
>+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>+ unsigned long index = ice_adapter_index(pdev);
>+
>+ a = ice_adapter_new();
Please consider some non-single-letter variable name.
>+ if (!a)
>+ return NULL;
>+
>+ xa_lock(&ice_adapters);
>+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
This is atomic section, can't sleep.
>+ if (xa_is_err(ret)) {
>+ ret = NULL;
Why don't you propagate err through ERR_PTR() ?
>+ goto unlock;
>+ }
>+ if (ret) {
>+ refcount_inc(&ret->refcount);
>+ goto unlock;
>+ }
>+ ret = no_free_ptr(a);
>+unlock:
>+ xa_unlock(&ice_adapters);
>+ return ret;
>+}
>+
>+void ice_adapter_put(const struct pci_dev *pdev)
>+{
>+ unsigned long index = ice_adapter_index(pdev);
>+ struct ice_adapter *a;
>+
>+ xa_lock(&ice_adapters);
>+ a = xa_load(&ice_adapters, index);
>+ if (WARN_ON(!a))
>+ goto unlock;
>+
>+ if (!refcount_dec_and_test(&a->refcount))
>+ goto unlock;
>+
>+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
Nice paranoia level :)
>+ ice_adapter_free(a);
>+unlock:
>+ xa_unlock(&ice_adapters);
>+}
>diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>new file mode 100644
>index 000000000000..cb5a02eb24c1
>--- /dev/null
>+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>@@ -0,0 +1,22 @@
>+/* SPDX-License-Identifier: GPL-2.0-only */
>+/* SPDX-FileCopyrightText: Copyright Red Hat */
>+
>+#ifndef _ICE_ADAPTER_H_
>+#define _ICE_ADAPTER_H_
>+
>+#include <linux/refcount_types.h>
>+
>+struct pci_dev;
>+
>+/**
>+ * struct ice_adapter - PCI adapter resources shared across PFs
>+ * @refcount: Reference count. struct ice_pf objects hold the references.
>+ */
>+struct ice_adapter {
>+ refcount_t refcount;
>+};
>+
>+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>+void ice_adapter_put(const struct pci_dev *pdev);
>+
>+#endif /* _ICE_ADAPTER_H */
>diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>index 8f73ba77e835..413219d81a12 100644
>--- a/drivers/net/ethernet/intel/ice/ice_main.c
>+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>@@ -5093,6 +5093,7 @@ static int
> ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> {
> struct device *dev = &pdev->dev;
>+ struct ice_adapter *adapter;
> struct ice_pf *pf;
> struct ice_hw *hw;
> int err;
>@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>
> pci_set_master(pdev);
>
>+ adapter = ice_adapter_get(pdev);
>+ if (!adapter)
>+ return -ENOMEM;
>+
> pf->pdev = pdev;
>+ pf->adapter = adapter;
> pci_set_drvdata(pdev, pf);
> set_bit(ICE_DOWN, pf->state);
> /* Disable service task until DOWN bit is cleared */
>@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> err_load:
> ice_deinit(pf);
> err_init:
>+ ice_adapter_put(pdev);
> pci_disable_device(pdev);
> return err;
> }
>@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> ice_setup_mc_magic_wake(pf);
> ice_set_wake(pf);
>
>+ ice_adapter_put(pdev);
> pci_disable_device(pdev);
> }
>
>--
>2.43.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
2024-03-06 17:00 ` [Intel-wired-lan] " Jiri Pirko
@ 2024-03-06 19:20 ` Michal Schmidt
-1 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 19:20 UTC (permalink / raw)
To: Jiri Pirko
Cc: intel-wired-lan, netdev, Jacob Keller, Jakub Kicinski,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
> >There is a need for synchronization between ice PFs on the same physical
> >adapter.
> >
> >Add a "struct ice_adapter" for holding data shared between PFs of the
> >same multifunction PCI device. The struct is refcounted - each ice_pf
> >holds a reference to it.
> >
> >Its first use will be for PTP. I expect it will be useful also to
> >improve the ugliness that is ice_prot_id_tbl.
> >
> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> >---
> > drivers/net/ethernet/intel/ice/Makefile | 3 +-
> > drivers/net/ethernet/intel/ice/ice.h | 2 +
> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> > drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
> > 5 files changed, 119 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
> >
> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
> >index cddd82d4ca0f..4fa09c321440 100644
> >--- a/drivers/net/ethernet/intel/ice/Makefile
> >+++ b/drivers/net/ethernet/intel/ice/Makefile
> >@@ -36,7 +36,8 @@ ice-y := ice_main.o \
> > ice_repr.o \
> > ice_tc_lib.o \
> > ice_fwlog.o \
> >- ice_debugfs.o
> >+ ice_debugfs.o \
> >+ ice_adapter.o
> > ice-$(CONFIG_PCI_IOV) += \
> > ice_sriov.o \
> > ice_virtchnl.o \
> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> >index 365c03d1c462..1ffecbdd361a 100644
> >--- a/drivers/net/ethernet/intel/ice/ice.h
> >+++ b/drivers/net/ethernet/intel/ice/ice.h
> >@@ -77,6 +77,7 @@
> > #include "ice_gnss.h"
> > #include "ice_irq.h"
> > #include "ice_dpll.h"
> >+#include "ice_adapter.h"
> >
> > #define ICE_BAR0 0
> > #define ICE_REQ_DESC_MULTIPLE 32
> >@@ -544,6 +545,7 @@ struct ice_agg_node {
> >
> > struct ice_pf {
> > struct pci_dev *pdev;
> >+ struct ice_adapter *adapter;
> >
> > struct devlink_region *nvm_region;
> > struct devlink_region *sram_region;
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >new file mode 100644
> >index 000000000000..b93b4db4c04c
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >@@ -0,0 +1,85 @@
> >+// SPDX-License-Identifier: GPL-2.0-only
> >+// SPDX-FileCopyrightText: Copyright Red Hat
> >+
> >+#include <linux/cleanup.h>
> >+#include <linux/mutex.h>
> >+#include <linux/pci.h>
> >+#include <linux/slab.h>
> >+#include <linux/xarray.h>
> >+#include "ice_adapter.h"
> >+
> >+static DEFINE_XARRAY(ice_adapters);
> >+
> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
> >+{
> >+ unsigned int domain = pci_domain_nr(pdev->bus);
> >+
> >+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
> >+ return ((unsigned long)domain << 13) |
> >+ ((unsigned long)pdev->bus->number << 5) |
> >+ PCI_SLOT(pdev->devfn);
> >+}
> >+
> >+static struct ice_adapter *ice_adapter_new(void)
> >+{
> >+ struct ice_adapter *a;
> >+
> >+ a = kzalloc(sizeof(*a), GFP_KERNEL);
> >+ if (!a)
> >+ return NULL;
> >+
> >+ refcount_set(&a->refcount, 1);
> >+
> >+ return a;
> >+}
> >+
> >+static void ice_adapter_free(struct ice_adapter *a)
> >+{
> >+ kfree(a);
> >+}
> >+
> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
> >+{
> >+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
> >+ unsigned long index = ice_adapter_index(pdev);
> >+
> >+ a = ice_adapter_new();
>
> Please consider some non-single-letter variable name.
Alright, I can change the name.
> >+ if (!a)
> >+ return NULL;
> >+
> >+ xa_lock(&ice_adapters);
> >+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>
> This is atomic section, can't sleep.
It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
memory, then reacquires it.
> >+ if (xa_is_err(ret)) {
> >+ ret = NULL;
>
> Why don't you propagate err through ERR_PTR() ?
It seemed unnecessary. ENOMEM is the only failure that can possibly
happen. EINVAL could be returned only if attempting to store an
unaligned pointer, which won't happen here.
>
> >+ goto unlock;
> >+ }
> >+ if (ret) {
> >+ refcount_inc(&ret->refcount);
> >+ goto unlock;
> >+ }
> >+ ret = no_free_ptr(a);
> >+unlock:
> >+ xa_unlock(&ice_adapters);
> >+ return ret;
> >+}
> >+
> >+void ice_adapter_put(const struct pci_dev *pdev)
> >+{
> >+ unsigned long index = ice_adapter_index(pdev);
> >+ struct ice_adapter *a;
> >+
> >+ xa_lock(&ice_adapters);
> >+ a = xa_load(&ice_adapters, index);
> >+ if (WARN_ON(!a))
> >+ goto unlock;
> >+
> >+ if (!refcount_dec_and_test(&a->refcount))
> >+ goto unlock;
> >+
> >+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
>
> Nice paranoia level :)
>
>
> >+ ice_adapter_free(a);
> >+unlock:
> >+ xa_unlock(&ice_adapters);
> >+}
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >new file mode 100644
> >index 000000000000..cb5a02eb24c1
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >@@ -0,0 +1,22 @@
> >+/* SPDX-License-Identifier: GPL-2.0-only */
> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
> >+
> >+#ifndef _ICE_ADAPTER_H_
> >+#define _ICE_ADAPTER_H_
> >+
> >+#include <linux/refcount_types.h>
> >+
> >+struct pci_dev;
> >+
> >+/**
> >+ * struct ice_adapter - PCI adapter resources shared across PFs
> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
> >+ */
> >+struct ice_adapter {
> >+ refcount_t refcount;
> >+};
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
> >+void ice_adapter_put(const struct pci_dev *pdev);
> >+
> >+#endif /* _ICE_ADAPTER_H */
> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> >index 8f73ba77e835..413219d81a12 100644
> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
> >@@ -5093,6 +5093,7 @@ static int
> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > {
> > struct device *dev = &pdev->dev;
> >+ struct ice_adapter *adapter;
> > struct ice_pf *pf;
> > struct ice_hw *hw;
> > int err;
> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> >
> > pci_set_master(pdev);
> >
> >+ adapter = ice_adapter_get(pdev);
> >+ if (!adapter)
> >+ return -ENOMEM;
> >+
> > pf->pdev = pdev;
> >+ pf->adapter = adapter;
> > pci_set_drvdata(pdev, pf);
> > set_bit(ICE_DOWN, pf->state);
> > /* Disable service task until DOWN bit is cleared */
> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > err_load:
> > ice_deinit(pf);
> > err_init:
> >+ ice_adapter_put(pdev);
> > pci_disable_device(pdev);
> > return err;
> > }
> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> > ice_setup_mc_magic_wake(pf);
> > ice_set_wake(pf);
> >
> >+ ice_adapter_put(pdev);
> > pci_disable_device(pdev);
> > }
> >
> >--
> >2.43.2
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-06 19:20 ` Michal Schmidt
0 siblings, 0 replies; 14+ messages in thread
From: Michal Schmidt @ 2024-03-06 19:20 UTC (permalink / raw)
To: Jiri Pirko
Cc: netdev, Arkadiusz Kubalewski, Karol Kolacinski, intel-wired-lan,
Jacob Keller, Jakub Kicinski
On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
> >There is a need for synchronization between ice PFs on the same physical
> >adapter.
> >
> >Add a "struct ice_adapter" for holding data shared between PFs of the
> >same multifunction PCI device. The struct is refcounted - each ice_pf
> >holds a reference to it.
> >
> >Its first use will be for PTP. I expect it will be useful also to
> >improve the ugliness that is ice_prot_id_tbl.
> >
> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> >---
> > drivers/net/ethernet/intel/ice/Makefile | 3 +-
> > drivers/net/ethernet/intel/ice/ice.h | 2 +
> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
> > drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
> > 5 files changed, 119 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
> >
> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
> >index cddd82d4ca0f..4fa09c321440 100644
> >--- a/drivers/net/ethernet/intel/ice/Makefile
> >+++ b/drivers/net/ethernet/intel/ice/Makefile
> >@@ -36,7 +36,8 @@ ice-y := ice_main.o \
> > ice_repr.o \
> > ice_tc_lib.o \
> > ice_fwlog.o \
> >- ice_debugfs.o
> >+ ice_debugfs.o \
> >+ ice_adapter.o
> > ice-$(CONFIG_PCI_IOV) += \
> > ice_sriov.o \
> > ice_virtchnl.o \
> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> >index 365c03d1c462..1ffecbdd361a 100644
> >--- a/drivers/net/ethernet/intel/ice/ice.h
> >+++ b/drivers/net/ethernet/intel/ice/ice.h
> >@@ -77,6 +77,7 @@
> > #include "ice_gnss.h"
> > #include "ice_irq.h"
> > #include "ice_dpll.h"
> >+#include "ice_adapter.h"
> >
> > #define ICE_BAR0 0
> > #define ICE_REQ_DESC_MULTIPLE 32
> >@@ -544,6 +545,7 @@ struct ice_agg_node {
> >
> > struct ice_pf {
> > struct pci_dev *pdev;
> >+ struct ice_adapter *adapter;
> >
> > struct devlink_region *nvm_region;
> > struct devlink_region *sram_region;
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >new file mode 100644
> >index 000000000000..b93b4db4c04c
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> >@@ -0,0 +1,85 @@
> >+// SPDX-License-Identifier: GPL-2.0-only
> >+// SPDX-FileCopyrightText: Copyright Red Hat
> >+
> >+#include <linux/cleanup.h>
> >+#include <linux/mutex.h>
> >+#include <linux/pci.h>
> >+#include <linux/slab.h>
> >+#include <linux/xarray.h>
> >+#include "ice_adapter.h"
> >+
> >+static DEFINE_XARRAY(ice_adapters);
> >+
> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
> >+{
> >+ unsigned int domain = pci_domain_nr(pdev->bus);
> >+
> >+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
> >+ return ((unsigned long)domain << 13) |
> >+ ((unsigned long)pdev->bus->number << 5) |
> >+ PCI_SLOT(pdev->devfn);
> >+}
> >+
> >+static struct ice_adapter *ice_adapter_new(void)
> >+{
> >+ struct ice_adapter *a;
> >+
> >+ a = kzalloc(sizeof(*a), GFP_KERNEL);
> >+ if (!a)
> >+ return NULL;
> >+
> >+ refcount_set(&a->refcount, 1);
> >+
> >+ return a;
> >+}
> >+
> >+static void ice_adapter_free(struct ice_adapter *a)
> >+{
> >+ kfree(a);
> >+}
> >+
> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
> >+{
> >+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
> >+ unsigned long index = ice_adapter_index(pdev);
> >+
> >+ a = ice_adapter_new();
>
> Please consider some non-single-letter variable name.
Alright, I can change the name.
> >+ if (!a)
> >+ return NULL;
> >+
> >+ xa_lock(&ice_adapters);
> >+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>
> This is atomic section, can't sleep.
It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
memory, then reacquires it.
> >+ if (xa_is_err(ret)) {
> >+ ret = NULL;
>
> Why don't you propagate err through ERR_PTR() ?
It seemed unnecessary. ENOMEM is the only failure that can possibly
happen. EINVAL could be returned only if attempting to store an
unaligned pointer, which won't happen here.
>
> >+ goto unlock;
> >+ }
> >+ if (ret) {
> >+ refcount_inc(&ret->refcount);
> >+ goto unlock;
> >+ }
> >+ ret = no_free_ptr(a);
> >+unlock:
> >+ xa_unlock(&ice_adapters);
> >+ return ret;
> >+}
> >+
> >+void ice_adapter_put(const struct pci_dev *pdev)
> >+{
> >+ unsigned long index = ice_adapter_index(pdev);
> >+ struct ice_adapter *a;
> >+
> >+ xa_lock(&ice_adapters);
> >+ a = xa_load(&ice_adapters, index);
> >+ if (WARN_ON(!a))
> >+ goto unlock;
> >+
> >+ if (!refcount_dec_and_test(&a->refcount))
> >+ goto unlock;
> >+
> >+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
>
> Nice paranoia level :)
>
>
> >+ ice_adapter_free(a);
> >+unlock:
> >+ xa_unlock(&ice_adapters);
> >+}
> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >new file mode 100644
> >index 000000000000..cb5a02eb24c1
> >--- /dev/null
> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> >@@ -0,0 +1,22 @@
> >+/* SPDX-License-Identifier: GPL-2.0-only */
> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
> >+
> >+#ifndef _ICE_ADAPTER_H_
> >+#define _ICE_ADAPTER_H_
> >+
> >+#include <linux/refcount_types.h>
> >+
> >+struct pci_dev;
> >+
> >+/**
> >+ * struct ice_adapter - PCI adapter resources shared across PFs
> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
> >+ */
> >+struct ice_adapter {
> >+ refcount_t refcount;
> >+};
> >+
> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
> >+void ice_adapter_put(const struct pci_dev *pdev);
> >+
> >+#endif /* _ICE_ADAPTER_H */
> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> >index 8f73ba77e835..413219d81a12 100644
> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
> >@@ -5093,6 +5093,7 @@ static int
> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > {
> > struct device *dev = &pdev->dev;
> >+ struct ice_adapter *adapter;
> > struct ice_pf *pf;
> > struct ice_hw *hw;
> > int err;
> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> >
> > pci_set_master(pdev);
> >
> >+ adapter = ice_adapter_get(pdev);
> >+ if (!adapter)
> >+ return -ENOMEM;
> >+
> > pf->pdev = pdev;
> >+ pf->adapter = adapter;
> > pci_set_drvdata(pdev, pf);
> > set_bit(ICE_DOWN, pf->state);
> > /* Disable service task until DOWN bit is cleared */
> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> > err_load:
> > ice_deinit(pf);
> > err_init:
> >+ ice_adapter_put(pdev);
> > pci_disable_device(pdev);
> > return err;
> > }
> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
> > ice_setup_mc_magic_wake(pf);
> > ice_set_wake(pf);
> >
> >+ ice_adapter_put(pdev);
> > pci_disable_device(pdev);
> > }
> >
> >--
> >2.43.2
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
2024-03-06 19:20 ` [Intel-wired-lan] " Michal Schmidt
@ 2024-03-07 8:27 ` Jiri Pirko
-1 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-07 8:27 UTC (permalink / raw)
To: Michal Schmidt
Cc: intel-wired-lan, netdev, Jacob Keller, Jakub Kicinski,
Jesse Brandeburg, Arkadiusz Kubalewski, Karol Kolacinski
Wed, Mar 06, 2024 at 08:20:33PM CET, mschmidt@redhat.com wrote:
>On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
>> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>> >There is a need for synchronization between ice PFs on the same physical
>> >adapter.
>> >
>> >Add a "struct ice_adapter" for holding data shared between PFs of the
>> >same multifunction PCI device. The struct is refcounted - each ice_pf
>> >holds a reference to it.
>> >
>> >Its first use will be for PTP. I expect it will be useful also to
>> >improve the ugliness that is ice_prot_id_tbl.
>> >
>> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> >---
>> > drivers/net/ethernet/intel/ice/Makefile | 3 +-
>> > drivers/net/ethernet/intel/ice/ice.h | 2 +
>> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
>> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
>> > drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
>> > 5 files changed, 119 insertions(+), 1 deletion(-)
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>> >
>> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>> >index cddd82d4ca0f..4fa09c321440 100644
>> >--- a/drivers/net/ethernet/intel/ice/Makefile
>> >+++ b/drivers/net/ethernet/intel/ice/Makefile
>> >@@ -36,7 +36,8 @@ ice-y := ice_main.o \
>> > ice_repr.o \
>> > ice_tc_lib.o \
>> > ice_fwlog.o \
>> >- ice_debugfs.o
>> >+ ice_debugfs.o \
>> >+ ice_adapter.o
>> > ice-$(CONFIG_PCI_IOV) += \
>> > ice_sriov.o \
>> > ice_virtchnl.o \
>> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>> >index 365c03d1c462..1ffecbdd361a 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice.h
>> >+++ b/drivers/net/ethernet/intel/ice/ice.h
>> >@@ -77,6 +77,7 @@
>> > #include "ice_gnss.h"
>> > #include "ice_irq.h"
>> > #include "ice_dpll.h"
>> >+#include "ice_adapter.h"
>> >
>> > #define ICE_BAR0 0
>> > #define ICE_REQ_DESC_MULTIPLE 32
>> >@@ -544,6 +545,7 @@ struct ice_agg_node {
>> >
>> > struct ice_pf {
>> > struct pci_dev *pdev;
>> >+ struct ice_adapter *adapter;
>> >
>> > struct devlink_region *nvm_region;
>> > struct devlink_region *sram_region;
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >new file mode 100644
>> >index 000000000000..b93b4db4c04c
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >@@ -0,0 +1,85 @@
>> >+// SPDX-License-Identifier: GPL-2.0-only
>> >+// SPDX-FileCopyrightText: Copyright Red Hat
>> >+
>> >+#include <linux/cleanup.h>
>> >+#include <linux/mutex.h>
>> >+#include <linux/pci.h>
>> >+#include <linux/slab.h>
>> >+#include <linux/xarray.h>
>> >+#include "ice_adapter.h"
>> >+
>> >+static DEFINE_XARRAY(ice_adapters);
>> >+
>> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>> >+{
>> >+ unsigned int domain = pci_domain_nr(pdev->bus);
>> >+
>> >+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>> >+ return ((unsigned long)domain << 13) |
>> >+ ((unsigned long)pdev->bus->number << 5) |
>> >+ PCI_SLOT(pdev->devfn);
>> >+}
>> >+
>> >+static struct ice_adapter *ice_adapter_new(void)
>> >+{
>> >+ struct ice_adapter *a;
>> >+
>> >+ a = kzalloc(sizeof(*a), GFP_KERNEL);
>> >+ if (!a)
>> >+ return NULL;
>> >+
>> >+ refcount_set(&a->refcount, 1);
>> >+
>> >+ return a;
>> >+}
>> >+
>> >+static void ice_adapter_free(struct ice_adapter *a)
>> >+{
>> >+ kfree(a);
>> >+}
>> >+
>> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>> >+{
>> >+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>> >+ unsigned long index = ice_adapter_index(pdev);
>> >+
>> >+ a = ice_adapter_new();
>>
>> Please consider some non-single-letter variable name.
>
>Alright, I can change the name.
>
>> >+ if (!a)
>> >+ return NULL;
>> >+
>> >+ xa_lock(&ice_adapters);
>> >+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>>
>> This is atomic section, can't sleep.
>
>It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
>memory, then reacquires it.
Ah, cool.
>
>> >+ if (xa_is_err(ret)) {
>> >+ ret = NULL;
>>
>> Why don't you propagate err through ERR_PTR() ?
>
>It seemed unnecessary. ENOMEM is the only failure that can possibly
>happen. EINVAL could be returned only if attempting to store an
>unaligned pointer, which won't happen here.
Yeah, the point is that you have valid err, you toss it out, the caller
then does:
adapter = ice_adapter_get(pdev);
if (!adapter)
return -ENOMEM;
And reinvents err. So my point was to propagate it through.
>
>>
>> >+ goto unlock;
>> >+ }
>> >+ if (ret) {
>> >+ refcount_inc(&ret->refcount);
>> >+ goto unlock;
>> >+ }
>> >+ ret = no_free_ptr(a);
>> >+unlock:
>> >+ xa_unlock(&ice_adapters);
>> >+ return ret;
>> >+}
>> >+
>> >+void ice_adapter_put(const struct pci_dev *pdev)
>> >+{
>> >+ unsigned long index = ice_adapter_index(pdev);
>> >+ struct ice_adapter *a;
>> >+
>> >+ xa_lock(&ice_adapters);
>> >+ a = xa_load(&ice_adapters, index);
>> >+ if (WARN_ON(!a))
>> >+ goto unlock;
>> >+
>> >+ if (!refcount_dec_and_test(&a->refcount))
>> >+ goto unlock;
>> >+
>> >+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
>>
>> Nice paranoia level :)
>>
>>
>> >+ ice_adapter_free(a);
>> >+unlock:
>> >+ xa_unlock(&ice_adapters);
>> >+}
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >new file mode 100644
>> >index 000000000000..cb5a02eb24c1
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >@@ -0,0 +1,22 @@
>> >+/* SPDX-License-Identifier: GPL-2.0-only */
>> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
>> >+
>> >+#ifndef _ICE_ADAPTER_H_
>> >+#define _ICE_ADAPTER_H_
>> >+
>> >+#include <linux/refcount_types.h>
>> >+
>> >+struct pci_dev;
>> >+
>> >+/**
>> >+ * struct ice_adapter - PCI adapter resources shared across PFs
>> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
>> >+ */
>> >+struct ice_adapter {
>> >+ refcount_t refcount;
>> >+};
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>> >+void ice_adapter_put(const struct pci_dev *pdev);
>> >+
>> >+#endif /* _ICE_ADAPTER_H */
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>> >index 8f73ba77e835..413219d81a12 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
>> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> >@@ -5093,6 +5093,7 @@ static int
>> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > {
>> > struct device *dev = &pdev->dev;
>> >+ struct ice_adapter *adapter;
>> > struct ice_pf *pf;
>> > struct ice_hw *hw;
>> > int err;
>> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> >
>> > pci_set_master(pdev);
>> >
>> >+ adapter = ice_adapter_get(pdev);
>> >+ if (!adapter)
>> >+ return -ENOMEM;
>> >+
>> > pf->pdev = pdev;
>> >+ pf->adapter = adapter;
>> > pci_set_drvdata(pdev, pf);
>> > set_bit(ICE_DOWN, pf->state);
>> > /* Disable service task until DOWN bit is cleared */
>> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > err_load:
>> > ice_deinit(pf);
>> > err_init:
>> >+ ice_adapter_put(pdev);
>> > pci_disable_device(pdev);
>> > return err;
>> > }
>> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
>> > ice_setup_mc_magic_wake(pf);
>> > ice_set_wake(pf);
>> >
>> >+ ice_adapter_put(pdev);
>> > pci_disable_device(pdev);
>> > }
>> >
>> >--
>> >2.43.2
>> >
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC
@ 2024-03-07 8:27 ` Jiri Pirko
0 siblings, 0 replies; 14+ messages in thread
From: Jiri Pirko @ 2024-03-07 8:27 UTC (permalink / raw)
To: Michal Schmidt
Cc: netdev, Arkadiusz Kubalewski, Karol Kolacinski, intel-wired-lan,
Jacob Keller, Jakub Kicinski
Wed, Mar 06, 2024 at 08:20:33PM CET, mschmidt@redhat.com wrote:
>On Wed, Mar 6, 2024 at 6:00 PM Jiri Pirko <jiri@resnulli.us> wrote:
>> Wed, Mar 06, 2024 at 05:29:05PM CET, mschmidt@redhat.com wrote:
>> >There is a need for synchronization between ice PFs on the same physical
>> >adapter.
>> >
>> >Add a "struct ice_adapter" for holding data shared between PFs of the
>> >same multifunction PCI device. The struct is refcounted - each ice_pf
>> >holds a reference to it.
>> >
>> >Its first use will be for PTP. I expect it will be useful also to
>> >improve the ugliness that is ice_prot_id_tbl.
>> >
>> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> >---
>> > drivers/net/ethernet/intel/ice/Makefile | 3 +-
>> > drivers/net/ethernet/intel/ice/ice.h | 2 +
>> > drivers/net/ethernet/intel/ice/ice_adapter.c | 85 ++++++++++++++++++++
>> > drivers/net/ethernet/intel/ice/ice_adapter.h | 22 +++++
>> > drivers/net/ethernet/intel/ice/ice_main.c | 8 ++
>> > 5 files changed, 119 insertions(+), 1 deletion(-)
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.c
>> > create mode 100644 drivers/net/ethernet/intel/ice/ice_adapter.h
>> >
>> >diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
>> >index cddd82d4ca0f..4fa09c321440 100644
>> >--- a/drivers/net/ethernet/intel/ice/Makefile
>> >+++ b/drivers/net/ethernet/intel/ice/Makefile
>> >@@ -36,7 +36,8 @@ ice-y := ice_main.o \
>> > ice_repr.o \
>> > ice_tc_lib.o \
>> > ice_fwlog.o \
>> >- ice_debugfs.o
>> >+ ice_debugfs.o \
>> >+ ice_adapter.o
>> > ice-$(CONFIG_PCI_IOV) += \
>> > ice_sriov.o \
>> > ice_virtchnl.o \
>> >diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
>> >index 365c03d1c462..1ffecbdd361a 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice.h
>> >+++ b/drivers/net/ethernet/intel/ice/ice.h
>> >@@ -77,6 +77,7 @@
>> > #include "ice_gnss.h"
>> > #include "ice_irq.h"
>> > #include "ice_dpll.h"
>> >+#include "ice_adapter.h"
>> >
>> > #define ICE_BAR0 0
>> > #define ICE_REQ_DESC_MULTIPLE 32
>> >@@ -544,6 +545,7 @@ struct ice_agg_node {
>> >
>> > struct ice_pf {
>> > struct pci_dev *pdev;
>> >+ struct ice_adapter *adapter;
>> >
>> > struct devlink_region *nvm_region;
>> > struct devlink_region *sram_region;
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >new file mode 100644
>> >index 000000000000..b93b4db4c04c
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>> >@@ -0,0 +1,85 @@
>> >+// SPDX-License-Identifier: GPL-2.0-only
>> >+// SPDX-FileCopyrightText: Copyright Red Hat
>> >+
>> >+#include <linux/cleanup.h>
>> >+#include <linux/mutex.h>
>> >+#include <linux/pci.h>
>> >+#include <linux/slab.h>
>> >+#include <linux/xarray.h>
>> >+#include "ice_adapter.h"
>> >+
>> >+static DEFINE_XARRAY(ice_adapters);
>> >+
>> >+static unsigned long ice_adapter_index(const struct pci_dev *pdev)
>> >+{
>> >+ unsigned int domain = pci_domain_nr(pdev->bus);
>> >+
>> >+ WARN_ON((unsigned long)domain >> (BITS_PER_LONG - 13));
>> >+ return ((unsigned long)domain << 13) |
>> >+ ((unsigned long)pdev->bus->number << 5) |
>> >+ PCI_SLOT(pdev->devfn);
>> >+}
>> >+
>> >+static struct ice_adapter *ice_adapter_new(void)
>> >+{
>> >+ struct ice_adapter *a;
>> >+
>> >+ a = kzalloc(sizeof(*a), GFP_KERNEL);
>> >+ if (!a)
>> >+ return NULL;
>> >+
>> >+ refcount_set(&a->refcount, 1);
>> >+
>> >+ return a;
>> >+}
>> >+
>> >+static void ice_adapter_free(struct ice_adapter *a)
>> >+{
>> >+ kfree(a);
>> >+}
>> >+
>> >+DEFINE_FREE(ice_adapter_free, struct ice_adapter*, if (_T) ice_adapter_free(_T))
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>> >+{
>> >+ struct ice_adapter *ret, __free(ice_adapter_free) *a = NULL;
>> >+ unsigned long index = ice_adapter_index(pdev);
>> >+
>> >+ a = ice_adapter_new();
>>
>> Please consider some non-single-letter variable name.
>
>Alright, I can change the name.
>
>> >+ if (!a)
>> >+ return NULL;
>> >+
>> >+ xa_lock(&ice_adapters);
>> >+ ret = __xa_cmpxchg(&ice_adapters, index, NULL, a, GFP_KERNEL);
>>
>> This is atomic section, can't sleep.
>
>It is not atomic. __xa_cmpxchg releases xa_lock before it allocates
>memory, then reacquires it.
Ah, cool.
>
>> >+ if (xa_is_err(ret)) {
>> >+ ret = NULL;
>>
>> Why don't you propagate err through ERR_PTR() ?
>
>It seemed unnecessary. ENOMEM is the only failure that can possibly
>happen. EINVAL could be returned only if attempting to store an
>unaligned pointer, which won't happen here.
Yeah, the point is that you have valid err, you toss it out, the caller
then does:
adapter = ice_adapter_get(pdev);
if (!adapter)
return -ENOMEM;
And reinvents err. So my point was to propagate it through.
>
>>
>> >+ goto unlock;
>> >+ }
>> >+ if (ret) {
>> >+ refcount_inc(&ret->refcount);
>> >+ goto unlock;
>> >+ }
>> >+ ret = no_free_ptr(a);
>> >+unlock:
>> >+ xa_unlock(&ice_adapters);
>> >+ return ret;
>> >+}
>> >+
>> >+void ice_adapter_put(const struct pci_dev *pdev)
>> >+{
>> >+ unsigned long index = ice_adapter_index(pdev);
>> >+ struct ice_adapter *a;
>> >+
>> >+ xa_lock(&ice_adapters);
>> >+ a = xa_load(&ice_adapters, index);
>> >+ if (WARN_ON(!a))
>> >+ goto unlock;
>> >+
>> >+ if (!refcount_dec_and_test(&a->refcount))
>> >+ goto unlock;
>> >+
>> >+ WARN_ON(__xa_erase(&ice_adapters, index) != a);
>>
>> Nice paranoia level :)
>>
>>
>> >+ ice_adapter_free(a);
>> >+unlock:
>> >+ xa_unlock(&ice_adapters);
>> >+}
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >new file mode 100644
>> >index 000000000000..cb5a02eb24c1
>> >--- /dev/null
>> >+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>> >@@ -0,0 +1,22 @@
>> >+/* SPDX-License-Identifier: GPL-2.0-only */
>> >+/* SPDX-FileCopyrightText: Copyright Red Hat */
>> >+
>> >+#ifndef _ICE_ADAPTER_H_
>> >+#define _ICE_ADAPTER_H_
>> >+
>> >+#include <linux/refcount_types.h>
>> >+
>> >+struct pci_dev;
>> >+
>> >+/**
>> >+ * struct ice_adapter - PCI adapter resources shared across PFs
>> >+ * @refcount: Reference count. struct ice_pf objects hold the references.
>> >+ */
>> >+struct ice_adapter {
>> >+ refcount_t refcount;
>> >+};
>> >+
>> >+struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev);
>> >+void ice_adapter_put(const struct pci_dev *pdev);
>> >+
>> >+#endif /* _ICE_ADAPTER_H */
>> >diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>> >index 8f73ba77e835..413219d81a12 100644
>> >--- a/drivers/net/ethernet/intel/ice/ice_main.c
>> >+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> >@@ -5093,6 +5093,7 @@ static int
>> > ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > {
>> > struct device *dev = &pdev->dev;
>> >+ struct ice_adapter *adapter;
>> > struct ice_pf *pf;
>> > struct ice_hw *hw;
>> > int err;
>> >@@ -5145,7 +5146,12 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> >
>> > pci_set_master(pdev);
>> >
>> >+ adapter = ice_adapter_get(pdev);
>> >+ if (!adapter)
>> >+ return -ENOMEM;
>> >+
>> > pf->pdev = pdev;
>> >+ pf->adapter = adapter;
>> > pci_set_drvdata(pdev, pf);
>> > set_bit(ICE_DOWN, pf->state);
>> > /* Disable service task until DOWN bit is cleared */
>> >@@ -5196,6 +5202,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>> > err_load:
>> > ice_deinit(pf);
>> > err_init:
>> >+ ice_adapter_put(pdev);
>> > pci_disable_device(pdev);
>> > return err;
>> > }
>> >@@ -5302,6 +5309,7 @@ static void ice_remove(struct pci_dev *pdev)
>> > ice_setup_mc_magic_wake(pf);
>> > ice_set_wake(pf);
>> >
>> >+ ice_adapter_put(pdev);
>> > pci_disable_device(pdev);
>> > }
>> >
>> >--
>> >2.43.2
>> >
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-03-07 8:27 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06 16:29 [PATCH net-next v2 0/3] ice: lighter locking for PTP time reading Michal Schmidt
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
2024-03-06 16:29 ` [PATCH net-next v2 1/3] ice: add ice_adapter for shared data across PFs on the same NIC Michal Schmidt
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
2024-03-06 17:00 ` Jiri Pirko
2024-03-06 17:00 ` [Intel-wired-lan] " Jiri Pirko
2024-03-06 19:20 ` Michal Schmidt
2024-03-06 19:20 ` [Intel-wired-lan] " Michal Schmidt
2024-03-07 8:27 ` Jiri Pirko
2024-03-07 8:27 ` [Intel-wired-lan] " Jiri Pirko
2024-03-06 16:29 ` [PATCH net-next v2 2/3] ice: avoid the PTP hardware semaphore in gettimex64 path Michal Schmidt
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
2024-03-06 16:29 ` [PATCH net-next v2 3/3] ice: fold ice_ptp_read_time into ice_ptp_gettimex64 Michal Schmidt
2024-03-06 16:29 ` [Intel-wired-lan] " Michal Schmidt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.