linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next 0/3] scsi/fcoe: Play nicer with PREEMPT_RT
@ 2021-11-17  2:59 Davidlohr Bueso
  2021-11-17  2:59 ` [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc() Davidlohr Bueso
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Davidlohr Bueso @ 2021-11-17  2:59 UTC (permalink / raw)
  To: martin.petersen, jejb
  Cc: hare, bigeasy, tglx, linux-scsi, linux-rt-users, linux-kernel, dave

Hi,

The following are the result of trying to get rid of the out-of-tree
equivalent[1].

Patches 1 and 2 remove the actual scheduling while atomic scenarios.

Patch 3 could be considered optional because afaict the stats do not
actually have blocking calls while having preemption disabled. But
from an RT perspective it's still beneficial as the region remains
preemptible.

Compile tested only.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/scsi_fcoe__Make_RT_aware..patch?h=linux-5.15.y-rt-patches

Thanks!

Davidlohr Bueso (3):
  scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc()
  scsi/fcoe: Add a local_lock to fcoe_percpu
  scsi/fcoe: Add a local_lock to percpu localport statistics

 drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 16 ++++++++------
 drivers/scsi/bnx2fc/bnx2fc_io.c   |  5 +++--
 drivers/scsi/fcoe/fcoe.c          | 35 +++++++++++++++++++------------
 drivers/scsi/fcoe/fcoe_ctlr.c     | 24 +++++++++++++--------
 drivers/scsi/libfc/fc_exch.c      |  3 +--
 drivers/scsi/libfc/fc_fcp.c       | 31 +++++++++++++++++++--------
 drivers/scsi/qedf/qedf_main.c     |  7 ++++---
 include/scsi/libfc.h              |  7 +++++++
 include/scsi/libfcoe.h            |  2 ++
 9 files changed, 86 insertions(+), 44 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc()
  2021-11-17  2:59 [PATCH -next 0/3] scsi/fcoe: Play nicer with PREEMPT_RT Davidlohr Bueso
@ 2021-11-17  2:59 ` Davidlohr Bueso
  2021-11-17  7:25   ` Sebastian Andrzej Siewior
  2021-11-17  2:59 ` [PATCH 2/3] scsi/fcoe: Add a local_lock to fcoe_percpu Davidlohr Bueso
  2021-11-17  2:59 ` [PATCH 3/3] scsi/fcoe: Add a local_lock to percpu localport statistics Davidlohr Bueso
  2 siblings, 1 reply; 7+ messages in thread
From: Davidlohr Bueso @ 2021-11-17  2:59 UTC (permalink / raw)
  To: martin.petersen, jejb
  Cc: hare, bigeasy, tglx, linux-scsi, linux-rt-users, linux-kernel,
	dave, Davidlohr Bueso

The get_cpu() in fc_exch_em_alloc() was introduced in:

    f018b73af6db ([SCSI] libfc, libfcoe, fcoe: use smp_processor_id() only when preempt disabled)

for no other reason than to simply use smp_processor_id()
without getting a warning, because everything is done with
the pool->lock held anyway. However, get_cpu(), by disabling
preemption, does not play well with PREEMPT_RT, particularly
when acquiring a regular (and thus sleepable) spinlock.

Therefore remove the get_cpu() and just use the unstable value
as we will have CPU locality guarantees next by taking the lock.
The window of migration, as noted by Sebastian, is small and
even if it happens the result is correct.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/scsi/libfc/fc_exch.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c
index 841000445b9a..be37bfb2814d 100644
--- a/drivers/scsi/libfc/fc_exch.c
+++ b/drivers/scsi/libfc/fc_exch.c
@@ -825,10 +825,9 @@ static struct fc_exch *fc_exch_em_alloc(struct fc_lport *lport,
 	}
 	memset(ep, 0, sizeof(*ep));
 
-	cpu = get_cpu();
+	cpu = raw_smp_processor_id();
 	pool = per_cpu_ptr(mp->pool, cpu);
 	spin_lock_bh(&pool->lock);
-	put_cpu();
 
 	/* peek cache of free slot */
 	if (pool->left != FC_XID_UNKNOWN) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] scsi/fcoe: Add a local_lock to fcoe_percpu
  2021-11-17  2:59 [PATCH -next 0/3] scsi/fcoe: Play nicer with PREEMPT_RT Davidlohr Bueso
  2021-11-17  2:59 ` [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc() Davidlohr Bueso
@ 2021-11-17  2:59 ` Davidlohr Bueso
  2021-11-17  7:50   ` Sebastian Andrzej Siewior
  2021-11-17  2:59 ` [PATCH 3/3] scsi/fcoe: Add a local_lock to percpu localport statistics Davidlohr Bueso
  2 siblings, 1 reply; 7+ messages in thread
From: Davidlohr Bueso @ 2021-11-17  2:59 UTC (permalink / raw)
  To: martin.petersen, jejb
  Cc: hare, bigeasy, tglx, linux-scsi, linux-rt-users, linux-kernel,
	dave, Davidlohr Bueso

fcoe_get_paged_crc_eof() relies on the caller having preemption
disabled to ensure the per-CPU fcoe_percpu context remains valid
throughout the call. This is done by either holding spinlocks
(such as bnx2fc_global_lock or qedf_global_lock) or the get_cpu()
from fcoe_alloc_paged_crc_eof(). This last one breaks PREEMPT_RT
semantics as there can be memory allocation and end up sleeping
in atomic contexts.

Introduce a local_lock_t to struct fcoe_percpu that will keep the
non-RT case the same, mapping to preempt_disable/enable, while
RT will use a per-CPU spinlock allowing the region to be preemptible
but still maintain CPU locality. The other users of fcoe_percpu
are already safe in this regard and do not require local_lock()ing.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/scsi/fcoe/fcoe.c | 6 ++++--
 include/scsi/libfcoe.h   | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index 6415f88738ad..a0064a1b4a32 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1452,9 +1452,10 @@ static int fcoe_alloc_paged_crc_eof(struct sk_buff *skb, int tlen)
 	struct fcoe_percpu_s *fps;
 	int rc;
 
-	fps = &get_cpu_var(fcoe_percpu);
+	local_lock(&fcoe_percpu.lock);
+	fps = this_cpu_ptr(&fcoe_percpu);
 	rc = fcoe_get_paged_crc_eof(skb, tlen, fps);
-	put_cpu_var(fcoe_percpu);
+	local_unlock(&fcoe_percpu.lock);
 
 	return rc;
 }
@@ -2487,6 +2488,7 @@ static int __init fcoe_init(void)
 		p = per_cpu_ptr(&fcoe_percpu, cpu);
 		INIT_WORK(&p->work, fcoe_receive_work);
 		skb_queue_head_init(&p->fcoe_rx_list);
+		local_lock_init(&p->lock);
 	}
 
 	/* Setup link change notification */
diff --git a/include/scsi/libfcoe.h b/include/scsi/libfcoe.h
index fac8e89aed81..6e79fb87fea2 100644
--- a/include/scsi/libfcoe.h
+++ b/include/scsi/libfcoe.h
@@ -14,6 +14,7 @@
 #include <linux/netdevice.h>
 #include <linux/skbuff.h>
 #include <linux/workqueue.h>
+#include <linux/local_lock.h>
 #include <linux/random.h>
 #include <scsi/fc/fc_fcoe.h>
 #include <scsi/libfc.h>
@@ -326,6 +327,7 @@ struct fcoe_percpu_s {
 	struct sk_buff_head fcoe_rx_list;
 	struct page *crc_eof_page;
 	int crc_eof_offset;
+	local_lock_t lock;
 };
 
 /**
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] scsi/fcoe: Add a local_lock to percpu localport statistics
  2021-11-17  2:59 [PATCH -next 0/3] scsi/fcoe: Play nicer with PREEMPT_RT Davidlohr Bueso
  2021-11-17  2:59 ` [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc() Davidlohr Bueso
  2021-11-17  2:59 ` [PATCH 2/3] scsi/fcoe: Add a local_lock to fcoe_percpu Davidlohr Bueso
@ 2021-11-17  2:59 ` Davidlohr Bueso
  2021-11-17  8:34   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 7+ messages in thread
From: Davidlohr Bueso @ 2021-11-17  2:59 UTC (permalink / raw)
  To: martin.petersen, jejb
  Cc: hare, bigeasy, tglx, linux-scsi, linux-rt-users, linux-kernel,
	dave, Davidlohr Bueso

Updating the per-CPU lport->stats relies on disabling preemption
with get/put_cpu() semantics. However, this is a bit harsh for
PREEMPT_RT and by using a local_lock, it can allow the region
to be preemptible, guaranteeing CPU locality, without making any
difference to the non-RT common case as it will continue to
disable preemption.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 16 ++++++++++------
 drivers/scsi/bnx2fc/bnx2fc_io.c   |  5 +++--
 drivers/scsi/fcoe/fcoe.c          | 29 ++++++++++++++++++-----------
 drivers/scsi/fcoe/fcoe_ctlr.c     | 24 +++++++++++++++---------
 drivers/scsi/libfc/fc_fcp.c       | 31 ++++++++++++++++++++++---------
 drivers/scsi/qedf/qedf_main.c     |  7 ++++---
 include/scsi/libfc.h              |  7 +++++++
 7 files changed, 79 insertions(+), 40 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index 71fa62bd3083..25bb08514824 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -398,11 +398,12 @@ static int bnx2fc_xmit(struct fc_lport *lport, struct fc_frame *fp)
 		skb_shinfo(skb)->gso_size = 0;
 	}
 
-	/*update tx stats */
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	/* update tx stats */
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	stats->TxFrames++;
 	stats->TxWords += wlen;
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 
 	/* send down to lld */
 	fr_dev(fp) = lport;
@@ -870,6 +871,7 @@ static void bnx2fc_indicate_netevent(void *context, unsigned long event,
 	struct fc_lport *vport;
 	struct bnx2fc_interface *interface, *tmp;
 	struct fcoe_ctlr *ctlr;
+	struct fc_stats *stats;
 	int wait_for_upload = 0;
 	u32 link_possible = 1;
 
@@ -962,9 +964,11 @@ static void bnx2fc_indicate_netevent(void *context, unsigned long event,
 				mutex_unlock(&lport->lp_mutex);
 				fc_host_port_type(lport->host) =
 					FC_PORTTYPE_UNKNOWN;
-				per_cpu_ptr(lport->stats,
-					    get_cpu())->LinkFailureCount++;
-				put_cpu();
+
+				local_lock(&lport->stats->lock);
+				stats = this_cpu_ptr(lport->stats);
+				stats->LinkFailureCount++;
+				local_unlock(&lport->stats->lock);
 				fcoe_clean_pending_queue(lport);
 				wait_for_upload = 1;
 			}
diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
index b9114113ee73..219ce4bc4008 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_io.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
@@ -2046,7 +2046,8 @@ int bnx2fc_post_io_req(struct bnx2fc_rport *tgt,
 	io_req->data_xfer_len = scsi_bufflen(sc_cmd);
 	sc_cmd->SCp.ptr = (char *)io_req;
 
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	if (sc_cmd->sc_data_direction == DMA_FROM_DEVICE) {
 		io_req->io_req_flags = BNX2FC_READ;
 		stats->InputRequests++;
@@ -2059,7 +2060,7 @@ int bnx2fc_post_io_req(struct bnx2fc_rport *tgt,
 		io_req->io_req_flags = 0;
 		stats->ControlRequests++;
 	}
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 
 	xid = io_req->xid;
 
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index a0064a1b4a32..80e6d5427d51 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1339,6 +1339,7 @@ static int fcoe_rcv(struct sk_buff *skb, struct net_device *netdev,
 	struct fcoe_interface *fcoe;
 	struct fc_frame_header *fh;
 	struct fcoe_percpu_s *fps;
+	struct fc_stats *stats;
 	struct ethhdr *eh;
 	unsigned int cpu;
 
@@ -1433,8 +1434,10 @@ static int fcoe_rcv(struct sk_buff *skb, struct net_device *netdev,
 
 	return NET_RX_SUCCESS;
 err:
-	per_cpu_ptr(lport->stats, get_cpu())->ErrorFrames++;
-	put_cpu();
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
+	stats->ErrorFrames++;
+	local_unlock(&lport->stats->lock);
 err2:
 	kfree_skb(skb);
 	return NET_RX_DROP;
@@ -1585,10 +1588,11 @@ static int fcoe_xmit(struct fc_lport *lport, struct fc_frame *fp)
 		skb_shinfo(skb)->gso_size = 0;
 	}
 	/* update tx stats: regardless if LLD fails */
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	stats->TxFrames++;
 	stats->TxWords += wlen;
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 
 	/* send down to lld */
 	fr_dev(fp) = lport;
@@ -1640,11 +1644,12 @@ static inline int fcoe_filter_frames(struct fc_lport *lport,
 		return 0;
 	}
 
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	stats->InvalidCRCCount++;
 	if (stats->InvalidCRCCount < 5)
 		printk(KERN_WARNING "fcoe: dropping frame with CRC error\n");
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 	return -EINVAL;
 }
 
@@ -1685,7 +1690,8 @@ static void fcoe_recv_frame(struct sk_buff *skb)
 	 */
 	hp = (struct fcoe_hdr *) skb_network_header(skb);
 
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	if (unlikely(FC_FCOE_DECAPS_VER(hp) != FC_FCOE_VER)) {
 		if (stats->ErrorFrames < 5)
 			printk(KERN_WARNING "fcoe: FCoE version "
@@ -1717,13 +1723,13 @@ static void fcoe_recv_frame(struct sk_buff *skb)
 		goto drop;
 
 	if (!fcoe_filter_frames(lport, fp)) {
-		put_cpu();
+		local_unlock(&lport->stats->lock);
 		fc_exch_recv(lport, fp);
 		return;
 	}
 drop:
 	stats->ErrorFrames++;
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 	kfree_skb(skb);
 }
 
@@ -1921,9 +1927,10 @@ static int fcoe_device_notification(struct notifier_block *notifier,
 			break;
 		case FCOE_CTLR_ENABLED:
 		case FCOE_CTLR_UNUSED:
-			stats = per_cpu_ptr(lport->stats, get_cpu());
+			local_lock(&lport->stats->lock);
+			stats = this_cpu_ptr(lport->stats);
 			stats->LinkFailureCount++;
-			put_cpu();
+			local_unlock(&lport->stats->lock);
 			fcoe_clean_pending_queue(lport);
 		}
 	}
diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c
index 1756a0ac6f08..d033aacafc5e 100644
--- a/drivers/scsi/fcoe/fcoe_ctlr.c
+++ b/drivers/scsi/fcoe/fcoe_ctlr.c
@@ -828,7 +828,8 @@ static unsigned long fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip)
 
 	INIT_LIST_HEAD(&del_list);
 
-	stats = per_cpu_ptr(fip->lp->stats, get_cpu());
+	local_lock(&fip->lp->stats->lock);
+	stats = this_cpu_ptr(fip->lp->stats);
 
 	list_for_each_entry_safe(fcf, next, &fip->fcfs, list) {
 		deadline = fcf->time + fcf->fka_period + fcf->fka_period / 2;
@@ -864,7 +865,7 @@ static unsigned long fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip)
 				sel_time = fcf->time;
 		}
 	}
-	put_cpu();
+	local_unlock(&fip->lp->stats->lock);
 
 	list_for_each_entry_safe(fcf, next, &del_list, list) {
 		/* Removes fcf from current list */
@@ -1286,10 +1287,11 @@ static void fcoe_ctlr_recv_els(struct fcoe_ctlr *fip, struct sk_buff *skb)
 	fr_dev(fp) = lport;
 	fr_encaps(fp) = els_dtype;
 
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	stats->RxFrames++;
 	stats->RxWords += skb->len / FIP_BPW;
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 
 	fc_exch_recv(lport, fp);
 	return;
@@ -1321,6 +1323,7 @@ static void fcoe_ctlr_recv_clr_vlink(struct fcoe_ctlr *fip,
 	struct fcoe_fcf *fcf = fip->sel_fcf;
 	struct fc_lport *lport = fip->lp;
 	struct fc_lport *vn_port = NULL;
+	struct fc_stats *stats;
 	u32 desc_mask;
 	int num_vlink_desc;
 	int reset_phys_port = 0;
@@ -1427,9 +1430,10 @@ static void fcoe_ctlr_recv_clr_vlink(struct fcoe_ctlr *fip,
 						      ntoh24(vp->fd_fc_id));
 			if (vn_port && (vn_port == lport)) {
 				mutex_lock(&fip->ctlr_mutex);
-				per_cpu_ptr(lport->stats,
-					    get_cpu())->VLinkFailureCount++;
-				put_cpu();
+				local_lock(&lport->stats->lock);
+				stats = this_cpu_ptr(lport->stats);
+				stats->VLinkFailureCount++;
+				local_unlock(&lport->stats->lock);
 				fcoe_ctlr_reset(fip);
 				mutex_unlock(&fip->ctlr_mutex);
 			}
@@ -1457,8 +1461,10 @@ static void fcoe_ctlr_recv_clr_vlink(struct fcoe_ctlr *fip,
 		 * followed by physical port
 		 */
 		mutex_lock(&fip->ctlr_mutex);
-		per_cpu_ptr(lport->stats, get_cpu())->VLinkFailureCount++;
-		put_cpu();
+		local_lock(&lport->stats->lock);
+		stats = this_cpu_ptr(lport->stats);
+		stats->VLinkFailureCount++;
+		local_unlock(&lport->stats->lock);
 		fcoe_ctlr_reset(fip);
 		mutex_unlock(&fip->ctlr_mutex);
 
diff --git a/drivers/scsi/libfc/fc_fcp.c b/drivers/scsi/libfc/fc_fcp.c
index 871b11edb586..bee26f4f2ea6 100644
--- a/drivers/scsi/libfc/fc_fcp.c
+++ b/drivers/scsi/libfc/fc_fcp.c
@@ -147,8 +147,12 @@ static struct fc_fcp_pkt *fc_fcp_pkt_alloc(struct fc_lport *lport, gfp_t gfp)
 		INIT_LIST_HEAD(&fsp->list);
 		spin_lock_init(&fsp->scsi_pkt_lock);
 	} else {
-		per_cpu_ptr(lport->stats, get_cpu())->FcpPktAllocFails++;
-		put_cpu();
+		struct fc_stats *stats;
+
+		local_lock(&lport->stats->lock);
+		stats = this_cpu_ptr(lport->stats);
+		stats->FcpPktAllocFails++;
+		local_unlock(&lport->stats->lock);
 	}
 	return fsp;
 }
@@ -266,12 +270,15 @@ static void fc_fcp_abort_done(struct fc_fcp_pkt *fsp)
 static int fc_fcp_send_abort(struct fc_fcp_pkt *fsp)
 {
 	int rc;
+	struct fc_stats *stats;
 
 	if (!fsp->seq_ptr)
 		return -EINVAL;
 
-	per_cpu_ptr(fsp->lp->stats, get_cpu())->FcpPktAborts++;
-	put_cpu();
+	local_lock(&fsp->lp->stats->lock);
+	stats = this_cpu_ptr(fsp->lp->stats);
+	stats->FcpPktAborts++;
+	local_unlock(&fsp->lp->stats->lock);
 
 	fsp->state |= FC_SRB_ABORT_PENDING;
 	rc = fc_seq_exch_abort(fsp->seq_ptr, 0);
@@ -435,13 +442,16 @@ static inline struct fc_frame *fc_fcp_frame_alloc(struct fc_lport *lport,
 						  size_t len)
 {
 	struct fc_frame *fp;
+	struct fc_stats *stats;
 
 	fp = fc_frame_alloc(lport, len);
 	if (likely(fp))
 		return fp;
 
-	per_cpu_ptr(lport->stats, get_cpu())->FcpFrameAllocFails++;
-	put_cpu();
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
+	stats->FcpFrameAllocFails++;
+	local_unlock(&lport->stats->lock);
 	/* error case */
 	fc_fcp_can_queue_ramp_down(lport);
 	shost_printk(KERN_ERR, lport->host,
@@ -537,8 +547,10 @@ static void fc_fcp_recv_data(struct fc_fcp_pkt *fsp, struct fc_frame *fp)
 
 		if (~crc != le32_to_cpu(fr_crc(fp))) {
 crc_err:
-			stats = per_cpu_ptr(lport->stats, get_cpu());
+			local_lock(&lport->stats->lock);
+			stats = this_cpu_ptr(lport->stats);
 			stats->ErrorFrames++;
+			local_unlock(&lport->stats->lock);
 			/* per cpu count, not total count, but OK for limit */
 			if (stats->InvalidCRCCount++ < FC_MAX_ERROR_CNT)
 				printk(KERN_WARNING "libfc: CRC error on data "
@@ -1917,7 +1929,8 @@ int fc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *sc_cmd)
 	/*
 	 * setup the data direction
 	 */
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	if (sc_cmd->sc_data_direction == DMA_FROM_DEVICE) {
 		fsp->req_flags = FC_SRB_READ;
 		stats->InputRequests++;
@@ -1930,7 +1943,7 @@ int fc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *sc_cmd)
 		fsp->req_flags = 0;
 		stats->ControlRequests++;
 	}
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 
 	/*
 	 * send it to the lower layer
diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c
index 1bf7a22d4948..1488f7431db5 100644
--- a/drivers/scsi/qedf/qedf_main.c
+++ b/drivers/scsi/qedf/qedf_main.c
@@ -1213,11 +1213,12 @@ static int qedf_xmit(struct fc_lport *lport, struct fc_frame *fp)
 		FC_FCOE_ENCAPS_VER(hp, FC_FCOE_VER);
 	hp->fcoe_sof = sof;
 
-	/*update tx stats */
-	stats = per_cpu_ptr(lport->stats, get_cpu());
+	/* update tx stats */
+	local_lock(&lport->stats->lock);
+	stats = this_cpu_ptr(lport->stats);
 	stats->TxFrames++;
 	stats->TxWords += wlen;
-	put_cpu();
+	local_unlock(&lport->stats->lock);
 
 	/* Get VLAN ID from skb for printing purposes */
 	__vlan_hwaccel_get_tag(skb, &vlan_tci);
diff --git a/include/scsi/libfc.h b/include/scsi/libfc.h
index eeb8d689ff6b..d8d717bdbb74 100644
--- a/include/scsi/libfc.h
+++ b/include/scsi/libfc.h
@@ -240,6 +240,7 @@ struct fc_rport_priv {
  * @OutputBytes:           Number of transmitted bytes
  * @VLinkFailureCount:     Number of virtual link failures
  * @MissDiscAdvCount:      Number of missing FIP discovery advertisement
+ * @lock:                  Protects percpu stats management
  */
 struct fc_stats {
 	u64		SecondsSinceLastReset;
@@ -263,6 +264,7 @@ struct fc_stats {
 	u64		OutputBytes;
 	u64		VLinkFailureCount;
 	u64		MissDiscAdvCount;
+	local_lock_t    lock;
 };
 
 /**
@@ -824,9 +826,14 @@ static inline void fc_lport_state_enter(struct fc_lport *lport,
  */
 static inline int fc_lport_init_stats(struct fc_lport *lport)
 {
+	int cpu;
+
 	lport->stats = alloc_percpu(struct fc_stats);
 	if (!lport->stats)
 		return -ENOMEM;
+
+	for_each_possible_cpu(cpu)
+		local_lock_init(&per_cpu_ptr(lport->stats, cpu)->lock);
 	return 0;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc()
  2021-11-17  2:59 ` [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc() Davidlohr Bueso
@ 2021-11-17  7:25   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-11-17  7:25 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: martin.petersen, jejb, hare, tglx, linux-scsi, linux-rt-users,
	linux-kernel, Davidlohr Bueso

On 2021-11-16 18:59:54 [-0800], Davidlohr Bueso wrote:
> The get_cpu() in fc_exch_em_alloc() was introduced in:
> 
>     f018b73af6db ([SCSI] libfc, libfcoe, fcoe: use smp_processor_id() only when preempt disabled)
> 
> for no other reason than to simply use smp_processor_id()
> without getting a warning, because everything is done with
> the pool->lock held anyway. However, get_cpu(), by disabling
> preemption, does not play well with PREEMPT_RT, particularly
> when acquiring a regular (and thus sleepable) spinlock.
> 
> Therefore remove the get_cpu() and just use the unstable value
> as we will have CPU locality guarantees next by taking the lock.
> The window of migration, as noted by Sebastian, is small and
> even if it happens the result is correct.
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c
> index 841000445b9a..be37bfb2814d 100644
> --- a/drivers/scsi/libfc/fc_exch.c
> +++ b/drivers/scsi/libfc/fc_exch.c
> @@ -825,10 +825,9 @@ static struct fc_exch *fc_exch_em_alloc(struct fc_lport *lport,
>  	}
>  	memset(ep, 0, sizeof(*ep));
>  
> -	cpu = get_cpu();
> +	cpu = raw_smp_processor_id();
>  	pool = per_cpu_ptr(mp->pool, cpu);
>  	spin_lock_bh(&pool->lock);
> -	put_cpu();

The `cpu' variable is later ORed into ep->oxid. I haven't figured out
why this is important/ required. The ep variable/ `fc_exch' is allocated
from a per-CPU memory pool and is later released to the same pool but
the pool's CPU number is not obtained from fc_exch::oxid but there is a
pointer to its pool stored in fc_exch::em. 
So it remains a mystery to why the CPU number is stored here.

>  	/* peek cache of free slot */
>  	if (pool->left != FC_XID_UNKNOWN) {

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] scsi/fcoe: Add a local_lock to fcoe_percpu
  2021-11-17  2:59 ` [PATCH 2/3] scsi/fcoe: Add a local_lock to fcoe_percpu Davidlohr Bueso
@ 2021-11-17  7:50   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-11-17  7:50 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: martin.petersen, jejb, hare, tglx, linux-scsi, linux-rt-users,
	linux-kernel, Davidlohr Bueso

On 2021-11-16 18:59:55 [-0800], Davidlohr Bueso wrote:
> fcoe_get_paged_crc_eof() relies on the caller having preemption
> disabled to ensure the per-CPU fcoe_percpu context remains valid
> throughout the call. This is done by either holding spinlocks
> (such as bnx2fc_global_lock or qedf_global_lock) or the get_cpu()
> from fcoe_alloc_paged_crc_eof(). This last one breaks PREEMPT_RT
> semantics as there can be memory allocation and end up sleeping
> in atomic contexts.
> 
> Introduce a local_lock_t to struct fcoe_percpu that will keep the
> non-RT case the same, mapping to preempt_disable/enable, while
> RT will use a per-CPU spinlock allowing the region to be preemptible
> but still maintain CPU locality. The other users of fcoe_percpu
> are already safe in this regard and do not require local_lock()ing.
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] scsi/fcoe: Add a local_lock to percpu localport statistics
  2021-11-17  2:59 ` [PATCH 3/3] scsi/fcoe: Add a local_lock to percpu localport statistics Davidlohr Bueso
@ 2021-11-17  8:34   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-11-17  8:34 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: martin.petersen, jejb, hare, tglx, linux-scsi, linux-rt-users,
	linux-kernel, Davidlohr Bueso

On 2021-11-16 18:59:56 [-0800], Davidlohr Bueso wrote:
> Updating the per-CPU lport->stats relies on disabling preemption
> with get/put_cpu() semantics. However, this is a bit harsh for
> PREEMPT_RT and by using a local_lock, it can allow the region
> to be preemptible, guaranteeing CPU locality, without making any
> difference to the non-RT common case as it will continue to
> disable preemption.

What about adding a struct u64_stats_sync where the stats are updated.
You have to use u64_stats_update_begin() at the beginning and the
matching end function instead the get_cpu()/ local_lock().
You need just ensure that there is only one writer at a time. Network
wise it is easy as there are often per-queue stats so one writer at a
time :)

Looking closer, the current approach appears broken in that regard:
stats members, such as TxFrames/ TxWords which are incremented in
bnx2fc_xmit(), are 64bit. Reading/ writing them requires two operations
on 32bit architectures. But we probably don't care because stats and it
happens hardly and performance of course. This is the cute part about
u64_stats_sync, it magically vanishes on 64bit architectures.

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-17  8:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-17  2:59 [PATCH -next 0/3] scsi/fcoe: Play nicer with PREEMPT_RT Davidlohr Bueso
2021-11-17  2:59 ` [PATCH 1/3] scsi/libfc: Remove get_cpu() semantics in fc_exch_em_alloc() Davidlohr Bueso
2021-11-17  7:25   ` Sebastian Andrzej Siewior
2021-11-17  2:59 ` [PATCH 2/3] scsi/fcoe: Add a local_lock to fcoe_percpu Davidlohr Bueso
2021-11-17  7:50   ` Sebastian Andrzej Siewior
2021-11-17  2:59 ` [PATCH 3/3] scsi/fcoe: Add a local_lock to percpu localport statistics Davidlohr Bueso
2021-11-17  8:34   ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).