All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/2] sfc: couple of ARFS fixes
@ 2018-04-12 14:00 Edward Cree
  2018-04-12 14:02 ` [PATCH net 1/2] sfc: insert ARFS filters with replace_equal=true Edward Cree
  2018-04-12 14:02 ` [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel Edward Cree
  0 siblings, 2 replies; 13+ messages in thread
From: Edward Cree @ 2018-04-12 14:00 UTC (permalink / raw)
  To: linux-net-drivers, David Miller; +Cc: netdev

Two issues introduced by my recent asynchronous filter handling changes:
1. The old filter_rfs_insert would replace a matching filter of equal
   priority; we need to pass the appropriate argument to filter_insert to
   make it do the same.
2. It's possible to cause the kernel to hammer ndo_rx_flow_steer very
   hard, so make sure we don't build up too huge a backlog of workitems.

Possibly it would be better to fix #2 on the kernel side; I think the way
 to do that would be to maintain a forward (as well as reverse) queue-to-
 cpu map and replace the set_rps_cpu() check
    if (rxq_index == skb_get_rx_queue(skb))
 with something like (pseudocode)
    if (irqaffinity of queue[skb_get_rx_queue(skb)] includes next_cpu)
 but I'm not sure whether it's right or even necessary, and in any case
 it's not a regression in 4.17 so isn't 'net' material.
(There's also the issue that we come up in the bad configuration by
 default, but that too is a problem for another time.)

Edward Cree (2):
  sfc: insert ARFS filters with replace_equal=true
  sfc: limit ARFS workitems in flight per channel

 drivers/net/ethernet/sfc/net_driver.h | 25 +++++++++++++++
 drivers/net/ethernet/sfc/rx.c         | 60 ++++++++++++++++++-----------------
 2 files changed, 56 insertions(+), 29 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net 1/2] sfc: insert ARFS filters with replace_equal=true
  2018-04-12 14:00 [PATCH net 0/2] sfc: couple of ARFS fixes Edward Cree
@ 2018-04-12 14:02 ` Edward Cree
  2018-04-12 14:02 ` [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel Edward Cree
  1 sibling, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-04-12 14:02 UTC (permalink / raw)
  To: linux-net-drivers, David Miller; +Cc: netdev

Necessary to allow redirecting a flow when the application moves.

Fixes: 3af0f34290f6 ("sfc: replace asynchronous filter operations")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index 95682831484e..13b0eb71dbf3 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -851,7 +851,7 @@ static void efx_filter_rfs_work(struct work_struct *data)
 	struct efx_channel *channel = efx_get_channel(efx, req->rxq_index);
 	int rc;
 
-	rc = efx->type->filter_insert(efx, &req->spec, false);
+	rc = efx->type->filter_insert(efx, &req->spec, true);
 	if (rc >= 0) {
 		/* Remember this so we can check whether to expire the filter
 		 * later.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-12 14:00 [PATCH net 0/2] sfc: couple of ARFS fixes Edward Cree
  2018-04-12 14:02 ` [PATCH net 1/2] sfc: insert ARFS filters with replace_equal=true Edward Cree
@ 2018-04-12 14:02 ` Edward Cree
  2018-04-12 15:11   ` David Miller
  1 sibling, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-04-12 14:02 UTC (permalink / raw)
  To: linux-net-drivers, David Miller; +Cc: netdev

A misconfigured system (e.g. with all interrupts affinitised to all CPUs)
 may produce a storm of ARFS steering events.  With the existing sfc ARFS
 implementation, that could create a backlog of workitems that grinds the
 system to a halt.  To prevent this, limit the number of workitems that
 may be in flight for a given SFC device to 8 (EFX_RPS_MAX_IN_FLIGHT), and
 return EBUSY from our ndo_rx_flow_steer method if the limit is reached.
Given this limit, also store the workitems in an array of slots within the
 struct efx_nic, rather than dynamically allocating for each request.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/net_driver.h | 25 +++++++++++++++
 drivers/net/ethernet/sfc/rx.c         | 58 ++++++++++++++++++-----------------
 2 files changed, 55 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 5e379a83c729..eea3808b3f25 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -733,6 +733,27 @@ struct efx_rss_context {
 	u32 rx_indir_table[128];
 };
 
+#ifdef CONFIG_RFS_ACCEL
+/**
+ * struct efx_async_filter_insertion - Request to asynchronously insert a filter
+ * @net_dev: Reference to the netdevice
+ * @spec: The filter to insert
+ * @work: Workitem for this request
+ * @rxq_index: Identifies the channel for which this request was made
+ * @flow_id: Identifies the kernel-side flow for which this request was made
+ */
+struct efx_async_filter_insertion {
+	struct net_device *net_dev;
+	struct efx_filter_spec spec;
+	struct work_struct work;
+	u16 rxq_index;
+	u32 flow_id;
+};
+
+/* Maximum number of ARFS workitems that may be in flight on an efx_nic */
+#define EFX_RPS_MAX_IN_FLIGHT	8
+#endif /* CONFIG_RFS_ACCEL */
+
 /**
  * struct efx_nic - an Efx NIC
  * @name: Device name (net device name or bus id before net device registered)
@@ -850,6 +871,8 @@ struct efx_rss_context {
  * @rps_expire_channel: Next channel to check for expiry
  * @rps_expire_index: Next index to check for expiry in
  *	@rps_expire_channel's @rps_flow_id
+ * @rps_slot_map: bitmap of in-flight entries in @rps_slot
+ * @rps_slot: array of ARFS insertion requests for efx_filter_rfs_work()
  * @active_queues: Count of RX and TX queues that haven't been flushed and drained.
  * @rxq_flush_pending: Count of number of receive queues that need to be flushed.
  *	Decremented when the efx_flush_rx_queue() is called.
@@ -1004,6 +1027,8 @@ struct efx_nic {
 	struct mutex rps_mutex;
 	unsigned int rps_expire_channel;
 	unsigned int rps_expire_index;
+	unsigned long rps_slot_map;
+	struct efx_async_filter_insertion rps_slot[EFX_RPS_MAX_IN_FLIGHT];
 #endif
 
 	atomic_t active_queues;
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index 13b0eb71dbf3..9c593c661cbf 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -827,28 +827,13 @@ MODULE_PARM_DESC(rx_refill_threshold,
 
 #ifdef CONFIG_RFS_ACCEL
 
-/**
- * struct efx_async_filter_insertion - Request to asynchronously insert a filter
- * @net_dev: Reference to the netdevice
- * @spec: The filter to insert
- * @work: Workitem for this request
- * @rxq_index: Identifies the channel for which this request was made
- * @flow_id: Identifies the kernel-side flow for which this request was made
- */
-struct efx_async_filter_insertion {
-	struct net_device *net_dev;
-	struct efx_filter_spec spec;
-	struct work_struct work;
-	u16 rxq_index;
-	u32 flow_id;
-};
-
 static void efx_filter_rfs_work(struct work_struct *data)
 {
 	struct efx_async_filter_insertion *req = container_of(data, struct efx_async_filter_insertion,
 							      work);
 	struct efx_nic *efx = netdev_priv(req->net_dev);
 	struct efx_channel *channel = efx_get_channel(efx, req->rxq_index);
+	int slot_idx = req - efx->rps_slot;
 	int rc;
 
 	rc = efx->type->filter_insert(efx, &req->spec, true);
@@ -878,8 +863,8 @@ static void efx_filter_rfs_work(struct work_struct *data)
 	}
 
 	/* Release references */
+	clear_bit(slot_idx, &efx->rps_slot_map);
 	dev_put(req->net_dev);
-	kfree(req);
 }
 
 int efx_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
@@ -888,22 +873,36 @@ int efx_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
 	struct efx_nic *efx = netdev_priv(net_dev);
 	struct efx_async_filter_insertion *req;
 	struct flow_keys fk;
+	int slot_idx;
+	int rc;
 
-	if (flow_id == RPS_FLOW_ID_INVALID)
-		return -EINVAL;
+	/* find a free slot */
+	for (slot_idx = 0; slot_idx < EFX_RPS_MAX_IN_FLIGHT; slot_idx++)
+		if (!test_and_set_bit(slot_idx, &efx->rps_slot_map))
+			break;
+	if (slot_idx >= EFX_RPS_MAX_IN_FLIGHT)
+		return -EBUSY;
 
-	if (!skb_flow_dissect_flow_keys(skb, &fk, 0))
-		return -EPROTONOSUPPORT;
+	if (flow_id == RPS_FLOW_ID_INVALID) {
+		rc = -EINVAL;
+		goto out_clear;
+	}
 
-	if (fk.basic.n_proto != htons(ETH_P_IP) && fk.basic.n_proto != htons(ETH_P_IPV6))
-		return -EPROTONOSUPPORT;
-	if (fk.control.flags & FLOW_DIS_IS_FRAGMENT)
-		return -EPROTONOSUPPORT;
+	if (!skb_flow_dissect_flow_keys(skb, &fk, 0)) {
+		rc = -EPROTONOSUPPORT;
+		goto out_clear;
+	}
 
-	req = kmalloc(sizeof(*req), GFP_ATOMIC);
-	if (!req)
-		return -ENOMEM;
+	if (fk.basic.n_proto != htons(ETH_P_IP) && fk.basic.n_proto != htons(ETH_P_IPV6)) {
+		rc = -EPROTONOSUPPORT;
+		goto out_clear;
+	}
+	if (fk.control.flags & FLOW_DIS_IS_FRAGMENT) {
+		rc = -EPROTONOSUPPORT;
+		goto out_clear;
+	}
 
+	req = efx->rps_slot + slot_idx;
 	efx_filter_init_rx(&req->spec, EFX_FILTER_PRI_HINT,
 			   efx->rx_scatter ? EFX_FILTER_FLAG_RX_SCATTER : 0,
 			   rxq_index);
@@ -933,6 +932,9 @@ int efx_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
 	req->flow_id = flow_id;
 	schedule_work(&req->work);
 	return 0;
+out_clear:
+	clear_bit(slot_idx, &efx->rps_slot_map);
+	return rc;
 }
 
 bool __efx_filter_rfs_expire(struct efx_nic *efx, unsigned int quota)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-12 14:02 ` [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel Edward Cree
@ 2018-04-12 15:11   ` David Miller
  2018-04-12 15:24     ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2018-04-12 15:11 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Thu, 12 Apr 2018 15:02:50 +0100

> A misconfigured system (e.g. with all interrupts affinitised to all CPUs)
>  may produce a storm of ARFS steering events.  With the existing sfc ARFS
>  implementation, that could create a backlog of workitems that grinds the
>  system to a halt.  To prevent this, limit the number of workitems that
>  may be in flight for a given SFC device to 8 (EFX_RPS_MAX_IN_FLIGHT), and
>  return EBUSY from our ndo_rx_flow_steer method if the limit is reached.
> Given this limit, also store the workitems in an array of slots within the
>  struct efx_nic, rather than dynamically allocating for each request.
> 
> Signed-off-by: Edward Cree <ecree@solarflare.com>

I don't think this behavior is all that great.

If you really have to queue up these operations because they take a long
time, I think it is better to enter a synchronous mode and sleep once
you hit this in-flight limit of 8.

Either that or make the expiration work smarter when it has lots of events
to process.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-12 15:11   ` David Miller
@ 2018-04-12 15:24     ` Edward Cree
  2018-04-12 15:33       ` David Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-04-12 15:24 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev

On 12/04/18 16:11, David Miller wrote:
> From: Edward Cree <ecree@solarflare.com>
> Date: Thu, 12 Apr 2018 15:02:50 +0100
>
>> A misconfigured system (e.g. with all interrupts affinitised to all CPUs)
>>  may produce a storm of ARFS steering events.  With the existing sfc ARFS
>>  implementation, that could create a backlog of workitems that grinds the
>>  system to a halt.  To prevent this, limit the number of workitems that
>>  may be in flight for a given SFC device to 8 (EFX_RPS_MAX_IN_FLIGHT), and
>>  return EBUSY from our ndo_rx_flow_steer method if the limit is reached.
>> Given this limit, also store the workitems in an array of slots within the
>>  struct efx_nic, rather than dynamically allocating for each request.
>>
>> Signed-off-by: Edward Cree <ecree@solarflare.com>
> I don't think this behavior is all that great.
>
> If you really have to queue up these operations because they take a long
> time, I think it is better to enter a synchronous mode and sleep once
> you hit this in-flight limit of 8.
I don't think we can sleep at this point, ndo_rx_flow_steer is called from
 the RX path (netif_receive_skb_internal() -> get_rps_cpu() ->
 set_rps_cpu()).

> Either that or make the expiration work smarter when it has lots of events
> to process.
I'm afraid I don't understand what you mean here.
This code is not handling expiration of old ARFS filters, it's inserting
 new ones.

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-12 15:24     ` Edward Cree
@ 2018-04-12 15:33       ` David Miller
  2018-04-13 12:36         ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2018-04-12 15:33 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Thu, 12 Apr 2018 16:24:46 +0100

> This code is not handling expiration of old ARFS filters, it's inserting
>  new ones.

Then simply make the work process a queue, and add entries to the queue
here if the work is already scheduled.

Is there a reason why that wouldn't work?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-12 15:33       ` David Miller
@ 2018-04-13 12:36         ` Edward Cree
  2018-04-13 14:45           ` David Miller
  2018-04-13 14:52           ` Edward Cree
  0 siblings, 2 replies; 13+ messages in thread
From: Edward Cree @ 2018-04-13 12:36 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev

It turns out this may all be moot anyway: I figured out why I was seeing
 ARFS storms and it wasn't the configuration issue I originally blamed.
My current ndo_rx_flow_steer() implementation, efx_filter_rfs(), returns
 0 for success, but the caller expects a filter ID to be returned (which
 we can't give it because we don't know what the filter ID will be until
 we start mucking around in the software state that's now protected by a
 sleepable lock).
As a result, when we call rps_may_expire_flow(), and pass it the _actual_
 filter ID, this doesn't match the one set_rps_cpu() recorded, so the
 function returns true and we immediately expire the filter.  Then the
 next packet to come along isn't steered, so ARFS asks us to insert a
 steering filter again.
As a quick fix I've simply tried making the rps_may_expire_flow() calls
 also pass a filter ID of 0, which prevents the ARFS storms.  This is
 safe; it may cause us to delay expiring a filter when flow_ids collide,
 but that can happen anyway with other drivers' implementations (e.g.
 mlx4 and mlx5 can potentially reuse filter IDs) so I presume it is OK.
I'll post a v2 with that fix in place of this Patch #2 shortly, then try
 to follow up with a counter-generated ID (similar to what mlx have).

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-13 12:36         ` Edward Cree
@ 2018-04-13 14:45           ` David Miller
  2018-04-13 14:52           ` Edward Cree
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2018-04-13 14:45 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Fri, 13 Apr 2018 13:36:28 +0100

> It turns out this may all be moot anyway: I figured out why I was seeing
>  ARFS storms and it wasn't the configuration issue I originally blamed.
> My current ndo_rx_flow_steer() implementation, efx_filter_rfs(), returns
>  0 for success, but the caller expects a filter ID to be returned (which
>  we can't give it because we don't know what the filter ID will be until
>  we start mucking around in the software state that's now protected by a
>  sleepable lock).
> As a result, when we call rps_may_expire_flow(), and pass it the _actual_
>  filter ID, this doesn't match the one set_rps_cpu() recorded, so the
>  function returns true and we immediately expire the filter.  Then the
>  next packet to come along isn't steered, so ARFS asks us to insert a
>  steering filter again.
> As a quick fix I've simply tried making the rps_may_expire_flow() calls
>  also pass a filter ID of 0, which prevents the ARFS storms.  This is
>  safe; it may cause us to delay expiring a filter when flow_ids collide,
>  but that can happen anyway with other drivers' implementations (e.g.
>  mlx4 and mlx5 can potentially reuse filter IDs) so I presume it is OK.
> I'll post a v2 with that fix in place of this Patch #2 shortly, then try
>  to follow up with a counter-generated ID (similar to what mlx have).

I understand the constraints you are working under, but do realize
that the real root of the problems is that you are implementing what
is defined clearly as a synchronous operation as asynchronous.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-13 12:36         ` Edward Cree
  2018-04-13 14:45           ` David Miller
@ 2018-04-13 14:52           ` Edward Cree
  2018-04-13 15:03             ` David Miller
  1 sibling, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-04-13 14:52 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev

On 13/04/18 13:36, Edward Cree wrote:
> It turns out this may all be moot anyway: I figured out why I was seeing
>  ARFS storms and it wasn't the configuration issue I originally blamed.
Hmm, correction, while the fix I mentioned in my previous email is needed,
 it doesn't prevent the ARFS storms (although seems to lessen their
 severity, given that the machine didn't actually fall over this time),
 so we do also need some kind of limiting.

On 12/04/18 16:33, David Miller wrote:
> Then simply make the work process a queue, and add entries to the queue
> here if the work is already scheduled.
>
> Is there a reason why that wouldn't work?
That has the same problem as the existing code, that the length of the queue
 can grow without bound, potentially causing a very long lag between the
 request and its execution.  This then can quickly become exponential as,
 while waiting for the filter to be inserted, further packets from the same
 flow are received (still unsteered) and trigger duplicate ARFS requests
 (though I suppose it would be possible to scan the queue for matching flow
 IDs; but the concurrency / locking problems with that are 'interesting').

I'm not sure why you object to the dropping of requests - it seems reasonable
 to me to treat ARFS as a 'best-effort' thing.  The packets will still be
 received (just not necessarily on the core nearest the application), and if
 the flow continues it will generate more steering requests after the ones
 currently in flight have been processed.
And in practice we only get into this situation in the first place when we
 have interrupt affinities configured in such a way as to make ARFS
 practically useless anyway, so our failure to insert the filters is not of
 great significance.

On 13/04/18 15:45, David Miller wrote:
> I understand the constraints you are working under, but do realize
> that the real root of the problems is that you are implementing what
> is defined clearly as a synchronous operation as asynchronous.
Yes, it is unfortunate that we are unable to perform synchronous filter
 insertions, but you go to war with the hardware you have :(

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-13 14:52           ` Edward Cree
@ 2018-04-13 15:03             ` David Miller
  2018-04-13 15:59               ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2018-04-13 15:03 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Fri, 13 Apr 2018 15:52:25 +0100

> On 13/04/18 15:45, David Miller wrote:
>> I understand the constraints you are working under, but do realize
>> that the real root of the problems is that you are implementing what
>> is defined clearly as a synchronous operation as asynchronous.
> Yes, it is unfortunate that we are unable to perform synchronous filter
>  insertions, but you go to war with the hardware you have :(

Whilst you may not be able to program the filter into the hardware
synchronously, you should be able to allocate the ID and get all of
the software state setup.

At least then you can return the proper return value from the netdev
op.

People really aren't going to be happy if their performance goes into
the tank because their filter update rate, for whatever reason, hits
this magic backlog limit.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-13 15:03             ` David Miller
@ 2018-04-13 15:59               ` Edward Cree
  2018-04-13 16:14                 ` David Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-04-13 15:59 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev

On 13/04/18 16:03, David Miller wrote:
> Whilst you may not be able to program the filter into the hardware
> synchronously, you should be able to allocate the ID and get all of
> the software state setup.
That's what we were doing before commit 3af0f34290f6 ("sfc: replace
 asynchronous filter operations"), and (as mentioned in that commit)
 that leads to (or at least the scheme we used had) race conditions
 which I could not see a way to fix.  If the hardware resets (and
 thus forgets all its filters) after we've done the software state
 setup but before we reach the point of finalising the software state
 after the hardware operation, we don't know what operations we need
 to do to re-apply the software state to the hardware, because we
 don't know whether the reset happened before or after the hardware
 operation.
Actually, it's not the insertion itself that this can happen to - if
 the reset happens first then the insert will fail because other
 hardware state we reference isn't set up yet.  However, inserting
 one filter can necessitate removing some others (lower-priority
 multicast filters for the same flow); the code before 3af0f34290f6
 was actually capable of getting so confused that it could double-
 free a pointer.
This all gets sufficiently complicated that even if I can find a
 locking scheme that in theory gets it right, there's pretty much a
 100% chance that the actual implementation will contain bugs,
 probably very subtle ones that can't reliably be reproduced etc.
 All my instincts say to get away from that if at all possible.

> People really aren't going to be happy if their performance goes into
> the tank because their filter update rate, for whatever reason, hits
> this magic backlog limit.
Well, the alternative, even if the software state setup part _could_
 be made synchronous, is to allow a potentially unbounded queue for
 the hardware update part (I think there are even still cases in
 which the exponential growth pathology is possible), causing the
 filter insertions to be delayed an arbitrarily long time.  Either
 the flow is still going by that time (in which case the backlog
 limit approach will get a new ndo_rx_flow_steer request and insert
 the filter too) or it isn't, in which case getting round to it
 eventually is no better than dropping it immediately.  In fact it's
 worse because now you waste time inserting a useless filter which
 delays new requests even more.
Besides, I'm fairly confident that the only cases in which you'll
 even come close to hitting the limit are ones where ARFS wouldn't
 do you much good anyway, such as:
* Misconfigured interrupt affinities where ARFS is entirely pointless
* Many short-lived flows (which greatly diminish the utility of ARFS)

So for multiple reasons, hitting the limit won't actually make
 performance worse, although it will often be a sign that performance
 will be bad for other reasons.

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-13 15:59               ` Edward Cree
@ 2018-04-13 16:14                 ` David Miller
  2018-04-13 16:24                   ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2018-04-13 16:14 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Fri, 13 Apr 2018 16:59:07 +0100

> On 13/04/18 16:03, David Miller wrote:
>> Whilst you may not be able to program the filter into the hardware
>> synchronously, you should be able to allocate the ID and get all of
>> the software state setup.
> That's what we were doing before commit 3af0f34290f6 ("sfc: replace
>  asynchronous filter operations"), and (as mentioned in that commit)
>  that leads to (or at least the scheme we used had) race conditions
>  which I could not see a way to fix.  If the hardware resets (and
>  thus forgets all its filters) after we've done the software state
>  setup but before we reach the point of finalising the software state
>  after the hardware operation, we don't know what operations we need
>  to do to re-apply the software state to the hardware, because we
>  don't know whether the reset happened before or after the hardware
>  operation.

When an entry is successfully programmed into the chip, you update
the software state.

When the chip resets, you clear all of those state booleans to false.

Indeed, you would have to synchronize these things somehow.

Is the issue that you learn about the hardware reset asynchronously,
and therefore cannot determine if filter insertion programming
happened afterwards and thus is still in the chip?

You must have a table of all the entries, so that you can reprogram
the hardware should it reset.  Or do you not handle things that way
and it's a lossy system?

> Well, the alternative, even if the software state setup part _could_
>  be made synchronous, is to allow a potentially unbounded queue for
>  the hardware update part (I think there are even still cases in
>  which the exponential growth pathology is possible), causing the
>  filter insertions to be delayed an arbitrarily long time.  Either
>  the flow is still going by that time (in which case the backlog
>  limit approach will get a new ndo_rx_flow_steer request and insert
>  the filter too) or it isn't, in which case getting round to it
>  eventually is no better than dropping it immediately.  In fact it's
>  worse because now you waste time inserting a useless filter which
>  delays new requests even more.
> Besides, I'm fairly confident that the only cases in which you'll
>  even come close to hitting the limit are ones where ARFS wouldn't
>  do you much good anyway, such as:
> * Misconfigured interrupt affinities where ARFS is entirely pointless
> * Many short-lived flows (which greatly diminish the utility of ARFS)
> 
> So for multiple reasons, hitting the limit won't actually make
>  performance worse, although it will often be a sign that performance
>  will be bad for other reasons.

Understood, thanks for explaining.

Please respin your series with the updates you talked about and I'll
apply it.

But generally we do have this issue with various kinds of
configuration programming and async vs. sync.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
  2018-04-13 16:14                 ` David Miller
@ 2018-04-13 16:24                   ` Edward Cree
  0 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-04-13 16:24 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev

On 13/04/18 17:14, David Miller wrote:
> Is the issue that you learn about the hardware reset asynchronously,
> and therefore cannot determine if filter insertion programming
> happened afterwards and thus is still in the chip?
Yes, pretty much.

> You must have a table of all the entries, so that you can reprogram
> the hardware should it reset.
Yes, we do have such a table; 'reprogram the hardware' happens in
 efx_ef10_filter_table_restore().

> Understood, thanks for explaining.
>
> Please respin your series with the updates you talked about and I'll
> apply it.
Will do, thanks.

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-04-13 16:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-12 14:00 [PATCH net 0/2] sfc: couple of ARFS fixes Edward Cree
2018-04-12 14:02 ` [PATCH net 1/2] sfc: insert ARFS filters with replace_equal=true Edward Cree
2018-04-12 14:02 ` [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel Edward Cree
2018-04-12 15:11   ` David Miller
2018-04-12 15:24     ` Edward Cree
2018-04-12 15:33       ` David Miller
2018-04-13 12:36         ` Edward Cree
2018-04-13 14:45           ` David Miller
2018-04-13 14:52           ` Edward Cree
2018-04-13 15:03             ` David Miller
2018-04-13 15:59               ` Edward Cree
2018-04-13 16:14                 ` David Miller
2018-04-13 16:24                   ` Edward Cree

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.