All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/4] sfc: more EF100 fixes
@ 2020-08-18 12:41 Edward Cree
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:41 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

Fix up some bugs in the initial EF100 submission, and re-fix
 the hash_valid fix which was incomplete.

The reset bugs are currently hard to trigger; they were found
 with an in-progress patch adding ethtool support, whereby
 ethtool --reset reliably reproduces them.

Edward Cree (4):
  sfc: really check hash is valid before using it
  sfc: take correct lock in ef100_reset()
  sfc: null out channel->rps_flow_id after freeing it
  sfc: don't free_irq()s if they were never requested

 drivers/net/ethernet/sfc/ef100_nic.c  | 10 ++++++----
 drivers/net/ethernet/sfc/net_driver.h |  2 ++
 drivers/net/ethernet/sfc/nic.c        |  4 ++++
 drivers/net/ethernet/sfc/rx_common.c  |  1 +
 4 files changed, 13 insertions(+), 4 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net 1/4] sfc: really check hash is valid before using it
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
@ 2020-08-18 12:43 ` Edward Cree
  2020-08-18 18:54   ` Jesse Brandeburg
  2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:43 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.

Fixes: 068885434ccb ("sfc: check hash is valid before using it")
Reported-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef100_nic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index 206d70f9d95b..b8a7e9ed7913 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -739,6 +739,7 @@ const struct efx_nic_type ef100_pf_nic_type = {
 	.rx_remove = efx_mcdi_rx_remove,
 	.rx_write = ef100_rx_write,
 	.rx_packet = __ef100_rx_packet,
+	.rx_buf_hash_valid = ef100_rx_buf_hash_valid,
 	.fini_dmaq = efx_fini_dmaq,
 	.max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS,
 	.filter_table_probe = ef100_filter_table_up,
@@ -820,6 +821,7 @@ const struct efx_nic_type ef100_vf_nic_type = {
 	.rx_remove = efx_mcdi_rx_remove,
 	.rx_write = ef100_rx_write,
 	.rx_packet = __ef100_rx_packet,
+	.rx_buf_hash_valid = ef100_rx_buf_hash_valid,
 	.fini_dmaq = efx_fini_dmaq,
 	.max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS,
 	.filter_table_probe = ef100_filter_table_up,


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 2/4] sfc: take correct lock in ef100_reset()
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
@ 2020-08-18 12:43 ` Edward Cree
  2020-08-18 18:55   ` Jesse Brandeburg
  2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:43 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

When downing and upping the ef100 filter table, we need to take a write
 lock on efx->filter_sem, not just a read lock, because we may kfree()
 the table pointers.
Without this, resets cause a WARN_ON from efx_rwsem_assert_write_locked().

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef100_nic.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index b8a7e9ed7913..19fe86b3b316 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -431,18 +431,18 @@ static int ef100_reset(struct efx_nic *efx, enum reset_type reset_type)
 		/* A RESET_TYPE_ALL will cause filters to be removed, so we remove filters
 		 * and reprobe after reset to avoid removing filters twice
 		 */
-		down_read(&efx->filter_sem);
+		down_write(&efx->filter_sem);
 		ef100_filter_table_down(efx);
-		up_read(&efx->filter_sem);
+		up_write(&efx->filter_sem);
 		rc = efx_mcdi_reset(efx, reset_type);
 		if (rc)
 			return rc;
 
 		netif_device_attach(efx->net_dev);
 
-		down_read(&efx->filter_sem);
+		down_write(&efx->filter_sem);
 		rc = ef100_filter_table_up(efx);
-		up_read(&efx->filter_sem);
+		up_write(&efx->filter_sem);
 		if (rc)
 			return rc;
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
  2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
@ 2020-08-18 12:44 ` Edward Cree
  2020-08-18 18:58   ` Jesse Brandeburg
  2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
  2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:44 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

If an ef100_net_open() fails, ef100_net_stop() may be called without
 channel->rps_flow_id having been written; thus it may hold the address
 freed by a previous ef100_net_stop()'s call to efx_remove_filters().
 This then causes a double-free when efx_remove_filters() is called
 again, leading to a panic.
To prevent this, after freeing it, overwrite it with NULL.

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/rx_common.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
index ef9bca92b0b7..5e29284c89c9 100644
--- a/drivers/net/ethernet/sfc/rx_common.c
+++ b/drivers/net/ethernet/sfc/rx_common.c
@@ -849,6 +849,7 @@ void efx_remove_filters(struct efx_nic *efx)
 	efx_for_each_channel(channel, efx) {
 		cancel_delayed_work_sync(&channel->filter_work);
 		kfree(channel->rps_flow_id);
+		channel->rps_flow_id = NULL;
 	}
 #endif
 	down_write(&efx->filter_sem);


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 4/4] sfc: don't free_irq()s if they were never requested
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
                   ` (2 preceding siblings ...)
  2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
@ 2020-08-18 12:44 ` Edward Cree
  2020-08-18 19:03   ` Jesse Brandeburg
  2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:44 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
 failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
 needed and will cause error messages and stack traces.
So instead, only do this if efx_nic_init_interrupt successfully completed,
 as indicated by the new efx->irqs_hooked flag.

Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/net_driver.h | 2 ++
 drivers/net/ethernet/sfc/nic.c        | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index dcb741d8bd11..062462a13847 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -846,6 +846,7 @@ struct efx_async_filter_insertion {
  * @timer_quantum_ns: Interrupt timer quantum, in nanoseconds
  * @timer_max_ns: Interrupt timer maximum value, in nanoseconds
  * @irq_rx_adaptive: Adaptive IRQ moderation enabled for RX event queues
+ * @irqs_hooked: Channel interrupts are hooked
  * @irq_rx_mod_step_us: Step size for IRQ moderation for RX event queues
  * @irq_rx_moderation_us: IRQ moderation time for RX event queues
  * @msg_enable: Log message enable flags
@@ -1004,6 +1005,7 @@ struct efx_nic {
 	unsigned int timer_quantum_ns;
 	unsigned int timer_max_ns;
 	bool irq_rx_adaptive;
+	bool irqs_hooked;
 	unsigned int irq_mod_step_us;
 	unsigned int irq_rx_moderation_us;
 	u32 msg_enable;
diff --git a/drivers/net/ethernet/sfc/nic.c b/drivers/net/ethernet/sfc/nic.c
index d994d136bb03..d1e908846f5d 100644
--- a/drivers/net/ethernet/sfc/nic.c
+++ b/drivers/net/ethernet/sfc/nic.c
@@ -129,6 +129,7 @@ int efx_nic_init_interrupt(struct efx_nic *efx)
 #endif
 	}
 
+	efx->irqs_hooked = true;
 	return 0;
 
  fail2:
@@ -154,6 +155,8 @@ void efx_nic_fini_interrupt(struct efx_nic *efx)
 	efx->net_dev->rx_cpu_rmap = NULL;
 #endif
 
+	if (!efx->irqs_hooked)
+		return;
 	if (EFX_INT_MODE_USE_MSI(efx)) {
 		/* Disable MSI/MSI-X interrupts */
 		efx_for_each_channel(channel, efx)
@@ -163,6 +166,7 @@ void efx_nic_fini_interrupt(struct efx_nic *efx)
 		/* Disable legacy interrupt */
 		free_irq(efx->legacy_irq, efx);
 	}
+	efx->irqs_hooked = false;
 }
 
 /* Register dump */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net 1/4] sfc: really check hash is valid before using it
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
@ 2020-08-18 18:54   ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:54 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

On Tue, 18 Aug 2020 13:43:30 +0100
Edward Cree <ecree@solarflare.com> wrote:

> Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.
> 
> Fixes: 068885434ccb ("sfc: check hash is valid before using it")
> Reported-by: Martin Habets <mhabets@solarflare.com>
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 2/4] sfc: take correct lock in ef100_reset()
  2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
@ 2020-08-18 18:55   ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:55 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

On Tue, 18 Aug 2020 13:43:57 +0100
Edward Cree <ecree@solarflare.com> wrote:

> When downing and upping the ef100 filter table, we need to take a
> write lock on efx->filter_sem, not just a read lock, because we may
> kfree() the table pointers.
> Without this, resets cause a WARN_ON from
> efx_rwsem_assert_write_locked().
> 
> Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and
> related gubbins")
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Fix makes sense
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
  2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
@ 2020-08-18 18:58   ` Jesse Brandeburg
  2020-08-18 19:01     ` Jesse Brandeburg
  0 siblings, 1 reply; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:58 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

Edward Cree wrote:

> If an ef100_net_open() fails, ef100_net_stop() may be called without
>  channel->rps_flow_id having been written; thus it may hold the address
>  freed by a previous ef100_net_stop()'s call to efx_remove_filters().
>  This then causes a double-free when efx_remove_filters() is called
>  again, leading to a panic.
> To prevent this, after freeing it, overwrite it with NULL.
> 
> Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
> Signed-off-by: Edward Cree <ecree@solarflare.com>
> ---
>  drivers/net/ethernet/sfc/rx_common.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
> index ef9bca92b0b7..5e29284c89c9 100644
> --- a/drivers/net/ethernet/sfc/rx_common.c
> +++ b/drivers/net/ethernet/sfc/rx_common.c
> @@ -849,6 +849,7 @@ void efx_remove_filters(struct efx_nic *efx)
>  	efx_for_each_channel(channel, efx) {
>  		cancel_delayed_work_sync(&channel->filter_work);
>  		kfree(channel->rps_flow_id);
> +		channel->rps_flow_id = NULL;
>  	}
>  #endif
>  	down_write(&efx->filter_sem);
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
  2020-08-18 18:58   ` Jesse Brandeburg
@ 2020-08-18 19:01     ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 19:01 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

Jesse Brandeburg wrote:

> Edward Cree wrote:
> 
> > If an ef100_net_open() fails, ef100_net_stop() may be called without
> >  channel->rps_flow_id having been written; thus it may hold the address
> >  freed by a previous ef100_net_stop()'s call to efx_remove_filters().
> >  This then causes a double-free when efx_remove_filters() is called
> >  again, leading to a panic.
> > To prevent this, after freeing it, overwrite it with NULL.
> > 
> > Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
> > Signed-off-by: Edward Cree <ecree@solarflare.com>

My mailer messed up my previous reply, but this is what I meant to say:
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 4/4] sfc: don't free_irq()s if they were never requested
  2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
@ 2020-08-18 19:03   ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 19:03 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

Edward Cree wrote:

> If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
>  failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
>  needed and will cause error messages and stack traces.
> So instead, only do this if efx_nic_init_interrupt successfully completed,
>  as indicated by the new efx->irqs_hooked flag.
> 
> Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Makes sense.
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 0/4] sfc: more EF100 fixes
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
                   ` (3 preceding siblings ...)
  2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
@ 2020-08-18 19:49 ` David Miller
  4 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2020-08-18 19:49 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Tue, 18 Aug 2020 13:41:49 +0100

> Fix up some bugs in the initial EF100 submission, and re-fix
>  the hash_valid fix which was incomplete.
> 
> The reset bugs are currently hard to trigger; they were found
>  with an in-progress patch adding ethtool support, whereby
>  ethtool --reset reliably reproduces them.

Series applied.

Thanks for the review Jesse.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-08-18 19:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
2020-08-18 18:54   ` Jesse Brandeburg
2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
2020-08-18 18:55   ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
2020-08-18 18:58   ` Jesse Brandeburg
2020-08-18 19:01     ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
2020-08-18 19:03   ` Jesse Brandeburg
2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.