netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/4] sfc: more EF100 fixes
@ 2020-08-18 12:41 Edward Cree
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:41 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

Fix up some bugs in the initial EF100 submission, and re-fix
 the hash_valid fix which was incomplete.

The reset bugs are currently hard to trigger; they were found
 with an in-progress patch adding ethtool support, whereby
 ethtool --reset reliably reproduces them.

Edward Cree (4):
  sfc: really check hash is valid before using it
  sfc: take correct lock in ef100_reset()
  sfc: null out channel->rps_flow_id after freeing it
  sfc: don't free_irq()s if they were never requested

 drivers/net/ethernet/sfc/ef100_nic.c  | 10 ++++++----
 drivers/net/ethernet/sfc/net_driver.h |  2 ++
 drivers/net/ethernet/sfc/nic.c        |  4 ++++
 drivers/net/ethernet/sfc/rx_common.c  |  1 +
 4 files changed, 13 insertions(+), 4 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net 1/4] sfc: really check hash is valid before using it
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
@ 2020-08-18 12:43 ` Edward Cree
  2020-08-18 18:54   ` Jesse Brandeburg
  2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:43 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.

Fixes: 068885434ccb ("sfc: check hash is valid before using it")
Reported-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef100_nic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index 206d70f9d95b..b8a7e9ed7913 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -739,6 +739,7 @@ const struct efx_nic_type ef100_pf_nic_type = {
 	.rx_remove = efx_mcdi_rx_remove,
 	.rx_write = ef100_rx_write,
 	.rx_packet = __ef100_rx_packet,
+	.rx_buf_hash_valid = ef100_rx_buf_hash_valid,
 	.fini_dmaq = efx_fini_dmaq,
 	.max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS,
 	.filter_table_probe = ef100_filter_table_up,
@@ -820,6 +821,7 @@ const struct efx_nic_type ef100_vf_nic_type = {
 	.rx_remove = efx_mcdi_rx_remove,
 	.rx_write = ef100_rx_write,
 	.rx_packet = __ef100_rx_packet,
+	.rx_buf_hash_valid = ef100_rx_buf_hash_valid,
 	.fini_dmaq = efx_fini_dmaq,
 	.max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS,
 	.filter_table_probe = ef100_filter_table_up,


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 2/4] sfc: take correct lock in ef100_reset()
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
@ 2020-08-18 12:43 ` Edward Cree
  2020-08-18 18:55   ` Jesse Brandeburg
  2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:43 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

When downing and upping the ef100 filter table, we need to take a write
 lock on efx->filter_sem, not just a read lock, because we may kfree()
 the table pointers.
Without this, resets cause a WARN_ON from efx_rwsem_assert_write_locked().

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef100_nic.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index b8a7e9ed7913..19fe86b3b316 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -431,18 +431,18 @@ static int ef100_reset(struct efx_nic *efx, enum reset_type reset_type)
 		/* A RESET_TYPE_ALL will cause filters to be removed, so we remove filters
 		 * and reprobe after reset to avoid removing filters twice
 		 */
-		down_read(&efx->filter_sem);
+		down_write(&efx->filter_sem);
 		ef100_filter_table_down(efx);
-		up_read(&efx->filter_sem);
+		up_write(&efx->filter_sem);
 		rc = efx_mcdi_reset(efx, reset_type);
 		if (rc)
 			return rc;
 
 		netif_device_attach(efx->net_dev);
 
-		down_read(&efx->filter_sem);
+		down_write(&efx->filter_sem);
 		rc = ef100_filter_table_up(efx);
-		up_read(&efx->filter_sem);
+		up_write(&efx->filter_sem);
 		if (rc)
 			return rc;
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
  2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
@ 2020-08-18 12:44 ` Edward Cree
  2020-08-18 18:58   ` Jesse Brandeburg
  2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
  2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:44 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

If an ef100_net_open() fails, ef100_net_stop() may be called without
 channel->rps_flow_id having been written; thus it may hold the address
 freed by a previous ef100_net_stop()'s call to efx_remove_filters().
 This then causes a double-free when efx_remove_filters() is called
 again, leading to a panic.
To prevent this, after freeing it, overwrite it with NULL.

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/rx_common.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
index ef9bca92b0b7..5e29284c89c9 100644
--- a/drivers/net/ethernet/sfc/rx_common.c
+++ b/drivers/net/ethernet/sfc/rx_common.c
@@ -849,6 +849,7 @@ void efx_remove_filters(struct efx_nic *efx)
 	efx_for_each_channel(channel, efx) {
 		cancel_delayed_work_sync(&channel->filter_work);
 		kfree(channel->rps_flow_id);
+		channel->rps_flow_id = NULL;
 	}
 #endif
 	down_write(&efx->filter_sem);


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 4/4] sfc: don't free_irq()s if they were never requested
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
                   ` (2 preceding siblings ...)
  2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
@ 2020-08-18 12:44 ` Edward Cree
  2020-08-18 19:03   ` Jesse Brandeburg
  2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
  4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:44 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
 failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
 needed and will cause error messages and stack traces.
So instead, only do this if efx_nic_init_interrupt successfully completed,
 as indicated by the new efx->irqs_hooked flag.

Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/net_driver.h | 2 ++
 drivers/net/ethernet/sfc/nic.c        | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index dcb741d8bd11..062462a13847 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -846,6 +846,7 @@ struct efx_async_filter_insertion {
  * @timer_quantum_ns: Interrupt timer quantum, in nanoseconds
  * @timer_max_ns: Interrupt timer maximum value, in nanoseconds
  * @irq_rx_adaptive: Adaptive IRQ moderation enabled for RX event queues
+ * @irqs_hooked: Channel interrupts are hooked
  * @irq_rx_mod_step_us: Step size for IRQ moderation for RX event queues
  * @irq_rx_moderation_us: IRQ moderation time for RX event queues
  * @msg_enable: Log message enable flags
@@ -1004,6 +1005,7 @@ struct efx_nic {
 	unsigned int timer_quantum_ns;
 	unsigned int timer_max_ns;
 	bool irq_rx_adaptive;
+	bool irqs_hooked;
 	unsigned int irq_mod_step_us;
 	unsigned int irq_rx_moderation_us;
 	u32 msg_enable;
diff --git a/drivers/net/ethernet/sfc/nic.c b/drivers/net/ethernet/sfc/nic.c
index d994d136bb03..d1e908846f5d 100644
--- a/drivers/net/ethernet/sfc/nic.c
+++ b/drivers/net/ethernet/sfc/nic.c
@@ -129,6 +129,7 @@ int efx_nic_init_interrupt(struct efx_nic *efx)
 #endif
 	}
 
+	efx->irqs_hooked = true;
 	return 0;
 
  fail2:
@@ -154,6 +155,8 @@ void efx_nic_fini_interrupt(struct efx_nic *efx)
 	efx->net_dev->rx_cpu_rmap = NULL;
 #endif
 
+	if (!efx->irqs_hooked)
+		return;
 	if (EFX_INT_MODE_USE_MSI(efx)) {
 		/* Disable MSI/MSI-X interrupts */
 		efx_for_each_channel(channel, efx)
@@ -163,6 +166,7 @@ void efx_nic_fini_interrupt(struct efx_nic *efx)
 		/* Disable legacy interrupt */
 		free_irq(efx->legacy_irq, efx);
 	}
+	efx->irqs_hooked = false;
 }
 
 /* Register dump */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net 1/4] sfc: really check hash is valid before using it
  2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
@ 2020-08-18 18:54   ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:54 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

On Tue, 18 Aug 2020 13:43:30 +0100
Edward Cree <ecree@solarflare.com> wrote:

> Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.
> 
> Fixes: 068885434ccb ("sfc: check hash is valid before using it")
> Reported-by: Martin Habets <mhabets@solarflare.com>
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 2/4] sfc: take correct lock in ef100_reset()
  2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
@ 2020-08-18 18:55   ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:55 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

On Tue, 18 Aug 2020 13:43:57 +0100
Edward Cree <ecree@solarflare.com> wrote:

> When downing and upping the ef100 filter table, we need to take a
> write lock on efx->filter_sem, not just a read lock, because we may
> kfree() the table pointers.
> Without this, resets cause a WARN_ON from
> efx_rwsem_assert_write_locked().
> 
> Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and
> related gubbins")
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Fix makes sense
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
  2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
@ 2020-08-18 18:58   ` Jesse Brandeburg
  2020-08-18 19:01     ` Jesse Brandeburg
  0 siblings, 1 reply; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:58 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

Edward Cree wrote:

> If an ef100_net_open() fails, ef100_net_stop() may be called without
>  channel->rps_flow_id having been written; thus it may hold the address
>  freed by a previous ef100_net_stop()'s call to efx_remove_filters().
>  This then causes a double-free when efx_remove_filters() is called
>  again, leading to a panic.
> To prevent this, after freeing it, overwrite it with NULL.
> 
> Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
> Signed-off-by: Edward Cree <ecree@solarflare.com>
> ---
>  drivers/net/ethernet/sfc/rx_common.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
> index ef9bca92b0b7..5e29284c89c9 100644
> --- a/drivers/net/ethernet/sfc/rx_common.c
> +++ b/drivers/net/ethernet/sfc/rx_common.c
> @@ -849,6 +849,7 @@ void efx_remove_filters(struct efx_nic *efx)
>  	efx_for_each_channel(channel, efx) {
>  		cancel_delayed_work_sync(&channel->filter_work);
>  		kfree(channel->rps_flow_id);
> +		channel->rps_flow_id = NULL;
>  	}
>  #endif
>  	down_write(&efx->filter_sem);
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
  2020-08-18 18:58   ` Jesse Brandeburg
@ 2020-08-18 19:01     ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 19:01 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

Jesse Brandeburg wrote:

> Edward Cree wrote:
> 
> > If an ef100_net_open() fails, ef100_net_stop() may be called without
> >  channel->rps_flow_id having been written; thus it may hold the address
> >  freed by a previous ef100_net_stop()'s call to efx_remove_filters().
> >  This then causes a double-free when efx_remove_filters() is called
> >  again, leading to a panic.
> > To prevent this, after freeing it, overwrite it with NULL.
> > 
> > Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
> > Signed-off-by: Edward Cree <ecree@solarflare.com>

My mailer messed up my previous reply, but this is what I meant to say:
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 4/4] sfc: don't free_irq()s if they were never requested
  2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
@ 2020-08-18 19:03   ` Jesse Brandeburg
  0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 19:03 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev

Edward Cree wrote:

> If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
>  failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
>  needed and will cause error messages and stack traces.
> So instead, only do this if efx_nic_init_interrupt successfully completed,
>  as indicated by the new efx->irqs_hooked flag.
> 
> Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Makes sense.
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 0/4] sfc: more EF100 fixes
  2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
                   ` (3 preceding siblings ...)
  2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
@ 2020-08-18 19:49 ` David Miller
  4 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2020-08-18 19:49 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, netdev

From: Edward Cree <ecree@solarflare.com>
Date: Tue, 18 Aug 2020 13:41:49 +0100

> Fix up some bugs in the initial EF100 submission, and re-fix
>  the hash_valid fix which was incomplete.
> 
> The reset bugs are currently hard to trigger; they were found
>  with an in-progress patch adding ethtool support, whereby
>  ethtool --reset reliably reproduces them.

Series applied.

Thanks for the review Jesse.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-08-18 19:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
2020-08-18 18:54   ` Jesse Brandeburg
2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
2020-08-18 18:55   ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
2020-08-18 18:58   ` Jesse Brandeburg
2020-08-18 19:01     ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
2020-08-18 19:03   ` Jesse Brandeburg
2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).