All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id
@ 2021-12-13 14:11 Dennis Dalessandro
  2022-01-04 14:30 ` Jason Gunthorpe
  2022-01-05 17:45 ` Jason Gunthorpe
  0 siblings, 2 replies; 4+ messages in thread
From: Dennis Dalessandro @ 2021-12-13 14:11 UTC (permalink / raw)
  To: jgg; +Cc: linux-rdma, Mike Marciniszyn, stable

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

Despite the patch noted below, the warning still happens with
certain kernel configs.

It appears that either check_preemption_disabled() is inconsistent with
with debug_rcu_read_lock() or the patch incorrectly assumes that an RCU
critical section will prevent the current cpu from changing.

A clarification has been solicited via:
https://lore.kernel.org/linux-rdma/CH0PR01MB71536FB1BD5ECF16E65CB3BFF26F9@CH0PR01MB7153.prod.exchangelabs.com/T/#u

This patch will silence the warning for now by using get_cpu()/put_cpu().

Fixes: b6d57e24ce6c ("IB/hfi1: Insure use of smp_processor_id() is preempt disabled")
Cc: stable@vger.kernel.org
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/sdma.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index f07d328..809096d 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -839,15 +839,15 @@ struct sdma_engine *sdma_select_user_engine(struct hfi1_devdata *dd,
 		goto out;
 
 	rcu_read_lock();
-	cpu_id = smp_processor_id();
+	cpu_id = get_cpu();
 	rht_node = rhashtable_lookup(dd->sdma_rht, &cpu_id,
 				     sdma_rht_params);
-
 	if (rht_node && rht_node->map[vl]) {
 		struct sdma_rht_map_elem *map = rht_node->map[vl];
 
 		sde = map->sde[selector & map->mask];
 	}
+	put_cpu();
 	rcu_read_unlock();
 
 	if (sde)


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id
  2021-12-13 14:11 [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id Dennis Dalessandro
@ 2022-01-04 14:30 ` Jason Gunthorpe
  2022-01-04 14:31   ` Marciniszyn, Mike
  2022-01-05 17:45 ` Jason Gunthorpe
  1 sibling, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2022-01-04 14:30 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: linux-rdma, Mike Marciniszyn, stable

On Mon, Dec 13, 2021 at 09:11:19AM -0500, Dennis Dalessandro wrote:
> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> 
> Despite the patch noted below, the warning still happens with
> certain kernel configs.
> 
> It appears that either check_preemption_disabled() is inconsistent with
> with debug_rcu_read_lock() or the patch incorrectly assumes that an RCU
> critical section will prevent the current cpu from changing.
> 
> A clarification has been solicited via:
> https://lore.kernel.org/linux-rdma/CH0PR01MB71536FB1BD5ECF16E65CB3BFF26F9@CH0PR01MB7153.prod.exchangelabs.com/T/#u

Did nobody answer your thread asking about this?

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id
  2022-01-04 14:30 ` Jason Gunthorpe
@ 2022-01-04 14:31   ` Marciniszyn, Mike
  0 siblings, 0 replies; 4+ messages in thread
From: Marciniszyn, Mike @ 2022-01-04 14:31 UTC (permalink / raw)
  To: Jason Gunthorpe, Dalessandro, Dennis; +Cc: linux-rdma, stable

> 
> On Mon, Dec 13, 2021 at 09:11:19AM -0500, Dennis Dalessandro wrote:
> > From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> >
> > Despite the patch noted below, the warning still happens with certain
> > kernel configs.
> >
> > It appears that either check_preemption_disabled() is inconsistent
> > with with debug_rcu_read_lock() or the patch incorrectly assumes that
> > an RCU critical section will prevent the current cpu from changing.
> >
> > A clarification has been solicited via:
> > https://lore.kernel.org/linux-
> rdma/CH0PR01MB71536FB1BD5ECF16E65CB3BFF2
> > 6F9@CH0PR01MB7153.prod.exchangelabs.com/T/#u
> 
> Did nobody answer your thread asking about this?
> 
> Jason

No.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id
  2021-12-13 14:11 [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id Dennis Dalessandro
  2022-01-04 14:30 ` Jason Gunthorpe
@ 2022-01-05 17:45 ` Jason Gunthorpe
  1 sibling, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2022-01-05 17:45 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: linux-rdma, Mike Marciniszyn, stable

On Mon, Dec 13, 2021 at 09:11:19AM -0500, Dennis Dalessandro wrote:
> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> 
> Despite the patch noted below, the warning still happens with
> certain kernel configs.
> 
> It appears that either check_preemption_disabled() is inconsistent with
> with debug_rcu_read_lock() or the patch incorrectly assumes that an RCU
> critical section will prevent the current cpu from changing.
> 
> A clarification has been solicited via:
> https://lore.kernel.org/linux-rdma/CH0PR01MB71536FB1BD5ECF16E65CB3BFF26F9@CH0PR01MB7153.prod.exchangelabs.com/T/#u
> 
> This patch will silence the warning for now by using get_cpu()/put_cpu().
> 
> Fixes: b6d57e24ce6c ("IB/hfi1: Insure use of smp_processor_id() is preempt disabled")
> Cc: stable@vger.kernel.org
> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
> ---
>  drivers/infiniband/hw/hfi1/sdma.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
> index f07d328..809096d 100644
> --- a/drivers/infiniband/hw/hfi1/sdma.c
> +++ b/drivers/infiniband/hw/hfi1/sdma.c
> @@ -839,15 +839,15 @@ struct sdma_engine *sdma_select_user_engine(struct hfi1_devdata *dd,
>  		goto out;
>  
>  	rcu_read_lock();
> -	cpu_id = smp_processor_id();
> +	cpu_id = get_cpu();
>  	rht_node = rhashtable_lookup(dd->sdma_rht, &cpu_id,
>  				     sdma_rht_params);
> -
>  	if (rht_node && rht_node->map[vl]) {
>  		struct sdma_rht_map_elem *map = rht_node->map[vl];
>  
>  		sde = map->sde[selector & map->mask];
>  	}
> +	put_cpu();
>  	rcu_read_unlock();

None of this makes any sense to me.. We have RCU locking but what is
the RCU dereference protecting map and sde? How can sde be taken
outside the lock without protection?

I see stuff like this:

				ret = rhashtable_remove_fast(dd->sdma_rht,
							     &rht_node->node,
							     sdma_rht_params);
				kfree(rht_node);

Which tells me the RCU is not being used properly.

It looks like this all relies on the cpu_id being exclusive at lookup
when the remove is being done, which I think, is not really true under
preempt rt anymore - so at least that is a functional bug fix, not
just a warning suppression.

It seems to me this code is really trying to get the CPU the task is
scheduled on:

	if (current->nr_cpus_allowed != 1)
		goto out;

Why not just get it directly from current? And how is all of that not
racy anyhow with task cpuset changes?

What happens if the wrong sde is selected anyhow?

This all seems so very sketchy and wrongish.

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-01-05 17:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-13 14:11 [PATCH for-rc] IB/hfi1: Insure preempt is held across smp_processor_id Dennis Dalessandro
2022-01-04 14:30 ` Jason Gunthorpe
2022-01-04 14:31   ` Marciniszyn, Mike
2022-01-05 17:45 ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.