netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* rtnl_lock() question
@ 2019-09-03 21:55 Jonathan Lemon
  2019-09-04  7:39 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Lemon @ 2019-09-03 21:55 UTC (permalink / raw)
  To: Netdev

How appropriate is it to hold the rtnl_lock() across a sleepable
memory allocation?  On one hand it's just a mutex, but it would
seem like it could block quite a few things.
-- 
Jonathan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rtnl_lock() question
  2019-09-03 21:55 rtnl_lock() question Jonathan Lemon
@ 2019-09-04  7:39 ` Eric Dumazet
  2019-09-04 16:38   ` Jonathan Lemon
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2019-09-04  7:39 UTC (permalink / raw)
  To: Jonathan Lemon, Netdev



On 9/3/19 11:55 PM, Jonathan Lemon wrote:
> How appropriate is it to hold the rtnl_lock() across a sleepable
> memory allocation?  On one hand it's just a mutex, but it would
> seem like it could block quite a few things.
> 

Sure, all GFP_KERNEL allocations can sleep for quite a while.

On the other hand, we may want to delay stuff if memory is under pressure,
or complex operations like NEWLINK would fail.

RTNL is mostly taken for control path operations, we prefer them to be
mostly reliable, otherwise admins job would be a nightmare.

In some cases, it is relatively easy to pre-allocate memory before rtnl is taken,
but that will only take care of some selected paths.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rtnl_lock() question
  2019-09-04  7:39 ` Eric Dumazet
@ 2019-09-04 16:38   ` Jonathan Lemon
  2019-09-04 23:23     ` Saeed Mahameed
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Lemon @ 2019-09-04 16:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Netdev, Saeed Mahameed

On 4 Sep 2019, at 0:39, Eric Dumazet wrote:

> On 9/3/19 11:55 PM, Jonathan Lemon wrote:
>> How appropriate is it to hold the rtnl_lock() across a sleepable
>> memory allocation?  On one hand it's just a mutex, but it would
>> seem like it could block quite a few things.
>>
>
> Sure, all GFP_KERNEL allocations can sleep for quite a while.
>
> On the other hand, we may want to delay stuff if memory is under 
> pressure,
> or complex operations like NEWLINK would fail.
>
> RTNL is mostly taken for control path operations, we prefer them to be
> mostly reliable, otherwise admins job would be a nightmare.
>
> In some cases, it is relatively easy to pre-allocate memory before 
> rtnl is taken,
> but that will only take care of some selected paths.

The particular code path that I'm looking at is mlx5e_tx_timeout_work().

This is called on TX timeout, and mlx5 wants to move an entire channel
and all the supporting structures elsewhere.  Under the rtnl_lock(), it
calls kvzmalloc() in order to grab a large chunk of contig memory, which
ends up stalling the system.

I suspect these large allocation should really be done outside the lock.
-- 
Jonathan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rtnl_lock() question
  2019-09-04 16:38   ` Jonathan Lemon
@ 2019-09-04 23:23     ` Saeed Mahameed
  2019-09-05 18:07       ` Rustad, Mark D
  0 siblings, 1 reply; 5+ messages in thread
From: Saeed Mahameed @ 2019-09-04 23:23 UTC (permalink / raw)
  To: jonathan.lemon, eric.dumazet; +Cc: netdev

On Wed, 2019-09-04 at 09:38 -0700, Jonathan Lemon wrote:
> On 4 Sep 2019, at 0:39, Eric Dumazet wrote:
> 
> > On 9/3/19 11:55 PM, Jonathan Lemon wrote:
> > > How appropriate is it to hold the rtnl_lock() across a sleepable
> > > memory allocation?  On one hand it's just a mutex, but it would
> > > seem like it could block quite a few things.
> > > 
> > 
> > Sure, all GFP_KERNEL allocations can sleep for quite a while.
> > 
> > On the other hand, we may want to delay stuff if memory is under 
> > pressure,
> > or complex operations like NEWLINK would fail.
> > 
> > RTNL is mostly taken for control path operations, we prefer them to
> > be
> > mostly reliable, otherwise admins job would be a nightmare.
> > 
> > In some cases, it is relatively easy to pre-allocate memory before 
> > rtnl is taken,
> > but that will only take care of some selected paths.
> 
> The particular code path that I'm looking at is
> mlx5e_tx_timeout_work().
> 
> This is called on TX timeout, and mlx5 wants to move an entire
> channel
> and all the supporting structures elsewhere.  Under the rtnl_lock(),
> it
> calls kvzmalloc() in order to grab a large chunk of contig memory,
> which
> ends up stalling the system.
> 
> I suspect these large allocation should really be done outside the
> lock.

I am afraid that is impossible, at least not for all allocations

some allocations require parameters that should remain valid and
constant across the whole reconfiguration procedure such
params.num_channels, so they must be done inside the lock. 

other allocations are buried deep inside mlx5 that by doing pre
allocations is going to require a lot of refactoring. 

One idea is to use some sort of mem cache specifically for mlx5
reconfiguration that is cheaper to call than raw kvzalloc ? but
different objects  in the mlx5 reconfiguration path requires differnt
memory types, numa affinity etc.. which might make the cache harder to
satisfy all requirements.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rtnl_lock() question
  2019-09-04 23:23     ` Saeed Mahameed
@ 2019-09-05 18:07       ` Rustad, Mark D
  0 siblings, 0 replies; 5+ messages in thread
From: Rustad, Mark D @ 2019-09-05 18:07 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: jonathan.lemon, eric.dumazet, netdev

[-- Attachment #1: Type: text/plain, Size: 566 bytes --]

On Sep 4, 2019, at 4:23 PM, Saeed Mahameed <saeedm@mellanox.com> wrote:

> some allocations require parameters that should remain valid and
> constant across the whole reconfiguration procedure such
> params.num_channels, so they must be done inside the lock.

You could always check if those parameters have changed once under the lock  
and, if they did, drop the lock, reallocate and try again. Since such  
changes should be very infrequent, this is something that really should not  
loop multiple times.

--
Mark Rustad, Networking Division, Intel Corporation

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-05 18:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03 21:55 rtnl_lock() question Jonathan Lemon
2019-09-04  7:39 ` Eric Dumazet
2019-09-04 16:38   ` Jonathan Lemon
2019-09-04 23:23     ` Saeed Mahameed
2019-09-05 18:07       ` Rustad, Mark D

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).