netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
@ 2024-02-24  9:06 Eric Dumazet
  2024-02-26  1:33 ` shaozhengchao
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Eric Dumazet @ 2024-02-24  9:06 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet, Zhengchao Shao

This is a followup of commit 234ec0b6034b ("netlink: fix potential
sleeping issue in mqueue_flush_file"), because vfree_atomic()
overhead is unfortunate for medium sized allocations.

1) If the allocation is smaller than PAGE_SIZE, do not bother
   with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
   while NLMSG_GOODSIZE is smaller than 8KB.

2) Use kvmalloc(), which might allocate one high order page
   instead of vmalloc if memory is not too fragmented.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Zhengchao Shao <shaozhengchao@huawei.com>
---
 net/netlink/af_netlink.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..90ca4e0ed9b3632bf223bf29fd864dbb76f3c89c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1206,23 +1206,21 @@ struct sock *netlink_getsockbyfilp(struct file *filp)
 
 struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
 {
+	size_t head_size = SKB_HEAD_ALIGN(size);
 	struct sk_buff *skb;
 	void *data;
 
-	if (size <= NLMSG_GOODSIZE || broadcast)
+	if (head_size <= PAGE_SIZE || broadcast)
 		return alloc_skb(size, GFP_KERNEL);
 
-	size = SKB_DATA_ALIGN(size) +
-	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
-	data = vmalloc(size);
-	if (data == NULL)
+	data = kvmalloc(head_size, GFP_KERNEL);
+	if (!data)
 		return NULL;
 
-	skb = __build_skb(data, size);
-	if (skb == NULL)
-		vfree(data);
-	else
+	skb = __build_skb(data, head_size);
+	if (!skb)
+		kvfree(data);
+	else if (is_vmalloc_addr(data))
 		skb->destructor = netlink_skb_destructor;
 
 	return skb;
-- 
2.44.0.rc1.240.g4c46232300-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
  2024-02-24  9:06 [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() Eric Dumazet
@ 2024-02-26  1:33 ` shaozhengchao
  2024-02-27 17:52 ` Jakub Kicinski
  2024-02-27 19:20 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 5+ messages in thread
From: shaozhengchao @ 2024-02-26  1:33 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet



On 2024/2/24 17:06, Eric Dumazet wrote:
> This is a followup of commit 234ec0b6034b ("netlink: fix potential
> sleeping issue in mqueue_flush_file"), because vfree_atomic()
> overhead is unfortunate for medium sized allocations.
> 
> 1) If the allocation is smaller than PAGE_SIZE, do not bother
>     with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
>     while NLMSG_GOODSIZE is smaller than 8KB.
> 
> 2) Use kvmalloc(), which might allocate one high order page
>     instead of vmalloc if memory is not too fragmented.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Zhengchao Shao <shaozhengchao@huawei.com>
> ---
>   net/netlink/af_netlink.c | 18 ++++++++----------
>   1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..90ca4e0ed9b3632bf223bf29fd864dbb76f3c89c 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -1206,23 +1206,21 @@ struct sock *netlink_getsockbyfilp(struct file *filp)
>   
>   struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
>   {
> +	size_t head_size = SKB_HEAD_ALIGN(size);
>   	struct sk_buff *skb;
>   	void *data;
>   
> -	if (size <= NLMSG_GOODSIZE || broadcast)
> +	if (head_size <= PAGE_SIZE || broadcast)
>   		return alloc_skb(size, GFP_KERNEL);
>   
> -	size = SKB_DATA_ALIGN(size) +
> -	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> -
> -	data = vmalloc(size);
> -	if (data == NULL)
> +	data = kvmalloc(head_size, GFP_KERNEL);
> +	if (!data)
>   		return NULL;
>   
> -	skb = __build_skb(data, size);
> -	if (skb == NULL)
> -		vfree(data);
> -	else
> +	skb = __build_skb(data, head_size);
> +	if (!skb)
> +		kvfree(data);
> +	else if (is_vmalloc_addr(data))
>   		skb->destructor = netlink_skb_destructor;
>   
>   	return skb;
LGTM, thanks.

Reviewed-by: Zhengchao Shao <shaozhengchao@huawei.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
  2024-02-24  9:06 [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() Eric Dumazet
  2024-02-26  1:33 ` shaozhengchao
@ 2024-02-27 17:52 ` Jakub Kicinski
  2024-02-27 18:15   ` Eric Dumazet
  2024-02-27 19:20 ` patchwork-bot+netdevbpf
  2 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2024-02-27 17:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, netdev, eric.dumazet, Zhengchao Shao

On Sat, 24 Feb 2024 09:06:30 +0000 Eric Dumazet wrote:
>  struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
>  {
> +	size_t head_size = SKB_HEAD_ALIGN(size);
>  	struct sk_buff *skb;
>  	void *data;
>  
> -	if (size <= NLMSG_GOODSIZE || broadcast)
> +	if (head_size <= PAGE_SIZE || broadcast)
>  		return alloc_skb(size, GFP_KERNEL);
>  
> -	size = SKB_DATA_ALIGN(size) +
> -	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> -
> -	data = vmalloc(size);
> -	if (data == NULL)
> +	data = kvmalloc(head_size, GFP_KERNEL);
> +	if (!data)
>  		return NULL;
>  
> -	skb = __build_skb(data, size);
> -	if (skb == NULL)
> -		vfree(data);
> -	else
> +	skb = __build_skb(data, head_size);

Is this going to work with KFENCE? Don't we need similar size
adjustment logic as we have in __slab_build_skb() ?

> +	if (!skb)
> +		kvfree(data);
> +	else if (is_vmalloc_addr(data))
>  		skb->destructor = netlink_skb_destructor;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
  2024-02-27 17:52 ` Jakub Kicinski
@ 2024-02-27 18:15   ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2024-02-27 18:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, eric.dumazet, Zhengchao Shao

On Tue, Feb 27, 2024 at 6:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 24 Feb 2024 09:06:30 +0000 Eric Dumazet wrote:
> >  struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
> >  {
> > +     size_t head_size = SKB_HEAD_ALIGN(size);
> >       struct sk_buff *skb;
> >       void *data;
> >
> > -     if (size <= NLMSG_GOODSIZE || broadcast)
> > +     if (head_size <= PAGE_SIZE || broadcast)
> >               return alloc_skb(size, GFP_KERNEL);
> >
> > -     size = SKB_DATA_ALIGN(size) +
> > -            SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > -
> > -     data = vmalloc(size);
> > -     if (data == NULL)
> > +     data = kvmalloc(head_size, GFP_KERNEL);
> > +     if (!data)
> >               return NULL;
> >
> > -     skb = __build_skb(data, size);
> > -     if (skb == NULL)
> > -             vfree(data);
> > -     else
> > +     skb = __build_skb(data, head_size);
>
> Is this going to work with KFENCE? Don't we need similar size
> adjustment logic as we have in __slab_build_skb() ?

Note that the 2nd argument of  __build_skb() has not been changed by my patch.

 SKB_HEAD_ALIGN(size) == SKB_DATA_ALIGN(size) +

SKB_DATA_ALIGN(sizeof(struct skb_shared_info));

I do not expect kfence being a problem here ?

Either data is vmalloc, and the patch is a no-op,
either it is kmalloc(), and __build_skb() does nothing special,
kfence magic already happened.

>
> > +     if (!skb)
> > +             kvfree(data);

Note that skb->head at this point must be equal to @data

> > +     else if (is_vmalloc_addr(data))
> >               skb->destructor = netlink_skb_destructor;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
  2024-02-24  9:06 [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() Eric Dumazet
  2024-02-26  1:33 ` shaozhengchao
  2024-02-27 17:52 ` Jakub Kicinski
@ 2024-02-27 19:20 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-02-27 19:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, pabeni, netdev, eric.dumazet, shaozhengchao

Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 24 Feb 2024 09:06:30 +0000 you wrote:
> This is a followup of commit 234ec0b6034b ("netlink: fix potential
> sleeping issue in mqueue_flush_file"), because vfree_atomic()
> overhead is unfortunate for medium sized allocations.
> 
> 1) If the allocation is smaller than PAGE_SIZE, do not bother
>    with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
>    while NLMSG_GOODSIZE is smaller than 8KB.
> 
> [...]

Here is the summary with links:
  - [net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
    https://git.kernel.org/netdev/net-next/c/f8cbf6bde4c8

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-27 19:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-24  9:06 [PATCH net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() Eric Dumazet
2024-02-26  1:33 ` shaozhengchao
2024-02-27 17:52 ` Jakub Kicinski
2024-02-27 18:15   ` Eric Dumazet
2024-02-27 19:20 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).