linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] net: memcg: late association of sock to memcg
@ 2020-03-05 20:55 Shakeel Butt
  2020-03-05 21:17 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Shakeel Butt @ 2020-03-05 20:55 UTC (permalink / raw)
  To: Eric Dumazet, Roman Gushchin
  Cc: Johannes Weiner, Michal Hocko, Andrew Morton, David S . Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev, linux-mm, cgroups,
	linux-kernel, Shakeel Butt

If a TCP socket is allocated in IRQ context or cloned from unassociated
(i.e. not associated to a memcg) in IRQ context then it will remain
unassociated for its whole life. Almost half of the TCPs created on the
system are created in IRQ context, so, memory used by such sockets will
not be accounted by the memcg.

This issue is more widespread in cgroup v1 where network memory
accounting is opt-in but it can happen in cgroup v2 if the source socket
for the cloning was created in root memcg.

To fix the issue, just do the late association of the unassociated
sockets at accept() time in the process context and then force charge
the memory buffer already reserved by the socket.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
Changes since v2:
- Additional check for charging.
- Release the sock after charging.

Changes since v1:
- added sk->sk_rmem_alloc to initial charging.
- added synchronization to get memory usage and set sk_memcg race-free.

 net/ipv4/inet_connection_sock.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index a4db79b1b643..5face55cf818 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -482,6 +482,26 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
 		}
 		spin_unlock_bh(&queue->fastopenq.lock);
 	}
+
+	if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
+		int amt;
+
+		/* atomically get the memory usage, set and charge the
+		 * sk->sk_memcg.
+		 */
+		lock_sock(newsk);
+
+		/* The sk has not been accepted yet, no need to look at
+		 * sk->sk_wmem_queued.
+		 */
+		amt = sk_mem_pages(newsk->sk_forward_alloc +
+				   atomic_read(&sk->sk_rmem_alloc));
+		mem_cgroup_sk_alloc(newsk);
+		if (newsk->sk_memcg && amt)
+			mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
+
+		release_sock(newsk);
+	}
 out:
 	release_sock(sk);
 	if (req)
-- 
2.25.0.265.gbab2e86ba0-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] net: memcg: late association of sock to memcg
  2020-03-05 20:55 [PATCH v3] net: memcg: late association of sock to memcg Shakeel Butt
@ 2020-03-05 21:17 ` Eric Dumazet
  2020-03-05 21:59   ` Shakeel Butt
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2020-03-05 21:17 UTC (permalink / raw)
  To: Shakeel Butt, Eric Dumazet, Roman Gushchin
  Cc: Johannes Weiner, Michal Hocko, Andrew Morton, David S . Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev, linux-mm, cgroups,
	linux-kernel



On 3/5/20 12:55 PM, Shakeel Butt wrote:
> If a TCP socket is allocated in IRQ context or cloned from unassociated
> (i.e. not associated to a memcg) in IRQ context then it will remain
> unassociated for its whole life. Almost half of the TCPs created on the
> system are created in IRQ context, so, memory used by such sockets will
> not be accounted by the memcg.
> 
> This issue is more widespread in cgroup v1 where network memory
> accounting is opt-in but it can happen in cgroup v2 if the source socket
> for the cloning was created in root memcg.
> 
> To fix the issue, just do the late association of the unassociated
> sockets at accept() time in the process context and then force charge
> the memory buffer already reserved by the socket.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
> Changes since v2:
> - Additional check for charging.
> - Release the sock after charging.
> 
> Changes since v1:
> - added sk->sk_rmem_alloc to initial charging.
> - added synchronization to get memory usage and set sk_memcg race-free.
> 
>  net/ipv4/inet_connection_sock.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index a4db79b1b643..5face55cf818 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -482,6 +482,26 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
>  		}
>  		spin_unlock_bh(&queue->fastopenq.lock);
>  	}
> +
> +	if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> +		int amt;
> +
> +		/* atomically get the memory usage, set and charge the
> +		 * sk->sk_memcg.
> +		 */
> +		lock_sock(newsk);
> +
> +		/* The sk has not been accepted yet, no need to look at
> +		 * sk->sk_wmem_queued.
> +		 */
> +		amt = sk_mem_pages(newsk->sk_forward_alloc +
> +				   atomic_read(&sk->sk_rmem_alloc));
> +		mem_cgroup_sk_alloc(newsk);
> +		if (newsk->sk_memcg && amt)
> +			mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
> +
> +		release_sock(newsk);
> +	}
>  out:
>  	release_sock(sk);
>  	if (req)
> 

This patch looks fine, but why keeping the mem_cgroup_sk_alloc(newsk);
in sk_clone_lock() ?

Note that all TCP sk_clone_lock() calls happen in softirq context.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] net: memcg: late association of sock to memcg
  2020-03-05 21:17 ` Eric Dumazet
@ 2020-03-05 21:59   ` Shakeel Butt
  2020-03-05 22:45     ` Roman Gushchin
  0 siblings, 1 reply; 4+ messages in thread
From: Shakeel Butt @ 2020-03-05 21:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, Roman Gushchin, Johannes Weiner, Michal Hocko,
	Andrew Morton, David S . Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, Linux MM, Cgroups, LKML

On Thu, Mar 5, 2020 at 1:17 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 3/5/20 12:55 PM, Shakeel Butt wrote:
> > If a TCP socket is allocated in IRQ context or cloned from unassociated
> > (i.e. not associated to a memcg) in IRQ context then it will remain
> > unassociated for its whole life. Almost half of the TCPs created on the
> > system are created in IRQ context, so, memory used by such sockets will
> > not be accounted by the memcg.
> >
> > This issue is more widespread in cgroup v1 where network memory
> > accounting is opt-in but it can happen in cgroup v2 if the source socket
> > for the cloning was created in root memcg.
> >
> > To fix the issue, just do the late association of the unassociated
> > sockets at accept() time in the process context and then force charge
> > the memory buffer already reserved by the socket.
> >
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > ---
> > Changes since v2:
> > - Additional check for charging.
> > - Release the sock after charging.
> >
> > Changes since v1:
> > - added sk->sk_rmem_alloc to initial charging.
> > - added synchronization to get memory usage and set sk_memcg race-free.
> >
> >  net/ipv4/inet_connection_sock.c | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > index a4db79b1b643..5face55cf818 100644
> > --- a/net/ipv4/inet_connection_sock.c
> > +++ b/net/ipv4/inet_connection_sock.c
> > @@ -482,6 +482,26 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
> >               }
> >               spin_unlock_bh(&queue->fastopenq.lock);
> >       }
> > +
> > +     if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> > +             int amt;
> > +
> > +             /* atomically get the memory usage, set and charge the
> > +              * sk->sk_memcg.
> > +              */
> > +             lock_sock(newsk);
> > +
> > +             /* The sk has not been accepted yet, no need to look at
> > +              * sk->sk_wmem_queued.
> > +              */
> > +             amt = sk_mem_pages(newsk->sk_forward_alloc +
> > +                                atomic_read(&sk->sk_rmem_alloc));
> > +             mem_cgroup_sk_alloc(newsk);
> > +             if (newsk->sk_memcg && amt)
> > +                     mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
> > +
> > +             release_sock(newsk);
> > +     }
> >  out:
> >       release_sock(sk);
> >       if (req)
> >
>
> This patch looks fine, but why keeping the mem_cgroup_sk_alloc(newsk);
> in sk_clone_lock() ?
>
> Note that all TCP sk_clone_lock() calls happen in softirq context.

So, basically re-doing something like 9f1c2674b328 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()") in this patch. I am fine with
that.

Roman, any concerns?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] net: memcg: late association of sock to memcg
  2020-03-05 21:59   ` Shakeel Butt
@ 2020-03-05 22:45     ` Roman Gushchin
  0 siblings, 0 replies; 4+ messages in thread
From: Roman Gushchin @ 2020-03-05 22:45 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Eric Dumazet, Eric Dumazet, Johannes Weiner, Michal Hocko,
	Andrew Morton, David S . Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, Linux MM, Cgroups, LKML

On Thu, Mar 05, 2020 at 01:59:37PM -0800, Shakeel Butt wrote:
> On Thu, Mar 5, 2020 at 1:17 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >
> >
> > On 3/5/20 12:55 PM, Shakeel Butt wrote:
> > > If a TCP socket is allocated in IRQ context or cloned from unassociated
> > > (i.e. not associated to a memcg) in IRQ context then it will remain
> > > unassociated for its whole life. Almost half of the TCPs created on the
> > > system are created in IRQ context, so, memory used by such sockets will
> > > not be accounted by the memcg.
> > >
> > > This issue is more widespread in cgroup v1 where network memory
> > > accounting is opt-in but it can happen in cgroup v2 if the source socket
> > > for the cloning was created in root memcg.
> > >
> > > To fix the issue, just do the late association of the unassociated
> > > sockets at accept() time in the process context and then force charge
> > > the memory buffer already reserved by the socket.
> > >
> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > > ---
> > > Changes since v2:
> > > - Additional check for charging.
> > > - Release the sock after charging.
> > >
> > > Changes since v1:
> > > - added sk->sk_rmem_alloc to initial charging.
> > > - added synchronization to get memory usage and set sk_memcg race-free.
> > >
> > >  net/ipv4/inet_connection_sock.c | 20 ++++++++++++++++++++
> > >  1 file changed, 20 insertions(+)
> > >
> > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > > index a4db79b1b643..5face55cf818 100644
> > > --- a/net/ipv4/inet_connection_sock.c
> > > +++ b/net/ipv4/inet_connection_sock.c
> > > @@ -482,6 +482,26 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
> > >               }
> > >               spin_unlock_bh(&queue->fastopenq.lock);
> > >       }
> > > +
> > > +     if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> > > +             int amt;
> > > +
> > > +             /* atomically get the memory usage, set and charge the
> > > +              * sk->sk_memcg.
> > > +              */
> > > +             lock_sock(newsk);
> > > +
> > > +             /* The sk has not been accepted yet, no need to look at
> > > +              * sk->sk_wmem_queued.
> > > +              */
> > > +             amt = sk_mem_pages(newsk->sk_forward_alloc +
> > > +                                atomic_read(&sk->sk_rmem_alloc));
> > > +             mem_cgroup_sk_alloc(newsk);
> > > +             if (newsk->sk_memcg && amt)
> > > +                     mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
> > > +
> > > +             release_sock(newsk);
> > > +     }
> > >  out:
> > >       release_sock(sk);
> > >       if (req)
> > >
> >
> > This patch looks fine, but why keeping the mem_cgroup_sk_alloc(newsk);
> > in sk_clone_lock() ?
> >
> > Note that all TCP sk_clone_lock() calls happen in softirq context.
> 
> So, basically re-doing something like 9f1c2674b328 ("net: memcontrol:
> defer call to mem_cgroup_sk_alloc()") in this patch. I am fine with
> that.
> 
> Roman, any concerns?

Nothing at the moment, I'll try to give some testing to the final patch.
I hope this time it will work better...

Thank you!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-06  2:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-05 20:55 [PATCH v3] net: memcg: late association of sock to memcg Shakeel Butt
2020-03-05 21:17 ` Eric Dumazet
2020-03-05 21:59   ` Shakeel Butt
2020-03-05 22:45     ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).