All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Yutian Yang <nglaive@gmail.com>, Michal Hocko <mhocko@kernel.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	 Cgroups <cgroups@vger.kernel.org>, Linux MM <linux-mm@kvack.org>,
	shenwenbo@zju.edu.cn
Subject: Re: [PATCH] memcg: charge semaphores and sem_undo objects
Date: Thu, 15 Jul 2021 11:22:05 -0700	[thread overview]
Message-ID: <CALvZod7sGcOASaFi6st40DSsXh1a0mv7HQ7Vc1pXxnsDgmDPkg@mail.gmail.com> (raw)
In-Reply-To: <YPB1EPaunr5587h5@casper.infradead.org>

On Thu, Jul 15, 2021 at 10:50 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Jul 15, 2021 at 03:14:44AM -0400, Yutian Yang wrote:
> > This patch adds accounting flags to semaphores and sem_undo allocation
> > sites so that kernel could correctly charge these objects.
> >
> > A malicious user could take up more than 63GB unaccounted memory under
> > default sysctl settings by exploiting the unaccounted objects. She could
> > allocate up to 32,000 unaccounted semaphore sets with up to 32,000
> > unaccounted semaphore objects in each set. She could further allocate one
> > sem_undo unaccounted object for each semaphore set.
>
> Do we really have to account every object that's allocated on behalf of
> userspace?  ie how seriously do we take this kind of thing?  Are memcgs
> supposed to be a hard limit, or are they just a rough accounting thing?

The memcgs are used for providing isolation between different
workloads running on the system and not just rough accounting
estimation. So, if there is an unbound allocation which can be
triggered by userspace than it should be accounted.

>
> There could be a very large stream of patches turning GFP_KERNEL into
> GFP_KERNEL_ACCOUNT.  For example, file locks (fs/locks.c) are only
> allocated with GFP_KERNEL and you can allocate one lock per byte of a
> file.  I'm sure there are hundreds more places where we do similar things.

We used to do opt-out kmem memcg accounting but switched to opt-in
with a9bb7e620efdf ("memcg: only account kmem allocations marked as
__GFP_ACCOUNT") with the reason that number of allocations which
should not be charged are larger than the allocations which should be
charged.

Personally I would prefer we go back to the opt-out accounting
specially after we have switched to reparenting the kmem charges and
shared kmem caches.


WARNING: multiple messages have this Message-ID (diff)
From: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Yutian Yang <nglaive-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	shenwenbo-Y5EWUtBUdg4nDS1+zs4M5A@public.gmane.org
Subject: Re: [PATCH] memcg: charge semaphores and sem_undo objects
Date: Thu, 15 Jul 2021 11:22:05 -0700	[thread overview]
Message-ID: <CALvZod7sGcOASaFi6st40DSsXh1a0mv7HQ7Vc1pXxnsDgmDPkg@mail.gmail.com> (raw)
In-Reply-To: <YPB1EPaunr5587h5-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>

On Thu, Jul 15, 2021 at 10:50 AM Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
>
> On Thu, Jul 15, 2021 at 03:14:44AM -0400, Yutian Yang wrote:
> > This patch adds accounting flags to semaphores and sem_undo allocation
> > sites so that kernel could correctly charge these objects.
> >
> > A malicious user could take up more than 63GB unaccounted memory under
> > default sysctl settings by exploiting the unaccounted objects. She could
> > allocate up to 32,000 unaccounted semaphore sets with up to 32,000
> > unaccounted semaphore objects in each set. She could further allocate one
> > sem_undo unaccounted object for each semaphore set.
>
> Do we really have to account every object that's allocated on behalf of
> userspace?  ie how seriously do we take this kind of thing?  Are memcgs
> supposed to be a hard limit, or are they just a rough accounting thing?

The memcgs are used for providing isolation between different
workloads running on the system and not just rough accounting
estimation. So, if there is an unbound allocation which can be
triggered by userspace than it should be accounted.

>
> There could be a very large stream of patches turning GFP_KERNEL into
> GFP_KERNEL_ACCOUNT.  For example, file locks (fs/locks.c) are only
> allocated with GFP_KERNEL and you can allocate one lock per byte of a
> file.  I'm sure there are hundreds more places where we do similar things.

We used to do opt-out kmem memcg accounting but switched to opt-in
with a9bb7e620efdf ("memcg: only account kmem allocations marked as
__GFP_ACCOUNT") with the reason that number of allocations which
should not be charged are larger than the allocations which should be
charged.

Personally I would prefer we go back to the opt-out accounting
specially after we have switched to reparenting the kmem charges and
shared kmem caches.

  reply	other threads:[~2021-07-15 18:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15  7:14 [PATCH] memcg: charge semaphores and sem_undo objects Yutian Yang
2021-07-15  7:14 ` Yutian Yang
2021-07-15 17:05 ` Shakeel Butt
2021-07-15 17:05   ` Shakeel Butt
2021-07-16  3:57   ` Vasily Averin
2021-07-16  3:57     ` Vasily Averin
2021-07-15 17:49 ` Matthew Wilcox
2021-07-15 17:49   ` Matthew Wilcox
2021-07-15 18:22   ` Shakeel Butt [this message]
2021-07-15 18:22     ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALvZod7sGcOASaFi6st40DSsXh1a0mv7HQ7Vc1pXxnsDgmDPkg@mail.gmail.com \
    --to=shakeelb@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=nglaive@gmail.com \
    --cc=shenwenbo@zju.edu.cn \
    --cc=vdavydov.dev@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.