All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonsoo Kim <js1304@gmail.com>
To: Yang Shi <shy828301@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Hugh Dickins <hughd@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	kernel-team@lge.com, Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: [PATCH v5 05/10] mm/swap: charge the page when adding to the swap cache
Date: Tue, 7 Apr 2020 10:27:44 +0900	[thread overview]
Message-ID: <CAAmzW4MNqyYqH+GbOE8Ardz2BNi5whHxP0FmwgjX1zPHNCXw_g@mail.gmail.com> (raw)
In-Reply-To: <CAHbLzkoL7zKOFtRghEfsfeKOERZmTkjfi8MynuHf4oKXD9mcvQ@mail.gmail.com>

2020년 4월 7일 (화) 오전 9:22, Yang Shi <shy828301@gmail.com>님이 작성:
>
> On Sun, Apr 5, 2020 at 6:03 PM Joonsoo Kim <js1304@gmail.com> wrote:
> >
> > 2020년 4월 4일 (토) 오전 3:29, Yang Shi <shy828301@gmail.com>님이 작성:
> > >
> > > On Thu, Apr 2, 2020 at 10:41 PM <js1304@gmail.com> wrote:
> > > >
> > > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > >
> > > > Currently, some swapped-in pages are not charged to the memcg until
> > > > actual access to the page happens. I checked the code and found that
> > > > it could cause a problem. In this implementation, even if the memcg
> > > > is enabled, one can consume a lot of memory in the system by exploiting
> > > > this hole. For example, one can make all the pages swapped out and
> > > > then call madvise_willneed() to load the all swapped-out pages without
> > > > pressing the memcg. Although actual access requires charging, it's really
> > > > big benefit to load the swapped-out pages to the memory without pressing
> > > > the memcg.
> > > >
> > > > And, for workingset detection which is implemented on the following patch,
> > > > a memcg should be committed before the workingset detection is executed.
> > > > For this purpose, the best solution, I think, is charging the page when
> > > > adding to the swap cache. Charging there is not that hard. Caller of
> > > > adding the page to the swap cache has enough information about the charged
> > > > memcg. So, what we need to do is just passing this information to
> > > > the right place.
> > > >
> > > > With this patch, specific memcg could be pressured more since readahead
> > > > pages are also charged to it now. This would result in performance
> > > > degradation to that user but it would be fair since that readahead is for
> > > > that user.
> > >
> > > If I read the code correctly, the readahead pages may be *not* charged
> > > to it at all but other memcgs since mem_cgroup_try_charge() would
> > > retrieve the target memcg id from the swap entry then charge to it
> > > (generally it is the memcg from who the page is swapped out). So, it
> > > may open a backdoor to let one memcg stress other memcgs?
> >
> > It looks like you talk about the call path on CONFIG_MEMCG_SWAP.
> >
> > The owner (task) for a anonymous page cannot be changed. It means that
> > the previous owner written on the swap entry will be the next user. So,
> > I think that using the target memcg id from the swap entry for readahead pages
> > is valid way.
> >
> > As you concerned, if someone can control swap-readahead to readahead
> > other's swap entry, one memcg could stress other memcg by using the fact above.
> > However, as far as I know, there is no explicit way to readahead other's swap
> > entry so no problem.
>
> Swap cluster readahead would readahead in pages on consecutive swap
> entries which may belong to different memcgs, however I just figured
> out patch #8 ("mm/swap: do not readahead if the previous owner of the
> swap entry isn't me") would prevent from reading ahead pages belonging
> to other memcgs. This would kill the potential problem.

Yes, that patch kill the potential problem. However, I think that swap cluster
readahead would not open the backdoor even without the patch #8 in
CONFIG_MEMCG_SWAP case, because:

1. consecutive swap space is usually filled by the same task.
2. swap cluster readahead needs a large I/O price to the offender and effect
isn't serious to the target.
3. those pages would be charged to their previous owner and it is valid.

Thanks.

  reply	other threads:[~2020-04-07  1:27 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03  5:40 [PATCH v5 00/10] workingset protection/detection on the anonymous LRU list js1304
2020-04-03  5:40 ` [PATCH v5 01/10] mm/vmscan: make active/inactive ratio as 1:1 for anon lru js1304
2020-04-03  5:40 ` [PATCH v5 02/10] mm/vmscan: protect the workingset on anonymous LRU js1304
2020-04-03  5:40 ` [PATCH v5 03/10] mm/workingset: extend the workingset detection for anon LRU js1304
2020-04-03  5:40 ` [PATCH v5 04/10] mm/swapcache: support to handle the exceptional entries in swapcache js1304
2020-04-03  5:40 ` [PATCH v5 05/10] mm/swap: charge the page when adding to the swap cache js1304
2020-04-03 18:29   ` Yang Shi
2020-04-03 18:29     ` Yang Shi
2020-04-06  1:03     ` Joonsoo Kim
2020-04-06  1:03       ` Joonsoo Kim
2020-04-07  0:22       ` Yang Shi
2020-04-07  0:22         ` Yang Shi
2020-04-07  1:27         ` Joonsoo Kim [this message]
2020-04-07  1:27           ` Joonsoo Kim
2020-04-16 16:11   ` Johannes Weiner
2020-04-17  1:38     ` Joonsoo Kim
2020-04-17  1:38       ` Joonsoo Kim
2020-04-17  3:31       ` Johannes Weiner
2020-04-17  3:57         ` Joonsoo Kim
2020-04-17  3:57           ` Joonsoo Kim
2020-04-03  5:40 ` [PATCH v5 06/10] mm/swap: implement workingset detection for anonymous LRU js1304
2020-04-03  5:40 ` [PATCH v5 07/10] mm/workingset: support to remember the previous owner of the page js1304
2020-04-03  5:40 ` [PATCH v5 08/10] mm/swap: do not readahead if the previous owner of the swap entry isn't me js1304
2020-04-03  5:40 ` [PATCH v5 09/10] mm/vmscan: restore active/inactive ratio for anonymous LRU js1304
2020-04-03  5:45 ` [PATCH v5 10/10] mm/swap: reinforce the reclaim_stat changed by anon LRU algorithm change js1304
2020-04-06  9:18 ` [PATCH v5 02/10] mm/vmscan: protect the workingset on anonymous LRU Hillf Danton
2020-04-07  0:40   ` Joonsoo Kim
2020-04-07  0:40     ` Joonsoo Kim
2020-04-06 11:58 ` [PATCH v5 05/10] mm/swap: charge the page when adding to the swap cache Hillf Danton
2020-04-07  0:42   ` Joonsoo Kim
2020-04-07  0:42     ` Joonsoo Kim
2020-04-07  2:21   ` Hillf Danton
2020-04-09  0:53     ` Joonsoo Kim
2020-04-09  0:53       ` Joonsoo Kim
2020-04-08 16:55 ` [PATCH v5 00/10] workingset protection/detection on the anonymous LRU list Vlastimil Babka
2020-04-09  0:50   ` Joonsoo Kim
2020-04-09  0:50     ` Joonsoo Kim
2020-06-03  3:57     ` Suren Baghdasaryan
2020-06-03  3:57       ` Suren Baghdasaryan
2020-06-03  5:46       ` Joonsoo Kim
2020-06-03  5:46         ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAmzW4MNqyYqH+GbOE8Ardz2BNi5whHxP0FmwgjX1zPHNCXw_g@mail.gmail.com \
    --to=js1304@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.