All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov@parallels.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
Date: Wed, 23 Jul 2014 16:38:47 +0200	[thread overview]
Message-ID: <20140723143847.GB16721@dhcp22.suse.cz> (raw)
In-Reply-To: <CAJfpegscT-ptQzq__uUV2TOn7Uvs6x4FdWGTQb9Fe9MEJr2KjA@mail.gmail.com>

On Tue 22-07-14 17:44:43, Miklos Szeredi wrote:
> On Tue, Jul 22, 2014 at 5:08 PM, Michal Hocko <mhocko@suse.cz> wrote:
> > On Sat 19-07-14 13:39:11, Johannes Weiner wrote:
> >> On Fri, Jul 18, 2014 at 05:12:54PM +0200, Miklos Szeredi wrote:
> >> > On Fri, Jul 18, 2014 at 4:45 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> >> >
> >> > > I assumed the source page would always be new, according to this part
> >> > > in fuse_try_move_page():
> >> > >
> >> > >         /*
> >> > >          * This is a new and locked page, it shouldn't be mapped or
> >> > >          * have any special flags on it
> >> > >          */
> >> > >         if (WARN_ON(page_mapped(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(page_has_private(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(PageDirty(oldpage) || PageWriteback(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(PageMlocked(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >
> >> > > However, it's in the page cache and I can't really convince myself
> >> > > that it's not also on the LRU.  Miklos, I have trouble pinpointing
> >> > > where oldpage is instantiated exactly and what state it might be in -
> >> > > can it already be on the LRU?
> >> >
> >> > oldpage comes from ->readpages() (*NOT* ->readpage()), i.e. readahead.
> >> >
> >> > AFAICS it is added to the LRU in read_cache_pages(), so it looks like
> >> > it is definitely on the LRU at that point.
> >
> > OK, so my understanding of the code was wrong :/ and staring at it for
> > quite a while didn't help much. The fuse code is so full of indirection
> > it makes my head spin.
> 
> Definitely needs a rewrite.  But forget the complexities for the
> moment and just consider this single case:
> 
>  ->readpages() is called to do some readahead, pages are locked, added
> to the page cache and, AFAICS, charged to a memcg (in
> add_to_page_cache_lru()).
> 
>  - fuse sends a READ request to userspace and it gets a reply with
> splice(... SPLICE_F_MOVE).  What this means that a bunch of pages of
> indefinite origin are to replace (if possible) the pages already in
> the page cache.  If not possible, for some reason, it falls back to
> copying the contents.  So, AFAICS, the oldpage and the newpage can be
> charged to a different memcg.

OK, thanks for the clarification. I had this feeling but couldn't wrap
my head around the indirection of the code.

It seems that checkig PageCgroupUsed(new) and bail out early in
mem_cgroup_migrate should just work, no?

> > How should we test this code path, Miklos?
> 
>   fusexmp_fh -osplice_write,splice_move /mnt/fuse
> 
> This will mirror / under /mnt/fuse and will use splice to move data
> from the underlying filesystem to the fuse filesystem, hopefully.
> 
> It would be useful if it had some instrumentation telling us the
> actual number of pages successfully moved, but it doesn't have that
> yet.

Thanks I will try to play with this tomorrow when I have more time.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.cz>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov@parallels.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
Date: Wed, 23 Jul 2014 16:38:47 +0200	[thread overview]
Message-ID: <20140723143847.GB16721@dhcp22.suse.cz> (raw)
In-Reply-To: <CAJfpegscT-ptQzq__uUV2TOn7Uvs6x4FdWGTQb9Fe9MEJr2KjA@mail.gmail.com>

On Tue 22-07-14 17:44:43, Miklos Szeredi wrote:
> On Tue, Jul 22, 2014 at 5:08 PM, Michal Hocko <mhocko@suse.cz> wrote:
> > On Sat 19-07-14 13:39:11, Johannes Weiner wrote:
> >> On Fri, Jul 18, 2014 at 05:12:54PM +0200, Miklos Szeredi wrote:
> >> > On Fri, Jul 18, 2014 at 4:45 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> >> >
> >> > > I assumed the source page would always be new, according to this part
> >> > > in fuse_try_move_page():
> >> > >
> >> > >         /*
> >> > >          * This is a new and locked page, it shouldn't be mapped or
> >> > >          * have any special flags on it
> >> > >          */
> >> > >         if (WARN_ON(page_mapped(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(page_has_private(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(PageDirty(oldpage) || PageWriteback(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(PageMlocked(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >
> >> > > However, it's in the page cache and I can't really convince myself
> >> > > that it's not also on the LRU.  Miklos, I have trouble pinpointing
> >> > > where oldpage is instantiated exactly and what state it might be in -
> >> > > can it already be on the LRU?
> >> >
> >> > oldpage comes from ->readpages() (*NOT* ->readpage()), i.e. readahead.
> >> >
> >> > AFAICS it is added to the LRU in read_cache_pages(), so it looks like
> >> > it is definitely on the LRU at that point.
> >
> > OK, so my understanding of the code was wrong :/ and staring at it for
> > quite a while didn't help much. The fuse code is so full of indirection
> > it makes my head spin.
> 
> Definitely needs a rewrite.  But forget the complexities for the
> moment and just consider this single case:
> 
>  ->readpages() is called to do some readahead, pages are locked, added
> to the page cache and, AFAICS, charged to a memcg (in
> add_to_page_cache_lru()).
> 
>  - fuse sends a READ request to userspace and it gets a reply with
> splice(... SPLICE_F_MOVE).  What this means that a bunch of pages of
> indefinite origin are to replace (if possible) the pages already in
> the page cache.  If not possible, for some reason, it falls back to
> copying the contents.  So, AFAICS, the oldpage and the newpage can be
> charged to a different memcg.

OK, thanks for the clarification. I had this feeling but couldn't wrap
my head around the indirection of the code.

It seems that checkig PageCgroupUsed(new) and bail out early in
mem_cgroup_migrate should just work, no?

> > How should we test this code path, Miklos?
> 
>   fusexmp_fh -osplice_write,splice_move /mnt/fuse
> 
> This will mirror / under /mnt/fuse and will use splice to move data
> from the underlying filesystem to the fuse filesystem, hopefully.
> 
> It would be useful if it had some instrumentation telling us the
> actual number of pages successfully moved, but it doesn't have that
> yet.

Thanks I will try to play with this tomorrow when I have more time.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
To: Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Vladimir Davydov
	<vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
Date: Wed, 23 Jul 2014 16:38:47 +0200	[thread overview]
Message-ID: <20140723143847.GB16721@dhcp22.suse.cz> (raw)
In-Reply-To: <CAJfpegscT-ptQzq__uUV2TOn7Uvs6x4FdWGTQb9Fe9MEJr2KjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue 22-07-14 17:44:43, Miklos Szeredi wrote:
> On Tue, Jul 22, 2014 at 5:08 PM, Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> wrote:
> > On Sat 19-07-14 13:39:11, Johannes Weiner wrote:
> >> On Fri, Jul 18, 2014 at 05:12:54PM +0200, Miklos Szeredi wrote:
> >> > On Fri, Jul 18, 2014 at 4:45 PM, Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> >> >
> >> > > I assumed the source page would always be new, according to this part
> >> > > in fuse_try_move_page():
> >> > >
> >> > >         /*
> >> > >          * This is a new and locked page, it shouldn't be mapped or
> >> > >          * have any special flags on it
> >> > >          */
> >> > >         if (WARN_ON(page_mapped(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(page_has_private(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(PageDirty(oldpage) || PageWriteback(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >         if (WARN_ON(PageMlocked(oldpage)))
> >> > >                 goto out_fallback_unlock;
> >> > >
> >> > > However, it's in the page cache and I can't really convince myself
> >> > > that it's not also on the LRU.  Miklos, I have trouble pinpointing
> >> > > where oldpage is instantiated exactly and what state it might be in -
> >> > > can it already be on the LRU?
> >> >
> >> > oldpage comes from ->readpages() (*NOT* ->readpage()), i.e. readahead.
> >> >
> >> > AFAICS it is added to the LRU in read_cache_pages(), so it looks like
> >> > it is definitely on the LRU at that point.
> >
> > OK, so my understanding of the code was wrong :/ and staring at it for
> > quite a while didn't help much. The fuse code is so full of indirection
> > it makes my head spin.
> 
> Definitely needs a rewrite.  But forget the complexities for the
> moment and just consider this single case:
> 
>  ->readpages() is called to do some readahead, pages are locked, added
> to the page cache and, AFAICS, charged to a memcg (in
> add_to_page_cache_lru()).
> 
>  - fuse sends a READ request to userspace and it gets a reply with
> splice(... SPLICE_F_MOVE).  What this means that a bunch of pages of
> indefinite origin are to replace (if possible) the pages already in
> the page cache.  If not possible, for some reason, it falls back to
> copying the contents.  So, AFAICS, the oldpage and the newpage can be
> charged to a different memcg.

OK, thanks for the clarification. I had this feeling but couldn't wrap
my head around the indirection of the code.

It seems that checkig PageCgroupUsed(new) and bail out early in
mem_cgroup_migrate should just work, no?

> > How should we test this code path, Miklos?
> 
>   fusexmp_fh -osplice_write,splice_move /mnt/fuse
> 
> This will mirror / under /mnt/fuse and will use splice to move data
> from the underlying filesystem to the fuse filesystem, hopefully.
> 
> It would be useful if it had some instrumentation telling us the
> actual number of pages successfully moved, but it doesn't have that
> yet.

Thanks I will try to play with this tomorrow when I have more time.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2014-07-23 14:38 UTC|newest]

Thread overview: 141+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 20:40 [patch 00/13] mm: memcontrol: naturalize charge lifetime v4 Johannes Weiner
2014-06-18 20:40 ` Johannes Weiner
2014-06-18 20:40 ` [patch 01/13] mm: memcontrol: fold mem_cgroup_do_charge() Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 02/13] mm: memcontrol: rearrange charging fast path Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 03/13] mm: memcontrol: reclaim at least once for __GFP_NORETRY Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 04/13] mm: huge_memory: use GFP_TRANSHUGE when charging huge pages Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 05/13] mm: memcontrol: retry reclaim for oom-disabled and __GFP_NOFAIL charges Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 06/13] mm: memcontrol: remove explicit OOM parameter in charge path Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 07/13] mm: memcontrol: simplify move precharge function Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 08/13] mm: memcontrol: catch root bypass in move precharge Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 09/13] mm: memcontrol: use root_mem_cgroup res_counter Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 10/13] mm: memcontrol: remove ordering between pc->mem_cgroup and PageCgroupUsed Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 11/13] mm: memcontrol: do not acquire page_cgroup lock for kmem pages Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-18 20:40 ` [patch 12/13] mm: memcontrol: rewrite charge API Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-23  6:15   ` Uwe Kleine-König
2014-06-23  6:15     ` Uwe Kleine-König
2014-06-23  6:15     ` Uwe Kleine-König
2014-06-23  9:30     ` Michal Hocko
2014-06-23  9:30       ` Michal Hocko
2014-06-23  9:30       ` Michal Hocko
2014-06-23  9:42       ` Uwe Kleine-König
2014-06-23  9:42         ` Uwe Kleine-König
2014-06-23  9:42         ` Uwe Kleine-König
2014-07-14 15:04   ` Michal Hocko
2014-07-14 15:04     ` Michal Hocko
2014-07-14 15:04     ` Michal Hocko
2014-07-14 17:13     ` Johannes Weiner
2014-07-14 17:13       ` Johannes Weiner
2014-07-14 18:43       ` Michal Hocko
2014-07-14 18:43         ` Michal Hocko
2014-06-18 20:40 ` [patch 13/13] mm: memcontrol: rewrite uncharge API Johannes Weiner
2014-06-18 20:40   ` Johannes Weiner
2014-06-20 16:36   ` [PATCH -mm] memcg: mem_cgroup_charge_statistics needs preempt_disable Michal Hocko
2014-06-20 16:36     ` Michal Hocko
2014-06-23  4:16     ` Johannes Weiner
2014-06-23  4:16       ` Johannes Weiner
2014-06-21  0:34   ` [patch 13/13] mm: memcontrol: rewrite uncharge API Sasha Levin
2014-06-21  0:34     ` Sasha Levin
2014-06-21  0:56     ` Andrew Morton
2014-06-21  0:56       ` Andrew Morton
2014-06-21  0:56       ` Andrew Morton
2014-06-21  1:03       ` Sasha Levin
2014-06-21  1:03         ` Sasha Levin
2014-07-15  8:25   ` Michal Hocko
2014-07-15  8:25     ` Michal Hocko
2014-07-15  8:25     ` Michal Hocko
2014-07-15 12:19     ` Michal Hocko
2014-07-15 12:19       ` Michal Hocko
2014-07-18  7:12       ` Michal Hocko
2014-07-18  7:12         ` Michal Hocko
2014-07-18 14:45         ` Johannes Weiner
2014-07-18 14:45           ` Johannes Weiner
2014-07-18 14:45           ` Johannes Weiner
2014-07-18 15:12           ` Miklos Szeredi
2014-07-18 15:12             ` Miklos Szeredi
2014-07-19 17:39             ` Johannes Weiner
2014-07-19 17:39               ` Johannes Weiner
2014-07-19 17:39               ` Johannes Weiner
2014-07-22 15:08               ` Michal Hocko
2014-07-22 15:08                 ` Michal Hocko
2014-07-22 15:44                 ` Miklos Szeredi
2014-07-22 15:44                   ` Miklos Szeredi
2014-07-22 15:44                   ` Miklos Szeredi
2014-07-23 14:38                   ` Michal Hocko [this message]
2014-07-23 14:38                     ` Michal Hocko
2014-07-23 14:38                     ` Michal Hocko
2014-07-23 15:06                     ` Johannes Weiner
2014-07-23 15:06                       ` Johannes Weiner
2014-07-23 15:19                       ` Michal Hocko
2014-07-23 15:19                         ` Michal Hocko
2014-07-23 15:19                         ` Michal Hocko
2014-07-23 15:36                         ` Johannes Weiner
2014-07-23 15:36                           ` Johannes Weiner
2014-07-23 18:08                       ` Miklos Szeredi
2014-07-23 18:08                         ` Miklos Szeredi
2014-07-23 21:02                         ` Johannes Weiner
2014-07-23 21:02                           ` Johannes Weiner
2014-07-23 21:02                           ` Johannes Weiner
2014-07-24  8:46                           ` Michal Hocko
2014-07-24  8:46                             ` Michal Hocko
2014-07-24  9:02                             ` Michal Hocko
2014-07-24  9:02                               ` Michal Hocko
2014-07-24  9:02                               ` Michal Hocko
2014-07-25 15:26                               ` Johannes Weiner
2014-07-25 15:26                                 ` Johannes Weiner
2014-07-25 15:26                                 ` Johannes Weiner
2014-07-25 15:43                                 ` Michal Hocko
2014-07-25 15:43                                   ` Michal Hocko
2014-07-25 17:34                                   ` Johannes Weiner
2014-07-25 17:34                                     ` Johannes Weiner
2014-07-15 14:23     ` Michal Hocko
2014-07-15 14:23       ` Michal Hocko
2014-07-15 14:23       ` Michal Hocko
2014-07-15 15:09       ` Johannes Weiner
2014-07-15 15:09         ` Johannes Weiner
2014-07-15 15:18         ` Michal Hocko
2014-07-15 15:18           ` Michal Hocko
2014-07-15 15:46           ` Johannes Weiner
2014-07-15 15:46             ` Johannes Weiner
2014-07-15 15:56             ` Michal Hocko
2014-07-15 15:56               ` Michal Hocko
2014-07-15 15:55   ` Naoya Horiguchi
2014-07-15 15:55     ` Naoya Horiguchi
2014-07-15 16:07     ` Michal Hocko
2014-07-15 16:07       ` Michal Hocko
2014-07-15 17:34       ` Johannes Weiner
2014-07-15 17:34         ` Johannes Weiner
2014-07-15 17:34         ` Johannes Weiner
2014-07-15 18:21         ` Michal Hocko
2014-07-15 18:21           ` Michal Hocko
2014-07-15 18:21           ` Michal Hocko
2014-07-15 18:43         ` Naoya Horiguchi
2014-07-15 18:43           ` Naoya Horiguchi
2014-07-15 19:04           ` Johannes Weiner
2014-07-15 19:04             ` Johannes Weiner
2014-07-15 19:04             ` Johannes Weiner
2014-07-15 20:49             ` Naoya Horiguchi
2014-07-15 20:49               ` Naoya Horiguchi
2014-07-15 21:48               ` Johannes Weiner
2014-07-15 21:48                 ` Johannes Weiner
2014-07-16  7:55                 ` Michal Hocko
2014-07-16  7:55                   ` Michal Hocko
2014-07-16 13:30                 ` Naoya Horiguchi
2014-07-16 13:30                   ` Naoya Horiguchi
2014-07-16 14:14                   ` Johannes Weiner
2014-07-16 14:14                     ` Johannes Weiner
2014-07-16 14:57                     ` Naoya Horiguchi
2014-07-16 14:57                       ` Naoya Horiguchi
2014-07-16 14:57                       ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140723143847.GB16721@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.