All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Michal Hocko <mhocko@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hillf Danton <dhillf@gmail.com>,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace
Date: Mon, 9 Jul 2012 16:57:14 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1207091622470.2261@eggly.anvils> (raw)
In-Reply-To: <20120709141324.GK7315@mudshark.cambridge.arm.com>

On Mon, 9 Jul 2012, Will Deacon wrote:
> On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote:
> > On Wed 04-07-12 15:32:56, Will Deacon wrote:
> > > When allocating and returning clear huge pages to userspace as a
> > > response to a fault, we may zero and return a mapping to a previously
> > > dirtied physical region (for example, it may have been written by
> > > a private mapping which was freed as a result of an ftruncate on the
> > > backing file). On architectures with Harvard caches, this can lead to
> > > I/D inconsistency since the zeroed view may not be visible to the
> > > instruction stream.
> > > 
> > > This patch solves the problem by flushing the region after allocating
> > > and clearing a new huge page. Note that PowerPC avoids this issue by
> > > performing the flushing in their clear_user_page implementation to keep
> > > the loader happy, however this is closely tied to the semantics of the
> > > PG_arch_1 page flag which is architecture-specific.
> > > 
> > > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > > ---
> > >  mm/hugetlb.c |    1 +
> > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index e198831..b83d026 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -2646,6 +2646,7 @@ retry:
> > >  			goto out;
> > >  		}
> > >  		clear_huge_page(page, address, pages_per_huge_page(h));
> > > +		flush_dcache_page(page);
> > >  		__SetPageUptodate(page);
> > 
> > Does this have to be explicit in the arch independent code?
> > It seems that ia64 uses flush_dcache_page already in the clear_user_page
> 
> It would match what is done in similar situations by cow_user_page (mm/memory.c)
> and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page
> flushing (DMA bounce, ksm) so I think this is the right place for it.

I am not at all sure if you are right or not:
please let's consult linux-arch about this - now Cc'ed.

If this hugetlb_no_page() were solely mapping the hugepage into that
userspace, I would say you are wrong.  It's the job of clear_huge_page()
to take the mapped address into account, and pass it down to the
architecture-specific implementation, to do whatever flushing is
needed - you should be providing that in your architecture.

In particular, notice how clear_huge_page() goes round a loop of
clear_user_highpage()s: in your patch, you're expecting the implementation
of flush_dcache_page() to notice whether or not this is a hugepage, and
flush the appropriate size.

Perhaps yours is the only architecture to need this on huge, and your
flush_dcache_page() implements it correctly; but it does seem surprising.

If I start to grep the architectures for non-empty flush_dcache_page(),
I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing
if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where
the naming suggests that I'm right, it's the architecture's responsibility
to arrange whatever flushing is needed in its copy and clear page functions.

But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts
the new page into page cache, making it accessible by other processes:
that may indeed be reason for flush_dcache_page() there - or a loop of
flush_dcache_page()s.  But I worry then that in the !VM_MAYSHARE case
you would be duplicating expensive flushes: perhaps they should be
restricted to the VM_MAYSHARE block.

Hugh

WARNING: multiple messages have this Message-ID (diff)
From: Hugh Dickins <hughd@google.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Michal Hocko <mhocko@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hillf Danton <dhillf@gmail.com>,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace
Date: Mon, 9 Jul 2012 16:57:14 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1207091622470.2261@eggly.anvils> (raw)
In-Reply-To: <20120709141324.GK7315@mudshark.cambridge.arm.com>

On Mon, 9 Jul 2012, Will Deacon wrote:
> On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote:
> > On Wed 04-07-12 15:32:56, Will Deacon wrote:
> > > When allocating and returning clear huge pages to userspace as a
> > > response to a fault, we may zero and return a mapping to a previously
> > > dirtied physical region (for example, it may have been written by
> > > a private mapping which was freed as a result of an ftruncate on the
> > > backing file). On architectures with Harvard caches, this can lead to
> > > I/D inconsistency since the zeroed view may not be visible to the
> > > instruction stream.
> > > 
> > > This patch solves the problem by flushing the region after allocating
> > > and clearing a new huge page. Note that PowerPC avoids this issue by
> > > performing the flushing in their clear_user_page implementation to keep
> > > the loader happy, however this is closely tied to the semantics of the
> > > PG_arch_1 page flag which is architecture-specific.
> > > 
> > > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > > ---
> > >  mm/hugetlb.c |    1 +
> > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index e198831..b83d026 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -2646,6 +2646,7 @@ retry:
> > >  			goto out;
> > >  		}
> > >  		clear_huge_page(page, address, pages_per_huge_page(h));
> > > +		flush_dcache_page(page);
> > >  		__SetPageUptodate(page);
> > 
> > Does this have to be explicit in the arch independent code?
> > It seems that ia64 uses flush_dcache_page already in the clear_user_page
> 
> It would match what is done in similar situations by cow_user_page (mm/memory.c)
> and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page
> flushing (DMA bounce, ksm) so I think this is the right place for it.

I am not at all sure if you are right or not:
please let's consult linux-arch about this - now Cc'ed.

If this hugetlb_no_page() were solely mapping the hugepage into that
userspace, I would say you are wrong.  It's the job of clear_huge_page()
to take the mapped address into account, and pass it down to the
architecture-specific implementation, to do whatever flushing is
needed - you should be providing that in your architecture.

In particular, notice how clear_huge_page() goes round a loop of
clear_user_highpage()s: in your patch, you're expecting the implementation
of flush_dcache_page() to notice whether or not this is a hugepage, and
flush the appropriate size.

Perhaps yours is the only architecture to need this on huge, and your
flush_dcache_page() implements it correctly; but it does seem surprising.

If I start to grep the architectures for non-empty flush_dcache_page(),
I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing
if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where
the naming suggests that I'm right, it's the architecture's responsibility
to arrange whatever flushing is needed in its copy and clear page functions.

But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts
the new page into page cache, making it accessible by other processes:
that may indeed be reason for flush_dcache_page() there - or a loop of
flush_dcache_page()s.  But I worry then that in the !VM_MAYSHARE case
you would be duplicating expensive flushes: perhaps they should be
restricted to the VM_MAYSHARE block.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-07-09 23:57 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-04 14:32 [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace Will Deacon
2012-07-04 14:32 ` Will Deacon
2012-07-05 12:37 ` Hillf Danton
2012-07-05 12:37   ` Hillf Danton
2012-07-05 14:17   ` Will Deacon
2012-07-05 14:17     ` Will Deacon
2012-07-06 13:15     ` Hillf Danton
2012-07-06 13:15       ` Hillf Danton
2012-07-09 12:25 ` Michal Hocko
2012-07-09 12:25   ` Michal Hocko
2012-07-09 14:13   ` Will Deacon
2012-07-09 14:13     ` Will Deacon
2012-07-09 23:57     ` Hugh Dickins [this message]
2012-07-09 23:57       ` Hugh Dickins
2012-07-10  9:45       ` Will Deacon
2012-07-10  9:45         ` Will Deacon
2012-07-10 10:42         ` Will Deacon
2012-07-10 10:42           ` Will Deacon
2012-07-11 17:48           ` Will Deacon
2012-07-11 17:48             ` Will Deacon
2012-07-12 11:16             ` Michal Hocko
2012-07-12 11:16               ` Michal Hocko
2012-07-12 11:26               ` James Bottomley
2012-07-12 11:26                 ` James Bottomley
2012-07-12 11:26               ` Will Deacon
2012-07-12 11:26                 ` Will Deacon
2012-07-12 11:57                 ` Michal Hocko
2012-07-12 11:57                   ` Michal Hocko
2012-08-07 16:03                   ` Will Deacon
2012-08-07 16:03                     ` Will Deacon
2012-08-08 16:26                     ` Michal Hocko
2012-08-08 16:26                       ` Michal Hocko
2012-08-16 16:09                       ` Will Deacon
2012-08-16 16:09                         ` Will Deacon
2012-08-16 16:09                         ` Will Deacon
2012-08-16 16:09                         ` Will Deacon
2012-08-16 17:25                         ` Michal Hocko
2012-08-16 17:25                           ` Michal Hocko
2012-08-16 17:34                           ` Will Deacon
2012-08-16 17:34                             ` Will Deacon
2012-08-16 18:06                             ` Michal Hocko
2012-08-16 18:06                               ` Michal Hocko
2012-08-16 18:19                               ` Will Deacon
2012-08-16 18:19                                 ` Will Deacon
2012-08-16 18:20                         ` Michal Hocko
2012-08-16 18:20                           ` Michal Hocko
2012-08-16 18:32                           ` Will Deacon
2012-08-16 18:32                             ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.00.1207091622470.2261@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhillf@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.