linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Vineet Gupta <vgupta@synopsys.com>,
	Russell King <linux@armlinux.org.uk>,
	Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Ralf Baechle <ralf@linux-mips.org>,
	"David S. Miller" <davem@davemloft.net>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	linux-arch@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] mm, thp: Do not loose dirty bit in __split_huge_pmd_locked()
Date: Wed, 14 Jun 2017 17:31:31 +0200	[thread overview]
Message-ID: <20170614153131.GC5847@redhat.com> (raw)
In-Reply-To: <20170614161857.69d54338@mschwideX1>

Hello,

On Wed, Jun 14, 2017 at 04:18:57PM +0200, Martin Schwidefsky wrote:
> Could we change pmdp_invalidate to make it return the old pmd entry?

That to me seems the simplest fix to avoid losing the dirty bit.

I earlier suggested to replace pmdp_invalidate with something like
old_pmd = pmdp_establish(pmd_mknotpresent(pmd)) (then tlb flush could
then be conditional to the old pmd being present). Making
pmdp_invalidate return the old pmd entry would be mostly equivalent to
that.

The advantage of not changing pmdp_invalidate is that we could skip a
xchg which is more costly in __split_huge_pmd_locked and
madvise_free_huge_pmd so perhaps there's a point to keep a variant of
pmdp_invalidate that doesn't use xchg internally (and in turn can't
return the old pmd value atomically).

If we don't want new messy names like pmdp_establish we could have a
__pmdp_invalidate that returns void, and pmdp_invalidate that returns
the old pmd and uses xchg (and it'd also be backwards compatible as
far as the callers are concerned). So those places that don't need the
old value returned and can skip the xchg, could simply
s/pmdp_invalidate/__pmdp_invalidate/ to optimize.

One way or another for change_huge_pmd I think we need a xchg like in
native_pmdp_get_and_clear but that sets the pmd to
pmd_mknotpresent(pmd) instead of zero. And this whole issues
originates because both change_huge_pmd(prot_numa = 1) and
madvise_free_huge_pmd both run concurrently with the mmap_sem for
reading.

In the earlier email on this topic, I also mentioned the concern of
the _notify mmu notifier invalidate that got dropped silently with the
s/pmdp_huge_get_and_clear_notify/pmdp_invalidate/ conversion but I
later noticed the mmu notifier invalidate is already covered by the
caller. So change_huge_pmd should have called pmdp_huge_get_and_clear
in the first place and the _notify prefix in the old code was a
mistake as far as I can tell. So we can focus only on the dirty bit
retention issue.

Thanks,
Andrea

  reply	other threads:[~2017-06-14 15:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-14 13:51 [HELP-NEEDED, PATCH 0/3] Do not loose dirty bit on THP pages Kirill A. Shutemov
2017-06-14 13:51 ` [PATCH 1/3] x86/mm: Provide pmdp_mknotpresent() helper Kirill A. Shutemov
2017-06-14 16:09   ` Andrea Arcangeli
2017-06-15  4:43   ` kbuild test robot
2017-06-14 13:51 ` [PATCH 2/3] mm: Do not loose dirty and access bits in pmdp_invalidate() Kirill A. Shutemov
2017-06-15  8:48   ` kbuild test robot
2017-06-14 13:51 ` [PATCH 3/3] mm, thp: Do not loose dirty bit in __split_huge_pmd_locked() Kirill A. Shutemov
2017-06-14 14:18   ` Martin Schwidefsky
2017-06-14 15:31     ` Andrea Arcangeli [this message]
2017-06-15  8:46       ` Kirill A. Shutemov
2017-06-14 15:28   ` Aneesh Kumar K.V
2017-06-14 14:06 ` [HELP-NEEDED, PATCH 0/3] Do not loose dirty bit on THP pages Martin Schwidefsky
2017-06-14 15:25 ` Aneesh Kumar K.V
2017-06-14 16:55   ` Will Deacon
2017-06-14 17:00     ` Vlastimil Babka
2017-06-15  1:36       ` Aneesh Kumar K.V
2017-06-15  1:05     ` Aneesh Kumar K.V
2017-06-15  2:50       ` Aneesh Kumar K.V
2017-06-15  8:48       ` Kirill A. Shutemov
2017-06-15  9:36         ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614153131.GC5847@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=heiko.carstens@de.ibm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=ralf@linux-mips.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=vgupta@synopsys.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).