Re: PMD update corruption (sync question)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Catalin Marinas <catalin.marinas@arm.com>
To: Jon Masters <jcm@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org,
	linux@arm.linux.org.uk, Steve Capper <steve.capper@linaro.org>,
	linux-mm@kvack.org, mark.rutland@arm.com,
	anders.roxell@linaro.org, peterz@infradead.org,
	gary.robertson@linaro.org, hughd@google.com, will.deacon@arm.com,
	mgorman@suse.de, dann.frazier@canonical.com,
	akpm@linux-foundation.org, christoffer.dall@linaro.org
Subject: Re: PMD update corruption (sync question)
Date: Mon, 2 Mar 2015 10:50:12 +0000	[thread overview]
Message-ID: <20150302105011.GD22541@e104818-lin.cambridge.arm.com> (raw)
In-Reply-To: <938476184.27970130.1425275915893.JavaMail.zimbra@zmail15.collab.prod.int.phx2.redhat.com>

On Mon, Mar 02, 2015 at 12:58:36AM -0500, Jon Masters wrote:
> I've pulled a couple of all nighters reproducing this hard to trigger
> issue and got some data. It looks like the high half of the (note always
> userspace) PMD is all zeros or all ones, which makes me wonder if the
> logic in update_mmu_cache might be missing something on AArch64.

That's worrying but I can tell you offline why ;).

Anyway, 64-bit writes are atomic on ARMv8, so you shouldn't see half
updates. To make sure the compiler does not generate something weird,
change the set_(pte|pmd|pud) to use an inline assembly with a 64-bit
STR.

One question - is the PMD a table or a block? You mentioned set_pte_at
at some point, which leads me to think it's a (transparent) huge page,
hence block mapping.

> When a kernel is built with 64K pages and 2 levels the PMD is
> effectively updated using set_pte_at, which explicitly won't perform a
> DSB if the address is userspace (it expects this to happen later, in
> update_mmu_cache as an example.
> 
> Can anyone think of an obvious reason why we might not be properly
> flushing the changes prior to them being consumed by a hardware walker?

Even if you don't have that barrier, the worst that can happen is that
you get another trap back in the kernel (from user; translation fault)
but the page table read by the kernel is valid and normally the
instruction restarted.

> Test kernels running with an explicit DSB in all PTE update cases now
> running overnight. Just in case.

It could be hiding some other problems.

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)

From: Catalin Marinas <catalin.marinas@arm.com>
To: Jon Masters <jcm@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org,
	linux@arm.linux.org.uk, Steve Capper <steve.capper@linaro.org>,
	linux-mm@kvack.org, mark.rutland@arm.com,
	anders.roxell@linaro.org, peterz@infradead.org,
	gary.robertson@linaro.org, hughd@google.com, will.deacon@arm.com,
	mgorman@suse.de, dann.frazier@canonical.com,
	akpm@linux-foundation.org, christoffer.dall@linaro.org
Subject: Re: PMD update corruption (sync question)
Date: Mon, 2 Mar 2015 10:50:12 +0000	[thread overview]
Message-ID: <20150302105011.GD22541@e104818-lin.cambridge.arm.com> (raw)
In-Reply-To: <938476184.27970130.1425275915893.JavaMail.zimbra@zmail15.collab.prod.int.phx2.redhat.com>

On Mon, Mar 02, 2015 at 12:58:36AM -0500, Jon Masters wrote:
> I've pulled a couple of all nighters reproducing this hard to trigger
> issue and got some data. It looks like the high half of the (note always
> userspace) PMD is all zeros or all ones, which makes me wonder if the
> logic in update_mmu_cache might be missing something on AArch64.

That's worrying but I can tell you offline why ;).

Anyway, 64-bit writes are atomic on ARMv8, so you shouldn't see half
updates. To make sure the compiler does not generate something weird,
change the set_(pte|pmd|pud) to use an inline assembly with a 64-bit
STR.

One question - is the PMD a table or a block? You mentioned set_pte_at
at some point, which leads me to think it's a (transparent) huge page,
hence block mapping.

> When a kernel is built with 64K pages and 2 levels the PMD is
> effectively updated using set_pte_at, which explicitly won't perform a
> DSB if the address is userspace (it expects this to happen later, in
> update_mmu_cache as an example.
> 
> Can anyone think of an obvious reason why we might not be properly
> flushing the changes prior to them being consumed by a hardware walker?

Even if you don't have that barrier, the worst that can happen is that
you get another trap back in the kernel (from user; translation fault)
but the page table read by the kernel is valid and normally the
instruction restarted.

> Test kernels running with an explicit DSB in all PTE update cases now
> running overnight. Just in case.

It could be hiding some other problems.

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: catalin.marinas@arm.com (Catalin Marinas)
To: linux-arm-kernel@lists.infradead.org
Subject: PMD update corruption (sync question)
Date: Mon, 2 Mar 2015 10:50:12 +0000	[thread overview]
Message-ID: <20150302105011.GD22541@e104818-lin.cambridge.arm.com> (raw)
In-Reply-To: <938476184.27970130.1425275915893.JavaMail.zimbra@zmail15.collab.prod.int.phx2.redhat.com>

On Mon, Mar 02, 2015 at 12:58:36AM -0500, Jon Masters wrote:
> I've pulled a couple of all nighters reproducing this hard to trigger
> issue and got some data. It looks like the high half of the (note always
> userspace) PMD is all zeros or all ones, which makes me wonder if the
> logic in update_mmu_cache might be missing something on AArch64.

That's worrying but I can tell you offline why ;).

Anyway, 64-bit writes are atomic on ARMv8, so you shouldn't see half
updates. To make sure the compiler does not generate something weird,
change the set_(pte|pmd|pud) to use an inline assembly with a 64-bit
STR.

One question - is the PMD a table or a block? You mentioned set_pte_at
at some point, which leads me to think it's a (transparent) huge page,
hence block mapping.

> When a kernel is built with 64K pages and 2 levels the PMD is
> effectively updated using set_pte_at, which explicitly won't perform a
> DSB if the address is userspace (it expects this to happen later, in
> update_mmu_cache as an example.
> 
> Can anyone think of an obvious reason why we might not be properly
> flushing the changes prior to them being consumed by a hardware walker?

Even if you don't have that barrier, the worst that can happen is that
you get another trap back in the kernel (from user; translation fault)
but the page table read by the kernel is valid and normally the
instruction restarted.

> Test kernels running with an explicit DSB in all PTE update cases now
> running overnight. Just in case.

It could be hiding some other problems.

-- 
Catalin

next prev parent reply	other threads:[~2015-03-02 10:50 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-26 14:03 [PATCH V4 0/6] RCU get_user_pages_fast and __get_user_pages_fast Steve Capper
2014-09-26 14:03 ` Steve Capper
2014-09-26 14:03 ` Steve Capper
2014-09-26 14:03 ` [PATCH V4 1/6] mm: Introduce a general RCU get_user_pages_fast Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-29 21:51   ` Hugh Dickins
2014-09-29 21:51     ` Hugh Dickins
2014-09-29 21:51     ` Hugh Dickins
2014-10-01 11:11     ` Catalin Marinas
2014-10-01 11:11       ` Catalin Marinas
2014-10-01 11:11       ` Catalin Marinas
2014-10-01 11:11       ` Catalin Marinas
2014-10-02 16:00     ` Steve Capper
2014-10-02 16:00       ` Steve Capper
2014-10-02 16:00       ` Steve Capper
2014-10-02 12:19   ` Andrea Arcangeli
2014-10-02 12:19     ` Andrea Arcangeli
2014-10-02 12:19     ` Andrea Arcangeli
2014-10-02 16:18     ` Steve Capper
2014-10-02 16:18       ` Steve Capper
2014-10-02 16:18       ` Steve Capper
2014-10-02 16:54       ` Andrea Arcangeli
2014-10-02 16:54         ` Andrea Arcangeli
2014-10-02 16:54         ` Andrea Arcangeli
2014-10-13  5:15     ` Aneesh Kumar K.V
2014-10-13  5:15       ` Aneesh Kumar K.V
2014-10-13  5:15       ` Aneesh Kumar K.V
2014-10-13  5:21       ` David Miller
2014-10-13  5:21         ` David Miller
2014-10-13  5:21         ` David Miller
2014-10-13 11:44         ` Steve Capper
2014-10-13 11:44           ` Steve Capper
2014-10-13 11:44           ` Steve Capper
2014-10-13 16:06           ` David Miller
2014-10-13 16:06             ` David Miller
2014-10-13 16:06             ` David Miller
2014-10-14 12:38             ` Steve Capper
2014-10-14 12:38               ` Steve Capper
2014-10-14 12:38               ` Steve Capper
2014-10-14 16:30               ` David Miller
2014-10-14 16:30                 ` David Miller
2014-10-14 16:30                 ` David Miller
2014-10-13 17:04           ` Aneesh Kumar K.V
2014-10-13 17:04             ` Aneesh Kumar K.V
2014-10-13 17:04             ` Aneesh Kumar K.V
2014-10-13  6:22   ` Aneesh Kumar K.V
2014-10-13  6:22     ` Aneesh Kumar K.V
2014-10-13  6:22     ` Aneesh Kumar K.V
2014-10-13  6:22     ` Aneesh Kumar K.V
2014-09-26 14:03 ` [PATCH V4 2/6] arm: mm: Introduce special ptes for LPAE Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03 ` [PATCH V4 3/6] arm: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03 ` [PATCH V4 4/6] arm: mm: Enable RCU fast_gup Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03 ` [PATCH V4 5/6] arm64: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03 ` [PATCH V4 6/6] arm64: mm: Enable RCU fast_gup Steve Capper
2014-09-26 14:03   ` Steve Capper
2014-09-26 14:03   ` Steve Capper
2015-02-27 12:42 ` [PATCH V4 0/6] RCU get_user_pages_fast and __get_user_pages_fast Jon Masters
2015-02-27 12:42   ` Jon Masters
2015-02-27 12:42   ` Jon Masters
2015-02-27 13:20   ` Mark Rutland
2015-02-27 13:20     ` Mark Rutland
2015-02-27 13:20     ` Mark Rutland
2015-03-02 14:16     ` Mark Rutland
2015-03-02 14:16       ` Mark Rutland
2015-03-02 14:16       ` Mark Rutland
2015-03-02  2:10   ` PMD update corruption (sync question) Jon Masters
2015-03-02  2:10     ` Jon Masters
2015-03-02  5:58     ` Jon Masters
2015-03-02  5:58       ` Jon Masters
2015-03-02  5:58       ` Jon Masters
2015-03-02 10:50       ` Catalin Marinas [this message]
2015-03-02 10:50         ` Catalin Marinas
2015-03-02 10:50         ` Catalin Marinas
2015-03-02 11:06         ` Jon Masters
2015-03-02 11:06           ` Jon Masters
2015-03-02 11:06           ` Jon Masters
2015-03-02 12:31           ` Peter Zijlstra
2015-03-02 12:31             ` Peter Zijlstra
2015-03-02 12:31             ` Peter Zijlstra
2015-03-02 12:40             ` Geert Uytterhoeven
2015-03-02 12:40               ` Geert Uytterhoeven
2015-03-02 12:40               ` Geert Uytterhoeven
2015-03-02 22:21         ` Jon Masters
2015-03-02 22:21           ` Jon Masters
2015-03-02 22:21           ` Jon Masters
2015-03-02 22:29           ` Jon Masters
2015-03-02 22:29             ` Jon Masters
2015-03-02 22:29             ` Jon Masters
2015-03-03  9:06           ` Arnd Bergmann
2015-03-03  9:06             ` Arnd Bergmann
2015-03-03  9:06             ` Arnd Bergmann
2015-03-03 15:46             ` Jon Masters
2015-03-03 15:46               ` Jon Masters
2015-03-03 15:46               ` Jon Masters

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150302105011.GD22541@e104818-lin.cambridge.arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anders.roxell@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=dann.frazier@canonical.com \
    --cc=gary.robertson@linaro.org \
    --cc=hughd@google.com \
    --cc=jcm@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=steve.capper@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.