All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Leo Yan <leo.yan@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, ard.biesheuvel@linaro.org
Subject: Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support")
Date: Tue, 17 Oct 2017 10:29:14 +0100	[thread overview]
Message-ID: <20171017092914.op4hlzaaqfrpoizm@lakrids.cambridge.arm.com> (raw)
In-Reply-To: <20171017003054.GB19504@leoy-ThinkPad-T440>

On Tue, Oct 17, 2017 at 08:30:54AM +0800, Leo Yan wrote:
> On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote:
> > On 16/10/17 15:26, Mark Rutland wrote:
> > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote:
> > >> On 16/10/17 14:48, Mark Rutland wrote:
> > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote:
> > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote:
> > >>>>> On 10/10/17 16:45, Mark Rutland wrote:
> > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote:
> > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to
> > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel
> > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic
> > >>>>>>> VMAP_STACK support") which introduce this issue.
> > >>>>>>>
> > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from
> > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So
> > >>>>>>> could you check this and have insight for this issue?

> > >>>> I enabled these debugging configs but cannot get clue from it; but
> > >>>> occasionally found this issue is quite likely related with CA53 errata,
> > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use
> > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed.

> > >>> Just to confirm, with the updated firmware you no longer see the issue?
> > >>>
> > >>> I can't immediately see how that would be related.

> > I guess the vmap addresses might tickle the "same L2 set" condition
> > differently to when both stack and DMA buffer are linear map addresses.
> 
> A bit more info for this.
> 
> I can reproduce this memory abort panic, and the panic places are not
> consistent; usually it's related with kmalloc address. Do you think
> "VMAP_STACK" introduces much more operations for cache clean? If
> so if might be in the same *set* with any other memory access (like
> kmalloc operations), then trigger data abort.

VMAP_STACK doesn't introduce any explicit cache maintenance, but it's
possible that it causes more natural evictions. 

That might explain why it triggers the issue.

> Hikey has CA53 CPUs is r3 version so it's luck can directly apply the
> ERRATA 855873 in ARM-TF.
>
> BTW, in case I may mislead you guys, we should note there have another
> two ERRATAs applied in ARM-TFv1.4 for Hikey:
> 
> ERRATA_A53_836870               :=      1
> ERRATA_A53_843419               :=      1

Thanks for the extra info!

AFAICT, erratum 836870 results in livelock rather than memory
corruption, so I think we can ignore that.

I'm a little worried by erratum 843419. The VMAP_STACK patches changed
{adr,ldr}_this_cpu (and some users thereof), and it's possible we're
managing to tickle that issue.

If you still have an affected kernel, could you dump the output of:

$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp'

... that would show us if there are any affected sequences.

>From a quick scan of my own vmlinux build from commit e3067861ba66, I
didn't see any, but it's possible this depends on the config used.

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support")
Date: Tue, 17 Oct 2017 10:29:14 +0100	[thread overview]
Message-ID: <20171017092914.op4hlzaaqfrpoizm@lakrids.cambridge.arm.com> (raw)
In-Reply-To: <20171017003054.GB19504@leoy-ThinkPad-T440>

On Tue, Oct 17, 2017 at 08:30:54AM +0800, Leo Yan wrote:
> On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote:
> > On 16/10/17 15:26, Mark Rutland wrote:
> > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote:
> > >> On 16/10/17 14:48, Mark Rutland wrote:
> > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote:
> > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote:
> > >>>>> On 10/10/17 16:45, Mark Rutland wrote:
> > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote:
> > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to
> > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel
> > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic
> > >>>>>>> VMAP_STACK support") which introduce this issue.
> > >>>>>>>
> > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from
> > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So
> > >>>>>>> could you check this and have insight for this issue?

> > >>>> I enabled these debugging configs but cannot get clue from it; but
> > >>>> occasionally found this issue is quite likely related with CA53 errata,
> > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use
> > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed.

> > >>> Just to confirm, with the updated firmware you no longer see the issue?
> > >>>
> > >>> I can't immediately see how that would be related.

> > I guess the vmap addresses might tickle the "same L2 set" condition
> > differently to when both stack and DMA buffer are linear map addresses.
> 
> A bit more info for this.
> 
> I can reproduce this memory abort panic, and the panic places are not
> consistent; usually it's related with kmalloc address. Do you think
> "VMAP_STACK" introduces much more operations for cache clean? If
> so if might be in the same *set* with any other memory access (like
> kmalloc operations), then trigger data abort.

VMAP_STACK doesn't introduce any explicit cache maintenance, but it's
possible that it causes more natural evictions. 

That might explain why it triggers the issue.

> Hikey has CA53 CPUs is r3 version so it's luck can directly apply the
> ERRATA 855873 in ARM-TF.
>
> BTW, in case I may mislead you guys, we should note there have another
> two ERRATAs applied in ARM-TFv1.4 for Hikey:
> 
> ERRATA_A53_836870               :=      1
> ERRATA_A53_843419               :=      1

Thanks for the extra info!

AFAICT, erratum 836870 results in livelock rather than memory
corruption, so I think we can ignore that.

I'm a little worried by erratum 843419. The VMAP_STACK patches changed
{adr,ldr}_this_cpu (and some users thereof), and it's possible we're
managing to tickle that issue.

If you still have an affected kernel, could you dump the output of:

$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp'

... that would show us if there are any affected sequences.

>From a quick scan of my own vmlinux build from commit e3067861ba66, I
didn't see any, but it's possible this depends on the config used.

Thanks,
Mark.

  reply	other threads:[~2017-10-17  9:29 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-10 14:27 ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") Leo Yan
2017-10-10 14:27 ` Leo Yan
2017-10-10 15:45 ` Mark Rutland
2017-10-10 15:45   ` Mark Rutland
2017-10-10 16:03   ` Robin Murphy
2017-10-10 16:03     ` Robin Murphy
2017-10-16  1:17     ` Leo Yan
2017-10-16  1:17       ` Leo Yan
2017-10-16 13:48       ` Mark Rutland
2017-10-16 13:48         ` Mark Rutland
2017-10-16 14:12         ` Robin Murphy
2017-10-16 14:12           ` Robin Murphy
2017-10-16 14:26           ` Mark Rutland
2017-10-16 14:26             ` Mark Rutland
2017-10-16 14:35             ` Robin Murphy
2017-10-16 14:35               ` Robin Murphy
2017-10-17  0:30               ` Leo Yan
2017-10-17  0:30                 ` Leo Yan
2017-10-17  9:29                 ` Mark Rutland [this message]
2017-10-17  9:29                   ` Mark Rutland
2017-10-17  9:32                   ` Ard Biesheuvel
2017-10-17  9:32                     ` Ard Biesheuvel
2017-10-17  9:36                     ` Leo Yan
2017-10-17  9:36                       ` Leo Yan
2017-10-17  9:56                       ` Mark Rutland
2017-10-17  9:56                         ` Mark Rutland
2017-10-18  6:33                         ` Leo Yan
2017-10-18  6:33                           ` Leo Yan
2017-10-17  9:57                       ` Ard Biesheuvel
2017-10-17  9:57                         ` Ard Biesheuvel
2017-10-17  0:33         ` Leo Yan
2017-10-17  0:33           ` Leo Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171017092914.op4hlzaaqfrpoizm@lakrids.cambridge.arm.com \
    --to=mark.rutland@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.