linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>,
	xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org,
	"Alex Bennée" <alex.bennee@linaro.org>,
	takahiro.akashi@linaro.org, jgross@suse.com,
	boris.ostrovsky@oracle.com
Subject: Re: [PATCH] arm/arm64: xen: Fix to convert percpu address to gfn correctly
Date: Thu, 8 Oct 2020 10:30:40 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.21.2010081030180.23978@sstabellini-ThinkPad-T480s> (raw)
In-Reply-To: <20201008172806.1591ebb538946c5ee93d372a@kernel.org>

On Thu, 8 Oct 2020, Masami Hiramatsu wrote:
> On Tue, 6 Oct 2020 10:56:52 -0700 (PDT)
> Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> > On Tue, 6 Oct 2020, Masami Hiramatsu wrote:
> > > On Mon, 5 Oct 2020 18:13:22 -0700 (PDT)
> > > Stefano Stabellini <sstabellini@kernel.org> wrote:
> > > 
> > > > On Mon, 5 Oct 2020, Julien Grall wrote:
> > > > > Hi Masami,
> > > > > 
> > > > > On 05/10/2020 14:39, Masami Hiramatsu wrote:
> > > > > > Use per_cpu_ptr_to_phys() instead of virt_to_phys() for per-cpu
> > > > > > address conversion.
> > > > > > 
> > > > > > In xen_starting_cpu(), per-cpu xen_vcpu_info address is converted
> > > > > > to gfn by virt_to_gfn() macro. However, since the virt_to_gfn(v)
> > > > > > assumes the given virtual address is in contiguous kernel memory
> > > > > > area, it can not convert the per-cpu memory if it is allocated on
> > > > > > vmalloc area (depends on CONFIG_SMP).
> > > > > 
> > > > > Are you sure about this? I have a .config with CONFIG_SMP=y where the per-cpu
> > > > > region for CPU0 is allocated outside of vmalloc area.
> > > > > 
> > > > > However, I was able to trigger the bug as soon as CONFIG_NUMA_BALANCING was
> > > > > enabled.
> > > > 
> > > > I cannot reproduce the issue with defconfig, but I can with Masami's
> > > > kconfig.
> > > > 
> > > > If I disable just CONFIG_NUMA_BALANCING from Masami's kconfig, the
> > > > problem still appears.
> > > > 
> > > > If I disable CONFIG_NUMA from Masami's kconfig, it works, which is
> > > > strange because CONFIG_NUMA is enabled in defconfig, and defconfig
> > > > works.
> > > 
> > > Hmm, strange, because when I disabled CONFIG_NUMA_BALANCING, the issue
> > > disappeared.
> > > 
> > > --- config-5.9.0-rc4+   2020-10-06 11:36:20.620107129 +0900
> > > +++ config-5.9.0-rc4+.buggy     2020-10-05 21:04:40.369936461 +0900
> > > @@ -131,7 +131,8 @@
> > >  CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
> > >  CONFIG_CC_HAS_INT128=y
> > >  CONFIG_ARCH_SUPPORTS_INT128=y
> > > -# CONFIG_NUMA_BALANCING is not set
> > > +CONFIG_NUMA_BALANCING=y
> > > +CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
> > >  CONFIG_CGROUPS=y
> > >  CONFIG_PAGE_COUNTER=y
> > >  CONFIG_MEMCG=y
> > > 
> > > So buggy config just enabled NUMA_BALANCING (and default enabled)
> > 
> > Yeah but both NUMA and NUMA_BALANCING are enabled in defconfig which
> > works fine...
> 
> Hmm, I found that the xen_vcpu_info was allocated on km if the Dom0 has
> enough memory. On my environment, if Xen passed 2GB of RAM to Dom0, it
> was allocated on kernel linear mapped address, but with 1GB of RAM, it
> was on vmalloc area.
> As far as I can see, it seems that the percpu allocates memory from
> 2nd chunk if the default memory size is small.
> 
> I've built a kernel with patch [1] and boot the same kernel up with
> different dom0_mem option with "trace_event=percpu:*" kernel cmdline.
> Then got following logs.
> 
> Boot with 4GB:
>           <idle>-0     [000] ....     0.543208: percpu_create_chunk: base_addr=000000005d5ad71c
>  [...]
>          systemd-1     [000] ....     0.568931: percpu_alloc_percpu: reserved=0 is_atomic=0 size=48 align=8 base_addr=00000000fa92a086 off=32672 ptr=000000008da0b73d
>          systemd-1     [000] ....     0.568938: xen_guest_init: Xen: alloc xen_vcpu_info ffff800011003fa0 id=000000008da0b73d
>          systemd-1     [000] ....     0.586635: xen_starting_cpu: Xen: xen_vcpu_info ffff800011003fa0, vcpup ffff00092f4ebfa0 per_cpu_offset[0] ffff80091e4e8000
> 
> (NOTE: base_addr and ptr are encoded to the ids, not actual address
>  because of "%p" printk format)
> 
> In this log, we can see the xen_vcpu_info is allocated NOT on the
> new chunk (this is the 2nd chunk). As you can see, the vcpup is in
> the kernel linear address in this case, because it came from the
> 1st kernel embedded chunk.
> 
> 
> Boot with 1GB
>           <idle>-0     [000] ....     0.516221: percpu_create_chunk: base_addr=000000008456b989
> [...]
>          systemd-1     [000] ....     0.541982: percpu_alloc_percpu: reserved=0 is_atomic=0 size=48 align=8 base_addr=000000008456b989 off=17920 ptr=00000000c247612d
>          systemd-1     [000] ....     0.541989: xen_guest_init: Xen: alloc xen_vcpu_info 7dff951f0600 id=00000000c247612d
>          systemd-1     [000] ....     0.559690: xen_starting_cpu: Xen: xen_vcpu_info 7dff951f0600, vcpup fffffdffbfcdc600 per_cpu_offset[0] ffff80002aaec000
> 
> On the other hand, when we boot the dom0 with 1GB memory, the xen_vcpu_info
> is allocated on the new chunk (the id of base_addr is same).
> Since the data of new chunk is allocated on vmalloc area, vcpup points
> the vmalloc address.
> 
> So, the bug seems not to depend on the kconfig, but depends on where
> the percpu memory is allocated from.
> 
> > [...]
> > 
> > > > The fix is fine for me. I tested it and it works. We need to remove the
> > > > "Fixes:" line from the commit message. Ideally, replacing it with a
> > > > reference to what is the source of the problem.
> > > 
> > > OK, as I said, it seems commit 9a9ab3cc00dc ("xen/arm: SMP support") has
> > > introduced the per-cpu code. So note it instead of Fixes tag.
> > 
> > ...and commit 9a9ab3cc00dc was already present in 5.8 which also works
> > fine with your kconfig. Something else changed in 5.9 causing this
> > breakage as a side effect. Commit 9a9ab3cc00dc is there since 2013, I
> > think it is OK -- this patch is fixing something else.
> 
> I think the commit 9a9ab3cc00dc theoletically wrong because it uses
> __pa() on percpu address. But that is not guaranteed according to the
> comment on per_cpu_ptr_to_phys() as below.
> 
> ----
>  * percpu allocator has special setup for the first chunk, which currently
>  * supports either embedding in linear address space or vmalloc mapping,
>  * and, from the second one, the backing allocator (currently either vm or
>  * km) provides translation.
>  *
>  * The addr can be translated simply without checking if it falls into the
>  * first chunk. But the current code reflects better how percpu allocator
>  * actually works, and the verification can discover both bugs in percpu
>  * allocator itself and per_cpu_ptr_to_phys() callers. So we keep current
>  * code.
> ----
> 
> So we must use per_cpu_ptr_to_phys() instead of __pa() macro for percpu
> address. That's why I pointed this will fix the commit 9a9ab3cc00dc.

Thank you for the analysis. We are going to try to get the patch
upstream as soon as we can.

      reply	other threads:[~2020-10-08 17:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-05 13:39 [PATCH] arm/arm64: xen: Fix to convert percpu address to gfn correctly Masami Hiramatsu
2020-10-05 18:18 ` Julien Grall
2020-10-06  0:35   ` Masami Hiramatsu
2020-10-06  1:13   ` Stefano Stabellini
2020-10-06  2:40     ` Masami Hiramatsu
2020-10-06  4:11       ` Masami Hiramatsu
2020-10-06 17:56       ` Stefano Stabellini
2020-10-07  5:50         ` Masami Hiramatsu
2020-10-08  8:28         ` Masami Hiramatsu
2020-10-08 17:30           ` Stefano Stabellini [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2010081030180.23978@sstabellini-ThinkPad-T480s \
    --to=sstabellini@kernel.org \
    --cc=alex.bennee@linaro.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jgross@suse.com \
    --cc=julien@xen.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=takahiro.akashi@linaro.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).