All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "npiggin@gmail.com" <npiggin@gmail.com>,
	"songyuanzheng@huawei.com" <songyuanzheng@huawei.com>
Subject: Re: [PATCH v4 1/2] Revert "powerpc: Set max_mapnr correctly"
Date: Fri, 01 Apr 2022 22:23:04 +1100	[thread overview]
Message-ID: <87ee2hf3on.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <34355f36-1122-9c22-e717-73a957a31a3e@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 28/03/2022 à 12:37, Michael Ellerman a écrit :
>> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
>>> Hi maintainers,
>>>
>>> I saw the patches has been reviewed[1], could they be merged?
>>
>> Maybe I'm just misreading the change log, but it seems wrong that we
>> need to add extra checks. pfn_valid() shouldn't return true for vmalloc
>> addresses in the first place, shouldn't we fix that instead? Who knows
>> what else that might be broken because of that.
>
> pfn_valid() doesn't take an address but a PFN

Yeah sorry that was unclear wording on my part.

What I mean is that pfn_valid(virt_to_pfn(some_vmalloc_addr)) should be
false, because virt_to_pfn(vmalloc_addr) should fail.

The right way to convert a vmalloc address to a pfn is with
vmalloc_to_pfn(), which walks the page tables to find the actual pfn
backing that vmalloc addr.

> If you have 1Gbyte of memory you have 256k PFNs.
>
> In a generic config the kernel will map 768 Mbytes of mémory (From
> 0xc0000000 to 0xe0000000) and will use 0xf0000000-0xffffffff for
> everything else including vmalloc.
>
> If you take a page above that 768 Mbytes limit, and tries to linarly
> convert it's PFN to a va, you'll hip vmalloc space. Anyway that PFN is
> valid.

That's true, but it's just some random page in vmalloc space, there's no
guarantee that it's the same page as the PFN you started with.

Note it's not true on 64-bit Book3S. There if you take a valid PFN (ie.
backed by RAM) and convert it to a virtual address (with __va()), you
will never get a vmalloc address.

> So the check really needs to be done in virt_addr_valid().

I don't think it has to, but with the way our virt_to_pfn()/__pa() works
I guess for now it's the easiest solution.

> There is another thing however that would be worth fixing (in another
> patch): that's the virt_to_pfn() in PPC64:
>
> #define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
>
> #define __pa(x)								\
> ({									\
> 	VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET);		\
> 	(unsigned long)(x) & 0x0fffffffffffffffUL;			\
> })
>
>
> So 0xc000000000000000 and 0xd000000000000000 have the same PFN. That's
> wrong.

Yes it was wrong. But we don't use 0xd000000000000000 anymore, so it
shouldn't be an issue in practice.

See 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range").

I guess it is still a problem for 64-bit Book3E, because they use 0xc
and 0x8.

I looked at fixing __pa()/__va() to use +/- a few years back, but from
memory it still didn't work and/or generated bad code. There's a comment
about it working around a GCC miscompile.

The other thing that makes me reluctant to change it is that I worry we
have places that inadvertantly use __pa() on addresses that are already
physical addresses. If we changed __pa() to use subtraction that would
break, ie. we'd end up with a negative address.

See eg. a6e2c226c3d5 ("powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL")

So to fix it we'd have to 1) verify that the compiler bug is no longer
an issue and 2) audit uses of __pa()/__va().

cheers

WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "songyuanzheng@huawei.com" <songyuanzheng@huawei.com>,
	"npiggin@gmail.com" <npiggin@gmail.com>
Subject: Re: [PATCH v4 1/2] Revert "powerpc: Set max_mapnr correctly"
Date: Fri, 01 Apr 2022 22:23:04 +1100	[thread overview]
Message-ID: <87ee2hf3on.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <34355f36-1122-9c22-e717-73a957a31a3e@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 28/03/2022 à 12:37, Michael Ellerman a écrit :
>> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
>>> Hi maintainers,
>>>
>>> I saw the patches has been reviewed[1], could they be merged?
>>
>> Maybe I'm just misreading the change log, but it seems wrong that we
>> need to add extra checks. pfn_valid() shouldn't return true for vmalloc
>> addresses in the first place, shouldn't we fix that instead? Who knows
>> what else that might be broken because of that.
>
> pfn_valid() doesn't take an address but a PFN

Yeah sorry that was unclear wording on my part.

What I mean is that pfn_valid(virt_to_pfn(some_vmalloc_addr)) should be
false, because virt_to_pfn(vmalloc_addr) should fail.

The right way to convert a vmalloc address to a pfn is with
vmalloc_to_pfn(), which walks the page tables to find the actual pfn
backing that vmalloc addr.

> If you have 1Gbyte of memory you have 256k PFNs.
>
> In a generic config the kernel will map 768 Mbytes of mémory (From
> 0xc0000000 to 0xe0000000) and will use 0xf0000000-0xffffffff for
> everything else including vmalloc.
>
> If you take a page above that 768 Mbytes limit, and tries to linarly
> convert it's PFN to a va, you'll hip vmalloc space. Anyway that PFN is
> valid.

That's true, but it's just some random page in vmalloc space, there's no
guarantee that it's the same page as the PFN you started with.

Note it's not true on 64-bit Book3S. There if you take a valid PFN (ie.
backed by RAM) and convert it to a virtual address (with __va()), you
will never get a vmalloc address.

> So the check really needs to be done in virt_addr_valid().

I don't think it has to, but with the way our virt_to_pfn()/__pa() works
I guess for now it's the easiest solution.

> There is another thing however that would be worth fixing (in another
> patch): that's the virt_to_pfn() in PPC64:
>
> #define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
>
> #define __pa(x)								\
> ({									\
> 	VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET);		\
> 	(unsigned long)(x) & 0x0fffffffffffffffUL;			\
> })
>
>
> So 0xc000000000000000 and 0xd000000000000000 have the same PFN. That's
> wrong.

Yes it was wrong. But we don't use 0xd000000000000000 anymore, so it
shouldn't be an issue in practice.

See 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range").

I guess it is still a problem for 64-bit Book3E, because they use 0xc
and 0x8.

I looked at fixing __pa()/__va() to use +/- a few years back, but from
memory it still didn't work and/or generated bad code. There's a comment
about it working around a GCC miscompile.

The other thing that makes me reluctant to change it is that I worry we
have places that inadvertantly use __pa() on addresses that are already
physical addresses. If we changed __pa() to use subtraction that would
break, ie. we'd end up with a negative address.

See eg. a6e2c226c3d5 ("powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL")

So to fix it we'd have to 1) verify that the compiler bug is no longer
an issue and 2) audit uses of __pa()/__va().

cheers

  reply	other threads:[~2022-04-01 11:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-16 12:11 [PATCH v4 1/2] Revert "powerpc: Set max_mapnr correctly" Kefeng Wang
2022-02-16 12:11 ` Kefeng Wang
2022-02-16 12:11 ` [PATCH v4 2/2] powerpc: Fix virt_addr_valid() check Kefeng Wang
2022-02-16 12:11   ` Kefeng Wang
2022-03-09 16:01   ` [v4,2/2] " Christophe Leroy
2022-03-09 16:00 ` [v4,1/2] Revert "powerpc: Set max_mapnr correctly" Christophe Leroy
2022-03-26  7:55 ` [PATCH v4 1/2] " Kefeng Wang
2022-03-26  7:55   ` Kefeng Wang
2022-03-28 10:37   ` Michael Ellerman
2022-03-28 10:37     ` Michael Ellerman
2022-03-28 10:59     ` Christophe Leroy
2022-03-28 10:59       ` Christophe Leroy
2022-04-01 11:23       ` Michael Ellerman [this message]
2022-04-01 11:23         ` Michael Ellerman
2022-04-01 12:07         ` Christophe Leroy
2022-04-01 12:07           ` Christophe Leroy
2022-03-28 14:12   ` Christophe Leroy
2022-03-28 14:12     ` Christophe Leroy
2022-03-29 11:32     ` Kefeng Wang
2022-03-29 11:32       ` Kefeng Wang
2022-04-04 12:31       ` Michael Ellerman
2022-04-04 12:31         ` Michael Ellerman
2022-04-06  2:21         ` Kefeng Wang
2022-04-06  2:21           ` Kefeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ee2hf3on.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=npiggin@gmail.com \
    --cc=songyuanzheng@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.