All of lore.kernel.org
 help / color / mirror / Atom feed
From: Palmer Dabbelt <palmer@dabbelt.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: alex@ghiti.fr, aou@eecs.berkeley.edu, benh@kernel.crashing.org,
	linux-mm@kvack.org, mpe@ellerman.id.au,
	Anup Patel <Anup.Patel@wdc.com>,
	linux-kernel@vger.kernel.org, Atish Patra <Atish.Patra@wdc.com>,
	paulus@samba.org, zong.li@sifive.com,
	Paul Walmsley <paul.walmsley@sifive.com>,
	linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone
Date: Wed, 22 Jul 2020 12:52:39 -0700 (PDT)	[thread overview]
Message-ID: <mhng-820ebe55-b4a3-4ab3-b848-6d3551b43091@palmerdabbelt-glaptop1> (raw)
In-Reply-To: <CAK8P3a34sT2bQbkZUjaxaShzCkn+s35pXxS0UNhqGFu+t2hZYw@mail.gmail.com>

On Wed, 22 Jul 2020 02:43:50 PDT (-0700), Arnd Bergmann wrote:
> On Tue, Jul 21, 2020 at 9:06 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>
>> On Tue, 21 Jul 2020 11:36:10 PDT (-0700), alex@ghiti.fr wrote:
>> > Let's try to make progress here: I add linux-mm in CC to get feedback on
>> > this patch as it blocks sv48 support too.
>>
>> Sorry for being slow here.  I haven't replied because I hadn't really fleshed
>> out the design yet, but just so everyone's on the same page my problems with
>> this are:
>>
>> * We waste vmalloc space on 32-bit systems, where there isn't a lot of it.
>
> There is actually an ongoing work to make 32-bit Arm kernels move
> vmlinux into the vmalloc space, as part of the move to avoid highmem.
>
> Overall, a 32-bit system would waste about 0.1% of its virtual address space
> by having the kernel be located in both the linear map and the vmalloc area.
> It's not zero, but not that bad either. With the typical split of 3072 MB user,
> 768MB linear and 256MB vmalloc, it's also around 1.5% of the available
> vmalloc area (assuming a 4MB vmlinux in a typical 32-bit kernel), but the
> boundaries can be changed arbitrarily if needed.

OK, I guess maybe it's not so bad.  Our 32-bit defconfig is 10MiB, but I
wouldn't really put much weight behind that number as it's just a 64-bit
defconfig built for 32-bit.  We don't have any 32-bit hardware anyway, so if
this becomes an issue later I guess we can just deal with it then.

> The eventual goal is to have a split of 3840MB for either user or linear map
> plus and 256MB for vmalloc, including the kernel. Switching between linear
> and user has a noticeable runtime overhead, but it relaxes both the limits
> for user memory and lowmem, and it provides a somewhat stronger
> address space isolation.

Ya, I think we decided not to do that, at least for now.  I guess the right
answer there will depend on what 32-bit systems look like, and since we don't
have any I'm inclined to just stick to the fast option.

> Another potential idea would be to completely randomize the physical
> addresses underneath the kernel by using a random permutation of the
> pages in the kernel image. This adds even more overhead (virt_to_phys
> may need to call vmalloc_to_page or similar) and may cause problems
> with DMA into kernel .data across page boundaries,
>
>> * Sort out how to maintain a linear map as the canonical hole moves around
>>   between the VA widths without adding a bunch of overhead to the virt2phys and
>>   friends.  This is probably going to be the trickiest part, but I think if we
>>   just change the page table code to essentially lie about VAs when an sv39
>>   system runs an sv48+sv39 kernel we could make it work -- there'd be some
>>   logical complexity involved, but it would remain fast.
>
> I assume you can't use the trick that x86 has where all kernel addresses
> are at the top of the 64-bit address space and user addresses are at the
> bottom, regardless of the size of the page tables?

They have the load in their mapping functions, as far as I can tell that's
required to do this sort of thing.  We do as well to handle some of the
implicit boot stuff for now, but I was assuming that we'd want to get rid of
that for performance reasons.  That said, maybe it just doesn't matter?  

WARNING: multiple messages have this Message-ID (diff)
From: Palmer Dabbelt <palmer@dabbelt.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: aou@eecs.berkeley.edu, alex@ghiti.fr,
	Atish Patra <Atish.Patra@wdc.com>,
	benh@kernel.crashing.org, Anup Patel <Anup.Patel@wdc.com>,
	linux-kernel@vger.kernel.org,
	Paul Walmsley <paul.walmsley@sifive.com>,
	linux-mm@kvack.org, paulus@samba.org, zong.li@sifive.com,
	mpe@ellerman.id.au, linux-riscv@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone
Date: Wed, 22 Jul 2020 12:52:39 -0700 (PDT)	[thread overview]
Message-ID: <mhng-820ebe55-b4a3-4ab3-b848-6d3551b43091@palmerdabbelt-glaptop1> (raw)
In-Reply-To: <CAK8P3a34sT2bQbkZUjaxaShzCkn+s35pXxS0UNhqGFu+t2hZYw@mail.gmail.com>

On Wed, 22 Jul 2020 02:43:50 PDT (-0700), Arnd Bergmann wrote:
> On Tue, Jul 21, 2020 at 9:06 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>
>> On Tue, 21 Jul 2020 11:36:10 PDT (-0700), alex@ghiti.fr wrote:
>> > Let's try to make progress here: I add linux-mm in CC to get feedback on
>> > this patch as it blocks sv48 support too.
>>
>> Sorry for being slow here.  I haven't replied because I hadn't really fleshed
>> out the design yet, but just so everyone's on the same page my problems with
>> this are:
>>
>> * We waste vmalloc space on 32-bit systems, where there isn't a lot of it.
>
> There is actually an ongoing work to make 32-bit Arm kernels move
> vmlinux into the vmalloc space, as part of the move to avoid highmem.
>
> Overall, a 32-bit system would waste about 0.1% of its virtual address space
> by having the kernel be located in both the linear map and the vmalloc area.
> It's not zero, but not that bad either. With the typical split of 3072 MB user,
> 768MB linear and 256MB vmalloc, it's also around 1.5% of the available
> vmalloc area (assuming a 4MB vmlinux in a typical 32-bit kernel), but the
> boundaries can be changed arbitrarily if needed.

OK, I guess maybe it's not so bad.  Our 32-bit defconfig is 10MiB, but I
wouldn't really put much weight behind that number as it's just a 64-bit
defconfig built for 32-bit.  We don't have any 32-bit hardware anyway, so if
this becomes an issue later I guess we can just deal with it then.

> The eventual goal is to have a split of 3840MB for either user or linear map
> plus and 256MB for vmalloc, including the kernel. Switching between linear
> and user has a noticeable runtime overhead, but it relaxes both the limits
> for user memory and lowmem, and it provides a somewhat stronger
> address space isolation.

Ya, I think we decided not to do that, at least for now.  I guess the right
answer there will depend on what 32-bit systems look like, and since we don't
have any I'm inclined to just stick to the fast option.

> Another potential idea would be to completely randomize the physical
> addresses underneath the kernel by using a random permutation of the
> pages in the kernel image. This adds even more overhead (virt_to_phys
> may need to call vmalloc_to_page or similar) and may cause problems
> with DMA into kernel .data across page boundaries,
>
>> * Sort out how to maintain a linear map as the canonical hole moves around
>>   between the VA widths without adding a bunch of overhead to the virt2phys and
>>   friends.  This is probably going to be the trickiest part, but I think if we
>>   just change the page table code to essentially lie about VAs when an sv39
>>   system runs an sv48+sv39 kernel we could make it work -- there'd be some
>>   logical complexity involved, but it would remain fast.
>
> I assume you can't use the trick that x86 has where all kernel addresses
> are at the top of the 64-bit address space and user addresses are at the
> bottom, regardless of the size of the page tables?

They have the load in their mapping functions, as far as I can tell that's
required to do this sort of thing.  We do as well to handle some of the
implicit boot stuff for now, but I was assuming that we'd want to get rid of
that for performance reasons.  That said, maybe it just doesn't matter?  

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Palmer Dabbelt <palmer@dabbelt.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: aou@eecs.berkeley.edu, alex@ghiti.fr,
	Atish Patra <Atish.Patra@wdc.com>,
	Anup Patel <Anup.Patel@wdc.com>,
	linux-kernel@vger.kernel.org,
	Paul Walmsley <paul.walmsley@sifive.com>,
	linux-mm@kvack.org, paulus@samba.org, zong.li@sifive.com,
	linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone
Date: Wed, 22 Jul 2020 12:52:39 -0700 (PDT)	[thread overview]
Message-ID: <mhng-820ebe55-b4a3-4ab3-b848-6d3551b43091@palmerdabbelt-glaptop1> (raw)
In-Reply-To: <CAK8P3a34sT2bQbkZUjaxaShzCkn+s35pXxS0UNhqGFu+t2hZYw@mail.gmail.com>

On Wed, 22 Jul 2020 02:43:50 PDT (-0700), Arnd Bergmann wrote:
> On Tue, Jul 21, 2020 at 9:06 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>
>> On Tue, 21 Jul 2020 11:36:10 PDT (-0700), alex@ghiti.fr wrote:
>> > Let's try to make progress here: I add linux-mm in CC to get feedback on
>> > this patch as it blocks sv48 support too.
>>
>> Sorry for being slow here.  I haven't replied because I hadn't really fleshed
>> out the design yet, but just so everyone's on the same page my problems with
>> this are:
>>
>> * We waste vmalloc space on 32-bit systems, where there isn't a lot of it.
>
> There is actually an ongoing work to make 32-bit Arm kernels move
> vmlinux into the vmalloc space, as part of the move to avoid highmem.
>
> Overall, a 32-bit system would waste about 0.1% of its virtual address space
> by having the kernel be located in both the linear map and the vmalloc area.
> It's not zero, but not that bad either. With the typical split of 3072 MB user,
> 768MB linear and 256MB vmalloc, it's also around 1.5% of the available
> vmalloc area (assuming a 4MB vmlinux in a typical 32-bit kernel), but the
> boundaries can be changed arbitrarily if needed.

OK, I guess maybe it's not so bad.  Our 32-bit defconfig is 10MiB, but I
wouldn't really put much weight behind that number as it's just a 64-bit
defconfig built for 32-bit.  We don't have any 32-bit hardware anyway, so if
this becomes an issue later I guess we can just deal with it then.

> The eventual goal is to have a split of 3840MB for either user or linear map
> plus and 256MB for vmalloc, including the kernel. Switching between linear
> and user has a noticeable runtime overhead, but it relaxes both the limits
> for user memory and lowmem, and it provides a somewhat stronger
> address space isolation.

Ya, I think we decided not to do that, at least for now.  I guess the right
answer there will depend on what 32-bit systems look like, and since we don't
have any I'm inclined to just stick to the fast option.

> Another potential idea would be to completely randomize the physical
> addresses underneath the kernel by using a random permutation of the
> pages in the kernel image. This adds even more overhead (virt_to_phys
> may need to call vmalloc_to_page or similar) and may cause problems
> with DMA into kernel .data across page boundaries,
>
>> * Sort out how to maintain a linear map as the canonical hole moves around
>>   between the VA widths without adding a bunch of overhead to the virt2phys and
>>   friends.  This is probably going to be the trickiest part, but I think if we
>>   just change the page table code to essentially lie about VAs when an sv39
>>   system runs an sv48+sv39 kernel we could make it work -- there'd be some
>>   logical complexity involved, but it would remain fast.
>
> I assume you can't use the trick that x86 has where all kernel addresses
> are at the top of the 64-bit address space and user addresses are at the
> bottom, regardless of the size of the page tables?

They have the load in their mapping functions, as far as I can tell that's
required to do this sort of thing.  We do as well to handle some of the
implicit boot stuff for now, but I was assuming that we'd want to get rid of
that for performance reasons.  That said, maybe it just doesn't matter?  

  reply	other threads:[~2020-07-22 19:52 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-07  7:59 [PATCH v5 0/4] vmalloc kernel mapping and relocatable kernel Alexandre Ghiti
2020-06-07  7:59 ` [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone Alexandre Ghiti
2020-06-11 21:34   ` Atish Patra
2020-06-11 21:34     ` Atish Patra
2020-06-11 21:34     ` Atish Patra
2020-06-12 12:30     ` Alex Ghiti
2020-06-12 12:30       ` Alex Ghiti
2020-06-12 12:30       ` Alex Ghiti
2020-07-09  5:05   ` Palmer Dabbelt
2020-07-09  5:05     ` Palmer Dabbelt
2020-07-09  5:05     ` Palmer Dabbelt
2020-07-09  8:15     ` Zong Li
2020-07-09  8:15       ` Zong Li
2020-07-09  8:15       ` Zong Li
2020-07-09 11:11     ` Alex Ghiti
2020-07-09 11:11       ` Alex Ghiti
2020-07-09 11:11       ` Alex Ghiti
2020-07-21 18:36       ` Alex Ghiti
2020-07-21 18:36         ` Alex Ghiti
2020-07-21 18:36         ` Alex Ghiti
2020-07-21 19:05         ` Palmer Dabbelt
2020-07-21 19:05           ` Palmer Dabbelt
2020-07-21 19:05           ` Palmer Dabbelt
2020-07-21 23:12           ` Benjamin Herrenschmidt
2020-07-21 23:12             ` Benjamin Herrenschmidt
2020-07-21 23:12             ` Benjamin Herrenschmidt
2020-07-21 23:48             ` Palmer Dabbelt
2020-07-21 23:48               ` Palmer Dabbelt
2020-07-21 23:48               ` Palmer Dabbelt
2020-07-22  2:21               ` Benjamin Herrenschmidt
2020-07-22  2:21                 ` Benjamin Herrenschmidt
2020-07-22  2:21                 ` Benjamin Herrenschmidt
2020-07-22  4:50                 ` Michael Ellerman
2020-07-22  4:50                   ` Michael Ellerman
2020-07-22  4:50                   ` Michael Ellerman
2020-07-22  5:46                   ` Palmer Dabbelt
2020-07-22  5:46                     ` Palmer Dabbelt
2020-07-22  5:46                     ` Palmer Dabbelt
2020-07-22  9:43           ` Arnd Bergmann
2020-07-22  9:43             ` Arnd Bergmann
2020-07-22  9:43             ` Arnd Bergmann
2020-07-22  9:43             ` Arnd Bergmann
2020-07-22 19:52             ` Palmer Dabbelt [this message]
2020-07-22 19:52               ` Palmer Dabbelt
2020-07-22 19:52               ` Palmer Dabbelt
2020-07-22 20:22               ` Arnd Bergmann
2020-07-22 20:22                 ` Arnd Bergmann
2020-07-22 20:22                 ` Arnd Bergmann
2020-07-22 20:22                 ` Arnd Bergmann
2020-07-22 21:05                 ` Atish Patra
2020-07-22 21:05                   ` Atish Patra
2020-07-22 21:05                   ` Atish Patra
2020-07-22 21:05                   ` Atish Patra
2020-07-24  7:20                   ` Arnd Bergmann
2020-07-24  7:20                     ` Arnd Bergmann
2020-07-24  7:20                     ` Arnd Bergmann
2020-07-24  7:20                     ` Arnd Bergmann
2020-07-23  5:32           ` Alex Ghiti
2020-07-23  5:32             ` Alex Ghiti
2020-07-23  5:32             ` Alex Ghiti
2020-07-21 23:11         ` Benjamin Herrenschmidt
2020-07-21 23:11           ` Benjamin Herrenschmidt
2020-07-21 23:11           ` Benjamin Herrenschmidt
2020-07-21 23:36           ` Palmer Dabbelt
2020-07-21 23:36             ` Palmer Dabbelt
2020-07-21 23:36             ` Palmer Dabbelt
2020-07-23  5:36             ` Alex Ghiti
2020-07-23  5:36               ` Alex Ghiti
2020-07-23  5:36               ` Alex Ghiti
2020-07-23  5:21           ` Alex Ghiti
2020-07-23  5:21             ` Alex Ghiti
2020-07-23  5:21             ` Alex Ghiti
2020-07-23 22:33             ` Benjamin Herrenschmidt
2020-07-23 22:33               ` Benjamin Herrenschmidt
2020-07-23 22:33               ` Benjamin Herrenschmidt
2020-07-24  8:14               ` Arnd Bergmann
2020-07-24  8:14                 ` Arnd Bergmann
2020-07-24  8:14                 ` Arnd Bergmann
2020-07-24  8:14                 ` Arnd Bergmann
2020-06-07  7:59 ` [PATCH v5 2/4] riscv: Introduce CONFIG_RELOCATABLE Alexandre Ghiti
2020-06-07  7:59   ` Alexandre Ghiti
2020-06-10 14:10   ` Jerome Forissier
2020-06-11 19:43     ` Alex Ghiti
2020-06-07  7:59 ` [PATCH v5 3/4] powerpc: Move script to check relocations at compile time in scripts/ Alexandre Ghiti
2020-06-07  7:59   ` Alexandre Ghiti
2020-06-07  7:59 ` [PATCH v5 4/4] riscv: Check relocations at compile time Alexandre Ghiti
2020-06-07  7:59   ` Alexandre Ghiti
2020-07-08  4:21 ` [PATCH v5 0/4] vmalloc kernel mapping and relocatable kernel Alex Ghiti
2020-07-08  4:21   ` Alex Ghiti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mhng-820ebe55-b4a3-4ab3-b848-6d3551b43091@palmerdabbelt-glaptop1 \
    --to=palmer@dabbelt.com \
    --cc=Anup.Patel@wdc.com \
    --cc=Atish.Patra@wdc.com \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paul.walmsley@sifive.com \
    --cc=paulus@samba.org \
    --cc=zong.li@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.