All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yangyu Chen <cyy@cyyself.name>
To: Charlie Jenkins <charlie@rivosinc.com>
Cc: alexghiti@rivosinc.com, anup@brainfault.org,
	aou@eecs.berkeley.edu, conor@kernel.org, jrtc27@jrtc27.com,
	konstantin@linuxfoundation.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-mm@kvack.org, linux-riscv@lists.infradead.org,
	mick@ics.forth.gr, palmer@rivosinc.com, paul.walmsley@sifive.com,
	rdunlap@infradead.org
Subject: Re: [PATCH v10 0/4] RISC-V: mm: Make SV48 the default address space
Date: Sat, 20 Jan 2024 14:13:14 +0800	[thread overview]
Message-ID: <tencent_FE461EBE274178ED6047005CCF98D710B807@qq.com> (raw)
In-Reply-To: <ZasjJ3HPUVuxr2oG@ghost>

Thanks for your reply.

On 1/20/24 09:34, Charlie Jenkins wrote:
> On Sun, Jan 14, 2024 at 01:26:57AM +0800, Yangyu Chen wrote:
>> Hi, Charlie
>>
>> Although this patchset has been merged I still have some questions about
>> this patchset. Because it breaks regular mmap if address >= 38 bits on
>> sv48 / sv57 capable systems like qemu. For example, If a userspace program
>> wants to mmap an anonymous page to addr=(1<<45) on an sv48 capable system,
>> it will fail and kernel will mmaped to another sv39 address since it does
> 
> Thank you for raising this concern. To make sure I am understanding
> correctly, you are passing a hint address of (1<<45) and expecting mmap
> to return 1<<45 and if it returns a different address you are describing
> mmap as failing? If you want an address that is in the sv48 space you
> can pass in an address that is greater than 1<<47.
> 
>> not meet the requirement to use sv48 as you wrote:
>>
>>> 	else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \
>>> 		mmap_end = VA_USER_SV48;			\
>>> 	else							\
>>> 		mmap_end = VA_USER_SV39;			\
>>
>> Then, How can a userspace program create a mmap with a hint if the address
>>> = (1<<38) after your patch without MAP_FIXED? The only way to do this is
>> to pass a hint >= (1<<47) on mmap syscall then kernel will return a random
>> address in sv48 address space but the hint address gets lost. I think this
> 
> In order to force mmap to return the address provided you must use
> MAP_FIXED. Otherwise, the address is a "hint" and has no guarantees. The
> hint address on riscv is used to mean "don't give me an address that
> uses more bits than this". This behavior is not unique to riscv, arm64
> and powerpc use a similar scheme. In arch/arm64/include/asm/processor.h
> there is the following code:
> 
> #define arch_get_mmap_base(addr, base) ((addr > DEFAULT_MAP_WINDOW) ? \
> 					base + TASK_SIZE - DEFAULT_MAP_WINDOW :\
> 					base)
> 
> arm64/powerpc are only concerned with a single boundary so the code is simpler.
> 

As you say, this code in arm64/powerpc will not meet the issue I 
address. For example, If the addr here is (1<<50) on arm64, the 
arch_get_mmap_base will return base+TASK_SIZE-DEFAULT_MAP_WINDOW which 
is (1<<vabits_actual). And this behavior on arm64/powerpc/x86 does not 
break anything since we will use a larger address space if the hint 
address is specified on the address > DEFAULT_MAP_WINDOW. The 
corresponding behavior on RISC-V should be if the hint address > BIT(47) 
then use Sv57 address space and use Sv48 when the hint address > BIT(38) 
if we want Sv39 by default.

However, your patch needs the address >= BIT(47) rather than BIT(38) to 
use Sv48 and address >= BIT(56) to use Sv57, thus breaking existing 
userspace software to create mapping on the hint address without 
MAP_FIXED set.

>> violate the principle of mmap syscall as kernel should take the hint and
>> attempt to create the mapping there.
> 
> Although the man page for mmap does say "on Linux, the kernel will pick
> a nearby page boundary" it is still a hint address so there is no strict
> requirement (and the precedent has already been set by arm64/powerpc).
> 

Yeah. There is no strict requirement. But currently x86/arm64/powerpc 
works in this situation well. The hint address on these ISAs is not used 
as the upper bound to allocating the address. However, on RISC-V, you 
treat this as the upper bound.

>>
>> I don't think patching in this way is right. However, if we only revert
>> this patch, some programs relying on mmap to return address with effective
>> bits <= 48 will still be an issue and it might expand to other ISAs if
>> they implement larger virtual address space like RISC-V sv57. A better way
>> to solve this might be adding a MAP_48BIT flag to mmap like MAP_32BIT has
>> been introduced for decades.
>>
>> Thanks,
>> Yangyu Chen
>>
> 
> - Charlie
> 


WARNING: multiple messages have this Message-ID (diff)
From: Yangyu Chen <cyy@cyyself.name>
To: Charlie Jenkins <charlie@rivosinc.com>
Cc: alexghiti@rivosinc.com, anup@brainfault.org,
	aou@eecs.berkeley.edu, conor@kernel.org, jrtc27@jrtc27.com,
	konstantin@linuxfoundation.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-mm@kvack.org, linux-riscv@lists.infradead.org,
	mick@ics.forth.gr, palmer@rivosinc.com, paul.walmsley@sifive.com,
	rdunlap@infradead.org
Subject: Re: [PATCH v10 0/4] RISC-V: mm: Make SV48 the default address space
Date: Sat, 20 Jan 2024 14:13:14 +0800	[thread overview]
Message-ID: <tencent_FE461EBE274178ED6047005CCF98D710B807@qq.com> (raw)
In-Reply-To: <ZasjJ3HPUVuxr2oG@ghost>

Thanks for your reply.

On 1/20/24 09:34, Charlie Jenkins wrote:
> On Sun, Jan 14, 2024 at 01:26:57AM +0800, Yangyu Chen wrote:
>> Hi, Charlie
>>
>> Although this patchset has been merged I still have some questions about
>> this patchset. Because it breaks regular mmap if address >= 38 bits on
>> sv48 / sv57 capable systems like qemu. For example, If a userspace program
>> wants to mmap an anonymous page to addr=(1<<45) on an sv48 capable system,
>> it will fail and kernel will mmaped to another sv39 address since it does
> 
> Thank you for raising this concern. To make sure I am understanding
> correctly, you are passing a hint address of (1<<45) and expecting mmap
> to return 1<<45 and if it returns a different address you are describing
> mmap as failing? If you want an address that is in the sv48 space you
> can pass in an address that is greater than 1<<47.
> 
>> not meet the requirement to use sv48 as you wrote:
>>
>>> 	else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \
>>> 		mmap_end = VA_USER_SV48;			\
>>> 	else							\
>>> 		mmap_end = VA_USER_SV39;			\
>>
>> Then, How can a userspace program create a mmap with a hint if the address
>>> = (1<<38) after your patch without MAP_FIXED? The only way to do this is
>> to pass a hint >= (1<<47) on mmap syscall then kernel will return a random
>> address in sv48 address space but the hint address gets lost. I think this
> 
> In order to force mmap to return the address provided you must use
> MAP_FIXED. Otherwise, the address is a "hint" and has no guarantees. The
> hint address on riscv is used to mean "don't give me an address that
> uses more bits than this". This behavior is not unique to riscv, arm64
> and powerpc use a similar scheme. In arch/arm64/include/asm/processor.h
> there is the following code:
> 
> #define arch_get_mmap_base(addr, base) ((addr > DEFAULT_MAP_WINDOW) ? \
> 					base + TASK_SIZE - DEFAULT_MAP_WINDOW :\
> 					base)
> 
> arm64/powerpc are only concerned with a single boundary so the code is simpler.
> 

As you say, this code in arm64/powerpc will not meet the issue I 
address. For example, If the addr here is (1<<50) on arm64, the 
arch_get_mmap_base will return base+TASK_SIZE-DEFAULT_MAP_WINDOW which 
is (1<<vabits_actual). And this behavior on arm64/powerpc/x86 does not 
break anything since we will use a larger address space if the hint 
address is specified on the address > DEFAULT_MAP_WINDOW. The 
corresponding behavior on RISC-V should be if the hint address > BIT(47) 
then use Sv57 address space and use Sv48 when the hint address > BIT(38) 
if we want Sv39 by default.

However, your patch needs the address >= BIT(47) rather than BIT(38) to 
use Sv48 and address >= BIT(56) to use Sv57, thus breaking existing 
userspace software to create mapping on the hint address without 
MAP_FIXED set.

>> violate the principle of mmap syscall as kernel should take the hint and
>> attempt to create the mapping there.
> 
> Although the man page for mmap does say "on Linux, the kernel will pick
> a nearby page boundary" it is still a hint address so there is no strict
> requirement (and the precedent has already been set by arm64/powerpc).
> 

Yeah. There is no strict requirement. But currently x86/arm64/powerpc 
works in this situation well. The hint address on these ISAs is not used 
as the upper bound to allocating the address. However, on RISC-V, you 
treat this as the upper bound.

>>
>> I don't think patching in this way is right. However, if we only revert
>> this patch, some programs relying on mmap to return address with effective
>> bits <= 48 will still be an issue and it might expand to other ISAs if
>> they implement larger virtual address space like RISC-V sv57. A better way
>> to solve this might be adding a MAP_48BIT flag to mmap like MAP_32BIT has
>> been introduced for decades.
>>
>> Thanks,
>> Yangyu Chen
>>
> 
> - Charlie
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2024-01-20  6:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-09 23:22 [PATCH v10 0/4] RISC-V: mm: Make SV48 the default address space Charlie Jenkins
2023-08-09 23:22 ` Charlie Jenkins
2023-08-09 23:22 ` [PATCH v10 1/4] RISC-V: mm: Restrict address space for sv39,sv48,sv57 Charlie Jenkins
2023-08-09 23:22   ` Charlie Jenkins
2023-08-09 23:22 ` [PATCH v10 2/4] RISC-V: mm: Add tests for RISC-V mm Charlie Jenkins
2023-08-09 23:22   ` Charlie Jenkins
2023-08-09 23:22 ` [PATCH v10 3/4] RISC-V: mm: Update pgtable comment documentation Charlie Jenkins
2023-08-09 23:22   ` Charlie Jenkins
2023-08-09 23:22 ` [PATCH v10 4/4] RISC-V: mm: Document mmap changes Charlie Jenkins
2023-08-09 23:22   ` Charlie Jenkins
2023-08-30 13:20 ` [PATCH v10 0/4] RISC-V: mm: Make SV48 the default address space patchwork-bot+linux-riscv
2023-08-30 13:20   ` patchwork-bot+linux-riscv
2024-01-13 17:26 ` Yangyu Chen
2024-01-13 17:26   ` Yangyu Chen
2024-01-20  1:34   ` Charlie Jenkins
2024-01-20  1:34     ` Charlie Jenkins
2024-01-20  6:13     ` Yangyu Chen [this message]
2024-01-20  6:13       ` Yangyu Chen
2024-01-20  6:49       ` Charlie Jenkins
2024-01-20  6:49         ` Charlie Jenkins
2024-01-20  7:09         ` Yangyu Chen
2024-01-20  7:09           ` Yangyu Chen
2024-01-22 19:56           ` Charlie Jenkins
2024-01-22 19:56             ` Charlie Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tencent_FE461EBE274178ED6047005CCF98D710B807@qq.com \
    --to=cyy@cyyself.name \
    --cc=alexghiti@rivosinc.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=charlie@rivosinc.com \
    --cc=conor@kernel.org \
    --cc=jrtc27@jrtc27.com \
    --cc=konstantin@linuxfoundation.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mick@ics.forth.gr \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rdunlap@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.