linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] riscv: Allow PROT_WRITE-only mmap()
@ 2022-09-08 17:01 Andrew Bresticker
  2022-09-08 17:21 ` SS JieJi
  2022-09-08 18:50 ` [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ Andrew Bresticker
  0 siblings, 2 replies; 12+ messages in thread
From: Andrew Bresticker @ 2022-09-08 17:01 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: Paul Walmsley, Atish Patra, Celeste Liu, dram, Ruizhe Pan,
	linux-riscv, linux-kernel, Andrew Bresticker

Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
PROT_READ with the justification that a write-only PTE is considered a
reserved PTE permission bit pattern in the privileged spec. This check
is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
inconsistent with other architectures that don't support write-only PTEs,
creating a potential software portability issue. Just remove the check
altogether and let PROT_WRITE imply PROT_READ as is the case on other
architectures.

Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
---
 arch/riscv/kernel/sys_riscv.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 571556bb9261..5d3f2fbeb33c 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
 	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
 		return -EINVAL;
 
-	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
-		return -EINVAL;
-
 	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
 			       offset >> (PAGE_SHIFT - page_shift_offset));
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] riscv: Allow PROT_WRITE-only mmap()
  2022-09-08 17:01 [PATCH] riscv: Allow PROT_WRITE-only mmap() Andrew Bresticker
@ 2022-09-08 17:21 ` SS JieJi
  2022-09-08 17:28   ` SS JieJi
  2022-09-08 18:50 ` [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ Andrew Bresticker
  1 sibling, 1 reply; 12+ messages in thread
From: SS JieJi @ 2022-09-08 17:21 UTC (permalink / raw)
  To: Andrew Bresticker
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, Celeste Liu, dram,
	linux-riscv, linux-kernel

> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
> maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
> inconsistent with other architectures that don't support write-only PTEs,
> creating a potential software portability issue.

I don't believe that the check is unnecessary. The missing check is
discovered in realworld scenario, while we are fixing libaio's test
failure on RISC-V [1]. A minimum reproducible example is uploaded to
https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages
before/after a write attempt performed by the kernel.

[1]: https://pagure.io/libaio/blob/1b18bfafc6a2f7b9fa2c6be77a95afed8b7be448/f/harness/cases/5.t

> -       if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> -               return -EINVAL;
> -

Just to mention, this revert patch is removing the check of exec
without read (--x), too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] riscv: Allow PROT_WRITE-only mmap()
  2022-09-08 17:21 ` SS JieJi
@ 2022-09-08 17:28   ` SS JieJi
  2022-09-08 18:14     ` Andrew Bresticker
  0 siblings, 1 reply; 12+ messages in thread
From: SS JieJi @ 2022-09-08 17:28 UTC (permalink / raw)
  To: Andrew Bresticker
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, Celeste Liu, dram,
	linux-riscv, linux-kernel

> https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages
> before/after a write attempt performed by the kernel.

That said, maybe prohibit mmap-ing -w- pages is not the best fix for
this issue. If -w- pages are irreplaceable for some use cases (and
hence need to be allowed), I'd suggest at least we need to re-fix the
read result inconsistency issue somewhere else despite simply
reverting the patch.

Yours, Pan Ruizhe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] riscv: Allow PROT_WRITE-only mmap()
  2022-09-08 17:28   ` SS JieJi
@ 2022-09-08 18:14     ` Andrew Bresticker
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Bresticker @ 2022-09-08 18:14 UTC (permalink / raw)
  To: SS JieJi
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, Celeste Liu, dram,
	linux-riscv, linux-kernel

On Thu, Sep 8, 2022 at 1:28 PM SS JieJi <c141028@gmail.com> wrote:
>
> > https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages
> > before/after a write attempt performed by the kernel.
>
> That said, maybe prohibit mmap-ing -w- pages is not the best fix for
> this issue. If -w- pages are irreplaceable for some use cases (and
> hence need to be allowed), I'd suggest at least we need to re-fix the
> read result inconsistency issue somewhere else despite simply
> reverting the patch.

Ah, this is because do_page_fault() also needs to be made aware of
write-implying-read. Will send a v2 shortly.

-Andrew

>
> Yours, Pan Ruizhe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-08 17:01 [PATCH] riscv: Allow PROT_WRITE-only mmap() Andrew Bresticker
  2022-09-08 17:21 ` SS JieJi
@ 2022-09-08 18:50 ` Andrew Bresticker
  2022-09-08 18:56   ` SS JieJi
                     ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Andrew Bresticker @ 2022-09-08 18:50 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: Paul Walmsley, Atish Patra, Celeste Liu, dram, Ruizhe Pan,
	linux-riscv, linux-kernel, Andrew Bresticker

Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
PROT_READ with the justification that a write-only PTE is considered a
reserved PTE permission bit pattern in the privileged spec. This check
is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
inconsistent with other architectures that don't support write-only PTEs,
creating a potential software portability issue. Just remove the check
altogether and let PROT_WRITE imply PROT_READ as is the case on other
architectures.

Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
disallowed prior to the aforementioned commit; PROT_READ is implied in
such mappings as well.

Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
---
v1 -> v2: Update access_error() to account for write-implies-read
---
 arch/riscv/kernel/sys_riscv.c | 3 ---
 arch/riscv/mm/fault.c         | 3 ++-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 571556bb9261..5d3f2fbeb33c 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
 	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
 		return -EINVAL;
 
-	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
-		return -EINVAL;
-
 	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
 			       offset >> (PAGE_SHIFT - page_shift_offset));
 }
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index f2fbd1400b7c..d86f7cebd4a7 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -184,7 +184,8 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
 		}
 		break;
 	case EXC_LOAD_PAGE_FAULT:
-		if (!(vma->vm_flags & VM_READ)) {
+		/* Write implies read */
+		if (!(vma->vm_flags & (VM_READ | VM_WRITE))) {
 			return true;
 		}
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-08 18:50 ` [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ Andrew Bresticker
@ 2022-09-08 18:56   ` SS JieJi
  2022-09-08 19:18     ` Andrew Bresticker
  2022-09-09  3:01   ` Celeste Liu
  2022-09-09 18:52   ` Atish Patra
  2 siblings, 1 reply; 12+ messages in thread
From: SS JieJi @ 2022-09-08 18:56 UTC (permalink / raw)
  To: Andrew Bresticker
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, Celeste Liu, dram,
	linux-riscv, linux-kernel

The v2 patch looks great,
> -       if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> -               return -EINVAL;
> -
This also removes the check for --x pages, which used to be present in
previous versions (before the submission of the to-be-reverted patch).
Is this intended? Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-08 18:56   ` SS JieJi
@ 2022-09-08 19:18     ` Andrew Bresticker
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Bresticker @ 2022-09-08 19:18 UTC (permalink / raw)
  To: SS JieJi
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, Celeste Liu, dram,
	linux-riscv, linux-kernel

On Thu, Sep 8, 2022 at 2:57 PM SS JieJi <c141028@gmail.com> wrote:
>
> The v2 patch looks great,
> > -       if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> > -               return -EINVAL;
> > -
> This also removes the check for --x pages, which used to be present in
> previous versions (before the submission of the to-be-reverted patch).
> Is this intended? Thanks!

There's no change in behavior for --X mappings; those have always been
allowed as it's a valid set of PTE permissions.

This patch does allow -WX mappings, which were originally disallowed
in commit e0d17c842c0f ("RISC-V: Don't allow write+exec only page
mapping request in mmap"), by implying read permissions for such
mappings as well. I have a note in the commit message about this.

-Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-08 18:50 ` [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ Andrew Bresticker
  2022-09-08 18:56   ` SS JieJi
@ 2022-09-09  3:01   ` Celeste Liu
  2022-09-09 11:42     ` Coelacanthus
  2022-09-09 18:52   ` Atish Patra
  2 siblings, 1 reply; 12+ messages in thread
From: Celeste Liu @ 2022-09-09  3:01 UTC (permalink / raw)
  To: Andrew Bresticker, Palmer Dabbelt
  Cc: Paul Walmsley, Atish Patra, dram, Ruizhe Pan, linux-riscv, linux-kernel

On 2022/9/9 02:50, Andrew Bresticker wrote:
> Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
> invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
> PROT_READ with the justification that a write-only PTE is considered a
> reserved PTE permission bit pattern in the privileged spec. This check
> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
> maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
> inconsistent with other architectures that don't support write-only PTEs,
> creating a potential software portability issue. Just remove the check
> altogether and let PROT_WRITE imply PROT_READ as is the case on other
> architectures.
> 
> Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
> disallowed prior to the aforementioned commit; PROT_READ is implied in
> such mappings as well.
> 
> Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> ---
> v1 -> v2: Update access_error() to account for write-implies-read
> ---
>  arch/riscv/kernel/sys_riscv.c | 3 ---
>  arch/riscv/mm/fault.c         | 3 ++-
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
> index 571556bb9261..5d3f2fbeb33c 100644
> --- a/arch/riscv/kernel/sys_riscv.c
> +++ b/arch/riscv/kernel/sys_riscv.c
> @@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>  	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
>  		return -EINVAL;
>  
> -	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> -		return -EINVAL;
> -
>  	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
>  			       offset >> (PAGE_SHIFT - page_shift_offset));
>  }
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index f2fbd1400b7c..d86f7cebd4a7 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -184,7 +184,8 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
>  		}
>  		break;
>  	case EXC_LOAD_PAGE_FAULT:
> -		if (!(vma->vm_flags & VM_READ)) {
> +		/* Write implies read */
> +		if (!(vma->vm_flags & (VM_READ | VM_WRITE))) {
>  			return true;
>  		}
>  		break;

Hi, this did solve the problem and achieved consistency between
architectures, but I have a question.

Such a change specifies behavior for a state that should not exist,
and if, in the future, RISC-V spec specifies a different behavior
for that state (I mean, RVI itself has a history of not caring about
downstream, like Zicsr and Zifencei), it will create inconsistencies,
which is bad.

If we reject the "write but not read" state, the user gets the most direct
response: the state is not allowed so that they do not and cannot rely
on the behavior of the state. This will bring better time consistency
to the application if the spec specifies the behavior in the future.
But it lost architecture consistency.

How do you think this situation should be handled properly?

Yours,
Celeste Liu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-09  3:01   ` Celeste Liu
@ 2022-09-09 11:42     ` Coelacanthus
  2022-09-09 15:16       ` Andrew Bresticker
  0 siblings, 1 reply; 12+ messages in thread
From: Coelacanthus @ 2022-09-09 11:42 UTC (permalink / raw)
  To: Andrew Bresticker, Palmer Dabbelt
  Cc: Paul Walmsley, Atish Patra, dram, Ruizhe Pan, linux-riscv, linux-kernel


[-- Attachment #1.1.1: Type: text/plain, Size: 4073 bytes --]

On 2022/9/9 11:01, Celeste Liu wrote:
> On 2022/9/9 02:50, Andrew Bresticker wrote:
>> Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
>> invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
>> PROT_READ with the justification that a write-only PTE is considered a
>> reserved PTE permission bit pattern in the privileged spec. This check
>> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
>> maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
>> inconsistent with other architectures that don't support write-only PTEs,
>> creating a potential software portability issue. Just remove the check
>> altogether and let PROT_WRITE imply PROT_READ as is the case on other
>> architectures.
>>
>> Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
>> disallowed prior to the aforementioned commit; PROT_READ is implied in
>> such mappings as well.
>>
>> Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
>> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
>> ---
>> v1 -> v2: Update access_error() to account for write-implies-read
>> ---
>>  arch/riscv/kernel/sys_riscv.c | 3 ---
>>  arch/riscv/mm/fault.c         | 3 ++-
>>  2 files changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
>> index 571556bb9261..5d3f2fbeb33c 100644
>> --- a/arch/riscv/kernel/sys_riscv.c
>> +++ b/arch/riscv/kernel/sys_riscv.c
>> @@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>>  	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
>>  		return -EINVAL;
>>  
>> -	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
>> -		return -EINVAL;
>> -
>>  	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
>>  			       offset >> (PAGE_SHIFT - page_shift_offset));
>>  }
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index f2fbd1400b7c..d86f7cebd4a7 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -184,7 +184,8 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
>>  		}
>>  		break;
>>  	case EXC_LOAD_PAGE_FAULT:
>> -		if (!(vma->vm_flags & VM_READ)) {
>> +		/* Write implies read */
>> +		if (!(vma->vm_flags & (VM_READ | VM_WRITE))) {
>>  			return true;
>>  		}
>>  		break;
> 
> Hi, this did solve the problem and achieved consistency between
> architectures, but I have a question.
> 
> Such a change specifies behavior for a state that should not exist,
> and if, in the future, RISC-V spec specifies a different behavior
> for that state (I mean, RVI itself has a history of not caring about
> downstream, like Zicsr and Zifencei), it will create inconsistencies,
> which is bad.
> 
> If we reject the "write but not read" state, the user gets the most direct
> response: the state is not allowed so that they do not and cannot rely
> on the behavior of the state. This will bring better time consistency
> to the application if the spec specifies the behavior in the future.
> But it lost architecture consistency.
> 
> How do you think this situation should be handled properly?
> 
> Yours,
> Celeste Liu

Oops!

I found a mistake in my previous understanding: PTE permission!=vma permission.
So your modification makes sense, no matter how we handle the mapping of input
permissions to PTEs, as long as we don't use the reserved permission combinations,
the behavior is reasonable and also independent of the architecture's definition
of PTEs.

But I think this mapping relationship should be well documented. If we have
such a mapping behavior in all architectures, then we should change this line in
the mmap documentation
    On some hardware architectures (e.g., i386), PROT_WRITE implies PROT_READ.
to apply all architectures. According to my read about code, all the vm_get_page_prot
will do the protection_map mapping to have this feature.

Yours,
Celeste Liu

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 8045 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-09 11:42     ` Coelacanthus
@ 2022-09-09 15:16       ` Andrew Bresticker
  2022-09-09 15:45         ` Celeste Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Bresticker @ 2022-09-09 15:16 UTC (permalink / raw)
  To: Coelacanthus
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, dram, Ruizhe Pan,
	linux-riscv, linux-kernel

On Fri, Sep 9, 2022 at 7:42 AM Coelacanthus <coelacanthushex@gmail.com> wrote:
>
> On 2022/9/9 11:01, Celeste Liu wrote:
> > On 2022/9/9 02:50, Andrew Bresticker wrote:
> >> Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
> >> invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
> >> PROT_READ with the justification that a write-only PTE is considered a
> >> reserved PTE permission bit pattern in the privileged spec. This check
> >> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
> >> maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
> >> inconsistent with other architectures that don't support write-only PTEs,
> >> creating a potential software portability issue. Just remove the check
> >> altogether and let PROT_WRITE imply PROT_READ as is the case on other
> >> architectures.
> >>
> >> Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
> >> disallowed prior to the aforementioned commit; PROT_READ is implied in
> >> such mappings as well.
> >>
> >> Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
> >> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> >> ---
> >> v1 -> v2: Update access_error() to account for write-implies-read
> >> ---
> >>  arch/riscv/kernel/sys_riscv.c | 3 ---
> >>  arch/riscv/mm/fault.c         | 3 ++-
> >>  2 files changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
> >> index 571556bb9261..5d3f2fbeb33c 100644
> >> --- a/arch/riscv/kernel/sys_riscv.c
> >> +++ b/arch/riscv/kernel/sys_riscv.c
> >> @@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
> >>      if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
> >>              return -EINVAL;
> >>
> >> -    if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> >> -            return -EINVAL;
> >> -
> >>      return ksys_mmap_pgoff(addr, len, prot, flags, fd,
> >>                             offset >> (PAGE_SHIFT - page_shift_offset));
> >>  }
> >> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> >> index f2fbd1400b7c..d86f7cebd4a7 100644
> >> --- a/arch/riscv/mm/fault.c
> >> +++ b/arch/riscv/mm/fault.c
> >> @@ -184,7 +184,8 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
> >>              }
> >>              break;
> >>      case EXC_LOAD_PAGE_FAULT:
> >> -            if (!(vma->vm_flags & VM_READ)) {
> >> +            /* Write implies read */
> >> +            if (!(vma->vm_flags & (VM_READ | VM_WRITE))) {
> >>                      return true;
> >>              }
> >>              break;
> >
> > Hi, this did solve the problem and achieved consistency between
> > architectures, but I have a question.
> >
> > Such a change specifies behavior for a state that should not exist,
> > and if, in the future, RISC-V spec specifies a different behavior
> > for that state (I mean, RVI itself has a history of not caring about
> > downstream, like Zicsr and Zifencei), it will create inconsistencies,
> > which is bad.
> >
> > If we reject the "write but not read" state, the user gets the most direct
> > response: the state is not allowed so that they do not and cannot rely
> > on the behavior of the state. This will bring better time consistency
> > to the application if the spec specifies the behavior in the future.
> > But it lost architecture consistency.
> >
> > How do you think this situation should be handled properly?
> >
> > Yours,
> > Celeste Liu
>
> Oops!
>
> I found a mistake in my previous understanding: PTE permission!=vma permission.
> So your modification makes sense, no matter how we handle the mapping of input
> permissions to PTEs, as long as we don't use the reserved permission combinations,
> the behavior is reasonable and also independent of the architecture's definition
> of PTEs.
>
> But I think this mapping relationship should be well documented. If we have
> such a mapping behavior in all architectures, then we should change this line in
> the mmap documentation
>     On some hardware architectures (e.g., i386), PROT_WRITE implies PROT_READ.
> to apply all architectures. According to my read about code, all the vm_get_page_prot
> will do the protection_map mapping to have this feature.

I think leaving the PROT_WRITE-implies-PROT_READ as being specified as
architecture-dependent is reasonable, but of course portable programs
shouldn't rely on this behavior. There are CPUs out there that support
write-only mappings -- MIPS with RI/XI comes to mind and indeed
mmap(PROT_WRITE) on such CPUs results in write-only mappings.

-Andrew

>
> Yours,
> Celeste Liu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-09 15:16       ` Andrew Bresticker
@ 2022-09-09 15:45         ` Celeste Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Celeste Liu @ 2022-09-09 15:45 UTC (permalink / raw)
  To: Andrew Bresticker
  Cc: Palmer Dabbelt, Paul Walmsley, Atish Patra, dram, Ruizhe Pan,
	linux-riscv, linux-kernel

On 2022/9/9 23:16, Andrew Bresticker wrote> 
> I think leaving the PROT_WRITE-implies-PROT_READ as being specified as
> architecture-dependent is reasonable, but of course portable programs
> shouldn't rely on this behavior. There are CPUs out there that support
> write-only mappings -- MIPS with RI/XI comes to mind and indeed
> mmap(PROT_WRITE) on such CPUs results in write-only mappings.
> 
> -Andrew
> 

Ok, I have no question now. This patch looks good to me.

This feature shouldn't be relied upon indeed, as it depends on the specific
hardware implementation.

Thanks for your explanation!

Yours,
Celeste Liu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ
  2022-09-08 18:50 ` [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ Andrew Bresticker
  2022-09-08 18:56   ` SS JieJi
  2022-09-09  3:01   ` Celeste Liu
@ 2022-09-09 18:52   ` Atish Patra
  2 siblings, 0 replies; 12+ messages in thread
From: Atish Patra @ 2022-09-09 18:52 UTC (permalink / raw)
  To: Andrew Bresticker
  Cc: Palmer Dabbelt, Paul Walmsley, Celeste Liu, dram, Ruizhe Pan,
	linux-riscv, linux-kernel

On Thu, Sep 8, 2022 at 11:50 AM Andrew Bresticker <abrestic@rivosinc.com> wrote:
>
> Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
> invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
> PROT_READ with the justification that a write-only PTE is considered a
> reserved PTE permission bit pattern in the privileged spec. This check
> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
> maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
> inconsistent with other architectures that don't support write-only PTEs,
> creating a potential software portability issue. Just remove the check
> altogether and let PROT_WRITE imply PROT_READ as is the case on other
> architectures.
>
> Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
> disallowed prior to the aforementioned commit; PROT_READ is implied in
> such mappings as well.
>
> Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
> ---
> v1 -> v2: Update access_error() to account for write-implies-read
> ---
>  arch/riscv/kernel/sys_riscv.c | 3 ---
>  arch/riscv/mm/fault.c         | 3 ++-
>  2 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
> index 571556bb9261..5d3f2fbeb33c 100644
> --- a/arch/riscv/kernel/sys_riscv.c
> +++ b/arch/riscv/kernel/sys_riscv.c
> @@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>         if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
>                 return -EINVAL;
>
> -       if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> -               return -EINVAL;
> -
>         return ksys_mmap_pgoff(addr, len, prot, flags, fd,
>                                offset >> (PAGE_SHIFT - page_shift_offset));
>  }
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index f2fbd1400b7c..d86f7cebd4a7 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -184,7 +184,8 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
>                 }
>                 break;
>         case EXC_LOAD_PAGE_FAULT:
> -               if (!(vma->vm_flags & VM_READ)) {
> +               /* Write implies read */
> +               if (!(vma->vm_flags & (VM_READ | VM_WRITE))) {
>                         return true;
>                 }
>                 break;

This should be a separate patch with commit text about VMA permissions.

> --
> 2.25.1
>

Otherwise, lgtm.

Reviewed-by: Atish Patra <atishp@rivosinc.com>

-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-09 18:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08 17:01 [PATCH] riscv: Allow PROT_WRITE-only mmap() Andrew Bresticker
2022-09-08 17:21 ` SS JieJi
2022-09-08 17:28   ` SS JieJi
2022-09-08 18:14     ` Andrew Bresticker
2022-09-08 18:50 ` [PATCH v2] riscv: Make mmap() with PROT_WRITE imply PROT_READ Andrew Bresticker
2022-09-08 18:56   ` SS JieJi
2022-09-08 19:18     ` Andrew Bresticker
2022-09-09  3:01   ` Celeste Liu
2022-09-09 11:42     ` Coelacanthus
2022-09-09 15:16       ` Andrew Bresticker
2022-09-09 15:45         ` Celeste Liu
2022-09-09 18:52   ` Atish Patra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).