linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] riscv: mem= support, ioremap exec fix
@ 2020-02-15 11:49 Jan Kiszka
  2020-02-15 11:49 ` [PATCH v2 1/3] riscv: Add support for mem= Jan Kiszka
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jan Kiszka @ 2020-02-15 11:49 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv; +Cc: linux-kernel

Patch 1 was already sent separately. This improves it by calling
memblock_enforce_memory_limit in order to call set_max_mapnr with the
correct memory size.

There are two more patches in this series. One is a micro-optimization,
the other enables ioremap of executable memory. The latter will be
needed to enable the Jailhouse hypervisor on RISC-V.

Jan

Jan Kiszka (3):
  riscv: Add support for mem=
  riscv: End kernel region search in setup_bootmem earlier
  riscv: Fix crash when flushing executable ioremap regions

 arch/riscv/mm/cacheflush.c |  3 ++-
 arch/riscv/mm/init.c       | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

--
2.16.4


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/3] riscv: Add support for mem=
  2020-02-15 11:49 [PATCH v2 0/3] riscv: mem= support, ioremap exec fix Jan Kiszka
@ 2020-02-15 11:49 ` Jan Kiszka
  2020-02-15 13:40   ` Anup Patel
  2020-02-15 13:44   ` Nikolay Borisov
  2020-02-15 11:49 ` [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier Jan Kiszka
  2020-02-15 11:49 ` [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions Jan Kiszka
  2 siblings, 2 replies; 15+ messages in thread
From: Jan Kiszka @ 2020-02-15 11:49 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv; +Cc: linux-kernel

From: Jan Kiszka <jan.kiszka@siemens.com>

This sets a memory limit provided via mem= on the command line,
analogously to many other architectures.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/riscv/mm/init.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 965a8cf4829c..aec39a56d6cf 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -118,6 +118,23 @@ static void __init setup_initrd(void)
 }
 #endif /* CONFIG_BLK_DEV_INITRD */

+static phys_addr_t memory_limit = PHYS_ADDR_MAX;
+
+/*
+ * Limit the memory size that was specified via FDT.
+ */
+static int __init early_mem(char *p)
+{
+	if (!p)
+		return 1;
+
+	memory_limit = memparse(p, &p) & PAGE_MASK;
+	pr_notice("Memory limited to %lldMB\n", memory_limit >> 20);
+
+	return 0;
+}
+early_param("mem", early_mem);
+
 static phys_addr_t dtb_early_pa __initdata;

 void __init setup_bootmem(void)
@@ -127,6 +144,8 @@ void __init setup_bootmem(void)
 	phys_addr_t vmlinux_end = __pa_symbol(&_end);
 	phys_addr_t vmlinux_start = __pa_symbol(&_start);

+	memblock_enforce_memory_limit(memory_limit);
+
 	/* Find the memory region containing the kernel */
 	for_each_memblock(memory, reg) {
 		phys_addr_t end = reg->base + reg->size;
--
2.16.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier
  2020-02-15 11:49 [PATCH v2 0/3] riscv: mem= support, ioremap exec fix Jan Kiszka
  2020-02-15 11:49 ` [PATCH v2 1/3] riscv: Add support for mem= Jan Kiszka
@ 2020-02-15 11:49 ` Jan Kiszka
  2020-02-16 14:42   ` Alex Ghiti
  2020-02-15 11:49 ` [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions Jan Kiszka
  2 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2020-02-15 11:49 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv; +Cc: linux-kernel

From: Jan Kiszka <jan.kiszka@siemens.com>

No need to look further when that single region is found.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/riscv/mm/init.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index aec39a56d6cf..a774547e9021 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -160,6 +160,8 @@ void __init setup_bootmem(void)
 			if (reg->base + mem_size < end)
 				memblock_remove(reg->base + mem_size,
 						end - reg->base - mem_size);
+
+			break;
 		}
 	}
 	BUG_ON(mem_size == 0);
--
2.16.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions
  2020-02-15 11:49 [PATCH v2 0/3] riscv: mem= support, ioremap exec fix Jan Kiszka
  2020-02-15 11:49 ` [PATCH v2 1/3] riscv: Add support for mem= Jan Kiszka
  2020-02-15 11:49 ` [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier Jan Kiszka
@ 2020-02-15 11:49 ` Jan Kiszka
  2020-02-16 14:41   ` Alex Ghiti
  2 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2020-02-15 11:49 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv; +Cc: linux-kernel

From: Jan Kiszka <jan.kiszka@siemens.com>

Those are not backed by page structs, and pte_page is returning an
invalid pointer.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/riscv/mm/cacheflush.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 8930ab7278e6..9ee2c1a387cc 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
 {
 	struct page *page = pte_page(pte);

-	if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+	if (!pfn_valid(pte_pfn(pte)) ||
+	    !test_and_set_bit(PG_dcache_clean, &page->flags))
 		flush_icache_all();
 }
 #endif /* CONFIG_MMU */
--
2.16.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] riscv: Add support for mem=
  2020-02-15 11:49 ` [PATCH v2 1/3] riscv: Add support for mem= Jan Kiszka
@ 2020-02-15 13:40   ` Anup Patel
  2020-02-15 13:44   ` Nikolay Borisov
  1 sibling, 0 replies; 15+ messages in thread
From: Anup Patel @ 2020-02-15 13:40 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv,
	linux-kernel@vger.kernel.org List

On Sat, Feb 15, 2020 at 5:20 PM Jan Kiszka <jan.kiszka@web.de> wrote:
>
> From: Jan Kiszka <jan.kiszka@siemens.com>
>
> This sets a memory limit provided via mem= on the command line,
> analogously to many other architectures.
>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/riscv/mm/init.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 965a8cf4829c..aec39a56d6cf 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -118,6 +118,23 @@ static void __init setup_initrd(void)
>  }
>  #endif /* CONFIG_BLK_DEV_INITRD */
>
> +static phys_addr_t memory_limit = PHYS_ADDR_MAX;
> +
> +/*
> + * Limit the memory size that was specified via FDT.
> + */
> +static int __init early_mem(char *p)
> +{
> +       if (!p)
> +               return 1;
> +
> +       memory_limit = memparse(p, &p) & PAGE_MASK;
> +       pr_notice("Memory limited to %lldMB\n", memory_limit >> 20);
> +
> +       return 0;
> +}
> +early_param("mem", early_mem);
> +
>  static phys_addr_t dtb_early_pa __initdata;
>
>  void __init setup_bootmem(void)
> @@ -127,6 +144,8 @@ void __init setup_bootmem(void)
>         phys_addr_t vmlinux_end = __pa_symbol(&_end);
>         phys_addr_t vmlinux_start = __pa_symbol(&_start);
>
> +       memblock_enforce_memory_limit(memory_limit);
> +
>         /* Find the memory region containing the kernel */
>         for_each_memblock(memory, reg) {
>                 phys_addr_t end = reg->base + reg->size;
> --
> 2.16.4
>
>

This is a good addition for Linux RISC-V.

Looks good to me.

Reviewed-by: Anup Patel <anup@brainfault.org>

Regards,
Anup

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] riscv: Add support for mem=
  2020-02-15 11:49 ` [PATCH v2 1/3] riscv: Add support for mem= Jan Kiszka
  2020-02-15 13:40   ` Anup Patel
@ 2020-02-15 13:44   ` Nikolay Borisov
  2020-02-15 14:23     ` Jan Kiszka
  1 sibling, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2020-02-15 13:44 UTC (permalink / raw)
  To: Jan Kiszka, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel



On 15.02.20 г. 13:49 ч., Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> This sets a memory limit provided via mem=3D on the command line,
> analogously to many other architectures.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> =2D--
>  arch/riscv/mm/init.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 965a8cf4829c..aec39a56d6cf 100644
> =2D-- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -118,6 +118,23 @@ static void __init setup_initrd(void)
>  }
>  #endif /* CONFIG_BLK_DEV_INITRD */
> 
> +static phys_addr_t memory_limit =3D PHYS_ADDR_MAX;

3d is the ascii code for =, meaning your client is somehow br0ken?
> +
> +/*
> + * Limit the memory size that was specified via FDT.
> + */
> +static int __init early_mem(char *p)
> +{
> +	if (!p)
> +		return 1;
> +
> +	memory_limit =3D memparse(p, &p) & PAGE_MASK;

ditto

> +	pr_notice("Memory limited to %lldMB\n", memory_limit >> 20);
> +
> +	return 0;
> +}
> +early_param("mem", early_mem);
> +
>  static phys_addr_t dtb_early_pa __initdata;
> 
>  void __init setup_bootmem(void)
> @@ -127,6 +144,8 @@ void __init setup_bootmem(void)
>  	phys_addr_t vmlinux_end =3D __pa_symbol(&_end);
>  	phys_addr_t vmlinux_start =3D __pa_symbol(&_start);
> 
> +	memblock_enforce_memory_limit(memory_limit);
> +
>  	/* Find the memory region containing the kernel */
>  	for_each_memblock(memory, reg) {
>  		phys_addr_t end =3D reg->base + reg->size;
> =2D-
> 2.16.4
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] riscv: Add support for mem=
  2020-02-15 13:44   ` Nikolay Borisov
@ 2020-02-15 14:23     ` Jan Kiszka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2020-02-15 14:23 UTC (permalink / raw)
  To: Nikolay Borisov, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

On 15.02.20 14:44, Nikolay Borisov wrote:
>
>
> On 15.02.20 г. 13:49 ч., Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> This sets a memory limit provided via mem=3D on the command line,
>> analogously to many other architectures.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> =2D--
>>   arch/riscv/mm/init.c | 19 +++++++++++++++++++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 965a8cf4829c..aec39a56d6cf 100644
>> =2D-- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -118,6 +118,23 @@ static void __init setup_initrd(void)
>>   }
>>   #endif /* CONFIG_BLK_DEV_INITRD */
>>
>> +static phys_addr_t memory_limit =3D PHYS_ADDR_MAX;
>
> 3d is the ascii code for =, meaning your client is somehow br0ken?

The client is called git send-email, and I just checked what was passed
to it - all fine. It must be my beloved freemail provider that enables
quoted-printable encoding for this series (interestingly not for another
one I sent to a different community today). I've resent the same patch
files to both corporate and private providers, and only the latter shows
this behavior. Sigh.

IIRC, git am processes this correctly, but I can resent via the
corporate server as well, whatever is preferred.

Thanks,
Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions
  2020-02-15 11:49 ` [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions Jan Kiszka
@ 2020-02-16 14:41   ` Alex Ghiti
  2020-02-16 16:05     ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Ghiti @ 2020-02-16 14:41 UTC (permalink / raw)
  To: Jan Kiszka, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

Hi Jan,

On 2/15/20 6:49 AM, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Those are not backed by page structs, and pte_page is returning an
> invalid pointer.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> =2D--
>   arch/riscv/mm/cacheflush.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index 8930ab7278e6..9ee2c1a387cc 100644
> =2D-- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
>   {
>   	struct page *page =3D pte_page(pte);
> 
> -	if (!test_and_set_bit(PG_dcache_clean, &page->flags))
> +	if (!pfn_valid(pte_pfn(pte)) ||
> +	    !test_and_set_bit(PG_dcache_clean, &page->flags))
>   		flush_icache_all();
>   }
>   #endif /* CONFIG_MMU */
> =2D-
> 2.16.4
> 
> 

When did you encounter such a situation ? i.e. executable code that is 
not backed by struct page ?

Riscv uses the generic implementation of ioremap and the way 
_PAGE_IOREMAP is defined does not allow to map executable memory region 
using ioremap, so I'm interested to understand how we end up in 
flush_icache_pte for an executable region not backed by any struct page.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier
  2020-02-15 11:49 ` [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier Jan Kiszka
@ 2020-02-16 14:42   ` Alex Ghiti
  2020-02-16 16:06     ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Ghiti @ 2020-02-16 14:42 UTC (permalink / raw)
  To: Jan Kiszka, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

Hi Jan,

On 2/15/20 6:49 AM, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> No need to look further when that single region is found.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> =2D--
>   arch/riscv/mm/init.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index aec39a56d6cf..a774547e9021 100644
> =2D-- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -160,6 +160,8 @@ void __init setup_bootmem(void)
>   			if (reg->base + mem_size < end)
>   				memblock_remove(reg->base + mem_size,
>   						end - reg->base - mem_size);
> +
> +			break;
>   		}
>   	}
>   	BUG_ON(mem_size =3D=3D 0);
> =2D-
> 2.16.4
> 
> 

I was looking at the test above that determines if the current memblock 
contains the kernel:

if (reg->base <= vmlinux_end && vmlinux_end <= end)

Shouldn't it be:

if (reg->base <= vmlinux_start && vmlinux_end <= end)

?

Otherwise, we can indeed stop as soon as we found the region containing 
the kernel, so feel free to add:

Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>

Thanks,

Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions
  2020-02-16 14:41   ` Alex Ghiti
@ 2020-02-16 16:05     ` Jan Kiszka
  2020-02-16 19:56       ` Alex Ghiti
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2020-02-16 16:05 UTC (permalink / raw)
  To: Alex Ghiti, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

On 16.02.20 15:41, Alex Ghiti wrote:
> Hi Jan,
>
> On 2/15/20 6:49 AM, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> Those are not backed by page structs, and pte_page is returning an
>> invalid pointer.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> =2D--
>>   arch/riscv/mm/cacheflush.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>> index 8930ab7278e6..9ee2c1a387cc 100644
>> =2D-- a/arch/riscv/mm/cacheflush.c
>> +++ b/arch/riscv/mm/cacheflush.c
>> @@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
>>   {
>>       struct page *page =3D pte_page(pte);
>>
>> -    if (!test_and_set_bit(PG_dcache_clean, &page->flags))
>> +    if (!pfn_valid(pte_pfn(pte)) ||
>> +        !test_and_set_bit(PG_dcache_clean, &page->flags))
>>           flush_icache_all();
>>   }
>>   #endif /* CONFIG_MMU */
>> =2D-
>> 2.16.4
>>
>>
>
> When did you encounter such a situation ? i.e. executable code that is
> not backed by struct page ?
>
> Riscv uses the generic implementation of ioremap and the way
> _PAGE_IOREMAP is defined does not allow to map executable memory region
> using ioremap, so I'm interested to understand how we end up in
> flush_icache_pte for an executable region not backed by any struct page.

You can create executable mappings of memory that Linux does not
initially consider as RAM via ioremap_prot or ioremap_page_range. We are
using that in Jailhouse to load the hypervisor code into reserved memory
that is ioremapped for the purpose. Works fine on x86, arm and arm64.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier
  2020-02-16 14:42   ` Alex Ghiti
@ 2020-02-16 16:06     ` Jan Kiszka
  2020-02-16 19:57       ` Alex Ghiti
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2020-02-16 16:06 UTC (permalink / raw)
  To: Alex Ghiti, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

On 16.02.20 15:42, Alex Ghiti wrote:
> Hi Jan,
>
> On 2/15/20 6:49 AM, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> No need to look further when that single region is found.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> =2D--
>>   arch/riscv/mm/init.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index aec39a56d6cf..a774547e9021 100644
>> =2D-- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -160,6 +160,8 @@ void __init setup_bootmem(void)
>>               if (reg->base + mem_size < end)
>>                   memblock_remove(reg->base + mem_size,
>>                           end - reg->base - mem_size);
>> +
>> +            break;
>>           }
>>       }
>>       BUG_ON(mem_size =3D=3D 0);
>> =2D-
>> 2.16.4
>>
>>
>
> I was looking at the test above that determines if the current memblock
> contains the kernel:
>
> if (reg->base <= vmlinux_end && vmlinux_end <= end)
>
> Shouldn't it be:
>
> if (reg->base <= vmlinux_start && vmlinux_end <= end)
>
> ?

Yes, I think you are right. Would you like to send a patch that fixes this?

>
> Otherwise, we can indeed stop as soon as we found the region containing
> the kernel, so feel free to add:
>
> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
>

Thanks,
Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions
  2020-02-16 16:05     ` Jan Kiszka
@ 2020-02-16 19:56       ` Alex Ghiti
  2020-02-20  5:49         ` Alex Ghiti
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Ghiti @ 2020-02-16 19:56 UTC (permalink / raw)
  To: Jan Kiszka, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

On 2/16/20 11:05 AM, Jan Kiszka wrote:
> On 16.02.20 15:41, Alex Ghiti wrote:
>> Hi Jan,
>>
>> On 2/15/20 6:49 AM, Jan Kiszka wrote:
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>
>>> Those are not backed by page structs, and pte_page is returning an
>>> invalid pointer.
>>>
>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> =2D--
>>>   arch/riscv/mm/cacheflush.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>>> index 8930ab7278e6..9ee2c1a387cc 100644
>>> =2D-- a/arch/riscv/mm/cacheflush.c
>>> +++ b/arch/riscv/mm/cacheflush.c
>>> @@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
>>>   {
>>>       struct page *page =3D pte_page(pte);
>>>
>>> -    if (!test_and_set_bit(PG_dcache_clean, &page->flags))
>>> +    if (!pfn_valid(pte_pfn(pte)) ||
>>> +        !test_and_set_bit(PG_dcache_clean, &page->flags))
>>>           flush_icache_all();
>>>   }
>>>   #endif /* CONFIG_MMU */
>>> =2D-
>>> 2.16.4
>>>
>>>
>>
>> When did you encounter such a situation ? i.e. executable code that is
>> not backed by struct page ?
>>
>> Riscv uses the generic implementation of ioremap and the way
>> _PAGE_IOREMAP is defined does not allow to map executable memory region
>> using ioremap, so I'm interested to understand how we end up in
>> flush_icache_pte for an executable region not backed by any struct page.
> 
> You can create executable mappings of memory that Linux does not
> initially consider as RAM via ioremap_prot or ioremap_page_range. We are
> using that in Jailhouse to load the hypervisor code into reserved memory
> that is ioremapped for the purpose. Works fine on x86, arm and arm64.
> 
> Jan

Ok thanks, I had missed this API.

Regarding your patch, I find it weird to do anything if the pfn is 
invalid, we could have garbage in pte pointing to an invalid region for 
example (I admit that the effect of flushing the icache would not be 
catastrophic in that situation).

I'm not saying I will come with a better solution but I'll take a deeper 
look tomorrow.

Alex


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier
  2020-02-16 16:06     ` Jan Kiszka
@ 2020-02-16 19:57       ` Alex Ghiti
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Ghiti @ 2020-02-16 19:57 UTC (permalink / raw)
  To: Jan Kiszka, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel


On 2/16/20 11:06 AM, Jan Kiszka wrote:
> On 16.02.20 15:42, Alex Ghiti wrote:
>> Hi Jan,
>>
>> On 2/15/20 6:49 AM, Jan Kiszka wrote:
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>
>>> No need to look further when that single region is found.
>>>
>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> =2D--
>>>   arch/riscv/mm/init.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>>> index aec39a56d6cf..a774547e9021 100644
>>> =2D-- a/arch/riscv/mm/init.c
>>> +++ b/arch/riscv/mm/init.c
>>> @@ -160,6 +160,8 @@ void __init setup_bootmem(void)
>>>               if (reg->base + mem_size < end)
>>>                   memblock_remove(reg->base + mem_size,
>>>                           end - reg->base - mem_size);
>>> +
>>> +            break;
>>>           }
>>>       }
>>>       BUG_ON(mem_size =3D=3D 0);
>>> =2D-
>>> 2.16.4
>>>
>>>
>>
>> I was looking at the test above that determines if the current memblock
>> contains the kernel:
>>
>> if (reg->base <= vmlinux_end && vmlinux_end <= end)
>>
>> Shouldn't it be:
>>
>> if (reg->base <= vmlinux_start && vmlinux_end <= end)
>>
>> ?
> 
> Yes, I think you are right. Would you like to send a patch that fixes this?

Thanks for confirming, I'll send a patch tomorrow and cc stable too.

Alex

> 
>>
>> Otherwise, we can indeed stop as soon as we found the region containing
>> the kernel, so feel free to add:
>>
>> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
>>
> 
> Thanks,
> Jan
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions
  2020-02-16 19:56       ` Alex Ghiti
@ 2020-02-20  5:49         ` Alex Ghiti
  2020-02-20  6:38           ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Ghiti @ 2020-02-20  5:49 UTC (permalink / raw)
  To: Jan Kiszka, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

Hi Jan,

On 2/16/20 2:56 PM, Alex Ghiti wrote:
> On 2/16/20 11:05 AM, Jan Kiszka wrote:
>> On 16.02.20 15:41, Alex Ghiti wrote:
>>> Hi Jan,
>>>
>>> On 2/15/20 6:49 AM, Jan Kiszka wrote:
>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>
>>>> Those are not backed by page structs, and pte_page is returning an
>>>> invalid pointer.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> =2D--
>>>>   arch/riscv/mm/cacheflush.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>>>> index 8930ab7278e6..9ee2c1a387cc 100644
>>>> =2D-- a/arch/riscv/mm/cacheflush.c
>>>> +++ b/arch/riscv/mm/cacheflush.c
>>>> @@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
>>>>   {
>>>>       struct page *page =3D pte_page(pte);
>>>>
>>>> -    if (!test_and_set_bit(PG_dcache_clean, &page->flags))
>>>> +    if (!pfn_valid(pte_pfn(pte)) ||
>>>> +        !test_and_set_bit(PG_dcache_clean, &page->flags))
>>>>           flush_icache_all();
>>>>   }
>>>>   #endif /* CONFIG_MMU */
>>>> =2D-
>>>> 2.16.4
>>>>
>>>>
>>>
>>> When did you encounter such a situation ? i.e. executable code that is
>>> not backed by struct page ?
>>>
>>> Riscv uses the generic implementation of ioremap and the way
>>> _PAGE_IOREMAP is defined does not allow to map executable memory region
>>> using ioremap, so I'm interested to understand how we end up in
>>> flush_icache_pte for an executable region not backed by any struct page.
>>
>> You can create executable mappings of memory that Linux does not
>> initially consider as RAM via ioremap_prot or ioremap_page_range. We are
>> using that in Jailhouse to load the hypervisor code into reserved memory
>> that is ioremapped for the purpose. Works fine on x86, arm and arm64.
>>
>> Jan
> 
> Ok thanks, I had missed this API.
> 
> Regarding your patch, I find it weird to do anything if the pfn is 
> invalid, we could have garbage in pte pointing to an invalid region for 
> example (I admit that the effect of flushing the icache would not be 
> catastrophic in that situation).
> 
> I'm not saying I will come with a better solution but I'll take a deeper 
> look tomorrow.
> 
> Alex
> 

I took a look at the Jailhouse driver. After loading the hypervisor into 
the ioremapped region, it explicitly ensures icache/dcache consistency 
by calling flush_icache_range here:

https://github.com/siemens/jailhouse/blob/master/driver/main.c#L505

There seems to be an implicit (?) rule that states that in-kernel code 
modification must handle icache/dcache consistency:

In arm64 set_pte_at definition, they do not sync icache/dcache when the 
pte is kernel:

https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/pgtable.h#L271

In mips, they do the same:

https://elixir.bootlin.com/linux/latest/source/arch/mips/mm/cache.c#L137

So funnily, I'd do the contrary of what you have done, the mips way:

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 8930ab7278e6..c90c8bb49109 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -84,6 +84,9 @@ void flush_icache_pte(pte_t pte)
  {
         struct page *page = pte_page(pte);

+       if (unlikely(!pfn_valid(pte_pfn(pte))))
+               return;
+
         if (!test_and_set_bit(PG_dcache_clean, &page->flags))
                 flush_icache_all();
  }

What do you think ?

Alex

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions
  2020-02-20  5:49         ` Alex Ghiti
@ 2020-02-20  6:38           ` Jan Kiszka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2020-02-20  6:38 UTC (permalink / raw)
  To: Alex Ghiti, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv
  Cc: linux-kernel

On 20.02.20 06:49, Alex Ghiti wrote:
> Hi Jan,
>
> On 2/16/20 2:56 PM, Alex Ghiti wrote:
>> On 2/16/20 11:05 AM, Jan Kiszka wrote:
>>> On 16.02.20 15:41, Alex Ghiti wrote:
>>>> Hi Jan,
>>>>
>>>> On 2/15/20 6:49 AM, Jan Kiszka wrote:
>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>
>>>>> Those are not backed by page structs, and pte_page is returning an
>>>>> invalid pointer.
>>>>>
>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> =2D--
>>>>>   arch/riscv/mm/cacheflush.c | 3 ++-
>>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>>>>> index 8930ab7278e6..9ee2c1a387cc 100644
>>>>> =2D-- a/arch/riscv/mm/cacheflush.c
>>>>> +++ b/arch/riscv/mm/cacheflush.c
>>>>> @@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
>>>>>   {
>>>>>       struct page *page =3D pte_page(pte);
>>>>>
>>>>> -    if (!test_and_set_bit(PG_dcache_clean, &page->flags))
>>>>> +    if (!pfn_valid(pte_pfn(pte)) ||
>>>>> +        !test_and_set_bit(PG_dcache_clean, &page->flags))
>>>>>           flush_icache_all();
>>>>>   }
>>>>>   #endif /* CONFIG_MMU */
>>>>> =2D-
>>>>> 2.16.4
>>>>>
>>>>>
>>>>
>>>> When did you encounter such a situation ? i.e. executable code that is
>>>> not backed by struct page ?
>>>>
>>>> Riscv uses the generic implementation of ioremap and the way
>>>> _PAGE_IOREMAP is defined does not allow to map executable memory region
>>>> using ioremap, so I'm interested to understand how we end up in
>>>> flush_icache_pte for an executable region not backed by any struct
>>>> page.
>>>
>>> You can create executable mappings of memory that Linux does not
>>> initially consider as RAM via ioremap_prot or ioremap_page_range. We are
>>> using that in Jailhouse to load the hypervisor code into reserved memory
>>> that is ioremapped for the purpose. Works fine on x86, arm and arm64.
>>>
>>> Jan
>>
>> Ok thanks, I had missed this API.
>>
>> Regarding your patch, I find it weird to do anything if the pfn is
>> invalid, we could have garbage in pte pointing to an invalid region
>> for example (I admit that the effect of flushing the icache would not
>> be catastrophic in that situation).
>>
>> I'm not saying I will come with a better solution but I'll take a
>> deeper look tomorrow.
>>
>> Alex
>>
>
> I took a look at the Jailhouse driver. After loading the hypervisor into
> the ioremapped region, it explicitly ensures icache/dcache consistency
> by calling flush_icache_range here:
>
> https://github.com/siemens/jailhouse/blob/master/driver/main.c#L505
>

Yeah, the arm64 port needed this.

> There seems to be an implicit (?) rule that states that in-kernel code
> modification must handle icache/dcache consistency:
>
> In arm64 set_pte_at definition, they do not sync icache/dcache when the
> pte is kernel:
>
> https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/pgtable.h#L271
>
>
> In mips, they do the same:
>
> https://elixir.bootlin.com/linux/latest/source/arch/mips/mm/cache.c#L137
>
> So funnily, I'd do the contrary of what you have done, the mips way:
>
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index 8930ab7278e6..c90c8bb49109 100644
> --- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -84,6 +84,9 @@ void flush_icache_pte(pte_t pte)
>   {
>          struct page *page = pte_page(pte);
>
> +       if (unlikely(!pfn_valid(pte_pfn(pte))))
> +               return;
> +
>          if (!test_and_set_bit(PG_dcache_clean, &page->flags))
>                  flush_icache_all();
>   }
>
> What do you think ?
>

I wouldn't mind doing it like above. I suspect that became the common
simple pattern because no one expected a use case like with Jailhouse.
But I'm by far not an expert in mm topics in the kernel.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-02-20  6:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-15 11:49 [PATCH v2 0/3] riscv: mem= support, ioremap exec fix Jan Kiszka
2020-02-15 11:49 ` [PATCH v2 1/3] riscv: Add support for mem= Jan Kiszka
2020-02-15 13:40   ` Anup Patel
2020-02-15 13:44   ` Nikolay Borisov
2020-02-15 14:23     ` Jan Kiszka
2020-02-15 11:49 ` [PATCH v2 2/3] riscv: End kernel region search in setup_bootmem earlier Jan Kiszka
2020-02-16 14:42   ` Alex Ghiti
2020-02-16 16:06     ` Jan Kiszka
2020-02-16 19:57       ` Alex Ghiti
2020-02-15 11:49 ` [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions Jan Kiszka
2020-02-16 14:41   ` Alex Ghiti
2020-02-16 16:05     ` Jan Kiszka
2020-02-16 19:56       ` Alex Ghiti
2020-02-20  5:49         ` Alex Ghiti
2020-02-20  6:38           ` Jan Kiszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).