linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 v7] add reserved e820 ranges to the kdump kernel e820 table
@ 2018-11-15  9:55 Lianbo Jiang
  2018-11-15  9:55 ` [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
  2018-11-15  9:55 ` [PATCH 2/2 v7] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
  0 siblings, 2 replies; 5+ messages in thread
From: Lianbo Jiang @ 2018-11-15  9:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kexec, x86, tglx, mingo, bp, akpm, dyoung, bhe

These patches add the new I/O resource descriptor 'IORES_DESC_RESERVED'
for the iomem resources search interfaces and also pass the e820 reserved
ranges to kdump kernel.

At present, when use the kexec_file_load syscall to load the kernel image
and initramfs(for example: kexec -s -p xxx), the upstream kernel does not
pass the e820 reserved ranges to the second kernel, which might cause two
problems:

The first one is the MMCONFIG issue, although which does not make the
system crash or hang, this issue is still a potential risk, and also
might lead to the hot-plug device could not be recognized in kdump kernel.
Because the PCI MMCONFIG(extended mode) requires the reserved region
otherwise it falls back to legacy mode. For example, the kdump kernel
outputs the following log.

Example:
......
[   19.798354] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[   19.800653] [Firmware Info]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
[   19.800995] PCI: not using MMCONFIG
......

The correct kernel log is like this:
......
[    0.082649] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[    0.083610] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
......

The second issue is that the e820 reserved ranges do not setup in kdump
kernel, which will cause some functions that related to the e820 reserved
ranges to become invalid. For example:

early_memremap()->
early_memremap_pgprot_adjust()->
memremap_should_map_decrypted()->
e820__get_entry_type()

Please focus on these functions, early_memremap_pgprot_adjust() and
memremap_should_map_decrypted().

In the first kernel, these ranges sit in e820 reserved ranges, so the
memremap_should_map_decrypted() will return true, that is to say, the
reserved memory is decrypted, then the early_memremap_pgprot_adjust()
will call the pgprot_decrypted() to clear the memory encryption mask.

In the second kernel, because the e820 reserved ranges are not passed
to the second kernel, these ranges don't sit in the e820 reserved ranges,
so the memremap_should_map_decrypted() will return false, that is to say,
the reserved memory is encrypted, and then the early_memremap_pgprot_
adjust() will also call the pgprot_encrypted() to set the memory encryption
mask.

In fact, in the second kernel, the e820 reserved memory is still decrypted.
Obviously, it has gone wrong. So, this issue must be fixed, otherwise kdump
won't work in this case.

The e820 reserved range is useful in kdump kernel, so it is necessary to
pass the e820 reserved ranges to kdump kernel.

Changes since v1:
1. Modified the value of flags to "0", when walking through the whole
tree for e820 reserved ranges.

Changes since v2:
1. Modified the value of flags to "0", when walking through the whole
tree for e820 reserved ranges.
2. Modified the invalid SOB chain issue.

Changes since v3:
1. Dropped [PATCH 1/3 v3] resource: fix an error which walks through iomem
   resources. Please refer to this commit <010a93bf97c7> "resource: Fix
   find_next_iomem_res() iteration issue"

Changes since v4:
1. Improve the patch log, and add kernel log.

Changes since v5:
1. Rewrite these patches log.

Changes since v6:
1. Modify the [PATCH 1/2], and add the new I/O resource descriptor
   'IORES_DESC_RESERVED' for the iomem resources search interfaces.
2. Modify the [PATCH 2/2], and walk through io resource based on the
   new descriptor 'IORES_DESC_RESERVED'.

Lianbo Jiang (2):
  resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table

 arch/x86/kernel/crash.c | 6 ++++++
 arch/x86/kernel/e820.c  | 2 +-
 include/linux/ioport.h  | 1 +
 3 files changed, 8 insertions(+), 1 deletion(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2018-11-15  9:55 [PATCH 0/2 v7] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
@ 2018-11-15  9:55 ` Lianbo Jiang
  2018-11-22  7:42   ` Dave Young
  2018-11-15  9:55 ` [PATCH 2/2 v7] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
  1 sibling, 1 reply; 5+ messages in thread
From: Lianbo Jiang @ 2018-11-15  9:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kexec, x86, tglx, mingo, bp, akpm, dyoung, bhe

The upstream kernel can not accurately add the e820 reserved type to
kdump krenel e820 table.

Kdump uses walk_iomem_res_desc() to iterate io resources, then adds
the matched resource ranges to the e820 table for kdump kernel. But,
when convert the e820 type into the iores descriptor, several e820
types are converted to 'IORES_DESC_NONE' in this function e820_type
_to_iores_desc(). So the walk_iomem_res_desc() will get unnecessary
types(such as E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820_TYPE_KERN) when
walk through io resources by the descriptor 'IORES_DESC_NONE'.

This patch adds the new I/O resource descriptor 'IORES_DESC_RESERVED'
for the iomem resources search interfaces. It is helpful to exactly
match the reserved resource ranges when walking through iomem resources.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
Changes since v5:
1. Improve the patch log

Changes since v6:
1. Modify this patch, and add the new I/O resource descriptor
   'IORES_DESC_RESERVED' for the iomem resources search interfaces.
2. Improve patch log.

 arch/x86/kernel/e820.c | 2 +-
 include/linux/ioport.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 50895c2f937d..57fafdafb860 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1048,10 +1048,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
 	case E820_TYPE_NVS:		return IORES_DESC_ACPI_NV_STORAGE;
 	case E820_TYPE_PMEM:		return IORES_DESC_PERSISTENT_MEMORY;
 	case E820_TYPE_PRAM:		return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
+	case E820_TYPE_RESERVED:	return IORES_DESC_RESERVED;
 	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
 	case E820_TYPE_RAM:		/* Fall-through: */
 	case E820_TYPE_UNUSABLE:	/* Fall-through: */
-	case E820_TYPE_RESERVED:	/* Fall-through: */
 	default:			return IORES_DESC_NONE;
 	}
 }
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..6ed59de48bd5 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -133,6 +133,7 @@ enum {
 	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
 	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
 	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
+	IORES_DESC_RESERVED			= 8,
 };
 
 /* helpers to define resources */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2 v7] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table
  2018-11-15  9:55 [PATCH 0/2 v7] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
  2018-11-15  9:55 ` [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
@ 2018-11-15  9:55 ` Lianbo Jiang
  1 sibling, 0 replies; 5+ messages in thread
From: Lianbo Jiang @ 2018-11-15  9:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: kexec, x86, tglx, mingo, bp, akpm, dyoung, bhe

At present, when use the kexec_file_load syscall to load the kernel image
and initramfs(for example: kexec -s -p xxx), the upstream kernel does not
pass the e820 reserved ranges to the second kernel, which might cause two
problems:

The first one is the MMCONFIG issue, although which does not make the
system crash or hang, this issue is still a potential risk, and also
might lead to the hot-plug device could not be recognized in kdump kernel.
Because the PCI MMCONFIG(extended mode) requires the reserved region
otherwise it falls back to legacy mode. For example, the kdump kernel
outputs the following log.

Example:
......
[   19.798354] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[   19.800653] [Firmware Info]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
[   19.800995] PCI: not using MMCONFIG
......

The correct kernel log is like this:
......
[    0.082649] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[    0.083610] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
......

The second issue is that the e820 reserved ranges do not setup in kdump
kernel, which will cause some functions that related to the e820 reserved
ranges to become invalid. For example:

early_memremap()->
early_memremap_pgprot_adjust()->
memremap_should_map_decrypted()->
e820__get_entry_type()

Please focus on these functions, early_memremap_pgprot_adjust() and
memremap_should_map_decrypted().

In the first kernel, these ranges sit in e820 reserved ranges, so the
memremap_should_map_decrypted() will return true, that is to say, the
reserved memory is decrypted, then the early_memremap_pgprot_adjust()
will call the pgprot_decrypted() to clear the memory encryption mask.

In the second kernel, because the e820 reserved ranges are not passed
to the second kernel, these ranges don't sit in the e820 reserved ranges,
so the memremap_should_map_decrypted() will return false, that is to say,
the reserved memory is encrypted, and then the early_memremap_pgprot_
adjust() will also call the pgprot_encrypted() to set the memory encryption
mask.

In fact, in the second kernel, the e820 reserved memory is still decrypted.
Obviously, it has gone wrong. So, this issue must be fixed, otherwise kdump
won't work in this case.

The e820 reserved range is useful in kdump kernel, so it is necessary to
pass the e820 reserved ranges to kdump kernel.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
Changes since v5:
1. Improve the patch log

Changes since v6:
1. Modify this patch, and walk through io resource based on the
   new descriptor 'IORES_DESC_RESERVED'.
2. Add comment in the code.

 arch/x86/kernel/crash.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index f631a3f15587..5354a84f1684 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -380,6 +380,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	walk_iomem_res_desc(IORES_DESC_ACPI_NV_STORAGE, flags, 0, -1, &cmd,
 			memmap_entry_callback);
 
+	/* Add e820 reserved ranges */
+	cmd.type = E820_TYPE_RESERVED;
+	flags = IORESOURCE_MEM;
+	walk_iomem_res_desc(IORES_DESC_RESERVED, flags, 0, -1, &cmd,
+			   memmap_entry_callback);
+
 	/* Add crashk_low_res region */
 	if (crashk_low_res.end) {
 		ei.addr = crashk_low_res.start;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2018-11-15  9:55 ` [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
@ 2018-11-22  7:42   ` Dave Young
  2018-11-22  8:21     ` lijiang
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Young @ 2018-11-22  7:42 UTC (permalink / raw)
  To: Lianbo Jiang; +Cc: linux-kernel, bhe, x86, kexec, mingo, bp, tglx, akpm

On 11/15/18 at 05:55pm, Lianbo Jiang wrote:
> The upstream kernel can not accurately add the e820 reserved type to
> kdump krenel e820 table.
> 
> Kdump uses walk_iomem_res_desc() to iterate io resources, then adds
> the matched resource ranges to the e820 table for kdump kernel. But,
> when convert the e820 type into the iores descriptor, several e820
> types are converted to 'IORES_DESC_NONE' in this function e820_type
> _to_iores_desc(). So the walk_iomem_res_desc() will get unnecessary
> types(such as E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820_TYPE_KERN) when
> walk through io resources by the descriptor 'IORES_DESC_NONE'.
> 
> This patch adds the new I/O resource descriptor 'IORES_DESC_RESERVED'
> for the iomem resources search interfaces. It is helpful to exactly
> match the reserved resource ranges when walking through iomem resources.
> 
> Suggested-by: Dave Young <dyoung@redhat.com>
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
> Changes since v5:
> 1. Improve the patch log
> 
> Changes since v6:
> 1. Modify this patch, and add the new I/O resource descriptor
>    'IORES_DESC_RESERVED' for the iomem resources search interfaces.
> 2. Improve patch log.
> 
>  arch/x86/kernel/e820.c | 2 +-
>  include/linux/ioport.h | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 50895c2f937d..57fafdafb860 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -1048,10 +1048,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
>  	case E820_TYPE_NVS:		return IORES_DESC_ACPI_NV_STORAGE;
>  	case E820_TYPE_PMEM:		return IORES_DESC_PERSISTENT_MEMORY;
>  	case E820_TYPE_PRAM:		return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
> +	case E820_TYPE_RESERVED:	return IORES_DESC_RESERVED;
>  	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
>  	case E820_TYPE_RAM:		/* Fall-through: */
>  	case E820_TYPE_UNUSABLE:	/* Fall-through: */
> -	case E820_TYPE_RESERVED:	/* Fall-through: */
>  	default:			return IORES_DESC_NONE;
>  	}
>  }
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index da0ebaec25f0..6ed59de48bd5 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -133,6 +133,7 @@ enum {
>  	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
>  	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
>  	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
> +	IORES_DESC_RESERVED			= 8,
>  };

There more works for a new iores desc. Originally IORES_DESC_NONE
is assumed to include reserved e820 type so all code path relates to
IORES_DESC_NONE should be carefully checked and ensure they still work after
your changes.

>  
>  /* helpers to define resources */
> -- 
> 2.17.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2018-11-22  7:42   ` Dave Young
@ 2018-11-22  8:21     ` lijiang
  0 siblings, 0 replies; 5+ messages in thread
From: lijiang @ 2018-11-22  8:21 UTC (permalink / raw)
  To: Dave Young; +Cc: linux-kernel, bhe, x86, kexec, mingo, bp, tglx, akpm

在 2018年11月22日 15:42, Dave Young 写道:
> On 11/15/18 at 05:55pm, Lianbo Jiang wrote:
>> The upstream kernel can not accurately add the e820 reserved type to
>> kdump krenel e820 table.
>>
>> Kdump uses walk_iomem_res_desc() to iterate io resources, then adds
>> the matched resource ranges to the e820 table for kdump kernel. But,
>> when convert the e820 type into the iores descriptor, several e820
>> types are converted to 'IORES_DESC_NONE' in this function e820_type
>> _to_iores_desc(). So the walk_iomem_res_desc() will get unnecessary
>> types(such as E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820_TYPE_KERN) when
>> walk through io resources by the descriptor 'IORES_DESC_NONE'.
>>
>> This patch adds the new I/O resource descriptor 'IORES_DESC_RESERVED'
>> for the iomem resources search interfaces. It is helpful to exactly
>> match the reserved resource ranges when walking through iomem resources.
>>
>> Suggested-by: Dave Young <dyoung@redhat.com>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>> Changes since v5:
>> 1. Improve the patch log
>>
>> Changes since v6:
>> 1. Modify this patch, and add the new I/O resource descriptor
>>    'IORES_DESC_RESERVED' for the iomem resources search interfaces.
>> 2. Improve patch log.
>>
>>  arch/x86/kernel/e820.c | 2 +-
>>  include/linux/ioport.h | 1 +
>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
>> index 50895c2f937d..57fafdafb860 100644
>> --- a/arch/x86/kernel/e820.c
>> +++ b/arch/x86/kernel/e820.c
>> @@ -1048,10 +1048,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
>>  	case E820_TYPE_NVS:		return IORES_DESC_ACPI_NV_STORAGE;
>>  	case E820_TYPE_PMEM:		return IORES_DESC_PERSISTENT_MEMORY;
>>  	case E820_TYPE_PRAM:		return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
>> +	case E820_TYPE_RESERVED:	return IORES_DESC_RESERVED;
>>  	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
>>  	case E820_TYPE_RAM:		/* Fall-through: */
>>  	case E820_TYPE_UNUSABLE:	/* Fall-through: */
>> -	case E820_TYPE_RESERVED:	/* Fall-through: */
>>  	default:			return IORES_DESC_NONE;
>>  	}
>>  }
>> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> index da0ebaec25f0..6ed59de48bd5 100644
>> --- a/include/linux/ioport.h
>> +++ b/include/linux/ioport.h
>> @@ -133,6 +133,7 @@ enum {
>>  	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
>>  	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
>>  	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
>> +	IORES_DESC_RESERVED			= 8,
>>  };
> 
> There more works for a new iores desc. Originally IORES_DESC_NONE
> is assumed to include reserved e820 type so all code path relates to
> IORES_DESC_NONE should be carefully checked and ensure they still work after
> your changes.
> 

Thanks for your reminder.

I'm checking it, and also will resend v7 later. Please ignore this patch.

Regards,
Lianbo

>>  
>>  /* helpers to define resources */
>> -- 
>> 2.17.1
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-22  8:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-15  9:55 [PATCH 0/2 v7] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
2018-11-15  9:55 ` [PATCH 1/2 v7] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
2018-11-22  7:42   ` Dave Young
2018-11-22  8:21     ` lijiang
2018-11-15  9:55 ` [PATCH 2/2 v7] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).