linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
@ 2013-07-19 17:47 Toshi Kani
  2013-07-19 19:30 ` KOSAKI Motohiro
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-19 17:47 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis, Toshi Kani

CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
interface, which allows a given memory address to be hot-added as
follows. (See Documentation/memory-hotplug.txt for more detail.)

# echo start_address_of_new_memory > /sys/devices/system/memory/probe

This probe interface is required on powerpc. On x86, however, ACPI
notifies a memory hotplug event to the kernel, which performs its
hotplug operation as the result. Therefore, regular users do not need
this interface on x86. This probe interface is also error-prone and
misleading that the kernel blindly adds a given memory address without
checking if the memory is present on the system; no probing is done
despite of its name. The kernel crashes when a user requests to online
a memory block that is not present on the system. This interface is
currently used for testing as it can fake a hotplug event.

This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
its Kconfig menu entry on x86, and clarifies its use in Documentation/
memory-hotplug.txt.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 Documentation/memory-hotplug.txt |   16 +++++++++-------
 arch/x86/Kconfig                 |    7 ++++++-
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 8e5eacb..8fd254c 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -210,13 +210,15 @@ If memory device is found, memory hotplug code will be called.
 
 4.2 Notify memory hot-add event by hand
 ------------
-In some environments, especially virtualized environment, firmware will not
-notify memory hotplug event to the kernel. For such environment, "probe"
-interface is supported. This interface depends on CONFIG_ARCH_MEMORY_PROBE.
-
-Now, CONFIG_ARCH_MEMORY_PROBE is supported only by powerpc but it does not
-contain highly architecture codes. Please add config if you need "probe"
-interface.
+On powerpc, the firmware does not notify a memory hotplug event to the kernel.
+Therefore, "probe" interface is supported to notify the event to the kernel.
+This interface depends on CONFIG_ARCH_MEMORY_PROBE.
+
+CONFIG_ARCH_MEMORY_PROBE is supported on powerpc only. On x86, this config
+option is disabled by default since ACPI notifies a memory hotplug event to
+the kernel, which performs its hotplug operation as the result. Please
+enable this option if you need the "probe" interface for testing purposes
+on x86.
 
 Probe interface is located at
 /sys/devices/system/memory/probe
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b32ebf9..408ef68 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1344,8 +1344,13 @@ config ARCH_SELECT_MEMORY_MODEL
 	depends on ARCH_SPARSEMEM_ENABLE
 
 config ARCH_MEMORY_PROBE
-	def_bool y
+	bool "Enable sysfs memory/probe interface"
+	default n
 	depends on X86_64 && MEMORY_HOTPLUG
+	help
+	  This option enables a sysfs memory/probe interface for testing.
+	  See Documentation/memory-hotplug.txt for more information.
+	  If you are unsure how to answer this question, answer N.
 
 config ARCH_PROC_KCORE_TEXT
 	def_bool y

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
@ 2013-07-19 19:30 ` KOSAKI Motohiro
  2013-07-19 19:35   ` Toshi Kani
  2013-07-22  8:37 ` Ingo Molnar
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2013-07-19 19:30 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b32ebf9..408ef68 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1344,8 +1344,13 @@ config ARCH_SELECT_MEMORY_MODEL
>   	depends on ARCH_SPARSEMEM_ENABLE
>   
>   config ARCH_MEMORY_PROBE
> -	def_bool y
> +	bool "Enable sysfs memory/probe interface"
> +	default n
>   	depends on X86_64 && MEMORY_HOTPLUG
> +	help
> +	  This option enables a sysfs memory/probe interface for testing.
> +	  See Documentation/memory-hotplug.txt for more information.
> +	  If you are unsure how to answer this question, answer N.

makes sense.

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-19 19:30 ` KOSAKI Motohiro
@ 2013-07-19 19:35   ` Toshi Kani
  0 siblings, 0 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-19 19:35 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: akpm, linux-mm, linux-kernel, x86, dave, isimatu.yasuaki,
	tangchen, vasilis.liaskovitis

On Fri, 2013-07-19 at 15:30 -0400, KOSAKI Motohiro wrote:
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index b32ebf9..408ef68 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -1344,8 +1344,13 @@ config ARCH_SELECT_MEMORY_MODEL
> >   	depends on ARCH_SPARSEMEM_ENABLE
> >   
> >   config ARCH_MEMORY_PROBE
> > -	def_bool y
> > +	bool "Enable sysfs memory/probe interface"
> > +	default n
> >   	depends on X86_64 && MEMORY_HOTPLUG
> > +	help
> > +	  This option enables a sysfs memory/probe interface for testing.
> > +	  See Documentation/memory-hotplug.txt for more information.
> > +	  If you are unsure how to answer this question, answer N.
> 
> makes sense.
> 
> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Great!  Thanks Motohiro!
-Toshi



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
  2013-07-19 19:30 ` KOSAKI Motohiro
@ 2013-07-22  8:37 ` Ingo Molnar
  2013-07-22 17:12   ` Toshi Kani
  2013-07-23  0:24 ` Yasuaki Ishimatsu
  2013-07-23  7:46 ` [tip:x86/mm] " tip-bot for Toshi Kani
  3 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2013-07-22  8:37 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis


* Toshi Kani <toshi.kani@hp.com> wrote:

> CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
> interface, which allows a given memory address to be hot-added as
> follows. (See Documentation/memory-hotplug.txt for more detail.)
> 
> # echo start_address_of_new_memory > /sys/devices/system/memory/probe
> 
> This probe interface is required on powerpc. On x86, however, ACPI
> notifies a memory hotplug event to the kernel, which performs its
> hotplug operation as the result. Therefore, regular users do not need
> this interface on x86. This probe interface is also error-prone and
> misleading that the kernel blindly adds a given memory address without
> checking if the memory is present on the system; no probing is done
> despite of its name. The kernel crashes when a user requests to online
> a memory block that is not present on the system. This interface is
> currently used for testing as it can fake a hotplug event.
> 
> This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
> its Kconfig menu entry on x86, and clarifies its use in Documentation/
> memory-hotplug.txt.

Could we please also fix it to never crash the kernel, even if stupid 
ranges are provided?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-22  8:37 ` Ingo Molnar
@ 2013-07-22 17:12   ` Toshi Kani
  2013-07-22 20:57     ` KOSAKI Motohiro
  2013-07-23  8:01     ` Ingo Molnar
  0 siblings, 2 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-22 17:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Mon, 2013-07-22 at 10:37 +0200, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
> > interface, which allows a given memory address to be hot-added as
> > follows. (See Documentation/memory-hotplug.txt for more detail.)
> > 
> > # echo start_address_of_new_memory > /sys/devices/system/memory/probe
> > 
> > This probe interface is required on powerpc. On x86, however, ACPI
> > notifies a memory hotplug event to the kernel, which performs its
> > hotplug operation as the result. Therefore, regular users do not need
> > this interface on x86. This probe interface is also error-prone and
> > misleading that the kernel blindly adds a given memory address without
> > checking if the memory is present on the system; no probing is done
> > despite of its name. The kernel crashes when a user requests to online
> > a memory block that is not present on the system. This interface is
> > currently used for testing as it can fake a hotplug event.
> > 
> > This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
> > its Kconfig menu entry on x86, and clarifies its use in Documentation/
> > memory-hotplug.txt.
> 
> Could we please also fix it to never crash the kernel, even if stupid 
> ranges are provided?

Yes, this probe interface can be enhanced to verify the firmware
information before adding a given memory address.  However, such change
would interfere its test use of "fake" hotplug, which is only the known
use-case of this interface on x86.

In order to verify if a given memory address is enabled at run-time (as
opposed to boot-time), we need to check with ACPI memory device objects
on x86.  However, system vendors tend to not implement memory device
objects unless their systems support memory hotplug.  Dave Hansen is
using this interface for his testing as a way to fake a hotplug event on
a system that does not support memory hotplug.

Thanks,
-Toshi



  



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-22 17:12   ` Toshi Kani
@ 2013-07-22 20:57     ` KOSAKI Motohiro
  2013-07-22 21:04       ` Dave Hansen
  2013-07-23  0:34       ` Toshi Kani
  2013-07-23  8:01     ` Ingo Molnar
  1 sibling, 2 replies; 28+ messages in thread
From: KOSAKI Motohiro @ 2013-07-22 20:57 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

(7/22/13 1:12 PM), Toshi Kani wrote:
> On Mon, 2013-07-22 at 10:37 +0200, Ingo Molnar wrote:
>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>
>>> CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
>>> interface, which allows a given memory address to be hot-added as
>>> follows. (See Documentation/memory-hotplug.txt for more detail.)
>>>
>>> # echo start_address_of_new_memory > /sys/devices/system/memory/probe
>>>
>>> This probe interface is required on powerpc. On x86, however, ACPI
>>> notifies a memory hotplug event to the kernel, which performs its
>>> hotplug operation as the result. Therefore, regular users do not need
>>> this interface on x86. This probe interface is also error-prone and
>>> misleading that the kernel blindly adds a given memory address without
>>> checking if the memory is present on the system; no probing is done
>>> despite of its name. The kernel crashes when a user requests to online
>>> a memory block that is not present on the system. This interface is
>>> currently used for testing as it can fake a hotplug event.
>>>
>>> This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
>>> its Kconfig menu entry on x86, and clarifies its use in Documentation/
>>> memory-hotplug.txt.
>>
>> Could we please also fix it to never crash the kernel, even if stupid
>> ranges are provided?
>
> Yes, this probe interface can be enhanced to verify the firmware
> information before adding a given memory address.  However, such change
> would interfere its test use of "fake" hotplug, which is only the known
> use-case of this interface on x86.
>
> In order to verify if a given memory address is enabled at run-time (as
> opposed to boot-time), we need to check with ACPI memory device objects
> on x86.  However, system vendors tend to not implement memory device
> objects unless their systems support memory hotplug.  Dave Hansen is
> using this interface for his testing as a way to fake a hotplug event on
> a system that does not support memory hotplug.

One of possible option is to return EINVAL when system has real hotplug device.
I mean this interface is only useful when system don't have proper hardware
feature and doesn't work correctly hardware property and this interface command
are not consistent.

Dave, What do you think?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-22 20:57     ` KOSAKI Motohiro
@ 2013-07-22 21:04       ` Dave Hansen
  2013-07-23  0:34       ` Toshi Kani
  1 sibling, 0 replies; 28+ messages in thread
From: Dave Hansen @ 2013-07-22 21:04 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Toshi Kani, Ingo Molnar, akpm, linux-mm, linux-kernel, x86,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On 07/22/2013 01:57 PM, KOSAKI Motohiro wrote:
> 
> One of possible option is to return EINVAL when system has real hotplug
> device.
> I mean this interface is only useful when system don't have proper hardware
> feature and doesn't work correctly hardware property and this interface
> command
> are not consistent.
> 
> Dave, What do you think?

Sounds reasonable to me.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
  2013-07-19 19:30 ` KOSAKI Motohiro
  2013-07-22  8:37 ` Ingo Molnar
@ 2013-07-23  0:24 ` Yasuaki Ishimatsu
  2013-07-23  0:45   ` Toshi Kani
  2013-07-23  7:46 ` [tip:x86/mm] " tip-bot for Toshi Kani
  3 siblings, 1 reply; 28+ messages in thread
From: Yasuaki Ishimatsu @ 2013-07-23  0:24 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	tangchen, vasilis.liaskovitis

(2013/07/20 2:47), Toshi Kani wrote:
> CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
> interface, which allows a given memory address to be hot-added as
> follows. (See Documentation/memory-hotplug.txt for more detail.)
> 
> # echo start_address_of_new_memory > /sys/devices/system/memory/probe
> 
> This probe interface is required on powerpc. On x86, however, ACPI
> notifies a memory hotplug event to the kernel, which performs its
> hotplug operation as the result. Therefore, regular users do not need
> this interface on x86. This probe interface is also error-prone and
> misleading that the kernel blindly adds a given memory address without
> checking if the memory is present on the system; no probing is done
> despite of its name. The kernel crashes when a user requests to online
> a memory block that is not present on the system. This interface is
> currently used for testing as it can fake a hotplug event.
> 
> This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
> its Kconfig menu entry on x86, and clarifies its use in Documentation/
> memory-hotplug.txt.
> 
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>

Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Thanks,
Yasuaki Ishimatsu

> ---
>   Documentation/memory-hotplug.txt |   16 +++++++++-------
>   arch/x86/Kconfig                 |    7 ++++++-
>   2 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> index 8e5eacb..8fd254c 100644
> --- a/Documentation/memory-hotplug.txt
> +++ b/Documentation/memory-hotplug.txt
> @@ -210,13 +210,15 @@ If memory device is found, memory hotplug code will be called.
>   
>   4.2 Notify memory hot-add event by hand
>   ------------
> -In some environments, especially virtualized environment, firmware will not
> -notify memory hotplug event to the kernel. For such environment, "probe"
> -interface is supported. This interface depends on CONFIG_ARCH_MEMORY_PROBE.
> -
> -Now, CONFIG_ARCH_MEMORY_PROBE is supported only by powerpc but it does not
> -contain highly architecture codes. Please add config if you need "probe"
> -interface.
> +On powerpc, the firmware does not notify a memory hotplug event to the kernel.
> +Therefore, "probe" interface is supported to notify the event to the kernel.
> +This interface depends on CONFIG_ARCH_MEMORY_PROBE.
> +
> +CONFIG_ARCH_MEMORY_PROBE is supported on powerpc only. On x86, this config
> +option is disabled by default since ACPI notifies a memory hotplug event to
> +the kernel, which performs its hotplug operation as the result. Please
> +enable this option if you need the "probe" interface for testing purposes
> +on x86.
>   
>   Probe interface is located at
>   /sys/devices/system/memory/probe
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b32ebf9..408ef68 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1344,8 +1344,13 @@ config ARCH_SELECT_MEMORY_MODEL
>   	depends on ARCH_SPARSEMEM_ENABLE
>   
>   config ARCH_MEMORY_PROBE
> -	def_bool y
> +	bool "Enable sysfs memory/probe interface"
> +	default n
>   	depends on X86_64 && MEMORY_HOTPLUG
> +	help
> +	  This option enables a sysfs memory/probe interface for testing.
> +	  See Documentation/memory-hotplug.txt for more information.
> +	  If you are unsure how to answer this question, answer N.
>   
>   config ARCH_PROC_KCORE_TEXT
>   	def_bool y
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-22 20:57     ` KOSAKI Motohiro
  2013-07-22 21:04       ` Dave Hansen
@ 2013-07-23  0:34       ` Toshi Kani
  1 sibling, 0 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-23  0:34 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Mon, 2013-07-22 at 16:57 -0400, KOSAKI Motohiro wrote:
> (7/22/13 1:12 PM), Toshi Kani wrote:
> > On Mon, 2013-07-22 at 10:37 +0200, Ingo Molnar wrote:
> >> * Toshi Kani <toshi.kani@hp.com> wrote:
> >>
> >>> CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
> >>> interface, which allows a given memory address to be hot-added as
> >>> follows. (See Documentation/memory-hotplug.txt for more detail.)
> >>>
> >>> # echo start_address_of_new_memory > /sys/devices/system/memory/probe
> >>>
> >>> This probe interface is required on powerpc. On x86, however, ACPI
> >>> notifies a memory hotplug event to the kernel, which performs its
> >>> hotplug operation as the result. Therefore, regular users do not need
> >>> this interface on x86. This probe interface is also error-prone and
> >>> misleading that the kernel blindly adds a given memory address without
> >>> checking if the memory is present on the system; no probing is done
> >>> despite of its name. The kernel crashes when a user requests to online
> >>> a memory block that is not present on the system. This interface is
> >>> currently used for testing as it can fake a hotplug event.
> >>>
> >>> This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
> >>> its Kconfig menu entry on x86, and clarifies its use in Documentation/
> >>> memory-hotplug.txt.
> >>
> >> Could we please also fix it to never crash the kernel, even if stupid
> >> ranges are provided?
> >
> > Yes, this probe interface can be enhanced to verify the firmware
> > information before adding a given memory address.  However, such change
> > would interfere its test use of "fake" hotplug, which is only the known
> > use-case of this interface on x86.
> >
> > In order to verify if a given memory address is enabled at run-time (as
> > opposed to boot-time), we need to check with ACPI memory device objects
> > on x86.  However, system vendors tend to not implement memory device
> > objects unless their systems support memory hotplug.  Dave Hansen is
> > using this interface for his testing as a way to fake a hotplug event on
> > a system that does not support memory hotplug.
> 
> One of possible option is to return EINVAL when system has real hotplug device.
> I mean this interface is only useful when system don't have proper hardware
> feature and doesn't work correctly hardware property and this interface command
> are not consistent.

Thanks for the suggestion.  Unfortunately, such check cannot be added
without some ugliness.  There is no capability bit in ACPI that
indicates if the platform supports memory hotplug.  Let's assume we can
consider that if ACPI has no memory object, then it has no memory
hotplug support.  We still need to distinguish this case from a case
that ACPI has a memory device object but it is disabled (hence, it can
be added later).  However, the two cases are basically handled in the
same way that neither cases call the ACPI memory handlers in
drivers/acpi/acpi_memhotplug.c, where memory-specific code can be added.
Therefore, it will require adding memory device-specific code into the
ACPI scan code itself, which is supposed to be device type-neutral.  So,
it will be hard to sell without a very compelling reason...

I think a long term solution should be to remove this config option from
x86 Kconfig as you suggested before.  For now, we can disable it by
default and document as "test-use" only, which this patch did.
-Toshi


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-23  0:24 ` Yasuaki Ishimatsu
@ 2013-07-23  0:45   ` Toshi Kani
  0 siblings, 0 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-23  0:45 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	tangchen, vasilis.liaskovitis

On Tue, 2013-07-23 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> (2013/07/20 2:47), Toshi Kani wrote:
> > CONFIG_ARCH_MEMORY_PROBE enables /sys/devices/system/memory/probe
> > interface, which allows a given memory address to be hot-added as
> > follows. (See Documentation/memory-hotplug.txt for more detail.)
> > 
> > # echo start_address_of_new_memory > /sys/devices/system/memory/probe
> > 
> > This probe interface is required on powerpc. On x86, however, ACPI
> > notifies a memory hotplug event to the kernel, which performs its
> > hotplug operation as the result. Therefore, regular users do not need
> > this interface on x86. This probe interface is also error-prone and
> > misleading that the kernel blindly adds a given memory address without
> > checking if the memory is present on the system; no probing is done
> > despite of its name. The kernel crashes when a user requests to online
> > a memory block that is not present on the system. This interface is
> > currently used for testing as it can fake a hotplug event.
> > 
> > This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds
> > its Kconfig menu entry on x86, and clarifies its use in Documentation/
> > memory-hotplug.txt.
> > 
> > Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> 
> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Great!  Thanks Yasuaki!
-Toshi


> 
> Thanks,
> Yasuaki Ishimatsu
> 
> > ---
> >   Documentation/memory-hotplug.txt |   16 +++++++++-------
> >   arch/x86/Kconfig                 |    7 ++++++-
> >   2 files changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> > index 8e5eacb..8fd254c 100644
> > --- a/Documentation/memory-hotplug.txt
> > +++ b/Documentation/memory-hotplug.txt
> > @@ -210,13 +210,15 @@ If memory device is found, memory hotplug code will be called.
> >   
> >   4.2 Notify memory hot-add event by hand
> >   ------------
> > -In some environments, especially virtualized environment, firmware will not
> > -notify memory hotplug event to the kernel. For such environment, "probe"
> > -interface is supported. This interface depends on CONFIG_ARCH_MEMORY_PROBE.
> > -
> > -Now, CONFIG_ARCH_MEMORY_PROBE is supported only by powerpc but it does not
> > -contain highly architecture codes. Please add config if you need "probe"
> > -interface.
> > +On powerpc, the firmware does not notify a memory hotplug event to the kernel.
> > +Therefore, "probe" interface is supported to notify the event to the kernel.
> > +This interface depends on CONFIG_ARCH_MEMORY_PROBE.
> > +
> > +CONFIG_ARCH_MEMORY_PROBE is supported on powerpc only. On x86, this config
> > +option is disabled by default since ACPI notifies a memory hotplug event to
> > +the kernel, which performs its hotplug operation as the result. Please
> > +enable this option if you need the "probe" interface for testing purposes
> > +on x86.
> >   
> >   Probe interface is located at
> >   /sys/devices/system/memory/probe
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index b32ebf9..408ef68 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -1344,8 +1344,13 @@ config ARCH_SELECT_MEMORY_MODEL
> >   	depends on ARCH_SPARSEMEM_ENABLE
> >   
> >   config ARCH_MEMORY_PROBE
> > -	def_bool y
> > +	bool "Enable sysfs memory/probe interface"
> > +	default n
> >   	depends on X86_64 && MEMORY_HOTPLUG
> > +	help
> > +	  This option enables a sysfs memory/probe interface for testing.
> > +	  See Documentation/memory-hotplug.txt for more information.
> > +	  If you are unsure how to answer this question, answer N.
> >   
> >   config ARCH_PROC_KCORE_TEXT
> >   	def_bool y
> > 
> 
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [tip:x86/mm] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
                   ` (2 preceding siblings ...)
  2013-07-23  0:24 ` Yasuaki Ishimatsu
@ 2013-07-23  7:46 ` tip-bot for Toshi Kani
  3 siblings, 0 replies; 28+ messages in thread
From: tip-bot for Toshi Kani @ 2013-07-23  7:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, tglx, kosaki.motohiro, toshi.kani

Commit-ID:  a0842b704b0f7c2d925473676f2012ea5dc39976
Gitweb:     http://git.kernel.org/tip/a0842b704b0f7c2d925473676f2012ea5dc39976
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Fri, 19 Jul 2013 11:47:48 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 22 Jul 2013 11:13:34 +0200

mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default

CONFIG_ARCH_MEMORY_PROBE enables the
/sys/devices/system/memory/probe interface, which allows a given
memory address to be hot-added as follows:

 # echo start_address_of_new_memory > /sys/devices/system/memory/probe

(See Documentation/memory-hotplug.txt for more details.)

This probe interface is required on powerpc. On x86, however,
ACPI notifies a memory hotplug event to the kernel, which
performs its hotplug operation as the result.

Therefore, regular users do not need this interface on x86. This probe
interface is also error-prone and misleading that the kernel blindly
adds a given memory address without checking if the memory is present
on the system; no probing is done despite of its name.

The kernel crashes when a user requests to online a memory block
that is not present on the system. This interface is currently
used for testing as it can fake a hotplug event.

This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86,
adds its Kconfig menu entry on x86, and clarifies its use in
Documentation/ memory-hotplug.txt.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-mm@kvack.org
Cc: dave@sr71.net
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: tangchen@cn.fujitsu.com
Cc: vasilis.liaskovitis@profitbricks.com
Link: http://lkml.kernel.org/r/1374256068-26016-1-git-send-email-toshi.kani@hp.com
[ Edited it slightly. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/memory-hotplug.txt | 16 +++++++++-------
 arch/x86/Kconfig                 |  6 +++++-
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 8e5eacb..8fd254c 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -210,13 +210,15 @@ If memory device is found, memory hotplug code will be called.
 
 4.2 Notify memory hot-add event by hand
 ------------
-In some environments, especially virtualized environment, firmware will not
-notify memory hotplug event to the kernel. For such environment, "probe"
-interface is supported. This interface depends on CONFIG_ARCH_MEMORY_PROBE.
-
-Now, CONFIG_ARCH_MEMORY_PROBE is supported only by powerpc but it does not
-contain highly architecture codes. Please add config if you need "probe"
-interface.
+On powerpc, the firmware does not notify a memory hotplug event to the kernel.
+Therefore, "probe" interface is supported to notify the event to the kernel.
+This interface depends on CONFIG_ARCH_MEMORY_PROBE.
+
+CONFIG_ARCH_MEMORY_PROBE is supported on powerpc only. On x86, this config
+option is disabled by default since ACPI notifies a memory hotplug event to
+the kernel, which performs its hotplug operation as the result. Please
+enable this option if you need the "probe" interface for testing purposes
+on x86.
 
 Probe interface is located at
 /sys/devices/system/memory/probe
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b32ebf9..2f9d957 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1344,8 +1344,12 @@ config ARCH_SELECT_MEMORY_MODEL
 	depends on ARCH_SPARSEMEM_ENABLE
 
 config ARCH_MEMORY_PROBE
-	def_bool y
+	bool "Enable sysfs memory/probe interface"
 	depends on X86_64 && MEMORY_HOTPLUG
+	help
+	  This option enables a sysfs memory/probe interface for testing.
+	  See Documentation/memory-hotplug.txt for more information.
+	  If you are unsure how to answer this question, answer N.
 
 config ARCH_PROC_KCORE_TEXT
 	def_bool y

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-22 17:12   ` Toshi Kani
  2013-07-22 20:57     ` KOSAKI Motohiro
@ 2013-07-23  8:01     ` Ingo Molnar
  2013-07-23 20:45       ` Toshi Kani
  1 sibling, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2013-07-23  8:01 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis


* Toshi Kani <toshi.kani@hp.com> wrote:

> > Could we please also fix it to never crash the kernel, even if stupid 
> > ranges are provided?
> 
> Yes, this probe interface can be enhanced to verify the firmware 
> information before adding a given memory address.  However, such change 
> would interfere its test use of "fake" hotplug, which is only the known 
> use-case of this interface on x86.

Not crashing the kernel is not a novel concept even for test interfaces...

Where does the possible crash come from - from using invalid RAM ranges, 
right? I.e. on x86 to fix the crash we need to check the RAM is present in 
the e820 maps, is marked RAM there, and is not already registered with the 
kernel, or so?

> In order to verify if a given memory address is enabled at run-time (as 
> opposed to boot-time), we need to check with ACPI memory device objects 
> on x86.  However, system vendors tend to not implement memory device 
> objects unless their systems support memory hotplug.  Dave Hansen is 
> using this interface for his testing as a way to fake a hotplug event on 
> a system that does not support memory hotplug.

All vendors implement e820 maps for the memory present at boot time.

How is the testing done by Dave Hansen? If it's done by booting with less 
RAM than available (via say the mem=1g boot parameter), and then 
hot-adding some of the missing RAM, then this could be made safe via the 
e820 maps and by consultig the physical memory maps (to avoid double 
registry), right?

How does the hotplug event based approach solve double adds? Relies on the 
hardware not sending a hot-add event twice for the same memory area or for 
an invalid memory area, or does it include fail-safes and double checks as 
well to avoid double adds and adding invalid memory? If yes then that 
could be utilized here as well.

I.e. fragility of an interface is our choice, not some natural given 
property.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-23  8:01     ` Ingo Molnar
@ 2013-07-23 20:45       ` Toshi Kani
  2013-07-23 20:59         ` Dave Hansen
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-23 20:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > > Could we please also fix it to never crash the kernel, even if stupid 
> > > ranges are provided?
> > 
> > Yes, this probe interface can be enhanced to verify the firmware 
> > information before adding a given memory address.  However, such change 
> > would interfere its test use of "fake" hotplug, which is only the known 
> > use-case of this interface on x86.
> 
> Not crashing the kernel is not a novel concept even for test interfaces...

Agreed.

> Where does the possible crash come from - from using invalid RAM ranges, 
> right? I.e. on x86 to fix the crash we need to check the RAM is present in 
> the e820 maps, is marked RAM there, and is not already registered with the 
> kernel, or so?

Yes, the crash comes from using invalid RAM ranges.  How to check if the
RAM is present is different if the system supports hotplug or not.

> > In order to verify if a given memory address is enabled at run-time (as 
> > opposed to boot-time), we need to check with ACPI memory device objects 
> > on x86.  However, system vendors tend to not implement memory device 
> > objects unless their systems support memory hotplug.  Dave Hansen is 
> > using this interface for his testing as a way to fake a hotplug event on 
> > a system that does not support memory hotplug.
> 
> All vendors implement e820 maps for the memory present at boot time.

Yes for boot time.  At run-time, e820 is not guaranteed to represent a
new memory added.  Here is a quote from ACPI spec.

===
15.1 INT 15H, E820H - Query System Address Map
 :
The memory map conveyed by this interface is not required to reflect any
changes in available physical memory that have occurred after the BIOS
has initially passed control to the operating system. For example, if
memory is added dynamically, this interface is not required to reflect
the new system memory configuration.
===

By definition, the "probe" interface is used for the kernel to recognize
a new memory added at run-time.  So, it should check ACPI memory device
objects (which represents run-time state) for the verification.  On x86,
however, ACPI also sends a hotplug event to the kernel, which triggers
the kernel to recognize the new physical memory properly.  Hence, users
do not need this "probe" interface.

> How is the testing done by Dave Hansen? If it's done by booting with less 
> RAM than available (via say the mem=1g boot parameter), and then 
> hot-adding some of the missing RAM, then this could be made safe via the 
> e820 maps and by consultig the physical memory maps (to avoid double 
> registry), right?

If we focus on this test scenario on a system that does not support
hotplug, yes, I agree that we can check with e820 since it is safe to
assume that the system has no change after boot.  IOW, it is unsafe to
check with e820 if the system supports hotplug, but there is no use in
this interface for testing if the system supports hotplug.  So, this may
be a good idea.

Dave, is this how you are testing?  Do you always specify a valid memory
address for your testing?

> How does the hotplug event based approach solve double adds? Relies on the 
> hardware not sending a hot-add event twice for the same memory area or for 
> an invalid memory area, or does it include fail-safes and double checks as 
> well to avoid double adds and adding invalid memory? If yes then that 
> could be utilized here as well.

In high-level, here is how ACPI memory hotplug works:

1. ACPI sends a hotplug event to a new ACPI memory device object that is
hot-added.
2. The kernel is notified, and verifies if the new memory device object
has not been attached by any handler yet.
3. The memory handler is called, and obtains a new memory range from the
ACPI memory device object. 
4. The memory handler calls add_memory() with the new address range.

The above step 1-4 proceeds automatically within the kernel.  No user
input (nor sysfs interface) is necessary.  Step 2 prevents double adds
and step 3 gets a valid address range from the firmware directly.  Step
4 is basically the same as the "probe" interface, but with all the
verification up front, this step is safe.

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-23 20:45       ` Toshi Kani
@ 2013-07-23 20:59         ` Dave Hansen
  2013-07-23 21:34           ` Toshi Kani
  2013-07-24  0:18         ` Hush Bensen
  2013-07-24  4:20         ` Ingo Molnar
  2 siblings, 1 reply; 28+ messages in thread
From: Dave Hansen @ 2013-07-23 20:59 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On 07/23/2013 01:45 PM, Toshi Kani wrote:
> Dave, is this how you are testing?  Do you always specify a valid memory
> address for your testing?

For the moment, yes.

I'm actually working on some other patches that add the kernel metadata
for memory ranges even if they're not backed by physical memory.  But
_that_ is just for testing and I'll just have to modify whatever you do
here in those patches anyway.

It sounds like you're pretty confident that it has no users, so why
don't you just go ahead and axe it on x86 and config it out completely?
 Folks that need it can just hack it back in.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-23 20:59         ` Dave Hansen
@ 2013-07-23 21:34           ` Toshi Kani
  0 siblings, 0 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-23 21:34 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Tue, 2013-07-23 at 13:59 -0700, Dave Hansen wrote:
> On 07/23/2013 01:45 PM, Toshi Kani wrote:
> > Dave, is this how you are testing?  Do you always specify a valid memory
> > address for your testing?
> 
> For the moment, yes.
> 
> I'm actually working on some other patches that add the kernel metadata
> for memory ranges even if they're not backed by physical memory.  But
> _that_ is just for testing and I'll just have to modify whatever you do
> here in those patches anyway.
> 
> It sounds like you're pretty confident that it has no users, so why
> don't you just go ahead and axe it on x86 and config it out completely?
>  Folks that need it can just hack it back in.

Well, I am only confident that this interface is not necessary for ACPI
hotplug.  As we found you as a user of this interface for testing on a
system without hotplug support, it is prudent to assume that there may
be other users as well.  So, I am willing to keep the interface
configurable (with default disabled) for now.

The question is what to do in the next step.  There are two options:

1) Make the interface safe to use
2) Remove the config option from x86 Kconfig

Both options will achieve the same goal -- prevent the crash.  Once this
first patch gets in, we will see if there are more users on the
interface.  Then, we can decide if we go with 1) for keeping it in the
long term, or deprecate with 2).

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-23 20:45       ` Toshi Kani
  2013-07-23 20:59         ` Dave Hansen
@ 2013-07-24  0:18         ` Hush Bensen
  2013-07-24 16:02           ` Toshi Kani
  2013-07-24  4:20         ` Ingo Molnar
  2 siblings, 1 reply; 28+ messages in thread
From: Hush Bensen @ 2013-07-24  0:18 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

On 07/24/2013 04:45 AM, Toshi Kani wrote:
> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>
>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>> ranges are provided?
>>> Yes, this probe interface can be enhanced to verify the firmware
>>> information before adding a given memory address.  However, such change
>>> would interfere its test use of "fake" hotplug, which is only the known
>>> use-case of this interface on x86.
>> Not crashing the kernel is not a novel concept even for test interfaces...
> Agreed.
>
>> Where does the possible crash come from - from using invalid RAM ranges,
>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>> the e820 maps, is marked RAM there, and is not already registered with the
>> kernel, or so?
> Yes, the crash comes from using invalid RAM ranges.  How to check if the
> RAM is present is different if the system supports hotplug or not.

Could you explain different methods to check the RAM is present if the 
system supports hotplkug or not?

>
>>> In order to verify if a given memory address is enabled at run-time (as
>>> opposed to boot-time), we need to check with ACPI memory device objects
>>> on x86.  However, system vendors tend to not implement memory device
>>> objects unless their systems support memory hotplug.  Dave Hansen is
>>> using this interface for his testing as a way to fake a hotplug event on
>>> a system that does not support memory hotplug.
>> All vendors implement e820 maps for the memory present at boot time.
> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
> new memory added.  Here is a quote from ACPI spec.
>
> ===
> 15.1 INT 15H, E820H - Query System Address Map
>   :
> The memory map conveyed by this interface is not required to reflect any
> changes in available physical memory that have occurred after the BIOS
> has initially passed control to the operating system. For example, if
> memory is added dynamically, this interface is not required to reflect
> the new system memory configuration.
> ===
>
> By definition, the "probe" interface is used for the kernel to recognize
> a new memory added at run-time.  So, it should check ACPI memory device
> objects (which represents run-time state) for the verification.  On x86,
> however, ACPI also sends a hotplug event to the kernel, which triggers
> the kernel to recognize the new physical memory properly.  Hence, users
> do not need this "probe" interface.
>
>> How is the testing done by Dave Hansen? If it's done by booting with less
>> RAM than available (via say the mem=1g boot parameter), and then
>> hot-adding some of the missing RAM, then this could be made safe via the
>> e820 maps and by consultig the physical memory maps (to avoid double
>> registry), right?
> If we focus on this test scenario on a system that does not support
> hotplug, yes, I agree that we can check with e820 since it is safe to
> assume that the system has no change after boot.  IOW, it is unsafe to
> check with e820 if the system supports hotplug, but there is no use in
> this interface for testing if the system supports hotplug.  So, this may
> be a good idea.
>
> Dave, is this how you are testing?  Do you always specify a valid memory
> address for your testing?
>
>> How does the hotplug event based approach solve double adds? Relies on the
>> hardware not sending a hot-add event twice for the same memory area or for
>> an invalid memory area, or does it include fail-safes and double checks as
>> well to avoid double adds and adding invalid memory? If yes then that
>> could be utilized here as well.
> In high-level, here is how ACPI memory hotplug works:
>
> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
> hot-added.
> 2. The kernel is notified, and verifies if the new memory device object
> has not been attached by any handler yet.
> 3. The memory handler is called, and obtains a new memory range from the
> ACPI memory device object.
> 4. The memory handler calls add_memory() with the new address range.
>
> The above step 1-4 proceeds automatically within the kernel.  No user
> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
> and step 3 gets a valid address range from the firmware directly.  Step
> 4 is basically the same as the "probe" interface, but with all the
> verification up front, this step is safe.

This is hot-added part, could you also explain how ACPI memory hotplug 
works for hot-remove?

>
> Thanks,
> -Toshi
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-23 20:45       ` Toshi Kani
  2013-07-23 20:59         ` Dave Hansen
  2013-07-24  0:18         ` Hush Bensen
@ 2013-07-24  4:20         ` Ingo Molnar
  2013-07-24 16:58           ` Toshi Kani
  2 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2013-07-24  4:20 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis


* Toshi Kani <toshi.kani@hp.com> wrote:

> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
> > * Toshi Kani <toshi.kani@hp.com> wrote:
> > 
> > > > Could we please also fix it to never crash the kernel, even if stupid 
> > > > ranges are provided?
> > > 
> > > Yes, this probe interface can be enhanced to verify the firmware 
> > > information before adding a given memory address.  However, such change 
> > > would interfere its test use of "fake" hotplug, which is only the known 
> > > use-case of this interface on x86.
> > 
> > Not crashing the kernel is not a novel concept even for test interfaces...
> 
> Agreed.
> 
> > Where does the possible crash come from - from using invalid RAM ranges, 
> > right? I.e. on x86 to fix the crash we need to check the RAM is present in 
> > the e820 maps, is marked RAM there, and is not already registered with the 
> > kernel, or so?
> 
> Yes, the crash comes from using invalid RAM ranges.  How to check if the
> RAM is present is different if the system supports hotplug or not.
> 
> > > In order to verify if a given memory address is enabled at run-time (as 
> > > opposed to boot-time), we need to check with ACPI memory device objects 
> > > on x86.  However, system vendors tend to not implement memory device 
> > > objects unless their systems support memory hotplug.  Dave Hansen is 
> > > using this interface for his testing as a way to fake a hotplug event on 
> > > a system that does not support memory hotplug.
> > 
> > All vendors implement e820 maps for the memory present at boot time.
> 
> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
> new memory added. [...]

Yes I know that, the e820 map is boot only.

You claimed that the only purpose of this on x86 was that testing was done 
on non-hotplug systems, using this interface. Non-hotplug systems have 
e820 maps.

> > How does the hotplug event based approach solve double adds? Relies on 
> > the hardware not sending a hot-add event twice for the same memory 
> > area or for an invalid memory area, or does it include fail-safes and 
> > double checks as well to avoid double adds and adding invalid memory? 
> > If yes then that could be utilized here as well.
> 
> In high-level, here is how ACPI memory hotplug works:
> 
> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
> hot-added.
> 2. The kernel is notified, and verifies if the new memory device object
> has not been attached by any handler yet.
> 3. The memory handler is called, and obtains a new memory range from the
> ACPI memory device object. 
> 4. The memory handler calls add_memory() with the new address range.
> 
> The above step 1-4 proceeds automatically within the kernel.  No user 
> input (nor sysfs interface) is necessary.  Step 2 prevents double adds 
> [...]

If this 'new memory device object' is some ACPI detail then I don't see 
how it protects the kernel from a buggy ACPI implementation double adding 
the same physical memory range.

> and step 3 gets a valid address range from the firmware directly.  Step 
> 4 is basically the same as the "probe" interface, but with all the 
> verification up front, this step is safe.

So what verification does the kernel do to ensure that a buggy ACPI 
implementation does not pass us a crappy memory range, such a double 
physical range (represented via separate 'memory device objects'), or a 
range overlapping with an existing physical memory range already known to 
the kernel, or a totally nonsensical range the CPU cannot even access 
physically, etc.?

Also, is there any verification done to make sure that the new memory 
range is actually RAM - i.e. we could write the first and last word of it 
and see whether it gets modified correctly [to keep the sanity check 
fast]?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-24  0:18         ` Hush Bensen
@ 2013-07-24 16:02           ` Toshi Kani
  2013-07-25  0:17             ` Hush Bensen
                               ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-24 16:02 UTC (permalink / raw)
  To: Hush Bensen
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
> On 07/24/2013 04:45 AM, Toshi Kani wrote:
> > On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
> >> * Toshi Kani <toshi.kani@hp.com> wrote:
> >>
> >>>> Could we please also fix it to never crash the kernel, even if stupid
> >>>> ranges are provided?
> >>> Yes, this probe interface can be enhanced to verify the firmware
> >>> information before adding a given memory address.  However, such change
> >>> would interfere its test use of "fake" hotplug, which is only the known
> >>> use-case of this interface on x86.
> >> Not crashing the kernel is not a novel concept even for test interfaces...
> > Agreed.
> >
> >> Where does the possible crash come from - from using invalid RAM ranges,
> >> right? I.e. on x86 to fix the crash we need to check the RAM is present in
> >> the e820 maps, is marked RAM there, and is not already registered with the
> >> kernel, or so?
> > Yes, the crash comes from using invalid RAM ranges.  How to check if the
> > RAM is present is different if the system supports hotplug or not.
> 
> Could you explain different methods to check the RAM is present if the 
> system supports hotplkug or not?

e820 and UEFI memory descriptor tables are the boot-time interfaces.
These interfaces are not required to reflect any run-time changes.

ACPI memory device objects can be used at both boot-time and run-time,
which reflect any run-time changes.  But they are optional to implement.
They typically are not implemented unless the system supports hotplug.

> >>> In order to verify if a given memory address is enabled at run-time (as
> >>> opposed to boot-time), we need to check with ACPI memory device objects
> >>> on x86.  However, system vendors tend to not implement memory device
> >>> objects unless their systems support memory hotplug.  Dave Hansen is
> >>> using this interface for his testing as a way to fake a hotplug event on
> >>> a system that does not support memory hotplug.
> >> All vendors implement e820 maps for the memory present at boot time.
> > Yes for boot time.  At run-time, e820 is not guaranteed to represent a
> > new memory added.  Here is a quote from ACPI spec.
> >
> > ===
> > 15.1 INT 15H, E820H - Query System Address Map
> >   :
> > The memory map conveyed by this interface is not required to reflect any
> > changes in available physical memory that have occurred after the BIOS
> > has initially passed control to the operating system. For example, if
> > memory is added dynamically, this interface is not required to reflect
> > the new system memory configuration.
> > ===
> >
> > By definition, the "probe" interface is used for the kernel to recognize
> > a new memory added at run-time.  So, it should check ACPI memory device
> > objects (which represents run-time state) for the verification.  On x86,
> > however, ACPI also sends a hotplug event to the kernel, which triggers
> > the kernel to recognize the new physical memory properly.  Hence, users
> > do not need this "probe" interface.
> >
> >> How is the testing done by Dave Hansen? If it's done by booting with less
> >> RAM than available (via say the mem=1g boot parameter), and then
> >> hot-adding some of the missing RAM, then this could be made safe via the
> >> e820 maps and by consultig the physical memory maps (to avoid double
> >> registry), right?
> > If we focus on this test scenario on a system that does not support
> > hotplug, yes, I agree that we can check with e820 since it is safe to
> > assume that the system has no change after boot.  IOW, it is unsafe to
> > check with e820 if the system supports hotplug, but there is no use in
> > this interface for testing if the system supports hotplug.  So, this may
> > be a good idea.
> >
> > Dave, is this how you are testing?  Do you always specify a valid memory
> > address for your testing?
> >
> >> How does the hotplug event based approach solve double adds? Relies on the
> >> hardware not sending a hot-add event twice for the same memory area or for
> >> an invalid memory area, or does it include fail-safes and double checks as
> >> well to avoid double adds and adding invalid memory? If yes then that
> >> could be utilized here as well.
> > In high-level, here is how ACPI memory hotplug works:
> >
> > 1. ACPI sends a hotplug event to a new ACPI memory device object that is
> > hot-added.
> > 2. The kernel is notified, and verifies if the new memory device object
> > has not been attached by any handler yet.
> > 3. The memory handler is called, and obtains a new memory range from the
> > ACPI memory device object.
> > 4. The memory handler calls add_memory() with the new address range.
> >
> > The above step 1-4 proceeds automatically within the kernel.  No user
> > input (nor sysfs interface) is necessary.  Step 2 prevents double adds
> > and step 3 gets a valid address range from the firmware directly.  Step
> > 4 is basically the same as the "probe" interface, but with all the
> > verification up front, this step is safe.
> 
> This is hot-added part, could you also explain how ACPI memory hotplug 
> works for hot-remove?

Sure.  Here is high-level.

1. ACPI sends a hotplug event to an ACPI memory device object that is
requested to hot-remove.
2. The kernel is notified, and verifies if the memory device object is
attached by a handler.
3. The memory handler is called (which is being attached), and obtains
its memory range.
4. The memory handler calls remove_memory() with the address range.
5. The kernel calls eject method of the ACPI memory device object.

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-24  4:20         ` Ingo Molnar
@ 2013-07-24 16:58           ` Toshi Kani
  2013-07-25 21:38             ` Ingo Molnar
  0 siblings, 1 reply; 28+ messages in thread
From: Toshi Kani @ 2013-07-24 16:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Wed, 2013-07-24 at 06:20 +0200, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
> > > * Toshi Kani <toshi.kani@hp.com> wrote:
> > > 
> > > > > Could we please also fix it to never crash the kernel, even if stupid 
> > > > > ranges are provided?
> > > > 
> > > > Yes, this probe interface can be enhanced to verify the firmware 
> > > > information before adding a given memory address.  However, such change 
> > > > would interfere its test use of "fake" hotplug, which is only the known 
> > > > use-case of this interface on x86.
> > > 
> > > Not crashing the kernel is not a novel concept even for test interfaces...
> > 
> > Agreed.
> > 
> > > Where does the possible crash come from - from using invalid RAM ranges, 
> > > right? I.e. on x86 to fix the crash we need to check the RAM is present in 
> > > the e820 maps, is marked RAM there, and is not already registered with the 
> > > kernel, or so?
> > 
> > Yes, the crash comes from using invalid RAM ranges.  How to check if the
> > RAM is present is different if the system supports hotplug or not.
> > 
> > > > In order to verify if a given memory address is enabled at run-time (as 
> > > > opposed to boot-time), we need to check with ACPI memory device objects 
> > > > on x86.  However, system vendors tend to not implement memory device 
> > > > objects unless their systems support memory hotplug.  Dave Hansen is 
> > > > using this interface for his testing as a way to fake a hotplug event on 
> > > > a system that does not support memory hotplug.
> > > 
> > > All vendors implement e820 maps for the memory present at boot time.
> > 
> > Yes for boot time.  At run-time, e820 is not guaranteed to represent a
> > new memory added. [...]
> 
> Yes I know that, the e820 map is boot only.
> 
> You claimed that the only purpose of this on x86 was that testing was done 
> on non-hotplug systems, using this interface. Non-hotplug systems have 
> e820 maps.

Right.  Sorry, I first thought that the interface needed to work as
defined, i.e. detect a new memory.  But for the test purpose on
non-hotplug systems, that is not necessary.  So, I agree that we can
check e820.

I summarized two options in the email below.
https://lkml.org/lkml/2013/7/23/602

Option 1) adds a check with e820.  Option 2) deprecates the interface by
removing the config option from x86 Kconfig.  I was thinking that we
could evaluate two options after this patch gets in.  Does it make
sense?   

> > > How does the hotplug event based approach solve double adds? Relies on 
> > > the hardware not sending a hot-add event twice for the same memory 
> > > area or for an invalid memory area, or does it include fail-safes and 
> > > double checks as well to avoid double adds and adding invalid memory? 
> > > If yes then that could be utilized here as well.
> > 
> > In high-level, here is how ACPI memory hotplug works:
> > 
> > 1. ACPI sends a hotplug event to a new ACPI memory device object that is
> > hot-added.
> > 2. The kernel is notified, and verifies if the new memory device object
> > has not been attached by any handler yet.
> > 3. The memory handler is called, and obtains a new memory range from the
> > ACPI memory device object. 
> > 4. The memory handler calls add_memory() with the new address range.
> > 
> > The above step 1-4 proceeds automatically within the kernel.  No user 
> > input (nor sysfs interface) is necessary.  Step 2 prevents double adds 
> > [...]
> 
> If this 'new memory device object' is some ACPI detail then I don't see 
> how it protects the kernel from a buggy ACPI implementation double adding 
> the same physical memory range.

You are right that the kernel is not fully protected from buggy ACPI.
In case of double adding, though, such hot-add operation fails
gracefully since add_memory() returns with -EEXIST.  But if buggy ACPI
returns an invalid RAM range, then it can crash the system, just like an
invalid address in e820 can crash the system as well.

> > and step 3 gets a valid address range from the firmware directly.  Step 
> > 4 is basically the same as the "probe" interface, but with all the 
> > verification up front, this step is safe.
> 
> So what verification does the kernel do to ensure that a buggy ACPI 
> implementation does not pass us a crappy memory range, such a double 
> physical range (represented via separate 'memory device objects'), or a 
> range overlapping with an existing physical memory range already known to 
> the kernel, or a totally nonsensical range the CPU cannot even access 
> physically, etc.?

The kernel checks if the status of an ACPI memory device object is
marked as enabled.  But it does not protect from buggy ACPI because
anything can be wrong... 

Overlapping and double add cases are verified in add_memory(), i.e.
register_memory_resource() fails.

If an address range is unique & wrong, we have no protection from it.

> Also, is there any verification done to make sure that the new memory 
> range is actually RAM - i.e. we could write the first and last word of it 
> and see whether it gets modified correctly [to keep the sanity check 
> fast]?

No such check is performed -- just like we don't at boot-time.

This may sound bad, but in my experience, such obvious bugs are quickly
found and fixed during the FW development phase.


Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-24 16:02           ` Toshi Kani
@ 2013-07-25  0:17             ` Hush Bensen
  2013-07-25 15:47               ` Toshi Kani
  2013-07-25  0:44             ` Hush Bensen
  2013-07-25  0:56             ` Hush Bensen
  2 siblings, 1 reply; 28+ messages in thread
From: Hush Bensen @ 2013-07-25  0:17 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

On 07/25/2013 12:02 AM, Toshi Kani wrote:
> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>
>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>> ranges are provided?
>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>> information before adding a given memory address.  However, such change
>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>> use-case of this interface on x86.
>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>> Agreed.
>>>
>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>> kernel, or so?
>>> Yes, the crash comes from using invalid RAM ranges.  How to check if the
>>> RAM is present is different if the system supports hotplug or not.
>> Could you explain different methods to check the RAM is present if the
>> system supports hotplkug or not?
> e820 and UEFI memory descriptor tables are the boot-time interfaces.
> These interfaces are not required to reflect any run-time changes.
>
> ACPI memory device objects can be used at both boot-time and run-time,
> which reflect any run-time changes.  But they are optional to implement.
> They typically are not implemented unless the system supports hotplug.
>
>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>> on x86.  However, system vendors tend to not implement memory device
>>>>> objects unless their systems support memory hotplug.  Dave Hansen is
>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>> a system that does not support memory hotplug.
>>>> All vendors implement e820 maps for the memory present at boot time.
>>> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
>>> new memory added.  Here is a quote from ACPI spec.
>>>
>>> ===
>>> 15.1 INT 15H, E820H - Query System Address Map
>>>    :
>>> The memory map conveyed by this interface is not required to reflect any
>>> changes in available physical memory that have occurred after the BIOS
>>> has initially passed control to the operating system. For example, if
>>> memory is added dynamically, this interface is not required to reflect
>>> the new system memory configuration.
>>> ===
>>>
>>> By definition, the "probe" interface is used for the kernel to recognize
>>> a new memory added at run-time.  So, it should check ACPI memory device
>>> objects (which represents run-time state) for the verification.  On x86,
>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>> the kernel to recognize the new physical memory properly.  Hence, users
>>> do not need this "probe" interface.
>>>
>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>> registry), right?
>>> If we focus on this test scenario on a system that does not support
>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>> assume that the system has no change after boot.  IOW, it is unsafe to
>>> check with e820 if the system supports hotplug, but there is no use in
>>> this interface for testing if the system supports hotplug.  So, this may
>>> be a good idea.
>>>
>>> Dave, is this how you are testing?  Do you always specify a valid memory
>>> address for your testing?
>>>
>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>> could be utilized here as well.
>>> In high-level, here is how ACPI memory hotplug works:
>>>
>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>> hot-added.
>>> 2. The kernel is notified, and verifies if the new memory device object
>>> has not been attached by any handler yet.
>>> 3. The memory handler is called, and obtains a new memory range from the
>>> ACPI memory device object.
>>> 4. The memory handler calls add_memory() with the new address range.
>>>
>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
>>> and step 3 gets a valid address range from the firmware directly.  Step
>>> 4 is basically the same as the "probe" interface, but with all the
>>> verification up front, this step is safe.
>> This is hot-added part, could you also explain how ACPI memory hotplug
>> works for hot-remove?
> Sure.  Here is high-level.
>
> 1. ACPI sends a hotplug event to an ACPI memory device object that is
> requested to hot-remove.
> 2. The kernel is notified, and verifies if the memory device object is
> attached by a handler.
> 3. The memory handler is called (which is being attached), and obtains
> its memory range.
> 4. The memory handler calls remove_memory() with the address range.
> 5. The kernel calls eject method of the ACPI memory device object.

Thanks for your explaination, very useful for me. ;-) Btw, what's the 
eject method done?

>
> Thanks,
> -Toshi
>
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-24 16:02           ` Toshi Kani
  2013-07-25  0:17             ` Hush Bensen
@ 2013-07-25  0:44             ` Hush Bensen
  2013-07-25  0:56             ` Hush Bensen
  2 siblings, 0 replies; 28+ messages in thread
From: Hush Bensen @ 2013-07-25  0:44 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

On 07/25/2013 12:02 AM, Toshi Kani wrote:
> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>
>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>> ranges are provided?
>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>> information before adding a given memory address.  However, such change
>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>> use-case of this interface on x86.
>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>> Agreed.
>>>
>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>> kernel, or so?
>>> Yes, the crash comes from using invalid RAM ranges.  How to check if the
>>> RAM is present is different if the system supports hotplug or not.
>> Could you explain different methods to check the RAM is present if the
>> system supports hotplkug or not?
> e820 and UEFI memory descriptor tables are the boot-time interfaces.
> These interfaces are not required to reflect any run-time changes.
>
> ACPI memory device objects can be used at both boot-time and run-time,
> which reflect any run-time changes.  But they are optional to implement.
> They typically are not implemented unless the system supports hotplug.
>
>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>> on x86.  However, system vendors tend to not implement memory device
>>>>> objects unless their systems support memory hotplug.  Dave Hansen is
>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>> a system that does not support memory hotplug.
>>>> All vendors implement e820 maps for the memory present at boot time.
>>> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
>>> new memory added.  Here is a quote from ACPI spec.
>>>
>>> ===
>>> 15.1 INT 15H, E820H - Query System Address Map
>>>    :
>>> The memory map conveyed by this interface is not required to reflect any
>>> changes in available physical memory that have occurred after the BIOS
>>> has initially passed control to the operating system. For example, if
>>> memory is added dynamically, this interface is not required to reflect
>>> the new system memory configuration.
>>> ===
>>>
>>> By definition, the "probe" interface is used for the kernel to recognize
>>> a new memory added at run-time.  So, it should check ACPI memory device
>>> objects (which represents run-time state) for the verification.  On x86,
>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>> the kernel to recognize the new physical memory properly.  Hence, users
>>> do not need this "probe" interface.
>>>
>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>> registry), right?
>>> If we focus on this test scenario on a system that does not support
>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>> assume that the system has no change after boot.  IOW, it is unsafe to
>>> check with e820 if the system supports hotplug, but there is no use in
>>> this interface for testing if the system supports hotplug.  So, this may
>>> be a good idea.
>>>
>>> Dave, is this how you are testing?  Do you always specify a valid memory
>>> address for your testing?
>>>
>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>> could be utilized here as well.
>>> In high-level, here is how ACPI memory hotplug works:
>>>
>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>> hot-added.
>>> 2. The kernel is notified, and verifies if the new memory device object
>>> has not been attached by any handler yet.
>>> 3. The memory handler is called, and obtains a new memory range from the
>>> ACPI memory device object.
>>> 4. The memory handler calls add_memory() with the new address range.
>>>
>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
>>> and step 3 gets a valid address range from the firmware directly.  Step
>>> 4 is basically the same as the "probe" interface, but with all the
>>> verification up front, this step is safe.
>> This is hot-added part, could you also explain how ACPI memory hotplug
>> works for hot-remove?
> Sure.  Here is high-level.
>
> 1. ACPI sends a hotplug event to an ACPI memory device object that is
> requested to hot-remove.
> 2. The kernel is notified, and verifies if the memory device object is
> attached by a handler.
> 3. The memory handler is called (which is being attached), and obtains
> its memory range.
> 4. The memory handler calls remove_memory() with the address range.
> 5. The kernel calls eject method of the ACPI memory device object.

Could you give me the calltrace of add_memory and remove_memory? I don't 
have machine support hotplug, but I hope to investigate how ACPI part 
works for memory hotplug. ;-)

>
> Thanks,
> -Toshi
>
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-24 16:02           ` Toshi Kani
  2013-07-25  0:17             ` Hush Bensen
  2013-07-25  0:44             ` Hush Bensen
@ 2013-07-25  0:56             ` Hush Bensen
  2013-07-25  3:08               ` Yasuaki Ishimatsu
  2 siblings, 1 reply; 28+ messages in thread
From: Hush Bensen @ 2013-07-25  0:56 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

On 07/25/2013 12:02 AM, Toshi Kani wrote:
> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>
>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>> ranges are provided?
>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>> information before adding a given memory address.  However, such change
>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>> use-case of this interface on x86.
>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>> Agreed.
>>>
>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>> kernel, or so?
>>> Yes, the crash comes from using invalid RAM ranges.  How to check if the
>>> RAM is present is different if the system supports hotplug or not.
>> Could you explain different methods to check the RAM is present if the
>> system supports hotplkug or not?
> e820 and UEFI memory descriptor tables are the boot-time interfaces.
> These interfaces are not required to reflect any run-time changes.
>
> ACPI memory device objects can be used at both boot-time and run-time,
> which reflect any run-time changes.  But they are optional to implement.
> They typically are not implemented unless the system supports hotplug.
>
>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>> on x86.  However, system vendors tend to not implement memory device
>>>>> objects unless their systems support memory hotplug.  Dave Hansen is
>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>> a system that does not support memory hotplug.
>>>> All vendors implement e820 maps for the memory present at boot time.
>>> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
>>> new memory added.  Here is a quote from ACPI spec.
>>>
>>> ===
>>> 15.1 INT 15H, E820H - Query System Address Map
>>>    :
>>> The memory map conveyed by this interface is not required to reflect any
>>> changes in available physical memory that have occurred after the BIOS
>>> has initially passed control to the operating system. For example, if
>>> memory is added dynamically, this interface is not required to reflect
>>> the new system memory configuration.
>>> ===
>>>
>>> By definition, the "probe" interface is used for the kernel to recognize
>>> a new memory added at run-time.  So, it should check ACPI memory device
>>> objects (which represents run-time state) for the verification.  On x86,
>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>> the kernel to recognize the new physical memory properly.  Hence, users
>>> do not need this "probe" interface.
>>>
>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>> registry), right?
>>> If we focus on this test scenario on a system that does not support
>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>> assume that the system has no change after boot.  IOW, it is unsafe to
>>> check with e820 if the system supports hotplug, but there is no use in
>>> this interface for testing if the system supports hotplug.  So, this may
>>> be a good idea.
>>>
>>> Dave, is this how you are testing?  Do you always specify a valid memory
>>> address for your testing?
>>>
>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>> could be utilized here as well.
>>> In high-level, here is how ACPI memory hotplug works:
>>>
>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>> hot-added.
>>> 2. The kernel is notified, and verifies if the new memory device object
>>> has not been attached by any handler yet.
>>> 3. The memory handler is called, and obtains a new memory range from the
>>> ACPI memory device object.
>>> 4. The memory handler calls add_memory() with the new address range.
>>>
>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
>>> and step 3 gets a valid address range from the firmware directly.  Step
>>> 4 is basically the same as the "probe" interface, but with all the
>>> verification up front, this step is safe.
>> This is hot-added part, could you also explain how ACPI memory hotplug
>> works for hot-remove?
> Sure.  Here is high-level.
>
> 1. ACPI sends a hotplug event to an ACPI memory device object that is
> requested to hot-remove.
> 2. The kernel is notified, and verifies if the memory device object is
> attached by a handler.
> 3. The memory handler is called (which is being attached), and obtains
> its memory range.
> 4. The memory handler calls remove_memory() with the address range.
> 5. The kernel calls eject method of the ACPI memory device object.

If hot remove the memory device by the hardware, or writing 1 to 
/sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method? 
What's the difference between these two methods? I guess the former will 
send SCI and the latter won't.

>
> Thanks,
> -Toshi
>
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-25  0:56             ` Hush Bensen
@ 2013-07-25  3:08               ` Yasuaki Ishimatsu
  2013-07-25  3:34                 ` Hush Bensen
  0 siblings, 1 reply; 28+ messages in thread
From: Yasuaki Ishimatsu @ 2013-07-25  3:08 UTC (permalink / raw)
  To: Hush Bensen
  Cc: Toshi Kani, Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, tangchen, vasilis.liaskovitis

(2013/07/25 9:56), Hush Bensen wrote:
> On 07/25/2013 12:02 AM, Toshi Kani wrote:
>> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>>
>>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>>> ranges are provided?
>>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>>> information before adding a given memory address.  However, such change
>>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>>> use-case of this interface on x86.
>>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>>> Agreed.
>>>>
>>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>>> kernel, or so?
>>>> Yes, the crash comes from using invalid RAM ranges.  How to check if the
>>>> RAM is present is different if the system supports hotplug or not.
>>> Could you explain different methods to check the RAM is present if the
>>> system supports hotplkug or not?
>> e820 and UEFI memory descriptor tables are the boot-time interfaces.
>> These interfaces are not required to reflect any run-time changes.
>>
>> ACPI memory device objects can be used at both boot-time and run-time,
>> which reflect any run-time changes.  But they are optional to implement.
>> They typically are not implemented unless the system supports hotplug.
>>
>>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>>> on x86.  However, system vendors tend to not implement memory device
>>>>>> objects unless their systems support memory hotplug.  Dave Hansen is
>>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>>> a system that does not support memory hotplug.
>>>>> All vendors implement e820 maps for the memory present at boot time.
>>>> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
>>>> new memory added.  Here is a quote from ACPI spec.
>>>>
>>>> ===
>>>> 15.1 INT 15H, E820H - Query System Address Map
>>>>    :
>>>> The memory map conveyed by this interface is not required to reflect any
>>>> changes in available physical memory that have occurred after the BIOS
>>>> has initially passed control to the operating system. For example, if
>>>> memory is added dynamically, this interface is not required to reflect
>>>> the new system memory configuration.
>>>> ===
>>>>
>>>> By definition, the "probe" interface is used for the kernel to recognize
>>>> a new memory added at run-time.  So, it should check ACPI memory device
>>>> objects (which represents run-time state) for the verification.  On x86,
>>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>>> the kernel to recognize the new physical memory properly.  Hence, users
>>>> do not need this "probe" interface.
>>>>
>>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>>> registry), right?
>>>> If we focus on this test scenario on a system that does not support
>>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>>> assume that the system has no change after boot.  IOW, it is unsafe to
>>>> check with e820 if the system supports hotplug, but there is no use in
>>>> this interface for testing if the system supports hotplug.  So, this may
>>>> be a good idea.
>>>>
>>>> Dave, is this how you are testing?  Do you always specify a valid memory
>>>> address for your testing?
>>>>
>>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>>> could be utilized here as well.
>>>> In high-level, here is how ACPI memory hotplug works:
>>>>
>>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>>> hot-added.
>>>> 2. The kernel is notified, and verifies if the new memory device object
>>>> has not been attached by any handler yet.
>>>> 3. The memory handler is called, and obtains a new memory range from the
>>>> ACPI memory device object.
>>>> 4. The memory handler calls add_memory() with the new address range.
>>>>
>>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>>> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
>>>> and step 3 gets a valid address range from the firmware directly.  Step
>>>> 4 is basically the same as the "probe" interface, but with all the
>>>> verification up front, this step is safe.
>>> This is hot-added part, could you also explain how ACPI memory hotplug
>>> works for hot-remove?
>> Sure.  Here is high-level.
>>
>> 1. ACPI sends a hotplug event to an ACPI memory device object that is
>> requested to hot-remove.
>> 2. The kernel is notified, and verifies if the memory device object is
>> attached by a handler.
>> 3. The memory handler is called (which is being attached), and obtains
>> its memory range.
>> 4. The memory handler calls remove_memory() with the address range.
>> 5. The kernel calls eject method of the ACPI memory device object.
>
> If hot remove the memory device by the hardware, or writing 1 to
> /sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method?

Yes.
Both operations will call eject method.

> What's the difference between these two methods? I guess the former will send SCI and the latter won't.

Triggers are different. Former is triggered by SCI, latter is triggered by
writing sysfs.

Thanks,
Yasuaki Ishimatsu


>
>>
>> Thanks,
>> -Toshi
>>
>>
>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-25  3:08               ` Yasuaki Ishimatsu
@ 2013-07-25  3:34                 ` Hush Bensen
  2013-07-25  4:55                   ` Yasuaki Ishimatsu
  0 siblings, 1 reply; 28+ messages in thread
From: Hush Bensen @ 2013-07-25  3:34 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Toshi Kani, Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, tangchen, vasilis.liaskovitis

On 07/25/2013 11:08 AM, Yasuaki Ishimatsu wrote:
> (2013/07/25 9:56), Hush Bensen wrote:
>> On 07/25/2013 12:02 AM, Toshi Kani wrote:
>>> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>>>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>>>
>>>>>>>> Could we please also fix it to never crash the kernel, even if 
>>>>>>>> stupid
>>>>>>>> ranges are provided?
>>>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>>>> information before adding a given memory address. However, such 
>>>>>>> change
>>>>>>> would interfere its test use of "fake" hotplug, which is only 
>>>>>>> the known
>>>>>>> use-case of this interface on x86.
>>>>>> Not crashing the kernel is not a novel concept even for test 
>>>>>> interfaces...
>>>>> Agreed.
>>>>>
>>>>>> Where does the possible crash come from - from using invalid RAM 
>>>>>> ranges,
>>>>>> right? I.e. on x86 to fix the crash we need to check the RAM is 
>>>>>> present in
>>>>>> the e820 maps, is marked RAM there, and is not already registered 
>>>>>> with the
>>>>>> kernel, or so?
>>>>> Yes, the crash comes from using invalid RAM ranges.  How to check 
>>>>> if the
>>>>> RAM is present is different if the system supports hotplug or not.
>>>> Could you explain different methods to check the RAM is present if the
>>>> system supports hotplkug or not?
>>> e820 and UEFI memory descriptor tables are the boot-time interfaces.
>>> These interfaces are not required to reflect any run-time changes.
>>>
>>> ACPI memory device objects can be used at both boot-time and run-time,
>>> which reflect any run-time changes.  But they are optional to 
>>> implement.
>>> They typically are not implemented unless the system supports hotplug.
>>>
>>>>>>> In order to verify if a given memory address is enabled at 
>>>>>>> run-time (as
>>>>>>> opposed to boot-time), we need to check with ACPI memory device 
>>>>>>> objects
>>>>>>> on x86.  However, system vendors tend to not implement memory 
>>>>>>> device
>>>>>>> objects unless their systems support memory hotplug. Dave Hansen is
>>>>>>> using this interface for his testing as a way to fake a hotplug 
>>>>>>> event on
>>>>>>> a system that does not support memory hotplug.
>>>>>> All vendors implement e820 maps for the memory present at boot time.
>>>>> Yes for boot time.  At run-time, e820 is not guaranteed to 
>>>>> represent a
>>>>> new memory added.  Here is a quote from ACPI spec.
>>>>>
>>>>> ===
>>>>> 15.1 INT 15H, E820H - Query System Address Map
>>>>>    :
>>>>> The memory map conveyed by this interface is not required to 
>>>>> reflect any
>>>>> changes in available physical memory that have occurred after the 
>>>>> BIOS
>>>>> has initially passed control to the operating system. For example, if
>>>>> memory is added dynamically, this interface is not required to 
>>>>> reflect
>>>>> the new system memory configuration.
>>>>> ===
>>>>>
>>>>> By definition, the "probe" interface is used for the kernel to 
>>>>> recognize
>>>>> a new memory added at run-time.  So, it should check ACPI memory 
>>>>> device
>>>>> objects (which represents run-time state) for the verification.  
>>>>> On x86,
>>>>> however, ACPI also sends a hotplug event to the kernel, which 
>>>>> triggers
>>>>> the kernel to recognize the new physical memory properly. Hence, 
>>>>> users
>>>>> do not need this "probe" interface.
>>>>>
>>>>>> How is the testing done by Dave Hansen? If it's done by booting 
>>>>>> with less
>>>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>>>> hot-adding some of the missing RAM, then this could be made safe 
>>>>>> via the
>>>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>>>> registry), right?
>>>>> If we focus on this test scenario on a system that does not support
>>>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>>>> assume that the system has no change after boot.  IOW, it is 
>>>>> unsafe to
>>>>> check with e820 if the system supports hotplug, but there is no 
>>>>> use in
>>>>> this interface for testing if the system supports hotplug.  So, 
>>>>> this may
>>>>> be a good idea.
>>>>>
>>>>> Dave, is this how you are testing?  Do you always specify a valid 
>>>>> memory
>>>>> address for your testing?
>>>>>
>>>>>> How does the hotplug event based approach solve double adds? 
>>>>>> Relies on the
>>>>>> hardware not sending a hot-add event twice for the same memory 
>>>>>> area or for
>>>>>> an invalid memory area, or does it include fail-safes and double 
>>>>>> checks as
>>>>>> well to avoid double adds and adding invalid memory? If yes then 
>>>>>> that
>>>>>> could be utilized here as well.
>>>>> In high-level, here is how ACPI memory hotplug works:
>>>>>
>>>>> 1. ACPI sends a hotplug event to a new ACPI memory device object 
>>>>> that is
>>>>> hot-added.
>>>>> 2. The kernel is notified, and verifies if the new memory device 
>>>>> object
>>>>> has not been attached by any handler yet.
>>>>> 3. The memory handler is called, and obtains a new memory range 
>>>>> from the
>>>>> ACPI memory device object.
>>>>> 4. The memory handler calls add_memory() with the new address range.
>>>>>
>>>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>>>> input (nor sysfs interface) is necessary.  Step 2 prevents double 
>>>>> adds
>>>>> and step 3 gets a valid address range from the firmware directly.  
>>>>> Step
>>>>> 4 is basically the same as the "probe" interface, but with all the
>>>>> verification up front, this step is safe.
>>>> This is hot-added part, could you also explain how ACPI memory hotplug
>>>> works for hot-remove?
>>> Sure.  Here is high-level.
>>>
>>> 1. ACPI sends a hotplug event to an ACPI memory device object that is
>>> requested to hot-remove.
>>> 2. The kernel is notified, and verifies if the memory device object is
>>> attached by a handler.
>>> 3. The memory handler is called (which is being attached), and obtains
>>> its memory range.
>>> 4. The memory handler calls remove_memory() with the address range.
>>> 5. The kernel calls eject method of the ACPI memory device object.
>>
>> If hot remove the memory device by the hardware, or writing 1 to
>> /sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method?
>
> Yes.
> Both operations will call eject method.
>
>> What's the difference between these two methods? I guess the former 
>> will send SCI and the latter won't.
>
> Triggers are different. Former is triggered by SCI, latter is 
> triggered by
> writing sysfs.

Thanks, another question, what's the role of udev in memory hotplug?

>
> Thanks,
> Yasuaki Ishimatsu
>
>
>>
>>>
>>> Thanks,
>>> -Toshi
>>>
>>>
>>
>
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-25  3:34                 ` Hush Bensen
@ 2013-07-25  4:55                   ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 28+ messages in thread
From: Yasuaki Ishimatsu @ 2013-07-25  4:55 UTC (permalink / raw)
  To: Hush Bensen
  Cc: Toshi Kani, Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, tangchen, vasilis.liaskovitis

(2013/07/25 12:34), Hush Bensen wrote:
> On 07/25/2013 11:08 AM, Yasuaki Ishimatsu wrote:
>> (2013/07/25 9:56), Hush Bensen wrote:
>>> On 07/25/2013 12:02 AM, Toshi Kani wrote:
>>>> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>>>>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>>>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>>>>> * Toshi Kani <toshi.kani@hp.com> wrote:
>>>>>>>
>>>>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>>>>> ranges are provided?
>>>>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>>>>> information before adding a given memory address. However, such change
>>>>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>>>>> use-case of this interface on x86.
>>>>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>>>>> Agreed.
>>>>>>
>>>>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>>>>> kernel, or so?
>>>>>> Yes, the crash comes from using invalid RAM ranges.  How to check if the
>>>>>> RAM is present is different if the system supports hotplug or not.
>>>>> Could you explain different methods to check the RAM is present if the
>>>>> system supports hotplkug or not?
>>>> e820 and UEFI memory descriptor tables are the boot-time interfaces.
>>>> These interfaces are not required to reflect any run-time changes.
>>>>
>>>> ACPI memory device objects can be used at both boot-time and run-time,
>>>> which reflect any run-time changes.  But they are optional to implement.
>>>> They typically are not implemented unless the system supports hotplug.
>>>>
>>>>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>>>>> on x86.  However, system vendors tend to not implement memory device
>>>>>>>> objects unless their systems support memory hotplug. Dave Hansen is
>>>>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>>>>> a system that does not support memory hotplug.
>>>>>>> All vendors implement e820 maps for the memory present at boot time.
>>>>>> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
>>>>>> new memory added.  Here is a quote from ACPI spec.
>>>>>>
>>>>>> ===
>>>>>> 15.1 INT 15H, E820H - Query System Address Map
>>>>>>    :
>>>>>> The memory map conveyed by this interface is not required to reflect any
>>>>>> changes in available physical memory that have occurred after the BIOS
>>>>>> has initially passed control to the operating system. For example, if
>>>>>> memory is added dynamically, this interface is not required to reflect
>>>>>> the new system memory configuration.
>>>>>> ===
>>>>>>
>>>>>> By definition, the "probe" interface is used for the kernel to recognize
>>>>>> a new memory added at run-time.  So, it should check ACPI memory device
>>>>>> objects (which represents run-time state) for the verification. On x86,
>>>>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>>>>> the kernel to recognize the new physical memory properly. Hence, users
>>>>>> do not need this "probe" interface.
>>>>>>
>>>>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>>>>> registry), right?
>>>>>> If we focus on this test scenario on a system that does not support
>>>>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>>>>> assume that the system has no change after boot.  IOW, it is unsafe to
>>>>>> check with e820 if the system supports hotplug, but there is no use in
>>>>>> this interface for testing if the system supports hotplug.  So, this may
>>>>>> be a good idea.
>>>>>>
>>>>>> Dave, is this how you are testing?  Do you always specify a valid memory
>>>>>> address for your testing?
>>>>>>
>>>>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>>>>> could be utilized here as well.
>>>>>> In high-level, here is how ACPI memory hotplug works:
>>>>>>
>>>>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>>>>> hot-added.
>>>>>> 2. The kernel is notified, and verifies if the new memory device object
>>>>>> has not been attached by any handler yet.
>>>>>> 3. The memory handler is called, and obtains a new memory range from the
>>>>>> ACPI memory device object.
>>>>>> 4. The memory handler calls add_memory() with the new address range.
>>>>>>
>>>>>> The above step 1-4 proceeds automatically within the kernel.  No user
>>>>>> input (nor sysfs interface) is necessary.  Step 2 prevents double adds
>>>>>> and step 3 gets a valid address range from the firmware directly. Step
>>>>>> 4 is basically the same as the "probe" interface, but with all the
>>>>>> verification up front, this step is safe.
>>>>> This is hot-added part, could you also explain how ACPI memory hotplug
>>>>> works for hot-remove?
>>>> Sure.  Here is high-level.
>>>>
>>>> 1. ACPI sends a hotplug event to an ACPI memory device object that is
>>>> requested to hot-remove.
>>>> 2. The kernel is notified, and verifies if the memory device object is
>>>> attached by a handler.
>>>> 3. The memory handler is called (which is being attached), and obtains
>>>> its memory range.
>>>> 4. The memory handler calls remove_memory() with the address range.
>>>> 5. The kernel calls eject method of the ACPI memory device object.
>>>
>>> If hot remove the memory device by the hardware, or writing 1 to
>>> /sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method?
>>
>> Yes.
>> Both operations will call eject method.
>>
>>> What's the difference between these two methods? I guess the former will send SCI and the latter
>>> won't.
>>
>> Triggers are different. Former is triggered by SCI, latter is triggered by
>> writing sysfs.
>
> Thanks, another question, what's the role of udev in memory hotplug?

Udev notifies user land of hotplug events. So if you want to do something
after memory hotplug automatically, you can do it by registering your script
to udev.

Thanks,
Yasuaki Ishimatsu


>
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>
>>>
>>>>
>>>> Thanks,
>>>> -Toshi
>>>>
>>>>
>>>
>>
>>
>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-25  0:17             ` Hush Bensen
@ 2013-07-25 15:47               ` Toshi Kani
  0 siblings, 0 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-25 15:47 UTC (permalink / raw)
  To: Hush Bensen
  Cc: Ingo Molnar, akpm, linux-mm, linux-kernel, x86, dave,
	kosaki.motohiro, isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Thu, 2013-07-25 at 08:17 +0800, Hush Bensen wrote:
 :
> Thanks for your explaination, very useful for me. ;-) Btw, what's the 
> eject method done?

On bare metal, the eject method makes a target device electrically
isolated and physically removable.  On virtualized environment, the
eject method makes a target device unassigned and released.

> Could you give me the calltrace of add_memory and remove_memory? I
> don't have machine support hotplug, but I hope to investigate how
> ACPI part works for memory hotplug. ;-)

Here you go.

[   84.883517]  [<ffffffff8153e8a4>] add_memory+0x24/0x185
[   84.883525]  [<ffffffff812ca331>] ? acpi_get_pxm+0x2b/0x4e
[   84.883533]  [<ffffffff81303418>] acpi_memory_device_add+0x144/0x274
[   84.883539]  [<ffffffff812c1b25>] acpi_bus_device_attach+0x80/0xd9
[   84.883545]  [<ffffffff812c2b65>] acpi_bus_scan+0x65/0x9f
[   84.883551]  [<ffffffff812c2c1d>] acpi_scan_bus_device_check
+0x7e/0x10f
[   84.883556]  [<ffffffff812c2cc1>] acpi_scan_device_check+0x13/0x15
[   84.883562]  [<ffffffff812bbdab>] acpi_os_execute_deferred+0x25/0x32
[   84.883568]  [<ffffffff81057cbc>] process_one_work+0x229/0x3d3
[   84.883573]  [<ffffffff81057bfa>] ? process_one_work+0x167/0x3d3
[   84.883579]  [<ffffffff81058248>] worker_thread+0x133/0x200
[   84.883584]  [<ffffffff81058115>] ? rescuer_thread+0x280/0x280
[   84.883591]  [<ffffffff8105e0c3>] kthread+0xb1/0xb9
[   84.883598]  [<ffffffff8105e012>] ? __kthread_parkme+0x65/0x65
[   84.883604]  [<ffffffff81556c9c>] ret_from_fork+0x7c/0xb0
[   84.883610]  [<ffffffff8105e012>] ? __kthread_parkme+0x65/0x65

[  129.586622]  [<ffffffff8153f4b1>] remove_memory+0x22/0x85
[  129.586627]  [<ffffffff813031c6>] acpi_memory_device_remove+0x77/0xa7
[  129.586631]  [<ffffffff812c0f95>] acpi_bus_device_detach+0x3d/0x5e
[  129.586634]  [<ffffffff812c0ff8>] acpi_bus_trim+0x42/0x7a
[  129.586637]  [<ffffffff812c1587>] acpi_scan_hot_remove+0x1ba/0x261
[  129.586641]  [<ffffffff812c16ce>] acpi_bus_device_eject+0xa0/0xd0
[  129.586644]  [<ffffffff812bbdab>] acpi_os_execute_deferred+0x25/0x32
[  129.586648]  [<ffffffff81057cbc>] process_one_work+0x229/0x3d3
[  129.586652]  [<ffffffff81057bfa>] ? process_one_work+0x167/0x3d3
[  129.586655]  [<ffffffff81058248>] worker_thread+0x133/0x200
[  129.586658]  [<ffffffff81058115>] ? rescuer_thread+0x280/0x280
[  129.586663]  [<ffffffff8105e0c3>] kthread+0xb1/0xb9
[  129.586667]  [<ffffffff8105e012>] ? __kthread_parkme+0x65/0x65
[  129.586671]  [<ffffffff81556c9c>] ret_from_fork+0x7c/0xb0
[  129.586675]  [<ffffffff8105e012>] ? __kthread_parkme+0x65/0x65

-Toshi


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-24 16:58           ` Toshi Kani
@ 2013-07-25 21:38             ` Ingo Molnar
  2013-07-25 22:36               ` Toshi Kani
  0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2013-07-25 21:38 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis


* Toshi Kani <toshi.kani@hp.com> wrote:

> > You claimed that the only purpose of this on x86 was 
> > that testing was done on non-hotplug systems, using 
> > this interface. Non-hotplug systems have e820 maps.
> 
> Right.  Sorry, I first thought that the interface needed 
> to work as defined, i.e. detect a new memory.  But for 
> the test purpose on non-hotplug systems, that is not 
> necessary.  So, I agree that we can check e820.
> 
> I summarized two options in the email below.
> https://lkml.org/lkml/2013/7/23/602
> 
> Option 1) adds a check with e820.  Option 2) deprecates 
> the interface by removing the config option from x86 
> Kconfig.  I was thinking that we could evaluate two 
> options after this patch gets in.  Does it make sense?

Yeah.

That having said, if the e820 check is too difficult to 
pull off straight away, I also don't mind keeping it as-is 
if it's useful for testing. Just make sure you document it 
as "you need to be careful with this" (beyond it being a 
root-only interface to begin with).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
  2013-07-25 21:38             ` Ingo Molnar
@ 2013-07-25 22:36               ` Toshi Kani
  0 siblings, 0 replies; 28+ messages in thread
From: Toshi Kani @ 2013-07-25 22:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, linux-mm, linux-kernel, x86, dave, kosaki.motohiro,
	isimatu.yasuaki, tangchen, vasilis.liaskovitis

On Thu, 2013-07-25 at 23:38 +0200, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > > You claimed that the only purpose of this on x86 was 
> > > that testing was done on non-hotplug systems, using 
> > > this interface. Non-hotplug systems have e820 maps.
> > 
> > Right.  Sorry, I first thought that the interface needed 
> > to work as defined, i.e. detect a new memory.  But for 
> > the test purpose on non-hotplug systems, that is not 
> > necessary.  So, I agree that we can check e820.
> > 
> > I summarized two options in the email below.
> > https://lkml.org/lkml/2013/7/23/602
> > 
> > Option 1) adds a check with e820.  Option 2) deprecates 
> > the interface by removing the config option from x86 
> > Kconfig.  I was thinking that we could evaluate two 
> > options after this patch gets in.  Does it make sense?
> 
> Yeah.
> 
> That having said, if the e820 check is too difficult to 
> pull off straight away, I also don't mind keeping it as-is 
> if it's useful for testing. Just make sure you document it 
> as "you need to be careful with this" (beyond it being a 
> root-only interface to begin with).

Sounds good.  I will keep it as-is for now, and add more clarification
to the document.  I will send an updated patch shortly.

I will make sure to come back after this patch gets in and we get more
info about users.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2013-07-25 22:37 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-19 17:47 [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default Toshi Kani
2013-07-19 19:30 ` KOSAKI Motohiro
2013-07-19 19:35   ` Toshi Kani
2013-07-22  8:37 ` Ingo Molnar
2013-07-22 17:12   ` Toshi Kani
2013-07-22 20:57     ` KOSAKI Motohiro
2013-07-22 21:04       ` Dave Hansen
2013-07-23  0:34       ` Toshi Kani
2013-07-23  8:01     ` Ingo Molnar
2013-07-23 20:45       ` Toshi Kani
2013-07-23 20:59         ` Dave Hansen
2013-07-23 21:34           ` Toshi Kani
2013-07-24  0:18         ` Hush Bensen
2013-07-24 16:02           ` Toshi Kani
2013-07-25  0:17             ` Hush Bensen
2013-07-25 15:47               ` Toshi Kani
2013-07-25  0:44             ` Hush Bensen
2013-07-25  0:56             ` Hush Bensen
2013-07-25  3:08               ` Yasuaki Ishimatsu
2013-07-25  3:34                 ` Hush Bensen
2013-07-25  4:55                   ` Yasuaki Ishimatsu
2013-07-24  4:20         ` Ingo Molnar
2013-07-24 16:58           ` Toshi Kani
2013-07-25 21:38             ` Ingo Molnar
2013-07-25 22:36               ` Toshi Kani
2013-07-23  0:24 ` Yasuaki Ishimatsu
2013-07-23  0:45   ` Toshi Kani
2013-07-23  7:46 ` [tip:x86/mm] " tip-bot for Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).