All of lore.kernel.org
 help / color / mirror / Atom feed
From: joeyli <jlee@suse.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Kani Toshimitsu <toshi.kani@hpe.com>,
	Jiri Kosina <jkosina@suse.cz>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	linux-api@vger.kernel.org
Subject: Re: memory hotplug and force_remove
Date: Fri, 31 Mar 2017 18:49:05 +0800	[thread overview]
Message-ID: <20170331104905.GA28365@linux-l9pv.suse> (raw)
In-Reply-To: <20170331083017.GK27098@dhcp22.suse.cz>

Hi Michal,

On Fri, Mar 31, 2017 at 10:30:17AM +0200, Michal Hocko wrote:
> [Fixed up email address of Toshimitsu - the email thread starts
> http://lkml.kernel.org/r/20170320192938.GA11363@dhcp22.suse.cz]
> 
> On Tue 28-03-17 17:22:58, Rafael J. Wysocki wrote:
> > On Tuesday, March 28, 2017 09:58:08 AM Michal Hocko wrote:
> > > On Mon 20-03-17 22:24:42, Rafael J. Wysocki wrote:
> > > > On Monday, March 20, 2017 03:29:39 PM Michal Hocko wrote:
> > > > > Hi Rafael,
> > > > 
> > > > Hi,
> > > > 
> > > > > we have been chasing the following BUG() triggering during the memory
> > > > > hotremove (remove_memory):
> > > > > 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> > > > > 				check_memblock_offlined_cb);
> > > > > 	if (ret)
> > > > > 		BUG();
> > > > > 
> > > > > and it took a while to learn that the issue is caused by
> > > > > /sys/firmware/acpi/hotplug/force_remove being enabled. I was really
> > > > > surprised to see such an option because at least for the memory hotplug
> > > > > it cannot work at all. Memory hotplug fails when the memory is still
> > > > > in use. Even if we do not BUG() here enforcing the hotplug operation
> > > > > will lead to problematic behavior later like crash or a silent memory
> > > > > corruption if the memory gets onlined back and reused by somebody else.
> > > > > 
> > > > > I am wondering what was the motivation for introducing this behavior and
> > > > > whether there is a way to disallow it for memory hotplug. Or maybe drop
> > > > > it completely. What would break in such a case?
> > > > 
> > > > Honestly, I don't remember from the top of my head and I haven't looked at
> > > > that code for several months.
> > > > 
> > > > I need some time to recall that.
> > > 
> > > Did you have any chance to look into this?
> > 
> > Well, yes.
> > 
> > It looks like that was added for some people who depended on the old behavior
> > at that time.
> > 
> > I guess we can try to drop it and see what happpens. :-)
> 
> OK, so what do you think about the following? It is based on the current
> linux-next and I have only compile tested it.
> ---
> >From 6c5ae594ce938a1ae9b9718958401682bfab3980 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Fri, 31 Mar 2017 10:08:41 +0200
> Subject: [PATCH] acpi: drop support for force_remove
> 
> /sys/firmware/acpi/hotplug/force_remove was presumably added to support
> auto offlining in the past. This is, however, inherently dangerous for
> some hotplugable resources like memory. The memory offlining fails when
> the memory is still in use and cannot be dropped or migrated. If we
> ignore the failure we are basically allowing for subtle memory
> corruption or a crash.
> 
> We have actually noticed the later while hitting BUG() during the memory
> hotremove (remove_memory):
> 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> 			check_memblock_offlined_cb);
> 	if (ret)
> 		BUG();
> 
> it took us quite non-trivial time realize that the customer had
> force_remove enabled. Even if the BUG was removed here and we could
> propagate the error up the call chain it wouldn't help at all because
> then we would hit a crash or a memory corruption later and harder to
> debug. So force_remove is unfixable for the memory hotremove. We haven't
> checked other hotplugable resources to be prone to a similar problems.
> 
> Remove the force_remove functionality because it is not fixable currently.
> Keep the sysfs file and report an error if somebody tries to enable it.
> Encourage users to report about the missing functionality and work with
> them with an alternative solution.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/ABI/obsolete/sysfs-firmware-acpi |  8 ++++++++
>  Documentation/ABI/testing/sysfs-firmware-acpi  | 10 ----------
>  drivers/acpi/internal.h                        |  2 --
>  drivers/acpi/scan.c                            | 17 +++--------------
>  drivers/acpi/sysfs.c                           |  9 +++++----
>  5 files changed, 16 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/ABI/obsolete/sysfs-firmware-acpi
> 
> diff --git a/Documentation/ABI/obsolete/sysfs-firmware-acpi b/Documentation/ABI/obsolete/sysfs-firmware-acpi
> new file mode 100644
> index 000000000000..6715a71bec3d
> --- /dev/null
> +++ b/Documentation/ABI/obsolete/sysfs-firmware-acpi
> @@ -0,0 +1,8 @@
> +What:		/sys/firmware/acpi/hotplug/force_remove
> +Date:		Mar 2017
> +Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> +Description:
> +		Since the force_remove is inherently broken and dangerous to
> +		use for some hotplugable resources like memory (because ignoring
> +		the offline failure might lead to memory corruption and crashes)
> +		enabling this knob is not safe and thus unsupported.
> diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi
> index c7fc72d4495c..613f42a9d5cd 100644
> --- a/Documentation/ABI/testing/sysfs-firmware-acpi
> +++ b/Documentation/ABI/testing/sysfs-firmware-acpi
> @@ -44,16 +44,6 @@ Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>  		or 0 (unset).  Attempts to write any other values to it will
>  		cause -EINVAL to be returned.
>  
> -What:		/sys/firmware/acpi/hotplug/force_remove
> -Date:		May 2013
> -Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> -Description:
> -		The number in this file (0 or 1) determines whether (1) or not
> -		(0) the ACPI subsystem will allow devices to be hot-removed even
> -		if they cannot be put offline gracefully (from the kernel's
> -		viewpoint).  That number can be changed by writing a boolean
> -		value to this file.
> -
>  What:		/sys/firmware/acpi/interrupts/
>  Date:		February 2008
>  Contact:	Len Brown <lenb@kernel.org>
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index f15900132912..66229ffa909b 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -65,8 +65,6 @@ static inline void acpi_cmos_rtc_init(void) {}
>  #endif
>  int acpi_rev_override_setup(char *str);
>  
> -extern bool acpi_force_hot_remove;
> -
>  void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  				    const char *name);
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 192691880d55..a8d893fcedca 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -30,12 +30,6 @@ extern struct acpi_device *acpi_root;
>  
>  #define INVALID_ACPI_HANDLE	((acpi_handle)empty_zero_page)
>  
> -/*
> - * If set, devices will be hot-removed even if they cannot be put offline
> - * gracefully (from the kernel's standpoint).
> - */
> -bool acpi_force_hot_remove;
> -
>  static const char *dummy_hid = "device";
>  
>  static LIST_HEAD(acpi_dep_list);
> @@ -170,9 +164,6 @@ static acpi_status acpi_bus_offline(acpi_handle handle, u32 lvl, void *data,
>  			pn->put_online = false;
>  		}
>  		ret = device_offline(pn->dev);
> -		if (acpi_force_hot_remove)
> -			continue;
> -
>  		if (ret >= 0) {
>  			pn->put_online = !ret;
>  		} else {
> @@ -241,11 +232,10 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
>  		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>  				    NULL, acpi_bus_offline, (void *)true,
>  				    (void **)&errdev);
> -		if (!errdev || acpi_force_hot_remove)
> +		if (!errdev)
>  			acpi_bus_offline(handle, 0, (void *)true,
>  					 (void **)&errdev);
> -
> -		if (errdev && !acpi_force_hot_remove) {
> +		else {
              ^^^^^^^^^^^^^
Here should still checks the parent's errdev state then rollback
parent/children to online state:

-		if (errdev && !acpi_force_hot_remove) {
+		if (errdev) {

>  			dev_warn(errdev, "Offline failed.\n");
>  			acpi_bus_online(handle, 0, NULL, NULL);
>  			acpi_walk_namespace(ACPI_TYPE_ANY, handle,
[...snip]

Thanks a lot!
Joey Lee

WARNING: multiple messages have this Message-ID (diff)
From: joeyli <jlee-IBi9RG/b67k@public.gmane.org>
To: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: "Rafael J. Wysocki" <rjw-LthD3rsA81gm4RdzfppkhA@public.gmane.org>,
	Kani Toshimitsu <toshi.kani-ZPxbGqLxI0U@public.gmane.org>,
	Jiri Kosina <jkosina-AlSwsSmVLrQ@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: memory hotplug and force_remove
Date: Fri, 31 Mar 2017 18:49:05 +0800	[thread overview]
Message-ID: <20170331104905.GA28365@linux-l9pv.suse> (raw)
In-Reply-To: <20170331083017.GK27098-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>

Hi Michal,

On Fri, Mar 31, 2017 at 10:30:17AM +0200, Michal Hocko wrote:
> [Fixed up email address of Toshimitsu - the email thread starts
> http://lkml.kernel.org/r/20170320192938.GA11363-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org]
> 
> On Tue 28-03-17 17:22:58, Rafael J. Wysocki wrote:
> > On Tuesday, March 28, 2017 09:58:08 AM Michal Hocko wrote:
> > > On Mon 20-03-17 22:24:42, Rafael J. Wysocki wrote:
> > > > On Monday, March 20, 2017 03:29:39 PM Michal Hocko wrote:
> > > > > Hi Rafael,
> > > > 
> > > > Hi,
> > > > 
> > > > > we have been chasing the following BUG() triggering during the memory
> > > > > hotremove (remove_memory):
> > > > > 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> > > > > 				check_memblock_offlined_cb);
> > > > > 	if (ret)
> > > > > 		BUG();
> > > > > 
> > > > > and it took a while to learn that the issue is caused by
> > > > > /sys/firmware/acpi/hotplug/force_remove being enabled. I was really
> > > > > surprised to see such an option because at least for the memory hotplug
> > > > > it cannot work at all. Memory hotplug fails when the memory is still
> > > > > in use. Even if we do not BUG() here enforcing the hotplug operation
> > > > > will lead to problematic behavior later like crash or a silent memory
> > > > > corruption if the memory gets onlined back and reused by somebody else.
> > > > > 
> > > > > I am wondering what was the motivation for introducing this behavior and
> > > > > whether there is a way to disallow it for memory hotplug. Or maybe drop
> > > > > it completely. What would break in such a case?
> > > > 
> > > > Honestly, I don't remember from the top of my head and I haven't looked at
> > > > that code for several months.
> > > > 
> > > > I need some time to recall that.
> > > 
> > > Did you have any chance to look into this?
> > 
> > Well, yes.
> > 
> > It looks like that was added for some people who depended on the old behavior
> > at that time.
> > 
> > I guess we can try to drop it and see what happpens. :-)
> 
> OK, so what do you think about the following? It is based on the current
> linux-next and I have only compile tested it.
> ---
> >From 6c5ae594ce938a1ae9b9718958401682bfab3980 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
> Date: Fri, 31 Mar 2017 10:08:41 +0200
> Subject: [PATCH] acpi: drop support for force_remove
> 
> /sys/firmware/acpi/hotplug/force_remove was presumably added to support
> auto offlining in the past. This is, however, inherently dangerous for
> some hotplugable resources like memory. The memory offlining fails when
> the memory is still in use and cannot be dropped or migrated. If we
> ignore the failure we are basically allowing for subtle memory
> corruption or a crash.
> 
> We have actually noticed the later while hitting BUG() during the memory
> hotremove (remove_memory):
> 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> 			check_memblock_offlined_cb);
> 	if (ret)
> 		BUG();
> 
> it took us quite non-trivial time realize that the customer had
> force_remove enabled. Even if the BUG was removed here and we could
> propagate the error up the call chain it wouldn't help at all because
> then we would hit a crash or a memory corruption later and harder to
> debug. So force_remove is unfixable for the memory hotremove. We haven't
> checked other hotplugable resources to be prone to a similar problems.
> 
> Remove the force_remove functionality because it is not fixable currently.
> Keep the sysfs file and report an error if somebody tries to enable it.
> Encourage users to report about the missing functionality and work with
> them with an alternative solution.
> 
> Signed-off-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
> ---
>  Documentation/ABI/obsolete/sysfs-firmware-acpi |  8 ++++++++
>  Documentation/ABI/testing/sysfs-firmware-acpi  | 10 ----------
>  drivers/acpi/internal.h                        |  2 --
>  drivers/acpi/scan.c                            | 17 +++--------------
>  drivers/acpi/sysfs.c                           |  9 +++++----
>  5 files changed, 16 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/ABI/obsolete/sysfs-firmware-acpi
> 
> diff --git a/Documentation/ABI/obsolete/sysfs-firmware-acpi b/Documentation/ABI/obsolete/sysfs-firmware-acpi
> new file mode 100644
> index 000000000000..6715a71bec3d
> --- /dev/null
> +++ b/Documentation/ABI/obsolete/sysfs-firmware-acpi
> @@ -0,0 +1,8 @@
> +What:		/sys/firmware/acpi/hotplug/force_remove
> +Date:		Mar 2017
> +Contact:	Rafael J. Wysocki <rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> +Description:
> +		Since the force_remove is inherently broken and dangerous to
> +		use for some hotplugable resources like memory (because ignoring
> +		the offline failure might lead to memory corruption and crashes)
> +		enabling this knob is not safe and thus unsupported.
> diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi
> index c7fc72d4495c..613f42a9d5cd 100644
> --- a/Documentation/ABI/testing/sysfs-firmware-acpi
> +++ b/Documentation/ABI/testing/sysfs-firmware-acpi
> @@ -44,16 +44,6 @@ Contact:	Rafael J. Wysocki <rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>  		or 0 (unset).  Attempts to write any other values to it will
>  		cause -EINVAL to be returned.
>  
> -What:		/sys/firmware/acpi/hotplug/force_remove
> -Date:		May 2013
> -Contact:	Rafael J. Wysocki <rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> -Description:
> -		The number in this file (0 or 1) determines whether (1) or not
> -		(0) the ACPI subsystem will allow devices to be hot-removed even
> -		if they cannot be put offline gracefully (from the kernel's
> -		viewpoint).  That number can be changed by writing a boolean
> -		value to this file.
> -
>  What:		/sys/firmware/acpi/interrupts/
>  Date:		February 2008
>  Contact:	Len Brown <lenb-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index f15900132912..66229ffa909b 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -65,8 +65,6 @@ static inline void acpi_cmos_rtc_init(void) {}
>  #endif
>  int acpi_rev_override_setup(char *str);
>  
> -extern bool acpi_force_hot_remove;
> -
>  void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  				    const char *name);
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 192691880d55..a8d893fcedca 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -30,12 +30,6 @@ extern struct acpi_device *acpi_root;
>  
>  #define INVALID_ACPI_HANDLE	((acpi_handle)empty_zero_page)
>  
> -/*
> - * If set, devices will be hot-removed even if they cannot be put offline
> - * gracefully (from the kernel's standpoint).
> - */
> -bool acpi_force_hot_remove;
> -
>  static const char *dummy_hid = "device";
>  
>  static LIST_HEAD(acpi_dep_list);
> @@ -170,9 +164,6 @@ static acpi_status acpi_bus_offline(acpi_handle handle, u32 lvl, void *data,
>  			pn->put_online = false;
>  		}
>  		ret = device_offline(pn->dev);
> -		if (acpi_force_hot_remove)
> -			continue;
> -
>  		if (ret >= 0) {
>  			pn->put_online = !ret;
>  		} else {
> @@ -241,11 +232,10 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
>  		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>  				    NULL, acpi_bus_offline, (void *)true,
>  				    (void **)&errdev);
> -		if (!errdev || acpi_force_hot_remove)
> +		if (!errdev)
>  			acpi_bus_offline(handle, 0, (void *)true,
>  					 (void **)&errdev);
> -
> -		if (errdev && !acpi_force_hot_remove) {
> +		else {
              ^^^^^^^^^^^^^
Here should still checks the parent's errdev state then rollback
parent/children to online state:

-		if (errdev && !acpi_force_hot_remove) {
+		if (errdev) {

>  			dev_warn(errdev, "Offline failed.\n");
>  			acpi_bus_online(handle, 0, NULL, NULL);
>  			acpi_walk_namespace(ACPI_TYPE_ANY, handle,
[...snip]

Thanks a lot!
Joey Lee

WARNING: multiple messages have this Message-ID (diff)
From: joeyli <jlee@suse.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Kani Toshimitsu <toshi.kani@hpe.com>,
	Jiri Kosina <jkosina@suse.cz>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	linux-api@vger.kernel.org
Subject: Re: memory hotplug and force_remove
Date: Fri, 31 Mar 2017 18:49:05 +0800	[thread overview]
Message-ID: <20170331104905.GA28365@linux-l9pv.suse> (raw)
In-Reply-To: <20170331083017.GK27098@dhcp22.suse.cz>

Hi Michal,

On Fri, Mar 31, 2017 at 10:30:17AM +0200, Michal Hocko wrote:
> [Fixed up email address of Toshimitsu - the email thread starts
> http://lkml.kernel.org/r/20170320192938.GA11363@dhcp22.suse.cz]
> 
> On Tue 28-03-17 17:22:58, Rafael J. Wysocki wrote:
> > On Tuesday, March 28, 2017 09:58:08 AM Michal Hocko wrote:
> > > On Mon 20-03-17 22:24:42, Rafael J. Wysocki wrote:
> > > > On Monday, March 20, 2017 03:29:39 PM Michal Hocko wrote:
> > > > > Hi Rafael,
> > > > 
> > > > Hi,
> > > > 
> > > > > we have been chasing the following BUG() triggering during the memory
> > > > > hotremove (remove_memory):
> > > > > 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> > > > > 				check_memblock_offlined_cb);
> > > > > 	if (ret)
> > > > > 		BUG();
> > > > > 
> > > > > and it took a while to learn that the issue is caused by
> > > > > /sys/firmware/acpi/hotplug/force_remove being enabled. I was really
> > > > > surprised to see such an option because at least for the memory hotplug
> > > > > it cannot work at all. Memory hotplug fails when the memory is still
> > > > > in use. Even if we do not BUG() here enforcing the hotplug operation
> > > > > will lead to problematic behavior later like crash or a silent memory
> > > > > corruption if the memory gets onlined back and reused by somebody else.
> > > > > 
> > > > > I am wondering what was the motivation for introducing this behavior and
> > > > > whether there is a way to disallow it for memory hotplug. Or maybe drop
> > > > > it completely. What would break in such a case?
> > > > 
> > > > Honestly, I don't remember from the top of my head and I haven't looked at
> > > > that code for several months.
> > > > 
> > > > I need some time to recall that.
> > > 
> > > Did you have any chance to look into this?
> > 
> > Well, yes.
> > 
> > It looks like that was added for some people who depended on the old behavior
> > at that time.
> > 
> > I guess we can try to drop it and see what happpens. :-)
> 
> OK, so what do you think about the following? It is based on the current
> linux-next and I have only compile tested it.
> ---
> >From 6c5ae594ce938a1ae9b9718958401682bfab3980 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Fri, 31 Mar 2017 10:08:41 +0200
> Subject: [PATCH] acpi: drop support for force_remove
> 
> /sys/firmware/acpi/hotplug/force_remove was presumably added to support
> auto offlining in the past. This is, however, inherently dangerous for
> some hotplugable resources like memory. The memory offlining fails when
> the memory is still in use and cannot be dropped or migrated. If we
> ignore the failure we are basically allowing for subtle memory
> corruption or a crash.
> 
> We have actually noticed the later while hitting BUG() during the memory
> hotremove (remove_memory):
> 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> 			check_memblock_offlined_cb);
> 	if (ret)
> 		BUG();
> 
> it took us quite non-trivial time realize that the customer had
> force_remove enabled. Even if the BUG was removed here and we could
> propagate the error up the call chain it wouldn't help at all because
> then we would hit a crash or a memory corruption later and harder to
> debug. So force_remove is unfixable for the memory hotremove. We haven't
> checked other hotplugable resources to be prone to a similar problems.
> 
> Remove the force_remove functionality because it is not fixable currently.
> Keep the sysfs file and report an error if somebody tries to enable it.
> Encourage users to report about the missing functionality and work with
> them with an alternative solution.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/ABI/obsolete/sysfs-firmware-acpi |  8 ++++++++
>  Documentation/ABI/testing/sysfs-firmware-acpi  | 10 ----------
>  drivers/acpi/internal.h                        |  2 --
>  drivers/acpi/scan.c                            | 17 +++--------------
>  drivers/acpi/sysfs.c                           |  9 +++++----
>  5 files changed, 16 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/ABI/obsolete/sysfs-firmware-acpi
> 
> diff --git a/Documentation/ABI/obsolete/sysfs-firmware-acpi b/Documentation/ABI/obsolete/sysfs-firmware-acpi
> new file mode 100644
> index 000000000000..6715a71bec3d
> --- /dev/null
> +++ b/Documentation/ABI/obsolete/sysfs-firmware-acpi
> @@ -0,0 +1,8 @@
> +What:		/sys/firmware/acpi/hotplug/force_remove
> +Date:		Mar 2017
> +Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> +Description:
> +		Since the force_remove is inherently broken and dangerous to
> +		use for some hotplugable resources like memory (because ignoring
> +		the offline failure might lead to memory corruption and crashes)
> +		enabling this knob is not safe and thus unsupported.
> diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi
> index c7fc72d4495c..613f42a9d5cd 100644
> --- a/Documentation/ABI/testing/sysfs-firmware-acpi
> +++ b/Documentation/ABI/testing/sysfs-firmware-acpi
> @@ -44,16 +44,6 @@ Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>  		or 0 (unset).  Attempts to write any other values to it will
>  		cause -EINVAL to be returned.
>  
> -What:		/sys/firmware/acpi/hotplug/force_remove
> -Date:		May 2013
> -Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> -Description:
> -		The number in this file (0 or 1) determines whether (1) or not
> -		(0) the ACPI subsystem will allow devices to be hot-removed even
> -		if they cannot be put offline gracefully (from the kernel's
> -		viewpoint).  That number can be changed by writing a boolean
> -		value to this file.
> -
>  What:		/sys/firmware/acpi/interrupts/
>  Date:		February 2008
>  Contact:	Len Brown <lenb@kernel.org>
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index f15900132912..66229ffa909b 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -65,8 +65,6 @@ static inline void acpi_cmos_rtc_init(void) {}
>  #endif
>  int acpi_rev_override_setup(char *str);
>  
> -extern bool acpi_force_hot_remove;
> -
>  void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  				    const char *name);
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 192691880d55..a8d893fcedca 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -30,12 +30,6 @@ extern struct acpi_device *acpi_root;
>  
>  #define INVALID_ACPI_HANDLE	((acpi_handle)empty_zero_page)
>  
> -/*
> - * If set, devices will be hot-removed even if they cannot be put offline
> - * gracefully (from the kernel's standpoint).
> - */
> -bool acpi_force_hot_remove;
> -
>  static const char *dummy_hid = "device";
>  
>  static LIST_HEAD(acpi_dep_list);
> @@ -170,9 +164,6 @@ static acpi_status acpi_bus_offline(acpi_handle handle, u32 lvl, void *data,
>  			pn->put_online = false;
>  		}
>  		ret = device_offline(pn->dev);
> -		if (acpi_force_hot_remove)
> -			continue;
> -
>  		if (ret >= 0) {
>  			pn->put_online = !ret;
>  		} else {
> @@ -241,11 +232,10 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
>  		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>  				    NULL, acpi_bus_offline, (void *)true,
>  				    (void **)&errdev);
> -		if (!errdev || acpi_force_hot_remove)
> +		if (!errdev)
>  			acpi_bus_offline(handle, 0, (void *)true,
>  					 (void **)&errdev);
> -
> -		if (errdev && !acpi_force_hot_remove) {
> +		else {
              ^^^^^^^^^^^^^
Here should still checks the parent's errdev state then rollback
parent/children to online state:

-		if (errdev && !acpi_force_hot_remove) {
+		if (errdev) {

>  			dev_warn(errdev, "Offline failed.\n");
>  			acpi_bus_online(handle, 0, NULL, NULL);
>  			acpi_walk_namespace(ACPI_TYPE_ANY, handle,
[...snip]

Thanks a lot!
Joey Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-31 10:49 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-20 19:29 memory hotplug and force_remove Michal Hocko
2017-03-20 19:29 ` Michal Hocko
2017-03-20 21:24 ` Rafael J. Wysocki
2017-03-20 21:24   ` Rafael J. Wysocki
2017-03-21 16:13   ` joeyli
2017-03-21 16:13     ` joeyli
2017-03-28  7:58   ` Michal Hocko
2017-03-28  7:58     ` Michal Hocko
2017-03-28 15:22     ` Rafael J. Wysocki
2017-03-28 15:22       ` Rafael J. Wysocki
2017-03-30  8:47       ` Jiri Kosina
2017-03-30  8:47         ` Jiri Kosina
2017-03-30 16:20         ` Michal Hocko
2017-03-30 16:20           ` Michal Hocko
2017-03-30 16:57           ` joeyli
2017-03-30 16:57             ` joeyli
2017-03-30 16:57             ` joeyli
2017-03-30 20:15             ` Rafael J. Wysocki
2017-03-30 20:15               ` Rafael J. Wysocki
2017-03-30 20:15               ` Rafael J. Wysocki
2017-03-31  0:00               ` joeyli
2017-03-31  0:00                 ` joeyli
2017-03-31  8:30       ` Michal Hocko
2017-03-31  8:30         ` Michal Hocko
2017-03-31  8:30         ` Michal Hocko
2017-03-31 10:49         ` joeyli [this message]
2017-03-31 10:49           ` joeyli
2017-03-31 10:49           ` joeyli
2017-03-31 10:55           ` Michal Hocko
2017-03-31 10:55             ` Michal Hocko
2017-03-31 10:55             ` Michal Hocko
2017-03-31 11:55             ` joeyli
2017-03-31 11:55               ` joeyli
2017-03-31 11:55               ` joeyli
2017-03-31 12:02               ` Michal Hocko
2017-03-31 12:02                 ` Michal Hocko
2017-03-31 22:35                 ` Rafael J. Wysocki
2017-03-31 22:35                   ` Rafael J. Wysocki
2017-03-31 22:35                   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170331104905.GA28365@linux-l9pv.suse \
    --to=jlee@suse.com \
    --cc=jkosina@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=toshi.kani@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.