All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Toshi Kani <toshi.kani@hp.com>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	isimatu.yasuaki@jp.fujitsu.com, Len Brown <lenb@kernel.org>,
	linux-mm@kvack.org, wency@cn.fujitsu.com
Subject: Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
Date: Tue, 7 May 2013 12:59:45 +0200	[thread overview]
Message-ID: <20130507105945.GA4354@dhcp-192-168-178-175.profitbricks.localdomain> (raw)
In-Reply-To: <1809544.1r1JBXrr0i@vostro.rjw.lan>

Hi,

On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Introduce .offline() and .online() callbacks for memory_subsys
> > > that will allow the generic device_offline() and device_online()
> > > to be used with device objects representing memory blocks.  That,
> > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > removable memory blocks offline, if possible, before removing
> > > memory modules holding them.
> > > 
> > > The 'online' sysfs attribute of memory block devices will attempt to
> > > put them offline if 0 is written to it and will attempt to apply the
> > > previously used online type when onlining them (i.e. when 1 is
> > > written to it).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> > >  include/linux/memory.h |    1 
> > >  2 files changed, 81 insertions(+), 25 deletions(-)
> > >
> > [...]
> > 
> > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > >  {
> > >  	int ret = 0;
> > >  
> > > +	lock_device_hotplug();
> > >  	mutex_lock(&mem->state_mutex);
> > > -	if (mem->state != MEM_OFFLINE)
> > > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > +	if (mem->state != MEM_OFFLINE) {
> > > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > +							 MEM_ONLINE, -1);
> > > +		if (!ret)
> > > +			mem->dev.offline = true;
> > > +	}
> > >  	mutex_unlock(&mem->state_mutex);
> > > +	unlock_device_hotplug();
> > 
> > (Testing with qemu...)
> 
> Thanks!
> 
> > offline_memory_block is called from remove_memory, which in turn is called from
> > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > don't need to lock/unlock_device_hotplug in offline_memory_block.
> 
> Indeed.
> 
> First, it looks like offline_memory_block_cb() is the only place calling
> offline_memory_block(), is that right?  I'm wondering if it would make

correct.

> sense to use device_offline() in there and remove offline_memory_block()
> entirely?

possibly. Not sure if we can get hold of the struct device from
mm/memory_hotplug.c, maybe we still need the helper function that operates
directly on the memory block.

> 
> Second, if you ran into this issue during testing, that would mean that patch
> [1/2] actually worked for you, which would be nice. :-)  Was that really the
> case?

yes, the patchset works fine once the extra lock/unlock_device_hotplug is
removed. For various dimm hot-remove operations, I saw either successfull
offlining and removal, or failed offlining and aborted removal.
You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):

Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

> 
> > A more general issue is that there are now two memory offlining efforts:
> > 
> > 1) from acpi_bus_offline_companions during device offline
> > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > 
> > The 2nd is only called if the device offline operation was already succesful, so
> > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > (unless the blocks were re-onlined in between).
> 
> Sure, and that should be OK for now.  Changing the detach behavior is not
> essential from the patch [2/2] perspective, we can do it later.

yes, ok.

> 
> > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > reworked 0baeab16. Maybe we need to consolidate the logic.
> 
> Hmm.  Perhaps it would make sense to implement that logic in
> memory_subsys_offline(), then?

the logic tries to offline the memory blocks of the device twice, because the
first memory block might be storing information for the subsequent memblocks.

memory_subsys_offline operates on one memory block at a time. Perhaps we can get
the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
acpi_scan_hot_remove but it's probably not a good idea, since that would
affect non-memory devices as well. 

I am not sure how important this intelligence is in practice (I am not using
mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
more details on 2-pass offlining effectiveness.

> 
> > remove_memory is called from device_detach, during trim that can't fail, so it
> > should not fail. However this function can still fail in 2 cases:
> > - offline_memory_block_cb
> > - is_memblock_offlined_cb
> > in the case of re-onlined memblocks in between device-offline and device detach.
> > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > duration of the hot-remove operation.
> 
> But we do hold device_hotplug_lock, so every code path that may race with
> acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
> question is whether or not there are any code paths like that calling one of
> the two functions above without holding device_hotplug_lock?

I think you are right. The other code path I had in mind was userspace initiated
online/offline operations from store_mem_state in drivers/base/memory.c. But we
also do lock_device_hotplug in that case too. So it seems safe. If I find
something else with stress testing the paths simultaneously (or another code
path) I 'll update.

thanks,

- Vasilis

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Toshi Kani <toshi.kani@hp.com>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	isimatu.yasuaki@jp.fujitsu.com, Len Brown <lenb@kernel.org>,
	linux-mm@kvack.org, wency@cn.fujitsu.com
Subject: Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
Date: Tue, 7 May 2013 12:59:45 +0200	[thread overview]
Message-ID: <20130507105945.GA4354@dhcp-192-168-178-175.profitbricks.localdomain> (raw)
In-Reply-To: <1809544.1r1JBXrr0i@vostro.rjw.lan>

Hi,

On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Introduce .offline() and .online() callbacks for memory_subsys
> > > that will allow the generic device_offline() and device_online()
> > > to be used with device objects representing memory blocks.  That,
> > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > removable memory blocks offline, if possible, before removing
> > > memory modules holding them.
> > > 
> > > The 'online' sysfs attribute of memory block devices will attempt to
> > > put them offline if 0 is written to it and will attempt to apply the
> > > previously used online type when onlining them (i.e. when 1 is
> > > written to it).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> > >  include/linux/memory.h |    1 
> > >  2 files changed, 81 insertions(+), 25 deletions(-)
> > >
> > [...]
> > 
> > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > >  {
> > >  	int ret = 0;
> > >  
> > > +	lock_device_hotplug();
> > >  	mutex_lock(&mem->state_mutex);
> > > -	if (mem->state != MEM_OFFLINE)
> > > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > +	if (mem->state != MEM_OFFLINE) {
> > > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > +							 MEM_ONLINE, -1);
> > > +		if (!ret)
> > > +			mem->dev.offline = true;
> > > +	}
> > >  	mutex_unlock(&mem->state_mutex);
> > > +	unlock_device_hotplug();
> > 
> > (Testing with qemu...)
> 
> Thanks!
> 
> > offline_memory_block is called from remove_memory, which in turn is called from
> > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > don't need to lock/unlock_device_hotplug in offline_memory_block.
> 
> Indeed.
> 
> First, it looks like offline_memory_block_cb() is the only place calling
> offline_memory_block(), is that right?  I'm wondering if it would make

correct.

> sense to use device_offline() in there and remove offline_memory_block()
> entirely?

possibly. Not sure if we can get hold of the struct device from
mm/memory_hotplug.c, maybe we still need the helper function that operates
directly on the memory block.

> 
> Second, if you ran into this issue during testing, that would mean that patch
> [1/2] actually worked for you, which would be nice. :-)  Was that really the
> case?

yes, the patchset works fine once the extra lock/unlock_device_hotplug is
removed. For various dimm hot-remove operations, I saw either successfull
offlining and removal, or failed offlining and aborted removal.
You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):

Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

> 
> > A more general issue is that there are now two memory offlining efforts:
> > 
> > 1) from acpi_bus_offline_companions during device offline
> > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > 
> > The 2nd is only called if the device offline operation was already succesful, so
> > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > (unless the blocks were re-onlined in between).
> 
> Sure, and that should be OK for now.  Changing the detach behavior is not
> essential from the patch [2/2] perspective, we can do it later.

yes, ok.

> 
> > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > reworked 0baeab16. Maybe we need to consolidate the logic.
> 
> Hmm.  Perhaps it would make sense to implement that logic in
> memory_subsys_offline(), then?

the logic tries to offline the memory blocks of the device twice, because the
first memory block might be storing information for the subsequent memblocks.

memory_subsys_offline operates on one memory block at a time. Perhaps we can get
the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
acpi_scan_hot_remove but it's probably not a good idea, since that would
affect non-memory devices as well. 

I am not sure how important this intelligence is in practice (I am not using
mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
more details on 2-pass offlining effectiveness.

> 
> > remove_memory is called from device_detach, during trim that can't fail, so it
> > should not fail. However this function can still fail in 2 cases:
> > - offline_memory_block_cb
> > - is_memblock_offlined_cb
> > in the case of re-onlined memblocks in between device-offline and device detach.
> > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > duration of the hot-remove operation.
> 
> But we do hold device_hotplug_lock, so every code path that may race with
> acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
> question is whether or not there are any code paths like that calling one of
> the two functions above without holding device_hotplug_lock?

I think you are right. The other code path I had in mind was userspace initiated
online/offline operations from store_mem_state in drivers/base/memory.c. But we
also do lock_device_hotplug in that case too. So it seems safe. If I find
something else with stress testing the paths simultaneously (or another code
path) I 'll update.

thanks,

- Vasilis

  reply	other threads:[~2013-05-07 10:59 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
2013-04-29 23:10   ` Greg Kroah-Hartman
2013-04-30 11:59     ` Rafael J. Wysocki
2013-04-30 15:32       ` Greg Kroah-Hartman
2013-04-30 20:05         ` Rafael J. Wysocki
2013-04-30 23:38   ` Toshi Kani
2013-05-02  0:58     ` Rafael J. Wysocki
2013-05-02 23:29       ` Toshi Kani
2013-05-03 11:48         ` Rafael J. Wysocki
2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
2013-04-29 23:11   ` Greg Kroah-Hartman
2013-04-30 12:01     ` Rafael J. Wysocki
2013-04-30 15:27       ` Greg Kroah-Hartman
2013-04-30 20:06         ` Rafael J. Wysocki
2013-04-30 23:42   ` Toshi Kani
2013-05-01 14:49     ` Rafael J. Wysocki
2013-05-01 20:07       ` Toshi Kani
2013-05-02  0:26         ` Rafael J. Wysocki
2013-04-29 12:29 ` [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
2013-04-30 23:49   ` Toshi Kani
2013-05-01 15:05     ` Rafael J. Wysocki
2013-05-01 20:20       ` Toshi Kani
2013-05-02  0:53         ` Rafael J. Wysocki
2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
2013-05-02 13:57     ` Greg Kroah-Hartman
2013-05-02 23:11     ` Toshi Kani
2013-05-02 23:36       ` Rafael J. Wysocki
2013-05-02 23:23         ` Toshi Kani
2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
2013-05-02 13:57     ` Greg Kroah-Hartman
2013-05-02 12:29   ` [PATCH 3/4] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
2013-05-02 12:31   ` [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure Rafael J. Wysocki
2013-05-02 13:59     ` Greg Kroah-Hartman
2013-05-02 23:20     ` Toshi Kani
2013-05-03 12:05       ` Rafael J. Wysocki
2013-05-03 12:21         ` Rafael J. Wysocki
2013-05-03 18:27         ` Toshi Kani
2013-05-03 19:31           ` Rafael J. Wysocki
2013-05-03 19:34             ` Toshi Kani
2013-05-04  1:01   ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04  1:01     ` Rafael J. Wysocki
2013-05-04  1:03     ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04  1:03       ` Rafael J. Wysocki
2013-05-04  1:04     ` [PATCH 2/3 RFC] Driver core: Introduce types of device "online" Rafael J. Wysocki
2013-05-04  1:04       ` Rafael J. Wysocki
2013-05-04  1:06     ` [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04  1:06       ` Rafael J. Wysocki
2013-05-04 11:11     ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 11:11       ` Rafael J. Wysocki
2013-05-04 11:12       ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04 11:12         ` Rafael J. Wysocki
2013-05-21  6:50         ` Tang Chen
2013-05-21  6:50           ` Tang Chen
2013-05-04 11:21       ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04 11:21         ` Rafael J. Wysocki
2013-05-06 16:28         ` Vasilis Liaskovitis
2013-05-06 16:28           ` Vasilis Liaskovitis
2013-05-07  0:59           ` Rafael J. Wysocki
2013-05-07  0:59             ` Rafael J. Wysocki
2013-05-07 10:59             ` Vasilis Liaskovitis [this message]
2013-05-07 10:59               ` Vasilis Liaskovitis
2013-05-07 12:11               ` Rafael J. Wysocki
2013-05-07 12:11                 ` Rafael J. Wysocki
2013-05-07 21:03                 ` Toshi Kani
2013-05-07 21:03                   ` Toshi Kani
2013-05-07 22:10                   ` Rafael J. Wysocki
2013-05-07 22:10                     ` Rafael J. Wysocki
2013-05-07 22:45                     ` Toshi Kani
2013-05-07 22:45                       ` Toshi Kani
2013-05-07 23:17                       ` Rafael J. Wysocki
2013-05-07 23:17                         ` Rafael J. Wysocki
2013-05-07 23:59                         ` Toshi Kani
2013-05-07 23:59                           ` Toshi Kani
2013-05-08  0:24                           ` Rafael J. Wysocki
2013-05-08  0:24                             ` Rafael J. Wysocki
2013-05-08  0:37                             ` Toshi Kani
2013-05-08  0:37                               ` Toshi Kani
2013-05-08 11:53                               ` Rafael J. Wysocki
2013-05-08 11:53                                 ` Rafael J. Wysocki
2013-05-08 14:38                                 ` Toshi Kani
2013-05-08 14:38                                   ` Toshi Kani
2013-05-06 17:20         ` Greg Kroah-Hartman
2013-05-06 17:20           ` Greg Kroah-Hartman
2013-05-06 19:46           ` Rafael J. Wysocki
2013-05-06 19:46             ` Rafael J. Wysocki
2013-05-21  6:37         ` Tang Chen
2013-05-21  6:37           ` Tang Chen
2013-05-21 11:15           ` Rafael J. Wysocki
2013-05-21 11:15             ` Rafael J. Wysocki
2013-05-22  4:45             ` Tang Chen
2013-05-22  4:45               ` Tang Chen
2013-05-22 10:42               ` Rafael J. Wysocki
2013-05-22 10:42                 ` Rafael J. Wysocki
2013-05-22 22:06               ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
2013-05-22 22:06                 ` Rafael J. Wysocki
2013-05-22 22:14                 ` Greg Kroah-Hartman
2013-05-22 22:14                   ` Greg Kroah-Hartman
2013-05-22 23:29                   ` Rafael J. Wysocki
2013-05-22 23:29                     ` Rafael J. Wysocki
2013-05-23  4:37                 ` Tang Chen
2013-05-23  4:37                   ` Tang Chen
2013-05-06 10:48       ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-06 10:48         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130507105945.GA4354@dhcp-192-168-178-175.profitbricks.localdomain \
    --to=vasilis.liaskovitis@profitbricks.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rjw@sisk.pl \
    --cc=toshi.kani@hp.com \
    --cc=wency@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.