From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Rafael J. Wysocki" Subject: Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Date: Wed, 08 May 2013 02:24:40 +0200 Message-ID: <1738385.YBsAESXG5F@vostro.rjw.lan> References: <1576321.HU0tZ4cGWk@vostro.rjw.lan> <228012439.MgiLXSqjLd@vostro.rjw.lan> <1367971156.30363.32.camel@misato.fc.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7Bit Return-path: In-Reply-To: <1367971156.30363.32.camel@misato.fc.hp.com> Sender: owner-linux-mm@kvack.org To: Toshi Kani Cc: Vasilis Liaskovitis , Greg Kroah-Hartman , ACPI Devel Maling List , LKML , isimatu.yasuaki@jp.fujitsu.com, Len Brown , linux-mm@kvack.org, wency@cn.fujitsu.com List-Id: linux-acpi@vger.kernel.org On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote: > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote: > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote: > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote: > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote: > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote: > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote: > > > > > > > > > > : > > > > > > > > > > > Updated patch is appended for completness. > > > > > > > > > > Yes, this updated patch solved the locking issue. > > > > > > > > > > > > > > A more general issue is that there are now two memory offlining efforts: > > > > > > > > > > > > > > > > > > 1) from acpi_bus_offline_companions during device offline > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb) > > > > > > > > > > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine > > > > > > > > > (unless the blocks were re-onlined in between). > > > > > > > > > > > > > > > > Sure, and that should be OK for now. Changing the detach behavior is not > > > > > > > > essential from the patch [2/2] perspective, we can do it later. > > > > > > > > > > > > > > yes, ok. > > > > > > > > > > > > > > > > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic. > > > > > > > > > > > > > > > > Hmm. Perhaps it would make sense to implement that logic in > > > > > > > > memory_subsys_offline(), then? > > > > > > > > > > > > > > the logic tries to offline the memory blocks of the device twice, because the > > > > > > > first memory block might be storing information for the subsequent memblocks. > > > > > > > > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would > > > > > > > affect non-memory devices as well. > > > > > > > > > > > > > > I am not sure how important this intelligence is in practice (I am not using > > > > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has > > > > > > > more details on 2-pass offlining effectiveness. > > > > > > > > > > > > OK > > > > > > > > > > > > It may be added in a separate patch in any case. > > > > > > > > > > I had the same comment as Vasilis. And, I agree with you that we can > > > > > enhance it in separate patches. > > > > > > > > > > : > > > > > > > > > > > +static int memory_subsys_offline(struct device *dev) > > > > > > +{ > > > > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev); > > > > > > + int ret; > > > > > > + > > > > > > + mutex_lock(&mem->state_mutex); > > > > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1); > > > > > > > > > > This function needs to check mem->state just like > > > > > offline_memory_block(). That is: > > > > > > > > > > int ret = 0; > > > > > : > > > > > if (mem->state != MEM_OFFLINE) > > > > > ret = __memory_block_change_state(...); > > > > > > > > > > Otherwise, memory hot-delete to an off-lined memory fails in > > > > > __memory_block_change_state() since mem->state is already set to > > > > > MEM_OFFLINE. > > > > > > > > > > With that change, for the series: > > > > > Reviewed-by: Toshi Kani > > > > > > > > OK, one more update, then (appended). > > > > > > > > That said I thought that the check against dev->offline in device_offline() > > > > would be sufficient to guard agaist that. Is there any "offline" code path > > > > I didn't take into account? > > > > > > Oh, you are right about that. The real problem is that dev->offline is > > > set to false (0) when a new memory is hot-added in off-line state. So, > > > instead, dev->offline needs to be set properly. > > > > OK, where does that happen? > > It's a bit messy, but the following change seems to work. A tricky part > is that online() is not called during boot, so I needed to update the > offline flag in __memory_block_change_state(). I wonder why? -> > --- > drivers/base/memory.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index b9dfd34..1c8d781 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct > memory_block *mem, > mem->state = from_state_req; > } else { > mem->state = to_state; > - if (to_state == MEM_ONLINE) > + if (to_state == MEM_ONLINE) { > mem->last_online = online_type; > + mem->dev.offline = false; > + } -> __memory_block_change_state() is called by memory_subsys_online/offline() and by __memory_block_change_state_uevent() only, so it should be sufficient to do this under the switch () in the latter. Still, though, __memory_block_change_state_uevent() is only called (indirectly) from store_mem_state() and by offline_memory_block() the both of which update dev->offline. What's the exact scenario you needed this for? > } > return ret; > } > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block > **memory, > mem->state = state; > mem->last_online = ONLINE_KEEP; > mem->section_count++; > + mem->dev.offline = (state == MEM_OFFLINE) ? true : false; You could write this as + mem->dev.offline = state == MEM_OFFLINE; Moreover, it'd be better to do it in register_memory(), I think. > mutex_init(&mem->state_mutex); > start_pfn = section_nr_to_pfn(mem->start_section_nr); > mem->phys_device = arch_get_memory_phys_device(start_pfn); Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753215Ab3EHAQT (ORCPT ); Tue, 7 May 2013 20:16:19 -0400 Received: from hydra.sisk.pl ([212.160.235.94]:37759 "EHLO hydra.sisk.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751310Ab3EHAQR (ORCPT ); Tue, 7 May 2013 20:16:17 -0400 From: "Rafael J. Wysocki" To: Toshi Kani Cc: Vasilis Liaskovitis , Greg Kroah-Hartman , ACPI Devel Maling List , LKML , isimatu.yasuaki@jp.fujitsu.com, Len Brown , linux-mm@kvack.org, wency@cn.fujitsu.com Subject: Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Date: Wed, 08 May 2013 02:24:40 +0200 Message-ID: <1738385.YBsAESXG5F@vostro.rjw.lan> User-Agent: KMail/4.9.5 (Linux/3.9.0+; KDE/4.9.5; x86_64; ; ) In-Reply-To: <1367971156.30363.32.camel@misato.fc.hp.com> References: <1576321.HU0tZ4cGWk@vostro.rjw.lan> <228012439.MgiLXSqjLd@vostro.rjw.lan> <1367971156.30363.32.camel@misato.fc.hp.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote: > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote: > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote: > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote: > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote: > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote: > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote: > > > > > > > > > > : > > > > > > > > > > > Updated patch is appended for completness. > > > > > > > > > > Yes, this updated patch solved the locking issue. > > > > > > > > > > > > > > A more general issue is that there are now two memory offlining efforts: > > > > > > > > > > > > > > > > > > 1) from acpi_bus_offline_companions during device offline > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb) > > > > > > > > > > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine > > > > > > > > > (unless the blocks were re-onlined in between). > > > > > > > > > > > > > > > > Sure, and that should be OK for now. Changing the detach behavior is not > > > > > > > > essential from the patch [2/2] perspective, we can do it later. > > > > > > > > > > > > > > yes, ok. > > > > > > > > > > > > > > > > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic. > > > > > > > > > > > > > > > > Hmm. Perhaps it would make sense to implement that logic in > > > > > > > > memory_subsys_offline(), then? > > > > > > > > > > > > > > the logic tries to offline the memory blocks of the device twice, because the > > > > > > > first memory block might be storing information for the subsequent memblocks. > > > > > > > > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would > > > > > > > affect non-memory devices as well. > > > > > > > > > > > > > > I am not sure how important this intelligence is in practice (I am not using > > > > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has > > > > > > > more details on 2-pass offlining effectiveness. > > > > > > > > > > > > OK > > > > > > > > > > > > It may be added in a separate patch in any case. > > > > > > > > > > I had the same comment as Vasilis. And, I agree with you that we can > > > > > enhance it in separate patches. > > > > > > > > > > : > > > > > > > > > > > +static int memory_subsys_offline(struct device *dev) > > > > > > +{ > > > > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev); > > > > > > + int ret; > > > > > > + > > > > > > + mutex_lock(&mem->state_mutex); > > > > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1); > > > > > > > > > > This function needs to check mem->state just like > > > > > offline_memory_block(). That is: > > > > > > > > > > int ret = 0; > > > > > : > > > > > if (mem->state != MEM_OFFLINE) > > > > > ret = __memory_block_change_state(...); > > > > > > > > > > Otherwise, memory hot-delete to an off-lined memory fails in > > > > > __memory_block_change_state() since mem->state is already set to > > > > > MEM_OFFLINE. > > > > > > > > > > With that change, for the series: > > > > > Reviewed-by: Toshi Kani > > > > > > > > OK, one more update, then (appended). > > > > > > > > That said I thought that the check against dev->offline in device_offline() > > > > would be sufficient to guard agaist that. Is there any "offline" code path > > > > I didn't take into account? > > > > > > Oh, you are right about that. The real problem is that dev->offline is > > > set to false (0) when a new memory is hot-added in off-line state. So, > > > instead, dev->offline needs to be set properly. > > > > OK, where does that happen? > > It's a bit messy, but the following change seems to work. A tricky part > is that online() is not called during boot, so I needed to update the > offline flag in __memory_block_change_state(). I wonder why? -> > --- > drivers/base/memory.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index b9dfd34..1c8d781 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct > memory_block *mem, > mem->state = from_state_req; > } else { > mem->state = to_state; > - if (to_state == MEM_ONLINE) > + if (to_state == MEM_ONLINE) { > mem->last_online = online_type; > + mem->dev.offline = false; > + } -> __memory_block_change_state() is called by memory_subsys_online/offline() and by __memory_block_change_state_uevent() only, so it should be sufficient to do this under the switch () in the latter. Still, though, __memory_block_change_state_uevent() is only called (indirectly) from store_mem_state() and by offline_memory_block() the both of which update dev->offline. What's the exact scenario you needed this for? > } > return ret; > } > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block > **memory, > mem->state = state; > mem->last_online = ONLINE_KEEP; > mem->section_count++; > + mem->dev.offline = (state == MEM_OFFLINE) ? true : false; You could write this as + mem->dev.offline = state == MEM_OFFLINE; Moreover, it'd be better to do it in register_memory(), I think. > mutex_init(&mem->state_mutex); > start_pfn = section_nr_to_pfn(mem->start_section_nr); > mem->phys_device = arch_get_memory_phys_device(start_pfn); Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center.