linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Cannot hot remove a memory device
@ 2013-08-01  8:37 Yasuaki Ishimatsu
  2013-08-01 21:43 ` Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Yasuaki Ishimatsu @ 2013-08-01  8:37 UTC (permalink / raw)
  To: rafael.j.wysocki, vasilis.liaskovitis, toshi.kani
  Cc: linux-kernel, linux-acpi, tangchen, wency

By following commit, I cannot hot remove a memory device.

ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
commit e2ff39400d81233374e780b133496a2296643d7d

Details are follows:
When I add a memory device, acpi_memory_enable_device() always fails
as follows:

...
[ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
[ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
[ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
[ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
[ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
[ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
[ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
[ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
[ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
[ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
[ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
[ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
[ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
...	
[ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
...	

By this failure, hot added acpi_memory_info->enalbed is not set to 1.
So when hot removing the memory device, acpi_memory_remove_memory() does
nothing, since acpi_memory_info->enabled is 0. Thus I cannot hot remove
the memory device.

According to the messages, add_memory() succeeded. So I think that it is
caused by failing acpi_bind_memory_blocks().

I'm now investigating why acpi_bind_memory_blocks() fails.

Thanks,
Yasuaki Ishimatsu


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-01  8:37 Cannot hot remove a memory device Yasuaki Ishimatsu
@ 2013-08-01 21:43 ` Rafael J. Wysocki
  2013-08-02 21:46   ` Toshi Kani
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 21:43 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: rafael.j.wysocki, vasilis.liaskovitis, toshi.kani, linux-kernel,
	linux-acpi, tangchen, wency

Hi,

Thanks for your report.

On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> By following commit, I cannot hot remove a memory device.
> 
> ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> commit e2ff39400d81233374e780b133496a2296643d7d
> 
> Details are follows:
> When I add a memory device, acpi_memory_enable_device() always fails
> as follows:
> 
> ...
> [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> ...	
> [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error

Well, the only new way acpi_memory_enable_device() can fail after that commit
is a failure in acpi_bind_memory_blocks().

This means that either handle is NULL, which I think we can exclude, because
acpi_memory_enable_device() wouldn't be called at all if that were the case, or
there's a more subtle error in acpi_bind_one().

One that comes to mind is that we may be calling acpi_bind_one() twice for the
same memory region, in which it will trigger -EINVAL from the sanity check in
there.

Also you may try to increase ACPI_MAX_PHYSICAL_NODE (in acpi_bus.h) and see if
that helps.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-01 21:43 ` Rafael J. Wysocki
@ 2013-08-02 21:46   ` Toshi Kani
  2013-08-02 23:43     ` Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Toshi Kani @ 2013-08-02 21:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> Hi,
> 
> Thanks for your report.
> 
> On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> > By following commit, I cannot hot remove a memory device.
> > 
> > ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> > commit e2ff39400d81233374e780b133496a2296643d7d
> > 
> > Details are follows:
> > When I add a memory device, acpi_memory_enable_device() always fails
> > as follows:
> > 
> > ...
> > [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> > [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> > [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> > [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> > [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> > [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> > [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> > [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> > [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> > [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> > [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> > [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> > [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3

It appears that each memory object only has 64MB of memory.  This is
less than the memory block size, which is 128MB.  This means that a
single memory block associates with two ACPI memory device objects.

> > ...	
> > [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> 
> Well, the only new way acpi_memory_enable_device() can fail after that commit
> is a failure in acpi_bind_memory_blocks().

Agreed.

> This means that either handle is NULL, which I think we can exclude, because
> acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> there's a more subtle error in acpi_bind_one().
> 
> One that comes to mind is that we may be calling acpi_bind_one() twice for the
> same memory region, in which it will trigger -EINVAL from the sanity check in
> there.

I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
handle is already set\n").  When two ACPI memory objects associate with
a same memory block, the bind procedure of the 2nd ACPI memory object
sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-02 21:46   ` Toshi Kani
@ 2013-08-02 23:43     ` Rafael J. Wysocki
  2013-08-03  0:04       ` Toshi Kani
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-02 23:43 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > Hi,
> > 
> > Thanks for your report.
> > 
> > On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> > > By following commit, I cannot hot remove a memory device.
> > > 
> > > ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> > > commit e2ff39400d81233374e780b133496a2296643d7d
> > > 
> > > Details are follows:
> > > When I add a memory device, acpi_memory_enable_device() always fails
> > > as follows:
> > > 
> > > ...
> > > [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> > > [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> > > [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> > > [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> > > [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> > > [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> > > [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> > > [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> > > [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> > > [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> > > [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> > > [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> > > [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> 
> It appears that each memory object only has 64MB of memory.  This is
> less than the memory block size, which is 128MB.  This means that a
> single memory block associates with two ACPI memory device objects.

That'd be bad.

How did that work before if that indeed is the case?

> > > ...	
> > > [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> > 
> > Well, the only new way acpi_memory_enable_device() can fail after that commit
> > is a failure in acpi_bind_memory_blocks().
> 
> Agreed.
> 
> > This means that either handle is NULL, which I think we can exclude, because
> > acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> > there's a more subtle error in acpi_bind_one().
> > 
> > One that comes to mind is that we may be calling acpi_bind_one() twice for the
> > same memory region, in which it will trigger -EINVAL from the sanity check in
> > there.
> 
> I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> handle is already set\n").  When two ACPI memory objects associate with
> a same memory block, the bind procedure of the 2nd ACPI memory object
> sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.

That sound's plausible, but I wonder how we can fix that?

There's no way for a single physical device to have two different ACPI
"companions".  It looks like the memory blocks should be 64 M each in that
case.  Or we need to create two child devices for each memory block and
associate each of them with an ACPI object.  That would lead to complications
in the user space interface, though.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-02 23:43     ` Rafael J. Wysocki
@ 2013-08-03  0:04       ` Toshi Kani
  2013-08-03  1:01         ` Rafael J. Wysocki
  2013-08-08 17:15         ` Cannot hot remove a memory device Toshi Kani
  0 siblings, 2 replies; 27+ messages in thread
From: Toshi Kani @ 2013-08-03  0:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > Hi,
> > > 
> > > Thanks for your report.
> > > 
> > > On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> > > > By following commit, I cannot hot remove a memory device.
> > > > 
> > > > ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> > > > commit e2ff39400d81233374e780b133496a2296643d7d
> > > > 
> > > > Details are follows:
> > > > When I add a memory device, acpi_memory_enable_device() always fails
> > > > as follows:
> > > > 
> > > > ...
> > > > [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> > > > [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> > > > [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> > > > [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> > > > [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> > > > [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> > > > [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> > > > [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> > > > [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> > > > [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> > > > [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> > > > [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> > > > [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> > 
> > It appears that each memory object only has 64MB of memory.  This is
> > less than the memory block size, which is 128MB.  This means that a
> > single memory block associates with two ACPI memory device objects.
> 
> That'd be bad.
> 
> How did that work before if that indeed is the case?

Well, it looks to me that it has never worked before...

> > > > ...	
> > > > [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> > > 
> > > Well, the only new way acpi_memory_enable_device() can fail after that commit
> > > is a failure in acpi_bind_memory_blocks().
> > 
> > Agreed.
> > 
> > > This means that either handle is NULL, which I think we can exclude, because
> > > acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> > > there's a more subtle error in acpi_bind_one().
> > > 
> > > One that comes to mind is that we may be calling acpi_bind_one() twice for the
> > > same memory region, in which it will trigger -EINVAL from the sanity check in
> > > there.
> > 
> > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > handle is already set\n").  When two ACPI memory objects associate with
> > a same memory block, the bind procedure of the 2nd ACPI memory object
> > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> 
> That sound's plausible, but I wonder how we can fix that?
> 
> There's no way for a single physical device to have two different ACPI
> "companions".  It looks like the memory blocks should be 64 M each in that
> case.  Or we need to create two child devices for each memory block and
> associate each of them with an ACPI object.  That would lead to complications
> in the user space interface, though.

Right.  Even bigger issue is that I do not think __add_pages() and
__remove_pages() can add / delete a memory chunk that is less than
128MB.  128MB is the granularity of them.  So, we may just have to fail
this case gracefully.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-03  0:04       ` Toshi Kani
@ 2013-08-03  1:01         ` Rafael J. Wysocki
  2013-08-04  0:37           ` Toshi Kani
  2013-08-08 17:15         ` Cannot hot remove a memory device Toshi Kani
  1 sibling, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-03  1:01 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Friday, August 02, 2013 06:04:40 PM Toshi Kani wrote:
> On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > Hi,
> > > > 
> > > > Thanks for your report.
> > > > 
> > > > On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> > > > > By following commit, I cannot hot remove a memory device.
> > > > > 
> > > > > ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> > > > > commit e2ff39400d81233374e780b133496a2296643d7d
> > > > > 
> > > > > Details are follows:
> > > > > When I add a memory device, acpi_memory_enable_device() always fails
> > > > > as follows:
> > > > > 
> > > > > ...
> > > > > [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> > > > > [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> > > > > [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> > > > > [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> > > > > [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> > > > > [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> > > > > [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> > > > > [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> > > > > [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> > > > > [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> > > > > [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> > > > > [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> > > > > [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> > > 
> > > It appears that each memory object only has 64MB of memory.  This is
> > > less than the memory block size, which is 128MB.  This means that a
> > > single memory block associates with two ACPI memory device objects.
> > 
> > That'd be bad.
> > 
> > How did that work before if that indeed is the case?
> 
> Well, it looks to me that it has never worked before...
> 
> > > > > ...	
> > > > > [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> > > > 
> > > > Well, the only new way acpi_memory_enable_device() can fail after that commit
> > > > is a failure in acpi_bind_memory_blocks().
> > > 
> > > Agreed.
> > > 
> > > > This means that either handle is NULL, which I think we can exclude, because
> > > > acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> > > > there's a more subtle error in acpi_bind_one().
> > > > 
> > > > One that comes to mind is that we may be calling acpi_bind_one() twice for the
> > > > same memory region, in which it will trigger -EINVAL from the sanity check in
> > > > there.
> > > 
> > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > handle is already set\n").  When two ACPI memory objects associate with
> > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > 
> > That sound's plausible, but I wonder how we can fix that?
> > 
> > There's no way for a single physical device to have two different ACPI
> > "companions".  It looks like the memory blocks should be 64 M each in that
> > case.  Or we need to create two child devices for each memory block and
> > associate each of them with an ACPI object.  That would lead to complications
> > in the user space interface, though.
> 
> Right.  Even bigger issue is that I do not think __add_pages() and
> __remove_pages() can add / delete a memory chunk that is less than
> 128MB.  128MB is the granularity of them.  So, we may just have to fail
> this case gracefully.

Sigh.

BTW, why do you think they are 64 M each (it's late and I'm obviously tired)?

Rafael


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-03  1:01         ` Rafael J. Wysocki
@ 2013-08-04  0:37           ` Toshi Kani
  2013-08-04 14:12             ` Rafael J. Wysocki
  2013-08-05  4:00             ` Yasuaki Ishimatsu
  0 siblings, 2 replies; 27+ messages in thread
From: Toshi Kani @ 2013-08-04  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Sat, 2013-08-03 at 03:01 +0200, Rafael J. Wysocki wrote:
> On Friday, August 02, 2013 06:04:40 PM Toshi Kani wrote:
> > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > > Hi,
> > > > > 
> > > > > Thanks for your report.
> > > > > 
> > > > > On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> > > > > > By following commit, I cannot hot remove a memory device.
> > > > > > 
> > > > > > ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> > > > > > commit e2ff39400d81233374e780b133496a2296643d7d
> > > > > > 
> > > > > > Details are follows:
> > > > > > When I add a memory device, acpi_memory_enable_device() always fails
> > > > > > as follows:
> > > > > > 
> > > > > > ...
> > > > > > [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> > > > > > [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> > > > > > [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> > > > > > [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> > > > > > [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> > > > > > [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> > > > > > [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> > > > > > [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> > > > > > [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> > > > > > [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> > > > > > [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> > > > > > [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> > > > > > [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> > > > 
> > > > It appears that each memory object only has 64MB of memory.  This is
> > > > less than the memory block size, which is 128MB.  This means that a
> > > > single memory block associates with two ACPI memory device objects.
> > > 
> > > That'd be bad.
> > > 
> > > How did that work before if that indeed is the case?
> > 
> > Well, it looks to me that it has never worked before...
> > 
> > > > > > ...	
> > > > > > [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> > > > > 
> > > > > Well, the only new way acpi_memory_enable_device() can fail after that commit
> > > > > is a failure in acpi_bind_memory_blocks().
> > > > 
> > > > Agreed.
> > > > 
> > > > > This means that either handle is NULL, which I think we can exclude, because
> > > > > acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> > > > > there's a more subtle error in acpi_bind_one().
> > > > > 
> > > > > One that comes to mind is that we may be calling acpi_bind_one() twice for the
> > > > > same memory region, in which it will trigger -EINVAL from the sanity check in
> > > > > there.
> > > > 
> > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > 
> > > That sound's plausible, but I wonder how we can fix that?
> > > 
> > > There's no way for a single physical device to have two different ACPI
> > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > case.  Or we need to create two child devices for each memory block and
> > > associate each of them with an ACPI object.  That would lead to complications
> > > in the user space interface, though.
> > 
> > Right.  Even bigger issue is that I do not think __add_pages() and
> > __remove_pages() can add / delete a memory chunk that is less than
> > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > this case gracefully.
> 
> Sigh.
> 
> BTW, why do you think they are 64 M each (it's late and I'm obviously tired)?

Oops!  Sorry, I had confused the above messages with the one in
init_memory_mapping(), which shows a memory range being added, i.e. the
size of an ACPI memory device object.  But the above messages actually
came from vmemmap_populate_hugepages(), which was called during boot-up.
So, these messages have nothing to do with ACPI memory device objects.
And even worse, I do not seem to be able to count a number of zeros...
In the above messages, each memory range is 4MB (0x400000), not 64MB
(0x4000000)...  My bad. :-(

So, while we may still need to do something for the less-than-128MB
issue, Yasuaki may be hitting a different one.  Let's wait for Yasuaki
to give us more info.

Thanks,
-Toshi










^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-04  0:37           ` Toshi Kani
@ 2013-08-04 14:12             ` Rafael J. Wysocki
  2013-08-05  4:00             ` Yasuaki Ishimatsu
  1 sibling, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-04 14:12 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Saturday, August 03, 2013 06:37:26 PM Toshi Kani wrote:
> On Sat, 2013-08-03 at 03:01 +0200, Rafael J. Wysocki wrote:
> > On Friday, August 02, 2013 06:04:40 PM Toshi Kani wrote:
> > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > Thanks for your report.
> > > > > > 
> > > > > > On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> > > > > > > By following commit, I cannot hot remove a memory device.
> > > > > > > 
> > > > > > > ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> > > > > > > commit e2ff39400d81233374e780b133496a2296643d7d
> > > > > > > 
> > > > > > > Details are follows:
> > > > > > > When I add a memory device, acpi_memory_enable_device() always fails
> > > > > > > as follows:
> > > > > > > 
> > > > > > > ...
> > > > > > > [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> > > > > > > [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> > > > > > > [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> > > > > > > [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> > > > > > > [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> > > > > > > [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> > > > > > > [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> > > > > > > [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> > > > > > > [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> > > > > > > [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> > > > > > > [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> > > > > > > [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> > > > > > > [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> > > > > 
> > > > > It appears that each memory object only has 64MB of memory.  This is
> > > > > less than the memory block size, which is 128MB.  This means that a
> > > > > single memory block associates with two ACPI memory device objects.
> > > > 
> > > > That'd be bad.
> > > > 
> > > > How did that work before if that indeed is the case?
> > > 
> > > Well, it looks to me that it has never worked before...
> > > 
> > > > > > > ...	
> > > > > > > [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> > > > > > 
> > > > > > Well, the only new way acpi_memory_enable_device() can fail after that commit
> > > > > > is a failure in acpi_bind_memory_blocks().
> > > > > 
> > > > > Agreed.
> > > > > 
> > > > > > This means that either handle is NULL, which I think we can exclude, because
> > > > > > acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> > > > > > there's a more subtle error in acpi_bind_one().
> > > > > > 
> > > > > > One that comes to mind is that we may be calling acpi_bind_one() twice for the
> > > > > > same memory region, in which it will trigger -EINVAL from the sanity check in
> > > > > > there.
> > > > > 
> > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > 
> > > > That sound's plausible, but I wonder how we can fix that?
> > > > 
> > > > There's no way for a single physical device to have two different ACPI
> > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > case.  Or we need to create two child devices for each memory block and
> > > > associate each of them with an ACPI object.  That would lead to complications
> > > > in the user space interface, though.
> > > 
> > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > __remove_pages() can add / delete a memory chunk that is less than
> > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > this case gracefully.
> > 
> > Sigh.
> > 
> > BTW, why do you think they are 64 M each (it's late and I'm obviously tired)?
> 
> Oops!  Sorry, I had confused the above messages with the one in
> init_memory_mapping(), which shows a memory range being added, i.e. the
> size of an ACPI memory device object.  But the above messages actually
> came from vmemmap_populate_hugepages(), which was called during boot-up.
> So, these messages have nothing to do with ACPI memory device objects.
> And even worse, I do not seem to be able to count a number of zeros...
> In the above messages, each memory range is 4MB (0x400000), not 64MB
> (0x4000000)...  My bad. :-(

Well, no problem. :-)

> So, while we may still need to do something for the less-than-128MB
> issue, Yasuaki may be hitting a different one.  Let's wait for Yasuaki
> to give us more info.

My somewhat educated guess is that ACPI_MAX_PHYSICAL_NODE is too low for that
particular system, but that only is a guess.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-04  0:37           ` Toshi Kani
  2013-08-04 14:12             ` Rafael J. Wysocki
@ 2013-08-05  4:00             ` Yasuaki Ishimatsu
  2013-08-05  7:59               ` Yasuaki Ishimatsu
  1 sibling, 1 reply; 27+ messages in thread
From: Yasuaki Ishimatsu @ 2013-08-05  4:00 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Rafael J. Wysocki, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

(2013/08/04 9:37), Toshi Kani wrote:
> On Sat, 2013-08-03 at 03:01 +0200, Rafael J. Wysocki wrote:
>> On Friday, August 02, 2013 06:04:40 PM Toshi Kani wrote:
>>> On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
>>>> On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
>>>>> On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for your report.
>>>>>>
>>>>>> On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
>>>>>>> By following commit, I cannot hot remove a memory device.
>>>>>>>
>>>>>>> ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
>>>>>>> commit e2ff39400d81233374e780b133496a2296643d7d
>>>>>>>
>>>>>>> Details are follows:
>>>>>>> When I add a memory device, acpi_memory_enable_device() always fails
>>>>>>> as follows:
>>>>>>>
>>>>>>> ...
>>>>>>> [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
>>>>>>> [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
>>>>>>> [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
>>>>>>> [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
>>>>>>> [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
>>>>>>> [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
>>>>>>> [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
>>>>>>> [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
>>>>>>> [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
>>>>>>> [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
>>>>>>> [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
>>>>>>> [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
>>>>>>> [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
>>>>>
>>>>> It appears that each memory object only has 64MB of memory.  This is
>>>>> less than the memory block size, which is 128MB.  This means that a
>>>>> single memory block associates with two ACPI memory device objects.
>>>>
>>>> That'd be bad.
>>>>
>>>> How did that work before if that indeed is the case?
>>>
>>> Well, it looks to me that it has never worked before...
>>>
>>>>>>> ...	
>>>>>>> [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
>>>>>>
>>>>>> Well, the only new way acpi_memory_enable_device() can fail after that commit
>>>>>> is a failure in acpi_bind_memory_blocks().
>>>>>
>>>>> Agreed.
>>>>>
>>>>>> This means that either handle is NULL, which I think we can exclude, because
>>>>>> acpi_memory_enable_device() wouldn't be called at all if that were the case, or
>>>>>> there's a more subtle error in acpi_bind_one().
>>>>>>
>>>>>> One that comes to mind is that we may be calling acpi_bind_one() twice for the
>>>>>> same memory region, in which it will trigger -EINVAL from the sanity check in
>>>>>> there.
>>>>>
>>>>> I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
>>>>> handle is already set\n").  When two ACPI memory objects associate with
>>>>> a same memory block, the bind procedure of the 2nd ACPI memory object
>>>>> sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
>>>>
>>>> That sound's plausible, but I wonder how we can fix that?
>>>>
>>>> There's no way for a single physical device to have two different ACPI
>>>> "companions".  It looks like the memory blocks should be 64 M each in that
>>>> case.  Or we need to create two child devices for each memory block and
>>>> associate each of them with an ACPI object.  That would lead to complications
>>>> in the user space interface, though.
>>>
>>> Right.  Even bigger issue is that I do not think __add_pages() and
>>> __remove_pages() can add / delete a memory chunk that is less than
>>> 128MB.  128MB is the granularity of them.  So, we may just have to fail
>>> this case gracefully.
>>
>> Sigh.
>>
>> BTW, why do you think they are 64 M each (it's late and I'm obviously tired)?
>
> Oops!  Sorry, I had confused the above messages with the one in
> init_memory_mapping(), which shows a memory range being added, i.e. the
> size of an ACPI memory device object.  But the above messages actually
> came from vmemmap_populate_hugepages(), which was called during boot-up.
> So, these messages have nothing to do with ACPI memory device objects.
> And even worse, I do not seem to be able to count a number of zeros...
> In the above messages, each memory range is 4MB (0x400000), not 64MB
> (0x4000000)...  My bad. :-(
>
> So, while we may still need to do something for the less-than-128MB
> issue, Yasuaki may be hitting a different one.  Let's wait for Yasuaki
> to give us more info.

acpi_bind_memory_blocks() failed with -ENOSPC.

int acpi_bind_one(struct device *dev, acpi_handle handle)
{
...
	/* allocate physical node id according to physical_node_id_bitmap */
	physical_node->node_id =
		find_first_zero_bit(acpi_dev->physical_node_id_bitmap,
		ACPI_MAX_PHYSICAL_NODE);
	if (physical_node->node_id >= ACPI_MAX_PHYSICAL_NODE) {
		retval = -ENOSPC; => here
		goto err_free;
	}

When adding memory device, acpi_bind_memroy_blocks() calls acpi_bind_one()
"memory device size / 128MiB" times. So ACPI_MAX_PHYSICAL_NODE need to
be set "memory device size / 128MiB" or more. But ACPI_MAX_PHYSICAL_NODE is 32.
So acpi_bind_memory_blocks() always failed with -ENOSPC.

I'll test again after increasing ACPI_MAX_PHYSICAL_NODE to enough size.

Thanks,
Yasuaki Ishimatsu

>
> Thanks,
> -Toshi
>
>
>
>
>
>
>
>
>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-05  4:00             ` Yasuaki Ishimatsu
@ 2013-08-05  7:59               ` Yasuaki Ishimatsu
  2013-08-05 13:14                 ` Cannot hot remove a memory device (patch) Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Yasuaki Ishimatsu @ 2013-08-05  7:59 UTC (permalink / raw)
  To: Toshi Kani, Rafael J. Wysocki
  Cc: rafael.j.wysocki, vasilis.liaskovitis, linux-kernel, linux-acpi,
	tangchen, wency

(2013/08/05 13:00), Yasuaki Ishimatsu wrote:
> (2013/08/04 9:37), Toshi Kani wrote:
>> On Sat, 2013-08-03 at 03:01 +0200, Rafael J. Wysocki wrote:
>>> On Friday, August 02, 2013 06:04:40 PM Toshi Kani wrote:
>>>> On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
>>>>> On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
>>>>>> On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks for your report.
>>>>>>>
>>>>>>> On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
>>>>>>>> By following commit, I cannot hot remove a memory device.
>>>>>>>>
>>>>>>>> ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
>>>>>>>> commit e2ff39400d81233374e780b133496a2296643d7d
>>>>>>>>
>>>>>>>> Details are follows:
>>>>>>>> When I add a memory device, acpi_memory_enable_device() always fails
>>>>>>>> as follows:
>>>>>>>>
>>>>>>>> ...
>>>>>>>> [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
>>>>>>>> [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
>>>>>>>> [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
>>>>>>>> [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
>>>>>>>> [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
>>>>>>>> [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
>>>>>>>> [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
>>>>>>>> [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
>>>>>>>> [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
>>>>>>>> [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
>>>>>>>> [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
>>>>>>>> [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
>>>>>>>> [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
>>>>>>
>>>>>> It appears that each memory object only has 64MB of memory.  This is
>>>>>> less than the memory block size, which is 128MB.  This means that a
>>>>>> single memory block associates with two ACPI memory device objects.
>>>>>
>>>>> That'd be bad.
>>>>>
>>>>> How did that work before if that indeed is the case?
>>>>
>>>> Well, it looks to me that it has never worked before...
>>>>
>>>>>>>> ...
>>>>>>>> [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
>>>>>>>
>>>>>>> Well, the only new way acpi_memory_enable_device() can fail after that commit
>>>>>>> is a failure in acpi_bind_memory_blocks().
>>>>>>
>>>>>> Agreed.
>>>>>>
>>>>>>> This means that either handle is NULL, which I think we can exclude, because
>>>>>>> acpi_memory_enable_device() wouldn't be called at all if that were the case, or
>>>>>>> there's a more subtle error in acpi_bind_one().
>>>>>>>
>>>>>>> One that comes to mind is that we may be calling acpi_bind_one() twice for the
>>>>>>> same memory region, in which it will trigger -EINVAL from the sanity check in
>>>>>>> there.
>>>>>>
>>>>>> I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
>>>>>> handle is already set\n").  When two ACPI memory objects associate with
>>>>>> a same memory block, the bind procedure of the 2nd ACPI memory object
>>>>>> sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
>>>>>
>>>>> That sound's plausible, but I wonder how we can fix that?
>>>>>
>>>>> There's no way for a single physical device to have two different ACPI
>>>>> "companions".  It looks like the memory blocks should be 64 M each in that
>>>>> case.  Or we need to create two child devices for each memory block and
>>>>> associate each of them with an ACPI object.  That would lead to complications
>>>>> in the user space interface, though.
>>>>
>>>> Right.  Even bigger issue is that I do not think __add_pages() and
>>>> __remove_pages() can add / delete a memory chunk that is less than
>>>> 128MB.  128MB is the granularity of them.  So, we may just have to fail
>>>> this case gracefully.
>>>
>>> Sigh.
>>>
>>> BTW, why do you think they are 64 M each (it's late and I'm obviously tired)?
>>
>> Oops!  Sorry, I had confused the above messages with the one in
>> init_memory_mapping(), which shows a memory range being added, i.e. the
>> size of an ACPI memory device object.  But the above messages actually
>> came from vmemmap_populate_hugepages(), which was called during boot-up.
>> So, these messages have nothing to do with ACPI memory device objects.
>> And even worse, I do not seem to be able to count a number of zeros...
>> In the above messages, each memory range is 4MB (0x400000), not 64MB
>> (0x4000000)...  My bad. :-(
>>
>> So, while we may still need to do something for the less-than-128MB
>> issue, Yasuaki may be hitting a different one.  Let's wait for Yasuaki
>> to give us more info.
>
> acpi_bind_memory_blocks() failed with -ENOSPC.
>
> int acpi_bind_one(struct device *dev, acpi_handle handle)
> {
> ...
>      /* allocate physical node id according to physical_node_id_bitmap */
>      physical_node->node_id =
>          find_first_zero_bit(acpi_dev->physical_node_id_bitmap,
>          ACPI_MAX_PHYSICAL_NODE);
>      if (physical_node->node_id >= ACPI_MAX_PHYSICAL_NODE) {
>          retval = -ENOSPC; => here
>          goto err_free;
>      }
>
> When adding memory device, acpi_bind_memroy_blocks() calls acpi_bind_one()
> "memory device size / 128MiB" times. So ACPI_MAX_PHYSICAL_NODE need to
> be set "memory device size / 128MiB" or more. But ACPI_MAX_PHYSICAL_NODE is 32.
> So acpi_bind_memory_blocks() always failed with -ENOSPC.
>

> I'll test again after increasing ACPI_MAX_PHYSICAL_NODE to enough size.

Additional info:
When I increased ACPI_MAX_PHYSICAL_NODE to 1024 and change size of
acpi_device_physical_node->node_it into u32, I could hot remove memory
device. But even if ACPI_MAX_PHYSICAL_NODE is set to 1024, same problem
will occurs since it just supports 124GiB memory. So we need a way to change
ACPI_MAX_PHYSICAL_NODE dynamically.

Thanks,
Yasuaki Ishimatsu

>
> Thanks,
> Yasuaki Ishimatsu
>
>>
>> Thanks,
>> -Toshi
>>
>>
>>
>>
>>
>>
>>
>>
>>
>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device (patch)
  2013-08-05  7:59               ` Yasuaki Ishimatsu
@ 2013-08-05 13:14                 ` Rafael J. Wysocki
  2013-08-05 23:19                   ` Toshi Kani
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-05 13:14 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Toshi Kani, rafael.j.wysocki, vasilis.liaskovitis, linux-kernel,
	linux-acpi, tangchen, wency

On Monday, August 05, 2013 04:59:20 PM Yasuaki Ishimatsu wrote:
> (2013/08/05 13:00), Yasuaki Ishimatsu wrote:
> > (2013/08/04 9:37), Toshi Kani wrote:
> >> On Sat, 2013-08-03 at 03:01 +0200, Rafael J. Wysocki wrote:
> >>> On Friday, August 02, 2013 06:04:40 PM Toshi Kani wrote:
> >>>> On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> >>>>> On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> >>>>>> On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Thanks for your report.
> >>>>>>>
> >>>>>>> On Thursday, August 01, 2013 05:37:21 PM Yasuaki Ishimatsu wrote:
> >>>>>>>> By following commit, I cannot hot remove a memory device.
> >>>>>>>>
> >>>>>>>> ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
> >>>>>>>> commit e2ff39400d81233374e780b133496a2296643d7d
> >>>>>>>>
> >>>>>>>> Details are follows:
> >>>>>>>> When I add a memory device, acpi_memory_enable_device() always fails
> >>>>>>>> as follows:
> >>>>>>>>
> >>>>>>>> ...
> >>>>>>>> [ 1271.114116]  [ffffea121c400000-ffffea121c7fffff] PMD -> [ffff880813c00000-ffff880813ffffff] on node 3
> >>>>>>>> [ 1271.128682]  [ffffea121c800000-ffffea121cbfffff] PMD -> [ffff880813800000-ffff880813bfffff] on node 3
> >>>>>>>> [ 1271.143298]  [ffffea121cc00000-ffffea121cffffff] PMD -> [ffff880813000000-ffff8808133fffff] on node 3
> >>>>>>>> [ 1271.157799]  [ffffea121d000000-ffffea121d3fffff] PMD -> [ffff880812c00000-ffff880812ffffff] on node 3
> >>>>>>>> [ 1271.172341]  [ffffea121d400000-ffffea121d7fffff] PMD -> [ffff880812800000-ffff880812bfffff] on node 3
> >>>>>>>> [ 1271.186872]  [ffffea121d800000-ffffea121dbfffff] PMD -> [ffff880812400000-ffff8808127fffff] on node 3
> >>>>>>>> [ 1271.201481]  [ffffea121dc00000-ffffea121dffffff] PMD -> [ffff880812000000-ffff8808123fffff] on node 3
> >>>>>>>> [ 1271.216041]  [ffffea121e000000-ffffea121e3fffff] PMD -> [ffff880811c00000-ffff880811ffffff] on node 3
> >>>>>>>> [ 1271.230623]  [ffffea121e400000-ffffea121e7fffff] PMD -> [ffff880811800000-ffff880811bfffff] on node 3
> >>>>>>>> [ 1271.245148]  [ffffea121e800000-ffffea121ebfffff] PMD -> [ffff880811400000-ffff8808117fffff] on node 3
> >>>>>>>> [ 1271.259683]  [ffffea121ec00000-ffffea121effffff] PMD -> [ffff880811000000-ffff8808113fffff] on node 3
> >>>>>>>> [ 1271.274194]  [ffffea121f000000-ffffea121f3fffff] PMD -> [ffff880810c00000-ffff880810ffffff] on node 3
> >>>>>>>> [ 1271.288764]  [ffffea121f400000-ffffea121f7fffff] PMD -> [ffff880810800000-ffff880810bfffff] on node 3
> >>>>>>
> >>>>>> It appears that each memory object only has 64MB of memory.  This is
> >>>>>> less than the memory block size, which is 128MB.  This means that a
> >>>>>> single memory block associates with two ACPI memory device objects.
> >>>>>
> >>>>> That'd be bad.
> >>>>>
> >>>>> How did that work before if that indeed is the case?
> >>>>
> >>>> Well, it looks to me that it has never worked before...
> >>>>
> >>>>>>>> ...
> >>>>>>>> [ 1271.325841] acpi PNP0C80:03: acpi_memory_enable_device() error
> >>>>>>>
> >>>>>>> Well, the only new way acpi_memory_enable_device() can fail after that commit
> >>>>>>> is a failure in acpi_bind_memory_blocks().
> >>>>>>
> >>>>>> Agreed.
> >>>>>>
> >>>>>>> This means that either handle is NULL, which I think we can exclude, because
> >>>>>>> acpi_memory_enable_device() wouldn't be called at all if that were the case, or
> >>>>>>> there's a more subtle error in acpi_bind_one().
> >>>>>>>
> >>>>>>> One that comes to mind is that we may be calling acpi_bind_one() twice for the
> >>>>>>> same memory region, in which it will trigger -EINVAL from the sanity check in
> >>>>>>> there.
> >>>>>>
> >>>>>> I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> >>>>>> handle is already set\n").  When two ACPI memory objects associate with
> >>>>>> a same memory block, the bind procedure of the 2nd ACPI memory object
> >>>>>> sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> >>>>>
> >>>>> That sound's plausible, but I wonder how we can fix that?
> >>>>>
> >>>>> There's no way for a single physical device to have two different ACPI
> >>>>> "companions".  It looks like the memory blocks should be 64 M each in that
> >>>>> case.  Or we need to create two child devices for each memory block and
> >>>>> associate each of them with an ACPI object.  That would lead to complications
> >>>>> in the user space interface, though.
> >>>>
> >>>> Right.  Even bigger issue is that I do not think __add_pages() and
> >>>> __remove_pages() can add / delete a memory chunk that is less than
> >>>> 128MB.  128MB is the granularity of them.  So, we may just have to fail
> >>>> this case gracefully.
> >>>
> >>> Sigh.
> >>>
> >>> BTW, why do you think they are 64 M each (it's late and I'm obviously tired)?
> >>
> >> Oops!  Sorry, I had confused the above messages with the one in
> >> init_memory_mapping(), which shows a memory range being added, i.e. the
> >> size of an ACPI memory device object.  But the above messages actually
> >> came from vmemmap_populate_hugepages(), which was called during boot-up.
> >> So, these messages have nothing to do with ACPI memory device objects.
> >> And even worse, I do not seem to be able to count a number of zeros...
> >> In the above messages, each memory range is 4MB (0x400000), not 64MB
> >> (0x4000000)...  My bad. :-(
> >>
> >> So, while we may still need to do something for the less-than-128MB
> >> issue, Yasuaki may be hitting a different one.  Let's wait for Yasuaki
> >> to give us more info.
> >
> > acpi_bind_memory_blocks() failed with -ENOSPC.
> >
> > int acpi_bind_one(struct device *dev, acpi_handle handle)
> > {
> > ...
> >      /* allocate physical node id according to physical_node_id_bitmap */
> >      physical_node->node_id =
> >          find_first_zero_bit(acpi_dev->physical_node_id_bitmap,
> >          ACPI_MAX_PHYSICAL_NODE);
> >      if (physical_node->node_id >= ACPI_MAX_PHYSICAL_NODE) {
> >          retval = -ENOSPC; => here
> >          goto err_free;
> >      }
> >
> > When adding memory device, acpi_bind_memroy_blocks() calls acpi_bind_one()
> > "memory device size / 128MiB" times. So ACPI_MAX_PHYSICAL_NODE need to
> > be set "memory device size / 128MiB" or more. But ACPI_MAX_PHYSICAL_NODE is 32.
> > So acpi_bind_memory_blocks() always failed with -ENOSPC.
> >
> 
> > I'll test again after increasing ACPI_MAX_PHYSICAL_NODE to enough size.
> 
> Additional info:
> When I increased ACPI_MAX_PHYSICAL_NODE to 1024 and change size of
> acpi_device_physical_node->node_it into u32, I could hot remove memory
> device. But even if ACPI_MAX_PHYSICAL_NODE is set to 1024, same problem
> will occurs since it just supports 124GiB memory. So we need a way to change
> ACPI_MAX_PHYSICAL_NODE dynamically.

Can you please test the appended patch?  I tested it somewhat, but since the
greatest number of physical nodes per ACPI device object I can get on my test
machines is 2 (and even that after hacking the kernel somewhat), that was kind
of unconclusive.

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device

The physical_node_id_bitmap in struct acpi_device is only used for
looking up the first currently unused phyiscal dependent node ID
by acpi_bind_one().  It is not really necessary, however, because
acpi_bind_one() walks the entire physical_node_list of the given
device object for sanity checking anyway and if that list is always
sorted by node_id, it is straightforward to find the first gap
between the currently used node IDs and use that number as the ID
of the new list node.

This also removes the artificial limit of the maximum number of
dependent physical devices per ACPI device object, which now depends
only on the capacity of unsigend int.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/glue.c     |   31 +++++++++++++++++--------------
 include/acpi/acpi_bus.h |    8 ++------
 2 files changed, 19 insertions(+), 20 deletions(-)

Index: linux-pm/drivers/acpi/glue.c
===================================================================
--- linux-pm.orig/drivers/acpi/glue.c
+++ linux-pm/drivers/acpi/glue.c
@@ -112,7 +112,9 @@ int acpi_bind_one(struct device *dev, ac
 	struct acpi_device *acpi_dev;
 	acpi_status status;
 	struct acpi_device_physical_node *physical_node, *pn;
-	char physical_node_name[sizeof(PHYSICAL_NODE_STRING) + 2];
+	struct list_head *physnode_list;
+	unsigned int node_id;
+	char physical_node_name[sizeof(PHYSICAL_NODE_STRING) + 10];
 	int retval = -EINVAL;
 
 	if (ACPI_HANDLE(dev)) {
@@ -139,8 +141,14 @@ int acpi_bind_one(struct device *dev, ac
 
 	mutex_lock(&acpi_dev->physical_node_lock);
 
-	/* Sanity check. */
-	list_for_each_entry(pn, &acpi_dev->physical_node_list, node)
+	/*
+	 * Keep the list sorted by node_id so that node_ids of removed nodes can be
+	 * recycled.
+	 */
+	physnode_list = &acpi_dev->physical_node_list;
+	node_id = 0;
+	list_for_each_entry(pn, &acpi_dev->physical_node_list, node) {
+		/* Sanity check. */
 		if (pn->dev == dev) {
 			dev_warn(dev, "Already associated with ACPI node\n");
 			if (ACPI_HANDLE(dev) == handle)
@@ -148,19 +156,15 @@ int acpi_bind_one(struct device *dev, ac
 
 			goto out_free;
 		}
-
-	/* allocate physical node id according to physical_node_id_bitmap */
-	physical_node->node_id =
-		find_first_zero_bit(acpi_dev->physical_node_id_bitmap,
-		ACPI_MAX_PHYSICAL_NODE);
-	if (physical_node->node_id >= ACPI_MAX_PHYSICAL_NODE) {
-		retval = -ENOSPC;
-		goto out_free;
+		if (pn->node_id == node_id) {
+			physnode_list = &pn->node;
+			node_id++;
+		}
 	}
 
-	set_bit(physical_node->node_id, acpi_dev->physical_node_id_bitmap);
+	physical_node->node_id = node_id;
 	physical_node->dev = dev;
-	list_add_tail(&physical_node->node, &acpi_dev->physical_node_list);
+	list_add(&physical_node->node, physnode_list);
 	acpi_dev->physical_node_count++;
 
 	mutex_unlock(&acpi_dev->physical_node_lock);
@@ -223,7 +227,6 @@ int acpi_unbind_one(struct device *dev)
 			continue;
 
 		list_del(node);
-		clear_bit(entry->node_id, acpi_dev->physical_node_id_bitmap);
 
 		acpi_dev->physical_node_count--;
 
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -284,15 +284,12 @@ struct acpi_device_wakeup {
 };
 
 struct acpi_device_physical_node {
-	u8 node_id;
+	unsigned int node_id;
 	struct list_head node;
 	struct device *dev;
 	bool put_online:1;
 };
 
-/* set maximum of physical nodes to 32 for expansibility */
-#define ACPI_MAX_PHYSICAL_NODE	32
-
 /* Device */
 struct acpi_device {
 	int device_type;
@@ -312,10 +309,9 @@ struct acpi_device {
 	struct acpi_driver *driver;
 	void *driver_data;
 	struct device dev;
-	u8 physical_node_count;
+	unsigned int physical_node_count;
 	struct list_head physical_node_list;
 	struct mutex physical_node_lock;
-	DECLARE_BITMAP(physical_node_id_bitmap, ACPI_MAX_PHYSICAL_NODE);
 	struct list_head power_dependent;
 	void (*remove)(struct acpi_device *);
 };


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device (patch)
  2013-08-05 13:14                 ` Cannot hot remove a memory device (patch) Rafael J. Wysocki
@ 2013-08-05 23:19                   ` Toshi Kani
  2013-08-06  0:15                     ` Cannot hot remove a memory device (patch, updated) Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Toshi Kani @ 2013-08-05 23:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Mon, 2013-08-05 at 15:14 +0200, Rafael J. Wysocki wrote:
  :
> Can you please test the appended patch?  I tested it somewhat, but since the
> greatest number of physical nodes per ACPI device object I can get on my test
> machines is 2 (and even that after hacking the kernel somewhat), that was kind
> of unconclusive.
> 
> Thanks,
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
> 
> The physical_node_id_bitmap in struct acpi_device is only used for
> looking up the first currently unused phyiscal dependent node ID
> by acpi_bind_one().  It is not really necessary, however, because
> acpi_bind_one() walks the entire physical_node_list of the given
> device object for sanity checking anyway and if that list is always
> sorted by node_id, it is straightforward to find the first gap
> between the currently used node IDs and use that number as the ID
> of the new list node.
> 
> This also removes the artificial limit of the maximum number of
> dependent physical devices per ACPI device object, which now depends
> only on the capacity of unsigend int.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I like the change. Much better :-)

Acked-by: Toshi Kani <toshi.kani@hp.com>

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device (patch, updated)
  2013-08-05 23:19                   ` Toshi Kani
@ 2013-08-06  0:15                     ` Rafael J. Wysocki
  2013-08-06  2:12                       ` Yasuaki Ishimatsu
  2013-08-06 15:28                       ` Toshi Kani
  0 siblings, 2 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-06  0:15 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Monday, August 05, 2013 05:19:56 PM Toshi Kani wrote:
> On Mon, 2013-08-05 at 15:14 +0200, Rafael J. Wysocki wrote:
>   :
> > Can you please test the appended patch?  I tested it somewhat, but since the
> > greatest number of physical nodes per ACPI device object I can get on my test
> > machines is 2 (and even that after hacking the kernel somewhat), that was kind
> > of unconclusive.
> > 
> > Thanks,
> > Rafael
> > 
> > 
> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
> > 
> > The physical_node_id_bitmap in struct acpi_device is only used for
> > looking up the first currently unused phyiscal dependent node ID
> > by acpi_bind_one().  It is not really necessary, however, because
> > acpi_bind_one() walks the entire physical_node_list of the given
> > device object for sanity checking anyway and if that list is always
> > sorted by node_id, it is straightforward to find the first gap
> > between the currently used node IDs and use that number as the ID
> > of the new list node.
> > 
> > This also removes the artificial limit of the maximum number of
> > dependent physical devices per ACPI device object, which now depends
> > only on the capacity of unsigend int.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> I like the change. Much better :-)
> 
> Acked-by: Toshi Kani <toshi.kani@hp.com>

However, it introduces a bug in acpi_unbind_one(), because the size of the name
array in there has to be increased too.  Updated patch follows.

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device

The physical_node_id_bitmap in struct acpi_device is only used for
looking up the first currently unused dependent phyiscal node ID
by acpi_bind_one().  It is not really necessary, however, because
acpi_bind_one() walks the entire physical_node_list of the given
device object for sanity checking anyway and if that list is always
sorted by node_id, it is straightforward to find the first gap
between the currently used node IDs and use that number as the ID
of the new list node.

This also removes the artificial limit of the maximum number of
dependent physical devices per ACPI device object, which now depends
only on the capacity of unsigend int.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/glue.c     |   34 +++++++++++++++++++---------------
 include/acpi/acpi_bus.h |    8 ++------
 2 files changed, 21 insertions(+), 21 deletions(-)

Index: linux-pm/drivers/acpi/glue.c
===================================================================
--- linux-pm.orig/drivers/acpi/glue.c
+++ linux-pm/drivers/acpi/glue.c
@@ -31,6 +31,7 @@ static LIST_HEAD(bus_type_list);
 static DECLARE_RWSEM(bus_type_sem);
 
 #define PHYSICAL_NODE_STRING "physical_node"
+#define PHYSICAL_NODE_NAME_SIZE (sizeof(PHYSICAL_NODE_STRING) + 10)
 
 int register_acpi_bus_type(struct acpi_bus_type *type)
 {
@@ -112,7 +113,9 @@ int acpi_bind_one(struct device *dev, ac
 	struct acpi_device *acpi_dev;
 	acpi_status status;
 	struct acpi_device_physical_node *physical_node, *pn;
-	char physical_node_name[sizeof(PHYSICAL_NODE_STRING) + 2];
+	char physical_node_name[PHYSICAL_NODE_NAME_SIZE];
+	struct list_head *physnode_list;
+	unsigned int node_id;
 	int retval = -EINVAL;
 
 	if (ACPI_HANDLE(dev)) {
@@ -139,8 +142,14 @@ int acpi_bind_one(struct device *dev, ac
 
 	mutex_lock(&acpi_dev->physical_node_lock);
 
-	/* Sanity check. */
-	list_for_each_entry(pn, &acpi_dev->physical_node_list, node)
+	/*
+	 * Keep the list sorted by node_id so that the IDs of removed nodes can
+	 * be recycled.
+	 */
+	physnode_list = &acpi_dev->physical_node_list;
+	node_id = 0;
+	list_for_each_entry(pn, &acpi_dev->physical_node_list, node) {
+		/* Sanity check. */
 		if (pn->dev == dev) {
 			dev_warn(dev, "Already associated with ACPI node\n");
 			if (ACPI_HANDLE(dev) == handle)
@@ -148,19 +157,15 @@ int acpi_bind_one(struct device *dev, ac
 
 			goto out_free;
 		}
-
-	/* allocate physical node id according to physical_node_id_bitmap */
-	physical_node->node_id =
-		find_first_zero_bit(acpi_dev->physical_node_id_bitmap,
-		ACPI_MAX_PHYSICAL_NODE);
-	if (physical_node->node_id >= ACPI_MAX_PHYSICAL_NODE) {
-		retval = -ENOSPC;
-		goto out_free;
+		if (pn->node_id == node_id) {
+			physnode_list = &pn->node;
+			node_id++;
+		}
 	}
 
-	set_bit(physical_node->node_id, acpi_dev->physical_node_id_bitmap);
+	physical_node->node_id = node_id;
 	physical_node->dev = dev;
-	list_add_tail(&physical_node->node, &acpi_dev->physical_node_list);
+	list_add(&physical_node->node, physnode_list);
 	acpi_dev->physical_node_count++;
 
 	mutex_unlock(&acpi_dev->physical_node_lock);
@@ -215,7 +220,7 @@ int acpi_unbind_one(struct device *dev)
 
 	mutex_lock(&acpi_dev->physical_node_lock);
 	list_for_each_safe(node, next, &acpi_dev->physical_node_list) {
-		char physical_node_name[sizeof(PHYSICAL_NODE_STRING) + 2];
+		char physical_node_name[PHYSICAL_NODE_NAME_SIZE];
 
 		entry = list_entry(node, struct acpi_device_physical_node,
 			node);
@@ -223,7 +228,6 @@ int acpi_unbind_one(struct device *dev)
 			continue;
 
 		list_del(node);
-		clear_bit(entry->node_id, acpi_dev->physical_node_id_bitmap);
 
 		acpi_dev->physical_node_count--;
 
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -284,15 +284,12 @@ struct acpi_device_wakeup {
 };
 
 struct acpi_device_physical_node {
-	u8 node_id;
+	unsigned int node_id;
 	struct list_head node;
 	struct device *dev;
 	bool put_online:1;
 };
 
-/* set maximum of physical nodes to 32 for expansibility */
-#define ACPI_MAX_PHYSICAL_NODE	32
-
 /* Device */
 struct acpi_device {
 	int device_type;
@@ -312,10 +309,9 @@ struct acpi_device {
 	struct acpi_driver *driver;
 	void *driver_data;
 	struct device dev;
-	u8 physical_node_count;
+	unsigned int physical_node_count;
 	struct list_head physical_node_list;
 	struct mutex physical_node_lock;
-	DECLARE_BITMAP(physical_node_id_bitmap, ACPI_MAX_PHYSICAL_NODE);
 	struct list_head power_dependent;
 	void (*remove)(struct acpi_device *);
 };


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device (patch, updated)
  2013-08-06  0:15                     ` Cannot hot remove a memory device (patch, updated) Rafael J. Wysocki
@ 2013-08-06  2:12                       ` Yasuaki Ishimatsu
  2013-08-06 14:17                         ` Rafael J. Wysocki
  2013-08-06 15:28                       ` Toshi Kani
  1 sibling, 1 reply; 27+ messages in thread
From: Yasuaki Ishimatsu @ 2013-08-06  2:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, rafael.j.wysocki, vasilis.liaskovitis, linux-kernel,
	linux-acpi, tangchen, wency

(2013/08/06 9:15), Rafael J. Wysocki wrote:
> On Monday, August 05, 2013 05:19:56 PM Toshi Kani wrote:
>> On Mon, 2013-08-05 at 15:14 +0200, Rafael J. Wysocki wrote:
>>    :
>>> Can you please test the appended patch?  I tested it somewhat, but since the
>>> greatest number of physical nodes per ACPI device object I can get on my test
>>> machines is 2 (and even that after hacking the kernel somewhat), that was kind
>>> of unconclusive.
>>>
>>> Thanks,
>>> Rafael
>>>
>>>
>>> ---
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
>>>
>>> The physical_node_id_bitmap in struct acpi_device is only used for
>>> looking up the first currently unused phyiscal dependent node ID
>>> by acpi_bind_one().  It is not really necessary, however, because
>>> acpi_bind_one() walks the entire physical_node_list of the given
>>> device object for sanity checking anyway and if that list is always
>>> sorted by node_id, it is straightforward to find the first gap
>>> between the currently used node IDs and use that number as the ID
>>> of the new list node.
>>>
>>> This also removes the artificial limit of the maximum number of
>>> dependent physical devices per ACPI device object, which now depends
>>> only on the capacity of unsigend int.
>>>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> I like the change. Much better :-)
>>
>> Acked-by: Toshi Kani <toshi.kani@hp.com>
>
> However, it introduces a bug in acpi_unbind_one(), because the size of the name
> array in there has to be increased too.  Updated patch follows.
>
> Thanks,
> Rafael
>
>
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
>
> The physical_node_id_bitmap in struct acpi_device is only used for
> looking up the first currently unused dependent phyiscal node ID
> by acpi_bind_one().  It is not really necessary, however, because
> acpi_bind_one() walks the entire physical_node_list of the given
> device object for sanity checking anyway and if that list is always
> sorted by node_id, it is straightforward to find the first gap
> between the currently used node IDs and use that number as the ID
> of the new list node.
>
> This also removes the artificial limit of the maximum number of
> dependent physical devices per ACPI device object, which now depends
> only on the capacity of unsigend int.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

I confirmed that I can add and remove a memory device correctly.

Thanks,
Yasuaki Ishimatsu

> ---
>   drivers/acpi/glue.c     |   34 +++++++++++++++++++---------------
>   include/acpi/acpi_bus.h |    8 ++------
>   2 files changed, 21 insertions(+), 21 deletions(-)
>
> Index: linux-pm/drivers/acpi/glue.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/glue.c
> +++ linux-pm/drivers/acpi/glue.c
> @@ -31,6 +31,7 @@ static LIST_HEAD(bus_type_list);
>   static DECLARE_RWSEM(bus_type_sem);
>
>   #define PHYSICAL_NODE_STRING "physical_node"
> +#define PHYSICAL_NODE_NAME_SIZE (sizeof(PHYSICAL_NODE_STRING) + 10)
>
>   int register_acpi_bus_type(struct acpi_bus_type *type)
>   {
> @@ -112,7 +113,9 @@ int acpi_bind_one(struct device *dev, ac
>   	struct acpi_device *acpi_dev;
>   	acpi_status status;
>   	struct acpi_device_physical_node *physical_node, *pn;
> -	char physical_node_name[sizeof(PHYSICAL_NODE_STRING) + 2];
> +	char physical_node_name[PHYSICAL_NODE_NAME_SIZE];
> +	struct list_head *physnode_list;
> +	unsigned int node_id;
>   	int retval = -EINVAL;
>
>   	if (ACPI_HANDLE(dev)) {
> @@ -139,8 +142,14 @@ int acpi_bind_one(struct device *dev, ac
>
>   	mutex_lock(&acpi_dev->physical_node_lock);
>
> -	/* Sanity check. */
> -	list_for_each_entry(pn, &acpi_dev->physical_node_list, node)
> +	/*
> +	 * Keep the list sorted by node_id so that the IDs of removed nodes can
> +	 * be recycled.
> +	 */
> +	physnode_list = &acpi_dev->physical_node_list;
> +	node_id = 0;
> +	list_for_each_entry(pn, &acpi_dev->physical_node_list, node) {
> +		/* Sanity check. */
>   		if (pn->dev == dev) {
>   			dev_warn(dev, "Already associated with ACPI node\n");
>   			if (ACPI_HANDLE(dev) == handle)
> @@ -148,19 +157,15 @@ int acpi_bind_one(struct device *dev, ac
>
>   			goto out_free;
>   		}
> -
> -	/* allocate physical node id according to physical_node_id_bitmap */
> -	physical_node->node_id =
> -		find_first_zero_bit(acpi_dev->physical_node_id_bitmap,
> -		ACPI_MAX_PHYSICAL_NODE);
> -	if (physical_node->node_id >= ACPI_MAX_PHYSICAL_NODE) {
> -		retval = -ENOSPC;
> -		goto out_free;
> +		if (pn->node_id == node_id) {
> +			physnode_list = &pn->node;
> +			node_id++;
> +		}
>   	}
>
> -	set_bit(physical_node->node_id, acpi_dev->physical_node_id_bitmap);
> +	physical_node->node_id = node_id;
>   	physical_node->dev = dev;
> -	list_add_tail(&physical_node->node, &acpi_dev->physical_node_list);
> +	list_add(&physical_node->node, physnode_list);
>   	acpi_dev->physical_node_count++;
>
>   	mutex_unlock(&acpi_dev->physical_node_lock);
> @@ -215,7 +220,7 @@ int acpi_unbind_one(struct device *dev)
>
>   	mutex_lock(&acpi_dev->physical_node_lock);
>   	list_for_each_safe(node, next, &acpi_dev->physical_node_list) {
> -		char physical_node_name[sizeof(PHYSICAL_NODE_STRING) + 2];
> +		char physical_node_name[PHYSICAL_NODE_NAME_SIZE];
>
>   		entry = list_entry(node, struct acpi_device_physical_node,
>   			node);
> @@ -223,7 +228,6 @@ int acpi_unbind_one(struct device *dev)
>   			continue;
>
>   		list_del(node);
> -		clear_bit(entry->node_id, acpi_dev->physical_node_id_bitmap);
>
>   		acpi_dev->physical_node_count--;
>
> Index: linux-pm/include/acpi/acpi_bus.h
> ===================================================================
> --- linux-pm.orig/include/acpi/acpi_bus.h
> +++ linux-pm/include/acpi/acpi_bus.h
> @@ -284,15 +284,12 @@ struct acpi_device_wakeup {
>   };
>
>   struct acpi_device_physical_node {
> -	u8 node_id;
> +	unsigned int node_id;
>   	struct list_head node;
>   	struct device *dev;
>   	bool put_online:1;
>   };
>
> -/* set maximum of physical nodes to 32 for expansibility */
> -#define ACPI_MAX_PHYSICAL_NODE	32
> -
>   /* Device */
>   struct acpi_device {
>   	int device_type;
> @@ -312,10 +309,9 @@ struct acpi_device {
>   	struct acpi_driver *driver;
>   	void *driver_data;
>   	struct device dev;
> -	u8 physical_node_count;
> +	unsigned int physical_node_count;
>   	struct list_head physical_node_list;
>   	struct mutex physical_node_lock;
> -	DECLARE_BITMAP(physical_node_id_bitmap, ACPI_MAX_PHYSICAL_NODE);
>   	struct list_head power_dependent;
>   	void (*remove)(struct acpi_device *);
>   };
>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device (patch, updated)
  2013-08-06  2:12                       ` Yasuaki Ishimatsu
@ 2013-08-06 14:17                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-06 14:17 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Toshi Kani, rafael.j.wysocki, vasilis.liaskovitis, linux-kernel,
	linux-acpi, tangchen, wency

On Tuesday, August 06, 2013 11:12:51 AM Yasuaki Ishimatsu wrote:
> (2013/08/06 9:15), Rafael J. Wysocki wrote:
> > On Monday, August 05, 2013 05:19:56 PM Toshi Kani wrote:
> >> On Mon, 2013-08-05 at 15:14 +0200, Rafael J. Wysocki wrote:
> >>    :
> >>> Can you please test the appended patch?  I tested it somewhat, but since the
> >>> greatest number of physical nodes per ACPI device object I can get on my test
> >>> machines is 2 (and even that after hacking the kernel somewhat), that was kind
> >>> of unconclusive.
> >>>
> >>> Thanks,
> >>> Rafael
> >>>
> >>>
> >>> ---
> >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>> Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
> >>>
> >>> The physical_node_id_bitmap in struct acpi_device is only used for
> >>> looking up the first currently unused phyiscal dependent node ID
> >>> by acpi_bind_one().  It is not really necessary, however, because
> >>> acpi_bind_one() walks the entire physical_node_list of the given
> >>> device object for sanity checking anyway and if that list is always
> >>> sorted by node_id, it is straightforward to find the first gap
> >>> between the currently used node IDs and use that number as the ID
> >>> of the new list node.
> >>>
> >>> This also removes the artificial limit of the maximum number of
> >>> dependent physical devices per ACPI device object, which now depends
> >>> only on the capacity of unsigend int.
> >>>
> >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> I like the change. Much better :-)
> >>
> >> Acked-by: Toshi Kani <toshi.kani@hp.com>
> >
> > However, it introduces a bug in acpi_unbind_one(), because the size of the name
> > array in there has to be increased too.  Updated patch follows.
> >
> > Thanks,
> > Rafael
> >
> >
> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
> >
> > The physical_node_id_bitmap in struct acpi_device is only used for
> > looking up the first currently unused dependent phyiscal node ID
> > by acpi_bind_one().  It is not really necessary, however, because
> > acpi_bind_one() walks the entire physical_node_list of the given
> > device object for sanity checking anyway and if that list is always
> > sorted by node_id, it is straightforward to find the first gap
> > between the currently used node IDs and use that number as the ID
> > of the new list node.
> >
> > This also removes the artificial limit of the maximum number of
> > dependent physical devices per ACPI device object, which now depends
> > only on the capacity of unsigend int.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> 
> I confirmed that I can add and remove a memory device correctly.

Great, thanks for the confirmation!

Rafael


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device (patch, updated)
  2013-08-06  0:15                     ` Cannot hot remove a memory device (patch, updated) Rafael J. Wysocki
  2013-08-06  2:12                       ` Yasuaki Ishimatsu
@ 2013-08-06 15:28                       ` Toshi Kani
  1 sibling, 0 replies; 27+ messages in thread
From: Toshi Kani @ 2013-08-06 15:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Tue, 2013-08-06 at 02:15 +0200, Rafael J. Wysocki wrote:
> On Monday, August 05, 2013 05:19:56 PM Toshi Kani wrote:
> > On Mon, 2013-08-05 at 15:14 +0200, Rafael J. Wysocki wrote:
> >   :
> > > Can you please test the appended patch?  I tested it somewhat, but since the
> > > greatest number of physical nodes per ACPI device object I can get on my test
> > > machines is 2 (and even that after hacking the kernel somewhat), that was kind
> > > of unconclusive.
> > > 
> > > Thanks,
> > > Rafael
> > > 
> > > 
> > > ---
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Subject: ACPI: Drop physical_node_id_bitmap from struct acpi_device
> > > 
> > > The physical_node_id_bitmap in struct acpi_device is only used for
> > > looking up the first currently unused phyiscal dependent node ID
> > > by acpi_bind_one().  It is not really necessary, however, because
> > > acpi_bind_one() walks the entire physical_node_list of the given
> > > device object for sanity checking anyway and if that list is always
> > > sorted by node_id, it is straightforward to find the first gap
> > > between the currently used node IDs and use that number as the ID
> > > of the new list node.
> > > 
> > > This also removes the artificial limit of the maximum number of
> > > dependent physical devices per ACPI device object, which now depends
> > > only on the capacity of unsigend int.
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > I like the change. Much better :-)
> > 
> > Acked-by: Toshi Kani <toshi.kani@hp.com>
> 
> However, it introduces a bug in acpi_unbind_one(), because the size of the name
> array in there has to be increased too.  Updated patch follows.

Right.  Sorry, I overlooked this one.  Please apply my ask to this new
version. 

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-03  0:04       ` Toshi Kani
  2013-08-03  1:01         ` Rafael J. Wysocki
@ 2013-08-08 17:15         ` Toshi Kani
  2013-08-08 22:12           ` Rafael J. Wysocki
  1 sibling, 1 reply; 27+ messages in thread
From: Toshi Kani @ 2013-08-08 17:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
 :
> > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > handle is already set\n").  When two ACPI memory objects associate with
> > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > 
> > That sound's plausible, but I wonder how we can fix that?
> > 
> > There's no way for a single physical device to have two different ACPI
> > "companions".  It looks like the memory blocks should be 64 M each in that
> > case.  Or we need to create two child devices for each memory block and
> > associate each of them with an ACPI object.  That would lead to complications
> > in the user space interface, though.
> 
> Right.  Even bigger issue is that I do not think __add_pages() and
> __remove_pages() can add / delete a memory chunk that is less than
> 128MB.  128MB is the granularity of them.  So, we may just have to fail
> this case gracefully.

FYI: I have submitted the patch blow to close this part of the issue...

https://lkml.org/lkml/2013/8/8/396

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-08 17:15         ` Cannot hot remove a memory device Toshi Kani
@ 2013-08-08 22:12           ` Rafael J. Wysocki
  2013-08-08 22:50             ` Toshi Kani
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-08 22:12 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
>  :
> > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > 
> > > That sound's plausible, but I wonder how we can fix that?
> > > 
> > > There's no way for a single physical device to have two different ACPI
> > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > case.  Or we need to create two child devices for each memory block and
> > > associate each of them with an ACPI object.  That would lead to complications
> > > in the user space interface, though.
> > 
> > Right.  Even bigger issue is that I do not think __add_pages() and
> > __remove_pages() can add / delete a memory chunk that is less than
> > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > this case gracefully.
> 
> FYI: I have submitted the patch blow to close this part of the issue...
> 
> https://lkml.org/lkml/2013/8/8/396

That looks good to me, but we'd still need to make it possible to have
memory blocks smaller than 128 MB ...

Rafael


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-08 22:12           ` Rafael J. Wysocki
@ 2013-08-08 22:50             ` Toshi Kani
  2013-08-08 23:14               ` Rafael J. Wysocki
  2013-08-11 21:13               ` Rafael J. Wysocki
  0 siblings, 2 replies; 27+ messages in thread
From: Toshi Kani @ 2013-08-08 22:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> >  :
> > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > 
> > > > That sound's plausible, but I wonder how we can fix that?
> > > > 
> > > > There's no way for a single physical device to have two different ACPI
> > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > case.  Or we need to create two child devices for each memory block and
> > > > associate each of them with an ACPI object.  That would lead to complications
> > > > in the user space interface, though.
> > > 
> > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > __remove_pages() can add / delete a memory chunk that is less than
> > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > this case gracefully.
> > 
> > FYI: I have submitted the patch blow to close this part of the issue...
> > 
> > https://lkml.org/lkml/2013/8/8/396
> 
> That looks good to me, but we'd still need to make it possible to have
> memory blocks smaller than 128 MB ...

Do you mean acpi_bind_one() needs to be able to handle such case?  If
so, it should not be a problem since a memory block device won't be
created when add_memory() fails with the change above.  So, there is no
binding to be done.  If you mean add_memory() needs to be able to handle
a smaller range, that's quite a tough one unless we make the section
size smaller.

BTW, when add_memory() fails, the memory hot-add request still succeeds
with no driver attached.  This seems logical, but the added device is
useless when no handler is attached.  And it does not allow ejecting the
device with no handler.  I am not too worry about this since this is a
rare case, but it reminded me that the framework won't handle rollback.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-08 22:50             ` Toshi Kani
@ 2013-08-08 23:14               ` Rafael J. Wysocki
  2013-08-08 23:35                 ` Toshi Kani
  2013-08-11 21:13               ` Rafael J. Wysocki
  1 sibling, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-08 23:14 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > >  :
> > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > 
> > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > 
> > > > > There's no way for a single physical device to have two different ACPI
> > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > case.  Or we need to create two child devices for each memory block and
> > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > in the user space interface, though.
> > > > 
> > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > this case gracefully.
> > > 
> > > FYI: I have submitted the patch blow to close this part of the issue...
> > > 
> > > https://lkml.org/lkml/2013/8/8/396
> > 
> > That looks good to me, but we'd still need to make it possible to have
> > memory blocks smaller than 128 MB ...
> 
> Do you mean acpi_bind_one() needs to be able to handle such case?  If
> so, it should not be a problem since a memory block device won't be
> created when add_memory() fails with the change above.  So, there is no
> binding to be done.  If you mean add_memory() needs to be able to handle
> a smaller range, that's quite a tough one unless we make the section
> size smaller.
> 
> BTW, when add_memory() fails, the memory hot-add request still succeeds
> with no driver attached.  This seems logical, but the added device is
> useless when no handler is attached.  And it does not allow ejecting the
> device with no handler.  I am not too worry about this since this is a
> rare case, but it reminded me that the framework won't handle rollback.

That is a valid observation.

Perhaps we should add a flag that will cause acpi_bus_offline_companions()
to fail immediately if that flag is set for the given device?

Then, acpi_memory_enable_device() could set that flag for mem_device->device
when either add_memory() or acpi_bind_memory_blocks() fails?

Rafael


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-08 23:14               ` Rafael J. Wysocki
@ 2013-08-08 23:35                 ` Toshi Kani
  0 siblings, 0 replies; 27+ messages in thread
From: Toshi Kani @ 2013-08-08 23:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Fri, 2013-08-09 at 01:14 +0200, Rafael J. Wysocki wrote:
> On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> > On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > >  :
> > > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > > 
> > > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > > 
> > > > > > There's no way for a single physical device to have two different ACPI
> > > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > > case.  Or we need to create two child devices for each memory block and
> > > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > > in the user space interface, though.
> > > > > 
> > > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > > this case gracefully.
> > > > 
> > > > FYI: I have submitted the patch blow to close this part of the issue...
> > > > 
> > > > https://lkml.org/lkml/2013/8/8/396
> > > 
> > > That looks good to me, but we'd still need to make it possible to have
> > > memory blocks smaller than 128 MB ...
> > 
> > Do you mean acpi_bind_one() needs to be able to handle such case?  If
> > so, it should not be a problem since a memory block device won't be
> > created when add_memory() fails with the change above.  So, there is no
> > binding to be done.  If you mean add_memory() needs to be able to handle
> > a smaller range, that's quite a tough one unless we make the section
> > size smaller.
> > 
> > BTW, when add_memory() fails, the memory hot-add request still succeeds
> > with no driver attached.  This seems logical, but the added device is
> > useless when no handler is attached.  And it does not allow ejecting the
> > device with no handler.  I am not too worry about this since this is a
> > rare case, but it reminded me that the framework won't handle rollback.
> 
> That is a valid observation.
> 
> Perhaps we should add a flag that will cause acpi_bus_offline_companions()
> to fail immediately if that flag is set for the given device?
> 
> Then, acpi_memory_enable_device() could set that flag for mem_device->device
> when either add_memory() or acpi_bind_memory_blocks() fails?

I am not sure if we need a new flag, but I agree that we could allow
ejecting a device with no handler attached.  We can probably skip any
device specific part (such as offlining) when no handler is attached.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-08 22:50             ` Toshi Kani
  2013-08-08 23:14               ` Rafael J. Wysocki
@ 2013-08-11 21:13               ` Rafael J. Wysocki
  2013-08-12 20:40                 ` Toshi Kani
  1 sibling, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-11 21:13 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > >  :
> > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > 
> > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > 
> > > > > There's no way for a single physical device to have two different ACPI
> > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > case.  Or we need to create two child devices for each memory block and
> > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > in the user space interface, though.
> > > > 
> > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > this case gracefully.
> > > 
> > > FYI: I have submitted the patch blow to close this part of the issue...
> > > 
> > > https://lkml.org/lkml/2013/8/8/396
> > 
> > That looks good to me, but we'd still need to make it possible to have
> > memory blocks smaller than 128 MB ...
> 
> Do you mean acpi_bind_one() needs to be able to handle such case?  If
> so, it should not be a problem since a memory block device won't be
> created when add_memory() fails with the change above.  So, there is no
> binding to be done.  If you mean add_memory() needs to be able to handle
> a smaller range, that's quite a tough one unless we make the section
> size smaller.
> 
> BTW, when add_memory() fails, the memory hot-add request still succeeds
> with no driver attached.  This seems logical, but the added device is
> useless when no handler is attached.  And it does not allow ejecting the
> device with no handler.  I am not too worry about this since this is a
> rare case, but it reminded me that the framework won't handle rollback.

I'm not sure which rollback you mean.  During removal?

There are two slight problems here in my view.  First, even if the device
cannot be ejected directly, it still will be removed when its parent is
ejected, so it may be more consistent to just allow everything to be ejected
regardless of whether or not it has a scan handler.  Second, I guess the
removal is undesirable for memory devices for which the registration of the
scan handler failed, so it would be good to fail the "offline" of such devices
regardless of how we get there.  That's why I thought it would be good to have
an "offline disabled" flag in struct acpi_device.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-11 21:13               ` Rafael J. Wysocki
@ 2013-08-12 20:40                 ` Toshi Kani
  2013-08-13  0:45                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Toshi Kani @ 2013-08-12 20:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Sun, 2013-08-11 at 23:13 +0200, Rafael J. Wysocki wrote:
> On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> > On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > >  :
> > > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > > 
> > > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > > 
> > > > > > There's no way for a single physical device to have two different ACPI
> > > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > > case.  Or we need to create two child devices for each memory block and
> > > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > > in the user space interface, though.
> > > > > 
> > > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > > this case gracefully.
> > > > 
> > > > FYI: I have submitted the patch blow to close this part of the issue...
> > > > 
> > > > https://lkml.org/lkml/2013/8/8/396
> > > 
> > > That looks good to me, but we'd still need to make it possible to have
> > > memory blocks smaller than 128 MB ...
> > 
> > Do you mean acpi_bind_one() needs to be able to handle such case?  If
> > so, it should not be a problem since a memory block device won't be
> > created when add_memory() fails with the change above.  So, there is no
> > binding to be done.  If you mean add_memory() needs to be able to handle
> > a smaller range, that's quite a tough one unless we make the section
> > size smaller.
> > 
> > BTW, when add_memory() fails, the memory hot-add request still succeeds
> > with no driver attached.  This seems logical, but the added device is
> > useless when no handler is attached.  And it does not allow ejecting the
> > device with no handler.  I am not too worry about this since this is a
> > rare case, but it reminded me that the framework won't handle rollback.
> 
> I'm not sure which rollback you mean.  During removal?

I meant rollback during hot-add.  Ideally, a device should be either
added in usable state (success) or failed back to the original state
(rollback).  Added in un-usable state is not really a success for users,
and creates an odd state to deal with.  But it is still a LOT better
than crashing the system.  So, I think this outcome is reasonable on
this framework because adding rollback at this point will complicate the
things unnecessarily.

> There are two slight problems here in my view.  First, even if the device
> cannot be ejected directly, it still will be removed when its parent is
> ejected, so it may be more consistent to just allow everything to be ejected
> regardless of whether or not it has a scan handler.  

Agreed.

> Second, I guess the
> removal is undesirable for memory devices for which the registration of the
> scan handler failed, so it would be good to fail the "offline" of such devices
> regardless of how we get there.  That's why I thought it would be good to have
> an "offline disabled" flag in struct acpi_device.

I see.  But when attach() failed, the memory device may not be used by
the kernel.  So, I think it should be safe to remove it.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-12 20:40                 ` Toshi Kani
@ 2013-08-13  0:45                   ` Rafael J. Wysocki
  2013-08-13  1:02                     ` Toshi Kani
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-13  0:45 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Monday, August 12, 2013 02:40:43 PM Toshi Kani wrote:
> On Sun, 2013-08-11 at 23:13 +0200, Rafael J. Wysocki wrote:
> > On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> > > On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > > > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > >  :
> > > > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > > > 
> > > > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > > > 
> > > > > > > There's no way for a single physical device to have two different ACPI
> > > > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > > > case.  Or we need to create two child devices for each memory block and
> > > > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > > > in the user space interface, though.
> > > > > > 
> > > > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > > > this case gracefully.
> > > > > 
> > > > > FYI: I have submitted the patch blow to close this part of the issue...
> > > > > 
> > > > > https://lkml.org/lkml/2013/8/8/396
> > > > 
> > > > That looks good to me, but we'd still need to make it possible to have
> > > > memory blocks smaller than 128 MB ...
> > > 
> > > Do you mean acpi_bind_one() needs to be able to handle such case?  If
> > > so, it should not be a problem since a memory block device won't be
> > > created when add_memory() fails with the change above.  So, there is no
> > > binding to be done.  If you mean add_memory() needs to be able to handle
> > > a smaller range, that's quite a tough one unless we make the section
> > > size smaller.
> > > 
> > > BTW, when add_memory() fails, the memory hot-add request still succeeds
> > > with no driver attached.  This seems logical, but the added device is
> > > useless when no handler is attached.  And it does not allow ejecting the
> > > device with no handler.  I am not too worry about this since this is a
> > > rare case, but it reminded me that the framework won't handle rollback.
> > 
> > I'm not sure which rollback you mean.  During removal?
> 
> I meant rollback during hot-add.  Ideally, a device should be either
> added in usable state (success) or failed back to the original state
> (rollback).  Added in un-usable state is not really a success for users,
> and creates an odd state to deal with.  But it is still a LOT better
> than crashing the system.  So, I think this outcome is reasonable on
> this framework because adding rollback at this point will complicate the
> things unnecessarily.
> 
> > There are two slight problems here in my view.  First, even if the device
> > cannot be ejected directly, it still will be removed when its parent is
> > ejected, so it may be more consistent to just allow everything to be ejected
> > regardless of whether or not it has a scan handler.  
> 
> Agreed.
> 
> > Second, I guess the
> > removal is undesirable for memory devices for which the registration of the
> > scan handler failed, so it would be good to fail the "offline" of such devices
> > regardless of how we get there.  That's why I thought it would be good to have
> > an "offline disabled" flag in struct acpi_device.
> 
> I see.  But when attach() failed, the memory device may not be used by
> the kernel.  So, I think it should be safe to remove it.

The failure of .attach() need not mean that the memory is not used by the
kernel, though, and the ACPI device object is still there and still may
be involved in some removal scenarios (through its parent).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-13  0:45                   ` Rafael J. Wysocki
@ 2013-08-13  1:02                     ` Toshi Kani
  2013-08-13 12:02                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Toshi Kani @ 2013-08-13  1:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Tue, 2013-08-13 at 02:45 +0200, Rafael J. Wysocki wrote:
> On Monday, August 12, 2013 02:40:43 PM Toshi Kani wrote:
> > On Sun, 2013-08-11 at 23:13 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> > > > On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > > > > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > > > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > > >  :
> > > > > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > > > > 
> > > > > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > > > > 
> > > > > > > > There's no way for a single physical device to have two different ACPI
> > > > > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > > > > case.  Or we need to create two child devices for each memory block and
> > > > > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > > > > in the user space interface, though.
> > > > > > > 
> > > > > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > > > > this case gracefully.
> > > > > > 
> > > > > > FYI: I have submitted the patch blow to close this part of the issue...
> > > > > > 
> > > > > > https://lkml.org/lkml/2013/8/8/396
> > > > > 
> > > > > That looks good to me, but we'd still need to make it possible to have
> > > > > memory blocks smaller than 128 MB ...
> > > > 
> > > > Do you mean acpi_bind_one() needs to be able to handle such case?  If
> > > > so, it should not be a problem since a memory block device won't be
> > > > created when add_memory() fails with the change above.  So, there is no
> > > > binding to be done.  If you mean add_memory() needs to be able to handle
> > > > a smaller range, that's quite a tough one unless we make the section
> > > > size smaller.
> > > > 
> > > > BTW, when add_memory() fails, the memory hot-add request still succeeds
> > > > with no driver attached.  This seems logical, but the added device is
> > > > useless when no handler is attached.  And it does not allow ejecting the
> > > > device with no handler.  I am not too worry about this since this is a
> > > > rare case, but it reminded me that the framework won't handle rollback.
> > > 
> > > I'm not sure which rollback you mean.  During removal?
> > 
> > I meant rollback during hot-add.  Ideally, a device should be either
> > added in usable state (success) or failed back to the original state
> > (rollback).  Added in un-usable state is not really a success for users,
> > and creates an odd state to deal with.  But it is still a LOT better
> > than crashing the system.  So, I think this outcome is reasonable on
> > this framework because adding rollback at this point will complicate the
> > things unnecessarily.
> > 
> > > There are two slight problems here in my view.  First, even if the device
> > > cannot be ejected directly, it still will be removed when its parent is
> > > ejected, so it may be more consistent to just allow everything to be ejected
> > > regardless of whether or not it has a scan handler.  
> > 
> > Agreed.
> > 
> > > Second, I guess the
> > > removal is undesirable for memory devices for which the registration of the
> > > scan handler failed, so it would be good to fail the "offline" of such devices
> > > regardless of how we get there.  That's why I thought it would be good to have
> > > an "offline disabled" flag in struct acpi_device.
> > 
> > I see.  But when attach() failed, the memory device may not be used by
> > the kernel.  So, I think it should be safe to remove it.
> 
> The failure of .attach() need not mean that the memory is not used by the
> kernel, though, and the ACPI device object is still there and still may
> be involved in some removal scenarios (through its parent).

As long as .attach() is failed cleanly, which I think is the case for
the memory handler (if not, we need to fix it), the rest of the kernel
code may not know what this device is.  That is, the ACPI memory handler
is the only one that knows PNP0C80 is a memory device.  So, I do not
think the memory can be used in such case...

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-13  1:02                     ` Toshi Kani
@ 2013-08-13 12:02                       ` Rafael J. Wysocki
  2013-08-13 17:14                         ` Toshi Kani
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2013-08-13 12:02 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Monday, August 12, 2013 07:02:44 PM Toshi Kani wrote:
> On Tue, 2013-08-13 at 02:45 +0200, Rafael J. Wysocki wrote:
> > On Monday, August 12, 2013 02:40:43 PM Toshi Kani wrote:
> > > On Sun, 2013-08-11 at 23:13 +0200, Rafael J. Wysocki wrote:
> > > > On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> > > > > On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > > > > > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > > > > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > > > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > > > >  :
> > > > > > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > > > > > 
> > > > > > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > > > > > 
> > > > > > > > > There's no way for a single physical device to have two different ACPI
> > > > > > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > > > > > case.  Or we need to create two child devices for each memory block and
> > > > > > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > > > > > in the user space interface, though.
> > > > > > > > 
> > > > > > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > > > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > > > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > > > > > this case gracefully.
> > > > > > > 
> > > > > > > FYI: I have submitted the patch blow to close this part of the issue...
> > > > > > > 
> > > > > > > https://lkml.org/lkml/2013/8/8/396
> > > > > > 
> > > > > > That looks good to me, but we'd still need to make it possible to have
> > > > > > memory blocks smaller than 128 MB ...
> > > > > 
> > > > > Do you mean acpi_bind_one() needs to be able to handle such case?  If
> > > > > so, it should not be a problem since a memory block device won't be
> > > > > created when add_memory() fails with the change above.  So, there is no
> > > > > binding to be done.  If you mean add_memory() needs to be able to handle
> > > > > a smaller range, that's quite a tough one unless we make the section
> > > > > size smaller.
> > > > > 
> > > > > BTW, when add_memory() fails, the memory hot-add request still succeeds
> > > > > with no driver attached.  This seems logical, but the added device is
> > > > > useless when no handler is attached.  And it does not allow ejecting the
> > > > > device with no handler.  I am not too worry about this since this is a
> > > > > rare case, but it reminded me that the framework won't handle rollback.
> > > > 
> > > > I'm not sure which rollback you mean.  During removal?
> > > 
> > > I meant rollback during hot-add.  Ideally, a device should be either
> > > added in usable state (success) or failed back to the original state
> > > (rollback).  Added in un-usable state is not really a success for users,
> > > and creates an odd state to deal with.  But it is still a LOT better
> > > than crashing the system.  So, I think this outcome is reasonable on
> > > this framework because adding rollback at this point will complicate the
> > > things unnecessarily.
> > > 
> > > > There are two slight problems here in my view.  First, even if the device
> > > > cannot be ejected directly, it still will be removed when its parent is
> > > > ejected, so it may be more consistent to just allow everything to be ejected
> > > > regardless of whether or not it has a scan handler.  
> > > 
> > > Agreed.
> > > 
> > > > Second, I guess the
> > > > removal is undesirable for memory devices for which the registration of the
> > > > scan handler failed, so it would be good to fail the "offline" of such devices
> > > > regardless of how we get there.  That's why I thought it would be good to have
> > > > an "offline disabled" flag in struct acpi_device.
> > > 
> > > I see.  But when attach() failed, the memory device may not be used by
> > > the kernel.  So, I think it should be safe to remove it.
> > 
> > The failure of .attach() need not mean that the memory is not used by the
> > kernel, though, and the ACPI device object is still there and still may
> > be involved in some removal scenarios (through its parent).
> 
> As long as .attach() is failed cleanly, which I think is the case for
> the memory handler (if not, we need to fix it), the rest of the kernel
> code may not know what this device is.  That is, the ACPI memory handler
> is the only one that knows PNP0C80 is a memory device.  So, I do not
> think the memory can be used in such case...

Yes, it can, if the kernel discovers it during boot before .attach() is first
called.  So say this happens and the ACPI memory device object has a parent
with existing _EJ0.

The memory handler's .attach() fails, so the initial acpi_bus_scan() won't walk
the namespace below the memory device, but it won't return an error code
either (the root device object is always there).  The parent of the failed
node is regarded as operational in particular.

Now, acpi_scan_hot_remove() is called for the parent.  Since the memory device
object doesn't have "physical" devices associated with it,
acpi_bus_offline_companions() will ignore it and acpi_scan_hot_remove() will go
for acpi_bus_trim() and straight for _EJ0 for the parent.  Splat!

I agree that this is a corner case, but I wonder if leaving it this way is a
good idea. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Cannot hot remove a memory device
  2013-08-13 12:02                       ` Rafael J. Wysocki
@ 2013-08-13 17:14                         ` Toshi Kani
  0 siblings, 0 replies; 27+ messages in thread
From: Toshi Kani @ 2013-08-13 17:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yasuaki Ishimatsu, rafael.j.wysocki, vasilis.liaskovitis,
	linux-kernel, linux-acpi, tangchen, wency

On Tue, 2013-08-13 at 14:02 +0200, Rafael J. Wysocki wrote:
> On Monday, August 12, 2013 07:02:44 PM Toshi Kani wrote:
> > On Tue, 2013-08-13 at 02:45 +0200, Rafael J. Wysocki wrote:
> > > On Monday, August 12, 2013 02:40:43 PM Toshi Kani wrote:
> > > > On Sun, 2013-08-11 at 23:13 +0200, Rafael J. Wysocki wrote:
> > > > > On Thursday, August 08, 2013 04:50:42 PM Toshi Kani wrote:
> > > > > > On Fri, 2013-08-09 at 00:12 +0200, Rafael J. Wysocki wrote:
> > > > > > > On Thursday, August 08, 2013 11:15:20 AM Toshi Kani wrote:
> > > > > > > > On Fri, 2013-08-02 at 18:04 -0600, Toshi Kani wrote:
> > > > > > > > > On Sat, 2013-08-03 at 01:43 +0200, Rafael J. Wysocki wrote:
> > > > > > > > > > On Friday, August 02, 2013 03:46:15 PM Toshi Kani wrote:
> > > > > > > > > > > On Thu, 2013-08-01 at 23:43 +0200, Rafael J. Wysocki wrote:
> > > > > > > >  :
> > > > > > > > > > > I think it fails with -EINVAL at the place with dev_warn(dev, "ACPI
> > > > > > > > > > > handle is already set\n").  When two ACPI memory objects associate with
> > > > > > > > > > > a same memory block, the bind procedure of the 2nd ACPI memory object
> > > > > > > > > > > sees that ACPI_HANDLE(dev) is already set to the 1st ACPI memory object.
> > > > > > > > > > 
> > > > > > > > > > That sound's plausible, but I wonder how we can fix that?
> > > > > > > > > > 
> > > > > > > > > > There's no way for a single physical device to have two different ACPI
> > > > > > > > > > "companions".  It looks like the memory blocks should be 64 M each in that
> > > > > > > > > > case.  Or we need to create two child devices for each memory block and
> > > > > > > > > > associate each of them with an ACPI object.  That would lead to complications
> > > > > > > > > > in the user space interface, though.
> > > > > > > > > 
> > > > > > > > > Right.  Even bigger issue is that I do not think __add_pages() and
> > > > > > > > > __remove_pages() can add / delete a memory chunk that is less than
> > > > > > > > > 128MB.  128MB is the granularity of them.  So, we may just have to fail
> > > > > > > > > this case gracefully.
> > > > > > > > 
> > > > > > > > FYI: I have submitted the patch blow to close this part of the issue...
> > > > > > > > 
> > > > > > > > https://lkml.org/lkml/2013/8/8/396
> > > > > > > 
> > > > > > > That looks good to me, but we'd still need to make it possible to have
> > > > > > > memory blocks smaller than 128 MB ...
> > > > > > 
> > > > > > Do you mean acpi_bind_one() needs to be able to handle such case?  If
> > > > > > so, it should not be a problem since a memory block device won't be
> > > > > > created when add_memory() fails with the change above.  So, there is no
> > > > > > binding to be done.  If you mean add_memory() needs to be able to handle
> > > > > > a smaller range, that's quite a tough one unless we make the section
> > > > > > size smaller.
> > > > > > 
> > > > > > BTW, when add_memory() fails, the memory hot-add request still succeeds
> > > > > > with no driver attached.  This seems logical, but the added device is
> > > > > > useless when no handler is attached.  And it does not allow ejecting the
> > > > > > device with no handler.  I am not too worry about this since this is a
> > > > > > rare case, but it reminded me that the framework won't handle rollback.
> > > > > 
> > > > > I'm not sure which rollback you mean.  During removal?
> > > > 
> > > > I meant rollback during hot-add.  Ideally, a device should be either
> > > > added in usable state (success) or failed back to the original state
> > > > (rollback).  Added in un-usable state is not really a success for users,
> > > > and creates an odd state to deal with.  But it is still a LOT better
> > > > than crashing the system.  So, I think this outcome is reasonable on
> > > > this framework because adding rollback at this point will complicate the
> > > > things unnecessarily.
> > > > 
> > > > > There are two slight problems here in my view.  First, even if the device
> > > > > cannot be ejected directly, it still will be removed when its parent is
> > > > > ejected, so it may be more consistent to just allow everything to be ejected
> > > > > regardless of whether or not it has a scan handler.  
> > > > 
> > > > Agreed.
> > > > 
> > > > > Second, I guess the
> > > > > removal is undesirable for memory devices for which the registration of the
> > > > > scan handler failed, so it would be good to fail the "offline" of such devices
> > > > > regardless of how we get there.  That's why I thought it would be good to have
> > > > > an "offline disabled" flag in struct acpi_device.
> > > > 
> > > > I see.  But when attach() failed, the memory device may not be used by
> > > > the kernel.  So, I think it should be safe to remove it.
> > > 
> > > The failure of .attach() need not mean that the memory is not used by the
> > > kernel, though, and the ACPI device object is still there and still may
> > > be involved in some removal scenarios (through its parent).
> > 
> > As long as .attach() is failed cleanly, which I think is the case for
> > the memory handler (if not, we need to fix it), the rest of the kernel
> > code may not know what this device is.  That is, the ACPI memory handler
> > is the only one that knows PNP0C80 is a memory device.  So, I do not
> > think the memory can be used in such case...
> 
> Yes, it can, if the kernel discovers it during boot before .attach() is first
> called.  So say this happens and the ACPI memory device object has a parent
> with existing _EJ0.
> 
> The memory handler's .attach() fails, so the initial acpi_bus_scan() won't walk
> the namespace below the memory device, but it won't return an error code
> either (the root device object is always there).  The parent of the failed
> node is regarded as operational in particular.
> 
> Now, acpi_scan_hot_remove() is called for the parent.  Since the memory device
> object doesn't have "physical" devices associated with it,
> acpi_bus_offline_companions() will ignore it and acpi_scan_hot_remove() will go
> for acpi_bus_trim() and straight for _EJ0 for the parent.  Splat!
> 
> I agree that this is a corner case, but I wonder if leaving it this way is a
> good idea. :-)

Oh, I see.  In practice, this case is unlikely to happen because
add_memory() fails with -EEXIST at the beginning during boot-time, and
is essentially a no-op.  But I agree with your point that such case may
happen and is safer to not allow ejecting a device with no handler.  A
system crash is much worse than not being able to eject.       

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-08-13 17:15 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01  8:37 Cannot hot remove a memory device Yasuaki Ishimatsu
2013-08-01 21:43 ` Rafael J. Wysocki
2013-08-02 21:46   ` Toshi Kani
2013-08-02 23:43     ` Rafael J. Wysocki
2013-08-03  0:04       ` Toshi Kani
2013-08-03  1:01         ` Rafael J. Wysocki
2013-08-04  0:37           ` Toshi Kani
2013-08-04 14:12             ` Rafael J. Wysocki
2013-08-05  4:00             ` Yasuaki Ishimatsu
2013-08-05  7:59               ` Yasuaki Ishimatsu
2013-08-05 13:14                 ` Cannot hot remove a memory device (patch) Rafael J. Wysocki
2013-08-05 23:19                   ` Toshi Kani
2013-08-06  0:15                     ` Cannot hot remove a memory device (patch, updated) Rafael J. Wysocki
2013-08-06  2:12                       ` Yasuaki Ishimatsu
2013-08-06 14:17                         ` Rafael J. Wysocki
2013-08-06 15:28                       ` Toshi Kani
2013-08-08 17:15         ` Cannot hot remove a memory device Toshi Kani
2013-08-08 22:12           ` Rafael J. Wysocki
2013-08-08 22:50             ` Toshi Kani
2013-08-08 23:14               ` Rafael J. Wysocki
2013-08-08 23:35                 ` Toshi Kani
2013-08-11 21:13               ` Rafael J. Wysocki
2013-08-12 20:40                 ` Toshi Kani
2013-08-13  0:45                   ` Rafael J. Wysocki
2013-08-13  1:02                     ` Toshi Kani
2013-08-13 12:02                       ` Rafael J. Wysocki
2013-08-13 17:14                         ` Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).