linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, <linux-acpi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <toshi.kani@hp.com>,
	<lenb@kernel.org>, <wency@cn.fujitsu.com>,
	<vasilis.liaskovitis@profitbricks.com>
Subject: Re: [PATCH v2] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Date: Wed, 31 Oct 2012 19:52:54 +0900	[thread overview]
Message-ID: <50910306.2030205@jp.fujitsu.com> (raw)
In-Reply-To: <20121026152544.GC15840@kroah.com>

Hi Greg,

2012/10/27 0:25, Greg Kroah-Hartman wrote:
> On Fri, Oct 26, 2012 at 04:33:49PM +0900, Yasuaki Ishimatsu wrote:
>> Hi Greg,
>>
>> Sorry for late reply.
>>
>> 2012/10/20 2:59, Greg Kroah-Hartman wrote:
>>> On Fri, Oct 19, 2012 at 06:29:52AM +0200, Rafael J. Wysocki wrote:
>>>> On Thursday 11 of October 2012 19:12:28 Yasuaki Ishimatsu wrote:
>>>>> acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error
>>>>> number. But acpi_bus_remove() cannot return error number correctly.
>>>>> acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if
>>>>> device cannot be removed correctly, acpi_bus_trim() ignores and continues to
>>>>> remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing
>>>>> devices. Therefore acpi_bus_hot_remove_device() can send "_EJ0" to firmware,
>>>>> even if the device is running on the system. In this case, the system cannot
>>>>> work well.
>>>>>
>>>>> Vasilis hit the bug at memory hotplug and reported it as follow:
>>>>> https://lkml.org/lkml/2012/9/26/318
>>>>>
>>>>> So acpi_bus_trim() should check whether device was removed or not correctly.
>>>>> The patch adds error check into some functions to remove the device.
>>>>>
>>>>> Applying the patch, acpi_bus_trim() stops removing devices when failing
>>>>> to remove the device. But I think there is no impact with the
>>>>> exceptionof CPU and Memory hotplug path. Because other device also fails
>>>>> but the fail is an irregular case like device is NULL.
>>>>>
>>>>> v1->v2
>>>>> - add a rollback for reinstalling a notify handler.
>>>>>
>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>
>>>> Greg, do you think there may be any problems with the changes in dd.c?
>>>
>>> Yes, I don't like it.
>>>
>>> remove should always work, just like the exit call in a module.  It
>>> means that the core wants to remove the driver, so it is going to
>>> happen, a driver can't refuse it.
>>>
>>> Which brings me to the larger question, why would this solve anything?
>>
>> Now we are developing physical memory hot plug.
>>
>> https://lkml.org/lkml/2012/10/23/213
>>
>> So if we aplly the patch-set, we can hot remove a physical memory
>> by the following way.
>>
>> "echo 1 > /sys/bus/acpi/devices/PNP/eject"
>>
>> In this case, acpi_bus_hot_remove_device() tries to remove memory
>> device by acpi_bus_trim(). But if the memory has irremovable memory,
>> memory hot remove fails. And the memory remains in kernel.
>> However acpi_bus_trim() cannot notice that memory hot remove fails and
>> retruns 0. So acpi_bus_hot_remove_device() continues to remove memory
>> devices and sends _EJ0 method to firmware. Thus the memory device cannot
>> be used. But the memory remains in kernel yet. So if someone access the
>> memory, kernel panic occurs.
>
> Why can't you check to find out if you can do the remove operation
> before you enter the driver core asking to actually remove the devices?
> That would allow you to "know" if you can do this before having to go
> through the whole operation.  What happens if you can complete half of
> the removal, and do that, but not the whole thing?  Don't you end up
> with half of the memory chunk gone from the system now?
>

> In other words, please solve this at a higher level than the driver
> core if at all possible.

O.K.
I'll check whether the problem is sloved at a higher level or not.

Thanks,
Yasuaki Ishimatsu

>
> greg k-h
>



      reply	other threads:[~2012-10-31 10:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-11 10:12 [PATCH v2] acpi : acpi_bus_trim() stops removing devices when failing to remove the device Yasuaki Ishimatsu
2012-10-11 13:58 ` Toshi Kani
2012-10-12  4:31   ` Yasuaki Ishimatsu
2012-10-19  4:29 ` Rafael J. Wysocki
2012-10-19 17:59   ` Greg Kroah-Hartman
2012-10-26  7:33     ` Yasuaki Ishimatsu
2012-10-26 15:25       ` Greg Kroah-Hartman
2012-10-31 10:52         ` Yasuaki Ishimatsu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50910306.2030205@jp.fujitsu.com \
    --to=isimatu.yasuaki@jp.fujitsu.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=toshi.kani@hp.com \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=wency@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).