All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: boris.ostrovsky@oracle.com, Peter Krempa <pkrempa@redhat.com>,
	liran.alon@oracle.com, qemu-devel@nongnu.org
Subject: Re: [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs
Date: Wed, 15 Jul 2020 14:38:00 +0200	[thread overview]
Message-ID: <0fd38252-b16d-fee8-31de-71e35475e3bc@redhat.com> (raw)
In-Reply-To: <20200714171935.10507f90@redhat.com>

On 07/14/20 17:19, Igor Mammedov wrote:
> On Tue, 14 Jul 2020 14:41:28 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 07/14/20 14:28, Laszlo Ersek wrote:
>>> (CC'ing Peter Krempa due to virsh setvcpu (singular) / setvcpus (plural)
>>> references)
>>>
>>> On 07/10/20 18:17, Igor Mammedov wrote:  
>>>> In case firmware has negotiated CPU hotplug SMI feature, generate
>>>> AML to describe SMI IO port region and send SMI to firmware
>>>> on each CPU hotplug SCI.
>>>>
>>>> It might be not really usable, but should serve as a starting point to
>>>> discuss how better to deal with split hotplug sequence during hot-add
>>>> (
>>>> ex scenario where it will break is:
>>>>    hot-add  
>>>>       -> (QEMU) add CPU in hotplug regs
>>>>       -> (QEMU) SCI  
>>>>            -1-> (OS) scan
>>>>                -1-> (OS) SMI
>>>>                -1-> (FW) pull in CPU1 ***
>>>>                -1-> (OS) start iterating hotplug regs
>>>>    hot-add  
>>>>       -> (QEMU) add CPU in hotplug regs
>>>>       -> (QEMU) SCI  
>>>>             -2-> (OS) scan (blocked on mutex till previous scan is finished)
>>>>                -1-> (OS) 1st added CPU1 send device check event -> INIT/SIPI
>>>>                -1-> (OS) 1st added CPU2 send device check event -> INIT/SIPI
>>>>                        that's where it explodes, since FW didn't see CPU2
>>>>                        when SMI was called
>>>> )
>>>>
>>>> hot remove will throw in yet another set of problems, so lets discuss
>>>> both here and see if we can  really share hotplug registers block between
>>>> FW and AML or we should do something else with it.  
>>>
>>> This issue is generally triggered by management applications such as
>>> libvirt that issue device_add commands in quick succession. For libvirt,
>>> the command is "virsh setvcpus" (plural) with multiple CPUs specified
>>> for plugging. The singular "virsh setvcpu" command, which is more
>>> friendly towards guest NUMA, does not run into the symptom.
>>>
>>> The scope of the scan method lock is not large enough, with SMI in the
>>> picture.
>>>
>>> I suggest that we not uproot the existing AML code or the hotplug
>>> register block. Instead, I suggest that we add serialization at a higher
>>> level, with sufficient scope.
>>>
>>> QEMU:
>>>
>>> - introduce a new flag standing for "CPU plug operation in progress"
>>>
>>> - if ICH9_LPC_SMI_F_BROADCAST_BIT has been negotiated:
>>>
>>>   - "device_add" and "device_del" should enforce
>>>     ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT and
>>>     ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT, respectively
>>>
>>>   - both device_add and device_del (for VCPUs) should set check the
>>>     "in progress" flag.
>>>
>>>     - If set, reject the request synchronously
>>>
>>>     - Otherwise, set the flag, and commence the operation
>>>
>>>   - in cpu_hotplug_wr(), where we emit the ACPI_DEVICE_OST event with
>>>     qapi_event_send_acpi_device_ost(), clear the "in-progress" flag.
>>>
>>> - If QEMU executes the QMP command processing and the cpu_hotplug_wr()
>>> function on different (host OS) threads, then perhaps we should use an
>>> atomic type for the flag. (Not sure about locking between QEMU threads,
>>> sorry.) I don't really expect race conditions, but in case we ever get
>>> stuck with the flag, we should make sure that the stuck state is "in
>>> progress", and not "not in progress". (The former state can prevent
>>> further plug operations, but cannot cause the guest to lose state.)  
>>
>> Furthermore, the "CPU plug operation in progress" flag should be:
>> - either migrated,
>> - or a migration blocker.
>>
>> Because on the destination host, device_add should be possible if and
>> only if the plug operation completed (either still on the source host,
>> or on the destination host).
> 
> I have a way more simple alternative idea, which doesn't involve libvirt.
> 
> We can change AML to
>   1. cache hotplugged CPUs from controller
>   2. send SMI
>   3. send Notify event to OS to online cached CPUs
> this way we never INIT/SIPI cpus that FW hasn't seen yet
> as for FW, it can relocate extra CPU that arrived after #1
> it won't cause any harm as on the next SCI AML will pick up those
> CPUs and SMI upcall will be just NOP.
> 
> I'll post a patch here on top of this series for you to try
> (without any of your comments addressed yet, as it's already written
> and I was testing it for a while to make sure it won't explode
> with various windows versions)

Sounds good, I'll be happy to test it.

Indeed "no event" is something that the fw deals with gracefully. (IIRC
I wanted to cover a "spurious SMI" explicitly.)

It didn't occur to me that you could dynamically size e.g. a package
object in AML. Based on my reading of the ACPI spec, "VarPackageOp" can
take a *runtime* "NumElements", so if you did two loops, the first loop
could count the pending stuff, and then a VarPackageOp could be used
with just the right NumElements... Anyway, I digress :)

> 
>> I guess that the "migration blocker" option is easier.
>>
>> Anyway I assume this is already handled with memory hotplug somehow
>> (i.e., migration attempt between device_add and ACPI_DEVICE_OST).
> 
> Thanks for comments,
> I'll need some time to ponder on other comments and do some
> palaeontology research to answer questions
> (aka. I need to make up excuses for the code I wrote :) )

haha, thanks :)
Laszlo



  reply	other threads:[~2020-07-15 12:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10 16:17 [RFC 0/3] x86: fix cpu hotplug with secure boot Igor Mammedov
2020-07-10 16:17 ` [RFC 1/3] x86: lpc9: let firmware negotiate CPU hotplug SMI feature Igor Mammedov
2020-07-14 10:19   ` Laszlo Ersek
2020-07-10 16:17 ` [RFC 2/3] x86: cphp: prevent guest crash on CPU hotplug when broadcast SMI is in use Igor Mammedov
2020-07-14 10:56   ` Laszlo Ersek
2020-07-17 12:57     ` Igor Mammedov
2020-07-20 17:29       ` Laszlo Ersek
2020-07-10 16:17 ` [RFC 3/3] x68: acpi: trigger SMI before scanning for hotplugged CPUs Igor Mammedov
2020-07-14 12:28   ` Laszlo Ersek
2020-07-14 12:41     ` Laszlo Ersek
2020-07-14 15:19       ` Igor Mammedov
2020-07-15 12:38         ` Laszlo Ersek [this message]
2020-07-15 13:43           ` Igor Mammedov
2020-07-16 12:36             ` Laszlo Ersek
2020-07-17 13:13     ` Igor Mammedov
2020-07-20 19:12       ` Laszlo Ersek
2020-07-14  9:58 ` [RFC 0/3] x86: fix cpu hotplug with secure boot Laszlo Ersek
2020-07-14 10:10   ` Laszlo Ersek
2020-07-14 18:26 ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fd38252-b16d-fee8-31de-71e35475e3bc@redhat.com \
    --to=lersek@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=imammedo@redhat.com \
    --cc=liran.alon@oracle.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.