All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>,
	"Zhang, Xiong Y" <xiong.y.zhang@intel.com>,
	Jan Beulich <JBeulich@suse.com>
Cc: "george.dunlap@eu.citrix.com" <george.dunlap@eu.citrix.com>,
	"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	"paul.durrant@citrix.com" <paul.durrant@citrix.com>,
	"Lv, Zhiyuan" <zhiyuan.lv@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
Date: Tue, 9 May 2017 10:24:44 +0100	[thread overview]
Message-ID: <54600161-9155-a636-3e95-96245fcbca5d@citrix.com> (raw)
In-Reply-To: <591151F1.5080402@linux.intel.com>

On 09/05/17 06:21, Yu Zhang wrote:
> 
> 
> On 5/8/2017 7:12 PM, George Dunlap wrote:
>> On 08/05/17 11:52, Zhang, Xiong Y wrote:
>>>>>>> On 06.05.17 at 03:51, <xiong.y.zhang@intel.com> wrote:
>>>>>>>>> On 05.05.17 at 05:52, <xiong.y.zhang@intel.com> wrote:
>>>>>>> 'commit 1679e0df3df6 ("x86/ioreq server: asynchronously reset
>>>>>>> outstanding p2m_ioreq_server entries")' will call
>>>>>>> p2m_change_entry_type_global() which set entry.recalc=1. Then
>>>>>>> the following get_entry(p2m_ioreq_server) will return
>>>>>>> p2m_ram_rw type.
>>>>>>> But 'commit 6d774a951696 ("x86/ioreq server: synchronously reset
>>>>>>> outstanding p2m_ioreq_server entries when an ioreq server unmaps")'
>>>>>>> assume get_entry(p2m_ioreq_server) will return p2m_ioreq_server
>>>>>>> type, then reset p2m_ioreq_server entries. The fact is the
>>>>>>> assumption
>>>>>>> isn't true, and sysnchronously reset function couldn't work. Then
>>>>>>> ioreq.entry_count is larger than zero after an ioreq server unmaps,
>>>>>>> finally this results DomU couldn't reboot.
>>>>>> I've had trouble understanding this part already on v1 (btw, why is
>>>>>> this one not tagged v2?), and since I still can't figure it I have
>>>>>> to ask:
>>>>>> Why is it that guest reboot is being impacted here? From what I
>>>>>> recall
>>>>>> a non-zero count should only prevent migration.
>>>>> [Zhang, Xiong Y] Sorry, although they solve the same issue, the
>>>>> solution is
>>>>> totally different, so I didn't mark this as V2, I will mark the
>>>>> following
>>>>> as v2 with this solution.
>>>>> During DomU reboot, it will first unmap ioreq server in shutdown
>>>>> process,
>>>>> then it call map ioreq server in boot process. The following
>>>>> sentence in
>>>>> p2m_set_ioreq_server() result mapping ioreq server failure, then DomU
>>>>> couldn't continue booting.
>>>>> If ( read_atomic(&p->ioreq.entry_count))
>>>>>     goto out;
>>>> It is clear that it would be this statement to be the problem one,
>>>> but I continue to not see why this would affect reboot: The rebooted
>>>> guest runs in another VM with, hence, a different p2m. I cannot see
>>>> why there would be a non-zero ioreq.entry_count the first time an
>>>> ioreq server claims the p2m_ioreq_server type for this new domain.
>>>>
>>> [Zhang, Xiong Y] This is what I see from xl dmesg when a DomU reboot
>>> 1) unmap io_req_server with old domid
>>> 2) map io_req_server with old domid
>>> 3)unmap io_req_server with old domid
>>> 4) map io_req_server with new domid
>>>
>>> The 1) and 2) are triggered by our device reset handler in qemu, it will
>>> destroy old device handler, then create device handler with the old
>>> domid
>>> again. so we could see ioreq.entry_could > 0 with old domid, then reboot
>>> process terminated.
>> Oh, so it prevents reboot of XenGT, but not of normal guests?
>>
>> Why does a reboot cause the device to detach, re-attach, and then
>> re-detach again?
>>
>> Also, I'm sorry for missing the bug during review, but it's a bit
>> annoying to find out that the core functionality of patch -- detaching
>> and re-attaching -- wasn't tested at all before submission.
> 
> Thanks for your reply, George.
> This error was introduced in the last version of patch "x86/ioreq
> server: Asynchronously reset
> outstanding p2m_ioreq_server entries.", which included 2 changes from
> its previous version:
> 
> 1> Do not reset p2m_ioreq_server back to p2m_ram_rw when an ioreq server
> is mapped;
> 
> 2> Use a helper function to return the p2m type -  this new helper
> routine returns p2m_ram_rw
> instead of p2m_ioreq_server if there's no mapping ioreq server. In
> previous versions I just returned
> p2m_ioreq_server, and XenGT tests all passed successfully. But I had not
> realized this issue for this
> last version and did not have enough time to do the test for this
> version in the last moment before
> code freeze...

Indeed, we were under a lot of time pressure and you did spend a lot of
time outside of your normal office hours to try to get things fixed; and
your (plural) testing did eventually turn this up before the release.
Sorry for being a bit irritable.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-05-09  9:24 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-05  3:52 [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type Xiong Zhang
2017-05-05 14:40 ` Jan Beulich
2017-05-06  1:51   ` Zhang, Xiong Y
2017-05-08  7:04     ` Jan Beulich
2017-05-08 10:52       ` Zhang, Xiong Y
2017-05-08 11:12         ` George Dunlap
2017-05-08 11:59           ` Zhang, Xiong Y
2017-05-08 12:13             ` Jan Beulich
2017-05-09  5:21           ` Yu Zhang
2017-05-09  9:24             ` George Dunlap [this message]
2017-05-08 13:26   ` George Dunlap
2017-05-08 13:55     ` George Dunlap
2017-05-08 15:47       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54600161-9155-a636-3e95-96245fcbca5d@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=paul.durrant@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=xiong.y.zhang@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.