From: George Dunlap <george.dunlap@citrix.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>, Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Tim Deegan <tim@xen.org>,
xen-devel@lists.xen.org, Paul Durrant <paul.durrant@citrix.com>,
zhiyuan.lv@intel.com, JunNakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
Date: Mon, 20 Jun 2016 11:43:55 +0100 [thread overview]
Message-ID: <def216bb-7f85-9307-005b-7e7ecd6eef47@citrix.com> (raw)
In-Reply-To: <5767C5C9.5090505@linux.intel.com>
On 20/06/16 11:30, Yu Zhang wrote:
>
>
> On 6/20/2016 6:10 PM, George Dunlap wrote:
>> On 20/06/16 10:03, Yu Zhang wrote:
>>>
>>> On 6/17/2016 6:17 PM, George Dunlap wrote:
>>>> On 16/06/16 10:55, Jan Beulich wrote:
>>>>>> Previously in the 2nd version, I used
>>>>>> p2m_change_entry_type_global() to
>>>>>> reset the
>>>>>> outstanding p2m_ioreq_server entries back to p2m_ram_rw
>>>>>> asynchronously after
>>>>>> the de-registration. But we realized later that this approach
>>>>>> means we
>>>>>> can not support
>>>>>> live migration. And to recalculate the whole p2m table forcefully
>>>>>> when
>>>>>> de-registration
>>>>>> happens means too much cost.
>>>>>>
>>>>>> And further discussion with Paul was that we can leave the
>>>>>> responsibility to reset p2m type
>>>>>> to the device model side, and even a device model fails to do so, the
>>>>>> affected one will only
>>>>>> be the current VM, neither other VM nor hypervisor will get hurt.
>>>>>>
>>>>>> I thought we have reached agreement in the review process of
>>>>>> version 2,
>>>>>> so I removed
>>>>>> this part from version 3.
>>>>> In which case I would appreciate the commit message to explain
>>>>> this (in particular I admit I don't recall why live migration would
>>>>> be affected by the p2m_change_entry_type_global() approach,
>>>>> but the request is also so that later readers have at least some
>>>>> source of information other than searching the mailing list).
>>>> Yes, I don't see why either. You wouldn't de-register the ioreq server
>>>> until after the final sweep after the VM has been paused, right? At
>>>> which point the lazy p2m re-calculation shouldn't really matter much I
>>>> don't think.
>>> Oh, seems I need to give some explanation, and sorry for the late reply.
>>>
>>> IIUC, p2m_change_entry_type_global() only sets the e.emt field to an
>>> invalid value and turn on
>>> the e.recal flag; the real p2m reset is done in resolve_misconfig() when
>>> ept misconfiguration
>>> happens or when ept_set_entry() is called.
>>>
>>> In the 2nd version patch, we leveraged this approach, by adding
>>> p2m_ioreq_server into the
>>> P2M_CHANGEABLE_TYPES, and triggering the p2m_change_entry_type_global()
>>> when an ioreq
>>> server is unbounded, hoping that later accesses to these gfns will reset
>>> the p2m type back to
>>> p2m_ram_rw. And for the recalculation itself, it works.
>>>
>>> However, there are conflicts if we take live migration into account,
>>> i.e. if the live migration is
>>> triggered by the user(unintentionally maybe) during the gpu emulation
>>> process, resolve_misconfig()
>>> will set all the outstanding p2m_ioreq_server entries to p2m_log_dirty,
>>> which is not what we expected,
>>> because our intention is to only reset the outdated p2m_ioreq_server
>>> entries back to p2m_ram_rw.
>> Well the real problem in the situation you describe is that a second
>> "lazy" p2m_change_entry_type_global() operation is starting before the
>> first one is finished. All that's needed to resolve the situation is
>> that if you get a second p2m_change_entry_type_global() operation while
>> there are outstanding entries from the first type change, you have to
>> finish the first operation (i.e., go "eagerly" find all the
>> misconfigured entries and change them to the new type) before starting
>> the second one.
>
> Thanks for your reply, George. :)
> I think this could also happen even when there's no first round
> p2m_change_entry_type_global(),
> the resolve_misconfig() will also change normal p2m_ioreq_server entries
> back to p2m_log_dirty.
>
> By "go 'eagerly'", do you mean traverse the ept table? Wouldn't that be
> time consuming
> also?
Yes, but it would only need to be done in the cases where there happened
to be a collision. And isn't it the case that we have to do things the
long way for all non-EPT guests (either shadow or AMD HAP) anyway?
>>> So one solution is to disallow the log dirty feature in XenGT, i.e. just
>>> return failure when enable_logdirty()
>>> is called in toolstack. But I'm afraid this will restrict XenGT's future
>>> live migration feature.
>> I don't understand this -- you can return -EBUSY if live migration is
>> attempted while there are outstanding ioreq_server entries for the time
>> being, and at some point in the future when this actually works, you can
>> return success.
>>
>
> Well, the problem is we cannot easily tell if there's any outstanding
> p2m_ioreq_server entries.
Well at very least we could count if we needed to. :-)
> Besides, do you agree it is the responsibility of device model to do the
> cleaning?
I don't necessarily think so. When qemu exits, for instance, dom0 will
automatically unmap all the references dom0 had to the guests' RAM --
that's part of the job of what operating systems do. It just seems like
a more robust interface to have Xen clean up regardless of what the
guest does.
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-06-20 10:43 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-19 9:05 [PATCH v4 0/3] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-05-19 9:05 ` [PATCH v4 1/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
2016-06-14 10:04 ` Jan Beulich
2016-06-14 13:14 ` George Dunlap
2016-06-15 10:51 ` Yu Zhang
2016-05-19 9:05 ` [PATCH v4 2/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
2016-05-19 9:05 ` [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-06-14 10:45 ` Jan Beulich
2016-06-14 13:13 ` George Dunlap
2016-06-14 13:31 ` Jan Beulich
2016-06-15 9:50 ` George Dunlap
2016-06-15 10:21 ` Jan Beulich
2016-06-15 11:28 ` George Dunlap
2016-06-16 9:30 ` Yu Zhang
2016-06-16 9:55 ` Jan Beulich
2016-06-17 10:17 ` George Dunlap
2016-06-20 9:03 ` Yu Zhang
2016-06-20 10:10 ` George Dunlap
2016-06-20 10:25 ` Jan Beulich
2016-06-20 10:32 ` George Dunlap
2016-06-20 10:55 ` Jan Beulich
2016-06-20 11:28 ` Yu Zhang
2016-06-20 13:13 ` George Dunlap
2016-06-21 7:42 ` Yu Zhang
2016-06-20 10:30 ` Yu Zhang
2016-06-20 10:43 ` George Dunlap [this message]
2016-06-20 10:45 ` Jan Beulich
2016-06-20 11:06 ` Yu Zhang
2016-06-20 11:20 ` Jan Beulich
2016-06-20 12:06 ` Yu Zhang
2016-06-20 13:38 ` Jan Beulich
2016-06-21 7:45 ` Yu Zhang
2016-06-21 8:22 ` Jan Beulich
2016-06-21 9:16 ` Yu Zhang
2016-06-21 9:47 ` Jan Beulich
2016-06-21 10:00 ` Yu Zhang
2016-06-21 14:38 ` George Dunlap
2016-06-22 6:39 ` Jan Beulich
2016-06-22 8:38 ` Yu Zhang
2016-06-22 9:11 ` Jan Beulich
2016-06-22 9:16 ` George Dunlap
2016-06-22 9:29 ` Jan Beulich
2016-06-22 9:47 ` George Dunlap
2016-06-22 10:07 ` Yu Zhang
2016-06-22 11:33 ` George Dunlap
2016-06-23 7:37 ` Yu Zhang
2016-06-23 10:33 ` George Dunlap
2016-06-24 4:16 ` Yu Zhang
2016-06-24 6:12 ` Jan Beulich
2016-06-24 7:12 ` Yu Zhang
2016-06-24 8:01 ` Jan Beulich
2016-06-24 9:57 ` Yu Zhang
2016-06-24 10:27 ` Jan Beulich
2016-06-22 10:10 ` Jan Beulich
2016-06-22 10:15 ` George Dunlap
2016-06-22 11:50 ` Jan Beulich
2016-06-15 10:52 ` Yu Zhang
2016-06-15 12:26 ` Jan Beulich
2016-06-16 9:32 ` Yu Zhang
2016-06-16 10:02 ` Jan Beulich
2016-06-16 11:18 ` Yu Zhang
2016-06-16 12:43 ` Jan Beulich
2016-06-20 9:05 ` Yu Zhang
2016-06-14 13:14 ` George Dunlap
2016-05-27 7:52 ` [PATCH v4 0/3] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Zhang, Yu C
2016-05-27 10:00 ` Jan Beulich
2016-05-27 9:51 ` Zhang, Yu C
2016-05-27 10:02 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=def216bb-7f85-9307-005b-7e7ecd6eef47@citrix.com \
--to=george.dunlap@citrix.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=paul.durrant@citrix.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
--cc=yu.c.zhang@linux.intel.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).