From: Razvan Cojocaru <rcojocaru@bitdefender.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: "kevin.tian@intel.com" <kevin.tian@intel.com>,
"tamas@tklengyel.com" <tamas@tklengyel.com>,
"wei.liu2@citrix.com" <wei.liu2@citrix.com>,
"jbeulich@suse.com" <jbeulich@suse.com>,
"george.dunlap@eu.citrix.com" <george.dunlap@eu.citrix.com>,
"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
"Mihai Donțu" <mdontu@bitdefender.com>,
"Andrei Vlad LUTAS" <vlutas@bitdefender.com>,
"jun.nakajima@intel.com" <jun.nakajima@intel.com>,
"Alexandru Stefan ISAILA" <aisaila@bitdefender.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"Anshul Makkar" <anshul.makkar@citrix.com>
Subject: Re: [PATCH v1] x86/hvm: Generic instruction re-execution mechanism for execute faults
Date: Thu, 22 Nov 2018 14:48:07 +0200 [thread overview]
Message-ID: <98f57a8a-288d-45ec-ef01-889fce63eeff@bitdefender.com> (raw)
In-Reply-To: <20181122105821.6ihjcq5dy2lqjj6j@mac>
On 11/22/18 12:58 PM, Roger Pau Monné wrote:
> On Thu, Nov 22, 2018 at 12:14:59PM +0200, Razvan Cojocaru wrote:
>> On 11/22/18 12:05 PM, Roger Pau Monné wrote:
>>> On Wed, Nov 21, 2018 at 08:55:48PM +0200, Razvan Cojocaru wrote:
>>>> On 11/16/18 7:04 PM, Roger Pau Monné wrote:
>>>>>> + if ( a == v )
>>>>>> + continue;
>>>>>> +
>>>>>> + /* Pause, synced. */
>>>>>> + while ( !a->arch.in_host )
>>>>> Why not use a->is_running as a way to know whether the vCPU is
>>>>> running?
>>>>>
>>>>> I think the logic of using vcpu_pause and expecting the running vcpu
>>>>> to take a vmexit and thus set in_host is wrong because a vcpu that
>>>>> wasn't running when vcpu_pause_nosync is called won't get scheduled
>>>>> anymore, thus not taking a vmexit and this function will lockup.
>>>>>
>>>>> I don't think you need the in_host boolean at all.
>>>>>
>>>>>> + cpu_relax();
>>>>> Is this really better than using vcpu_pause?
>>>>>
>>>>> I assume this is done to avoid waiting on each vcpu, and instead doing
>>>>> it here likely means less wait time?
>>>>
>>>> The problem with plain vcpu_pause() is that we weren't able to use it,
>>>> for the same reason (which remains unclear as of yet) that we couldn't
>>>> use a->is_running: we get CPU stuck hypervisor crashes that way. Here's
>>>> one that uses the same logic, but loops on a->is_running instead of
>>>> !a->arch.in_host:
>
> [...]
>
>>>> Some scheduler magic appears to happen here where it is unclear why
>>>> is_running doesn't seem to end up being 0 as expected in our case. We'll
>>>> keep digging.
>>>
>>> There seems to be some kind of deadlock between
>>> vmx_start_reexecute_instruction and hap_track_dirty_vram/handle_mmio.
>>> Are you holding a lock while trying to put the other vcpus to sleep?
>>
>> d->arch.rexec_lock, but I don't see how that would matter in this case.
>
> The trace from pCPU#0:
>
> (XEN) [ 3668.016989] RFLAGS: 0000000000000202 CONTEXT: hypervisor (d0v0)
> [...]
> (XEN) [ 3668.275417] Xen call trace:
> (XEN) [ 3668.278714] [<ffff82d0801327d2>] vcpu_sleep_sync+0x40/0x71
> (XEN) [ 3668.284952] [<ffff82d08010735b>] domain.c#do_domain_pause+0x33/0x4f
> (XEN) [ 3668.291973] [<ffff82d08010879a>] domain_pause+0x25/0x27
> (XEN) [ 3668.297952] [<ffff82d080245e69>] hap_track_dirty_vram+0x2c1/0x4a7
> (XEN) [ 3668.304797] [<ffff82d0801dd8f5>] do_hvm_op+0x18be/0x2b58
> (XEN) [ 3668.310864] [<ffff82d080172aca>] pv_hypercall+0x1e5/0x402
> (XEN) [ 3668.317017] [<ffff82d080250899>] entry.o#test_all_events+0/0x3d
>
> Shows there's an hypercall executed from Dom0 that's trying to pause
> the domain, thus pausing all the vCPUs.
>
> Then pCPU#3:
>
> (XEN) [ 3669.062841] RFLAGS: 0000000000000202 CONTEXT: hypervisor (d1v0)
> [...]
> (XEN) [ 3669.322832] Xen call trace:
> (XEN) [ 3669.326128] [<ffff82d08021006a>] vmx_start_reexecute_instruction+0x107/0x68a
> (XEN) [ 3669.333925] [<ffff82d080210b3e>] p2m_mem_access_check+0x551/0x64d
> (XEN) [ 3669.340774] [<ffff82d0801dee9e>] hvm_hap_nested_page_fault+0x2f2/0x631
> (XEN) [ 3669.348051] [<ffff82d080202c00>] vmx_vmexit_handler+0x156c/0x1e45
> (XEN) [ 3669.354899] [<ffff82d08020820c>] vmx_asm_vmexit_handler+0xec/0x250
>
> Seems to be blocked in vmx_start_reexecute_instruction, and thus not
> getting paused and triggering the watchdog on pCPU#0?
>
> You should check on which vCPU is the trace from pCPU#0 waiting, if
> that's the vCPU running on pCPU#3 (d1v0) you will have to check what's
> taking such a long time in vmx_start_reexecute_instruction.
Right, so this is what appears to be happening, if the output of my test
is to be trusted: https://pastebin.com/YEDqNuwh
1. vmx_start_reexecute_instruction() pauses all VCPUs but self (which
appears to be VCPU 1):
(XEN) [ 195.427141] 0 pause_count 0
(XEN) [ 195.427142] 2 pause_count 0
(XEN) [ 195.427143] 3 pause_count 0
(XEN) [ 195.427144] 4 pause_count 0
(XEN) [ 195.427146] 5 pause_count 0
(XEN) [ 195.427147] 6 pause_count 0
(XEN) [ 195.427148] 7 pause_count 0
2. The hypercall happens, which calls domain_pause(), which I've
modified thus:
@@ -959,7 +961,10 @@ static void do_domain_pause(struct domain *d,
atomic_inc(&d->pause_count);
for_each_vcpu( d, v )
+ {
+ printk("domain_pause %d\n", v->vcpu_id);
sleep_fn(v);
+ }
arch_domain_pause(d);
}
and which says:
(XEN) [ 195.492064] domain_pause 0
3. At this point, according to addr2line,
vmx_start_reexecute_instruction() does "while ( a->is_running )
cpu_relax();" for all VCPUs but itself.
Now, d1v0, which, if I'm reading this correctly, is the VCPU that
domain_pause() is stuck waiting for, does:
(XEN) [ 200.829874] Xen call trace:
(XEN) [ 200.833166] [<ffff82d0801278c6>]
queue_read_lock_slowpath+0x25/0x4d
(XEN) [ 200.840186] [<ffff82d08020c1f6>]
get_page_from_gfn_p2m+0x14e/0x3b0
(XEN) [ 200.847121] [<ffff82d080247213>]
hap_p2m_ga_to_gfn_4_levels+0x48/0x299
(XEN) [ 200.854400] [<ffff82d080247480>]
hap_gva_to_gfn_4_levels+0x1c/0x1e
(XEN) [ 200.861331] [<ffff82d08021275c>] paging_gva_to_gfn+0x10e/0x11d
(XEN) [ 200.867918] [<ffff82d0801d66e0>] hvm.c#__hvm_copy+0x98/0x37f
(XEN) [ 200.874329] [<ffff82d0801d848d>]
hvm_fetch_from_guest_virt_nofault+0x14/0x16
(XEN) [ 200.882130] [<ffff82d0801d141a>]
emulate.c#_hvm_emulate_one+0x118/0x2bc
(XEN) [ 200.889496] [<ffff82d0801d16b4>] hvm_emulate_one+0x10/0x12
(XEN) [ 200.895735] [<ffff82d0801e0902>] handle_mmio+0x52/0xc9
(XEN) [ 200.901626] [<ffff82d0801e09ba>]
handle_mmio_with_translation+0x41/0x43
(XEN) [ 200.908994] [<ffff82d0801ded1f>]
hvm_hap_nested_page_fault+0x133/0x631
(XEN) [ 200.916271] [<ffff82d080202c40>]
vmx_vmexit_handler+0x156c/0x1e45
(XEN) [ 200.923117] [<ffff82d08020824c>]
vmx_asm_vmexit_handler+0xec/0x250
I hope I'm not reading this wrong.
Thanks,
Razvan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2018-11-22 12:48 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-16 10:06 [PATCH v1] x86/hvm: Generic instruction re-execution mechanism for execute faults Alexandru Stefan ISAILA
2018-11-16 17:04 ` Roger Pau Monné
2018-11-19 13:30 ` Alexandru Stefan ISAILA
2018-11-19 14:26 ` Jan Beulich
2018-11-19 15:08 ` Roger Pau Monné
2018-11-19 15:56 ` Alexandru Stefan ISAILA
2018-11-21 9:56 ` Roger Pau Monné
2018-11-21 10:28 ` Alexandru Stefan ISAILA
2018-11-21 11:41 ` Roger Pau Monné
2018-11-21 12:00 ` Alexandru Stefan ISAILA
2018-11-19 13:33 ` Jan Beulich
2018-11-21 18:55 ` Razvan Cojocaru
2018-11-22 9:50 ` Alexandru Stefan ISAILA
2018-11-22 10:00 ` Jan Beulich
2018-11-22 10:07 ` Roger Pau Monné
2018-11-22 10:05 ` Roger Pau Monné
2018-11-22 10:14 ` Razvan Cojocaru
2018-11-22 10:58 ` Roger Pau Monné
2018-11-22 12:48 ` Razvan Cojocaru [this message]
2018-11-22 14:49 ` Roger Pau Monné
2018-11-22 15:25 ` Razvan Cojocaru
2018-11-22 15:37 ` Roger Pau Monné
2018-11-22 16:52 ` Razvan Cojocaru
2018-11-22 17:08 ` Roger Pau Monné
2018-11-22 18:24 ` Razvan Cojocaru
2018-11-23 8:54 ` Roger Pau Monné
[not found] ` <59739FBC020000C234861ACF@prv1-mh.provo.novell.com>
[not found] ` <F553A58C020000AB0063616D@prv1-mh.provo.novell.com>
[not found] ` <4D445A680200003E34861ACF@prv1-mh.provo.novell.com>
[not found] ` <DAD49D5A020000780063616D@prv1-mh.provo.novell.com>
[not found] ` <5400A6CB0200003634861ACF@prv1-mh.provo.novell.com>
[not found] ` <203C1A92020000400063616D@prv1-mh.provo.novell.com>
[not found] ` <0DF3BC62020000E934861ACF@prv1-mh.provo.novell.com>
[not found] ` <C6A2E442020000640063616D@prv1-mh.provo.novell.com>
[not found] ` <6EEA58AB020000EA34861ACF@prv1-mh.provo.novell.com>
2018-11-27 10:31 ` Razvan Cojocaru
2018-11-27 11:32 ` Roger Pau Monné
2018-11-27 11:45 ` Razvan Cojocaru
2018-11-27 11:59 ` Andrew Cooper
2018-11-27 12:12 ` Razvan Cojocaru
2018-12-19 16:49 ` Alexandru Stefan ISAILA
2018-12-19 17:40 ` Roger Pau Monné
2018-12-20 14:37 ` Alexandru Stefan ISAILA
[not found] ` <838191050200006B34861ACF@prv1-mh.provo.novell.com>
2018-11-23 9:07 ` Jan Beulich
2018-11-27 10:49 ` Razvan Cojocaru
2018-11-27 11:28 ` Jan Beulich
2018-11-27 11:44 ` Razvan Cojocaru
2019-05-13 13:58 ` Razvan Cojocaru
2019-05-13 13:58 ` [Xen-devel] " Razvan Cojocaru
2019-05-13 14:06 ` Jan Beulich
2019-05-13 14:06 ` [Xen-devel] " Jan Beulich
2019-05-13 14:15 ` Razvan Cojocaru
2019-05-13 14:15 ` [Xen-devel] " Razvan Cojocaru
2019-05-14 13:47 ` Razvan Cojocaru
2019-05-14 13:47 ` [Xen-devel] " Razvan Cojocaru
2019-05-14 14:16 ` Jan Beulich
2019-05-14 14:16 ` [Xen-devel] " Jan Beulich
2019-05-14 14:20 ` Razvan Cojocaru
2019-05-14 14:20 ` [Xen-devel] " Razvan Cojocaru
[not found] ` <A31948D30200007D0063616D@prv1-mh.provo.novell.com>
2018-11-23 9:10 ` Jan Beulich
[not found] ` <9B05ED9E020000C434861ACF@prv1-mh.provo.novell.com>
[not found] ` <626A217B020000C50063616D@prv1-mh.provo.novell.com>
[not found] ` <0D3C56BA0200004834861ACF@prv1-mh.provo.novell.com>
2018-12-20 9:07 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98f57a8a-288d-45ec-ef01-889fce63eeff@bitdefender.com \
--to=rcojocaru@bitdefender.com \
--cc=aisaila@bitdefender.com \
--cc=andrew.cooper3@citrix.com \
--cc=anshul.makkar@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jbeulich@suse.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=mdontu@bitdefender.com \
--cc=roger.pau@citrix.com \
--cc=tamas@tklengyel.com \
--cc=vlutas@bitdefender.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).