On 03/29/2017 05:00 PM, Razvan Cojocaru wrote: > On 03/29/2017 04:55 PM, Jan Beulich wrote: >>>>> On 28.03.17 at 12:50, wrote: >>> On 03/28/2017 01:47 PM, Jan Beulich wrote: >>>>>>> On 28.03.17 at 12:27, wrote: >>>>> On 03/28/2017 01:03 PM, Jan Beulich wrote: >>>>>>>>> On 28.03.17 at 11:14, wrote: >>>>>>> I'm not sure that the RETRY model is what the guest OS expects. AFAIK, a >>>>>>> failed CMPXCHG should happen just once, with the proper registers and ZF >>>>>>> set. The guest surely expects neither that the instruction resume until >>>>>>> it succeeds, nor that some hidden loop goes on for an undeterminate >>>>>>> ammount of time until a CMPXCHG succeeds. >>>>>> >>>>>> The guest doesn't observe the CMPXCHG failing - RETRY leads to >>>>>> the instruction being restarted instead of completed. >>>>> >>>>> Indeed, but it works differently with hvm_emulate_one_vm_event() where >>>>> RETRY currently would have the instruction be re-executed (properly >>>>> re-executed, not just re-emulated) by the guest. >>>> >>>> Right - see my other reply to Andrew: The function likely would >>>> need to tell apart guest CMPXCHG uses from us using the insn to >>>> carry out the write by some other one. That may involve >>>> adjustments to the memory write logic in x86_emulate() itself, as >>>> the late failure of the comparison then would also need to be >>>> communicated back (via ZF clear) to the guest. >>> >>> Exactly, it would require quite some reworking of x86_emulate(). >> >> I had imagined it to be less intrusive (outside of x86_emulate()), >> but I've now learned why Andrew was able to get rid of >> X86EMUL_CMPXCHG_FAILED - the apparently intended behavior >> was never implemented. Attached a first take at it, which has >> seen smoke testing, but nothing more. The way it ends up being >> I don't think this can reasonably be considered for 4.9 at this >> point in time. (Also Cc-ing Tim for the shadow code changes, >> even if this isn't really a proper patch submission.) > > Thanks! I'll give a spin with a modified version of my CMPXCHG patch as > soon as possible. With the attached patch with hvmemul_cmpxchg() now returning X86EMUL_CMPXCHG_FAILED if __cmpxchg() fails my (32-bit) Windows 7 guest gets stuck at the "Starting Windows" screen. It's state appears to be: # ./xenctx -a 3 cs:eip: 0008:8bcd85d6 flags: 00200246 cid i z p ss:esp: 0010:82736b9c eax: 00000000 ebx: 84f3a678 ecx: 84ee2610 edx: 001eb615 esi: 40008000 edi: 82739d20 ebp: 82736c20 ds: 0023 es: 0023 fs: 0030 gs: 0000 cr0: 8001003b cr2: 8fd94000 cr3: 00185000 cr4: 000406f9 dr0: 00000000 dr1: 00000000 dr2: 00000000 dr3: 00000000 dr6: fffe0ff0 dr7: 00000400 Code (instr addr 8bcd85d6) 47 fc 83 c7 14 4e 75 ef 5f 5e c3 cc cc cc cc cc cc 8b ff fb f4 cc cc cc cc cc 8b ff 55 8b ec # ./xenctx -a 3 cs:eip: 0008:8bcd85d6 flags: 00200246 cid i z p ss:esp: 0010:82736b9c eax: 00000000 ebx: 84f3a678 ecx: 84ee2610 edx: 002ca60d esi: 40008000 edi: 82739d20 ebp: 82736c20 ds: 0023 es: 0023 fs: 0030 gs: 0000 cr0: 8001003b cr2: 8fd94000 cr3: 00185000 cr4: 000406f9 dr0: 00000000 dr1: 00000000 dr2: 00000000 dr3: 00000000 dr6: fffe0ff0 dr7: 00000400 Code (instr addr 8bcd85d6) 47 fc 83 c7 14 4e 75 ef 5f 5e c3 cc cc cc cc cc cc 8b ff fb f4 cc cc cc cc cc 8b ff 55 8b ec This only happens in SMP scenarios (my guest had 10 VCPUs for easy reproduction). With a single VCPU, the guest booted fine. So something somehow is still not right when a CMPXCHG fails in a race-type situation (unless something's obviously wrong with my patch, but I don't see it). Thanks, Razvan