All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Wei Liu <wei.liu2@citrix.com>, Jan Beulich <JBeulich@suse.com>
Subject: Re: [xen-unstable test] 123379: regressions - FAIL
Date: Wed, 13 Jun 2018 11:02:04 +0200	[thread overview]
Message-ID: <7a9c3db6-6f47-8817-a2cd-ca21061ee00f@suse.com> (raw)
In-Reply-To: <8b71c6fa-c8a2-9f64-6d25-45a713571ce8@citrix.com>

On 13/06/18 10:58, Andrew Cooper wrote:
> On 13/06/18 09:52, Juergen Gross wrote:
>> On 12/06/18 17:58, Juergen Gross wrote:
>>> On 08/06/18 12:12, Juergen Gross wrote:
>>>> On 07/06/18 13:30, Juergen Gross wrote:
>>>>> On 06/06/18 11:40, Juergen Gross wrote:
>>>>>> On 06/06/18 11:35, Jan Beulich wrote:
>>>>>>>>>> On 05.06.18 at 18:19, <ian.jackson@citrix.com> wrote:
>>>>>>>>>>  test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 14 guest-saverestore.2 
>>>>>>>> I thought I would reply again with the key point from my earlier mail
>>>>>>>> highlighted, and go a bit further.  The first thing to go wrong in
>>>>>>>> this was:
>>>>>>>>
>>>>>>>> 2018-05-30 22:12:49.320+0000: xc: Failed to get types for pfn batch (14 = Bad address): Internal error
>>>>>>>> 2018-05-30 22:12:49.483+0000: xc: Save failed (14 = Bad address): Internal error
>>>>>>>> 2018-05-30 22:12:49.648+0000: libxl-save-helper: complete r=-1: Bad address
>>>>>>>>
>>>>>>>> You can see similar messages in the other logfile:
>>>>>>>>
>>>>>>>> 2018-05-30 22:12:49.650+0000: libxl: libxl_stream_write.c:350:libxl__xc_domain_save_done: Domain 3:saving domain: domain responded to suspend request: Bad address
>>>>>>>>
>>>>>>>> All of these are reports of the same thing: xc_get_pfn_type_batch at
>>>>>>>> xc_sr_save.c:133 failed with EFAULT.  I'm afraid I don't know why.
>>>>>>>>
>>>>>>>> There is no corresponding message in the host's serial log nor the
>>>>>>>> dom0 kernel log.
>>>>>>> I vaguely recall from the time when I had looked at the similar Windows
>>>>>>> migration issues that the guest is already in the process of being cleaned
>>>>>>> up when these occur. Commit 2dbe9c3cd2 ("x86/mm: silence a pointless
>>>>>>> warning") intentionally suppressed a log message here, and the
>>>>>>> immediately following debugging code (933f966bcd x86/mm: add
>>>>>>> temporary debugging code to get_page_from_gfn_p2m()) was reverted
>>>>>>> a little over a month later. This wasn't as a follow-up to another patch
>>>>>>> (fix), but following the discussion rooted at
>>>>>>> https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg00324.html
>>>>>> That was -ESRCH, not -EFAULT.
>>>>> I've looked a little bit more into this.
>>>>>
>>>>> As we are seeing EFAULT being returned by the hypervisor this either
>>>>> means the tools are specifying an invalid address (quite unlikely)
>>>>> or the buffers are not as MAP_LOCKED as we wish them to be.
>>>>>
>>>>> Is there a way to see whether the host was experiencing some memory
>>>>> shortage, so the buffers might have been swapped out?
>>>>>
>>>>> man mmap tells me: "This implementation will try to populate (prefault)
>>>>> the whole range but the mmap call doesn't fail with ENOMEM if this
>>>>> fails. Therefore major faults might happen later on."
>>>>>
>>>>> And: "One should use mmap(2) plus mlock(2) when major faults are not
>>>>> acceptable after the initialization of the mapping."
>>>>>
>>>>> With osdep_alloc_pages() in tools/libs/call/linux.c touching all the
>>>>> hypercall buffer pages before doing the hypercall I'm not sure this
>>>>> could be an issue.
>>>>>
>>>>> Any thoughts on that?
>>>> Ian, is there a chance to dedicate a machine to a specific test trying
>>>> to reproduce the problem? In case we manage to get this failure in a
>>>> reasonable time frame I guess the most promising approach would be to
>>>> use a test hypervisor producing more debug data. If you think this is
>>>> worth doing I can write a patch.
>>> Trying to reproduce the problem in a limited test environment finally
>>> worked: doing a loop of "xl save -c" produced the problem after 198
>>> iterations.
>>>
>>> I have asked a SUSE engineer doing kernel memory management if he
>>> could think of something. His idea is that maybe some kthread could be
>>> the reason for our problem, e.g. trying page migration or compaction
>>> (at least on the test machine I've looked at compaction of mlocked
>>> pages is allowed: /proc/sys/vm/compact_unevictable_allowed is 1).
>>>
>>> In order to be really sure nothing in the kernel can temporarily
>>> switch hypercall buffer pages read-only or invalid for the hypervisor
>>> we'll have to modify the privcmd driver interface: it will have to
>>> gain knowledge which pages are handed over to the hypervisor as buffers
>>> in order to be able to lock them accordingly via get_user_pages().
>>>
>>> While this is a possible explanation of the fault we are seeing it might
>>> be related to another reason. So I'm going to apply some modifications
>>> to the hypervisor to get some more diagnostics in order to verify the
>>> suspected kernel behavior is really the reason for the hypervisor to
>>> return EFAULT.
>> I was lucky. Took only 39 iterations this time.
>>
>> The debug data confirms the theory that the kernel is setting the PTE to
>> invalid or read only for a short amount of time:
>>
>> (XEN) fixup for address 00007ffb9904fe44, error_code 0002:
>> (XEN) Pagetable walk from 00007ffb9904fe44:
>> (XEN)  L4[0x0ff] = 0000000458da6067 0000000000019190
>> (XEN)  L3[0x1ee] = 0000000457d26067 0000000000018210
>> (XEN)  L2[0x0c8] = 0000000445ab3067 0000000000006083
>> (XEN)  L1[0x04f] = 8000000458cdc107 000000000001925a
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d0802abe31>] __copy_to_user_ll+0x27/0x30
>> (XEN)    [<ffff82d080272edb>] arch_do_domctl+0x5a8/0x2648
>> (XEN)    [<ffff82d080206d5d>] do_domctl+0x18fb/0x1c4e
>> (XEN)    [<ffff82d08036d1ba>] pv_hypercall+0x1f4/0x43e
>> (XEN)    [<ffff82d0803734a6>] lstar_enter+0x116/0x120
>>
>> The page was writable again when the page walk data has been collected,
>> but A and D bits still are 0 (which should not be the case in case the
>> kernel didn't touch the PTE, as the hypervisor read from that page some
>> instructions before the failed write).
>>
>> Starting with the Xen patches now...
> 
> Given that walk, I'd expect the spurious pagefault logic to have kicked
> in, and retried.
> 
> Presumably the spurious walk logic saw the non-present/read-only mappings?

I guess so.

Otherwise my debug coding wouldn't have been called...


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-06-13  9:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-31  6:00 [xen-unstable test] 123379: regressions - FAIL osstest service owner
2018-05-31  8:32 ` Juergen Gross
2018-05-31  9:14   ` Juergen Gross
2018-06-01  8:10     ` Jan Beulich
2018-06-01  9:08       ` Juergen Gross
2018-06-05 16:16         ` Ian Jackson
2018-06-06  7:39           ` Juergen Gross
2018-06-05 16:19         ` Ian Jackson
2018-06-06  9:35           ` Jan Beulich
     [not found]           ` <5B17AAE102000078001C8972@suse.com>
2018-06-06  9:40             ` Juergen Gross
2018-06-07 11:30               ` Juergen Gross
2018-06-08 10:12                 ` Juergen Gross
2018-06-12 15:58                   ` Juergen Gross
2018-06-13  6:11                     ` Jan Beulich
     [not found]                     ` <5B20B5A602000078001CAACA@suse.com>
2018-06-13  6:50                       ` Juergen Gross
2018-06-13  7:21                         ` Jan Beulich
     [not found]                         ` <5B20C5E002000078001CAB80@suse.com>
2018-06-13  7:57                           ` Juergen Gross
2018-06-13  8:52                     ` Juergen Gross
2018-06-13  8:58                       ` Andrew Cooper
2018-06-13  9:02                         ` Juergen Gross [this message]
2018-06-08 14:25         ` Ad-hoc test instructions (was Re: [xen-unstable test] 123379: regressions - FAIL) Ian Jackson
2018-06-08 15:42           ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7a9c3db6-6f47-8817-a2cd-ca21061ee00f@suse.com \
    --to=jgross@suse.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.