stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hans van Kranenburg <hans@knorrie.org>
To: Juergen Gross <jgross@suse.com>,
	linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org,
	x86@kernel.org
Cc: sstabellini@kernel.org, stable@vger.kernel.org, mingo@redhat.com,
	bp@alien8.de, hpa@zytor.com, boris.ostrovsky@oracle.com,
	tglx@linutronix.de
Subject: Re: [Xen-devel] [PATCH v2] xen: Fix x86 sched_clock() interface for xen
Date: Fri, 11 Jan 2019 22:46:22 +0100	[thread overview]
Message-ID: <e4dde9d7-6f4d-c219-41e7-49ce2bded907@knorrie.org> (raw)
In-Reply-To: <cceafa86-e7b7-6e8d-851c-1c21ada0f445@knorrie.org>

On 1/11/19 4:57 PM, Hans van Kranenburg wrote:
> On 1/11/19 3:01 PM, Juergen Gross wrote:
>> On 11/01/2019 14:12, Hans van Kranenburg wrote:
>>> Hi,
>>>
>>> On 1/11/19 1:08 PM, Juergen Gross wrote:
>>>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>>>> sched_clock() interface") broke Xen guest time handling across
>>>> migration:
>>>>
>>>> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
>>>> [  187.251137] OOM killer disabled.
>>>> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>>>> [  187.252299] suspending xenstore...
>>>> [  187.266987] xen:grant_table: Grant tables using version 1 layout
>>>> [18446743811.706476] OOM killer enabled.
>>>> [18446743811.706478] Restarting tasks ... done.
>>>> [18446743811.720505] Setting capacity to 16777216
>>>>
>>>> Fix that by setting xen_sched_clock_offset at resume time to ensure a
>>>> monotonic clock value.
>>>>
>>>> [...]
>>>
>>> I'm throwing around a PV domU over a bunch of test servers with live
>>> migrate now, and in between the kernel logging, I'm seeing this:
>>>
>>> [Fri Jan 11 13:58:42 2019] Freezing user space processes ... (elapsed
>>> 0.002 seconds) done.
>>> [Fri Jan 11 13:58:42 2019] OOM killer disabled.
>>> [Fri Jan 11 13:58:42 2019] Freezing remaining freezable tasks ...
>>> (elapsed 0.000 seconds) done.
>>> [Fri Jan 11 13:58:42 2019] suspending xenstore...
>>> [Fri Jan 11 13:58:42 2019] ------------[ cut here ]------------
>>> [Fri Jan 11 13:58:42 2019] Current state: 1
>>> [Fri Jan 11 13:58:42 2019] WARNING: CPU: 3 PID: 0 at
>>> kernel/time/clockevents.c:133 clockevents_switch_state+0x48/0xe0
>>> [...]
>>>
>>> This always happens on every *first* live migrate that I do after
>>> starting the domU.
>>
>> Yeah, its a WARN_ONCE().

Ok, false alarm. It's there, but not caused by this patch.

I changed the WARN_ONCE to WARN for funs, and now I get it a lot more
already (v2):

https://paste.debian.net/plainh/d535a379

>> And you didn't see it with v1 of the patch?
> 
> No.

I was wrong. I tried a bit more, and I can also reproduce without v1 or
v2 patch at all, and I can reproduce it with v4.19.9. Just sometimes
needs a dozen times more live migrating it before it shows up.

I cannot make it happen to show up with the Debian 4.19.9 distro kernel,
that's why it was new for me.

So, let's ignore it in this thread now.

>> On the first glance this might be another bug just being exposed by
>> my patch.
>>
>> I'm investigating further, but this might take some time. Could you
>> meanwhile verify the same happens with kernel 5.0-rc1? That was the
>> one I tested with and I didn't spot that WARN.
> 
> I have Linux 5.0-rc1 with v2 on top now, which gives me this on live
> migrate:
> 
> [...]
> [   51.871076] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000098
> [   51.871091] #PF error: [normal kernel read fault]
> [   51.871100] PGD 0 P4D 0
> [   51.871109] Oops: 0000 [#1] SMP NOPTI
> [   51.871117] CPU: 0 PID: 36 Comm: xenwatch Not tainted 5.0.0-rc1 #1
> [   51.871132] RIP: e030:blk_mq_map_swqueue+0x103/0x270
> [...]

Dunno about all the 5.0-rc1 crashes yet. I can provide more feedback
about that if you want, but not in here I presume.

Hans

  reply	other threads:[~2019-01-11 21:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 12:08 [PATCH v2] xen: Fix x86 sched_clock() interface for xen Juergen Gross
2019-01-11 13:12 ` Hans van Kranenburg
2019-01-11 14:01   ` [Xen-devel] " Juergen Gross
2019-01-11 15:57     ` Hans van Kranenburg
2019-01-11 21:46       ` Hans van Kranenburg [this message]
2019-01-11 20:35 ` Boris Ostrovsky
2019-01-14 12:43   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4dde9d7-6f4d-c219-41e7-49ce2bded907@knorrie.org \
    --to=hans@knorrie.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sstabellini@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).