linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Please review arch/x86/kernel/pvclock.c to fix Docker/Mono crashes in new Kernels
@ 2016-05-16 18:10 Linus Torvalds
  2016-05-16 18:37 ` Andy Lutomirski
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2016-05-16 18:10 UTC (permalink / raw)
  To: Andy Lutomirski, Marcelo Tosatti, Paolo Bonzini
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List

There is something odd being reported in Ubuntu.

There's a Mono SIGSEGV that was bisected to Andy's commit 1ddf0b1b11aa
("x86, vdso: Use asm volatile in __getcpu"), and then reported to be
fixed with commits

  80f7fdb1c7f0 ("x86: vdso: fix pvclock races with task migration")
  0a4e6be9ca17 ("x86: kvm: Revert "remove sched notifier for cross-cpu
migrations"")

and when those were backported all looked well.

But then those two commits in turn were reverted with

  73459e2a1ada ("x86: pvclock: Really remove the sched notifier for
cross-cpu migrations")

and people seem to report that it's back as a result:

  https://bugzilla.xamarin.com/show_bug.cgi?id=29212#c16

so apparently that task migration notifier somehow does matter.

Comments?

                  Linus


On Mon, May 16, 2016 at 1:13 AM,  <okhalzov@team.vestbery.com> wrote:
>
> Hello Linus,
>
> Am am sorry to bother you, but it seems that the bug from old kernels was
> copied to new >=4.1 kernels. We use Ubuntu/Docker/Mono and we had to
> rollback to 3.19.0-54 kernel for the work around.
>
> We found that a year ago there was a discussion on the launchpad
> (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1450584) regarding
> SIGSEGV on multi-cpu vm.
> It seems to me that the commits around that bug
> https://github.com/torvalds/linux/commits/master/arch/x86/kernel/pvclock.c
> caused 4.1 and up kernels to keep that bug.
> Please review pvclock.c to fix that problem.
>
> Kiitos! Thank you!
>
> --
> Oleg Khalzov
> SDE
> Vestbery
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Please review arch/x86/kernel/pvclock.c to fix Docker/Mono crashes in new Kernels
  2016-05-16 18:10 Please review arch/x86/kernel/pvclock.c to fix Docker/Mono crashes in new Kernels Linus Torvalds
@ 2016-05-16 18:37 ` Andy Lutomirski
  2016-05-16 18:56   ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Lutomirski @ 2016-05-16 18:37 UTC (permalink / raw)
  To: Linus Torvalds, okhalzov
  Cc: Marcelo Tosatti, Paolo Bonzini, the arch/x86 maintainers,
	Linux Kernel Mailing List

On Mon, May 16, 2016 at 11:10 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> There is something odd being reported in Ubuntu.
>
> There's a Mono SIGSEGV that was bisected to Andy's commit 1ddf0b1b11aa
> ("x86, vdso: Use asm volatile in __getcpu"), and then reported to be
> fixed with commits

I'm reasonably confident that the addition of "volatile" is not the
root cause...

>
>   80f7fdb1c7f0 ("x86: vdso: fix pvclock races with task migration")
>   0a4e6be9ca17 ("x86: kvm: Revert "remove sched notifier for cross-cpu
> migrations"")
>
> and when those were backported all looked well.
>
> But then those two commits in turn were reverted with
>
>   73459e2a1ada ("x86: pvclock: Really remove the sched notifier for
> cross-cpu migrations")
>
> and people seem to report that it's back as a result:
>
>   https://bugzilla.xamarin.com/show_bug.cgi?id=29212#c16
>
> so apparently that task migration notifier somehow does matter.

All of those fixes were intended to fix incorrect times being
reported, not segfaults.  Weird.

I tried to sign up for Xamarin bugzilla and made zero progress (the
confirmation email was never sent).  That being said, this bug report
is very confusing.  Could someone who can reproduce this provide the
following information:

1. What is the contents of
/sys/devices/system/clocksource/clocksource0/current_clocksource

2. If you do:

 echo tsc >/sys/devices/system/clocksource/clocksource0/current_clocksource

can you still reproduce it?

3. I rewrote the whole vdso pvclock mess in Linux 4.5.  Does the bug
exist in Linux 4.5?

4. What is actually crashing?  The stack trace says:

Method (wrapper managed-to-managed) string:.ctor (char[],int,int)
emitted at 0x40b5b1b0 to 0x40b5b1d9 (code length 41)

[bug-18026.exe]
converting method (wrapper managed-to-native)
object:__icall_wrapper_mono_gc_alloc_string (intptr,intptr,int)
Method (wrapper managed-to-native)
object:__icall_wrapper_mono_gc_alloc_string (intptr,intptr,int)
emitted at 0x40b5b1f0 to 0x40b5b284 (code length 148) [bug-18026.exe]

Unhandled Exception:
System.NullReferenceException: Object reference not set to an instance
of an object
  at Test.Main () [0x00000] in <filename unknown>:0
[ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException:
Object reference not set to an instance of an object
  at Test.Main () [0x00000] in <filename unknown>:0


What on earth does that mean?  Is mono crashing in the vdso?  Is mono
crashing because time went backwards?  Is mono crashing because its GC
is just weirdly buggy, uses clock_gettime, and has a race condition
that is or is not triggered depending on how long the function takes?

An actual stack dump of the segfault (the native stack, not what mono
thinks the stack is) would be nice.


FWIW, the pvclock host code is complicated and it's not obvious to me
that it has any particular guarantee of monotonicity.  (That's not to
say it has a bug that breaks monotonicity -- it's just that I, as a
reader of the code, have never had a clear understanding of what it's
trying to do or why it's trying to do it.)

If it's fixed in 4.5, I suppose the big rewrite could be backported,
but I'd rather have some understanding of what's going on.

--Andy

>
> Comments?
>
>                   Linus
>
>
> On Mon, May 16, 2016 at 1:13 AM,  <okhalzov@team.vestbery.com> wrote:
>>
>> Hello Linus,
>>
>> Am am sorry to bother you, but it seems that the bug from old kernels was
>> copied to new >=4.1 kernels. We use Ubuntu/Docker/Mono and we had to
>> rollback to 3.19.0-54 kernel for the work around.
>>
>> We found that a year ago there was a discussion on the launchpad
>> (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1450584) regarding
>> SIGSEGV on multi-cpu vm.
>> It seems to me that the commits around that bug
>> https://github.com/torvalds/linux/commits/master/arch/x86/kernel/pvclock.c
>> caused 4.1 and up kernels to keep that bug.
>> Please review pvclock.c to fix that problem.
>>
>> Kiitos! Thank you!
>>
>> --
>> Oleg Khalzov
>> SDE
>> Vestbery
>>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Please review arch/x86/kernel/pvclock.c to fix Docker/Mono crashes in new Kernels
  2016-05-16 18:37 ` Andy Lutomirski
@ 2016-05-16 18:56   ` Linus Torvalds
  2016-05-17 11:06     ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2016-05-16 18:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: okhalzov, Marcelo Tosatti, Paolo Bonzini,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Mon, May 16, 2016 at 11:37 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>
> All of those fixes were intended to fix incorrect times being
> reported, not segfaults.  Weird.

I'm assuming it's "time going backwards". I can easily see that
causing segfaults.

I've seen lots of code that timestamps events, and can easily imagine
confusion if the end result is not ordered (ie walking off the
beginning/end of a list or array or similar because the algorithm
"knows" that the events are ordered).

I agree that the original bisection result is a bit questionable, and
it might just be about exposing a timing issue.

                   Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Please review arch/x86/kernel/pvclock.c to fix Docker/Mono crashes in new Kernels
  2016-05-16 18:56   ` Linus Torvalds
@ 2016-05-17 11:06     ` Paolo Bonzini
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2016-05-17 11:06 UTC (permalink / raw)
  To: Linus Torvalds, Andy Lutomirski
  Cc: okhalzov, Marcelo Tosatti, the arch/x86 maintainers,
	Linux Kernel Mailing List, Taloth Saldono



On 16/05/2016 20:56, Linus Torvalds wrote:
> On Mon, May 16, 2016 at 11:37 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>>
>> All of those fixes were intended to fix incorrect times being
>> reported, not segfaults.  Weird.
> 
> I'm assuming it's "time going backwards". I can easily see that
> causing segfaults.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1450584 says the
crashes occurs "more frequently on a vbox vm with multiple CPUs
configured".  If vbox tells the VM to use the vdso when it shouldn't
(because the TSC is not stable on the host), bad things could happen.

However, based on the Xamarin bug comments and
http://lists.ximian.com/pipermail/mono-devel-list/2015-August/043181.html,
it looks like vbox after all doesn't use pvclock and the trigger seems
to be https://github.com/torvalds/linux/commit/c70e1b475f37:

   With __always_inline on vread_pvclock, mono crashed. With noinline on
   vread_pvclock, mono doesn't crash. Weirdest part is that the pvclock
   isn't even used during my tests.

Oleg, what is your environment exactly?  You mentioned Docker, but are
you also virtualizing and if so what is your hypervisor?

Taloth, do you know if it can it be reproduced under Xen or KVM or
bare-metal?  I'd trust them more than VirtualBox regarding timekeeping.
 I saw Amazon Linux mentioned in the mono mailing list archives which
would point to Xen.  Xen doesn't use the pvclock vdso code either, though.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-17 11:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-16 18:10 Please review arch/x86/kernel/pvclock.c to fix Docker/Mono crashes in new Kernels Linus Torvalds
2016-05-16 18:37 ` Andy Lutomirski
2016-05-16 18:56   ` Linus Torvalds
2016-05-17 11:06     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).