All of lore.kernel.org
 help / color / mirror / Atom feed
* 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter)
@ 2017-08-28 12:26 Takashi Iwai
  2017-08-28 13:06 ` Adam Borowski
  2017-08-28 13:52 ` Mike Galbraith
  0 siblings, 2 replies; 6+ messages in thread
From: Takashi Iwai @ 2017-08-28 12:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, x86, linux-kernel

Hi,

I seem to get a kernel warning when running KVM on Dell desktop with
IvyBridge like below.  As you can see, a bad page BUG is triggered
after that, too.  The problem is not triggered always, but it happens
occasionally.

I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening
since rc5.  So this might be a regression at rc5.  But, as it doesn't
happen always, I can't be 100% sure about it, and it's quite difficult
to bisect (the test case isn't reliable), unfortunately.

Any hint for further debugging this?


thanks,

Takashi

===

 WARNING: CPU: 2 PID: 7188 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0xe7/0x100 [kvm]
 CPU: 2 PID: 7188 Comm: qemu-system-x86 Not tainted 4.13.0-rc7-test+ #26
 Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
 task: ffff997fdc1320c0 task.stack: ffffb9394cf70000
 RIP: 0010:mmu_spte_clear_track_bits+0xe7/0x100 [kvm]
 RSP: 0018:ffffb9394cf73ab8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 400000010552cc77 RCX: dead0000000000ff
 RDX: 0000000000000000 RSI: ffff997f53659108 RDI: ffffe7c684154b00
 RBP: ffffb9394cf73ad0 R08: 0000000000000100 R09: 000000000000000c
 R10: ffff997fc2ba0008 R11: ffff997fc2ba0000 R12: 000000000010552c
 R13: ffff997edd0f0000 R14: ffff997f69c44348 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff9980de280000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ffd57a77ff8 CR3: 0000000105704000 CR4: 00000000001426e0
 Call Trace:
  drop_spte+0x1a/0xb0 [kvm]
  mmu_page_zap_pte+0x9d/0xe0 [kvm]
  kvm_mmu_prepare_zap_page+0x56/0x300 [kvm]
  kvm_mmu_invalidate_zap_all_pages+0x10d/0x160 [kvm]
  kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
  kvm_mmu_notifier_release+0x2c/0x40 [kvm]
  __mmu_notifier_release+0x44/0xc0
  exit_mmap+0x11e/0x130
  ? __switch_to+0x216/0x3f0
  ? __khugepaged_exit+0x114/0x120
  mmput+0x46/0x120
  do_exit+0x277/0xb10
  ? preempt_schedule_common+0x1f/0x30
  ? preempt_schedule+0x27/0x30
  ? ___preempt_schedule+0x16/0x18
  do_group_exit+0x43/0xb0
  get_signal+0x260/0x600
  ? get_futex_key+0x294/0x350
  do_signal+0x28/0x710
  ? recalc_sigpending+0x1b/0x50
  exit_to_usermode_loop+0x6e/0xc0
  syscall_return_slowpath+0x59/0x60
  entry_SYSCALL_64_fastpath+0xa7/0xa9
 RIP: 0033:0x7f95aab8f8ec
 RSP: 002b:00007f959c36c7e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
 RAX: fffffffffffffe00 RBX: 000056156743b010 RCX: 00007f95aab8f8ec
 RDX: 0000000000000002 RSI: 0000000000000080 RDI: 000056156626c080
 RBP: 0000000000000000 R08: 000056156626c080 R09: 000056156626c000
 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f
 R13: 000056156626c080 R14: 0000000000000000 R15: 000056156743b03c
 Code: 4f 04 00 48 85 c0 75 1c 4c 89 e7 e8 54 4d fe ff 48 8b 05 9d 4f 04 00 48 85 c0 74 c1 48 85 c3 0f 95 c3 eb bf 48 85 c3 74 e7 eb dd <0f> ff eb 9e 4c 89 e7 e8 2d 4d fe ff eb a4 90 66 2e 0f 1f 84 00 
 ---[ end trace dd04b99ed8787467 ]---
 BUG: Bad page state in process python  pfn:10552c
 page:ffffe7c684154b00 count:0 mapcount:0 mapping:          (null) index:0x1
 flags: 0x2ffff800000014(referenced|dirty)
 raw: 002ffff800000014 0000000000000000 0000000000000001 00000000ffffffff
 raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
 page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
 bad because of flags: 0x14(referenced|dirty)
 CPU: 7 PID: 9171 Comm: python Tainted: G        W       4.13.0-rc7-test+ #26
 Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
 Call Trace:
  dump_stack+0x63/0x8d
  bad_page+0xcb/0x120
  check_new_page_bad+0x67/0x80
  get_page_from_freelist+0x9d1/0xb00
  __alloc_pages_nodemask+0xce/0x230
  alloc_pages_vma+0x88/0x1f0
  __handle_mm_fault+0x7c9/0x1000
  handle_mm_fault+0xde/0x1e0
  __do_page_fault+0x23b/0x4f0
  do_page_fault+0x22/0x30
  page_fault+0x28/0x30
 RIP: 0033:0x7f4a2761c2c3
 RSP: 002b:00007fffbe1b2988 EFLAGS: 00010206
 RAX: 0000557a87b725b0 RBX: 00000000000000e8 RCX: 0000557a87b75000
 RDX: 0000557a87b78a00 RSI: 00000000000000ba RDI: 0000557a87b725b0
 RBP: 00007f4a279362a8 R08: 0000000000000003 R09: 000000000000007c
 R10: 00000000000001c1 R11: 0000000000000077 R12: 00007f4a27935b38
 R13: 0000000000006460 R14: 0000557a87b725b0 R15: 00007f4a27935ae0
 Disabling lock debugging due to kernel taint

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter)
  2017-08-28 12:26 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter) Takashi Iwai
@ 2017-08-28 13:06 ` Adam Borowski
  2017-08-28 13:15   ` Takashi Iwai
  2017-08-28 13:52 ` Mike Galbraith
  1 sibling, 1 reply; 6+ messages in thread
From: Adam Borowski @ 2017-08-28 13:06 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Paolo Bonzini, kvm, x86, linux-kernel

On Mon, Aug 28, 2017 at 02:26:06PM +0200, Takashi Iwai wrote:
> I seem to get a kernel warning when running KVM on Dell desktop with
> IvyBridge like below.  As you can see, a bad page BUG is triggered
> after that, too.  The problem is not triggered always, but it happens
> occasionally.

See the thread starting with 20170820231302.s732zclznrqxwr46@angband.pl

> I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening
> since rc5.  So this might be a regression at rc5.  But, as it doesn't
> happen always, I can't be 100% sure about it, and it's quite difficult
> to bisect (the test case isn't reliable), unfortunately.

Same here -- it sometimes takes a few hours of trying to reproduce, which
makes proving the negative greatly unpleasant.

And all I've been able to tell so far is that the problem is between
4.13-rc4 and 4.13-rc5, just like you say.

> ===
> 
>  WARNING: CPU: 2 PID: 7188 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0xe7/0x100 [kvm]
>  CPU: 2 PID: 7188 Comm: qemu-system-x86 Not tainted 4.13.0-rc7-test+ #26
>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>  task: ffff997fdc1320c0 task.stack: ffffb9394cf70000
>  RIP: 0010:mmu_spte_clear_track_bits+0xe7/0x100 [kvm]
>  RSP: 0018:ffffb9394cf73ab8 EFLAGS: 00010246
>  RAX: 0000000000000000 RBX: 400000010552cc77 RCX: dead0000000000ff
>  RDX: 0000000000000000 RSI: ffff997f53659108 RDI: ffffe7c684154b00
>  RBP: ffffb9394cf73ad0 R08: 0000000000000100 R09: 000000000000000c
>  R10: ffff997fc2ba0008 R11: ffff997fc2ba0000 R12: 000000000010552c
>  R13: ffff997edd0f0000 R14: ffff997f69c44348 R15: 0000000000000000
>  FS:  0000000000000000(0000) GS:ffff9980de280000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007ffd57a77ff8 CR3: 0000000105704000 CR4: 00000000001426e0
>  Call Trace:
>   drop_spte+0x1a/0xb0 [kvm]
>   mmu_page_zap_pte+0x9d/0xe0 [kvm]
>   kvm_mmu_prepare_zap_page+0x56/0x300 [kvm]
>   kvm_mmu_invalidate_zap_all_pages+0x10d/0x160 [kvm]
>   kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
>   kvm_mmu_notifier_release+0x2c/0x40 [kvm]
>   __mmu_notifier_release+0x44/0xc0
>   exit_mmap+0x11e/0x130
>   ? __switch_to+0x216/0x3f0
>   ? __khugepaged_exit+0x114/0x120
>   mmput+0x46/0x120
>   do_exit+0x277/0xb10
>   ? preempt_schedule_common+0x1f/0x30
>   ? preempt_schedule+0x27/0x30
>   ? ___preempt_schedule+0x16/0x18
>   do_group_exit+0x43/0xb0
>   get_signal+0x260/0x600
>   ? get_futex_key+0x294/0x350
>   do_signal+0x28/0x710
>   ? recalc_sigpending+0x1b/0x50
>   exit_to_usermode_loop+0x6e/0xc0
>   syscall_return_slowpath+0x59/0x60
>   entry_SYSCALL_64_fastpath+0xa7/0xa9
>  RIP: 0033:0x7f95aab8f8ec
>  RSP: 002b:00007f959c36c7e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>  RAX: fffffffffffffe00 RBX: 000056156743b010 RCX: 00007f95aab8f8ec
>  RDX: 0000000000000002 RSI: 0000000000000080 RDI: 000056156626c080
>  RBP: 0000000000000000 R08: 000056156626c080 R09: 000056156626c000
>  R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f
>  R13: 000056156626c080 R14: 0000000000000000 R15: 000056156743b03c
>  Code: 4f 04 00 48 85 c0 75 1c 4c 89 e7 e8 54 4d fe ff 48 8b 05 9d 4f 04 00 48 85 c0 74 c1 48 85 c3 0f 95 c3 eb bf 48 85 c3 74 e7 eb dd <0f> ff eb 9e 4c 89 e7 e8 2d 4d fe ff eb a4 90 66 2e 0f 1f 84 00 
>  ---[ end trace dd04b99ed8787467 ]---

The first WARN is always the above.  But the rest seems to be totally random
-- a nasty case of fandango on core whose results range from harmless
through crash to massive data loss (just guess what would happen if some
idiot picked balancing the disk as a test load -- no one would be that
stupid, right?  At least an incomplete idiot has checksums and backups).

>  BUG: Bad page state in process python  pfn:10552c
>  page:ffffe7c684154b00 count:0 mapcount:0 mapping:          (null) index:0x1
>  flags: 0x2ffff800000014(referenced|dirty)
>  raw: 002ffff800000014 0000000000000000 0000000000000001 00000000ffffffff
>  raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
>  page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
>  bad because of flags: 0x14(referenced|dirty)
>  CPU: 7 PID: 9171 Comm: python Tainted: G        W       4.13.0-rc7-test+ #26
>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>  Call Trace:
>   dump_stack+0x63/0x8d
>   bad_page+0xcb/0x120
>   check_new_page_bad+0x67/0x80
>   get_page_from_freelist+0x9d1/0xb00
>   __alloc_pages_nodemask+0xce/0x230
>   alloc_pages_vma+0x88/0x1f0
>   __handle_mm_fault+0x7c9/0x1000
>   handle_mm_fault+0xde/0x1e0
>   __do_page_fault+0x23b/0x4f0
>   do_page_fault+0x22/0x30
>   page_fault+0x28/0x30
>  RIP: 0033:0x7f4a2761c2c3
>  RSP: 002b:00007fffbe1b2988 EFLAGS: 00010206
>  RAX: 0000557a87b725b0 RBX: 00000000000000e8 RCX: 0000557a87b75000
>  RDX: 0000557a87b78a00 RSI: 00000000000000ba RDI: 0000557a87b725b0
>  RBP: 00007f4a279362a8 R08: 0000000000000003 R09: 000000000000007c
>  R10: 00000000000001c1 R11: 0000000000000077 R12: 00007f4a27935b38
>  R13: 0000000000006460 R14: 0000557a87b725b0 R15: 00007f4a27935ae0
>  Disabling lock debugging due to kernel taint
> 

-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀                                 -- Genghis Ht'rok'din
⠈⠳⣄⠀⠀⠀⠀ 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter)
  2017-08-28 13:06 ` Adam Borowski
@ 2017-08-28 13:15   ` Takashi Iwai
  2017-08-28 13:29     ` Richard Weinberger
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2017-08-28 13:15 UTC (permalink / raw)
  To: Adam Borowski; +Cc: Paolo Bonzini, kvm, x86, linux-kernel

On Mon, 28 Aug 2017 15:06:00 +0200,
Adam Borowski wrote:
> 
> On Mon, Aug 28, 2017 at 02:26:06PM +0200, Takashi Iwai wrote:
> > I seem to get a kernel warning when running KVM on Dell desktop with
> > IvyBridge like below.  As you can see, a bad page BUG is triggered
> > after that, too.  The problem is not triggered always, but it happens
> > occasionally.
> 
> See the thread starting with 20170820231302.s732zclznrqxwr46@angband.pl
> 
> > I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening
> > since rc5.  So this might be a regression at rc5.  But, as it doesn't
> > happen always, I can't be 100% sure about it, and it's quite difficult
> > to bisect (the test case isn't reliable), unfortunately.
> 
> Same here -- it sometimes takes a few hours of trying to reproduce, which
> makes proving the negative greatly unpleasant.
> 
> And all I've been able to tell so far is that the problem is between
> 4.13-rc4 and 4.13-rc5, just like you say.

Good to hear that we can chorus!
So if it's really a regression between rc4 and rc5, I see no obvious
changes in arch/x86, i.e. it's likely somewhere else.

(snip)
> The first WARN is always the above.  But the rest seems to be totally random
> -- a nasty case of fandango on core whose results range from harmless
> through crash to massive data loss (just guess what would happen if some
> idiot picked balancing the disk as a test load -- no one would be that
> stupid, right?  At least an incomplete idiot has checksums and backups).

Yeah, the crash after the WARNING seems quite random.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter)
  2017-08-28 13:15   ` Takashi Iwai
@ 2017-08-28 13:29     ` Richard Weinberger
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Weinberger @ 2017-08-28 13:29 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Adam Borowski, Paolo Bonzini, kvm, x86, LKML

On Mon, Aug 28, 2017 at 3:15 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Mon, 28 Aug 2017 15:06:00 +0200,
> Adam Borowski wrote:
>>
>> On Mon, Aug 28, 2017 at 02:26:06PM +0200, Takashi Iwai wrote:
>> > I seem to get a kernel warning when running KVM on Dell desktop with
>> > IvyBridge like below.  As you can see, a bad page BUG is triggered
>> > after that, too.  The problem is not triggered always, but it happens
>> > occasionally.
>>
>> See the thread starting with 20170820231302.s732zclznrqxwr46@angband.pl
>>
>> > I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening
>> > since rc5.  So this might be a regression at rc5.  But, as it doesn't
>> > happen always, I can't be 100% sure about it, and it's quite difficult
>> > to bisect (the test case isn't reliable), unfortunately.
>>
>> Same here -- it sometimes takes a few hours of trying to reproduce, which
>> makes proving the negative greatly unpleasant.
>>
>> And all I've been able to tell so far is that the problem is between
>> 4.13-rc4 and 4.13-rc5, just like you say.
>
> Good to hear that we can chorus!
> So if it's really a regression between rc4 and rc5, I see no obvious
> changes in arch/x86, i.e. it's likely somewhere else.
>
> (snip)
>> The first WARN is always the above.  But the rest seems to be totally random
>> -- a nasty case of fandango on core whose results range from harmless
>> through crash to massive data loss (just guess what would happen if some
>> idiot picked balancing the disk as a test load -- no one would be that
>> stupid, right?  At least an incomplete idiot has checksums and backups).
>
> Yeah, the crash after the WARNING seems quite random.

I don't see that warning but running qemu-kvm crashed by laptop
(4.13-rc4) two times in a row and
some userspace applications such has firefox or bash crashed randomly.

pstore-efi is now enabled to capture the stacktrace.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter)
  2017-08-28 12:26 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter) Takashi Iwai
  2017-08-28 13:06 ` Adam Borowski
@ 2017-08-28 13:52 ` Mike Galbraith
  2017-08-28 14:41   ` Mike Galbraith
  1 sibling, 1 reply; 6+ messages in thread
From: Mike Galbraith @ 2017-08-28 13:52 UTC (permalink / raw)
  To: Takashi Iwai, Paolo Bonzini; +Cc: kvm, x86, linux-kernel

On Mon, 2017-08-28 at 14:26 +0200, Takashi Iwai wrote:
> Hi,
> 
> I seem to get a kernel warning when running KVM on Dell desktop with
> IvyBridge like below.  As you can see, a bad page BUG is triggered
> after that, too.  The problem is not triggered always, but it happens
> occasionally.
> 
> I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening
> since rc5.  So this might be a regression at rc5.  But, as it doesn't
> happen always, I can't be 100% sure about it, and it's quite difficult
> to bisect (the test case isn't reliable), unfortunately.
> 
> Any hint for further debugging this?

Maybe a way to make failure more likely.  This is an RT kernel, but
trying to build a fat kernel over NFS from a KVM clone of my
workstation (full topology, half of ram) didn't survive one build.

[ 2583.153312] WARNING: CPU: 7 PID: 9323 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0x82/0x100 [kvm]
[ 2583.153899] WARNING: CPU: 7 PID: 9323 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0x82/0x100 [kvm]
[ 2583.154016] WARNING: CPU: 7 PID: 9323 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0x82/0x100 [kvm]
[ 2583.158810] WARNING: CPU: 7 PID: 9323 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0x82/0x100 [kvm]
[ 2768.419797] BUG: Bad page state in process as  pfn:048b3
[ 2768.419932] BUG: Bad page state in process as  pfn:04983
[ 2775.097980] BUG: Bad page state in process cc1  pfn:04982
[ 2782.487748] BUG: Bad page state in process cc1  pfn:04980
[ 2782.622636] BUG: Bad page state in process cc1  pfn:048b0
[ 2782.622899] BUG: Bad page state in process cc1  pfn:04984
[ 2782.623053] BUG: Bad page state in process cc1  pfn:04986
[ 2782.673705] BUG: Bad page state in process cc1  pfn:048b4
[ 2782.673903] BUG: Bad page state in process cc1  pfn:048b6
[ 2782.674044] BUG: Bad page state in process cc1  pfn:04989
[ 2782.674185] BUG: Bad page state in process cc1  pfn:0498a
[ 2784.895701] BUG: Bad page state in process cc1  pfn:04990
[ 2784.895921] BUG: Bad page state in process cc1  pfn:04992
[ 2784.896100] BUG: Bad page state in process cc1  pfn:04994
[ 2784.896255] BUG: Bad page state in process cc1  pfn:04996
[ 2784.905232] BUG: Bad page state in process cc1  pfn:0499c
[ 2784.905501] BUG: Bad page state in process cc1  pfn:0499e
[ 2785.762044] BUG: Bad page state in process cc1  pfn:040cb
[ 2787.052976] BUG: Bad page state in process cc1  pfn:048ca
[ 2787.208480] BUG: Bad page state in process kdesu  pfn:048a8
[ 2787.208694] BUG: Bad page state in process kdesu  pfn:048aa
[ 2787.208862] BUG: Bad page state in process kdesu  pfn:048ac
[ 2787.208957] BUG: Bad page state in process kdesu  pfn:048ae
[ 2787.211725] BUG: Bad page state in process cc1  pfn:04884
[ 2787.219784] BUG: Bad page state in process kdesu  pfn:04888
[ 2787.226212] BUG: Bad page state in process cc1  pfn:049a0
[ 2788.955108] BUG: Bad page state in process cc1  pfn:048e9
[ 2788.959686] BUG: Bad page state in process cc1  pfn:048f1
[ 2788.959882] BUG: Bad page state in process cc1  pfn:048f2
[ 2788.977485] BUG: Bad page state in process cc1  pfn:048fe
[ 2789.295335] BUG: Bad page state in process cc1  pfn:04807
[ 2794.661501] BUG: Bad page state in process cc1  pfn:04819
[ 2794.661658] BUG: Bad page state in process cc1  pfn:0481b
[ 2794.661747] BUG: Bad page state in process cc1  pfn:0481d
[ 2794.680432] BUG: Bad page state in process cc1  pfn:0482a
[ 2794.692849] BUG: Bad page state in process cc1  pfn:04834
[ 2794.705438] BUG: Bad page state in process cc1  pfn:0483c
[ 2794.784882] BUG: Bad page state in process gcc  pfn:0485c
[ 2794.785105] BUG: Bad page state in process gcc  pfn:0485e
[ 2796.541058] BUG: Bad page state in process Xorg  pfn:04011
[ 2808.425625] BUG: Bad page state in process Xorg  pfn:04a09
[ 3605.187591] BUG: unable to handle kernel paging request at 000000000001bcf4
[ 3605.202446] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:33
[ 3605.203322] BUG: stack guard page was hit at ffffc9000e483ff8 (stack is ffffc9000e484000..ffffc9000e487fff)
[ 3605.279108] BUG: scheduling while atomic: ld/5485/0x00000002
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter)
  2017-08-28 13:52 ` Mike Galbraith
@ 2017-08-28 14:41   ` Mike Galbraith
  0 siblings, 0 replies; 6+ messages in thread
From: Mike Galbraith @ 2017-08-28 14:41 UTC (permalink / raw)
  To: Takashi Iwai, Paolo Bonzini; +Cc: kvm, x86, linux-kernel

On Mon, 2017-08-28 at 15:52 +0200, Mike Galbraith wrote:
> On Mon, 2017-08-28 at 14:26 +0200, Takashi Iwai wrote:
> > Hi,
> > 
> > I seem to get a kernel warning when running KVM on Dell desktop with
> > IvyBridge like below.  As you can see, a bad page BUG is triggered
> > after that, too.  The problem is not triggered always, but it happens
> > occasionally.
> > 
> > I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening
> > since rc5.  So this might be a regression at rc5.  But, as it doesn't
> > happen always, I can't be 100% sure about it, and it's quite difficult
> > to bisect (the test case isn't reliable), unfortunately.
> > 
> > Any hint for further debugging this?
> 
> Maybe a way to make failure more likely.  This is an RT kernel, but
> trying to build a fat kernel over NFS from a KVM clone of my
> workstation (full topology, half of ram) didn't survive one build.

Bah, plain master in guest/host survived.

	-Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-28 14:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-28 12:26 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter) Takashi Iwai
2017-08-28 13:06 ` Adam Borowski
2017-08-28 13:15   ` Takashi Iwai
2017-08-28 13:29     ` Richard Weinberger
2017-08-28 13:52 ` Mike Galbraith
2017-08-28 14:41   ` Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.