All of lore.kernel.org
 help / color / mirror / Atom feed
From: Radha Mohan <mohun106@gmail.com>
To: qemu-devel@nongnu.org,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>
Subject: [Qemu-devel] host stalls when qemu-system-aarch64 with kvm and pflash
Date: Tue, 28 Mar 2017 12:58:24 -0700	[thread overview]
Message-ID: <CAC8NTUUWuH4k7SZSus_hz6a6V+OLGYUFcSOUwVbji2YDAZbVdQ@mail.gmail.com> (raw)

Hi,
I am seeing an issue with qemu-system-aarch64 when using pflash
(booting kernel via UEFI bios).

Host kernel: 4.11.0-rc3-next-20170323
Qemu version: v2.9.0-rc1

Command used:
./aarch64-softmmu/qemu-system-aarch64 -cpu host -enable-kvm -M
virt,gic_version=3 -nographic -smp 1 -m 2048 -drive
if=none,id=hd0,file=/root/zesty-server-cloudimg-arm64.img,id=0 -device
virtio-blk-device,drive=hd0 -pflash /root/flash0.img -pflash
/root/flash1.img


As soon as the guest kernel boots the host starts to stall and prints
the below messages. And the system never recovers. I can neither
poweroff the guest nor the host. So I have resort to external power
reset of the host.

==================
[  116.199077] NMI watchdog: BUG: soft lockup - CPU#25 stuck for 23s!
[kworker/25:1:454]
[  116.206901] Modules linked in: binfmt_misc nls_iso8859_1 aes_ce_blk
shpchp crypto_simd gpio_keys cryptd aes_ce_cipher ghash_ce sha2_ce
sha1_ce uio_pdrv_genirq uio autofs4 btrfs raid10 rai
d456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c raid1 raid0 multipath linear ast i2c_algo_bit ttm
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_s
ys_fops drm nicvf ahci nicpf libahci thunder_bgx thunder_xcv
mdio_thunder mdio_cavium

[  116.206995] CPU: 25 PID: 454 Comm: kworker/25:1 Not tainted
4.11.0-rc3-next-20170323 #1
[  116.206997] Hardware name: www.cavium.com crb-1s/crb-1s, BIOS 0.3 Feb 23 2017
[  116.207010] Workqueue: events netstamp_clear
[  116.207015] task: ffff801f906b5400 task.stack: ffff801f901a4000
[  116.207020] PC is at smp_call_function_many+0x284/0x2e8
[  116.207023] LR is at smp_call_function_many+0x244/0x2e8
[  116.207026] pc : [<ffff000008156ecc>] lr : [<ffff000008156e8c>]
pstate: 80000145
[  116.207028] sp : ffff801f901a7be0
[  116.207030] x29: ffff801f901a7be0 x28: ffff000009139000
[  116.207036] x27: ffff000009139434 x26: 0000000000000080
[  116.207041] x25: 0000000000000000 x24: ffff0000081565d0
[  116.207047] x23: 0000000000000001 x22: ffff000008e11e00
[  116.207052] x21: ffff801f6d5cff00 x20: ffff801f6d5cff08
[  116.207057] x19: ffff000009138e38 x18: 0000000000000a03
[  116.207063] x17: 0000ffffb77c9028 x16: ffff0000082e81d8
[  116.207068] x15: 00003d0d6dd44d08 x14: 0036312196549b4a
[  116.207073] x13: 0000000058dabe4c x12: 0000000000000018
[  116.207079] x11: 00000000366e2f04 x10: 00000000000009f0
[  116.207084] x9 : ffff801f901a7d30 x8 : 0000000000000002
[  116.207089] x7 : 0000000000000000 x6 : 0000000000000000
[  116.207095] x5 : ffffffff00000000 x4 : 0000000000000020
[  116.207100] x3 : 0000000000000020 x2 : 0000000000000000
[  116.207105] x1 : ffff801f6d682578 x0 : 0000000000000003

[  150.443116] INFO: rcu_sched self-detected stall on CPU
[  150.448261]  25-...: (14997 ticks this GP)
idle=47a/140000000000001/0 softirq=349/349 fqs=7495
[  150.451115] INFO: rcu_sched detected stalls on CPUs/tasks:
[  150.451123]  25-...: (14997 ticks this GP)
idle=47a/140000000000001/0 softirq=349/349 fqs=7495
[  150.451124]  (detected by 13, t=15002 jiffies, g=805, c=804, q=8384)
[  150.451136] Task dump for CPU 25:
[  150.451138] kworker/25:1    R  running task        0   454      2 0x00000002
[  150.451155] Workqueue: events netstamp_clear
[  150.451158] Call trace:
[  150.451164] [<ffff000008086188>] __switch_to+0x90/0xa8
[  150.451172] [<ffff0000081f6240>] static_key_slow_inc+0x128/0x138
[  150.451175] [<ffff0000081f6284>] static_key_enable+0x34/0x60
[  150.451178] [<ffff000008843268>] netstamp_clear+0x68/0x80
[  150.451181] [<ffff0000080e49a0>] process_one_work+0x158/0x478
[  150.451183] [<ffff0000080e4d10>] worker_thread+0x50/0x4a8
[  150.451187] [<ffff0000080ebd78>] kthread+0x108/0x138
[  150.451190] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
[  150.477451]   (t=15008 jiffies g=805 c=804 q=8384)
[  150.482242] Task dump for CPU 25:
[  150.482245] kworker/25:1    R  running task        0   454      2 0x00000002
[  150.482259] Workqueue: events netstamp_clear
[  150.482264] Call trace:
[  150.482271] [<ffff00000808a530>] dump_backtrace+0x0/0x2b0
[  150.482277] [<ffff00000808a804>] show_stack+0x24/0x30
[  150.482281] [<ffff0000080fb750>] sched_show_task+0x128/0x178
[  150.482285] [<ffff0000080fd298>] dump_cpu_task+0x48/0x58
[  150.482288] [<ffff0000081f81e4>] rcu_dump_cpu_stacks+0xa0/0xe8
[  150.482297] [<ffff00000813983c>] rcu_check_callbacks+0x774/0x938
[  150.482305] [<ffff00000813fcb4>] update_process_times+0x34/0x60
[  150.482314] [<ffff000008151b80>] tick_sched_handle.isra.7+0x38/0x70
[  150.482319] [<ffff000008151c04>] tick_sched_timer+0x4c/0x98
[  150.482324] [<ffff000008140510>] __hrtimer_run_queues+0xd8/0x2b8
[  150.482328] [<ffff000008141180>] hrtimer_interrupt+0xa8/0x228
[  150.482334] [<ffff0000087f2a2c>] arch_timer_handler_phys+0x3c/0x50
[  150.482341] [<ffff00000812c194>] handle_percpu_devid_irq+0x8c/0x230
[  150.482344] [<ffff000008126174>] generic_handle_irq+0x34/0x50
[  150.482347] [<ffff000008126898>] __handle_domain_irq+0x68/0xc0
[  150.482351] [<ffff0000080817e4>] gic_handle_irq+0xc4/0x170
[  150.482356] Exception stack(0xffff801f901a7ab0 to 0xffff801f901a7be0)
[  150.482360] 7aa0:
0000000000000003 ffff801f6d682578
[  150.482364] 7ac0: 0000000000000000 0000000000000020
0000000000000020 ffffffff00000000
[  150.482367] 7ae0: 0000000000000000 0000000000000000
0000000000000002 ffff801f901a7d30
[  150.482371] 7b00: 00000000000009f0 00000000366e2f04
0000000000000018 0000000058dabe4c
[  150.482375] 7b20: 0036312196549b4a 00003d0d6dd44d08
ffff0000082e81d8 0000ffffb77c9028
[  150.482378] 7b40: 0000000000000a03 ffff000009138e38
ffff801f6d5cff08 ffff801f6d5cff00
[  150.482382] 7b60: ffff000008e11e00 0000000000000001
ffff0000081565d0 0000000000000000
[  150.482386] 7b80: 0000000000000080 ffff000009139434
ffff000009139000 ffff801f901a7be0
[  150.482390] 7ba0: ffff000008156e8c ffff801f901a7be0
ffff000008156ecc 0000000080000145
[  150.482394] 7bc0: ffff801f901a7be0 ffff000008156e68
ffffffffffffffff ffff000008156e8c
[  150.482397] [<ffff000008082ff4>] el1_irq+0xb4/0x140
[  150.482401] [<ffff000008156ecc>] smp_call_function_many+0x284/0x2e8
[  150.482405] [<ffff000008157020>] kick_all_cpus_sync+0x30/0x38
[  150.482409] [<ffff00000897c6cc>] aarch64_insn_patch_text+0xec/0xf8
[  150.482415] [<ffff000008095978>] arch_jump_label_transform+0x60/0x98
[  150.482420] [<ffff0000081f593c>] __jump_label_update+0x8c/0xa8
[  150.482423] [<ffff0000081f6088>] jump_label_update+0x58/0xe8
[  150.482429] [<ffff0000081f6240>] static_key_slow_inc+0x128/0x138
[  150.482434] [<ffff0000081f6284>] static_key_enable+0x34/0x60
[  150.482438] [<ffff000008843268>] netstamp_clear+0x68/0x80
[  150.482441] [<ffff0000080e49a0>] process_one_work+0x158/0x478
[  150.482444] [<ffff0000080e4d10>] worker_thread+0x50/0x4a8
[  150.482448] [<ffff0000080ebd78>] kthread+0x108/0x138
[  150.482451] [<ffff0000080836c0>] ret_from_fork+0x10/0x50

====================================

I am observing that this usually happens when the guest tries to
bringup or use the default virtio-net interface.
And I am unable to reproduce this when directly booting the guest
kernel without UEFI BIOS.
So anyone observed similar issue ?

regards,
Radha Mohan

WARNING: multiple messages have this Message-ID (diff)
From: Radha Mohan <mohun106@gmail.com>
To: qemu-devel@nongnu.org,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>
Subject: host stalls when qemu-system-aarch64 with kvm and pflash
Date: Tue, 28 Mar 2017 12:58:24 -0700	[thread overview]
Message-ID: <CAC8NTUUWuH4k7SZSus_hz6a6V+OLGYUFcSOUwVbji2YDAZbVdQ@mail.gmail.com> (raw)

Hi,
I am seeing an issue with qemu-system-aarch64 when using pflash
(booting kernel via UEFI bios).

Host kernel: 4.11.0-rc3-next-20170323
Qemu version: v2.9.0-rc1

Command used:
./aarch64-softmmu/qemu-system-aarch64 -cpu host -enable-kvm -M
virt,gic_version=3 -nographic -smp 1 -m 2048 -drive
if=none,id=hd0,file=/root/zesty-server-cloudimg-arm64.img,id=0 -device
virtio-blk-device,drive=hd0 -pflash /root/flash0.img -pflash
/root/flash1.img


As soon as the guest kernel boots the host starts to stall and prints
the below messages. And the system never recovers. I can neither
poweroff the guest nor the host. So I have resort to external power
reset of the host.

==================
[  116.199077] NMI watchdog: BUG: soft lockup - CPU#25 stuck for 23s!
[kworker/25:1:454]
[  116.206901] Modules linked in: binfmt_misc nls_iso8859_1 aes_ce_blk
shpchp crypto_simd gpio_keys cryptd aes_ce_cipher ghash_ce sha2_ce
sha1_ce uio_pdrv_genirq uio autofs4 btrfs raid10 rai
d456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c raid1 raid0 multipath linear ast i2c_algo_bit ttm
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_s
ys_fops drm nicvf ahci nicpf libahci thunder_bgx thunder_xcv
mdio_thunder mdio_cavium

[  116.206995] CPU: 25 PID: 454 Comm: kworker/25:1 Not tainted
4.11.0-rc3-next-20170323 #1
[  116.206997] Hardware name: www.cavium.com crb-1s/crb-1s, BIOS 0.3 Feb 23 2017
[  116.207010] Workqueue: events netstamp_clear
[  116.207015] task: ffff801f906b5400 task.stack: ffff801f901a4000
[  116.207020] PC is at smp_call_function_many+0x284/0x2e8
[  116.207023] LR is at smp_call_function_many+0x244/0x2e8
[  116.207026] pc : [<ffff000008156ecc>] lr : [<ffff000008156e8c>]
pstate: 80000145
[  116.207028] sp : ffff801f901a7be0
[  116.207030] x29: ffff801f901a7be0 x28: ffff000009139000
[  116.207036] x27: ffff000009139434 x26: 0000000000000080
[  116.207041] x25: 0000000000000000 x24: ffff0000081565d0
[  116.207047] x23: 0000000000000001 x22: ffff000008e11e00
[  116.207052] x21: ffff801f6d5cff00 x20: ffff801f6d5cff08
[  116.207057] x19: ffff000009138e38 x18: 0000000000000a03
[  116.207063] x17: 0000ffffb77c9028 x16: ffff0000082e81d8
[  116.207068] x15: 00003d0d6dd44d08 x14: 0036312196549b4a
[  116.207073] x13: 0000000058dabe4c x12: 0000000000000018
[  116.207079] x11: 00000000366e2f04 x10: 00000000000009f0
[  116.207084] x9 : ffff801f901a7d30 x8 : 0000000000000002
[  116.207089] x7 : 0000000000000000 x6 : 0000000000000000
[  116.207095] x5 : ffffffff00000000 x4 : 0000000000000020
[  116.207100] x3 : 0000000000000020 x2 : 0000000000000000
[  116.207105] x1 : ffff801f6d682578 x0 : 0000000000000003

[  150.443116] INFO: rcu_sched self-detected stall on CPU
[  150.448261]  25-...: (14997 ticks this GP)
idle=47a/140000000000001/0 softirq=349/349 fqs=7495
[  150.451115] INFO: rcu_sched detected stalls on CPUs/tasks:
[  150.451123]  25-...: (14997 ticks this GP)
idle=47a/140000000000001/0 softirq=349/349 fqs=7495
[  150.451124]  (detected by 13, t=15002 jiffies, g=805, c=804, q=8384)
[  150.451136] Task dump for CPU 25:
[  150.451138] kworker/25:1    R  running task        0   454      2 0x00000002
[  150.451155] Workqueue: events netstamp_clear
[  150.451158] Call trace:
[  150.451164] [<ffff000008086188>] __switch_to+0x90/0xa8
[  150.451172] [<ffff0000081f6240>] static_key_slow_inc+0x128/0x138
[  150.451175] [<ffff0000081f6284>] static_key_enable+0x34/0x60
[  150.451178] [<ffff000008843268>] netstamp_clear+0x68/0x80
[  150.451181] [<ffff0000080e49a0>] process_one_work+0x158/0x478
[  150.451183] [<ffff0000080e4d10>] worker_thread+0x50/0x4a8
[  150.451187] [<ffff0000080ebd78>] kthread+0x108/0x138
[  150.451190] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
[  150.477451]   (t=15008 jiffies g=805 c=804 q=8384)
[  150.482242] Task dump for CPU 25:
[  150.482245] kworker/25:1    R  running task        0   454      2 0x00000002
[  150.482259] Workqueue: events netstamp_clear
[  150.482264] Call trace:
[  150.482271] [<ffff00000808a530>] dump_backtrace+0x0/0x2b0
[  150.482277] [<ffff00000808a804>] show_stack+0x24/0x30
[  150.482281] [<ffff0000080fb750>] sched_show_task+0x128/0x178
[  150.482285] [<ffff0000080fd298>] dump_cpu_task+0x48/0x58
[  150.482288] [<ffff0000081f81e4>] rcu_dump_cpu_stacks+0xa0/0xe8
[  150.482297] [<ffff00000813983c>] rcu_check_callbacks+0x774/0x938
[  150.482305] [<ffff00000813fcb4>] update_process_times+0x34/0x60
[  150.482314] [<ffff000008151b80>] tick_sched_handle.isra.7+0x38/0x70
[  150.482319] [<ffff000008151c04>] tick_sched_timer+0x4c/0x98
[  150.482324] [<ffff000008140510>] __hrtimer_run_queues+0xd8/0x2b8
[  150.482328] [<ffff000008141180>] hrtimer_interrupt+0xa8/0x228
[  150.482334] [<ffff0000087f2a2c>] arch_timer_handler_phys+0x3c/0x50
[  150.482341] [<ffff00000812c194>] handle_percpu_devid_irq+0x8c/0x230
[  150.482344] [<ffff000008126174>] generic_handle_irq+0x34/0x50
[  150.482347] [<ffff000008126898>] __handle_domain_irq+0x68/0xc0
[  150.482351] [<ffff0000080817e4>] gic_handle_irq+0xc4/0x170
[  150.482356] Exception stack(0xffff801f901a7ab0 to 0xffff801f901a7be0)
[  150.482360] 7aa0:
0000000000000003 ffff801f6d682578
[  150.482364] 7ac0: 0000000000000000 0000000000000020
0000000000000020 ffffffff00000000
[  150.482367] 7ae0: 0000000000000000 0000000000000000
0000000000000002 ffff801f901a7d30
[  150.482371] 7b00: 00000000000009f0 00000000366e2f04
0000000000000018 0000000058dabe4c
[  150.482375] 7b20: 0036312196549b4a 00003d0d6dd44d08
ffff0000082e81d8 0000ffffb77c9028
[  150.482378] 7b40: 0000000000000a03 ffff000009138e38
ffff801f6d5cff08 ffff801f6d5cff00
[  150.482382] 7b60: ffff000008e11e00 0000000000000001
ffff0000081565d0 0000000000000000
[  150.482386] 7b80: 0000000000000080 ffff000009139434
ffff000009139000 ffff801f901a7be0
[  150.482390] 7ba0: ffff000008156e8c ffff801f901a7be0
ffff000008156ecc 0000000080000145
[  150.482394] 7bc0: ffff801f901a7be0 ffff000008156e68
ffffffffffffffff ffff000008156e8c
[  150.482397] [<ffff000008082ff4>] el1_irq+0xb4/0x140
[  150.482401] [<ffff000008156ecc>] smp_call_function_many+0x284/0x2e8
[  150.482405] [<ffff000008157020>] kick_all_cpus_sync+0x30/0x38
[  150.482409] [<ffff00000897c6cc>] aarch64_insn_patch_text+0xec/0xf8
[  150.482415] [<ffff000008095978>] arch_jump_label_transform+0x60/0x98
[  150.482420] [<ffff0000081f593c>] __jump_label_update+0x8c/0xa8
[  150.482423] [<ffff0000081f6088>] jump_label_update+0x58/0xe8
[  150.482429] [<ffff0000081f6240>] static_key_slow_inc+0x128/0x138
[  150.482434] [<ffff0000081f6284>] static_key_enable+0x34/0x60
[  150.482438] [<ffff000008843268>] netstamp_clear+0x68/0x80
[  150.482441] [<ffff0000080e49a0>] process_one_work+0x158/0x478
[  150.482444] [<ffff0000080e4d10>] worker_thread+0x50/0x4a8
[  150.482448] [<ffff0000080ebd78>] kthread+0x108/0x138
[  150.482451] [<ffff0000080836c0>] ret_from_fork+0x10/0x50

====================================

I am observing that this usually happens when the guest tries to
bringup or use the default virtio-net interface.
And I am unable to reproduce this when directly booting the guest
kernel without UEFI BIOS.
So anyone observed similar issue ?

regards,
Radha Mohan

             reply	other threads:[~2017-03-28 19:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28 19:58 Radha Mohan [this message]
2017-03-28 19:58 ` host stalls when qemu-system-aarch64 with kvm and pflash Radha Mohan
2017-03-28 20:16 ` [Qemu-devel] " Christoffer Dall
2017-03-28 20:16   ` Christoffer Dall
2017-03-28 20:24   ` [Qemu-devel] " Radha Mohan
2017-03-28 20:24     ` Radha Mohan
2017-03-29 18:17     ` [Qemu-devel] " Radha Mohan
2017-03-29 18:17       ` Radha Mohan
2017-03-29 18:34       ` [Qemu-devel] " Peter Maydell
2017-03-29 18:34         ` Peter Maydell
2017-03-29 18:56     ` [Qemu-devel] " Christoffer Dall
2017-03-29 18:56       ` Christoffer Dall
2017-03-29 20:51       ` [Qemu-devel] " Radha Mohan
2017-03-29 20:51         ` Radha Mohan
2017-03-29 21:06         ` [Qemu-devel] " Christoffer Dall
2017-03-29 21:06           ` Christoffer Dall
2017-03-29 21:36           ` [Qemu-devel] " Radha Mohan
2017-03-29 21:36             ` Radha Mohan
2017-03-30 10:51       ` [Qemu-devel] " Marc Zyngier
2017-03-30 10:51         ` Marc Zyngier
2017-04-07 20:05         ` [Qemu-devel] " Wei Huang
2017-04-07 20:05           ` Wei Huang
2017-03-30 16:47       ` [Qemu-devel] " Laszlo Ersek
2017-03-30 16:47         ` Laszlo Ersek
2017-03-31 23:16         ` [Qemu-devel] " Radha Mohan
2017-03-31 23:16           ` Radha Mohan
2017-04-05 19:12           ` [Qemu-devel] " Radha Mohan
2017-04-05 19:12             ` Radha Mohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAC8NTUUWuH4k7SZSus_hz6a6V+OLGYUFcSOUwVbji2YDAZbVdQ@mail.gmail.com \
    --to=mohun106@gmail.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.