qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] sda abort with virtio-scsi
@ 2016-02-03 21:46 Jim Minter
  2016-02-03 23:19 ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Minter @ 2016-02-03 21:46 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 10421 bytes --]

Hello,

I am hitting the following VM lockup issue running a VM with latest 
RHEL7 kernel on a host also running latest RHEL7 kernel.  FWIW I'm using 
virtio-scsi because I want to use discard=unmap.  I ran the VM as follows:

/usr/libexec/qemu-kvm -nodefaults \
   -cpu host \
   -smp 4 \
   -m 8192 \
   -drive discard=unmap,file=vm.qcow2,id=disk1,if=none,cache=unsafe \
   -device virtio-scsi-pci \
   -device scsi-disk,drive=disk1 \
   -netdev bridge,id=net0,br=br0 \
   -device virtio-net-pci,netdev=net0,mac=$(utils/random-mac.py) \
   -chardev socket,id=chan0,path=/tmp/rhev.sock,server,nowait \
   -chardev socket,id=chan1,path=/tmp/qemu.sock,server,nowait \
   -monitor unix:tmp/vm.sock,server,nowait \
   -device virtio-serial-pci \
   -device virtserialport,chardev=chan0,name=com.redhat.rhevm.vdsm \
   -device virtserialport,chardev=chan1,name=org.qemu.guest_agent.0 \
   -device cirrus-vga \
   -vnc none \
   -usbdevice tablet

The host was busyish at the time, but not excessively (IMO).  Nothing 
untoward in the host's kernel log; host storage subsystem is fine.  I 
didn't get any qemu logs this time around, but I will when the issue 
next recurs.  The VM's full kernel log is attached; here are the highlights:

sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
sd 2:0:0:0: [sda] abort
INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, 
t=60002 jiffies, g=5253, c=5252, q=0)
sending NMI to all CPUs:
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
task: ffff88023417d080 ti: ffff8802341a4000 task.ti: ffff8802341a4000
RIP: 0010:[<ffffffff81058e96>]  [<ffffffff81058e96>] 
native_safe_halt+0x6/0x10
RSP: 0018:ffff8802341a7e98  EFLAGS: 00000286
RAX: 00000000ffffffed RBX: ffff8802341a4000 RCX: 0100000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
RBP: ffff8802341a7e98 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffff8802341a4000 R14: ffff8802341a4000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4978587008 CR3: 000000003645e000 CR4: 00000000003407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
  ffff8802341a7eb8 ffffffff8101dbcf ffff8802341a4000 ffffffff81a68260
  ffff8802341a7ec8 ffffffff8101e4d6 ffff8802341a7f20 ffffffff810d62e5
  ffff8802341a7fd8 ffff8802341a4000 2581685d70de192c 7ba58fdb3a3bc8d4
Call Trace:
  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
  [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
  [<ffffffff810475fa>] start_secondary+0x1ba/0x230
Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 
48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 
0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
NMI backtrace for cpu 0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
task: ffffffff81951440 ti: ffffffff8193c000 task.ti: ffffffff8193c000
RIP: 0010:[<ffffffff81058e96>]  [<ffffffff81058e96>] 
native_safe_halt+0x6/0x10
RSP: 0018:ffffffff8193fea0  EFLAGS: 00000286
RAX: 00000000ffffffed RBX: ffffffff8193c000 RCX: 0100000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
RBP: ffffffff8193fea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000000b975 R12: 0000000000000000
R13: ffffffff8193c000 R14: ffffffff8193c000 R15: ffffffff8193ffa8
FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f497d2dca60 CR3: 00000001cd69b000 CR4: 00000000003407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
  ffffffff8193fec0 ffffffff8101dbcf ffffffff8193c000 ffffffff81a68260
  ffffffff8193fed0 ffffffff8101e4d6 ffffffff8193ff28 ffffffff810d62e5
  ffffffff8193ffd8 ffffffff8193c000 d30a24e273173538 b42178990e5a2834
Call Trace:
  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
  [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
  [<ffffffff81624db7>] rest_init+0x77/0x80
  [<ffffffff81a8d057>] start_kernel+0x429/0x44a
  [<ffffffff81a8ca37>] ? repair_env_string+0x5c/0x5c
  [<ffffffff81a8c120>] ? early_idt_handlers+0x120/0x120
  [<ffffffff81a8c5ee>] x86_64_start_reservations+0x2a/0x2c
  [<ffffffff81a8c742>] x86_64_start_kernel+0x152/0x175
Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 
48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 
0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
NMI backtrace for cpu 2
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
task: ffff88023417dc00 ti: ffff8802341a8000 task.ti: ffff8802341a8000
RIP: 0010:[<ffffffff81058d5a>]  [<ffffffff81058d5a>] 
native_write_msr_safe+0xa/0x10
RSP: 0018:ffff88023fd03d78  EFLAGS: 00000046
RAX: 0000000000000400 RBX: 0000000000000002 RCX: 0000000000000830
RDX: 0000000000000002 RSI: 0000000000000400 RDI: 0000000000000830
RBP: ffff88023fd03d78 R08: ffffffff81a658e0 R09: 00000000000003d8
R10: 0000000000000000 R11: ffff88023fd03afe R12: ffffffff81a658e0
R13: 0000000000000002 R14: 000000000000a022 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c2087f1200 CR3: 00000002330ba000 CR4: 00000000003407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
  ffff88023fd03dc8 ffffffff8104fa22 0000000000000096 0000000000000002
  000800003fd03da8 0000000000000001 0000000000000002 0000000000000004
  ffff88023fd0df80 ffffffff819a6080 ffff88023fd03dd8 ffffffff8104fac3
Call Trace:
  <IRQ>

  [<ffffffff8104fa22>] __x2apic_send_IPI_mask+0xb2/0xe0
  [<ffffffff8104fac3>] x2apic_send_IPI_mask+0x13/0x20
  [<ffffffff8104b2ad>] arch_trigger_all_cpu_backtrace+0x11d/0x130
  [<ffffffff8112695d>] rcu_check_callbacks+0x5bd/0x610
  [<ffffffff810e0600>] ? tick_sched_handle.isra.14+0x60/0x60
  [<ffffffff8108e8c7>] update_process_times+0x47/0x80
  [<ffffffff810e05c5>] tick_sched_handle.isra.14+0x25/0x60
  [<ffffffff810e0641>] tick_sched_timer+0x41/0x70
  [<ffffffff810a9d32>] __hrtimer_run_queues+0xd2/0x260
  [<ffffffff810aa2d0>] hrtimer_interrupt+0xb0/0x1e0
  [<ffffffff81049537>] local_apic_timer_interrupt+0x37/0x60
  [<ffffffff81647e4f>] smp_apic_timer_interrupt+0x3f/0x60
  [<ffffffff8164651d>] apic_timer_interrupt+0x6d/0x80
  <EOI>

  [<ffffffff81058e96>] ? native_safe_halt+0x6/0x10
  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
  [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
  [<ffffffff810475fa>] start_secondary+0x1ba/0x230
Code: 00 55 89 f9 48 89 e5 0f 32 31 c9 89 c0 48 c1 e2 20 89 0e 48 09 c2 
48 89 d0 5d c3 66 0f 1f 44 00 00 55 89 f0 89 f9 48 89 e5 0f 30 <31> c0 
5d c3 66 90 55 89 f9 48 89 e5 0f 33 89 c0 48 c1 e2 20 48
NMI backtrace for cpu 3
CPU: 3 PID: 30 Comm: migration/3 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
task: ffff8802341bf300 ti: ffff880234208000 task.ti: ffff880234208000
RIP: 0010:[<ffffffff8163ca37>]  [<ffffffff8163ca37>] 
_raw_spin_unlock+0x17/0x30
RSP: 0018:ffff88023fd83e58  EFLAGS: 00000082
RAX: 00000000f90cf90c RBX: ffff88023fc14780 RCX: 00000000000052f8
RDX: 000000000000f90c RSI: 00000000f90ef90c RDI: ffff88023fc14780
RBP: ffff88023fd83e98 R08: ffffffff81c6ec80 R09: ffff88023fd8dbc1
R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000989680
R13: ffff880209f73a40 R14: 0000000a4dbdb2da R15: ffff8801faed1c00
FS:  0000000000000000(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f91d9cab000 CR3: 0000000230c56000 CR4: 00000000003407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
  ffffffff810c4487 ffff8801faed1d00 000000004f324af7 ffff880209f73980
  ffff880209f73948 0000000000989680 ffff880209f73a40 0000000a4dbdb2da
  ffff88023fd83ed0 ffffffff810c46d7 ffff880209f73980 ffff88023fd8d960
Call Trace:
  <IRQ>

  [<ffffffff810c4487>] ? distribute_cfs_runtime+0x87/0x100
  [<ffffffff810c46d7>] sched_cfs_period_timer+0x107/0x140
  [<ffffffff810c45d0>] ? sched_cfs_slack_timer+0xd0/0xd0
  [<ffffffff810a9d32>] __hrtimer_run_queues+0xd2/0x260
  [<ffffffff810aa2d0>] hrtimer_interrupt+0xb0/0x1e0
  [<ffffffff81049537>] local_apic_timer_interrupt+0x37/0x60
  [<ffffffff81647e4f>] smp_apic_timer_interrupt+0x3f/0x60
  [<ffffffff8164651d>] apic_timer_interrupt+0x6d/0x80
  <EOI>

  [<ffffffff81064cba>] ? native_set_fixmap+0x3a/0x40
  [<ffffffff811031f5>] ? multi_cpu_stop+0xa5/0xf0
  [<ffffffff81103150>] ? cpu_stop_should_run+0x50/0x50
  [<ffffffff811033d8>] cpu_stopper_thread+0x88/0x160
  [<ffffffff810b4268>] ? finish_task_switch+0x108/0x170
  [<ffffffff8163a228>] ? __schedule+0x2d8/0x900
  [<ffffffff810ae51f>] smpboot_thread_fn+0xff/0x1a0
  [<ffffffff8163a879>] ? schedule+0x29/0x70
  [<ffffffff810ae420>] ? lg_double_unlock+0x90/0x90
  [<ffffffff810a5aef>] kthread+0xcf/0xe0
  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
  [<ffffffff81645818>] ret_from_fork+0x58/0x90
  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Code: 39 d0 75 ea b8 01 00 00 00 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 
44 00 00 e9 06 00 00 00 66 83 07 02 c3 90 8b 37 f0 66 83 07 02 <f6> 47 
02 01 74 f1 55 48 89 e5 e8 fd 2c ff ff 5d c3 0f 1f 84 00

Does anyone have any quick thoughts on what the issue could be, or any 
advice on how to get further appropriate debugging information to help 
get to the bottom of it?

Thanks,

Jim

[-- Attachment #2: vm-dmesg.gz --]
[-- Type: application/gzip, Size: 37692 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-03 21:46 [Qemu-devel] sda abort with virtio-scsi Jim Minter
@ 2016-02-03 23:19 ` Paolo Bonzini
  2016-02-03 23:34   ` Jim Minter
  2016-02-04  6:59   ` Hannes Reinecke
  0 siblings, 2 replies; 12+ messages in thread
From: Paolo Bonzini @ 2016-02-03 23:19 UTC (permalink / raw)
  To: Jim Minter, qemu-devel, Hannes Reinecke



On 03/02/2016 22:46, Jim Minter wrote:
> I am hitting the following VM lockup issue running a VM with latest
> RHEL7 kernel on a host also running latest RHEL7 kernel.  FWIW I'm using
> virtio-scsi because I want to use discard=unmap.  I ran the VM as follows:
> 
> /usr/libexec/qemu-kvm -nodefaults \
>   -cpu host \
>   -smp 4 \
>   -m 8192 \
>   -drive discard=unmap,file=vm.qcow2,id=disk1,if=none,cache=unsafe \
>   -device virtio-scsi-pci \
>   -device scsi-disk,drive=disk1 \
>   -netdev bridge,id=net0,br=br0 \
>   -device virtio-net-pci,netdev=net0,mac=$(utils/random-mac.py) \
>   -chardev socket,id=chan0,path=/tmp/rhev.sock,server,nowait \
>   -chardev socket,id=chan1,path=/tmp/qemu.sock,server,nowait \
>   -monitor unix:tmp/vm.sock,server,nowait \
>   -device virtio-serial-pci \
>   -device virtserialport,chardev=chan0,name=com.redhat.rhevm.vdsm \
>   -device virtserialport,chardev=chan1,name=org.qemu.guest_agent.0 \
>   -device cirrus-vga \
>   -vnc none \
>   -usbdevice tablet
> 
> The host was busyish at the time, but not excessively (IMO).  Nothing
> untoward in the host's kernel log; host storage subsystem is fine.  I
> didn't get any qemu logs this time around, but I will when the issue
> next recurs.  The VM's full kernel log is attached; here are the
> highlights:

Hannes, were you going to send a patch to disable time outs?

> 
> INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=60002 jiffies, g=5253, c=5252, q=0)
> sending NMI to all CPUs:
> NMI backtrace for cpu 1
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> task: ffff88023417d080 ti: ffff8802341a4000 task.ti: ffff8802341a4000
> RIP: 0010:[<ffffffff81058e96>]  [<ffffffff81058e96>] native_safe_halt+0x6/0x10
> RSP: 0018:ffff8802341a7e98  EFLAGS: 00000286
> RAX: 00000000ffffffed RBX: ffff8802341a4000 RCX: 0100000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
> RBP: ffff8802341a7e98 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff8802341a4000 R14: ffff8802341a4000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4978587008 CR3: 000000003645e000 CR4: 00000000003407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  ffff8802341a7eb8 ffffffff8101dbcf ffff8802341a4000 ffffffff81a68260
>  ffff8802341a7ec8 ffffffff8101e4d6 ffff8802341a7f20 ffffffff810d62e5
>  ffff8802341a7fd8 ffff8802341a4000 2581685d70de192c 7ba58fdb3a3bc8d4
> Call Trace:
>  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
>  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
>  [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
>  [<ffffffff810475fa>] start_secondary+0x1ba/0x230
> Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
> NMI backtrace for cpu 0 

This is the NMI watchdog firing; the CPU got stuck for 20 seconds.  The
issue was not a busy host, but a busy storage (could it be a network
partition if the disk was hosted on NFS???)

Firing the NMI watchdog is fixed in more recent QEMU, which has
asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
(try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-03 23:19 ` Paolo Bonzini
@ 2016-02-03 23:34   ` Jim Minter
  2016-02-04 10:23     ` Paolo Bonzini
  2016-02-04  6:59   ` Hannes Reinecke
  1 sibling, 1 reply; 12+ messages in thread
From: Jim Minter @ 2016-02-03 23:34 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, Hannes Reinecke

Hi again, thanks for replying,

On 03/02/16 23:19, Paolo Bonzini wrote:
> On 03/02/2016 22:46, Jim Minter wrote:
>> I am hitting the following VM lockup issue running a VM with latest
>> RHEL7 kernel on a host also running latest RHEL7 kernel.  FWIW I'm using
>> virtio-scsi because I want to use discard=unmap.  I ran the VM as follows:
>>
>> /usr/libexec/qemu-kvm -nodefaults \
>>    -cpu host \
>>    -smp 4 \
>>    -m 8192 \
>>    -drive discard=unmap,file=vm.qcow2,id=disk1,if=none,cache=unsafe \
>>    -device virtio-scsi-pci \
>>    -device scsi-disk,drive=disk1 \
>>    -netdev bridge,id=net0,br=br0 \
>>    -device virtio-net-pci,netdev=net0,mac=$(utils/random-mac.py) \
>>    -chardev socket,id=chan0,path=/tmp/rhev.sock,server,nowait \
>>    -chardev socket,id=chan1,path=/tmp/qemu.sock,server,nowait \
>>    -monitor unix:tmp/vm.sock,server,nowait \
>>    -device virtio-serial-pci \
>>    -device virtserialport,chardev=chan0,name=com.redhat.rhevm.vdsm \
>>    -device virtserialport,chardev=chan1,name=org.qemu.guest_agent.0 \
>>    -device cirrus-vga \
>>    -vnc none \
>>    -usbdevice tablet
>>
>> The host was busyish at the time, but not excessively (IMO).  Nothing
>> untoward in the host's kernel log; host storage subsystem is fine.  I
>> didn't get any qemu logs this time around, but I will when the issue
>> next recurs.  The VM's full kernel log is attached; here are the
>> highlights:
>
> Hannes, were you going to send a patch to disable time outs?
>
>>
>> INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=60002 jiffies, g=5253, c=5252, q=0)
>> sending NMI to all CPUs:
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> task: ffff88023417d080 ti: ffff8802341a4000 task.ti: ffff8802341a4000
>> RIP: 0010:[<ffffffff81058e96>]  [<ffffffff81058e96>] native_safe_halt+0x6/0x10
>> RSP: 0018:ffff8802341a7e98  EFLAGS: 00000286
>> RAX: 00000000ffffffed RBX: ffff8802341a4000 RCX: 0100000000000000
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
>> RBP: ffff8802341a7e98 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>> R13: ffff8802341a4000 R14: ffff8802341a4000 R15: 0000000000000000
>> FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f4978587008 CR3: 000000003645e000 CR4: 00000000003407e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Stack:
>>   ffff8802341a7eb8 ffffffff8101dbcf ffff8802341a4000 ffffffff81a68260
>>   ffff8802341a7ec8 ffffffff8101e4d6 ffff8802341a7f20 ffffffff810d62e5
>>   ffff8802341a7fd8 ffff8802341a4000 2581685d70de192c 7ba58fdb3a3bc8d4
>> Call Trace:
>>   [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
>>   [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
>>   [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
>>   [<ffffffff810475fa>] start_secondary+0x1ba/0x230
>> Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
>> NMI backtrace for cpu 0
>
> This is the NMI watchdog firing; the CPU got stuck for 20 seconds.  The
> issue was not a busy host, but a busy storage (could it be a network
> partition if the disk was hosted on NFS???)

The VM qcow2 storage is on host-local SSD, and although there's some 
competition for the host CPU and storage, it seems surprising to me that 
the VM should be starved of CPU to this extent.  I was worried there was 
some way in which the contention could cause an abort and perhaps thence 
the lockup (which does not seem to recover when the host load goes down).

> Firing the NMI watchdog is fixed in more recent QEMU, which has
> asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
> (try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).

/usr/libexec/qemu-kvm --version reports QEMU emulator version 1.5.3 
(qemu-kvm-1.5.3-105.el7_2.3)

Cheers,

Jim

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-03 23:19 ` Paolo Bonzini
  2016-02-03 23:34   ` Jim Minter
@ 2016-02-04  6:59   ` Hannes Reinecke
  2016-02-04 11:27     ` Paolo Bonzini
  1 sibling, 1 reply; 12+ messages in thread
From: Hannes Reinecke @ 2016-02-04  6:59 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Minter, qemu-devel

On 02/04/2016 12:19 AM, Paolo Bonzini wrote:
> 
> 
> On 03/02/2016 22:46, Jim Minter wrote:
>> I am hitting the following VM lockup issue running a VM with latest
>> RHEL7 kernel on a host also running latest RHEL7 kernel.  FWIW I'm using
>> virtio-scsi because I want to use discard=unmap.  I ran the VM as follows:
>>
>> /usr/libexec/qemu-kvm -nodefaults \
>>   -cpu host \
>>   -smp 4 \
>>   -m 8192 \
>>   -drive discard=unmap,file=vm.qcow2,id=disk1,if=none,cache=unsafe \
>>   -device virtio-scsi-pci \
>>   -device scsi-disk,drive=disk1 \
>>   -netdev bridge,id=net0,br=br0 \
>>   -device virtio-net-pci,netdev=net0,mac=$(utils/random-mac.py) \
>>   -chardev socket,id=chan0,path=/tmp/rhev.sock,server,nowait \
>>   -chardev socket,id=chan1,path=/tmp/qemu.sock,server,nowait \
>>   -monitor unix:tmp/vm.sock,server,nowait \
>>   -device virtio-serial-pci \
>>   -device virtserialport,chardev=chan0,name=com.redhat.rhevm.vdsm \
>>   -device virtserialport,chardev=chan1,name=org.qemu.guest_agent.0 \
>>   -device cirrus-vga \
>>   -vnc none \
>>   -usbdevice tablet
>>
>> The host was busyish at the time, but not excessively (IMO).  Nothing
>> untoward in the host's kernel log; host storage subsystem is fine.  I
>> didn't get any qemu logs this time around, but I will when the issue
>> next recurs.  The VM's full kernel log is attached; here are the
>> highlights:
> 
> Hannes, were you going to send a patch to disable time outs?
> 
Rah. Didn't I do it already?
Seems like I didn't; will be doing so shortly.

>>
>> INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=60002 jiffies, g=5253, c=5252, q=0)
>> sending NMI to all CPUs:
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> task: ffff88023417d080 ti: ffff8802341a4000 task.ti: ffff8802341a4000
>> RIP: 0010:[<ffffffff81058e96>]  [<ffffffff81058e96>] native_safe_halt+0x6/0x10
>> RSP: 0018:ffff8802341a7e98  EFLAGS: 00000286
>> RAX: 00000000ffffffed RBX: ffff8802341a4000 RCX: 0100000000000000
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
>> RBP: ffff8802341a7e98 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>> R13: ffff8802341a4000 R14: ffff8802341a4000 R15: 0000000000000000
>> FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f4978587008 CR3: 000000003645e000 CR4: 00000000003407e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Stack:
>>  ffff8802341a7eb8 ffffffff8101dbcf ffff8802341a4000 ffffffff81a68260
>>  ffff8802341a7ec8 ffffffff8101e4d6 ffff8802341a7f20 ffffffff810d62e5
>>  ffff8802341a7fd8 ffff8802341a4000 2581685d70de192c 7ba58fdb3a3bc8d4
>> Call Trace:
>>  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
>>  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
>>  [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
>>  [<ffffffff810475fa>] start_secondary+0x1ba/0x230
>> Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
>> NMI backtrace for cpu 0 
> 
> This is the NMI watchdog firing; the CPU got stuck for 20 seconds.  The
> issue was not a busy host, but a busy storage (could it be a network
> partition if the disk was hosted on NFS???)
> 
> Firing the NMI watchdog is fixed in more recent QEMU, which has
> asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
> (try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).
> 
Actually, you still cannot do _real_ async cancellation of I/O; the
linux aio subsystem implements io_cancel(), but the cancellation
just aborts the (internal) waitqueue element, not the I/O itself.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-03 23:34   ` Jim Minter
@ 2016-02-04 10:23     ` Paolo Bonzini
  2016-02-04 11:00       ` Denis V. Lunev
  2016-02-04 13:41       ` Jim Minter
  0 siblings, 2 replies; 12+ messages in thread
From: Paolo Bonzini @ 2016-02-04 10:23 UTC (permalink / raw)
  To: Jim Minter, qemu-devel, Hannes Reinecke



On 04/02/2016 00:34, Jim Minter wrote:
> I was worried there was
> some way in which the contention could cause an abort and perhaps thence
> the lockup (which does not seem to recover when the host load goes down).

I don't know... It's not the most tested code, but it is not very
complicated either.

The certain points that can be extracted from the kernel messages are:
1) there was a cancellation request that took a long time, >20 seconds;
2) despite taking a long time, it _did_ recover sooner or later because
otherwise you'd not have the lockup splat either.

Paolo

>> Firing the NMI watchdog is fixed in more recent QEMU, which has
>> asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
>> (try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).
> 
> /usr/libexec/qemu-kvm --version reports QEMU emulator version 1.5.3
> (qemu-kvm-1.5.3-105.el7_2.3)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04 10:23     ` Paolo Bonzini
@ 2016-02-04 11:00       ` Denis V. Lunev
  2016-02-04 13:41       ` Jim Minter
  1 sibling, 0 replies; 12+ messages in thread
From: Denis V. Lunev @ 2016-02-04 11:00 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Minter, qemu-devel, Hannes Reinecke; +Cc: Roman Kagan

On 02/04/2016 01:23 PM, Paolo Bonzini wrote:
>
> On 04/02/2016 00:34, Jim Minter wrote:
>> I was worried there was
>> some way in which the contention could cause an abort and perhaps thence
>> the lockup (which does not seem to recover when the host load goes down).
> I don't know... It's not the most tested code, but it is not very
> complicated either.
>
> The certain points that can be extracted from the kernel messages are:
> 1) there was a cancellation request that took a long time, >20 seconds;
> 2) despite taking a long time, it _did_ recover sooner or later because
> otherwise you'd not have the lockup splat either.
>
> Paolo
>
>>> Firing the NMI watchdog is fixed in more recent QEMU, which has
>>> asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
>>> (try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).
>> /usr/libexec/qemu-kvm --version reports QEMU emulator version 1.5.3
>> (qemu-kvm-1.5.3-105.el7_2.3)

my $0.02 to the account. This could be related or could be
not related.

speaking about NMI we do observe similar problems on older
AMD hosts with big enough number of VCPUs in guest.
On a simple boot we see something like this (the probability to
face the problem is around 1/10). RHEV 2.3 QEMU is used, the same
kernel is running on host node:

[   72.189005] Kernel panic - not syncing: softlockup: hung tasks
[   72.189005] CPU: 5 PID: 593 Comm: systemd-udevd Tainted: G             L ------------   3.10.0-327.4.4.el7.x86_64 #1
[   72.189005] Hardware name: Red Hat KVM, BIOS seabios-1.7.5-11.vz7.2 04/01/2014
[   72.189005]  ffffffff81871a03 0000000050291887 ffff88007fd43e18 ffffffff8163515c
[   72.189005]  ffff88007fd43e98 ffffffff8162e9d7 0000000000000008 ffff88007fd43ea8
[   72.189005]  ffff88007fd43e48 0000000050291887 0000000000002710 0000000000000000
[   72.189005] Call Trace:
[   72.189005]  <IRQ>  [<ffffffff8163515c>] dump_stack+0x19/0x1b
[   72.189005]  [<ffffffff8162e9d7>] panic+0xd8/0x1e7
[   72.189005]  [<ffffffff8111b376>] watchdog_timer_fn+0x1b6/0x1c0
[   72.189005]  [<ffffffff8111b1c0>] ? watchdog_enable+0xc0/0xc0
[   72.189005]  [<ffffffff810a9d42>] __hrtimer_run_queues+0xd2/0x260
[   72.189005]  [<ffffffff810aa2e0>] hrtimer_interrupt+0xb0/0x1e0
[   72.189005]  [<ffffffff816471dc>] ? call_softirq+0x1c/0x30
[   72.189005]  [<ffffffff81065d40>] ? flush_tlb_func+0xb0/0xb0
[   72.189005]  [<ffffffff81049537>] local_apic_timer_interrupt+0x37/0x60
[   72.189005]  [<ffffffff81647e4f>] smp_apic_timer_interrupt+0x3f/0x60
[   72.189005]  [<ffffffff8164651d>] apic_timer_interrupt+0x6d/0x80
[   72.189005]  <EOI>  [<ffffffff812f27e9>] ? free_cpumask_var+0x9/0x10
[   72.189005]  [<ffffffff810e6de2>] ? smp_call_function_many+0x202/0x260
[   72.189005]  [<ffffffff81065d40>] ? flush_tlb_func+0xb0/0xb0
[   72.189005]  [<ffffffff810e6e9d>] on_each_cpu+0x2d/0x60
[   72.189005]  [<ffffffff81066119>] flush_tlb_kernel_range+0x59/0xa0
[   72.189005]  [<ffffffff811a4730>] __purge_vmap_area_lazy+0x1a0/0x210
[   72.189005]  [<ffffffff811a4927>] vm_unmap_aliases+0x187/0x1b0
[   72.189005]  [<ffffffff810620e8>] change_page_attr_set_clr+0xe8/0x4d0
[   72.189005]  [<ffffffff81131127>] ? ring_buffer_time_stamp+0x7/0x10
[   72.189005]  [<ffffffff8106299f>] set_memory_ro+0x2f/0x40
[   72.189005]  [<ffffffff8162f7c4>] set_section_ro_nx+0x3a/0x71
[   72.189005]  [<ffffffff810ed19c>] load_module+0x103c/0x1b50
[   72.189005]  [<ffffffff810e9743>] ? copy_module_from_fd.isra.42+0x53/0x150
[   72.189005]  [<ffffffff810ede66>] SyS_finit_module+0xa6/0xd0
[   72.189005]  [<ffffffff816458c9>] system_call_fastpath+0x16/0x1b
[   72.189005] Shutting down cpus with NMI


Timesources changes does not help. Also there is a strange stuff like this:

Having configured the serial port to point at a unix socket, and timestamping
the messages on the host side, we observe

guest timestamp | host timestamp
        0.000000 | 20:09:26.805461
        2.587056 | 20:09:30.000993
        7.607329 | 20:09:35.062367
       12.645539 | 20:09:40.057634
       22.608054 | 20:09:50.028727
       32.395499 | 20:10:00.041215
       42.571265 | 20:10:10.041960
       47.606661 | 20:10:15.028973
       48.627059 | 20:10:20.022359
       49.029059 | 20:10:25.047857
       49.399065 | 20:10:30.066884
       49.809077 | 20:10:35.036467
       58.159132 | 20:10:40.013387
       68.043371 | 20:10:40.266714

Note the anomaly around 47 seconds from boot.

anyway, this story is far from the end and we are unable to provide more 
details. Roma is digging this story. Den

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04  6:59   ` Hannes Reinecke
@ 2016-02-04 11:27     ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2016-02-04 11:27 UTC (permalink / raw)
  To: Hannes Reinecke, Jim Minter, qemu-devel



On 04/02/2016 07:59, Hannes Reinecke wrote:
> On 02/04/2016 12:19 AM, Paolo Bonzini wrote:
>>
>>
>> On 03/02/2016 22:46, Jim Minter wrote:
>>> I am hitting the following VM lockup issue running a VM with latest
>>> RHEL7 kernel on a host also running latest RHEL7 kernel.  FWIW I'm using
>>> virtio-scsi because I want to use discard=unmap.  I ran the VM as follows:
>>>
>>> /usr/libexec/qemu-kvm -nodefaults \
>>>   -cpu host \
>>>   -smp 4 \
>>>   -m 8192 \
>>>   -drive discard=unmap,file=vm.qcow2,id=disk1,if=none,cache=unsafe \
>>>   -device virtio-scsi-pci \
>>>   -device scsi-disk,drive=disk1 \
>>>   -netdev bridge,id=net0,br=br0 \
>>>   -device virtio-net-pci,netdev=net0,mac=$(utils/random-mac.py) \
>>>   -chardev socket,id=chan0,path=/tmp/rhev.sock,server,nowait \
>>>   -chardev socket,id=chan1,path=/tmp/qemu.sock,server,nowait \
>>>   -monitor unix:tmp/vm.sock,server,nowait \
>>>   -device virtio-serial-pci \
>>>   -device virtserialport,chardev=chan0,name=com.redhat.rhevm.vdsm \
>>>   -device virtserialport,chardev=chan1,name=org.qemu.guest_agent.0 \
>>>   -device cirrus-vga \
>>>   -vnc none \
>>>   -usbdevice tablet
>>>
>>> The host was busyish at the time, but not excessively (IMO).  Nothing
>>> untoward in the host's kernel log; host storage subsystem is fine.  I
>>> didn't get any qemu logs this time around, but I will when the issue
>>> next recurs.  The VM's full kernel log is attached; here are the
>>> highlights:
>>
>> Hannes, were you going to send a patch to disable time outs?
>>
> Rah. Didn't I do it already?
> Seems like I didn't; will be doing so shortly.
> 
>>>
>>> INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=60002 jiffies, g=5253, c=5252, q=0)
>>> sending NMI to all CPUs:
>>> NMI backtrace for cpu 1
>>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
>>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> task: ffff88023417d080 ti: ffff8802341a4000 task.ti: ffff8802341a4000
>>> RIP: 0010:[<ffffffff81058e96>]  [<ffffffff81058e96>] native_safe_halt+0x6/0x10
>>> RSP: 0018:ffff8802341a7e98  EFLAGS: 00000286
>>> RAX: 00000000ffffffed RBX: ffff8802341a4000 RCX: 0100000000000000
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
>>> RBP: ffff8802341a7e98 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>>> R13: ffff8802341a4000 R14: ffff8802341a4000 R15: 0000000000000000
>>> FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007f4978587008 CR3: 000000003645e000 CR4: 00000000003407e0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Stack:
>>>  ffff8802341a7eb8 ffffffff8101dbcf ffff8802341a4000 ffffffff81a68260
>>>  ffff8802341a7ec8 ffffffff8101e4d6 ffff8802341a7f20 ffffffff810d62e5
>>>  ffff8802341a7fd8 ffff8802341a4000 2581685d70de192c 7ba58fdb3a3bc8d4
>>> Call Trace:
>>>  [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
>>>  [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
>>>  [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
>>>  [<ffffffff810475fa>] start_secondary+0x1ba/0x230
>>> Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
>>> NMI backtrace for cpu 0 
>>
>> This is the NMI watchdog firing; the CPU got stuck for 20 seconds.  The
>> issue was not a busy host, but a busy storage (could it be a network
>> partition if the disk was hosted on NFS???)
>>
>> Firing the NMI watchdog is fixed in more recent QEMU, which has
>> asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
>> (try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).
>>
> Actually, you still cannot do _real_ async cancellation of I/O; the
> linux aio subsystem implements io_cancel(), but the cancellation
> just aborts the (internal) waitqueue element, not the I/O itself.

Right, but at least the TMF is asynchronous.  Synchronous TMFs keep the
VCPUs in QEMU for many seconds and cause the watchdog to fire.

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04 10:23     ` Paolo Bonzini
  2016-02-04 11:00       ` Denis V. Lunev
@ 2016-02-04 13:41       ` Jim Minter
  2016-02-04 13:54         ` Hannes Reinecke
  2016-02-08 20:02         ` Jim Minter
  1 sibling, 2 replies; 12+ messages in thread
From: Jim Minter @ 2016-02-04 13:41 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, Hannes Reinecke

FWIW, I've now done:

echo 300 >/sys/block/sda/device/timeout

Not entirely sure whether it would help or not, but so far I haven't had 
a recurrence.

Cheers,

Jim

-- 
Jim Minter
Principal Solution Architect, Red Hat UK
e: jminter@redhat.com
m: +44 (0)7906 098697
cal: 
https://www.google.com/calendar/embed?src=jminter@redhat.com&ctz=Europe/London&mode=week

On 04/02/16 10:23, Paolo Bonzini wrote:
>
>
> On 04/02/2016 00:34, Jim Minter wrote:
>> I was worried there was
>> some way in which the contention could cause an abort and perhaps thence
>> the lockup (which does not seem to recover when the host load goes down).
>
> I don't know... It's not the most tested code, but it is not very
> complicated either.
>
> The certain points that can be extracted from the kernel messages are:
> 1) there was a cancellation request that took a long time, >20 seconds;
> 2) despite taking a long time, it _did_ recover sooner or later because
> otherwise you'd not have the lockup splat either.
>
> Paolo
>
>>> Firing the NMI watchdog is fixed in more recent QEMU, which has
>>> asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
>>> (try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).
>>
>> /usr/libexec/qemu-kvm --version reports QEMU emulator version 1.5.3
>> (qemu-kvm-1.5.3-105.el7_2.3)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04 13:41       ` Jim Minter
@ 2016-02-04 13:54         ` Hannes Reinecke
  2016-02-04 15:03           ` Paolo Bonzini
  2016-02-08 20:02         ` Jim Minter
  1 sibling, 1 reply; 12+ messages in thread
From: Hannes Reinecke @ 2016-02-04 13:54 UTC (permalink / raw)
  To: Jim Minter, Paolo Bonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 589 bytes --]

On 02/04/2016 02:41 PM, Jim Minter wrote:
> FWIW, I've now done:
> 
> echo 300 >/sys/block/sda/device/timeout
> 
> Not entirely sure whether it would help or not, but so far I haven't
> had a recurrence.
> 
Can you try with the attached patch?
Should work as well, but you shouldn't need to tweak anything.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

[-- Attachment #2: 0001-virtio_scsi-Implement-eh_timed_out-callback.patch --]
[-- Type: text/x-patch, Size: 2424 bytes --]

>From 330b4f056957eed60cb3cd1bfa1486ba7db96d8d Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Wed, 18 Nov 2015 10:33:13 +0100
Subject: [PATCH] virtio_scsi: Implement eh_timed_out callback

If a command timeout occurs on the host the SCSI error handling
is entered, which typically takes quite some time until to error
is resolved and the command can be returned to the guest.
This error handling typically overflows any timeout the guest
might have been set on that command, causing the guest to abort
the command.
However, command aborts from the guest is tricky, and, in most
cases, downright impossible.
So instead this patch implements an eh_timed_out callback to
virtio_scsi, which will issue a QUERY TASK TMF. If that succeeds
the command is still active on the host, and the guest command
timeout is reset.

References: bsc#936530

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/virtio_scsi.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 595af1a..f712db5 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -543,6 +543,33 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
 	return virtscsi_tmf(vscsi, cmd);
 }
 
+static enum blk_eh_timer_return virtscsi_timed_out(struct scsi_cmnd *sc)
+{
+	struct virtio_scsi *vscsi = shost_priv(sc->device->host);
+	struct virtio_scsi_cmd *cmd;
+
+	scmd_printk(KERN_INFO, sc, "timeout\n");
+	cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO);
+	if (!cmd)
+		return FAILED;
+
+	memset(cmd, 0, sizeof(*cmd));
+	cmd->sc = sc;
+	cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){
+		.type = VIRTIO_SCSI_T_TMF,
+		.subtype = VIRTIO_SCSI_T_TMF_QUERY_TASK,
+		.lun[0] = 1,
+		.lun[1] = sc->device->id,
+		.lun[2] = (sc->device->lun >> 8) | 0x40,
+		.lun[3] = sc->device->lun & 0xff,
+		.tag = (unsigned long)sc,
+	};
+	if (virtscsi_tmf(vscsi, cmd) == FAILED)
+		return BLK_EH_NOT_HANDLED;
+
+	return BLK_EH_RESET_TIMER;
+}
+
 static struct scsi_host_template virtscsi_host_template = {
 	.module = THIS_MODULE,
 	.name = "Virtio SCSI HBA",
@@ -551,6 +578,7 @@ static struct scsi_host_template virtscsi_host_template = {
 	.this_id = -1,
 	.eh_abort_handler = virtscsi_abort,
 	.eh_device_reset_handler = virtscsi_device_reset,
+	.eh_timed_out = virtscsi_timed_out,
 
 	.can_queue = 1024,
 	.dma_boundary = UINT_MAX,
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04 13:54         ` Hannes Reinecke
@ 2016-02-04 15:03           ` Paolo Bonzini
  2016-02-04 15:11             ` Hannes Reinecke
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2016-02-04 15:03 UTC (permalink / raw)
  To: Hannes Reinecke, Jim Minter, qemu-devel

> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 595af1a..f712db5 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -543,6 +543,33 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
>  	return virtscsi_tmf(vscsi, cmd);
>  }
>  
> +static enum blk_eh_timer_return virtscsi_timed_out(struct scsi_cmnd *sc)
> +{
> +	struct virtio_scsi *vscsi = shost_priv(sc->device->host);
> +	struct virtio_scsi_cmd *cmd;
> +
> +	scmd_printk(KERN_INFO, sc, "timeout\n");
> +	cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO);
> +	if (!cmd)
> +		return FAILED;

Should this be BLK_EH_NOT_HANDLED?  Apart from this, the patch looks
good.  And of course this doesn't affect Jim, who can test this patch
anyway.

Thanks!

Paolo

> +
> +	memset(cmd, 0, sizeof(*cmd));
> +	cmd->sc = sc;
> +	cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){
> +		.type = VIRTIO_SCSI_T_TMF,
> +		.subtype = VIRTIO_SCSI_T_TMF_QUERY_TASK,
> +		.lun[0] = 1,
> +		.lun[1] = sc->device->id,
> +		.lun[2] = (sc->device->lun >> 8) | 0x40,
> +		.lun[3] = sc->device->lun & 0xff,
> +		.tag = (unsigned long)sc,
> +	};
> +	if (virtscsi_tmf(vscsi, cmd) == FAILED)
> +		return BLK_EH_NOT_HANDLED;
> +
> +	return BLK_EH_RESET_TIMER;
> +}
> +
>  static struct scsi_host_template virtscsi_host_template = {
>  	.module = THIS_MODULE,
>  	.name = "Virtio SCSI HBA",
> @@ -551,6 +578,7 @@ static struct scsi_host_template virtscsi_host_template = {
>  	.this_id = -1,
>  	.eh_abort_handler = virtscsi_abort,
>  	.eh_device_reset_handler = virtscsi_device_reset,
> +	.eh_timed_out = virtscsi_timed_out,
>  
>  	.can_queue = 1024,
>  	.dma_boundary = UINT_MAX,
> -- 1.8.5.6
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04 15:03           ` Paolo Bonzini
@ 2016-02-04 15:11             ` Hannes Reinecke
  0 siblings, 0 replies; 12+ messages in thread
From: Hannes Reinecke @ 2016-02-04 15:11 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Minter, qemu-devel

On 02/04/2016 04:03 PM, Paolo Bonzini wrote:
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 595af1a..f712db5 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -543,6 +543,33 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
>>  	return virtscsi_tmf(vscsi, cmd);
>>  }
>>  
>> +static enum blk_eh_timer_return virtscsi_timed_out(struct scsi_cmnd *sc)
>> +{
>> +	struct virtio_scsi *vscsi = shost_priv(sc->device->host);
>> +	struct virtio_scsi_cmd *cmd;
>> +
>> +	scmd_printk(KERN_INFO, sc, "timeout\n");
>> +	cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO);
>> +	if (!cmd)
>> +		return FAILED;
> 
> Should this be BLK_EH_NOT_HANDLED?  Apart from this, the patch looks
> good.  And of course this doesn't affect Jim, who can test this patch
> anyway.
> 
Why, but of course.

I'll be sending the official patch to linux-scsi.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] sda abort with virtio-scsi
  2016-02-04 13:41       ` Jim Minter
  2016-02-04 13:54         ` Hannes Reinecke
@ 2016-02-08 20:02         ` Jim Minter
  1 sibling, 0 replies; 12+ messages in thread
From: Jim Minter @ 2016-02-08 20:02 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, Hannes Reinecke

Again FWIW:

No recurrence of the SCSI abort notices since increasing the timeout, 
but still getting guest userspace lockups.  Guest kernel logs show RCU 
"detected stalls" messages and triggering NMIs across the CPUs.  These 
consistently indicate CPU 2 sitting in the CFS scheduler via the timer 
interrupt, appearing to make some progress (i.e. RIP changes over time), 
and the other CPUs all sitting idle.  Although the guest kernel keeps 
going and logging these issues out, none of the guest userspace 
processes make any progress at all over several hours.

I'm upgrading to the QEMU version shipped with RHEV 
(qemu-kvm-rhev-2.3.0-31.el7_2.7) to see if that helps - so far so good. 
  My best guess is that there's a missing bugfix in the RHEL 7 qemu 
1.5.3 codebase, but which is fixed upstream and in the RHEV QEMU release.

Cheers,

Jim

On 04/02/16 13:41, Jim Minter wrote:
> FWIW, I've now done:
>
> echo 300 >/sys/block/sda/device/timeout
>
> Not entirely sure whether it would help or not, but so far I haven't had
> a recurrence.
>
> Cheers,
>
> Jim
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-02-08 20:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-03 21:46 [Qemu-devel] sda abort with virtio-scsi Jim Minter
2016-02-03 23:19 ` Paolo Bonzini
2016-02-03 23:34   ` Jim Minter
2016-02-04 10:23     ` Paolo Bonzini
2016-02-04 11:00       ` Denis V. Lunev
2016-02-04 13:41       ` Jim Minter
2016-02-04 13:54         ` Hannes Reinecke
2016-02-04 15:03           ` Paolo Bonzini
2016-02-04 15:11             ` Hannes Reinecke
2016-02-08 20:02         ` Jim Minter
2016-02-04  6:59   ` Hannes Reinecke
2016-02-04 11:27     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).