All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hill <dhill@redhat.com>
To: Jason Wang <jasowang@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
Date: Fri, 24 Nov 2017 11:22:35 -0500	[thread overview]
Message-ID: <9c912f3b-081c-8b02-17c8-453ebf36f42c@redhat.com> (raw)
In-Reply-To: <a0ec66f5-ebc0-3c54-26a8-dfba06801084@redhat.com>

The VMs all have 2 vNICs ... and this is the hypervisor:

[root@zappa ~]# brctl show
bridge name    bridge id        STP enabled    interfaces
virbr0        8000.525400914858    yes        virbr0-nic
                             vnet0
                             vnet1


1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
group default qlen 1000
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
     inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
     inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP 
group default qlen 1000
     link/ether 84:2b:2b:13:f2:91 brd ff:ff:ff:ff:ff:ff
     inet redacted/24 brd 173.178.138.255 scope global dynamic eno1
        valid_lft 48749sec preferred_lft 48749sec
     inet6 fe80::862b:2bff:fe13:f291/64 scope link
        valid_lft forever preferred_lft forever
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP 
group default qlen 1000
     link/ether 84:2b:2b:13:f2:92 brd ff:ff:ff:ff:ff:ff
     inet 192.168.1.3/24 brd 192.168.1.255 scope global eno2
        valid_lft forever preferred_lft forever
     inet6 fe80::862b:2bff:fe13:f292/64 scope link
        valid_lft forever preferred_lft forever
4: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP group default qlen 1000
     link/ether 52:54:00:91:48:58 brd ff:ff:ff:ff:ff:ff
     inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0

        valid_lft forever preferred_lft forever
     inet 192.168.122.10/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.11/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.12/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.15/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.16/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.17/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.18/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.31/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.32/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.33/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.34/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.35/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.36/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.37/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.45/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.46/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.47/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.48/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.49/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.50/32 scope global virbr0
        valid_lft forever preferred_lft forever
     inet 192.168.122.51/32 scope global virbr0
        valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master 
virbr0 state DOWN group default qlen 1000
     link/ether 52:54:00:91:48:58 brd ff:ff:ff:ff:ff:ff
125: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1360 qdisc 
fq_codel state UNKNOWN group default qlen 100
     link/none
     inet 10.10.122.28/21 brd 10.10.127.255 scope global tun0
        valid_lft forever preferred_lft forever
     inet6 fe80::1f9b:bfd4:e9c9:2059/64 scope link stable-privacy
        valid_lft forever preferred_lft forever
402: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel 
master virbr0 state UNKNOWN group default qlen 1000
     link/ether fe:54:00:09:27:39 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::fc54:ff:fe09:2739/64 scope link
        valid_lft forever preferred_lft forever
403: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel 
master virbr0 state UNKNOWN group default qlen 1000
     link/ether fe:54:00:ea:6b:18 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::fc54:ff:feea:6b18/64 scope link
        valid_lft forever preferred_lft forever


On 2017-11-23 10:11 PM, Jason Wang wrote:
>
>
> On 2017年11月24日 07:48, Paolo Bonzini wrote:
>> Jason, any ideas?
>>
>> Thanks,
>>
>> Paolo
>>
>> On 22/11/2017 19:22, David Hill wrote:
>>> ore than 120 seconds.
>>>      [ 7496.552987]       Tainted: G          I
>>>      4.14.0-0.rc1.git3.1.fc28.x86_64 #1
>>>      [ 7496.552996] "echo 0 /proc/sys/kernel/hung_task_timeout_secs"
>>>      disables this message.
>>>      [ 7496.553006] qemu-system-x86 D12240  5978      1 0x00000004
>>>      [ 7496.553024] Call Trace:
>>>      [ 7496.553044]  __schedule+0x2dc/0xbb0
>>>      [ 7496.553055]  ? trace_hardirqs_on+0xd/0x10
>>>      [ 7496.553074]  schedule+0x3d/0x90
>>>      [ 7496.553087]  vhost_net_ubuf_put_and_wait+0x73/0xa0 [vhost_net]
>>>      [ 7496.553100]  ? finish_wait+0x90/0x90
>>>      [ 7496.553115]  vhost_net_ioctl+0x542/0x910 [vhost_net]
>>>      [ 7496.553144]  do_vfs_ioctl+0xa6/0x6c0
>>>      [ 7496.553166]  SyS_ioctl+0x79/0x90
>>>      [ 7496.553182]  entry_SYSCALL_64_fastpath+0x1f/0xbe
>>>      [ 7496.553190] RIP: 0033:0x7fa1ea0e1817
>>>      [ 7496.553196] RSP: 002b:00007ffe3d854bc8 EFLAGS: 00000246
>>>      ORIG_RAX: 0000000000000010
>>>      [ 7496.553209] RAX: ffffffffffffffda RBX: 000000000000001d RCX:
>>>      00007fa1ea0e1817
>>>      [ 7496.553215] RDX: 00007ffe3d854bd0 RSI: 000000004008af30 RDI:
>>>      0000000000000021
>>>      [ 7496.553222] RBP: 000055e33352b610 R08: 000055e33024a6f0 R09:
>>>      000055e330245d92
>>>      [ 7496.553228] R10: 000055e33344e7f0 R11: 0000000000000246 R12:
>>>      000055e33351a000
>>>      [ 7496.553235] R13: 0000000000000001 R14: 0000000400000000 R15:
>>>      0000000000000000
>>>      [ 7496.553284]
>>>                     Showing all locks held in the system:
>>>      [ 7496.553313] 1 lock held by khungtaskd/161:
>>>      [ 7496.553319]  #0:  (tasklist_lock){.+.+}, at:
>>>      [<ffffffff8111740d>] debug_show_all_locks+0x3d/0x1a0
>>>      [ 7496.553373] 1 lock held by in:imklog/1194:
>>>      [ 7496.553379]  #0:  (&f->f_pos_lock){+.+.}, at:
>>>      [<ffffffff8130ecfc>] __fdget_pos+0x4c/0x60
>>>      [ 7496.553541] 1 lock held by qemu-system-x86/5978:
>>>      [ 7496.553547]  #0:  (&dev->mutex#3){+.+.}, at:
>>>      [<ffffffffc077e498>] vhost_net_ioctl+0x358/0x910 [vhost_net]
>
> Hi:
>
> The backtrace shows zero copied skb was not sent for a long while for 
> some reason. This could be either a bug in vhost_net or somewhere in 
> the host driver, qdiscs or others.
>
> What's your network setups in host (e.g the qdiscs or network driver)? 
> Can you still hit the issue if you switch to use another type of 
> ethernet driver/cards? Can this still be reproducible in net.git 
> (https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/).
>
> Will try to reproduce this locally.
>
> Thanks

  parent reply	other threads:[~2017-11-24 16:22 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <efd45fba-5724-0036-8473-0274b5816ae9@redhat.com>
2017-11-13 15:54 ` Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover. [1] David Hill
     [not found]   ` <CALapVYHmf7gG25nA-5LkoaTDR8gB0xQ1Ro_FyyCQNbzrfSp+aQ@mail.gmail.com>
2017-11-15 21:08     ` David Hill
2017-11-22 18:22       ` Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover David Hill
2017-11-23 23:48         ` Paolo Bonzini
2017-11-24  3:11           ` Jason Wang
2017-11-24 16:19             ` David Hill
2017-11-24 16:22             ` David Hill [this message]
2017-11-27  3:44               ` Jason Wang
2017-11-27 19:38                 ` David Hill
2017-11-28 18:00                   ` David Hill
2017-11-29  1:52                     ` Jason Wang
2017-11-29  2:52                       ` Dave Hill
2017-11-29  5:15                         ` Jason Wang
2017-11-29 19:13                           ` David Hill
2017-11-30  2:42                             ` Jason Wang
2017-11-30 20:52                               ` David Hill
2017-11-30 20:59                                 ` David Hill
2017-12-01 16:38                                   ` David Hill
2017-12-04  4:08                                     ` Jason Wang
2017-12-04 19:51                                       ` David Hill
2017-12-07  4:34                                         ` David Hill
2017-12-07  4:42                                           ` David Hill
2017-12-07  5:13                                             ` Jason Wang
2017-12-08 18:03                                               ` David Hill
2017-12-12  3:53                                                 ` David Hill
2017-12-19  3:36                                                   ` Jason Wang
2017-12-19 16:19                                                     ` Willem de Bruijn
2017-12-07  5:12                                           ` Jason Wang
2017-12-02 12:16                                   ` Harald Moeller
2017-12-02 16:37                                   ` Harald Moeller
2017-12-07  2:44                                     ` David Hill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c912f3b-081c-8b02-17c8-453ebf36f42c@redhat.com \
    --to=dhill@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.