All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: David Hill <dhill@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org
Cc: Willem de Bruijn <willemb@google.com>, netdev <netdev@vger.kernel.org>
Subject: Re: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
Date: Tue, 19 Dec 2017 11:36:19 +0800	[thread overview]
Message-ID: <cd0ea113-34ff-d24a-b798-819dfb536c76@redhat.com> (raw)
In-Reply-To: <c3b3f34e-fe64-8ab3-3617-f98313526f9f@redhat.com>



On 2017年12月12日 11:53, David Hill wrote:
>
>
> On 2017-12-08 01:03 PM, David Hill wrote:
>>
>>
>> On 2017-12-07 12:13 AM, Jason Wang wrote:
>>>
>>>
>>> On 2017年12月07日 12:42, David Hill wrote:
>>>>
>>>>
>>>> On 2017-12-06 11:34 PM, David Hill wrote:
>>>>>
>>>>>
>>>>> On 2017-12-04 02:51 PM, David Hill wrote:
>>>>>>
>>>>>> On 2017-12-03 11:08 PM, Jason Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年12月02日 00:38, David Hill wrote:
>>>>>>>>>
>>>>>>>>> Finally, I reverted 581fe0ea61584d88072527ae9fb9dcb9d1f2783e 
>>>>>>>>> too ... compiling and I'll keep you posted.
>>>>>>>>
>>>>>>>> So I'm still able to reproduce this issue even with reverting 
>>>>>>>> these 3 commits.  Would you have other suspect commits ? 
>>>>>>>
>>>>>>> Thanks for the testing. No, I don't have other suspect commits.
>>>>>>>
>>>>>>> Looks like somebody else it hitting your issue too (see 
>>>>>>> https://www.spinics.net/lists/netdev/msg468319.html)
>>>>>>>
>>>>>>> But he claims the issue were fixed by using qemu 2.10.1.
>>>>>>>
>>>>>>> So you may:
>>>>>>>
>>>>>>> -try to see if qemu 2.10.1 solves your issue
>>>>>> It didn't solve it for him... it's only harder to reproduce. [1]
>>>>>>> -if not, try to see if commit 
>>>>>>> 2ddf71e23cc246e95af72a6deed67b4a50a7b81c ("net: add notifier 
>>>>>>> hooks for devmap bpf map") is the first bad commit
>>>>>> I'll try to see what I can do here
>>>>> I'm looking at that commit and it's been introduced before v4.13 
>>>>> if I'm not mistaken while this issue appeared between v4.13 and 
>>>>> v4.14-rc1 .  Between those two releases, there're 1352 commits.
>>>>> Is there a way to quickly know which commits are touching 
>>>>> vhost-net, zerocopy ?
>>>>>
>>>>>
>>>>> [ 7496.553044]  __schedule+0x2dc/0xbb0
>>>>> [ 7496.553055]  ? trace_hardirqs_on+0xd/0x10
>>>>> [ 7496.553074]  schedule+0x3d/0x90
>>>>> [ 7496.553087]  vhost_net_ubuf_put_and_wait+0x73/0xa0 [vhost_net]
>>>>> [ 7496.553100]  ? finish_wait+0x90/0x90
>>>>> [ 7496.553115]  vhost_net_ioctl+0x542/0x910 [vhost_net]
>>>>> [ 7496.553144]  do_vfs_ioctl+0xa6/0x6c0
>>>>> [ 7496.553166]  SyS_ioctl+0x79/0x90
>>>>> [ 7496.553182]  entry_SYSCALL_64_fastpath+0x1f/0xbe
>>>>
>>>> That vhost_net_ubuf_put_and)wait call has been changed in this 
>>>> commit with the following comment:
>>>>
>>>> commit 0ad8b480d6ee916aa84324f69acf690142aecd0e
>>>> Author: Michael S. Tsirkin <mst@redhat.com>
>>>> Date:   Thu Feb 13 11:42:05 2014 +0200
>>>>
>>>>     vhost: fix ref cnt checking deadlock
>>>>
>>>>     vhost checked the counter within the refcnt before 
>>>> decrementing.  It
>>>>     really wanted to know that it is the one that has the last 
>>>> reference, as
>>>>     a way to batch freeing resources a bit more efficiently.
>>>>
>>>>     Note: we only let refcount go to 0 on device release.
>>>>
>>>>     This works well but we now access the ref counter twice so 
>>>> there's a
>>>>     race: all users might see a high count and decide to defer freeing
>>>>     resources.
>>>>     In the end no one initiates freeing resources until the last 
>>>> reference
>>>>     is gone (which is on VM shotdown so might happen after a 
>>>> looooong time).
>>>>
>>>>     Let's do what we probably should have done straight away:
>>>>     switch from kref to plain atomic, documenting the
>>>>     semantics, return the refcount value atomically after decrement,
>>>>     then use that to avoid the deadlock.
>>>>
>>>>     Reported-by: Qin Chuanyu <qinchuanyu@huawei.com>
>>>>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>     Acked-by: Jason Wang <jasowang@redhat.com>
>>>>     Signed-off-by: David S. Miller <davem@davemloft.net>
>>>>
>>>>
>>>>
>>>> So at this point, are we hitting a deadlock when using 
>>>> experimental_zcopytx ? 
>>>
>>> Yes. But there could be another possibility that it was not caused 
>>> by vhost_net itself but other places that holds a packet.
>>>
>>> Thanks
>>
>> While bisecting, when I reach this commit 
>> 46d4b68f891bee5d83a32508bfbd9778be6b1b63, the system kernel panic 
>> when I run virt-customize :
>>
>> Message from syslogd@zappa at Dec  8 12:52:06 ...
>>  kernel:[  350.016376] Kernel panic - not syncing: Fatal exception in 
>> interrupt
>>
>> I marked that commit as bad again.   Will continue bisecting!
>>
>
> It looks like the first bad commit would be the following:
>
> [jenkins@zappa linux-stable-new]$ sudo bash bisect.sh -g
> 3ece782693c4b64d588dd217868558ab9a19bfe7 is the first bad commit
> commit 3ece782693c4b64d588dd217868558ab9a19bfe7
> Author: Willem de Bruijn <willemb@google.com>
> Date:   Thu Aug 3 16:29:38 2017 -0400
>
>     sock: skb_copy_ubufs support for compound pages
>
>     Refine skb_copy_ubufs to support compound pages. With upcoming TCP
>     zerocopy sendmsg, such fragments may appear.
>
>     The existing code replaces each page one for one. Splitting each
>     compound page into an independent number of regular pages can result
>     in exceeding limit MAX_SKB_FRAGS if data is not exactly page aligned.
>
>     Instead, fill all destination pages but the last to PAGE_SIZE.
>     Split the existing alloc + copy loop into separate stages:
>     1. compute bytelength and minimum number of pages to store this.
>     2. allocate
>     3. copy, filling each page except the last to PAGE_SIZE bytes
>     4. update skb frag array
>
>     Signed-off-by: Willem de Bruijn <willemb@google.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> :040000 040000 f1b652be7e59b1046400cad8e6be25028a88b8e2 
> 6ecf86d9f06a2d98946f531f1e4cf803de071b10 M    include
> :040000 040000 8420cf451fcf51f669ce81437ce7e0aacc33d2eb 
> 4fc8384362693e4619fab39b0a945f6f2349226b M    net
>
> Here is the bisect log:

Thanks for the hard bisecting.

Cc netdev and Willem.


>
> [root@zappa linux-stable-new]# git bisect log
> git bisect start
> # bad: [2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e] Linux 4.14-rc1
> git bisect bad 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e
> # good: [e87c13993f16549e77abce9744af844c55154349] Linux 4.13.16
> git bisect good e87c13993f16549e77abce9744af844c55154349
> # good: [569dbb88e80deb68974ef6fdd6a13edb9d686261] Linux 4.13
> git bisect good 569dbb88e80deb68974ef6fdd6a13edb9d686261
> # good: [569dbb88e80deb68974ef6fdd6a13edb9d686261] Linux 4.13
> git bisect good 569dbb88e80deb68974ef6fdd6a13edb9d686261
> # bad: [aae3dbb4776e7916b6cd442d00159bea27a695c1] Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> git bisect bad aae3dbb4776e7916b6cd442d00159bea27a695c1
> # good: [bf1d6b2c76eda86159519bf5c427b1fa8f51f733] Merge tag 
> 'staging-4.14-rc1' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> git bisect good bf1d6b2c76eda86159519bf5c427b1fa8f51f733
> # bad: [e833251ad813168253fef9915aaf6a8c883337b0] rxrpc: Add 
> notification of end-of-Tx phase
> git bisect bad e833251ad813168253fef9915aaf6a8c883337b0
> # bad: [46d4b68f891bee5d83a32508bfbd9778be6b1b63] Merge tag 
> 'wireless-drivers-next-for-davem-2017-08-07' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
> git bisect bad 46d4b68f891bee5d83a32508bfbd9778be6b1b63
> # good: [cf6c6ea352faadb15d1373d890bf857080b218a4] iwlwifi: mvm: fix 
> the FIFO numbers in A000 devices
> git bisect good cf6c6ea352faadb15d1373d890bf857080b218a4
> # good: [65205cc465e9b37abbdbb3d595c46081b97e35bc] sctp: remove the 
> typedef sctp_addiphdr_t
> git bisect good 65205cc465e9b37abbdbb3d595c46081b97e35bc
> # bad: [ecbd87b8430419199cc9dd91598d5552a180f558] phylink: add support 
> for MII ioctl access to Clause 45 PHYs
> git bisect bad ecbd87b8430419199cc9dd91598d5552a180f558
> # bad: [52267790ef52d7513879238ca9fac22c1733e0e3] sock: add MSG_ZEROCOPY
> git bisect bad 52267790ef52d7513879238ca9fac22c1733e0e3
> # good: [04b1d4e50e82536c12da00ee04a77510c459c844] net: core: Make the 
> FIB notification chain generic
> git bisect good 04b1d4e50e82536c12da00ee04a77510c459c844
> # good: [9217d8c2fe743f02a3ce6d430fe3b5d514fd5f1c] ipv6: Regenerate 
> host route according to node pointer upon loopback up
> git bisect good 9217d8c2fe743f02a3ce6d430fe3b5d514fd5f1c
> # good: [0a7fd1ac2a6b316ceeb9a57a41ce0c45f6bff549] mlxsw: 
> spectrum_router: Add support for route replace
> git bisect good 0a7fd1ac2a6b316ceeb9a57a41ce0c45f6bff549
> # good: [84b7187ca2338832e3af58eb5123c02bb6921e4e] Merge branch 
> 'mlxsw-Support-for-IPv6-UC-router'
> git bisect good 84b7187ca2338832e3af58eb5123c02bb6921e4e
> # bad: [3ece782693c4b64d588dd217868558ab9a19bfe7] sock: skb_copy_ubufs 
> support for compound pages
> git bisect bad 3ece782693c4b64d588dd217868558ab9a19bfe7
> # good: [98ba0bd5505dcbb90322a4be07bcfe6b8a18c73f] sock: allocate skbs 
> from optmem
> git bisect good 98ba0bd5505dcbb90322a4be07bcfe6b8a18c73f
> # first bad commit: [3ece782693c4b64d588dd217868558ab9a19bfe7] sock: 
> skb_copy_ubufs support for compound pages
>
>

  reply	other threads:[~2017-12-19  3:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <efd45fba-5724-0036-8473-0274b5816ae9@redhat.com>
2017-11-13 15:54 ` Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover. [1] David Hill
     [not found]   ` <CALapVYHmf7gG25nA-5LkoaTDR8gB0xQ1Ro_FyyCQNbzrfSp+aQ@mail.gmail.com>
2017-11-15 21:08     ` David Hill
2017-11-22 18:22       ` Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover David Hill
2017-11-23 23:48         ` Paolo Bonzini
2017-11-24  3:11           ` Jason Wang
2017-11-24 16:19             ` David Hill
2017-11-24 16:22             ` David Hill
2017-11-27  3:44               ` Jason Wang
2017-11-27 19:38                 ` David Hill
2017-11-28 18:00                   ` David Hill
2017-11-29  1:52                     ` Jason Wang
2017-11-29  2:52                       ` Dave Hill
2017-11-29  5:15                         ` Jason Wang
2017-11-29 19:13                           ` David Hill
2017-11-30  2:42                             ` Jason Wang
2017-11-30 20:52                               ` David Hill
2017-11-30 20:59                                 ` David Hill
2017-12-01 16:38                                   ` David Hill
2017-12-04  4:08                                     ` Jason Wang
2017-12-04 19:51                                       ` David Hill
2017-12-07  4:34                                         ` David Hill
2017-12-07  4:42                                           ` David Hill
2017-12-07  5:13                                             ` Jason Wang
2017-12-08 18:03                                               ` David Hill
2017-12-12  3:53                                                 ` David Hill
2017-12-19  3:36                                                   ` Jason Wang [this message]
2017-12-19 16:19                                                     ` Willem de Bruijn
2017-12-07  5:12                                           ` Jason Wang
2017-12-02 12:16                                   ` Harald Moeller
2017-12-02 16:37                                   ` Harald Moeller
2017-12-07  2:44                                     ` David Hill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd0ea113-34ff-d24a-b798-819dfb536c76@redhat.com \
    --to=jasowang@redhat.com \
    --cc=dhill@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.