All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
@ 2017-11-13 15:35 bugzilla-daemon
  2017-12-04 20:33 ` [Bug 197861] " bugzilla-daemon
                   ` (40 more replies)
  0 siblings, 41 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-11-13 15:35 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

            Bug ID: 197861
           Summary: Shutting down a VM with Kernel 4.14 will sometime hang
                    and a reboot is the only way to recover.
           Product: Virtualization
           Version: unspecified
    Kernel Version: 4.14-rc8
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: hilld@binarystorm.net
        Regression: No

[ 7496.552971] INFO: task qemu-system-x86:5978 blocked for more than 120
seconds.
[ 7496.552987]       Tainted: G          I     4.14.0-0.rc1.git3.1.fc28.x86_64
#1
[ 7496.552996] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 7496.553006] qemu-system-x86 D12240  5978      1 0x00000004
[ 7496.553024] Call Trace:
[ 7496.553044]  __schedule+0x2dc/0xbb0
[ 7496.553055]  ? trace_hardirqs_on+0xd/0x10
[ 7496.553074]  schedule+0x3d/0x90
[ 7496.553087]  vhost_net_ubuf_put_and_wait+0x73/0xa0 [vhost_net]
[ 7496.553100]  ? finish_wait+0x90/0x90
[ 7496.553115]  vhost_net_ioctl+0x542/0x910 [vhost_net]
[ 7496.553144]  do_vfs_ioctl+0xa6/0x6c0
[ 7496.553166]  SyS_ioctl+0x79/0x90
[ 7496.553182]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 7496.553190] RIP: 0033:0x7fa1ea0e1817
[ 7496.553196] RSP: 002b:00007ffe3d854bc8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 7496.553209] RAX: ffffffffffffffda RBX: 000000000000001d RCX:
00007fa1ea0e1817
[ 7496.553215] RDX: 00007ffe3d854bd0 RSI: 000000004008af30 RDI:
0000000000000021
[ 7496.553222] RBP: 000055e33352b610 R08: 000055e33024a6f0 R09:
000055e330245d92
[ 7496.553228] R10: 000055e33344e7f0 R11: 0000000000000246 R12:
000055e33351a000
[ 7496.553235] R13: 0000000000000001 R14: 0000000400000000 R15:
0000000000000000
[ 7496.553284]
               Showing all locks held in the system:
[ 7496.553313] 1 lock held by khungtaskd/161:
[ 7496.553319]  #0:  (tasklist_lock){.+.+}, at: [<ffffffff8111740d>]
debug_show_all_locks+0x3d/0x1a0
[ 7496.553373] 1 lock held by in:imklog/1194:
[ 7496.553379]  #0:  (&f->f_pos_lock){+.+.}, at: [<ffffffff8130ecfc>]
__fdget_pos+0x4c/0x60
[ 7496.553541] 1 lock held by qemu-system-x86/5978:
[ 7496.553547]  #0:  (&dev->mutex#3){+.+.}, at: [<ffffffffc077e498>]
vhost_net_ioctl+0x358/0x910 [vhost_net]



I'm currently bisecting to figure out which commit breaks this but for some
reasons, when hitting this commit:

# bad: [46d4b68f891bee5d83a32508bfbd9778be6b1b63] Merge tag
'wireless-drivers-next-for-davem-2017-08-07' of
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
git bisect bad 46d4b68f891bee5d83a32508bfbd9778be6b1b63

the host will not allow SSHd to establish a new session and when starting a KVM
guest, the host will hard lock.   I'm still bisecting but I marked that commit
as bad even though perhaps it would be good.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
@ 2017-12-04 20:33 ` bugzilla-daemon
  2017-12-11 11:34 ` bugzilla-daemon
                   ` (39 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-04 20:33 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #1 from David Hill (hilld@binarystorm.net) ---
The same happens with v4.15.0-rc*

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
  2017-12-04 20:33 ` [Bug 197861] " bugzilla-daemon
@ 2017-12-11 11:34 ` bugzilla-daemon
  2017-12-12 18:55 ` bugzilla-daemon
                   ` (38 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-11 11:34 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

Ian Kumlien (Ian.kumlien@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Ian.kumlien@gmail.com

--- Comment #2 from Ian Kumlien (Ian.kumlien@gmail.com) ---
I'm hitting the exact same issue on multiple hosts - shouldn't this be a
regression?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
  2017-12-04 20:33 ` [Bug 197861] " bugzilla-daemon
  2017-12-11 11:34 ` bugzilla-daemon
@ 2017-12-12 18:55 ` bugzilla-daemon
  2017-12-12 19:15 ` bugzilla-daemon
                   ` (37 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-12 18:55 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

Klaus Mueller (kmueller@justmail.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmueller@justmail.de

--- Comment #3 from Klaus Mueller (kmueller@justmail.de) ---
Take a look here: 
https://www.mail-archive.com/netdev@vger.kernel.org/msg204727.html

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (2 preceding siblings ...)
  2017-12-12 18:55 ` bugzilla-daemon
@ 2017-12-12 19:15 ` bugzilla-daemon
  2017-12-12 19:18 ` bugzilla-daemon
                   ` (36 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-12 19:15 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #4 from David Hill (hilld@binarystorm.net) ---
Great that's a nice thing I didn't find prior today ...   With
4.15.0-0.rc2.git2.1 it's even worse:

[jenkins@zappa linux-stable-new]$ sudo virsh list
 Id    Name                           State
----------------------------------------------------
 2     undercloud-0-rhosp9            in shutdown
 17    ceph-0-rhosp9                  in shutdown
 18    control-2-rhosp9               in shutdown
 19    control-1-rhosp9               in shutdown
 20    control-0-rhosp9               in shutdown

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (3 preceding siblings ...)
  2017-12-12 19:15 ` bugzilla-daemon
@ 2017-12-12 19:18 ` bugzilla-daemon
  2017-12-12 22:13 ` bugzilla-daemon
                   ` (35 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-12 19:18 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #5 from David Hill (hilld@binarystorm.net) ---
I posted that to the KVM mailing list yesterday:

It looks like the first bad commit would be the following:

[jenkins@zappa linux-stable-new]$ sudo bash bisect.sh -g
3ece782693c4b64d588dd217868558ab9a19bfe7 is the first bad commit
commit 3ece782693c4b64d588dd217868558ab9a19bfe7
Author: Willem de Bruijn <willemb@google.com>
Date:   Thu Aug 3 16:29:38 2017 -0400

    sock: skb_copy_ubufs support for compound pages

    Refine skb_copy_ubufs to support compound pages. With upcoming TCP
    zerocopy sendmsg, such fragments may appear.

    The existing code replaces each page one for one. Splitting each
    compound page into an independent number of regular pages can result
    in exceeding limit MAX_SKB_FRAGS if data is not exactly page aligned.

    Instead, fill all destination pages but the last to PAGE_SIZE.
    Split the existing alloc + copy loop into separate stages:
    1. compute bytelength and minimum number of pages to store this.
    2. allocate
    3. copy, filling each page except the last to PAGE_SIZE bytes
    4. update skb frag array

    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 f1b652be7e59b1046400cad8e6be25028a88b8e2
6ecf86d9f06a2d98946f531f1e4cf803de071b10 M    include
:040000 040000 8420cf451fcf51f669ce81437ce7e0aacc33d2eb
4fc8384362693e4619fab39b0a945f6f2349226b M    net

Here is the bisect log:

[root@zappa linux-stable-new]# git bisect log
git bisect start
# bad: [2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e] Linux 4.14-rc1
git bisect bad 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e
# good: [e87c13993f16549e77abce9744af844c55154349] Linux 4.13.16
git bisect good e87c13993f16549e77abce9744af844c55154349
# good: [569dbb88e80deb68974ef6fdd6a13edb9d686261] Linux 4.13
git bisect good 569dbb88e80deb68974ef6fdd6a13edb9d686261
# good: [569dbb88e80deb68974ef6fdd6a13edb9d686261] Linux 4.13
git bisect good 569dbb88e80deb68974ef6fdd6a13edb9d686261
# bad: [aae3dbb4776e7916b6cd442d00159bea27a695c1] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad aae3dbb4776e7916b6cd442d00159bea27a695c1
# good: [bf1d6b2c76eda86159519bf5c427b1fa8f51f733] Merge tag 'staging-4.14-rc1'
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect good bf1d6b2c76eda86159519bf5c427b1fa8f51f733
# bad: [e833251ad813168253fef9915aaf6a8c883337b0] rxrpc: Add notification of
end-of-Tx phase
git bisect bad e833251ad813168253fef9915aaf6a8c883337b0
# bad: [46d4b68f891bee5d83a32508bfbd9778be6b1b63] Merge tag
'wireless-drivers-next-for-davem-2017-08-07' of
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
git bisect bad 46d4b68f891bee5d83a32508bfbd9778be6b1b63
# good: [cf6c6ea352faadb15d1373d890bf857080b218a4] iwlwifi: mvm: fix the FIFO
numbers in A000 devices
git bisect good cf6c6ea352faadb15d1373d890bf857080b218a4
# good: [65205cc465e9b37abbdbb3d595c46081b97e35bc] sctp: remove the typedef
sctp_addiphdr_t
git bisect good 65205cc465e9b37abbdbb3d595c46081b97e35bc
# bad: [ecbd87b8430419199cc9dd91598d5552a180f558] phylink: add support for MII
ioctl access to Clause 45 PHYs
git bisect bad ecbd87b8430419199cc9dd91598d5552a180f558
# bad: [52267790ef52d7513879238ca9fac22c1733e0e3] sock: add MSG_ZEROCOPY
git bisect bad 52267790ef52d7513879238ca9fac22c1733e0e3
# good: [04b1d4e50e82536c12da00ee04a77510c459c844] net: core: Make the FIB
notification chain generic
git bisect good 04b1d4e50e82536c12da00ee04a77510c459c844
# good: [9217d8c2fe743f02a3ce6d430fe3b5d514fd5f1c] ipv6: Regenerate host route
according to node pointer upon loopback up
git bisect good 9217d8c2fe743f02a3ce6d430fe3b5d514fd5f1c
# good: [0a7fd1ac2a6b316ceeb9a57a41ce0c45f6bff549] mlxsw: spectrum_router: Add
support for route replace
git bisect good 0a7fd1ac2a6b316ceeb9a57a41ce0c45f6bff549
# good: [84b7187ca2338832e3af58eb5123c02bb6921e4e] Merge branch
'mlxsw-Support-for-IPv6-UC-router'
git bisect good 84b7187ca2338832e3af58eb5123c02bb6921e4e
# bad: [3ece782693c4b64d588dd217868558ab9a19bfe7] sock: skb_copy_ubufs support
for compound pages
git bisect bad 3ece782693c4b64d588dd217868558ab9a19bfe7
# good: [98ba0bd5505dcbb90322a4be07bcfe6b8a18c73f] sock: allocate skbs from
optmem
git bisect good 98ba0bd5505dcbb90322a4be07bcfe6b8a18c73f
# first bad commit: [3ece782693c4b64d588dd217868558ab9a19bfe7] sock:
skb_copy_ubufs support for compound pages

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (4 preceding siblings ...)
  2017-12-12 19:18 ` bugzilla-daemon
@ 2017-12-12 22:13 ` bugzilla-daemon
  2017-12-12 22:25 ` bugzilla-daemon
                   ` (34 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-12 22:13 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

Willem de Bruijn (willemb@google.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |willemb@google.com

--- Comment #6 from Willem de Bruijn (willemb@google.com) ---
Does reverting only that patch resolve the issue?

The logic in it is quite complex, but it is only needed for zerocopy with
MSG_ZEROCOPY. And then only in edge cases.

This is likely not used here. The code is blocking on vhost_net use of
zerocopy. Which does not build skbuffs with zerocopy data in compound pages.

The patch adds checks against shared and cloned skbs that were not present
before. It is not safe to modify skb frags[] on on either, but perhaps this
changed return path causes buffers to not be released, causing
vhost_net_ubuf_put_and_wait to wait seemingly indefinitely.

+       if (skb_shared(skb) || skb_unclone(skb, gfp_mask))
+               return -EINVAL;

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (5 preceding siblings ...)
  2017-12-12 22:13 ` bugzilla-daemon
@ 2017-12-12 22:25 ` bugzilla-daemon
  2017-12-13  3:48 ` bugzilla-daemon
                   ` (33 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-12 22:25 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #7 from Willem de Bruijn (willemb@google.com) ---
If that fixes it, the output with this patch would be informative.

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a95877a8ac8b..7c4ba11cb429 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -941,8 +941,14 @@ int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
        if (!num_frags)
                return 0;

-       if (skb_shared(skb) || skb_unclone(skb, gfp_mask))
+       if (skb_shared(skb)) {
+               WARN_ONCE(1, "shared skb");
                return -EINVAL;
+       }
+       if (skb_unclone(skb, gfp_mask)) {
+               WARN_ONCE(1, "cloned skb");
+               return -EINVAL;
+       }

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (6 preceding siblings ...)
  2017-12-12 22:25 ` bugzilla-daemon
@ 2017-12-13  3:48 ` bugzilla-daemon
  2017-12-13  4:21 ` bugzilla-daemon
                   ` (32 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13  3:48 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #8 from David Hill (hilld@binarystorm.net) ---
If I revert the patch 3ece782693c4b64d588dd217868558ab9a19bfe7, I can't apply
the patch you mentionned unless you want me to apply it on the unreverted
branch ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (7 preceding siblings ...)
  2017-12-13  3:48 ` bugzilla-daemon
@ 2017-12-13  4:21 ` bugzilla-daemon
  2017-12-13  4:22 ` bugzilla-daemon
                   ` (31 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13  4:21 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #9 from Willem de Bruijn (willemb@google.com) ---
Yes, I mean that if that revert fixes it, then this patch is the cause. Then it
would be helpful to see whether it is due to those more stringent precondition
checks and, if so, which codepath it is that leads to this function with an skb
that does not match those checks. Existing code should pass them too, even if
the checks where not there in practice before. Sorry for the confusion.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (8 preceding siblings ...)
  2017-12-13  4:21 ` bugzilla-daemon
@ 2017-12-13  4:22 ` bugzilla-daemon
  2017-12-13 18:05 ` bugzilla-daemon
                   ` (30 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13  4:22 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #10 from Willem de Bruijn (willemb@google.com) ---
So please apply it on the unreverted branch.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (9 preceding siblings ...)
  2017-12-13  4:22 ` bugzilla-daemon
@ 2017-12-13 18:05 ` bugzilla-daemon
  2017-12-13 19:17 ` bugzilla-daemon
                   ` (29 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13 18:05 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #11 from David Hill (hilld@binarystorm.net) ---
I've applied the suggested patch on the current branch but the patch didn't
apply somehow ... I had to change it manually so I guess it's the tabs that are
used for indentation and the spaces I copy/pasted from this BZ.   I couldn't
revert the commit using "git revert" so my guess here is that I'd have to go
back to before this commit, compile and retry to reproduce but we already know
it works according to the bisection so I was wondering if we still had to try
that ?  I tried checking out skbuff.c from the working 4.13.0 branch but the
kernel didn't compile because some functions were probably changed and the
validation failed at the end.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (10 preceding siblings ...)
  2017-12-13 18:05 ` bugzilla-daemon
@ 2017-12-13 19:17 ` bugzilla-daemon
  2017-12-13 20:27 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13 19:17 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #12 from Willem de Bruijn (willemb@google.com) ---
> we already know it works according to the bisection
> so I was wondering if we still had to try that ?

It is not strictly necessary, but it would increase confidence
that we found the culprit. I don't know how easy or hard it is
to reproduce for you, so skip if this it too much work.

But in case you can reproduce it, I pushed v4.14 with the
(indeed non-trivial) revert to

  https://github.com/wdebruij/linux/commits/v4.14-revert-copyubufs

If only time for either this or the WARN_ONCE instrumentation,
then that variant is more helpful. Thanks.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (11 preceding siblings ...)
  2017-12-13 19:17 ` bugzilla-daemon
@ 2017-12-13 20:27 ` bugzilla-daemon
  2017-12-13 22:48 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13 20:27 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #13 from David Hill (hilld@binarystorm.net) ---
It's really easy to reproduce but I hit issues at various stages of the CI I'm
running when I hit this issue.

Sometimes the VMs is stuck in shutting down, sometimes one of the VM won't get
an IP address from the DHCP server it has to contact, sometimes the server will
simply kernel panic.   

I've applied that patch to my current v4.14.5 branch and am compiling it now
and as soon as it's done, I'll reboot and retry reproducing this again.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (12 preceding siblings ...)
  2017-12-13 20:27 ` bugzilla-daemon
@ 2017-12-13 22:48 ` bugzilla-daemon
  2017-12-13 23:09 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13 22:48 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #14 from David Hill (hilld@binarystorm.net) ---
So I can still reproduce this issue with the patch reverted... Now I'm puzzled.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (13 preceding siblings ...)
  2017-12-13 22:48 ` bugzilla-daemon
@ 2017-12-13 23:09 ` bugzilla-daemon
  2017-12-14  1:54 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-13 23:09 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #15 from Willem de Bruijn (willemb@google.com) ---
It is quite likely that this patch --or patchset-- is involved, as the hang
happens in cleaning up zerocopy skb state and this patchset extends zerocopy.

Because of conflicts I had to resolve this patch is not a pure revert, but the
changes should be limited to replacing

        uarg->callback(uarg, false);
        skb_shinfo(skb)->tx_flags &= ~SKBTX_DEV_ZEROCOPY;

with an skb_zcopy_clear(skb, false) that boils down to the same.

One option would be to test while reverting the entire patchset. I may be able
to do that, no idea how many conflicts I'd have to resolve.

Perhaps first, could you try the other suggestion with the warnings. If those
fire, then we know for sure that this is implicated.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (14 preceding siblings ...)
  2017-12-13 23:09 ` bugzilla-daemon
@ 2017-12-14  1:54 ` bugzilla-daemon
  2017-12-14  2:36 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-14  1:54 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #16 from David Hill (hilld@binarystorm.net) ---
I tried the WARN_ONCE on the v4.14.5 branch and I didn't find those output in
the logs.  Does this make sense?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (15 preceding siblings ...)
  2017-12-14  1:54 ` bugzilla-daemon
@ 2017-12-14  2:36 ` bugzilla-daemon
  2017-12-14  3:23 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-14  2:36 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #17 from Willem de Bruijn (willemb@google.com) ---
Thanks for trying. You did encounter the hangs? Then it's not due to having
added those checks and failing skb_copy_ubufs as of this patch.

It's increasingly likely that it is another patch in the patchset, then.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (16 preceding siblings ...)
  2017-12-14  2:36 ` bugzilla-daemon
@ 2017-12-14  3:23 ` bugzilla-daemon
  2017-12-14 17:39 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-14  3:23 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #18 from Klaus Mueller (kmueller@justmail.de) ---
(In reply to David Hill from comment #16)
> I tried the WARN_ONCE on the v4.14.5 branch and I didn't find those output
> in the logs.  Does this make sense?

Could you please try to revert patchset "Remove UDP
Fragmentation Offload support" as described by Andreas here: 
https://www.mail-archive.com/netdev@vger.kernel.org/msg204727.html ?

He added the complete revert patch for 4.14.4

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (17 preceding siblings ...)
  2017-12-14  3:23 ` bugzilla-daemon
@ 2017-12-14 17:39 ` bugzilla-daemon
  2017-12-14 17:58 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-14 17:39 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #19 from Willem de Bruijn (willemb@google.com) ---
Andreas just mentioned in that thread that the full revert resolves all issues
for him, including this hang in

  vhost_net_ubuf_put_and_wait

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (18 preceding siblings ...)
  2017-12-14 17:39 ` bugzilla-daemon
@ 2017-12-14 17:58 ` bugzilla-daemon
  2017-12-14 18:06 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-14 17:58 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #20 from David Hill (hilld@binarystorm.net) ---
So will it get fixed or will we have to rollback those commits ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (19 preceding siblings ...)
  2017-12-14 17:58 ` bugzilla-daemon
@ 2017-12-14 18:06 ` bugzilla-daemon
  2017-12-27  4:00 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-14 18:06 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #21 from Willem de Bruijn (willemb@google.com) ---
I meant to give a +1 to Klaus's request to test that that solves your
regression, too.

We will have to dig deeper and fix the actual remaining issue, instead of
reverting the full patchset. That revert solved a lot of bugs in the UFO stack,
so an wholesale revert is not a solution.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (20 preceding siblings ...)
  2017-12-14 18:06 ` bugzilla-daemon
@ 2017-12-27  4:00 ` bugzilla-daemon
  2017-12-27 16:58 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27  4:00 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #22 from Willem de Bruijn (willemb@google.com) ---
The following fix has been merged

  skbuff: orphan frags before zerocopy clone
  268b790679422a89e9ab0685d9f291edae780c98
  v4.15-rc5~10^2^2~1
  http://patchwork.ozlabs.org/patch/851715/

That resolved another report of what is most likely
the same issue. Tested together with this follow-up
that is currently awaiting review to net-next 

  skbuff: in skb_segment, call zerocopy functions once per nskb
  http://patchwork.ozlabs.org/patch/852598/

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (21 preceding siblings ...)
  2017-12-27  4:00 ` bugzilla-daemon
@ 2017-12-27 16:58 ` bugzilla-daemon
  2017-12-27 17:40 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27 16:58 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #23 from David Hill (hilld@binarystorm.net) ---
I'll try reproducing the problem with 4.15-rc5 and will keep you updated .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (22 preceding siblings ...)
  2017-12-27 16:58 ` bugzilla-daemon
@ 2017-12-27 17:40 ` bugzilla-daemon
  2017-12-27 17:50 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27 17:40 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #24 from Dan Moulding (dmoulding@me.com) ---
I can confirm that with 4.14.9 with the fix applied, I can no longer reproduce
the problem. My VMs are shutting down nicely now.

I can finally get off the EOL 4.13 series. Thank you Willem for your efforts to
resolve this!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (23 preceding siblings ...)
  2017-12-27 17:40 ` bugzilla-daemon
@ 2017-12-27 17:50 ` bugzilla-daemon
  2017-12-27 19:22 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27 17:50 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #25 from Willem de Bruijn (willemb@google.com) ---
Glad to hear it! Thanks for updating the bug. Time to close it?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (24 preceding siblings ...)
  2017-12-27 17:50 ` bugzilla-daemon
@ 2017-12-27 19:22 ` bugzilla-daemon
  2017-12-27 22:04 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27 19:22 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #26 from Dan Moulding (dmoulding@me.com) ---
Maybe we should wait and see if David Hill can also confirm the problem goes
away on his setup before closing it?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (25 preceding siblings ...)
  2017-12-27 19:22 ` bugzilla-daemon
@ 2017-12-27 22:04 ` bugzilla-daemon
  2017-12-27 23:26 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27 22:04 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

LimeTech (tomm@lime-technology.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tomm@lime-technology.com

--- Comment #27 from LimeTech (tomm@lime-technology.com) ---
(In reply to Willem de Bruijn from comment #22)
> The following fix has been merged
> 
>   skbuff: orphan frags before zerocopy clone
>   268b790679422a89e9ab0685d9f291edae780c98
>   v4.15-rc5~10^2^2~1
>   http://patchwork.ozlabs.org/patch/851715/
> 
> That resolved another report of what is most likely
> the same issue. Tested together with this follow-up
> that is currently awaiting review to net-next 
> 
>   skbuff: in skb_segment, call zerocopy functions once per nskb
>   http://patchwork.ozlabs.org/patch/852598/

The second patch is partially un-doing something in the first patch.  Is this
correct, and all that is necessary for the 4.14.9 kernel?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (26 preceding siblings ...)
  2017-12-27 22:04 ` bugzilla-daemon
@ 2017-12-27 23:26 ` bugzilla-daemon
  2017-12-28 17:20 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-27 23:26 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #28 from Willem de Bruijn (willemb@google.com) ---

> The second patch is partially un-doing something in the first patch.

It changes how often the calls are performed. They only depend on the state of
the origin and destination skb. Calling them more often (once for each frag)
serves no purpose.

> Is this correct,

I would not have sent it if I thought otherwise.

> and all that is necessary for the 4.14.9 kernel?

No, the second one is an optimization sent onto to net-next, which will be
4.16.

I split them to make the actual bugfix that needs to be backported to stable as
small and obvious as possible.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (27 preceding siblings ...)
  2017-12-27 23:26 ` bugzilla-daemon
@ 2017-12-28 17:20 ` bugzilla-daemon
  2018-01-01 16:01 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2017-12-28 17:20 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #29 from David Hill (hilld@binarystorm.net) ---
I just retried with 4.15-rc5 and it looks like I'm no longer able to reproduce
this issue.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (28 preceding siblings ...)
  2017-12-28 17:20 ` bugzilla-daemon
@ 2018-01-01 16:01 ` bugzilla-daemon
  2018-01-02 10:01 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-01 16:01 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

Giancarlo Razzolini (grazzolini@archlinux.org) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |grazzolini@archlinux.org

--- Comment #30 from Giancarlo Razzolini (grazzolini@archlinux.org) ---
Patches for this issue were deployed to kernel 4.14.9 or they'll only land on
4.16? I have been affected by this issue ever since kernel 4.14 landed on
archlinux. Sometimes it happens when rebooting a guest also. Another thing
that's worth nothing is that this doesn't seem to happen when the guest was
just started and there was not much network traffic yet. But, after you start
using the guest, the qemu process remains in D state after shutting down the
guest. And the only way to not (possibly) loose data on the host (due to a
forced shutdown) is to use the Magic SysRq keys and either REISUB or REISUO
combinations.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (29 preceding siblings ...)
  2018-01-01 16:01 ` bugzilla-daemon
@ 2018-01-02 10:01 ` bugzilla-daemon
  2018-01-02 13:33 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-02 10:01 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #31 from Willem de Bruijn (willemb@google.com) ---
The patch is in 4.15-rc5 and the backport to 4.14 is queued:

https://patchwork.kernel.org/patch/10138995/

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (30 preceding siblings ...)
  2018-01-02 10:01 ` bugzilla-daemon
@ 2018-01-02 13:33 ` bugzilla-daemon
  2018-01-02 16:43 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-02 13:33 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #32 from Giancarlo Razzolini (grazzolini@archlinux.org) ---
Thanks for the info Willem. There is a workaround for this issue, which is to
pass the following option to the vhost_net module:

options vhost_net  experimental_zcopytx=0

I have tested this and it works.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (31 preceding siblings ...)
  2018-01-02 13:33 ` bugzilla-daemon
@ 2018-01-02 16:43 ` bugzilla-daemon
  2018-01-03  9:35 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-02 16:43 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #33 from LimeTech (tomm@lime-technology.com) ---
(In reply to Willem de Bruijn from comment #31)
> The patch is in 4.15-rc5 and the backport to 4.14 is queued:
> 
> https://patchwork.kernel.org/patch/10138995/

With that patch, we still sometimes see Ubuntu VM terminate but qemu process is
left in "uninterruptible sleep".  Related?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (32 preceding siblings ...)
  2018-01-02 16:43 ` bugzilla-daemon
@ 2018-01-03  9:35 ` bugzilla-daemon
  2018-01-03 11:41 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-03  9:35 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #34 from Willem de Bruijn (willemb@google.com) ---
Hard to say without knowing what the process is sleeping on.

If it still happens with Giancarlo's suggestion from #32 then we could exclude
this bug.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (33 preceding siblings ...)
  2018-01-03  9:35 ` bugzilla-daemon
@ 2018-01-03 11:41 ` bugzilla-daemon
  2018-01-04 22:55 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-03 11:41 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #35 from Giancarlo Razzolini (grazzolini@archlinux.org) ---
LimeTech, the kernel should be telling you exactly why the task is in blocked
state, unless you have disabled the hung_task_timeout_secs with:

echo 0 > /proc/sys/kernel/hung_task_timeout_secs

If you have done so, revert that, and you should be able to know why. If, on
the call trace you see vhost_net, then I think it is the same issue. That
workaround works just fine, I have been able to shutdown/reboot several guests,
with different OS'es, across 2 hosts with it.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (34 preceding siblings ...)
  2018-01-03 11:41 ` bugzilla-daemon
@ 2018-01-04 22:55 ` bugzilla-daemon
  2018-01-10 13:21 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-04 22:55 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #36 from LimeTech (tomm@lime-technology.com) ---
We have confirmed this issue seems be solved in 4.14.11.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (35 preceding siblings ...)
  2018-01-04 22:55 ` bugzilla-daemon
@ 2018-01-10 13:21 ` bugzilla-daemon
  2018-01-10 14:25 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-10 13:21 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

bubez (michele.mase@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michele.mase@gmail.com

--- Comment #37 from bubez (michele.mase@gmail.com) ---
Host: ubuntu 17.10, vanilla kernel 4.14.12, nested virtualization and vhost_net
workaround aplied
options kvm_intel nested=1
options vhost_net experimental_zcopytx=0

Problem: can always be reproduced on redhat/centos7.x, after about 8 hour of
guest uptime, guest machine hangs

How to reproduce: boot a centos/redhat7.x guest vm (a minimal installation
should be ok), and wait about 8hours, the period may vary. You can give a tail
command on syslog to see some detailed message (for example tail -f
/var/log/messages)

Guest kernel: 3.10.0-693.11.6.el7.x86_64

Syslog output: /var/log/messages
Jan 10 12:56:03 kvm178 dbus[756]: [system] Activating via systemd: service
name='org.freedesktop.nm_dispatcher'
unit='dbus-org.freedesktop.nm-dispatcher.service'
Jan 10 12:56:03 kvm178 dhclient[911]: bound to 192.168.122.178 -- renewal in
1257 seconds.
Jan 10 12:56:28 kvm178 dbus[756]: [system] Failed to activate service
'org.freedesktop.nm_dispatcher': timed out
Jan 10 12:56:28 kvm178 dbus-daemon: dbus[756]: [system] Failed to activate
service 'org.freedesktop.nm_dispatcher': timed out
Jan 10 12:58:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 12:58:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 12:58:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 12:58:40 kvm178 kernel: Call Trace:
Jan 10 12:58:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 12:58:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 12:58:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 12:58:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 12:58:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 12:58:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 12:58:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:00:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:00:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:00:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:00:40 kvm178 kernel: Call Trace:
Jan 10 13:00:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:00:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:00:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:00:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:00:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:00:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:00:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:01:26 kvm178 systemd-logind: Failed to start session scope
session-23.scope: Connection timed out
Jan 10 13:02:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:02:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:02:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:02:40 kvm178 kernel: Call Trace:
Jan 10 13:02:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:02:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:02:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:02:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:02:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:02:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:02:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:04:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:04:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:04:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:04:40 kvm178 kernel: Call Trace:
Jan 10 13:04:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:04:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:04:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:04:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:04:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:04:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:04:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:06:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:06:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:06:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:06:40 kvm178 kernel: Call Trace:
Jan 10 13:06:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:06:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:06:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:06:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:06:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:06:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:06:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:08:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:08:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:08:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:08:40 kvm178 kernel: Call Trace:
Jan 10 13:08:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:08:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:08:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:08:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:08:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:08:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:08:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:10:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:10:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:10:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:10:40 kvm178 kernel: Call Trace:
Jan 10 13:10:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:10:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:10:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:10:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:10:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:10:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:10:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:12:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:12:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:12:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:12:40 kvm178 kernel: Call Trace:
Jan 10 13:12:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:12:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:12:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:12:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:12:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:12:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:12:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:14:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:14:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:14:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:14:40 kvm178 kernel: Call Trace:
Jan 10 13:14:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:14:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:14:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:14:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:14:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:14:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:14:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jan 10 13:16:40 kvm178 kernel: INFO: task systemd:1 blocked for more than 120
seconds.
Jan 10 13:16:40 kvm178 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 13:16:40 kvm178 kernel: systemd         D ffff88004d1d8000     0     1  
   0 0x00000000
Jan 10 13:16:40 kvm178 kernel: Call Trace:
Jan 10 13:16:40 kvm178 kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 10 13:16:40 kvm178 kernel: [<ffffffff810625bf>]
kvm_async_pf_task_wait+0x1df/0x230
Jan 10 13:16:40 kvm178 kernel: [<ffffffff810b34b0>] ?
wake_up_atomic_t+0x30/0x30
Jan 10 13:16:40 kvm178 kernel: [<ffffffff816afc00>] ? error_swapgs+0x61/0x18d
Jan 10 13:16:40 kvm178 kernel: [<ffffffff816afcef>] ? error_swapgs+0x150/0x18d
Jan 10 13:16:40 kvm178 kernel: [<ffffffff816b32d6>]
do_async_page_fault+0x96/0xd0
Jan 10 13:16:40 kvm178 kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
....
guest died, guest cpu 100%, hard reset on guest needed.

Guests with redhat/centos6.x (kernel 2.6.32-696.18.7.el6.x86_64) and windows10
doesn't have problems.
Hope this could help.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (36 preceding siblings ...)
  2018-01-10 13:21 ` bugzilla-daemon
@ 2018-01-10 14:25 ` bugzilla-daemon
  2018-01-15 13:26 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-10 14:25 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #38 from Giancarlo Razzolini (grazzolini@archlinux.org) ---
@bubez

4.14.12 does not require the workaround anymore. Can you remove it and test
again? Also, I think this issue has another patchset to improve it further on
4.16.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (37 preceding siblings ...)
  2018-01-10 14:25 ` bugzilla-daemon
@ 2018-01-15 13:26 ` bugzilla-daemon
  2018-01-15 13:33 ` bugzilla-daemon
  2018-01-15 13:34 ` bugzilla-daemon
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-15 13:26 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #39 from bubez (michele.mase@gmail.com) ---
@Razzolini
host kernel: vanilla 4.14.13
Tried again without the
options vhost_net experimental_zcopytx=0
and without the dhcp server of libvirtd/dnsmasq
and after 5 days, all redhat7.x machines Guest kernel:
3.10.0-693.11.6.el7.x86_64 were ko again.
5 days of guests uptime is better than 8 hours, but is not enough.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (38 preceding siblings ...)
  2018-01-15 13:26 ` bugzilla-daemon
@ 2018-01-15 13:33 ` bugzilla-daemon
  2018-01-15 13:34 ` bugzilla-daemon
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-15 13:33 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #40 from Giancarlo Razzolini (grazzolini@archlinux.org) ---
@bubez

As far as I know this bug doesn't kill or freeze the guests nor the host. It
just prevents the guests qemu processes from being ended, they remain on D
state. Last week I had a guest running for more than 4 days. But since this is
not a server, I did suspend the host a few times in between. I'll have more
data into servers this week.

Also, the log you sent is not the same for this bug, which is on the vhost_net
module. I think you might be hitting another issue.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Bug 197861] Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.
  2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
                   ` (39 preceding siblings ...)
  2018-01-15 13:33 ` bugzilla-daemon
@ 2018-01-15 13:34 ` bugzilla-daemon
  40 siblings, 0 replies; 42+ messages in thread
From: bugzilla-daemon @ 2018-01-15 13:34 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=197861

--- Comment #41 from Ian Kumlien (Ian.kumlien@gmail.com) ---
@bubez: I suspect that your issue is related to:
https://marc.info/?l=linux-kernel&m=151602104019357&w=2

And is not related to this issue.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-01-15 13:34 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-13 15:35 [Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover bugzilla-daemon
2017-12-04 20:33 ` [Bug 197861] " bugzilla-daemon
2017-12-11 11:34 ` bugzilla-daemon
2017-12-12 18:55 ` bugzilla-daemon
2017-12-12 19:15 ` bugzilla-daemon
2017-12-12 19:18 ` bugzilla-daemon
2017-12-12 22:13 ` bugzilla-daemon
2017-12-12 22:25 ` bugzilla-daemon
2017-12-13  3:48 ` bugzilla-daemon
2017-12-13  4:21 ` bugzilla-daemon
2017-12-13  4:22 ` bugzilla-daemon
2017-12-13 18:05 ` bugzilla-daemon
2017-12-13 19:17 ` bugzilla-daemon
2017-12-13 20:27 ` bugzilla-daemon
2017-12-13 22:48 ` bugzilla-daemon
2017-12-13 23:09 ` bugzilla-daemon
2017-12-14  1:54 ` bugzilla-daemon
2017-12-14  2:36 ` bugzilla-daemon
2017-12-14  3:23 ` bugzilla-daemon
2017-12-14 17:39 ` bugzilla-daemon
2017-12-14 17:58 ` bugzilla-daemon
2017-12-14 18:06 ` bugzilla-daemon
2017-12-27  4:00 ` bugzilla-daemon
2017-12-27 16:58 ` bugzilla-daemon
2017-12-27 17:40 ` bugzilla-daemon
2017-12-27 17:50 ` bugzilla-daemon
2017-12-27 19:22 ` bugzilla-daemon
2017-12-27 22:04 ` bugzilla-daemon
2017-12-27 23:26 ` bugzilla-daemon
2017-12-28 17:20 ` bugzilla-daemon
2018-01-01 16:01 ` bugzilla-daemon
2018-01-02 10:01 ` bugzilla-daemon
2018-01-02 13:33 ` bugzilla-daemon
2018-01-02 16:43 ` bugzilla-daemon
2018-01-03  9:35 ` bugzilla-daemon
2018-01-03 11:41 ` bugzilla-daemon
2018-01-04 22:55 ` bugzilla-daemon
2018-01-10 13:21 ` bugzilla-daemon
2018-01-10 14:25 ` bugzilla-daemon
2018-01-15 13:26 ` bugzilla-daemon
2018-01-15 13:33 ` bugzilla-daemon
2018-01-15 13:34 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.