All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Bligh <alex@alex.org.uk>
To: Joe Jin <joe.jin@oracle.com>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>,
	"David S. Miller" <davem@davemloft.net>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	zheng.x.li@oracle.com, Xen Devel <xen-devel@lists.xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Alex Bligh <alex@alex.org.uk>
Subject: Re: kernel panic in skb_copy_bits
Date: Sun, 30 Jun 2013 10:13:35 +0100	[thread overview]
Message-ID: <6BFD5AF235F72F13CE646A0D@nimrod.local> (raw)
In-Reply-To: <51CD0E67.4000008@oracle.com>



--On 28 June 2013 12:17:43 +0800 Joe Jin <joe.jin@oracle.com> wrote:

> Find a similar issue
> http://www.gossamer-threads.com/lists/xen/devel/265611 So copied to Xen
> developer as well.

I thought this sounded familiar. I haven't got the start of this
thread, but what version of Xen are you running and what device
model? If before 4.3, there is a page lifetime bug in the kernel
(not the xen code) which can affect anything where the guest accesses
the host's block stack and that in turn accesses the networking
stack (it may in fact be wider than that). So, e.g. domU on
iCSSI will do it. It tends to get triggered by a TCP retransmit
or (on NFS) the RPC equivalent. Essentially block operation
is considered complete, returning through xen and freeing the
grant table entry, and yet something in the kernel (e.g. tcp
retransmit) can still access the data. The nature of the bug
is extensively discussed in that thread - you'll also find
a reference to a thread on linux-nfs which concludes it
isn't an nfs problem, and even some patches to fix it in the
kernel adding reference counting.

A workaround is to turn off O_DIRECT use by Xen as that ensures
the pages are copied. Xen 4.3 does this by default.

I believe fixes for this are in 4.3 and 4.2.2 if using the
qemu upstream DM. Note these aren't real fixes, just a workaround
of a kernel bug.

To fix on a local build of xen you will need something like this:
https://github.com/abligh/qemu-upstream-4.2-testing/commit/9a97c011e1a682eed9bc7195a25349eaf23ff3f9
and something like this (NB: obviously insert your own git
repo and commit numbers)
https://github.com/abligh/xen/commit/f5c344afac96ced8b980b9659fb3e81c4a0db5ca

Also note those fixes are (technically) unsafe for live migration
unless there is an ordering change made in qemu's block open
call.

Of course this might be something completely different.

-- 
Alex Bligh

WARNING: multiple messages have this Message-ID (diff)
From: Alex Bligh <alex@alex.org.uk>
To: Joe Jin <joe.jin@oracle.com>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>,
	zheng.x.li@oracle.com, Ian Campbell <Ian.Campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Alex Bligh <alex@alex.org.uk>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Xen Devel <xen-devel@lists.xen.org>,
	Jan Beulich <JBeulich@suse.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: kernel panic in skb_copy_bits
Date: Sun, 30 Jun 2013 10:13:35 +0100	[thread overview]
Message-ID: <6BFD5AF235F72F13CE646A0D@nimrod.local> (raw)
In-Reply-To: <51CD0E67.4000008@oracle.com>



--On 28 June 2013 12:17:43 +0800 Joe Jin <joe.jin@oracle.com> wrote:

> Find a similar issue
> http://www.gossamer-threads.com/lists/xen/devel/265611 So copied to Xen
> developer as well.

I thought this sounded familiar. I haven't got the start of this
thread, but what version of Xen are you running and what device
model? If before 4.3, there is a page lifetime bug in the kernel
(not the xen code) which can affect anything where the guest accesses
the host's block stack and that in turn accesses the networking
stack (it may in fact be wider than that). So, e.g. domU on
iCSSI will do it. It tends to get triggered by a TCP retransmit
or (on NFS) the RPC equivalent. Essentially block operation
is considered complete, returning through xen and freeing the
grant table entry, and yet something in the kernel (e.g. tcp
retransmit) can still access the data. The nature of the bug
is extensively discussed in that thread - you'll also find
a reference to a thread on linux-nfs which concludes it
isn't an nfs problem, and even some patches to fix it in the
kernel adding reference counting.

A workaround is to turn off O_DIRECT use by Xen as that ensures
the pages are copied. Xen 4.3 does this by default.

I believe fixes for this are in 4.3 and 4.2.2 if using the
qemu upstream DM. Note these aren't real fixes, just a workaround
of a kernel bug.

To fix on a local build of xen you will need something like this:
https://github.com/abligh/qemu-upstream-4.2-testing/commit/9a97c011e1a682eed9bc7195a25349eaf23ff3f9
and something like this (NB: obviously insert your own git
repo and commit numbers)
https://github.com/abligh/xen/commit/f5c344afac96ced8b980b9659fb3e81c4a0db5ca

Also note those fixes are (technically) unsafe for live migration
unless there is an ordering change made in qemu's block open
call.

Of course this might be something completely different.

-- 
Alex Bligh

  parent reply	other threads:[~2013-06-30  9:24 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-27  2:58 kernel panic in skb_copy_bits Joe Jin
2013-06-27  2:58 ` Joe Jin
2013-06-27  5:31 ` Eric Dumazet
2013-06-27  5:31   ` Eric Dumazet
2013-06-27  7:15   ` Joe Jin
2013-06-27  7:15     ` Joe Jin
2013-06-28  4:17   ` Joe Jin
2013-06-28  4:17     ` Joe Jin
2013-06-28  6:52     ` Eric Dumazet
2013-06-28  6:52       ` Eric Dumazet
2013-06-28  9:37       ` Eric Dumazet
2013-06-28  9:37       ` Eric Dumazet
2013-06-28 11:33         ` Joe Jin
2013-06-28 11:33         ` Joe Jin
2013-06-28 23:36         ` Joe Jin
2013-06-28 23:36           ` Joe Jin
2013-06-29  7:04           ` Eric Dumazet
2013-06-29  7:04           ` Eric Dumazet
2013-06-29  7:20           ` Eric Dumazet
2013-06-29  7:20           ` Eric Dumazet
2013-06-29  7:20             ` Eric Dumazet
2013-06-29 16:11             ` Ben Greear
2013-06-29 16:11             ` Ben Greear
2013-06-29 16:11               ` Ben Greear
2013-06-29 16:26               ` Eric Dumazet
2013-06-29 16:31                 ` Ben Greear
2013-06-29 16:31                 ` Ben Greear
2013-06-29 16:26               ` Eric Dumazet
2013-06-30  0:26             ` Joe Jin
2013-06-30  0:26               ` Joe Jin
2013-06-30  7:50               ` Eric Dumazet
2013-06-30  7:50               ` Eric Dumazet
2013-06-30  0:26             ` Joe Jin
2013-06-28 23:36         ` Joe Jin
2013-07-01 20:36         ` David Miller
2013-07-01 20:36         ` David Miller
2013-06-28  6:52     ` Eric Dumazet
2013-06-30  9:13     ` Alex Bligh [this message]
2013-06-30  9:13       ` Alex Bligh
2013-06-30  9:35       ` Alex Bligh
2013-06-30  9:35       ` Alex Bligh
2013-07-01  3:18       ` Joe Jin
2013-07-01  8:11         ` Ian Campbell
2013-07-01  8:11         ` Ian Campbell
2013-07-01 13:00           ` Joe Jin
2013-07-01 13:00           ` Joe Jin
2013-07-04  8:55           ` Joe Jin
2013-07-04  8:55           ` Joe Jin
2013-07-04  8:59             ` Ian Campbell
2013-07-04  8:59             ` Ian Campbell
2013-07-04  9:34               ` Eric Dumazet
2013-07-04  9:34               ` Eric Dumazet
2013-07-04  9:52                 ` Ian Campbell
2013-07-04  9:52                 ` Ian Campbell
2013-07-04 10:12                   ` Eric Dumazet
2013-07-04 10:12                   ` Eric Dumazet
2013-07-04 12:57                     ` Alex Bligh
2013-07-04 12:57                     ` Alex Bligh
2013-07-04 21:32                     ` David Miller
2013-07-04 21:32                     ` David Miller
2013-07-01  8:29         ` Alex Bligh
2013-07-01  8:29         ` Alex Bligh
2013-07-01  3:18       ` Joe Jin
2013-06-28  4:17   ` Joe Jin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6BFD5AF235F72F13CE646A0D@nimrod.local \
    --to=alex@alex.org.uk \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=frank.blaschka@de.ibm.com \
    --cc=joe.jin@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=zheng.x.li@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.