All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Campbell <Ian.Campbell@citrix.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>,
	zheng.x.li@oracle.com, Jan Beulich <JBeulich@suse.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	netdev@vger.kernel.org, Joe Jin <joe.jin@oracle.com>,
	linux-kernel@vger.kernel.org, Xen Devel <xen-devel@lists.xen.org>,
	Alex Bligh <alex@alex.org.uk>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: kernel panic in skb_copy_bits
Date: Thu, 4 Jul 2013 10:52:45 +0100	[thread overview]
Message-ID: <1372931565.7184.32.camel__43466.6985217065$1372931693$gmane$org@kazak.uk.xensource.com> (raw)
In-Reply-To: <1372930465.4979.82.camel@edumazet-glaptop>

On Thu, 2013-07-04 at 02:34 -0700, Eric Dumazet wrote:
> On Thu, 2013-07-04 at 09:59 +0100, Ian Campbell wrote:
> > On Thu, 2013-07-04 at 16:55 +0800, Joe Jin wrote:
> > > 
> > > Another way is add new page flag like PG_send, when sendpage() be called,
> > > set the bit, when page be put, clear the bit. Then xen-blkback can wait
> > > on the pagequeue.
> > 
> > These schemes don't work when you have multiple simultaneous I/Os
> > referencing the same underlying page.
> 
> So this is a page property, still the patches I saw tried to address
> this problem adding networking stuff (destructors) in the skbs.
> 
> Given that a page refcount can be transfered between entities, say using
> splice() system call, I do not really understand why the fix would imply
> networking only.
> 
> Let's try to fix it properly, or else we must disable zero copies
> because they are not reliable.
> 
> Why sendfile() doesn't have the problem, but vmsplice()+splice() do have
> this issue ?

Might just be that no one has observed it with vmsplice()+splice()? Most
of the time this happens silently and you'll probably never notice, it's
just the behaviour of Xen which escalates the issue into one you can
see.

> As soon as a page fragment reference is taken somewhere, the only way to
> properly reuse the page is to rely on put_page() and page being freed.

Xen's out of tree netback used to fix this by a destructor call back on
page free, but that was a core mm patch in the hot memory free path
which wasn't popular, and it doesn't solve anything for the non-Xen
instances of this issue.

> Adding workarounds in TCP stack to always copy the page fragments in
> case of a retransmit is partial solution, as the remote peer could be
> malicious and send ACK _before_ page content is actually read by the
> NIC.
> 
> So if we rely on networking stacks to give the signal for page reuse, we
> can have major security issue.

If you ignore the Xen case and consider just the native case then the
issue isn't page reuse in the sense of getting mapped into another
process, it's the same page in the same process but the process has
written something new to the buffer, e.g.
	memset(buf, 0xaa, 4096);
	write(fd, buf, 4096)
	memset(buf, 0x55, 4096);
(where fd is O_DIRECT on NFS) Can result in 0x55 being seen on the wire
in the TCP retransmit.

If the retransmit is at the RPC layer then you get a resend of the NFS
write RPC, but the XDR sequence stuff catches that case (I think, memory
is fuzzy).

If the retransmit is at the TCP level then the TCP sequence/ack will
cause the receiver to ignore the corrupt version, but if you replace the
second memset with write_critical_secret_key(buf), then you have an
information leak.

Ian.

  reply	other threads:[~2013-07-04  9:52 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-27  2:58 kernel panic in skb_copy_bits Joe Jin
2013-06-27  2:58 ` Joe Jin
2013-06-27  5:31 ` Eric Dumazet
2013-06-27  5:31   ` Eric Dumazet
2013-06-27  7:15   ` Joe Jin
2013-06-27  7:15     ` Joe Jin
2013-06-28  4:17   ` Joe Jin
2013-06-28  4:17     ` Joe Jin
2013-06-28  6:52     ` Eric Dumazet
2013-06-28  6:52       ` Eric Dumazet
2013-06-28  9:37       ` Eric Dumazet
2013-06-28  9:37       ` Eric Dumazet
2013-06-28 11:33         ` Joe Jin
2013-06-28 11:33         ` Joe Jin
2013-06-28 23:36         ` Joe Jin
2013-06-28 23:36           ` Joe Jin
2013-06-29  7:04           ` Eric Dumazet
2013-06-29  7:04           ` Eric Dumazet
2013-06-29  7:20           ` Eric Dumazet
2013-06-29  7:20           ` Eric Dumazet
2013-06-29  7:20             ` Eric Dumazet
2013-06-29 16:11             ` Ben Greear
2013-06-29 16:11             ` Ben Greear
2013-06-29 16:11               ` Ben Greear
2013-06-29 16:26               ` Eric Dumazet
2013-06-29 16:31                 ` Ben Greear
2013-06-29 16:31                 ` Ben Greear
2013-06-29 16:26               ` Eric Dumazet
2013-06-30  0:26             ` Joe Jin
2013-06-30  0:26               ` Joe Jin
2013-06-30  7:50               ` Eric Dumazet
2013-06-30  7:50               ` Eric Dumazet
2013-06-30  0:26             ` Joe Jin
2013-06-28 23:36         ` Joe Jin
2013-07-01 20:36         ` David Miller
2013-07-01 20:36         ` David Miller
2013-06-28  6:52     ` Eric Dumazet
2013-06-30  9:13     ` Alex Bligh
2013-06-30  9:13       ` Alex Bligh
2013-06-30  9:35       ` Alex Bligh
2013-06-30  9:35       ` Alex Bligh
2013-07-01  3:18       ` Joe Jin
2013-07-01  8:11         ` Ian Campbell
2013-07-01  8:11         ` Ian Campbell
2013-07-01 13:00           ` Joe Jin
2013-07-01 13:00           ` Joe Jin
2013-07-04  8:55           ` Joe Jin
2013-07-04  8:55           ` Joe Jin
2013-07-04  8:59             ` Ian Campbell
2013-07-04  8:59             ` Ian Campbell
2013-07-04  9:34               ` Eric Dumazet
2013-07-04  9:34               ` Eric Dumazet
2013-07-04  9:52                 ` Ian Campbell [this message]
2013-07-04  9:52                 ` Ian Campbell
2013-07-04 10:12                   ` Eric Dumazet
2013-07-04 10:12                   ` Eric Dumazet
2013-07-04 12:57                     ` Alex Bligh
2013-07-04 12:57                     ` Alex Bligh
2013-07-04 21:32                     ` David Miller
2013-07-04 21:32                     ` David Miller
2013-07-01  8:29         ` Alex Bligh
2013-07-01  8:29         ` Alex Bligh
2013-07-01  3:18       ` Joe Jin
2013-06-28  4:17   ` Joe Jin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='1372931565.7184.32.camel__43466.6985217065$1372931693$gmane$org@kazak.uk.xensource.com' \
    --to=ian.campbell@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=alex@alex.org.uk \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=frank.blaschka@de.ibm.com \
    --cc=joe.jin@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=zheng.x.li@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.