All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Christian Lamparter <chunkeey@googlemail.com>
Cc: netdev@vger.kernel.org, Eric Dumazet <eric.dumazet@gmail.com>,
	Alan Curry <rlwinm@sdf.org>,
	alexmcwhirter@triadic.us, David Miller <davem@davemloft.net>
Subject: Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)
Date: Sun, 12 Feb 2017 05:42:18 +0000	[thread overview]
Message-ID: <20170212054209.GQ13195@ZenIV.linux.org.uk> (raw)
In-Reply-To: <6129494.xvkFqTsVzW@debian64>

On Sat, Feb 11, 2017 at 08:37:06PM +0100, Christian Lamparter wrote:

> I think if you follow through with this argument. You have the problem of:
> How to handle EFAULT from skb_copy_datagram_* (and all it's "wrappers")?
> 
> Because on one hand, the iovec could be partially bad. I remember that 
> the application could do the following shenanigans during recvmsg: 
>  - mprotect() could've flipped page read-only and back to read-write.
>  - Or truncate() could've shortened the mmapped file,
>  - etc.
> 
> In this case the error should be propagated back to the userspace.
> 
> But OTOH, it could just be a temporary failure (*) and restoring the
> iovec and trying again is needed.

No.  You can't _rely_ upon -EFAULT being repeated, but it's not something
you would expect to retry your way out of.

The sane semantics is
	* fail read/recvmsg (with EFAULT) if it's a datagram socket
	* fail if it's a stream socket and nothing has been read by
that point
	* a short read if something has been already read.

> Is this a correct/complete assessment of the problem at hand? Or did
> I make a mistake / wrong assumption in there?

> I'm looking at:
> <http://lxr.free-electrons.com/source/net/ipv4/tcp_input.c#L4668>
> <http://lxr.free-electrons.com/source/net/ipv4/tcp_input.c#L5232>
> <http://lxr.free-electrons.com/source/net/ipv4/tcp_input.c#L5465>
> 
> >From what I can see, the tcp functions tcp_data_queue(),
> tcp_copy_to_iovec() and tcp_rcv_established() would need to be
> extended to handle EFAULT. Because if the iovec is restored
> and the application did something bad (mprotect(), truncate(),
>  ...), this code would sort of loop?

tcp_v4_do_rcv() has every right to copy nothing whatsoever - it's a fastpath
and when e.g. it's called in context of another thread or when skb isn't the
next fragment expected it won't bother with tcp_copy_to_iovec() at all.
Failure to copy anything in there is just fine, as long as you don't end
up buggering tp->ucopy state (in particular, tp->ucopy.msg->msg_iter).

> If this is the case: How many retries do we want, before we can
> say it is a permament failure (and abort)?

We don't want any.  What happens is that this path won't copy anything and
when that skb gets to
                        err = skb_copy_datagram_msg(skb, offset, msg, used);
                        if (err) {
                                /* Exception. Bailout! */
                                if (!copied)
                                        copied = -EFAULT;
                                break;
                        }
in tcp_recvmsg() we'll get our short read.

Again, the trouble is not with tcp_v4_do_rcv() failing to copy something -
it's failing to copy and ending up with iov_iter advanced that might be
a problem.  E.g. tp->ucopy.len getting out of sync with tp->ucopy.msg->msg_iter,
etc.

Short read on fault is fine.  So's full copy if somebody had been flipping
memprotect() and slow path ends up catching the moment when the buffer is
writable.  Both outcomes are fine.  Having the same memprotect() flipping
leave ->msg_iter more than one would expect by tp->ucopy.len and everything
back with copy_to_user working again, OTOH, might confuse tcp_input.c.

  reply	other threads:[~2017-02-12  5:43 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-24  3:35 PROBLEM: network data corruption (bisected to e5a4b0bb803b) Alan Curry
2016-07-24 17:45 ` Christian Lamparter
2016-07-24 17:45   ` Christian Lamparter
2016-07-24 19:02   ` Al Viro
2016-07-24 19:02     ` Al Viro
2016-07-26  4:57     ` Alan Curry
2016-07-26 13:59       ` Christian Lamparter
2016-07-26 18:15         ` alexmcwhirter
2016-07-27  6:39           ` Kalle Valo
2016-07-27  1:14         ` Alan Curry
2016-07-27 10:32     ` Alan Curry
2016-07-27 18:04       ` alexmcwhirter
2016-07-27 23:02         ` alexmcwhirter
2016-07-27 23:45           ` David Miller
2016-07-28  0:31             ` Al Viro
2016-07-28  0:26               ` alexmcwhirter
2016-07-28  1:22                 ` Al Viro
2016-07-28  1:22                   ` Al Viro
2016-08-03  3:49                   ` Alan Curry
2016-08-03 12:43                     ` Christian Lamparter
2016-08-03 23:25                       ` Alan Curry
     [not found]                     ` <20160803054118.GG2356@ZenIV.linux.org.uk>
     [not found]                       ` <2363167.YiBS7sFNO2@debian64>
     [not found]                         ` <20160809145836.GQ2356@ZenIV.linux.org.uk>
     [not found]                           ` <20170210081126.GA14157@ZenIV.linux.org.uk>
2017-02-10 21:45                             ` Al Viro
2017-02-11 19:37                               ` Christian Lamparter
2017-02-12  5:42                                 ` Al Viro [this message]
2017-02-13 21:56                                   ` Christian Lamparter
2017-02-14  1:33                                     ` [PATCH][CFT] Saner error handling in skb_copy_datagram_iter() et.al. (was Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)) Al Viro
2017-02-17 15:54                                       ` [PATCH][CFT] Saner error handling in skb_copy_datagram_iter() et.al David Miller
2017-02-17 17:03                                         ` Al Viro
2017-02-18  0:02                                           ` Al Viro
2017-02-18  2:24                                             ` Al Viro
2017-02-19 19:19                                             ` Christian Lamparter
2017-02-20 15:14                                             ` David Miller
2017-02-21 13:25                                             ` David Laight
2016-07-26  4:32   ` PROBLEM: network data corruption (bisected to e5a4b0bb803b) Alan Curry
2016-07-26  4:38   ` alexmcwhirter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170212054209.GQ13195@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=alexmcwhirter@triadic.us \
    --cc=chunkeey@googlemail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=rlwinm@sdf.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.