linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Kubecek <mkubecek@suse.cz>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Oskolkov <posk@google.com>,
	Gustavo Figueira <gfigueira@suse.com>
Subject: Re: [PATCH net] net: ipv4: do not handle duplicate fragments as overlapping
Date: Thu, 13 Dec 2018 12:27:48 +0100	[thread overview]
Message-ID: <20181213112748.GF21324@unicorn.suse.cz> (raw)
In-Reply-To: <cb96bca9-1dda-b243-b581-91f1b51f1517@gmail.com>

On Wed, Dec 12, 2018 at 10:20:42PM -0800, Eric Dumazet wrote:
> On 12/12/2018 06:28 PM, Michal Kubecek wrote:
> > Since commit 7969e5c40dfd ("ip: discard IPv4 datagrams with overlapping
> > segments.") IPv4 reassembly code drops the whole queue whenever an
> > overlapping fragment is received. However, the test is written in a way
> > which detects duplicate fragments as overlapping so that in environments
> > with many duplicate packets, fragmented packets may be undeliverable.
> > 
> > Add an extra test and for (potentially) duplicate fragment, only drop the
> > new fragment rather than the whole queue. Only starting offset and length
> > are checked, not the contents of the fragments as that would be too
> > expensive.  Check for duplicity with last (tail) fragment first as in real
> > life scenarios this should be the most frequent case and we would have to
> > iterate through the whole "run" otherwise.
> > 
> > Fixes: 7969e5c40dfd ("ip: discard IPv4 datagrams with overlapping segments.")
> > Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
> > ---
> >  net/ipv4/ip_fragment.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> > index aa0b22697998..f09e3683b209 100644
> > --- a/net/ipv4/ip_fragment.c
> > +++ b/net/ipv4/ip_fragment.c
> > @@ -436,6 +436,10 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
> >  			ip4_frag_append_to_last_run(&qp->q, skb);
> >  		else
> >  			ip4_frag_create_run(&qp->q, skb);
> > +	} else if (offset == prev_tail->ip_defrag_offset &&
> > +		   skb->len == prev_tail->len) {
> > +		/* potential duplicate of last fragment */
> > +		goto err;
> 
> What value is in @err variable at this point ?
> 
> Are you sure callers expect to receive -EINVAL ?

That's what they get if one of the earliery sanity checks fails so I
thought it would be the safest bet as that's something they certainly
can get already.

I tracked down the callers and almost all of them eventually ignore the
return value and only care if it's zero or not. The only exception was
one path in openvswitch where the value would be propagated to doit()
genetlink handler.

> >  	} else {
> >  		/* Binary search. Note that skb can become the first fragment,
> >  		 * but not the last (covered above).
> > @@ -449,8 +453,16 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
> >  			else if (offset >= skb1->ip_defrag_offset +
> >  						FRAG_CB(skb1)->frag_run_len)
> >  				rbn = &parent->rb_right;
> > -			else /* Found an overlap with skb1. */
> > +			else {
> > +				/* check for potential duplicate */
> > +				while (skb1 && skb1->ip_defrag_offset < offset)
> > +					skb1 = FRAG_CB(skb1)->next_frag;
> > +				if (skb1 && offset == skb1->ip_defrag_offset &&
> > +				    skb->len == skb1->len)
> > +					goto err;
> 
> Maybe we should not care, if the node in the rbtree contains the range of this
> incoming fragment, do not worry about finding if it is overlap or not ?
> 
> I am nervous about adding back a linear scan.

After rethinking it again, I agree. Unlike in the IPv6 case, we don't
have an RFC strictly requiring us to drop the whole queue and the
requirement from RFC 7522 (for IPv6) seems to be motivated by the risk
of later fragments rewriting header fields. That cannot happen if we
drop the later fragment (which doesn't bring any new data anyway).

And for FragmentSmack type attack, thorough check would in fact help the
attacker, as you pointed out.

I'll send a v2.

> > +				/* Found an overlap */
> >  				goto overlap;
> > +			}
> >  		} while (*rbn);
> >  		/* Here we have parent properly set, and rbn pointing to
> >  		 * one of its NULL left/right children. Insert skb.

      reply	other threads:[~2018-12-13 11:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-13  2:28 [PATCH net] net: ipv4: do not handle duplicate fragments as overlapping Michal Kubecek
2018-12-13  5:42 ` David Miller
2018-12-13  6:20 ` Eric Dumazet
2018-12-13 11:27   ` Michal Kubecek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181213112748.GF21324@unicorn.suse.cz \
    --to=mkubecek@suse.cz \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=gfigueira@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=posk@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).