All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: DoS Attack on Fragment Cache
       [not found] <02917697-4CE2-4BBE-BF47-31F58BC89025@hxcore.ol>
@ 2021-04-16 23:09 ` Keyu Man
  2021-04-17  0:31 ` David Ahern
  1 sibling, 0 replies; 12+ messages in thread
From: Keyu Man @ 2021-04-16 23:09 UTC (permalink / raw)
  To: davem, yoshfuji, dsahern, Jakub Kicinski
  Cc: netdev, linux-kernel, Zhiyun Qian

Hi,

    My name is Keyu Man. We are a group of researchers from University
of California, Riverside. Zhiyun Qian is my advisor. We found the code
in processing IPv4/IPv6 fragments will potentially lead to DoS
Attacks. Specifically, after the latest kernel receives an IPv4
fragment, it will try to fit it into a queue by calling function

    struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void
*key) in net/ipv4/inet_fragment.c.

    However, this function will first check if the existing fragment
memory exceeds the fqdir->high_thresh. If it exceeds, then drop the
fragment regardless whether it belongs to a new queue or an existing
queue.
    Chances are that an attacker can fill the cache with fragments
that will never be assembled (i.e., only sends the first fragment with
new IPIDs every time) to exceed the threshold so that all future
incoming fragmented IPv4 traffic would be blocked and dropped. Since
there is no GC mechanism, the victim host has to wait for 30s when the
fragments are expired to continue receiving incoming fragments
normally.
    In practice, given the 4MB fragment cache, the attacker only needs
to send 1766 fragments to exhaust the cache and DoS the victim for
30s, whose cost is pretty low. Besides, IPv6 would also be affected
since the issue resides in inet part.
    This issue is introduced in commit
648700f76b03b7e8149d13cc2bdb3355035258a9 (inet: frags: use rhashtables
for reassembly units) which removes fqdir->low_thresh, and GC worker
as well. We would kindly request to bring GC workers back to the
kernel to prevent the DoS attacks.

    Looking forward to hear from you

    Thanks,

Keyu Man


On Fri, Apr 16, 2021 at 3:58 PM Keyu Man <kman001@ucr.edu> wrote:
>
> Hi,
>
>
>
>     My name is Keyu Man. We are a group of researchers from University of California, Riverside. Zhiyun Qian is my advisor. We found the code in processing IPv4/IPv6 fragments will potentially lead to DoS Attacks. Specifically, after the latest kernel receives an IPv4 fragment, it will try to fit it into a queue by calling function
>
>
>
>     struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void *key) in net/ipv4/inet_fragment.c.
>
>
>
>     However, this function will first check if the existing fragment memory exceeds the fqdir->high_thresh. If it exceeds, then drop the fragment regardless whether it belongs to a new queue or an existing queue.
>
>     Chances are that an attacker can fill the cache with fragments that will never be assembled (i.e., only sends the first fragment with new IPIDs every time) to exceed the threshold so that all future incoming fragmented IPv4 traffic would be blocked and dropped. Since there is no GC mechanism, the victim host has to wait for 30s when the fragments are expired to continue receive incoming fragments normally.
>
>     In practice, given the 4MB fragment cache, the attacker only needs to send 1766 fragments to exhaust the cache and DoS the victim for 30s, whose cost is pretty low. Besides, IPv6 would also be affected since the issue resides in inet part.
>
> This issue is introduced in commit 648700f76b03b7e8149d13cc2bdb3355035258a9 (inet: frags: use rhashtables for reassembly units) which removes fqdir->low_thresh, and GC worker as well. We would gently request to bring GC worker back to the kernel to prevent the DoS attacks.
>
> Looking forward to hear from you
>
>
>
>     Thanks,
>
> Keyu Man

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
       [not found] <02917697-4CE2-4BBE-BF47-31F58BC89025@hxcore.ol>
  2021-04-16 23:09 ` PROBLEM: DoS Attack on Fragment Cache Keyu Man
@ 2021-04-17  0:31 ` David Ahern
  2021-04-17  4:44   ` Eric Dumazet
  1 sibling, 1 reply; 12+ messages in thread
From: David Ahern @ 2021-04-17  0:31 UTC (permalink / raw)
  To: Keyu Man, davem, yoshfuji, dsahern, Jakub Kicinski, Eric Dumazet
  Cc: netdev, linux-kernel, Zhiyun Qian

[ cc author of 648700f76b03b7e8149d13cc2bdb3355035258a9 ]

On 4/16/21 3:58 PM, Keyu Man wrote:
> Hi,
> 
>  
> 
>     My name is Keyu Man. We are a group of researchers from University
> of California, Riverside. Zhiyun Qian is my advisor. We found the code
> in processing IPv4/IPv6 fragments will potentially lead to DoS Attacks.
> Specifically, after the latest kernel receives an IPv4 fragment, it will
> try to fit it into a queue by calling function
> 
>  
> 
>     struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void
> *key) in net/ipv4/inet_fragment.c.
> 
>  
> 
>     However, this function will first check if the existing fragment
> memory exceeds the fqdir->high_thresh. If it exceeds, then drop the
> fragment regardless whether it belongs to a new queue or an existing queue.
> 
>     Chances are that an attacker can fill the cache with fragments that
> will never be assembled (i.e., only sends the first fragment with new
> IPIDs every time) to exceed the threshold so that all future incoming
> fragmented IPv4 traffic would be blocked and dropped. Since there is no
> GC mechanism, the victim host has to wait for 30s when the fragments are
> expired to continue receive incoming fragments normally.
> 
>     In practice, given the 4MB fragment cache, the attacker only needs
> to send 1766 fragments to exhaust the cache and DoS the victim for 30s,
> whose cost is pretty low. Besides, IPv6 would also be affected since the
> issue resides in inet part.
> 
> This issue is introduced in commit
> 648700f76b03b7e8149d13cc2bdb3355035258a9 (inet: frags: use rhashtables
> for reassembly units) which removes fqdir->low_thresh, and GC worker as
> well. We would gently request to bring GC worker back to the kernel to
> prevent the DoS attacks.
> 
> Looking forward to hear from you
> 
>  
> 
>     Thanks,
> 
> Keyu Man
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-17  0:31 ` David Ahern
@ 2021-04-17  4:44   ` Eric Dumazet
  2021-04-17  7:27     ` Willy Tarreau
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2021-04-17  4:44 UTC (permalink / raw)
  To: David Ahern, Florian Westphal
  Cc: Keyu Man, davem, yoshfuji, dsahern, Jakub Kicinski, netdev,
	linux-kernel, Zhiyun Qian

On Sat, Apr 17, 2021 at 2:31 AM David Ahern <dsahern@gmail.com> wrote:
>
> [ cc author of 648700f76b03b7e8149d13cc2bdb3355035258a9 ]



I think this has been discussed already. There is no strategy that
makes IP reassembly units immune to DDOS attacks.

We added rb-tree and sysctls to let admins choose to use GB of RAM if
they really care.



>
> On 4/16/21 3:58 PM, Keyu Man wrote:
> > Hi,
> >
> >
> >
> >     My name is Keyu Man. We are a group of researchers from University
> > of California, Riverside. Zhiyun Qian is my advisor. We found the code
> > in processing IPv4/IPv6 fragments will potentially lead to DoS Attacks.
> > Specifically, after the latest kernel receives an IPv4 fragment, it will
> > try to fit it into a queue by calling function
> >
> >
> >
> >     struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void
> > *key) in net/ipv4/inet_fragment.c.
> >
> >
> >
> >     However, this function will first check if the existing fragment
> > memory exceeds the fqdir->high_thresh. If it exceeds, then drop the
> > fragment regardless whether it belongs to a new queue or an existing queue.
> >
> >     Chances are that an attacker can fill the cache with fragments that
> > will never be assembled (i.e., only sends the first fragment with new
> > IPIDs every time) to exceed the threshold so that all future incoming
> > fragmented IPv4 traffic would be blocked and dropped. Since there is no
> > GC mechanism, the victim host has to wait for 30s when the fragments are
> > expired to continue receive incoming fragments normally.
> >
> >     In practice, given the 4MB fragment cache, the attacker only needs
> > to send 1766 fragments to exhaust the cache and DoS the victim for 30s,
> > whose cost is pretty low. Besides, IPv6 would also be affected since the
> > issue resides in inet part.
> >
> > This issue is introduced in commit
> > 648700f76b03b7e8149d13cc2bdb3355035258a9 (inet: frags: use rhashtables
> > for reassembly units) which removes fqdir->low_thresh, and GC worker as
> > well. We would gently request to bring GC worker back to the kernel to
> > prevent the DoS attacks.
> >
> > Looking forward to hear from you
> >
> >
> >
> >     Thanks,
> >
> > Keyu Man
> >
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-17  4:44   ` Eric Dumazet
@ 2021-04-17  7:27     ` Willy Tarreau
       [not found]       ` <CAMqUL6bkp2Dy3AMFZeNLjE1f-sAwnuBWpXH_FSYTSh8=Ac3RKg@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Willy Tarreau @ 2021-04-17  7:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Ahern, Florian Westphal, Keyu Man, davem, yoshfuji,
	dsahern, Jakub Kicinski, netdev, linux-kernel, Zhiyun Qian

On Sat, Apr 17, 2021 at 06:44:40AM +0200, Eric Dumazet wrote:
> On Sat, Apr 17, 2021 at 2:31 AM David Ahern <dsahern@gmail.com> wrote:
> >
> > [ cc author of 648700f76b03b7e8149d13cc2bdb3355035258a9 ]
> 
> I think this has been discussed already. There is no strategy that
> makes IP reassembly units immune to DDOS attacks.

For having tried to deal with this in the past as well, I agree with
this conclusion, which is also another good example of why fragments
should really be avoided as much as possible over hostile networks.

However I also found that random drops of previous entries is the
approach which seems to offer the most statistical opportunities to
legitimate traffic to still work under attack (albeit really poorly
considering that any lost fragment requires retransmission of the
whole series). In this case the chance for a packet to be successfully
reassembled would vary proportionally to the inverse of its number of
fragments, which reasonably limits the impact of attacks (without being
an ultimate solution of course).

> We added rb-tree and sysctls to let admins choose to use GB of RAM if
> they really care.

I agree that for those who care, the real solution is to make sure they
can store all the traffic they receive during a reassembly period.
Legitimate traffic mostly reassembles quickly so keeping 1 second of
traffic at 10 Gbps is only 1.25 GB of RAM after all...

Willy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
       [not found]       ` <CAMqUL6bkp2Dy3AMFZeNLjE1f-sAwnuBWpXH_FSYTSh8=Ac3RKg@mail.gmail.com>
@ 2021-04-17  7:50         ` Willy Tarreau
  2021-04-18  1:30           ` Matt Corallo
  0 siblings, 1 reply; 12+ messages in thread
From: Willy Tarreau @ 2021-04-17  7:50 UTC (permalink / raw)
  To: Keyu Man
  Cc: Eric Dumazet, David Ahern, Florian Westphal, davem, yoshfuji,
	dsahern, Jakub Kicinski, netdev, linux-kernel, Zhiyun Qian

On Sat, Apr 17, 2021 at 12:42:39AM -0700, Keyu Man wrote:
> How about at least allow the existing queue to finish? Currently a tiny new
> fragment would potentially invalid all previous fragments by letting them
> timeout without allowing the fragments to come in to finish the assembly.

Because this is exactly the principle of how attacks are built: reserve
resources claiming that you'll send everything so that others can't make
use of the resources that are reserved to you. The best solution precisely
is *not* to wait for anyone to finish, hence *not* to reserve valuable
resources that are unusuable by others.

Willy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-17  7:50         ` Willy Tarreau
@ 2021-04-18  1:30           ` Matt Corallo
  2021-04-18  1:38             ` Keyu Man
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Corallo @ 2021-04-18  1:30 UTC (permalink / raw)
  To: Willy Tarreau, Keyu Man
  Cc: Eric Dumazet, David Ahern, Florian Westphal, davem, yoshfuji,
	dsahern, Jakub Kicinski, netdev, linux-kernel, Zhiyun Qian

See-also "[PATCH] Reduce IP_FRAG_TIME fragment-reassembly timeout to 1s, from 30s" (and the two resends of it) - given 
the size of the default cache (4MB) and the time that it takes before we flush the cache (30 seconds) you only need 
about 1Mbps of fragments to hit this issue. While DoS attacks are concerning, its also incredibly practical (and I do) 
hit this issue in normal non-adversarial conditions.

Matt

On 4/17/21 03:50, Willy Tarreau wrote:
> On Sat, Apr 17, 2021 at 12:42:39AM -0700, Keyu Man wrote:
>> How about at least allow the existing queue to finish? Currently a tiny new
>> fragment would potentially invalid all previous fragments by letting them
>> timeout without allowing the fragments to come in to finish the assembly.
> 
> Because this is exactly the principle of how attacks are built: reserve
> resources claiming that you'll send everything so that others can't make
> use of the resources that are reserved to you. The best solution precisely
> is *not* to wait for anyone to finish, hence *not* to reserve valuable
> resources that are unusuable by others.
> 
> Willy
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-18  1:30           ` Matt Corallo
@ 2021-04-18  1:38             ` Keyu Man
  2021-04-18  2:26               ` Matt Corallo
  0 siblings, 1 reply; 12+ messages in thread
From: Keyu Man @ 2021-04-18  1:38 UTC (permalink / raw)
  To: Matt Corallo
  Cc: Willy Tarreau, Eric Dumazet, David Ahern, Florian Westphal,
	davem, yoshfuji, dsahern, Jakub Kicinski, netdev, linux-kernel,
	Zhiyun Qian

Willy's words make sense to me and I agree that the existing fragments
should be evicted when the new one comes in and the cache is full.
Though the attacker can still leverage this to flush the victim's
cache, as mentioned previously, since fragments are likely to be
assembled in a very short time, it would be hard to launch the
attack(evicting the legit fragment before it's assembled requires a
large packet sending rate). And this seems better than the existing
solution (drop all incoming fragments when full).

Keyu

On Sat, Apr 17, 2021 at 6:30 PM Matt Corallo
<netdev-list@mattcorallo.com> wrote:
>
> See-also "[PATCH] Reduce IP_FRAG_TIME fragment-reassembly timeout to 1s, from 30s" (and the two resends of it) - given
> the size of the default cache (4MB) and the time that it takes before we flush the cache (30 seconds) you only need
> about 1Mbps of fragments to hit this issue. While DoS attacks are concerning, its also incredibly practical (and I do)
> hit this issue in normal non-adversarial conditions.
>
> Matt
>
> On 4/17/21 03:50, Willy Tarreau wrote:
> > On Sat, Apr 17, 2021 at 12:42:39AM -0700, Keyu Man wrote:
> >> How about at least allow the existing queue to finish? Currently a tiny new
> >> fragment would potentially invalid all previous fragments by letting them
> >> timeout without allowing the fragments to come in to finish the assembly.
> >
> > Because this is exactly the principle of how attacks are built: reserve
> > resources claiming that you'll send everything so that others can't make
> > use of the resources that are reserved to you. The best solution precisely
> > is *not* to wait for anyone to finish, hence *not* to reserve valuable
> > resources that are unusuable by others.
> >
> > Willy
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-18  1:38             ` Keyu Man
@ 2021-04-18  2:26               ` Matt Corallo
  2021-04-18  4:39                 ` Willy Tarreau
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Corallo @ 2021-04-18  2:26 UTC (permalink / raw)
  To: Keyu Man
  Cc: Willy Tarreau, Eric Dumazet, David Ahern, Florian Westphal,
	davem, yoshfuji, dsahern, Jakub Kicinski, netdev, linux-kernel,
	Zhiyun Qian

Sure, there are better ways to handle the reassembly cache overflowing, but that is pretty unrelated to the fact that 
waiting 30 full seconds for a fragment to come in doesn't really make sense in today's networks (the 30 second delay 
that is used today appears to even be higher than RFC 791 suggested in 1981!). You get a lot more bang for your buck if 
you don't wait around so long (or we could restructure things to kick out the oldest fragments, but that is a lot more 
work, and probably extra indexes that just aren't worth it).

Matt

On 4/17/21 21:38, Keyu Man wrote:
> Willy's words make sense to me and I agree that the existing fragments
> should be evicted when the new one comes in and the cache is full.
> Though the attacker can still leverage this to flush the victim's
> cache, as mentioned previously, since fragments are likely to be
> assembled in a very short time, it would be hard to launch the
> attack(evicting the legit fragment before it's assembled requires a
> large packet sending rate). And this seems better than the existing
> solution (drop all incoming fragments when full).
> 
> Keyu
> 
> On Sat, Apr 17, 2021 at 6:30 PM Matt Corallo
> <netdev-list@mattcorallo.com> wrote:
>>
>> See-also "[PATCH] Reduce IP_FRAG_TIME fragment-reassembly timeout to 1s, from 30s" (and the two resends of it) - given
>> the size of the default cache (4MB) and the time that it takes before we flush the cache (30 seconds) you only need
>> about 1Mbps of fragments to hit this issue. While DoS attacks are concerning, its also incredibly practical (and I do)
>> hit this issue in normal non-adversarial conditions.
>>
>> Matt
>>
>> On 4/17/21 03:50, Willy Tarreau wrote:
>>> On Sat, Apr 17, 2021 at 12:42:39AM -0700, Keyu Man wrote:
>>>> How about at least allow the existing queue to finish? Currently a tiny new
>>>> fragment would potentially invalid all previous fragments by letting them
>>>> timeout without allowing the fragments to come in to finish the assembly.
>>>
>>> Because this is exactly the principle of how attacks are built: reserve
>>> resources claiming that you'll send everything so that others can't make
>>> use of the resources that are reserved to you. The best solution precisely
>>> is *not* to wait for anyone to finish, hence *not* to reserve valuable
>>> resources that are unusuable by others.
>>>
>>> Willy
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-18  2:26               ` Matt Corallo
@ 2021-04-18  4:39                 ` Willy Tarreau
  2021-04-18 14:31                   ` Matt Corallo
  0 siblings, 1 reply; 12+ messages in thread
From: Willy Tarreau @ 2021-04-18  4:39 UTC (permalink / raw)
  To: Matt Corallo
  Cc: Keyu Man, Eric Dumazet, David Ahern, Florian Westphal, davem,
	yoshfuji, dsahern, Jakub Kicinski, netdev, linux-kernel,
	Zhiyun Qian

On Sat, Apr 17, 2021 at 10:26:30PM -0400, Matt Corallo wrote:
> Sure, there are better ways to handle the reassembly cache overflowing, but
> that is pretty unrelated to the fact that waiting 30 full seconds for a
> fragment to come in doesn't really make sense in today's networks (the 30
> second delay that is used today appears to even be higher than RFC 791
> suggested in 1981!).

Not exactly actually, because you forget the TTL here. With most hosts
sending an initial TTL around 64, after crossing 10-15 hops it's still
around 50 so that would result in ~50 seconds by default, even according
to the 40 years old RFC791. The 15s there was the absolute minimum. While
I do agree that we shouldn't keep them that long nowadays, we can't go
too low without risking to break some slow transmission stacks (SLIP/PPP
over modems for example). In addition even cutting that in 3 will remain
trivially DoSable.

> You get a lot more bang for your buck if you don't wait
> around so long (or we could restructure things to kick out the oldest
> fragments, but that is a lot more work, and probably extra indexes that just
> aren't worth it).

Kicking out oldest ones is a bad approach in a system made only of
independent elements, because it tends to result in a lot of damage once
all of them behave similarly. I.e. if you need to kick out an old entry
in valid traffic, it's because you do need to wait that long, and if all
datagrams need to wait that long, then new datagrams systematically
prevent the oldest one from being reassembled, and none gest reassembled.
With a random approach at least your success ratio converges towards 1/e
(i.e. 36%) which is better.

Willy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-18  4:39                 ` Willy Tarreau
@ 2021-04-18 14:31                   ` Matt Corallo
  2021-04-19  9:43                     ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Corallo @ 2021-04-18 14:31 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Keyu Man, Eric Dumazet, David Ahern, Florian Westphal, davem,
	yoshfuji, dsahern, Jakub Kicinski, netdev, linux-kernel,
	Zhiyun Qian

Should the default, though, be so low? If someone is still using a old modem they can crank up the sysctl, it does seem 
like such things are pretty rare these days :). Its rather trivial to, without any kind of attack, hit 1Mbps of lost 
fragments in today's networks, at which point all fragments are dropped. After all, I submitted the patch to "scratch my 
own itch" :).

Matt

On 4/18/21 00:39, Willy Tarreau wrote:
> I do agree that we shouldn't keep them that long nowadays, we can't go
> too low without risking to break some slow transmission stacks (SLIP/PPP
> over modems for example).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-18 14:31                   ` Matt Corallo
@ 2021-04-19  9:43                     ` Eric Dumazet
  2021-04-19 17:20                       ` Matt Corallo
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2021-04-19  9:43 UTC (permalink / raw)
  To: Matt Corallo
  Cc: Willy Tarreau, Keyu Man, David Ahern, Florian Westphal,
	David Miller, Hideaki YOSHIFUJI, David Ahern, Jakub Kicinski,
	netdev, LKML, Zhiyun Qian

On Sun, Apr 18, 2021 at 4:31 PM Matt Corallo
<netdev-list@mattcorallo.com> wrote:
>
> Should the default, though, be so low? If someone is still using a old modem they can crank up the sysctl, it does seem
> like such things are pretty rare these days :). Its rather trivial to, without any kind of attack, hit 1Mbps of lost
> fragments in today's networks, at which point all fragments are dropped. After all, I submitted the patch to "scratch my
> own itch" :).

Again, even if you increase the values by 1000x, it is trivial for an
attacker to use all the memory you allowed.

And allowing a significant portion of memory to be eaten like that
might cause OOM on hosts where jobs are consuming all physical memory.

It is a sysctl, I changed things so that one could really reserve/use
16GB of memory if she/he is desperate about frags.

>
> Matt
>
> On 4/18/21 00:39, Willy Tarreau wrote:
> > I do agree that we shouldn't keep them that long nowadays, we can't go
> > too low without risking to break some slow transmission stacks (SLIP/PPP
> > over modems for example).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PROBLEM: DoS Attack on Fragment Cache
  2021-04-19  9:43                     ` Eric Dumazet
@ 2021-04-19 17:20                       ` Matt Corallo
  0 siblings, 0 replies; 12+ messages in thread
From: Matt Corallo @ 2021-04-19 17:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Willy Tarreau, Keyu Man, David Ahern, Florian Westphal,
	David Miller, Hideaki YOSHIFUJI, David Ahern, Jakub Kicinski,
	netdev, LKML, Zhiyun Qian

Note that there are two completely separate sysctls here - the timeout on fragments, and the amount of memory available 
for fragment reassembly. You have to multiply them together to reach the "Mbps of lost or deliberately-lost fragments 
before we start dropping all future fragments". See the calculation in the description of the patch I mentioned above 
for exact details, but turning the time down to 1s already gives you 32Mbps, and you can tune the memory usage 
separately (eg 128MB, really 256 between v4 and v6, would give you 1Gbps of "lost" fragments).

Its true, an attacker can use a lot of memory in that case, but 128MiB isn't actually something that rises to the level 
of "trivial for an attacker to use all the memory you allowed" or "cause OOM".

I only chimed in on this thread to note that this isn't just a theoretical attack concern, however - this is a 
real-world non-attack-scenario issue that's pretty trivial to hit. Just losing 1Mbps of traffic on a modern residential 
internet connection is pretty doable, make that flow mostly frags and suddenly your VPN drops out for 30 seconds at a 
time just because.

I agree with others here that actually solving the DoS issue isn't trivial, but making it less absurdly trivial to have 
30 second dropouts of your VPN connection would also be a nice change.

Matt

On 4/19/21 05:43, Eric Dumazet wrote:
> On Sun, Apr 18, 2021 at 4:31 PM Matt Corallo
> <netdev-list@mattcorallo.com> wrote:
>>
>> Should the default, though, be so low? If someone is still using a old modem they can crank up the sysctl, it does seem
>> like such things are pretty rare these days :). Its rather trivial to, without any kind of attack, hit 1Mbps of lost
>> fragments in today's networks, at which point all fragments are dropped. After all, I submitted the patch to "scratch my
>> own itch" :).
> 
> Again, even if you increase the values by 1000x, it is trivial for an
> attacker to use all the memory you allowed.
> 
> And allowing a significant portion of memory to be eaten like that
> might cause OOM on hosts where jobs are consuming all physical memory.
> 
> It is a sysctl, I changed things so that one could really reserve/use
> 16GB of memory if she/he is desperate about frags.
> 
>>
>> Matt
>>
>> On 4/18/21 00:39, Willy Tarreau wrote:
>>> I do agree that we shouldn't keep them that long nowadays, we can't go
>>> too low without risking to break some slow transmission stacks (SLIP/PPP
>>> over modems for example).

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-04-19 17:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <02917697-4CE2-4BBE-BF47-31F58BC89025@hxcore.ol>
2021-04-16 23:09 ` PROBLEM: DoS Attack on Fragment Cache Keyu Man
2021-04-17  0:31 ` David Ahern
2021-04-17  4:44   ` Eric Dumazet
2021-04-17  7:27     ` Willy Tarreau
     [not found]       ` <CAMqUL6bkp2Dy3AMFZeNLjE1f-sAwnuBWpXH_FSYTSh8=Ac3RKg@mail.gmail.com>
2021-04-17  7:50         ` Willy Tarreau
2021-04-18  1:30           ` Matt Corallo
2021-04-18  1:38             ` Keyu Man
2021-04-18  2:26               ` Matt Corallo
2021-04-18  4:39                 ` Willy Tarreau
2021-04-18 14:31                   ` Matt Corallo
2021-04-19  9:43                     ` Eric Dumazet
2021-04-19 17:20                       ` Matt Corallo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.