All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Fwd: Re: [PATCH] Multicast packet reassembly can fail]
       [not found] ` <4AE86420.3040607@gmail.com>
@ 2009-10-28 17:05   ` Steve Chen
  2009-10-28 17:26     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Chen @ 2009-10-28 17:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Wed, 2009-10-28 at 16:32 +0100, Eric Dumazet wrote:
> Steve Chen a écrit :
> > Following is the e-mail I replied to Rick Jones in case you haven't got
> > it yet.  I have not posted the test code yet.  The test code is a tar gz
> > file about 11k in size.  Lot of people will get very upset if I send it
> > to the mailing list.  If you can suggest a place to up load, I'll be
> > happy to do so.
> > 
> > Regards,
> 
> > Here is the scenario this patch tries to address
> > 
> > <src node> ---->  <switch>  ----> <eth0 dest node>
> >                             \--->  <eth1 dest node>
> 
> If each fragment is received twice on host, once by eth0, once by eth1,
> should we deliver datagram once or twice ?

The application received it once.  IIRC the duplicate packet is drop in
the routing code.

> 
> Once should be enough, even if in the non fragmented case, it will
> be delivered twice (kernel cannot detect duplicates, user app might do itself)

Routing code drops the duplicate packet for none-fragmented case as
well.

> 
> 
> > 
> > For this specific case, src/dst address, protocol, IP ID and fragment
> > offset are all identical.  The only difference is the ingress interface.
> > A good follow up question would be why would anyone in their right mind
> > multicast to the same destination?  well, I don't know.  I can not get
> > the people who reported the problem to tell me either.   Since someone
> > found the need to do this,  perhaps others may find it useful too.
> > 
> 
> Then, if a 2000 bytes message is fragmented in two packets, one coming
> from eth0, one coming from eth1, I suspect your patch drops the message,
> unless eth0/eth1 are part of a bonding device...

Actually, the patch tries to prevent packet drop for this exact
scenario.  Please consider the following scenarios
1.  Packet comes in the fragment reassemble code in the following order
(eth0 frag1), (eth0 frag2), (eth1 frag1), (eth1 frag2)
Packet from both interfaces get reassembled and gets further processed.

2. Packet can some times arrive in (perhaps other orders as well)
(eth0 frag1), (eth1 frag1), (eth0 frag2), (eth1 frag2)
Without this patch, eth0 frag 1/2 are overwritten by eth1 frag1/2, and
packet from eth1 is dropped in the routing code.

> 
> That would break common routing setups, using two links to aggregate bandwidth ?

I don't believe it would.  The aggregate bandwidth will work the same as
before.  The attributes (src/dst addr, protocol, interface, etc.) should
generate a unique hash key.  If hash collision should happen with the
addition of iif << 5, the code still compare the original src addr along
with interface number, so there should be no issues.

> Nothing in IP forces all packets of a datagram to follow an unique route.
> 
My understanding as well.

Regards,

Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: [PATCH] Multicast packet reassembly can fail]
  2009-10-28 17:05   ` [Fwd: Re: [PATCH] Multicast packet reassembly can fail] Steve Chen
@ 2009-10-28 17:26     ` Eric Dumazet
  2009-10-28 18:25       ` Steve Chen
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2009-10-28 17:26 UTC (permalink / raw)
  To: Steve Chen; +Cc: netdev

Steve Chen a écrit :
> On Wed, 2009-10-28 at 16:32 +0100, Eric Dumazet wrote:
>> If each fragment is received twice on host, once by eth0, once by eth1,
>> should we deliver datagram once or twice ?
> 
> The application received it once.  IIRC the duplicate packet is drop in
> the routing code.
> 
>> Once should be enough, even if in the non fragmented case, it will
>> be delivered twice (kernel cannot detect duplicates, user app might do itself)
> 
> Routing code drops the duplicate packet for none-fragmented case as
> well.

Really ? How so ? Receiving two copies of the same packet is legal.

> 
>>
>>> For this specific case, src/dst address, protocol, IP ID and fragment
>>> offset are all identical.  The only difference is the ingress interface.
>>> A good follow up question would be why would anyone in their right mind
>>> multicast to the same destination?  well, I don't know.  I can not get
>>> the people who reported the problem to tell me either.   Since someone
>>> found the need to do this,  perhaps others may find it useful too.
>>>
>> Then, if a 2000 bytes message is fragmented in two packets, one coming
>> from eth0, one coming from eth1, I suspect your patch drops the message,
>> unless eth0/eth1 are part of a bonding device...
> 
> Actually, the patch tries to prevent packet drop for this exact
> scenario.  Please consider the following scenarios
> 1.  Packet comes in the fragment reassemble code in the following order
> (eth0 frag1), (eth0 frag2), (eth1 frag1), (eth1 frag2)
> Packet from both interfaces get reassembled and gets further processed.

Yes your patch does this, so each multicast application receives two copies of the
same datagram.

> 
> 2. Packet can some times arrive in (perhaps other orders as well)
> (eth0 frag1), (eth1 frag1), (eth0 frag2), (eth1 frag2)
> Without this patch, eth0 frag 1/2 are overwritten by eth1 frag1/2, and
> packet from eth1 is dropped in the routing code.

Really ? how so ? I dont see how it can happen, unless you use RPF ?

current situation should be :

(eth0 frag1) : We create a context, store frag1 in it
(eth1 frag1) : We find this context, and drop frag1 since we already have the data
                  (maybe the bug is here, if we cannot cope with a duplicate ?)
(eth0 frag2) : We find this context, store frag2 -> complete datagram and deliver it
(eth1 frag2) : We find context, drop frag2 since datagram was completed.

               (or maybe we create a new context that will timeout later, maybe this is your problem ?)

Net effect : We deliver the datagram correctly.


> 
>> That would break common routing setups, using two links to aggregate bandwidth ?
> 
> I don't believe it would.  The aggregate bandwidth will work the same as
> before.  The attributes (src/dst addr, protocol, interface, etc.) should
> generate a unique hash key.  If hash collision should happen with the
> addition of iif << 5, the code still compare the original src addr along
> with interface number, so there should be no issues.

What about the obvious :

(eth0 frag1),  (eth1 frag2)

Your patch creates two contexts since hashes are different,
that will timeout and no packet delivered at all


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: [PATCH] Multicast packet reassembly can fail]
  2009-10-28 17:26     ` Eric Dumazet
@ 2009-10-28 18:25       ` Steve Chen
  2009-10-28 20:22         ` David Stevens
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Chen @ 2009-10-28 18:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: mhuth, netdev

On Wed, 2009-10-28 at 18:26 +0100, Eric Dumazet wrote:
> Steve Chen a écrit :
> > On Wed, 2009-10-28 at 16:32 +0100, Eric Dumazet wrote:
> >> If each fragment is received twice on host, once by eth0, once by eth1,
> >> should we deliver datagram once or twice ?
> > 
> > The application received it once.  IIRC the duplicate packet is drop in
> > the routing code.
> > 
> >> Once should be enough, even if in the non fragmented case, it will
> >> be delivered twice (kernel cannot detect duplicates, user app might do itself)
> > 
> > Routing code drops the duplicate packet for none-fragmented case as
> > well.
> 
> Really ? How so ? Receiving two copies of the same packet is legal.

I will have to double check exactly where the packet drop happens.  I
thought it was somewhere in routing, but it could be in netfilter.

> 
> > 
> >>
> >>> For this specific case, src/dst address, protocol, IP ID and fragment
> >>> offset are all identical.  The only difference is the ingress interface.
> >>> A good follow up question would be why would anyone in their right mind
> >>> multicast to the same destination?  well, I don't know.  I can not get
> >>> the people who reported the problem to tell me either.   Since someone
> >>> found the need to do this,  perhaps others may find it useful too.
> >>>
> >> Then, if a 2000 bytes message is fragmented in two packets, one coming
> >> from eth0, one coming from eth1, I suspect your patch drops the message,
> >> unless eth0/eth1 are part of a bonding device...
> > 
> > Actually, the patch tries to prevent packet drop for this exact
> > scenario.  Please consider the following scenarios
> > 1.  Packet comes in the fragment reassemble code in the following order
> > (eth0 frag1), (eth0 frag2), (eth1 frag1), (eth1 frag2)
> > Packet from both interfaces get reassembled and gets further processed.
> 
> Yes your patch does this, so each multicast application receives two copies of the
> same datagram.
> 
> > 
> > 2. Packet can some times arrive in (perhaps other orders as well)
> > (eth0 frag1), (eth1 frag1), (eth0 frag2), (eth1 frag2)
> > Without this patch, eth0 frag 1/2 are overwritten by eth1 frag1/2, and
> > packet from eth1 is dropped in the routing code.
> 
> Really ? how so ? I dont see how it can happen, unless you use RPF ?
> 
> current situation should be :
> 
> (eth0 frag1) : We create a context, store frag1 in it
> (eth1 frag1) : We find this context, and drop frag1 since we already have the data
>                   (maybe the bug is here, if we cannot cope with a duplicate ?)
> (eth0 frag2) : We find this context, store frag2 -> complete datagram and deliver it
> (eth1 frag2) : We find context, drop frag2 since datagram was completed.

Yes, this is exactly what is happening in the current code.

> 
>                (or maybe we create a new context that will timeout later, maybe this is your problem ?)
> 
> Net effect : We deliver the datagram correctly.
> 
> 
> > 
> >> That would break common routing setups, using two links to aggregate bandwidth ?
> > 
> > I don't believe it would.  The aggregate bandwidth will work the same as
> > before.  The attributes (src/dst addr, protocol, interface, etc.) should
> > generate a unique hash key.  If hash collision should happen with the
> > addition of iif << 5, the code still compare the original src addr along
> > with interface number, so there should be no issues.
> 
> What about the obvious :
> 
> (eth0 frag1),  (eth1 frag2)
> 
> Your patch creates two contexts since hashes are different,
> that will timeout and no packet delivered at all
> 
I see the point you are making.  I assumed, probably incorrectly, that
since eth0 and eth1 have different IP address.  I would get a complete
series of fragments for each interface.  Perhaps, I should really be
looking up the stack to see why packets were dropped.  Please correct me
if I'm mistaken.  The normal behavior is that application should be
receiving either 2 (scenario 1) or 1 (scenario 2) packets.

Regards,

Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: [PATCH] Multicast packet reassembly can fail]
  2009-10-28 18:25       ` Steve Chen
@ 2009-10-28 20:22         ` David Stevens
  2009-10-28 21:11           ` Steve Chen
  0 siblings, 1 reply; 5+ messages in thread
From: David Stevens @ 2009-10-28 20:22 UTC (permalink / raw)
  To: Steve Chen; +Cc: Eric Dumazet, mhuth, netdev, netdev-owner

netdev-owner@vger.kernel.org wrote on 10/28/2009 11:25:39 AM:

> I see the point you are making.  I assumed, probably incorrectly, that
> since eth0 and eth1 have different IP address.  I would get a complete
> series of fragments for each interface.  Perhaps, I should really be
> looking up the stack to see why packets were dropped.  Please correct me
> if I'm mistaken.  The normal behavior is that application should be
> receiving either 2 (scenario 1) or 1 (scenario 2) packets.

Steve,
        If you didn't join the group on both interfaces, you won't receive
two copies in the first place; the unjoined NIC won't deliver anything
up the stack that isn't in it's multicast address filter.

 +-DLS


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Fwd: Re: [PATCH] Multicast packet reassembly can fail]
  2009-10-28 20:22         ` David Stevens
@ 2009-10-28 21:11           ` Steve Chen
  0 siblings, 0 replies; 5+ messages in thread
From: Steve Chen @ 2009-10-28 21:11 UTC (permalink / raw)
  To: David Stevens; +Cc: Eric Dumazet, mhuth, netdev, netdev-owner

On Wed, 2009-10-28 at 13:22 -0700, David Stevens wrote:
> netdev-owner@vger.kernel.org wrote on 10/28/2009 11:25:39 AM:
> 
> > I see the point you are making.  I assumed, probably incorrectly, that
> > since eth0 and eth1 have different IP address.  I would get a complete
> > series of fragments for each interface.  Perhaps, I should really be
> > looking up the stack to see why packets were dropped.  Please correct me
> > if I'm mistaken.  The normal behavior is that application should be
> > receiving either 2 (scenario 1) or 1 (scenario 2) packets.
> 
> Steve,
>         If you didn't join the group on both interfaces, you won't receive
> two copies in the first place; the unjoined NIC won't deliver anything
> up the stack that isn't in it's multicast address filter.
> 
>  +-DLS

Thanks for the inputs.  I'll revisit the issue.

Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-10-28 21:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1256740748.3153.418.camel@linux-1lbu>
     [not found] ` <4AE86420.3040607@gmail.com>
2009-10-28 17:05   ` [Fwd: Re: [PATCH] Multicast packet reassembly can fail] Steve Chen
2009-10-28 17:26     ` Eric Dumazet
2009-10-28 18:25       ` Steve Chen
2009-10-28 20:22         ` David Stevens
2009-10-28 21:11           ` Steve Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.