All of lore.kernel.org
 help / color / mirror / Atom feed
* Skipping past TCP lost packet in userspace
@ 2011-05-31  1:19 Josh Lehan
  2011-05-31  3:30 ` Marcus D. Leech
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Josh Lehan @ 2011-05-31  1:19 UTC (permalink / raw)
  To: netdev

Hello.  I looked, but could not find an answer.  Is there already an
ioctl() or something like that in Linux, that would allow a userspace
TCP socket to skip past a lost packet?

The kernel already will continue to queue up packets, and with TCP SACK,
the kernel can acknowledge reception of further packets beyond the lost
packet, allowing the queue to continue growing.  However, all these
queued packets won't be delivered to userspace until the original lost
packet is received again, after it has been retransmitted.

Is there a way for a userspace program to prevent this needless stall?
It would be great if there was an ioctl() or similar call, that would
tell the kernel that it's OK to leave a gap in the data stream, and
resume supplying userspace with more data.  An obvious application would
be media streaming, and many high-level media protocols do their own
block framing anyway, so resynchronization after the data gap would not
be a problem.

This sounds like something that would be a FAQ, and if so, please point
me to the answer.  Thank you!

Josh Lehan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-05-31  1:19 Skipping past TCP lost packet in userspace Josh Lehan
@ 2011-05-31  3:30 ` Marcus D. Leech
  2011-05-31  4:12   ` Josh Lehan
  2011-05-31  4:05 ` Mikael Abrahamsson
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Marcus D. Leech @ 2011-05-31  3:30 UTC (permalink / raw)
  To: Josh Lehan; +Cc: netdev

>
> Hello.  I looked, but could not find an answer.  Is there already an
> ioctl() or something like that in Linux, that would allow a userspace
> TCP socket to skip past a lost packet?
>
> The kernel already will continue to queue up packets, and with TCP SACK,
> the kernel can acknowledge reception of further packets beyond the lost
> packet, allowing the queue to continue growing.  However, all these
> queued packets won't be delivered to userspace until the original lost
> packet is received again, after it has been retransmitted.
>
> Is there a way for a userspace program to prevent this needless stall?
> It would be great if there was an ioctl() or similar call, that would
> tell the kernel that it's OK to leave a gap in the data stream, and
> resume supplying userspace with more data.  An obvious application would
> be media streaming, and many high-level media protocols do their own
> block framing anyway, so resynchronization after the data gap would not
> be a problem.
>
> This sounds like something that would be a FAQ, and if so, please point
> me to the answer.  Thank you!
>
>   
This sounds like you want UDP, not TCP.

Unless I'm misunderstanding what you want, you want a protocol that has
a different "contract"
  than TCP.  Doing what you want basically requires breaking TCP.  That
isn't going to happen.




-- 
Principal Investigator
Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-05-31  1:19 Skipping past TCP lost packet in userspace Josh Lehan
  2011-05-31  3:30 ` Marcus D. Leech
@ 2011-05-31  4:05 ` Mikael Abrahamsson
  2011-05-31 11:12 ` Neil Horman
  2011-05-31 17:23 ` Yuchung Cheng
  3 siblings, 0 replies; 17+ messages in thread
From: Mikael Abrahamsson @ 2011-05-31  4:05 UTC (permalink / raw)
  To: Josh Lehan; +Cc: netdev

On Mon, 30 May 2011, Josh Lehan wrote:

> Hello.  I looked, but could not find an answer.  Is there already an
> ioctl() or something like that in Linux, that would allow a userspace
> TCP socket to skip past a lost packet?

The basic operation of TCP is that it delivers a character stream to the 
application. That requires TCP to make sure everything is in order and 
nothing is missing.

If you want to do something else, you have to choose another protocol.

> Is there a way for a userspace program to prevent this needless stall?

That's like saying your body has all that needless blood.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-05-31  3:30 ` Marcus D. Leech
@ 2011-05-31  4:12   ` Josh Lehan
  0 siblings, 0 replies; 17+ messages in thread
From: Josh Lehan @ 2011-05-31  4:12 UTC (permalink / raw)
  To: Marcus D. Leech; +Cc: Josh Lehan, netdev

On 05/30/2011 08:30 PM, Marcus D. Leech wrote:
> This sounds like you want UDP, not TCP.
> 
> Unless I'm misunderstanding what you want, you want a protocol that has
> a different "contract"
>   than TCP.  Doing what you want basically requires breaking TCP.  That
> isn't going to happen.

Thanks.  This wouldn't break the TCP protocol on the wire, though.
Instead, it would merely provide a way for a userspace application to
"peek" at the arrived data that's behind the missing packet.  There's
already an ioctl() to peek at unread data, but it considers the missing
packet to be a barrier, and will not allow the application to see beyond it.

The reason for TCP is for maximum compatibility with firewalls, proxies,
and all the other annoyances of the modern commercialized Internet.
Using UDP would indeed solve this problem, but defeat the point of being
compatible.  Using other exotic protocols such as SCTP or DCCP is a
nonstarter.

Josh Lehan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-05-31  1:19 Skipping past TCP lost packet in userspace Josh Lehan
  2011-05-31  3:30 ` Marcus D. Leech
  2011-05-31  4:05 ` Mikael Abrahamsson
@ 2011-05-31 11:12 ` Neil Horman
  2011-05-31 17:23 ` Yuchung Cheng
  3 siblings, 0 replies; 17+ messages in thread
From: Neil Horman @ 2011-05-31 11:12 UTC (permalink / raw)
  To: Josh Lehan; +Cc: netdev

On Mon, May 30, 2011 at 06:19:20PM -0700, Josh Lehan wrote:
> Hello.  I looked, but could not find an answer.  Is there already an
> ioctl() or something like that in Linux, that would allow a userspace
> TCP socket to skip past a lost packet?
> 
> The kernel already will continue to queue up packets, and with TCP SACK,
> the kernel can acknowledge reception of further packets beyond the lost
> packet, allowing the queue to continue growing.  However, all these
> queued packets won't be delivered to userspace until the original lost
> packet is received again, after it has been retransmitted.
> 
> Is there a way for a userspace program to prevent this needless stall?
> It would be great if there was an ioctl() or similar call, that would
> tell the kernel that it's OK to leave a gap in the data stream, and
> resume supplying userspace with more data.  An obvious application would
> be media streaming, and many high-level media protocols do their own
> block framing anyway, so resynchronization after the data gap would not
> be a problem.
> 
> This sounds like something that would be a FAQ, and if so, please point
> me to the answer.  Thank you!
> 
No, TCP doesn't and won't do that by design

If you want to allow frames to come in to an application as they arrive at the
system regarless of prior loss, use UDP

If you still want ordering and reliability, look at SCTP.
Neil

> Josh Lehan
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-05-31  1:19 Skipping past TCP lost packet in userspace Josh Lehan
                   ` (2 preceding siblings ...)
  2011-05-31 11:12 ` Neil Horman
@ 2011-05-31 17:23 ` Yuchung Cheng
  2011-06-01  8:10   ` Josh Lehan
  3 siblings, 1 reply; 17+ messages in thread
From: Yuchung Cheng @ 2011-05-31 17:23 UTC (permalink / raw)
  To: Josh Lehan; +Cc: netdev, jiyengar

On Mon, May 30, 2011 at 6:19 PM, Josh Lehan <linux@krellan.com> wrote:
>
> Hello.  I looked, but could not find an answer.  Is there already an
> ioctl() or something like that in Linux, that would allow a userspace
> TCP socket to skip past a lost packet?
>
> The kernel already will continue to queue up packets, and with TCP SACK,
> the kernel can acknowledge reception of further packets beyond the lost
> packet, allowing the queue to continue growing.  However, all these
> queued packets won't be delivered to userspace until the original lost
> packet is received again, after it has been retransmitted.
>
> Is there a way for a userspace program to prevent this needless stall?

This paper may have a solution to your problem
"Minion—an All-Terrain Packet Packhorse to Jump-Start Stalled Internet
Transports"
http://csweb1.fandm.edu/jiyengar/lair/papers/minion-pfldnet2010.pdf

> It would be great if there was an ioctl() or similar call, that would
> tell the kernel that it's OK to leave a gap in the data stream, and
> resume supplying userspace with more data.  An obvious application would
> be media streaming, and many high-level media protocols do their own
> block framing anyway, so resynchronization after the data gap would not
> be a problem.
>
> This sounds like something that would be a FAQ, and if so, please point
> me to the answer.  Thank you!
>
> Josh Lehan
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-05-31 17:23 ` Yuchung Cheng
@ 2011-06-01  8:10   ` Josh Lehan
  2011-06-01 16:57     ` Bill Sommerfeld
                       ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Josh Lehan @ 2011-06-01  8:10 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Josh Lehan, netdev, jiyengar

On 05/31/2011 10:23 AM, Yuchung Cheng wrote:
> This paper may have a solution to your problem
> "Minion—an All-Terrain Packet Packhorse to Jump-Start Stalled Internet
> Transports"
> http://csweb1.fandm.edu/jiyengar/lair/papers/minion-pfldnet2010.pdf

Nice, thanks for pointing me to this.  I appreciate the helpful answer,
instead of just saying "use UDP" or "use SCTP".  That's not the point.

For better or for worse, TCP is realistically the only viable protocol
for streaming to the largest possible audience these days, hence my
question about adding this feature to the Linux TCP implementation.

As for the Linux receiver, I was thinking of 2 features:

1) Assuming the application has already read all data that would be
possible without blocking, perform something like an ioctl() to peek at
the data, and peek to see if there is any out-of-order data behind a gap.

2) If the out-of-order data is enough to be useful to the application,
another ioctl() could be done, to ask the kernel to jump over the gap
and deliver the data to the application.  At this point the missing data
would be considered lost.  The data behind the gap would then be
delivered to the application as part of its normal reading stream.  The
TCP sequence numbers would advance, as usual, just as if the missing
data had been there all along.

This will need some more thought in order to work in real life, such as
needing to be blocked on (select(), poll(), etc.) without having to
busywait the "peek" ioctl.  Also, it would be nice to have an ioctl() to
reclaim the missing data that was skipped over, should that data happen
to successfully arrive late, so that the application could still save it
(or whatever) instead of it having to be discarded.

Josh Lehan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-01  8:10   ` Josh Lehan
@ 2011-06-01 16:57     ` Bill Sommerfeld
  2011-06-01 17:35     ` Rick Jones
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Bill Sommerfeld @ 2011-06-01 16:57 UTC (permalink / raw)
  To: Josh Lehan; +Cc: Yuchung Cheng, netdev, jiyengar

On Wed, Jun 1, 2011 at 01:10, Josh Lehan <linux@krellan.com> wrote:
> 2) If the out-of-order data is enough to be useful to the application,
> another ioctl() could be done, to ask the kernel to jump over the gap
> and deliver the data to the application.

this sounds like a different way to spell lseek(fd, gapsize, SEEK_CUR)

of course, allowing a subset of lseek for sockets could very well mess
with any code which (ab)uses lseek to guess the type of a random
descriptor.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-01  8:10   ` Josh Lehan
  2011-06-01 16:57     ` Bill Sommerfeld
@ 2011-06-01 17:35     ` Rick Jones
  2011-06-24 14:58       ` Janardhan Iyengar
  2011-06-01 19:36     ` juice
  2011-06-03 11:51     ` Ilpo Järvinen
  3 siblings, 1 reply; 17+ messages in thread
From: Rick Jones @ 2011-06-01 17:35 UTC (permalink / raw)
  To: Josh Lehan; +Cc: Yuchung Cheng, netdev, jiyengar

On Wed, 2011-06-01 at 01:10 -0700, Josh Lehan wrote:
> On 05/31/2011 10:23 AM, Yuchung Cheng wrote:
> > This paper may have a solution to your problem
> > "Minion—an All-Terrain Packet Packhorse to Jump-Start Stalled Internet
> > Transports"
> > http://csweb1.fandm.edu/jiyengar/lair/papers/minion-pfldnet2010.pdf
> 
> Nice, thanks for pointing me to this.  I appreciate the helpful answer,
> instead of just saying "use UDP" or "use SCTP".  That's not the point.
> 
> For better or for worse, TCP is realistically the only viable protocol
> for streaming to the largest possible audience these days, hence my
> question about adding this feature to the Linux TCP implementation.

Isn't that treating the symptoms of problems at layers 8 and 9 (*) with
kludges (perhaps hacks if one is feeling charitable) at the user
interface to layer 4?  Just how many more little bits can we add to the
great pile before the aroma is overpowering?  Or to abuse another
metaphor, is there really any camel's back left here?

And while Linux has had some slightly non-trivial, non-portable
enhancements to its interface to a TCP endpoint (TCP_CORK is something
that comes to mind) I don't think any of them have been anywhere nearly
as large a change to a fundamental semantic of a TCP connection as what
you propose.

rick jones

*
http://www.isc.org/store/logoware-clothing/isc-9-layer-osi-model-cotton-t-shirt



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-01  8:10   ` Josh Lehan
  2011-06-01 16:57     ` Bill Sommerfeld
  2011-06-01 17:35     ` Rick Jones
@ 2011-06-01 19:36     ` juice
  2011-06-03 11:51     ` Ilpo Järvinen
  3 siblings, 0 replies; 17+ messages in thread
From: juice @ 2011-06-01 19:36 UTC (permalink / raw)
  To: Josh Lehan, Yuchung Cheng, Josh Lehan, netdev, jiyengar


> Nice, thanks for pointing me to this.  I appreciate the helpful answer,
> instead of just saying "use UDP" or "use SCTP".  That's not the point.
>
> For better or for worse, TCP is realistically the only viable protocol
> for streaming to the largest possible audience these days, hence my
> question about adding this feature to the Linux TCP implementation.
>

For better or for worse, I think the problem in your proposal is that it
just will not be portable, even if you implemented it on Linux stack.
I am quite sure you will not be able to bend other operating systems to
accept this kind of kludge in the TCP stack so the benefits are minimal.

The correct way to address this problem is to make sure that end-to-end
connectivity of all the needed protocols is maintained.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-01  8:10   ` Josh Lehan
                       ` (2 preceding siblings ...)
  2011-06-01 19:36     ` juice
@ 2011-06-03 11:51     ` Ilpo Järvinen
  2011-06-06  6:30       ` Josh Lehan
  3 siblings, 1 reply; 17+ messages in thread
From: Ilpo Järvinen @ 2011-06-03 11:51 UTC (permalink / raw)
  To: Josh Lehan; +Cc: netdev

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2062 bytes --]

On Wed, 1 Jun 2011, Josh Lehan wrote:

> On 05/31/2011 10:23 AM, Yuchung Cheng wrote:
> > This paper may have a solution to your problem
> > "Minion—an All-Terrain Packet Packhorse to Jump-Start Stalled Internet
> > Transports"
> > http://csweb1.fandm.edu/jiyengar/lair/papers/minion-pfldnet2010.pdf
> 
> Nice, thanks for pointing me to this.  I appreciate the helpful answer,
> instead of just saying "use UDP" or "use SCTP".  That's not the point.
> 
> For better or for worse, TCP is realistically the only viable protocol
> for streaming to the largest possible audience these days, hence my
> question about adding this feature to the Linux TCP implementation.
> 
> As for the Linux receiver, I was thinking of 2 features:
> 
> 1) Assuming the application has already read all data that would be
> possible without blocking, perform something like an ioctl() to peek at
> the data, and peek to see if there is any out-of-order data behind a gap.
> 
> 2) If the out-of-order data is enough to be useful to the application,
> another ioctl() could be done, to ask the kernel to jump over the gap
> and deliver the data to the application.  At this point the missing data
> would be considered lost.  The data behind the gap would then be
> delivered to the application as part of its normal reading stream.  The
> TCP sequence numbers would advance, as usual, just as if the missing
> data had been there all along.

And you'd send a cumulative ACK without the actual data segment...? 
...That's gonna break many middleboxes which would want to see that 
data segment too ...And there goes your "viability" (though with luck it 
will _sometimes_ work as rexmit of the data segment is already in flight). 

...The fact is that such change into TCP wire behavior is no longer TCP 
enough to work reliably.

In addition, such a non-legimite cumulative ACK probably violates number 
of TCP RFCs or at least assumptions made in them... e.g., for starters, 
please explain which timestamp you would be putting there into that 
particular cumulative ACK?


-- 
 i.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-03 11:51     ` Ilpo Järvinen
@ 2011-06-06  6:30       ` Josh Lehan
  0 siblings, 0 replies; 17+ messages in thread
From: Josh Lehan @ 2011-06-06  6:30 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Josh Lehan, netdev

On 06/03/2011 04:51 AM, Ilpo Järvinen wrote:
> And you'd send a cumulative ACK without the actual data segment...? 
> ...That's gonna break many middleboxes which would want to see that 
> data segment too ...And there goes your "viability" (though with luck it 
> will _sometimes_ work as rexmit of the data segment is already in flight). 

No, there would be no wire-visible change.  This idea was explored at
first, and then rejected.  As you mentioned, this would break many
middleboxes.  It would rightfully be considered an "optimistic ACK attack".

The late data segment would have to eventually arrive.  It would either
be dropped, if the userspace application had already skipped beyond that
point, or better yet, it could be re-inserted into the data stream (if
too late for live playback, then it could at least be saved into the
rewind buffer, or saved to disk if the user is doing that).

> In addition, such a non-legimite cumulative ACK probably violates number 
> of TCP RFCs or at least assumptions made in them... e.g., for starters, 
> please explain which timestamp you would be putting there into that 
> particular cumulative ACK?

It wouldn't change anything on the wire.  As you mentioned, timestamps
remain a good defense for guarding against optimistic ACK attacks.

Josh Lehan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-01 17:35     ` Rick Jones
@ 2011-06-24 14:58       ` Janardhan Iyengar
  2011-06-30  8:38         ` Josh Lehan
  0 siblings, 1 reply; 17+ messages in thread
From: Janardhan Iyengar @ 2011-06-24 14:58 UTC (permalink / raw)
  To: rick.jones2; +Cc: Josh Lehan, Yuchung Cheng, netdev, Bryan Ford

[Reviving this thread... apologies for dropping it.]

Rick,

Thanks for your note.  I agree that it does seem like we're simply adding to the metaphorical pile.  And my first knee-jerk response would be that there's not much else one can do in the modern IPv4 Internet :-)

That said, I'd like to point out that when you say that problems at layer 4 need to be fixed, there are two kinds of changes that can be considered -- the layer 4 protocol, and the API it offers to apps.  Changes to the API, which is what we're proposing, is not a modification to the transport layer protocol per se.  In other words, we are changing the service that TCP offers to apps, and not the protocol.

The non-portability that you point out in the second part of your note, while a completely legitimate point, is again an API issue.  While this API is non-portable because it doesn't exist in other OSes yet, that is a matter of time (hopefully), and we're hoping to start the process with Linux.  And, it is easy enough for an app to fail over to using simple TCP behavior where the sockopt is not supported.

You also point out that we could perhaps fix the shortcomings of TCP by actively building and deploying alternative transports -- we've tried that and failed exactly because we've missed the point that the narrow waist of the Internet hourglass is no longer just IP, but includes TCP and UDP as well (and sometimes just TCP.)  For the entire time I've been working with SCTP (since 2001), we've been working on and trying to get SCTP through middleboxes, but as it stands, we still only have pockets of success.  Try using SCTP in the wild; your packets are quite likely to get black-holed within your home/ISP network.  The problem with middleboxes is that the IETF missed the boat on providing standards for these devices, and there are very few (if any) points of pressure that can be applied to change these devices in the network.  As a result, getting any new non-network-compatible transport deployed over the network remains untenable.

Our goal is to be able to provide new network services while remaining compatible with the network.  As we see it, that is the only option that remains if we are to consider any new transport services beyond TCP's straitjacketed one.  As it turns out, our work shows that we _can_ offer more services using the same TCP protocol, which is a win-win, since the new services remain network-compatible.

Note:  I'm talking largely about the v4 Internet.  The v6 Internet will hopefully have fewer devices that interpose on the transport layer, esp. NAPTs;  however, I expect fully that firewalls and PEPs will still use transport layer information, requiring them to be able to read/understand transport header information.

- jana

On 6/1/11 1:35 PM, Rick Jones wrote:
> On Wed, 2011-06-01 at 01:10 -0700, Josh Lehan wrote:
>> On 05/31/2011 10:23 AM, Yuchung Cheng wrote:
>>> This paper may have a solution to your problem
>>> "Minion—an All-Terrain Packet Packhorse to Jump-Start Stalled Internet
>>> Transports"
>>> http://csweb1.fandm.edu/jiyengar/lair/papers/minion-pfldnet2010.pdf
>>
>> Nice, thanks for pointing me to this.  I appreciate the helpful answer,
>> instead of just saying "use UDP" or "use SCTP".  That's not the point.
>>
>> For better or for worse, TCP is realistically the only viable protocol
>> for streaming to the largest possible audience these days, hence my
>> question about adding this feature to the Linux TCP implementation.
>
> Isn't that treating the symptoms of problems at layers 8 and 9 (*) with
> kludges (perhaps hacks if one is feeling charitable) at the user
> interface to layer 4?  Just how many more little bits can we add to the
> great pile before the aroma is overpowering?  Or to abuse another
> metaphor, is there really any camel's back left here?
>
> And while Linux has had some slightly non-trivial, non-portable
> enhancements to its interface to a TCP endpoint (TCP_CORK is something
> that comes to mind) I don't think any of them have been anywhere nearly
> as large a change to a fundamental semantic of a TCP connection as what
> you propose.
>
> rick jones
>
> *
> http://www.isc.org/store/logoware-clothing/isc-9-layer-osi-model-cotton-t-shirt
>
>

-- 
Janardhan Iyengar
Assistant Professor, Computer Science
Franklin & Marshall College
http://www.fandm.edu/jiyengar

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-24 14:58       ` Janardhan Iyengar
@ 2011-06-30  8:38         ` Josh Lehan
  2011-06-30 14:36           ` Neil Horman
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Lehan @ 2011-06-30  8:38 UTC (permalink / raw)
  To: janardhan.iyengar
  Cc: Janardhan Iyengar, rick.jones2, Josh Lehan, Yuchung Cheng,
	netdev, Bryan Ford

On 06/24/2011 07:58 AM, Janardhan Iyengar wrote:
> Thanks for your note.  I agree that it does seem like we're simply
> adding to the metaphorical pile.  And my first knee-jerk response would
> be that there's not much else one can do in the modern IPv4 Internet :-)

Thanks, I also appreciate you reviving this thread.  I was surprised at
the hostility here, towards an idea that we both think is necessary and
practical, given the realities of today's Internet.

TCP is at the middle of the hourglass, as you said.  Even UDP isn't
universally allowed (it's not all that uncommon to see UDP blocked,
except for DNS packets to whitelisted DNS servers).  At least one ISP,
"AT&T U-Verse", no longer allows the customer their choice of Internet
router, and the ISP's mandated router will filter all traffic in both
directions, so if the packet isn't recognized by its simple little
stateful firewall, into the bit bucket it goes.  Have fun trying to pass
SCTP or DCCP through that!

> Changes to the API, which is what we're proposing, is not a modification
> to the transport layer protocol per se.  In other words, we are changing
> the service that TCP offers to apps, and not the protocol.

Agreed, and the freedom of Linux to do this is what makes it great.  API
compatibility with other OS's is not an issue, since as you said the app
can always fall back to classical TCP behavior, and since nothing on the
wire changes, it won't break the other OS on the other side of the wire.

> Note:  I'm talking largely about the v4 Internet.  The v6 Internet will
> hopefully have fewer devices that interpose on the transport layer, esp.
> NAPTs;  however, I expect fully that firewalls and PEPs will still use
> transport layer information, requiring them to be able to
> read/understand transport header information.

Like IPv4, most (all?) IPv6 firewalls are stateful, so the firewall has
to be aware of the transport protocol in order to know which packets to
allow back through as replies.  And, for servers behind the firewall,
the firewall must offer a way to punch a hole through it without opening
too wide of a hole, and keep state for incoming connections, so
transport protocol awareness is important there as well.

So, even though we're vastly outnumbered on this mailing list, I remain
interested in your "Minion" paper and its ideas for providing a richer
API to make TCP more versatile to suit a wide variety of needs.

Josh Lehan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-30  8:38         ` Josh Lehan
@ 2011-06-30 14:36           ` Neil Horman
  2011-07-01  8:39             ` Josh Lehan
  0 siblings, 1 reply; 17+ messages in thread
From: Neil Horman @ 2011-06-30 14:36 UTC (permalink / raw)
  To: Josh Lehan
  Cc: janardhan.iyengar, Janardhan Iyengar, rick.jones2, Yuchung Cheng,
	netdev, Bryan Ford

On Thu, Jun 30, 2011 at 01:38:12AM -0700, Josh Lehan wrote:
> On 06/24/2011 07:58 AM, Janardhan Iyengar wrote:
> > Thanks for your note.  I agree that it does seem like we're simply
> > adding to the metaphorical pile.  And my first knee-jerk response would
> > be that there's not much else one can do in the modern IPv4 Internet :-)
> 
> Thanks, I also appreciate you reviving this thread.  I was surprised at
> the hostility here, towards an idea that we both think is necessary and
> practical, given the realities of today's Internet.
> 
> TCP is at the middle of the hourglass, as you said.  Even UDP isn't
> universally allowed (it's not all that uncommon to see UDP blocked,
> except for DNS packets to whitelisted DNS servers).  At least one ISP,
> "AT&T U-Verse", no longer allows the customer their choice of Internet
> router, and the ISP's mandated router will filter all traffic in both
> directions, so if the packet isn't recognized by its simple little
> stateful firewall, into the bit bucket it goes.  Have fun trying to pass
> SCTP or DCCP through that!
> 
I'll leave the rest of this alone, since its pretty obvious that no one is going
to break TCP for you, but just so that you're aware, The only reason you have to
use the 2-Wire gateway that AT&T provides is because there are no commercially
available routers that support the uplink interface (which I expect will change
eventually).  In the time being, if you want to use a different router, place
the RG in bridge mode by selecting a host as your DMZ device.  That will assign
the wan address to that connected device via DHCP and allow you to pass whatever
traffic you want through it.  I use it to pass SCTP and IPv6 traffice all the
time, works great.
Neil


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-06-30 14:36           ` Neil Horman
@ 2011-07-01  8:39             ` Josh Lehan
  2011-07-01 13:37               ` Neil Horman
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Lehan @ 2011-07-01  8:39 UTC (permalink / raw)
  To: Neil Horman
  Cc: Josh Lehan, janardhan.iyengar, Janardhan Iyengar, rick.jones2,
	Yuchung Cheng, netdev, Bryan Ford

On 06/30/2011 07:36 AM, Neil Horman wrote:
> I'll leave the rest of this alone, since its pretty obvious that no one is going
> to break TCP for you, but just so that you're aware, The only reason you have to

That's the fundamental disconnect we've been trying to communicate: TCP
*won't break*.  None of the rules of TCP are broken, from the wire's
point of view.  The OS merely gets a richer API, from the application's
point of view, to optimize the TCP protocol implementation to serve a
wider variety of needs.

> use the 2-Wire gateway that AT&T provides is because there are no commercially
> available routers that support the uplink interface (which I expect will change

That would be good to give the customer a choice of access devices with
which to get on the network, and let the market device what is best,
instead of AT&T dictating what's allowed.  I'm getting deja vu of a
famous legal case from 27 years ago.

> eventually).  In the time being, if you want to use a different router, place
> the RG in bridge mode by selecting a host as your DMZ device.  That will assign
> the wan address to that connected device via DHCP and allow you to pass whatever
> traffic you want through it.  I use it to pass SCTP and IPv6 traffice all the
> time, works great.

Wow, that's news to me, that it allows this.

http://www.ka9q.net/Uverse/

Have the limitations in these documents been addressed?  If so, kudos to
AT&T.

Josh Lehan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Skipping past TCP lost packet in userspace
  2011-07-01  8:39             ` Josh Lehan
@ 2011-07-01 13:37               ` Neil Horman
  0 siblings, 0 replies; 17+ messages in thread
From: Neil Horman @ 2011-07-01 13:37 UTC (permalink / raw)
  To: Josh Lehan
  Cc: janardhan.iyengar, Janardhan Iyengar, rick.jones2, Yuchung Cheng,
	netdev, Bryan Ford

On Fri, Jul 01, 2011 at 01:39:18AM -0700, Josh Lehan wrote:
> On 06/30/2011 07:36 AM, Neil Horman wrote:
> > I'll leave the rest of this alone, since its pretty obvious that no one is going
> > to break TCP for you, but just so that you're aware, The only reason you have to
> 
> That's the fundamental disconnect we've been trying to communicate: TCP
> *won't break*.  None of the rules of TCP are broken, from the wire's
> point of view.  The OS merely gets a richer API, from the application's
> point of view, to optimize the TCP protocol implementation to serve a
> wider variety of needs.
> 
I get what you're saying, but a API change that is only available on linux is
still a break.  I get that its not an on-wire change, but API differences that
make code non-portable go unused, for exactly that reason - people don't write
apps that can only work on linux, they write standard apps that comply to
specifications.  Deviations from those standards go unused.  

I suppose it comes down to a difference of opinion about what "broken" amounts
to.  Either way, if you want to see this happen, I'm certain it will start with
you presenting code and illustrating its benefit.  No one else is going to write
this for you.

> > use the 2-Wire gateway that AT&T provides is because there are no commercially
> > available routers that support the uplink interface (which I expect will change
> 
> That would be good to give the customer a choice of access devices with
> which to get on the network, and let the market device what is best,
> instead of AT&T dictating what's allowed.  I'm getting deja vu of a
> famous legal case from 27 years ago.
> 
> > eventually).  In the time being, if you want to use a different router, place
> > the RG in bridge mode by selecting a host as your DMZ device.  That will assign
> > the wan address to that connected device via DHCP and allow you to pass whatever
> > traffic you want through it.  I use it to pass SCTP and IPv6 traffice all the
> > time, works great.
> 
> Wow, that's news to me, that it allows this.
> 
> http://www.ka9q.net/Uverse/
> 
> Have the limitations in these documents been addressed?  If so, kudos to
> AT&T.
> 
The limitations are overstated in the link above.  NAT is mandatory, but only
over the HPNA interface.  The idea is to prevent your set-top box that AT&T
communicates with from having a public ip address (to maintain quality of
service to the TV and prevent outside attacks).  They have one or two ports
forwarded from the public ip address to the private address of the set top box

In comparison the RJ-45 interfaces can be made wide open.  Specifically you can
attach a single device and mark it as the DMZ for the Residential gateway.
anything not forwarded to the set top box gets passed to the DMZ device. 

So its not 100% NAT free.  Its wide open, minus a single port to allow AT&T to
push content to your set top box.

It works pretty well.  I marked v6rotuer.think-freely.org as my DMZ device and
use it to do my own internal NAT-ing for IPv4 as well as serve as the tunnel
enpoint and router for my IPv6 network.  
Neil

> Josh Lehan
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-07-01 13:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-31  1:19 Skipping past TCP lost packet in userspace Josh Lehan
2011-05-31  3:30 ` Marcus D. Leech
2011-05-31  4:12   ` Josh Lehan
2011-05-31  4:05 ` Mikael Abrahamsson
2011-05-31 11:12 ` Neil Horman
2011-05-31 17:23 ` Yuchung Cheng
2011-06-01  8:10   ` Josh Lehan
2011-06-01 16:57     ` Bill Sommerfeld
2011-06-01 17:35     ` Rick Jones
2011-06-24 14:58       ` Janardhan Iyengar
2011-06-30  8:38         ` Josh Lehan
2011-06-30 14:36           ` Neil Horman
2011-07-01  8:39             ` Josh Lehan
2011-07-01 13:37               ` Neil Horman
2011-06-01 19:36     ` juice
2011-06-03 11:51     ` Ilpo Järvinen
2011-06-06  6:30       ` Josh Lehan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.