From: "David S. Miller" <davem@redhat.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: e0206@foo21.com, linux-kernel@vger.kernel.org, kuznet@ms2.inr.ac.ru
Subject: Re: [Patch][RFC] epoll and half closed TCP connections
Date: Sat, 12 Jul 2003 22:24:57 -0700 [thread overview]
Message-ID: <20030712222457.3d132897.davem@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.55.0307121256200.4720@bigblue.dev.mcafeelabs.com>
On Sat, 12 Jul 2003 13:01:21 -0700 (PDT)
Davide Libenzi <davidel@xmailserver.org> wrote:
>
> [Cc:ing DaveM ]
[Cc:ing Alexey :-) ]
Alexey, they seem to want to add some kind of POLLRDHUP thing,
comments wrt. TCP and elsewhere in the networking? See below...
> On Sat, 12 Jul 2003, Eric Varsanyi wrote:
>
> > I'm proposing adding a new POLL event type (POLLRDHUP) as way to solve
> > a new race introduced by having an edge triggered event mechanism
> > (epoll). The problem occurs when a client writes data and then does a
> > write side shutdown(). The server (using epoll) sees only one event for
> > the read data ready and the read EOF condition and has no way to tell
> > that an EOF occurred.
> >
> > -Eric Varsanyi
> >
> > Details
> > -----------
> > - remote sends data and does a shutdown
> > [ we see a data bearing packet and FIN from client on the wire ]
> >
> > - user mode server gets around to doing accept() and registers
> > for EPOLLIN events (along with HUP and ERR which are forced on)
> >
> > - epoll_wait() returns a single EPOLLIN event on the FD representing
> > both the 1/2 shutdown state and data available
> >
> > At this point there is no way the app can tell if there is a half closed
> > connection so it may issue a close() back to the client after writing
> > results. Normally the server would distinguish these events by assuming
> > EOF if it got a read ready indication and the first read returned 0 bytes,
> > or would issue read calls until less data was returned than was asked for.
> >
> > In a level triggered world this all just works because the read ready
> > indication is driven back to the app as long as the socket state is half
> > closed. The event driven epoll mechanism folds these two indications
> > together and thus loses one 'edge'.
> >
> > One would be tempted to issue an extra read() after getting back less than
> > expected, but this is an extra system call on every read event and you get
> > back the same '0' bytes that you get if the buffer is just empty. The only
> > sure bet seems to be CTL_MODding the FD to force a re-poll (which would
> > cost a syscall and hash-lookup in eventpoll for every read event).
> >
>
> Yes, this is overhead that should be avoided. It is true that you could
> use Level Triggered events, but if you structured your app on edge you
> should be able to solve this w/out overhead.
>
>
>
> > 2) add a new 1/2 closed event type that a poll routine can return
> >
> > The implementation is trivial, a patch is included below. If this idea sees
> > favor I'll fix the other architectures, ipv6, epoll.h, and make a 'real'
> > patch. I do not believe any drivers deserve to be modified to return this
> > new event.
>
> This looks good to me. David what do you think ?
>
>
>
> > diff -Naur linux-2.4.20/include/asm-i386/poll.h linux-2.4.20_ev/include/asm-i386/poll.h
> > --- linux-2.4.20/include/asm-i386/poll.h Thu Jan 23 13:01:28 1997
> > +++ linux-2.4.20_ev/include/asm-i386/poll.h Sat Jul 12 12:29:11 2003
> > @@ -15,6 +15,7 @@
> > #define POLLWRNORM 0x0100
> > #define POLLWRBAND 0x0200
> > #define POLLMSG 0x0400
> > +#define POLLRDHUP 0x0800
> >
> > struct pollfd {
> > int fd;
> > diff -Naur linux-2.4.20/net/ipv4/tcp.c linux-2.4.20_ev/net/ipv4/tcp.c
> > --- linux-2.4.20/net/ipv4/tcp.c Tue Jul 8 09:40:42 2003
> > +++ linux-2.4.20_ev/net/ipv4/tcp.c Sat Jul 12 12:29:56 2003
> > @@ -424,7 +424,7 @@
> > if (sk->shutdown == SHUTDOWN_MASK || sk->state == TCP_CLOSE)
> > mask |= POLLHUP;
> > if (sk->shutdown & RCV_SHUTDOWN)
> > - mask |= POLLIN | POLLRDNORM;
> > + mask |= POLLIN | POLLRDNORM | POLLRDHUP;
> >
> > /* Connected? */
> > if ((1 << sk->state) & ~(TCPF_SYN_SENT|TCPF_SYN_RECV)) {
> >
>
>
>
> - Davide
next prev parent reply other threads:[~2003-07-13 5:19 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-07-12 18:16 [Patch][RFC] epoll and half closed TCP connections Eric Varsanyi
2003-07-12 19:44 ` Jamie Lokier
2003-07-12 20:51 ` Eric Varsanyi
2003-07-12 20:48 ` Davide Libenzi
2003-07-12 21:19 ` Eric Varsanyi
2003-07-12 21:20 ` Davide Libenzi
2003-07-12 21:41 ` Davide Libenzi
2003-07-12 23:11 ` Eric Varsanyi
2003-07-12 23:55 ` Davide Libenzi
2003-07-13 1:05 ` Eric Varsanyi
2003-07-13 20:32 ` David Schwartz
2003-07-13 21:10 ` Jamie Lokier
2003-07-13 23:05 ` David Schwartz
2003-07-13 23:09 ` Davide Libenzi
2003-07-14 8:14 ` Alan Cox
2003-07-14 15:03 ` Davide Libenzi
2003-07-14 1:27 ` Jamie Lokier
2003-07-13 21:14 ` Davide Libenzi
2003-07-13 23:05 ` David Schwartz
2003-07-13 23:11 ` Davide Libenzi
2003-07-13 23:52 ` Entrope
2003-07-14 6:14 ` David Schwartz
2003-07-14 7:20 ` Jamie Lokier
2003-07-14 1:51 ` Jamie Lokier
2003-07-14 6:14 ` David Schwartz
2003-07-15 20:27 ` James Antill
2003-07-16 1:46 ` David Schwartz
2003-07-16 2:09 ` James Antill
2003-07-13 13:12 ` Jamie Lokier
2003-07-13 16:55 ` Davide Libenzi
2003-07-12 20:01 ` Davide Libenzi
2003-07-13 5:24 ` David S. Miller [this message]
2003-07-13 14:07 ` Jamie Lokier
2003-07-13 17:00 ` Davide Libenzi
2003-07-13 19:15 ` Jamie Lokier
2003-07-13 23:03 ` Davide Libenzi
2003-07-14 1:41 ` Jamie Lokier
2003-07-14 2:24 ` POLLRDONCE optimisation for epoll users (was: epoll and half closed TCP connections) Jamie Lokier
2003-07-14 2:37 ` Davide Libenzi
2003-07-14 2:43 ` Davide Libenzi
2003-07-14 2:56 ` Jamie Lokier
2003-07-14 3:02 ` Davide Libenzi
2003-07-14 3:16 ` Jamie Lokier
2003-07-14 3:21 ` Davide Libenzi
2003-07-14 3:42 ` Jamie Lokier
2003-07-14 4:00 ` Davide Libenzi
2003-07-14 5:51 ` Jamie Lokier
2003-07-14 6:24 ` Davide Libenzi
2003-07-14 6:57 ` Jamie Lokier
2003-07-14 3:12 ` Jamie Lokier
2003-07-14 3:17 ` Davide Libenzi
2003-07-14 3:35 ` Jamie Lokier
2003-07-14 3:04 ` Jamie Lokier
2003-07-14 3:12 ` Davide Libenzi
2003-07-14 3:27 ` Jamie Lokier
2003-07-14 17:09 ` [Patch][RFC] epoll and half closed TCP connections kuznet
2003-07-14 17:09 ` Davide Libenzi
2003-07-14 21:45 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030712222457.3d132897.davem@redhat.com \
--to=davem@redhat.com \
--cc=davidel@xmailserver.org \
--cc=e0206@foo21.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).