[PATCH] tcp: select(writefds) don't hang up when a peer close connection

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] tcp: select(writefds) don't hang up when a peer close connection
@ 2010-08-25  2:05 KOSAKI Motohiro
  2010-08-25 22:34 ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: KOSAKI Motohiro @ 2010-08-25  2:05 UTC (permalink / raw)
  To: netdev, LKML; +Cc: kosaki.motohiro, YOSHIFUJI Hideaki

This issue come from ruby language community. Below test program
hang up when only run on Linux.

	% uname -mrsv
	Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
	% ruby -rsocket -ve '
	BasicSocket.do_not_reverse_lookup = true
	serv = TCPServer.open("127.0.0.1", 0)
	s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
	s2 = serv.accept
	s2.close
	s1.write("a") rescue p $!
	s1.write("a") rescue p $!
	Thread.new {
	  s1.write("a")
	}.join'
	ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
	#<Errno::EPIPE: Broken pipe>
	[Hang Here]

FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call
select() internally. and tcp_poll has a bug.

SUS defined 'ready for writing' of select() as following.

|  A descriptor shall be considered ready for writing when a call to an output
|  function with O_NONBLOCK clear would not block, whether or not the function
|  would transfer data successfully.

That said, EPIPE situation is clearly one of 'ready for writing'.

We don't have read-side issue because tcp_poll() already has read side
shutdown care.

|        if (sk->sk_shutdown & RCV_SHUTDOWN)
|                mask |= POLLIN | POLLRDNORM | POLLRDHUP;

So, Let's insert same logic in write side.

- reference url
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 net/ipv4/tcp.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 176e11a..2497e48 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -451,7 +451,8 @@ unsigned int tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
 				if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk))
 					mask |= POLLOUT | POLLWRNORM;
 			}
-		}
+		} else
+			mask |= POLLOUT | POLLWRNORM;
 
 		if (tp->urg_data & TCP_URG_VALID)
 			mask |= POLLPRI;
-- 
1.6.5.2




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-25  2:05 [PATCH] tcp: select(writefds) don't hang up when a peer close connection KOSAKI Motohiro
@ 2010-08-25 22:34 ` David Miller
  2010-08-26  1:24   ` Ben Hutchings
  2010-08-26  3:09   ` KOSAKI Motohiro
  0 siblings, 2 replies; 10+ messages in thread
From: David Miller @ 2010-08-25 22:34 UTC (permalink / raw)
  To: kosaki.motohiro; +Cc: netdev, linux-kernel, yoshfuji

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 25 Aug 2010 11:05:48 +0900 (JST)

> This issue come from ruby language community. Below test program
> hang up when only run on Linux.
> 
> 	% uname -mrsv
> 	Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
> 	% ruby -rsocket -ve '
> 	BasicSocket.do_not_reverse_lookup = true
> 	serv = TCPServer.open("127.0.0.1", 0)
> 	s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
> 	s2 = serv.accept
> 	s2.close
> 	s1.write("a") rescue p $!
> 	s1.write("a") rescue p $!
> 	Thread.new {
> 	  s1.write("a")
> 	}.join'
> 	ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
> 	#<Errno::EPIPE: Broken pipe>
> 	[Hang Here]
> 
> FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call
> select() internally. and tcp_poll has a bug.

In your opinion.

> SUS defined 'ready for writing' of select() as following.
> 
> |  A descriptor shall be considered ready for writing when a call to an output
> |  function with O_NONBLOCK clear would not block, whether or not the function
> |  would transfer data successfully.
> 
> That said, EPIPE situation is clearly one of 'ready for writing'.

How Linux should behave is defined by many things, and often it is
simply defined by how we've behaved for a very long time.  This is
because changing behavior can often break as many applications as it
can fix.  Standards don't necessarily tell us how we must behave,
since often is it impractical to follow their definions.

And in this case here, I call into question the behavior of Ruby and
the application from two perspectives:

1) Unlike all of the other conditions signalled by poll() this is
   one the application explicitly created and therefore knows about.

   If the application calls close() or shutdown() with the send flag
   set, IT KNOWS what is going to happen on a write() attempt.

2) Ruby and this script will have to deal with the past 13 years
   worth of Linux kernels.  Even if I were to apply this fix now
   it is not going to propagate onto a user's system any time soon.

   Many systems would never ever get this fix.

   Therefore it behooves Ruby and this script to make a very reasonable
   change, which is to track when close() or send shutdown() calls occur
   and behave appropriately on a write() call.

I'm therefore not applying this patch, because not only can applications
handle this properly with information they already have, the change has
the potential to break applications.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-25 22:34 ` David Miller
@ 2010-08-26  1:24   ` Ben Hutchings
  2010-08-26  3:55     ` KOSAKI Motohiro
  2010-08-26  3:09   ` KOSAKI Motohiro
  1 sibling, 1 reply; 10+ messages in thread
From: Ben Hutchings @ 2010-08-26  1:24 UTC (permalink / raw)
  To: David Miller; +Cc: kosaki.motohiro, netdev, linux-kernel, yoshfuji

On Wed, 2010-08-25 at 15:34 -0700, David Miller wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 25 Aug 2010 11:05:48 +0900 (JST)
> 
> > This issue come from ruby language community. Below test program
> > hang up when only run on Linux.
> > 
> > 	% uname -mrsv
> > 	Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
> > 	% ruby -rsocket -ve '
> > 	BasicSocket.do_not_reverse_lookup = true
> > 	serv = TCPServer.open("127.0.0.1", 0)
> > 	s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
> > 	s2 = serv.accept
> > 	s2.close
> > 	s1.write("a") rescue p $!
> > 	s1.write("a") rescue p $!
> > 	Thread.new {
> > 	  s1.write("a")
> > 	}.join'
> > 	ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
> > 	#<Errno::EPIPE: Broken pipe>
> > 	[Hang Here]
[...]
> And in this case here, I call into question the behavior of Ruby and
> the application from two perspectives:
> 
> 1) Unlike all of the other conditions signalled by poll() this is
>    one the application explicitly created and therefore knows about.
>
>    If the application calls close() or shutdown() with the send flag
>    set, IT KNOWS what is going to happen on a write() attempt.
[...]

This example seems to have both server (serv, s2) and client (s1) in the
same process for simplicity.  The server socket (s2) is closed and the
client cannot be expected to know that.  Of course the client ought to
drop the connection after the first EPIPE, but it's reasonable to expect
that this is a sticky condition just as it would be for a pipe.

Here's a similar test case in C:

#include <assert.h>
#include <signal.h>
#include <stdio.h>

#include <sys/select.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <unistd.h>

int main(void)
{
    struct sockaddr sa;
    struct timeval tv;
    int serv, s1, s2;
    socklen_t len;
    fd_set fds;

    signal(SIGPIPE, SIG_IGN);

    serv = socket(AF_INET, SOCK_STREAM, 0);
    assert(serv >= 0);
    assert(!listen(serv, 1));
    len = sizeof(sa);
    assert(!getsockname(serv, &sa, &len));

    s1 = socket(AF_INET, SOCK_STREAM, 0);
    assert(s1 >= 0);
    assert(!connect(s1, &sa, len));
    len = sizeof(sa);

    s2 = accept(serv, &sa, &len);
    assert(s2 >= 0);
    close(s2);

    for (;;) {
	printf("write: %d\n", write(s1, "a", 1));
	FD_ZERO(&fds);
	FD_SET(s1, &fds);
	tv.tv_sec = 1;
	tv.tv_usec = 0;
	printf("select: %d\n", select(s1 + 1, NULL, &fds, NULL, &tv));
    }
    return 0;
}

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-25 22:34 ` David Miller
  2010-08-26  1:24   ` Ben Hutchings
@ 2010-08-26  3:09   ` KOSAKI Motohiro
  2010-08-26  3:51     ` David Miller
  1 sibling, 1 reply; 10+ messages in thread
From: KOSAKI Motohiro @ 2010-08-26  3:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel, yoshfuji

Hi

Thank you for quick responce!


>> SUS defined 'ready for writing' of select() as following.
>>
>> |  A descriptor shall be considered ready for writing when a call to an output
>> |  function with O_NONBLOCK clear would not block, whether or not the function
>> |  would transfer data successfully.
>>
>> That said, EPIPE situation is clearly one of 'ready for writing'.
>
> How Linux should behave is defined by many things, and often it is
> simply defined by how we've behaved for a very long time.  This is
> because changing behavior can often break as many applications as it
> can fix.  Standards don't necessarily tell us how we must behave,
> since often is it impractical to follow their definions.
>
> And in this case here, I call into question the behavior of Ruby and
> the application from two perspectives:
>
> 1) Unlike all of the other conditions signalled by poll() this is
>   one the application explicitly created and therefore knows about.
>
>   If the application calls close() or shutdown() with the send flag
>   set, IT KNOWS what is going to happen on a write() attempt.

Umm??

Probably my example is not so good. That's not my point.
In the example application, client and server socket is in the same process.
But it's NOT generic. usually, client and server are another process. then,
client can't expect when server close socket.

The most big matter is, this is can't be avoided in userland. In addition,
EVERY application don't want application hang up. we don't hesitate
userland change.

> 2) Ruby and this script will have to deal with the past 13 years
>   worth of Linux kernels.  Even if I were to apply this fix now
>   it is not going to propagate onto a user's system any time soon.
>
>   Many systems would never ever get this fix.
>
>   Therefore it behooves Ruby and this script to make a very reasonable
>   change, which is to track when close() or send shutdown() calls occur
>   and behave appropriately on a write() call.
>
> I'm therefore not applying this patch, because not only can applications
> handle this properly with information they already have, the change has
> the potential to break applications.

At first, I was thinking two fix plan.
 1) this patch
 2) adding POLLWRHUP as POLLRDHUP.

However I couldn't find any regression rick in (1). then I did choice (1).

So, Can you please tell us what rick you worry? My thinking is, If
select(writefds)
returned, an application naturally call to write. (why not? If not,
why do you call select?)
and write return EPIPE. every network application have EPIPE error checking.

But, there is any rick. I can remake a patch as (2).

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-26  3:09   ` KOSAKI Motohiro
@ 2010-08-26  3:51     ` David Miller
  2010-08-26  4:02       ` KOSAKI Motohiro
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2010-08-26  3:51 UTC (permalink / raw)
  To: kosaki.motohiro; +Cc: netdev, linux-kernel, yoshfuji, bhutchings

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Thu, 26 Aug 2010 12:09:55 +0900

> Probably my example is not so good. That's not my point.
> In the example application, client and server socket is in the same process.
> But it's NOT generic. usually, client and server are another process. then,
> client can't expect when server close socket.
> 
> The most big matter is, this is can't be avoided in userland. In addition,
> EVERY application don't want application hang up. we don't hesitate
> userland change.

Ok, you and Ben Hutchings have convinced me to reconsider.

And this matches what even BSD4.4-Lite does (I checked yesterday before
my initial reply), so I will apply this patch.

Thanks for your patience.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-26  1:24   ` Ben Hutchings
@ 2010-08-26  3:55     ` KOSAKI Motohiro
  2010-08-26 12:07       ` Ben Hutchings
  0 siblings, 1 reply; 10+ messages in thread
From: KOSAKI Motohiro @ 2010-08-26  3:55 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: kosaki.motohiro, David Miller, netdev, linux-kernel, yoshfuji

> On Wed, 2010-08-25 at 15:34 -0700, David Miller wrote:
> > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > Date: Wed, 25 Aug 2010 11:05:48 +0900 (JST)
> > 
> > > This issue come from ruby language community. Below test program
> > > hang up when only run on Linux.
> > > 
> > > 	% uname -mrsv
> > > 	Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
> > > 	% ruby -rsocket -ve '
> > > 	BasicSocket.do_not_reverse_lookup = true
> > > 	serv = TCPServer.open("127.0.0.1", 0)
> > > 	s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
> > > 	s2 = serv.accept
> > > 	s2.close
> > > 	s1.write("a") rescue p $!
> > > 	s1.write("a") rescue p $!
> > > 	Thread.new {
> > > 	  s1.write("a")
> > > 	}.join'
> > > 	ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
> > > 	#<Errno::EPIPE: Broken pipe>
> > > 	[Hang Here]
> [...]
> > And in this case here, I call into question the behavior of Ruby and
> > the application from two perspectives:
> > 
> > 1) Unlike all of the other conditions signalled by poll() this is
> >    one the application explicitly created and therefore knows about.
> >
> >    If the application calls close() or shutdown() with the send flag
> >    set, IT KNOWS what is going to happen on a write() attempt.
> [...]
> 
> This example seems to have both server (serv, s2) and client (s1) in the
> same process for simplicity.  The server socket (s2) is closed and the
> client cannot be expected to know that.  Of course the client ought to
> drop the connection after the first EPIPE, but it's reasonable to expect
> that this is a sticky condition just as it would be for a pipe.
> 
> Here's a similar test case in C:
> 
> #include <assert.h>
> #include <signal.h>
> #include <stdio.h>
> 
> #include <sys/select.h>
> #include <sys/socket.h>
> #include <sys/types.h>
> #include <unistd.h>
> 
> int main(void)
> {
>     struct sockaddr sa;
>     struct timeval tv;
>     int serv, s1, s2;
>     socklen_t len;
>     fd_set fds;
> 
>     signal(SIGPIPE, SIG_IGN);
> 
>     serv = socket(AF_INET, SOCK_STREAM, 0);
>     assert(serv >= 0);
>     assert(!listen(serv, 1));
>     len = sizeof(sa);
>     assert(!getsockname(serv, &sa, &len));
> 
>     s1 = socket(AF_INET, SOCK_STREAM, 0);
>     assert(s1 >= 0);
>     assert(!connect(s1, &sa, len));
>     len = sizeof(sa);
> 
>     s2 = accept(serv, &sa, &len);
>     assert(s2 >= 0);
>     close(s2);
> 
>     for (;;) {
> 	printf("write: %d\n", write(s1, "a", 1));
> 	FD_ZERO(&fds);
> 	FD_SET(s1, &fds);
> 	tv.tv_sec = 1;
> 	tv.tv_usec = 0;
> 	printf("select: %d\n", select(s1 + 1, NULL, &fds, NULL, &tv));
>     }
>     return 0;
> }

Cool!

Ben, I think your code is cleaner than mine. If you allow me, I hope to
include this one into my patch description.

Thanks.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-26  3:51     ` David Miller
@ 2010-08-26  4:02       ` KOSAKI Motohiro
  0 siblings, 0 replies; 10+ messages in thread
From: KOSAKI Motohiro @ 2010-08-26  4:02 UTC (permalink / raw)
  To: David Miller; +Cc: kosaki.motohiro, netdev, linux-kernel, yoshfuji, bhutchings

> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Thu, 26 Aug 2010 12:09:55 +0900
> 
> > Probably my example is not so good. That's not my point.
> > In the example application, client and server socket is in the same process.
> > But it's NOT generic. usually, client and server are another process. then,
> > client can't expect when server close socket.
> > 
> > The most big matter is, this is can't be avoided in userland. In addition,
> > EVERY application don't want application hang up. we don't hesitate
> > userland change.
> 
> Ok, you and Ben Hutchings have convinced me to reconsider.
> 
> And this matches what even BSD4.4-Lite does (I checked yesterday before
> my initial reply), so I will apply this patch.
> 
> Thanks for your patience.

Great!

Thank you!!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-26  3:55     ` KOSAKI Motohiro
@ 2010-08-26 12:07       ` Ben Hutchings
  2010-08-26 19:21         ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Hutchings @ 2010-08-26 12:07 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: David Miller, netdev, linux-kernel, yoshfuji

On Thu, 2010-08-26 at 12:55 +0900, KOSAKI Motohiro wrote:
[...]
> Cool!
> 
> Ben, I think your code is cleaner than mine. If you allow me, I hope to
> include this one into my patch description.

You can do that if you like, but it seems a bit long for a commit
message.

By the way, I was able to reproduce this bug in RHEL 3 (kernel based on
2.4.21) so it seems to have been around for a while.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-26 12:07       ` Ben Hutchings
@ 2010-08-26 19:21         ` David Miller
  2010-08-30  1:08           ` KOSAKI Motohiro
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2010-08-26 19:21 UTC (permalink / raw)
  To: bhutchings; +Cc: kosaki.motohiro, netdev, linux-kernel, yoshfuji

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 26 Aug 2010 13:07:28 +0100

> By the way, I was able to reproduce this bug in RHEL 3 (kernel based on
> 2.4.21) so it seems to have been around for a while.

It's existed since September 1998:

commit 6275af56642b5b9ed9ef15bf5d9718661c5fe79d
Author: freitag <freitag>
Date:   Sat Sep 26 06:20:25 1998 +0000

    Some small fixes.
    
    - Remove second check for sk->err in poll as discussed with Linus.
    - Don't signal writability when the local end has been shut down.
    - Comment fixes.

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5a09639..16d4308 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -5,7 +5,7 @@
  *
  *		Implementation of the Transmission Control Protocol(TCP).
  *
- * Version:	$Id: tcp.c,v 1.123 1998-09-23 19:52:39 davem Exp $
+ * Version:	$Id: tcp.c,v 1.124 1998-09-26 06:20:25 freitag Exp $
  *
  * Authors:	Ross Biro, <bir7@leland.Stanford.Edu>
  *		Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
@@ -591,18 +591,19 @@ unsigned int tcp_poll(struct file * file, struct socket *sock, poll_table *wait)
 		mask |= POLLHUP;
 
 	/* Connected? */
-	if ((1 << sk->state) & ~(TCPF_SYN_SENT|TCPF_SYN_RECV)) {
+	if ((1 << sk->state) & ~(TCPF_SYN_SENT|TCPF_SYN_RECV)) { 
 		if ((tp->rcv_nxt != tp->copied_seq) &&
 		    (tp->urg_seq != tp->copied_seq ||
 		     tp->rcv_nxt != tp->copied_seq+1 ||
 		     sk->urginline || !tp->urg_data))
 			mask |= POLLIN | POLLRDNORM;
 
-		/* Always wake the user up when an error occurred */
-		if (sock_wspace(sk) >= tcp_min_write_space(sk, tp) || sk->err) {
-			mask |= POLLOUT | POLLWRNORM;
-		} else {
-			sk->socket->flags |= SO_NOSPACE; /* send SIGIO later */
+		if (!(sk->shutdown & SEND_SHUTDOWN)) { 
+			if (sock_wspace(sk) >= tcp_min_write_space(sk, tp)) { 
+				mask |= POLLOUT | POLLWRNORM;
+			} else {  /* send SIGIO later */
+				sk->socket->flags |= SO_NOSPACE;  
+			}
 		}
 
 		if (tp->urg_data & URG_VALID)
@@ -733,6 +734,9 @@ static void wait_for_tcp_memory(struct sock * sk)
 	lock_sock(sk);
 }
 
+/* When all user supplied data has been queued set the PSH bit */ 
+#define PSH_NEEDED (seglen == 0 && iovlen == 0) 
+
 /*
  *	This routine copies from a user buffer into a socket,
  *	and starts the transmit system.
@@ -768,6 +772,9 @@ int tcp_do_sendmsg(struct sock *sk, int iovlen, struct iovec *iov, int flags)
 		while(seglen > 0) {
 			int copy, tmp, queue_it;
 
+			if (err) 
+				goto do_fault2;
+
 			/* Stop on errors. */
 			if (sk->err)
 				goto do_sock_err;
@@ -810,12 +817,24 @@ int tcp_do_sendmsg(struct sock *sk, int iovlen, struct iovec *iov, int flags)
 							from, skb_put(skb, copy),
 							copy, skb->csum, &err);
 					}
+					/*
+					 * FIXME: the *_user functions should
+					 *	  return how much data was
+					 *	  copied before the fault
+					 *	  occured and then a partial
+					 *	  packet with this data should
+					 *	  be sent.  Unfortunately 
+					 *	  csum_and_copy_from_user doesn't
+					 *	  return this information. 
+					 *	  ATM it might send partly zeroed
+					 *	  data in this case.
+					 */ 
 					tp->write_seq += copy;
 					TCP_SKB_CB(skb)->end_seq += copy;
 					from += copy;
 					copied += copy;
 					seglen -= copy;
-					if(!seglen && !iovlen)
+					if (PSH_NEEDED)
 						TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_PSH;
 					continue;
 				}
@@ -831,7 +850,7 @@ int tcp_do_sendmsg(struct sock *sk, int iovlen, struct iovec *iov, int flags)
 			 * actually do is use the whole MSS.  Since
 			 * the results in the right edge of the packet
 			 * being outside the window, it will be queued
-			 * for later rather than sent.
+			 * for later rather than sent.  
 			 */
 			copy = tp->snd_wnd - (tp->snd_nxt - tp->snd_una);
 			if(copy >= (tp->max_window >> 1))
@@ -884,7 +903,7 @@ int tcp_do_sendmsg(struct sock *sk, int iovlen, struct iovec *iov, int flags)
 
 			/* Prepare control bits for TCP header creation engine. */
 			TCP_SKB_CB(skb)->flags = (TCPCB_FLAG_ACK |
-						  ((!seglen && !iovlen) ?
+						  (PSH_NEEDED ?
 						   TCPCB_FLAG_PSH : 0));
 			TCP_SKB_CB(skb)->sacked = 0;
 			if (flags & MSG_OOB) {
@@ -933,9 +952,12 @@ do_interrupted:
 	return err;
 do_fault:
 	kfree_skb(skb);
+do_fault2: 
 	return -EFAULT;
 }
 
+#undef PSH_NEEDED
+
 /*
  *	Send an ack if one is backlogged at this point. Ought to merge
  *	this with tcp_send_ack().
@@ -1050,8 +1072,6 @@ static void cleanup_rbuf(struct sock *sk, int copied)
 		tcp_eat_skb(sk, skb);
 	}
 
-	SOCK_DEBUG(sk, "sk->rspace = %lu\n", sock_rspace(sk));
-
   	/* We send an ACK if we can now advertise a non-zero window
 	 * which has been raised "significantly".
   	 */

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] tcp: select(writefds) don't hang up when a peer close connection
  2010-08-26 19:21         ` David Miller
@ 2010-08-30  1:08           ` KOSAKI Motohiro
  0 siblings, 0 replies; 10+ messages in thread
From: KOSAKI Motohiro @ 2010-08-30  1:08 UTC (permalink / raw)
  To: David Miller; +Cc: kosaki.motohiro, bhutchings, netdev, linux-kernel, yoshfuji

> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Thu, 26 Aug 2010 13:07:28 +0100
> 
> > By the way, I was able to reproduce this bug in RHEL 3 (kernel based on
> > 2.4.21) so it seems to have been around for a while.
> 
> It's existed since September 1998:
> 
> commit 6275af56642b5b9ed9ef15bf5d9718661c5fe79d
> Author: freitag <freitag>
> Date:   Sat Sep 26 06:20:25 1998 +0000
> 
>     Some small fixes.
>     
>     - Remove second check for sk->err in poll as discussed with Linus.
>     - Don't signal writability when the local end has been shut down.
>     - Comment fixes.

because (almostly) nobody were using this feature? ;)




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-08-30  1:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-25  2:05 [PATCH] tcp: select(writefds) don't hang up when a peer close connection KOSAKI Motohiro
2010-08-25 22:34 ` David Miller
2010-08-26  1:24   ` Ben Hutchings
2010-08-26  3:55     ` KOSAKI Motohiro
2010-08-26 12:07       ` Ben Hutchings
2010-08-26 19:21         ` David Miller
2010-08-30  1:08           ` KOSAKI Motohiro
2010-08-26  3:09   ` KOSAKI Motohiro
2010-08-26  3:51     ` David Miller
2010-08-26  4:02       ` KOSAKI Motohiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.