linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Cassella <pwc@speakeasy.net>
To: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org
Subject: 2.4.0-ac3 write() to tcp socket returning errno of -3 (ESRCH: "No such process")
Date: Sun, 7 Jan 2001 23:55:28 -0600 (CST)	[thread overview]
Message-ID: <Pine.LNX.4.21.0101072217440.703-100000@localhost> (raw)

[1.] One line summary of the problem:    

write() returns -1 and sets errno non-sensically.  2.4.0{,-ac[23]}

[2.] Full description of the problem/report:

write() sometimes returns -1 and sets errno to one of
1 (EPERM: "Operation not permitted")
3 (ESRCH: "No such process")
8 (ENOEXEC: "Exec format error")

1 seems to be the most common.

I have verified that at least the ESRCH case is caused by the kernel.  
With the following patch to sys_write(),

--- fs/read_write.c~       Sat Nov 11 18:07:38 2000
+++ fs/read_write.c        Sun Jan  7 14:00:25 2001
@@ -146,6 +146,8 @@
        ssize_t ret;
        struct file * file;
 
+extern struct file_operations socket_file_ops;
+
        ret = -EBADF;
        file = fget(fd);
        if (file) {
@@ -165,6 +167,18 @@
                                 DN_MODIFY);
                fput(file);
        }
+       if(ret < 0 && file && file->f_op == &socket_file_ops ) {
+         switch(-ret) {
+               case EAGAIN: case EBADF: case EPIPE: case ENOSPC: case EIO: case ECONNRESET:
+               case EINTR: case ETIMEDOUT: case EFAULT: case EINVAL: case EMSGSIZE: case ENOMEM:
+               case ENOBUFS: case ENOTCONN: 
+                 break;
+
+               default:
+               printk(KERN_ERR "sys_write: ret is unexpectedly %d.\n", ret);
+         }
+       }
+
        return ret;
 }

and socket_file_ops made non-static, I eventually got this logged:

Jan  7 22:15:29 localhost kernel: sys_write: ret is unexpectedly -3.


And this user code:

r = write(fd_that_is_a_tcp_socket, ...); 

if (r > 0) { ...
} else if(r == 0) { ...
} else if (errno == EAGAIN || errno == EINTR) { ...
} else if (errno == EPIPE || errno == ENOSPC || errno == EIO || errno == ECONNRESET || errno == ETIMEDOUT) { ...
} else {
  g_error("node_write: write failed with unexpected errno: %d (%s)\n", errno, g_strerror(errno));
}

at the same time produced

** ERROR **: node_write: write failed with unexpected errno: 3 (No such process)

aborting...

This socket is O_NONBLOCK'd, as well as SO_KEEPALIVE'd and SO_REUSEADDR'd,
and was an outgoing connection.

TCP ECN is on.  Syncookies are compiled in but turned off.

This app has previously failed with the same thing for errno's 1 and 8,
but without the kernel change to verify that those values are actually
coming from the kernel.

That had happened on 2.4.0, 2.4.0-ac2, and 2.4.0-ac3.  I didn't notice it
on 2.4.0-test11-pre2, which was the last kernel I'd been running.  But
the problem's not particularly deterministic, and I may not have been
running it long enough.  At least two other people are seeing the same
thing on 2.4 kernels.

My guess is that something mistakenly thinks it's returning a positive
number.

[3.] Keywords (i.e., modules, networking, kernel):

networking


[4.] Kernel version (from /proc/version):

Linux version 2.4.0-ac3 (fortytwo@manetheren) (gcc version 2.95.2 20000220 (Debian GNU/Linux)) #4 Sun Jan 7 16:18:26 CST 2001

The machine and kernel are UP.


[6.] A small shell script or example program which triggers the
     problem (if possible)

This happens after the app runs usually for several hours with lots of
outgoing and incoming TCP connections to lots of different hosts.


[7.1.] Software (add the output of the ver_linux script here)

Linux manetheren 2.4.0-ac3 #4 Sun Jan 7 16:18:26 CST 2001 i686 unknown
Kernel modules         2.3.23
Gnu C                  2.95.2
Gnu Make               3.79.1
Binutils               2.10.1.0.2
Linux C Library        > libc.2.2
Dynamic linker         ldd (GNU libc) 2.2
Procps                 2.0.6
Mount                  2.10q
Net-tools              2.05
Console-tools          0.2.3
Sh-utils               2.0.11
Modules Loaded         parport_pc lp parport rtc usbcore


[7.2.] Processor information (from /proc/cpuinfo):
[7.3.] Module information (from /proc/modules):
[7.5.] PCI information ('lspci -vvv' as root)
[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)

These doesn't seem particularly relevant, so I'm putting them, as well as
.config, at

 http://manetheren.eigenray.com/~fortytwo/bad_write_info

Don't hesitate to ask me for more info or to try a patch or something.


[7.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):

In 2.4.0 and -ac2, I was getting the reset_xmit_timer() messages that -ac3
fixed.  I'm also getting TCP peer shrinks window messages, but had none
this boot before this error.

-- 
Paul Cassella

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

             reply	other threads:[~2001-01-08  5:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-01-08  5:55 Paul Cassella [this message]
2001-01-08  5:58 ` 2.4.0-ac3 write() to tcp socket returning errno of -3 (ESRCH: "No such process") David S. Miller
2001-01-08  7:16   ` Paul Cassella
2001-01-08  7:05     ` David S. Miller
2001-01-08  6:35 ` Andi Kleen
2001-01-08  7:54   ` Paul Cassella
     [not found] <3A5B0A59.1EB60A88@uow.edu.au>
2001-01-09 14:03 ` Paul Cassella
2001-01-11 16:45   ` Paul Cassella
2001-01-11 17:18     ` David S. Miller
2001-01-12 10:52       ` 2.4.0-ac3 write() to tcp socket returning errno of -3 (ESRCH:"No " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.21.0101072217440.703-100000@localhost \
    --to=pwc@speakeasy.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-net@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).