All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alan Burlison <Alan.Burlison@oracle.com>
To: Al Viro <viro@ZenIV.linux.org.uk>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	netdev@vger.kernel.org, dholland-tech@netbsd.org,
	Casper Dik <casper.dik@oracle.com>
Subject: Re: Fw: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3)
Date: Wed, 21 Oct 2015 15:38:51 +0100	[thread overview]
Message-ID: <5627A37B.4090208@oracle.com> (raw)
In-Reply-To: <20151021034950.GL22011@ZenIV.linux.org.uk>

On 21/10/2015 04:49, Al Viro wrote:

Firstly, thank you for the comprehensive and considered reply.

> Refcount is an implementation detail, of course.  However, in any Unix I know
> of, there are two separate notions - descriptor losing connection to opened
> file (be it from close(), exit(), execve(), dup2(), etc.) and opened file
> getting closed.

Yep, it's an implementation detail inside the kernel - Solaris also has 
a refcount inside its vnodes. However that's really only dimly visible 
at the process level, where all you have is an integer file ID.

> The latter cannot happen while there are descriptors connected to the
> file in question, of course.  However, that is not the only thing
> that might prevent an opened file from getting closed - e.g. sending an
> SCM_RIGHTS datagram with attached descriptor connected to the opened file
> in question *at* *the* *moment* *of* *sendmsg(2)* will carry said opened
> file until it is successfully received or discarded (in the former case
> recepient will get a new descriptor refering to that opened file, of course).
> Having the original descriptor closed right after sendmsg(2) does *not*
> do anything to opened file.  On any Unix that implements descriptor-passing.

I believe async IO data is another way that a file can remain live after 
a close(), from the close() section of IEEE Std 1003.1:

"An I/O operation that is not canceled completes as if the close() 
operation had not yet occurred"

> There's going to be a notion of "last close"; that's what this refcount is
> about and _that_ is more than implementation detail.

Yes, POSIX distinguishes between "file descriptor" and "file 
description" (ugh!) and the close() page says:

"When all file descriptors associated with an open file description have 
been closed, the open file description shall be freed."

In the context of this discussion I believe it's the behaviour of the 
integer file descriptor that's the issue. Once it's had close() called 
on it then it's invalid, and any IO on it should fail, even if the 
underlying file description is still 'live'.

> In other words, is that destruction of
> 	* any descriptor refering to this socket [utterly insane for obvious
> reasons]
> 	* the last descriptor refering to this socket (modulo descriptor
> passing, etc.) [a bitch to implement, unless we treat a syscall in progress
> as keeping the opened file open], or
> 	* _the_ descriptor used to issue accept(2) [a bitch to implement,
> with a lot of fun races in an already race-prone area]?

 From reading the POSIX close() page I believe the second option is the 
correct one.

> Additional question is whether it's
> 	* just a magical behaviour of close(2) [ugly], or
> 	* something that happens when descriptor gets dissociated from
> opened file [obviously more consistent]?

The second, I believe.

> BTW, for real fun, consider this:
> 7)
> // fd is a socket
> fd2 = dup(fd);
> in thread A: accept(fd);
> in thread B: accept(fd);
> in thread C: accept(fd2);
> in thread D: close(fd);
>
> Which threads (if any), should get hit where it hurts?

A & B should return from the accept with an error. C should continue. 
Which is what happens on Solaris.

> I have no idea what semantics does Solaris have in that area and how racy
> their descriptor table handling is.  And no, I'm not going to RTFS their
> kernel, CDDL being what it is.

I can answer that for you :-) I've looked through the appropriate bits 
of the Solaris kernel code and my colleague Casper has written an 
excellent summary of what happens, so with his permission I've just 
copied it verbatim below:

----------
Since at least Solaris 7 (1998), a thread which is sleeping
on a file descriptor which is being closed by another thread,
will be woken up.

To this end each thread keeps a list of file descriptors
in use by the current active system call.

When a file descriptor is closed and this file descriptor
is marked as being in use by other threads, the kernel
will search all threads to see which have this file descriptor
listed as in use. For each such thread, the kernel tells
the thread that its active fds list is now stale and, if
possible, makes the thread run.

While this algorithm is pretty expensive, it is not often invoked.

The thread running close() will NOT return until all other threads
using that filedescriptor have released it.

When run, the thread will return from its syscall and will in most cases
return EBADF. A second thread trying to close this same file descriptor
may return earlier with close() returning EBADF.
----------

-- 
Alan Burlison
--

  reply	other threads:[~2015-10-21 14:39 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-19 16:59 Fw: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3) Stephen Hemminger
2015-10-19 23:33 ` Eric Dumazet
2015-10-20  1:12   ` Alan Burlison
2015-10-20  1:45     ` Eric Dumazet
2015-10-20  9:59       ` Alan Burlison
2015-10-20 11:24         ` David Miller
2015-10-20 11:39           ` Alan Burlison
2015-10-20 13:19         ` Fw: " Eric Dumazet
2015-10-20 13:45           ` Alan Burlison
2015-10-20 15:30             ` Eric Dumazet
2015-10-20 18:31               ` Alan Burlison
2015-10-20 18:42                 ` Eric Dumazet
2015-10-21 10:25                 ` David Laight
2015-10-21 10:49                   ` Alan Burlison
2015-10-21 11:28                     ` Eric Dumazet
2015-10-21 13:03                       ` Alan Burlison
2015-10-21 13:29                         ` Eric Dumazet
2015-10-21  3:49       ` Al Viro
2015-10-21 14:38         ` Alan Burlison [this message]
2015-10-21 15:30           ` David Miller
2015-10-21 16:04             ` Casper.Dik
2015-10-21 21:18               ` Eric Dumazet
2015-10-21 21:28                 ` Al Viro
2015-10-21 16:32           ` Fw: " Eric Dumazet
2015-10-21 18:51           ` Al Viro
2015-10-21 20:33             ` Casper.Dik
2015-10-22  4:21               ` Al Viro
2015-10-22 10:55                 ` Alan Burlison
2015-10-22 18:16                   ` Al Viro
2015-10-22 20:15                     ` Alan Burlison
2015-11-02 10:03               ` David Laight
2015-11-02 10:29                 ` Al Viro
2015-10-21 22:28             ` Alan Burlison
2015-10-22  1:29             ` David Miller
2015-10-22  4:17               ` Alan Burlison
2015-10-22  4:44                 ` Al Viro
2015-10-22  6:03                   ` Al Viro
2015-10-22  6:34                     ` Casper.Dik
2015-10-22 17:21                       ` Al Viro
2015-10-22 18:24                         ` Casper.Dik
2015-10-22 19:07                           ` Al Viro
2015-10-22 19:51                             ` Casper.Dik
2015-10-22 21:57                               ` Al Viro
2015-10-23  9:52                                 ` Casper.Dik
2015-10-23 13:02                                   ` Eric Dumazet
2015-10-23 13:20                                     ` Casper.Dik
2015-10-23 13:48                                       ` Eric Dumazet
2015-10-23 14:13                                       ` Eric Dumazet
2015-10-23 13:35                                     ` Alan Burlison
2015-10-23 14:21                                       ` Eric Dumazet
2015-10-23 15:46                                         ` Alan Burlison
2015-10-23 16:00                                           ` Eric Dumazet
2015-10-23 16:07                                             ` Alan Burlison
2015-10-23 16:19                                             ` Eric Dumazet
2015-10-23 16:40                                               ` Alan Burlison
2015-10-23 17:47                                                 ` Eric Dumazet
2015-10-23 17:59                                                   ` [PATCH net-next] af_unix: do not report POLLOUT on listeners Eric Dumazet
2015-10-25 13:45                                                     ` David Miller
2015-10-24  2:30                                   ` [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3) Al Viro
2015-10-27  9:08                                     ` Casper.Dik
2015-10-27 10:52                                       ` Alan Burlison
2015-10-27 12:01                                         ` Eric Dumazet
2015-10-27 12:27                                           ` Alan Burlison
2015-10-27 12:44                                             ` Eric Dumazet
2015-10-27 13:42                                         ` David Miller
2015-10-27 13:37                                           ` Alan Burlison
2015-10-27 13:59                                             ` David Miller
2015-10-27 14:13                                               ` Alan Burlison
2015-10-27 14:39                                                 ` David Miller
2015-10-27 14:39                                                   ` Alan Burlison
2015-10-27 15:04                                                     ` David Miller
2015-10-27 15:53                                                       ` Alan Burlison
2015-10-27 23:17                                         ` Al Viro
2015-10-28  0:13                                           ` Eric Dumazet
2015-10-28 12:35                                             ` Al Viro
2015-10-28 13:24                                               ` Eric Dumazet
2015-10-28 14:47                                                 ` Eric Dumazet
2015-10-28 21:13                                                   ` Al Viro
2015-10-28 21:44                                                     ` Eric Dumazet
2015-10-28 22:33                                                       ` Al Viro
2015-10-28 23:08                                                         ` Eric Dumazet
2015-10-29  0:15                                                           ` Al Viro
2015-10-29  3:29                                                             ` Eric Dumazet
2015-10-29  4:16                                                               ` Al Viro
2015-10-29 12:35                                                                 ` Eric Dumazet
2015-10-29 13:48                                                                   ` Eric Dumazet
2015-10-30 17:18                                                                   ` Linus Torvalds
2015-10-30 21:02                                                                     ` Al Viro
2015-10-30 21:23                                                                       ` Linus Torvalds
2015-10-30 21:50                                                                         ` Linus Torvalds
2015-10-30 22:33                                                                           ` Al Viro
2015-10-30 23:52                                                                             ` Linus Torvalds
2015-10-31  0:09                                                                               ` Al Viro
2015-10-31 15:59                                                                               ` Eric Dumazet
2015-10-31 19:34                                                                               ` Al Viro
2015-10-31 19:54                                                                                 ` Linus Torvalds
2015-10-31 20:29                                                                                   ` Al Viro
2015-11-02  0:24                                                                                     ` Al Viro
2015-11-02  0:59                                                                                       ` Linus Torvalds
2015-11-02  2:14                                                                                       ` Eric Dumazet
2015-11-02  6:22                                                                                         ` Al Viro
2015-10-31 20:45                                                                                   ` Eric Dumazet
2015-10-31 21:23                                                                                     ` Linus Torvalds
2015-10-31 21:51                                                                                       ` Al Viro
2015-10-31 22:34                                                                                       ` Eric Dumazet
2015-10-31  1:07                                                                           ` Eric Dumazet
2015-10-28 16:04                                           ` Alan Burlison
2015-10-29 14:58                                         ` David Holland
2015-10-29 15:18                                           ` Alan Burlison
2015-10-29 16:01                                             ` David Holland
2015-10-29 16:15                                               ` Alan Burlison
2015-10-29 17:07                                                 ` Al Viro
2015-10-29 17:12                                                   ` Alan Burlison
2015-10-30  1:54                                                     ` David Miller
2015-10-30  1:55                                                   ` David Miller
2015-10-30  5:44                                                 ` David Holland
2015-10-30 17:43                                           ` David Laight
2015-10-30 21:09                                             ` Al Viro
2015-11-04 15:54                                               ` David Laight
2015-11-04 16:27                                                 ` Al Viro
2015-11-06 15:07                                                   ` David Laight
2015-11-06 19:31                                                     ` Al Viro
2015-10-22  6:51                   ` Casper.Dik
2015-10-22 11:18                     ` Alan Burlison
2015-10-22 11:15                   ` Alan Burlison
2015-10-22  6:15                 ` Casper.Dik
2015-10-22 11:30                   ` Eric Dumazet
2015-10-22 11:58                     ` Alan Burlison
2015-10-22 12:10                       ` Eric Dumazet
2015-10-22 13:12                         ` David Miller
2015-10-22 13:14                         ` Alan Burlison
2015-10-22 17:05                           ` Al Viro
2015-10-22 17:39                             ` Alan Burlison
2015-10-22 18:56                               ` Al Viro
2015-10-22 19:50                                 ` Casper.Dik
2015-10-23 17:09                                   ` Al Viro
2015-10-23 18:30           ` Fw: " David Holland
2015-10-23 19:51             ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5627A37B.4090208@oracle.com \
    --to=alan.burlison@oracle.com \
    --cc=casper.dik@oracle.com \
    --cc=dholland-tech@netbsd.org \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=stephen@networkplumber.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.