All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Baron <jbaron@akamai.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Rainer Weikusat <rweikusat@talktalk.net>, netdev@vger.kernel.org
Subject: Re: [RFC] nasty corner case in unix_dgram_sendmsg()
Date: Wed, 27 Feb 2019 11:45:40 -0500	[thread overview]
Message-ID: <59657502-5154-a2ff-ab5f-a432b217f9d6@akamai.com> (raw)
In-Reply-To: <20190226235912.GL2217@ZenIV.linux.org.uk>



On 2/26/19 6:59 PM, Al Viro wrote:
> On Tue, Feb 26, 2019 at 03:35:39PM -0500, Jason Baron wrote:
> 
>>> I understand what the unix_dgram_peer_wake_me() is doing; I understand
>>> what unix_dgram_poll() is using it for.  What I do not understand is
>>> what's the point of doing that in unix_dgram_sendmsg()...
>>>
>>
>> Hi,
>>
>> So the unix_dgram_peer_wake_me() in unix_dgram_sendmsg() is there for
>> epoll in edge-triggered mode. In that case, we want to ensure that if
>> -EAGAIN is returned a subsequent epoll_wait() is not stuck indefinitely.
>> Probably could use a comment...
> 
> *owwww*
> 
> Let me see if I've got it straight - you want the forwarding rearmed,
> so that it would match the behaviour of ep_poll_callback() (i.e.
> removing only when POLLFREE is passed)?  Looks like an odd way to
> do it, if that's what's happening...

If unix_dgram_sendmsg() return -EAGAIN in this case, then a subsequent call
to poll()/select()/epoll_wait() is normally going to do the forwarding rearm
via unix_dgram_poll() (unless its already writeable). However, in the
special case of epoll with edge-trigger, the call to epoll_wait does not
call unix_dgram_poll() and thus the re-arm has to happen in
unix_dgram_sendmsg().

> 
> While we are at it, why disarm a forwarder upon noticing that peer
> is dead?  Wouldn't it be simpler to move that
>         wake_up_interruptible_all(&u->peer_wait);
> in unix_release_sock() to just before
>         unix_state_unlock(sk);
> a line prior?  Then anyone seeing SOCK_DEAD on (locked) peer
> would be guaranteed that all forwarders are gone...
>

The condition we are checking here is unix_recvq_full(), so even if
the wakeup happens under the lock, we could end up waking up the
waiter that still sees unix_recvq_full() because the skb's aren't
freed until *after* the wakeup call. The race is described here:

51f7e95 af_unix: ensure POLLOUT on remote close() for connected dgram socket

Note, that I did have an earlier version of that patch that moved
the wake up call (instead of checking for SOCK_DEAD), see:
https://patchwork.ozlabs.org/patch/944593/

However, I thought that the explicit check for SOCK_DEAD made things
more explicit. IE we don't wait on a SOCK_DEAD socket.

> Another fun question about the same dgram sendmsg:
>                 if (unix_peer(sk) == other) {
>                         unix_peer(sk) = NULL;
>                         unix_dgram_peer_wake_disconnect_wakeup(sk, other);
> 
>                         unix_state_unlock(sk);
> 
>                         unix_dgram_disconnected(sk, other);
> 
> ... and we are holding any locks at the last line.  What happens
> if we have thread A doing
> 	decide which address to talk to
> 	connect(fd, that address)
> 	send request over fd (with send(2) or write(2))
> 	read reply from fd (recv(2) or read(2))
> in a loop, with thread B doing explicit sendto(2) over the same
> socket?
> 
> Suppose B happens to send to the last server thread A was talking
> to and finds it just closed (e.g. because the last request from
> A had been "shut down", which server has honoured).  B gets ECONNREFUSED,
> as it ought to, but it can also ends up disrupting the next exchange
> of A.
> 
> Shouldn't we rather extract the skbs from that queue *before*
> dropping sk->lock?  E.g. move them to a temporary queue, and flush
> that queue after we'd unlocked sk...
> 

If I understand your concern, B drops the lock as above and then
A does a connect() to somewhere else and then B drops skbs from the
new source. Looks plausible. I think in general, A and B would probably
be co-ordinating if they are both reading/writing the same socket,
but I think it probably would make sense to fix this case. Note that,
unix_dgram_disconnected() is also called in unix_dgram_connect() after
the lock is dropped so that would need a similar fix.

Thanks,

-Jason


      reply	other threads:[~2019-02-27 16:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-25  3:51 [RFC] nasty corner case in unix_dgram_sendmsg() Al Viro
2019-02-26  6:28 ` Al Viro
2019-02-26  6:38   ` Al Viro
2019-02-26 15:31     ` Rainer Weikusat
2019-02-26 19:03       ` Al Viro
2019-02-26 20:35         ` Jason Baron
2019-02-26 23:59           ` Al Viro
2019-02-27 16:45             ` Jason Baron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59657502-5154-a2ff-ab5f-a432b217f9d6@akamai.com \
    --to=jbaron@akamai.com \
    --cc=netdev@vger.kernel.org \
    --cc=rweikusat@talktalk.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.