netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 2/2] unix: Show number of pending scm files of receive queue in fdinfo
       [not found] <157588582628.223723.6787992203555637280.stgit () localhost ! localdomain>
@ 2020-02-24 10:15 ` Paolo Abeni
  2020-02-25  8:07   ` Kirill Tkhai
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2020-02-24 10:15 UTC (permalink / raw)
  To: Kirill Tkhai, netdev; +Cc: Willem de Bruijn

Hi,

On Mon, 2019-12-09 at 10:03 +0000, Kirill Tkhai wrote:
> diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> index 3426d6dacc45..17e10fba2152 100644
> --- a/include/net/af_unix.h
> +++ b/include/net/af_unix.h
> @@ -41,6 +41,10 @@ struct unix_skb_parms {
>  	u32			consumed;
>  } __randomize_layout;
>  
> +struct scm_stat {
> +	u32 nr_fds;
> +};
> +

I'd like to drop the 'destructor' argument from
__skb_try_recv_datagram() and friends - that will both clean-up the
datagram code a bit and will avoid an indirect call in fast-path.

unix_dgram_recvmsg() needs special care: with the proposed change
scm_stat_del() will be called explicitly after _skb_try_recv_datagram()
while 'nr_fds' must to be updated under the receive queue lock.

Any of the following should work:
- change 'nr_fds' to an atomic type, and drop all lockdep stuff
- acquire again the receive queue spinlock before calling
scm_stat_del(), ev doing that only 'if UNIXCB(skb).fp'
- open code a variant of __skb_try_recv_datagram() which will take care
of scm_stat_del() under the receive queue lock.

Do you have any preferences? If you don't plan to add more fields to
'struct scm_stat' I would go for the first option.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v2 2/2] unix: Show number of pending scm files of receive queue in fdinfo
  2020-02-24 10:15 ` [PATCH net-next v2 2/2] unix: Show number of pending scm files of receive queue in fdinfo Paolo Abeni
@ 2020-02-25  8:07   ` Kirill Tkhai
  0 siblings, 0 replies; 3+ messages in thread
From: Kirill Tkhai @ 2020-02-25  8:07 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: Willem de Bruijn

Hi,

On 24.02.2020 13:15, Paolo Abeni wrote:
> Hi,
> 
> On Mon, 2019-12-09 at 10:03 +0000, Kirill Tkhai wrote:
>> diff --git a/include/net/af_unix.h b/include/net/af_unix.h
>> index 3426d6dacc45..17e10fba2152 100644
>> --- a/include/net/af_unix.h
>> +++ b/include/net/af_unix.h
>> @@ -41,6 +41,10 @@ struct unix_skb_parms {
>>  	u32			consumed;
>>  } __randomize_layout;
>>  
>> +struct scm_stat {
>> +	u32 nr_fds;
>> +};
>> +
> 
> I'd like to drop the 'destructor' argument from
> __skb_try_recv_datagram() and friends - that will both clean-up the
> datagram code a bit and will avoid an indirect call in fast-path.
> 
> unix_dgram_recvmsg() needs special care: with the proposed change
> scm_stat_del() will be called explicitly after _skb_try_recv_datagram()
> while 'nr_fds' must to be updated under the receive queue lock.
> 
> Any of the following should work:
> - change 'nr_fds' to an atomic type, and drop all lockdep stuff
> - acquire again the receive queue spinlock before calling
> scm_stat_del(), ev doing that only 'if UNIXCB(skb).fp'
> - open code a variant of __skb_try_recv_datagram() which will take care
> of scm_stat_del() under the receive queue lock.
> 
> Do you have any preferences? If you don't plan to add more fields to
> 'struct scm_stat' I would go for the first option.

The first option looks the best in my opinion.

Kirill

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH net-next v2 2/2] unix: Show number of pending scm files of receive queue in fdinfo
  2019-12-09 10:03 [PATCH net-next v2 0/2] unix: Show number of scm files " Kirill Tkhai
@ 2019-12-09 10:03 ` Kirill Tkhai
  0 siblings, 0 replies; 3+ messages in thread
From: Kirill Tkhai @ 2019-12-09 10:03 UTC (permalink / raw)
  To: netdev
  Cc: davem, axboe, pankaj.laxminarayan.bharadiya, keescook, viro,
	hare, tglx, edumazet, arnd, ktkhai

Unix sockets like a block box. You never know what is stored there:
there may be a file descriptor holding a mount or a block device,
or there may be whole universes with namespaces, sockets with receive
queues full of sockets etc.

The patch adds a little debug and accounts number of files (not recursive),
which is in receive queue of a unix socket. Sometimes this is useful
to determine, that socket should be investigated or which task should
be killed to put reference counter on a resourse.

v2: Pass correct argument to lockdep

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/af_unix.h |    5 ++++
 net/unix/af_unix.c    |   56 +++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index 3426d6dacc45..17e10fba2152 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -41,6 +41,10 @@ struct unix_skb_parms {
 	u32			consumed;
 } __randomize_layout;
 
+struct scm_stat {
+	u32 nr_fds;
+};
+
 #define UNIXCB(skb)	(*(struct unix_skb_parms *)&((skb)->cb))
 
 #define unix_state_lock(s)	spin_lock(&unix_sk(s)->lock)
@@ -65,6 +69,7 @@ struct unix_sock {
 #define UNIX_GC_MAYBE_CYCLE	1
 	struct socket_wq	peer_wq;
 	wait_queue_entry_t	peer_wake;
+	struct scm_stat		scm_stat;
 };
 
 static inline struct unix_sock *unix_sk(const struct sock *sk)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f0a074356012..71d2aa83911a 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -676,6 +676,16 @@ static int unix_set_peek_off(struct sock *sk, int val)
 	return 0;
 }
 
+static void unix_show_fdinfo(struct seq_file *m, struct socket *sock)
+{
+	struct sock *sk = sock->sk;
+	struct unix_sock *u;
+
+	if (sk) {
+		u = unix_sk(sock->sk);
+		seq_printf(m, "scm_fds: %u\n", READ_ONCE(u->scm_stat.nr_fds));
+	}
+}
 
 static const struct proto_ops unix_stream_ops = {
 	.family =	PF_UNIX,
@@ -701,6 +711,7 @@ static const struct proto_ops unix_stream_ops = {
 	.sendpage =	unix_stream_sendpage,
 	.splice_read =	unix_stream_splice_read,
 	.set_peek_off =	unix_set_peek_off,
+	.show_fdinfo =	unix_show_fdinfo,
 };
 
 static const struct proto_ops unix_dgram_ops = {
@@ -726,6 +737,7 @@ static const struct proto_ops unix_dgram_ops = {
 	.mmap =		sock_no_mmap,
 	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
+	.show_fdinfo =	unix_show_fdinfo,
 };
 
 static const struct proto_ops unix_seqpacket_ops = {
@@ -751,6 +763,7 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.mmap =		sock_no_mmap,
 	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
+	.show_fdinfo =	unix_show_fdinfo,
 };
 
 static struct proto unix_proto = {
@@ -788,6 +801,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
 	mutex_init(&u->bindlock); /* single task binding lock */
 	init_waitqueue_head(&u->peer_wait);
 	init_waitqueue_func_entry(&u->peer_wake, unix_dgram_peer_wake_relay);
+	memset(&u->scm_stat, 0, sizeof(struct scm_stat));
 	unix_insert_socket(unix_sockets_unbound(sk), sk);
 out:
 	if (sk == NULL)
@@ -1572,6 +1586,28 @@ static bool unix_skb_scm_eq(struct sk_buff *skb,
 	       unix_secdata_eq(scm, skb);
 }
 
+static void scm_stat_add(struct sock *sk, struct sk_buff *skb)
+{
+	struct scm_fp_list *fp = UNIXCB(skb).fp;
+	struct unix_sock *u = unix_sk(sk);
+
+	lockdep_assert_held(&sk->sk_receive_queue.lock);
+
+	if (unlikely(fp && fp->count))
+		u->scm_stat.nr_fds += fp->count;
+}
+
+static void scm_stat_del(struct sock *sk, struct sk_buff *skb)
+{
+	struct scm_fp_list *fp = UNIXCB(skb).fp;
+	struct unix_sock *u = unix_sk(sk);
+
+	lockdep_assert_held(&sk->sk_receive_queue.lock);
+
+	if (unlikely(fp && fp->count))
+		u->scm_stat.nr_fds -= fp->count;
+}
+
 /*
  *	Send AF_UNIX data.
  */
@@ -1757,7 +1793,10 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (sock_flag(other, SOCK_RCVTSTAMP))
 		__net_timestamp(skb);
 	maybe_add_creds(skb, sock, other);
-	skb_queue_tail(&other->sk_receive_queue, skb);
+	spin_lock(&other->sk_receive_queue.lock);
+	scm_stat_add(other, skb);
+	__skb_queue_tail(&other->sk_receive_queue, skb);
+	spin_unlock(&other->sk_receive_queue.lock);
 	unix_state_unlock(other);
 	other->sk_data_ready(other);
 	sock_put(other);
@@ -1859,7 +1898,10 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 			goto pipe_err_free;
 
 		maybe_add_creds(skb, sock, other);
-		skb_queue_tail(&other->sk_receive_queue, skb);
+		spin_lock(&other->sk_receive_queue.lock);
+		scm_stat_add(other, skb);
+		__skb_queue_tail(&other->sk_receive_queue, skb);
+		spin_unlock(&other->sk_receive_queue.lock);
 		unix_state_unlock(other);
 		other->sk_data_ready(other);
 		sent += size;
@@ -2058,8 +2100,8 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg,
 		mutex_lock(&u->iolock);
 
 		skip = sk_peek_offset(sk, flags);
-		skb = __skb_try_recv_datagram(sk, flags, NULL, &skip, &err,
-					      &last);
+		skb = __skb_try_recv_datagram(sk, flags, scm_stat_del,
+					      &skip, &err, &last);
 		if (skb)
 			break;
 
@@ -2353,8 +2395,12 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
 
 			sk_peek_offset_bwd(sk, chunk);
 
-			if (UNIXCB(skb).fp)
+			if (UNIXCB(skb).fp) {
+				spin_lock(&sk->sk_receive_queue.lock);
+				scm_stat_del(sk, skb);
+				spin_unlock(&sk->sk_receive_queue.lock);
 				unix_detach_fds(&scm, skb);
+			}
 
 			if (unix_skb_len(skb))
 				break;



^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-02-25  8:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <157588582628.223723.6787992203555637280.stgit () localhost ! localdomain>
2020-02-24 10:15 ` [PATCH net-next v2 2/2] unix: Show number of pending scm files of receive queue in fdinfo Paolo Abeni
2020-02-25  8:07   ` Kirill Tkhai
2019-12-09 10:03 [PATCH net-next v2 0/2] unix: Show number of scm files " Kirill Tkhai
2019-12-09 10:03 ` [PATCH net-next v2 2/2] unix: Show number of pending scm files of receive queue " Kirill Tkhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).