From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FB75C63777 for ; Fri, 27 Nov 2020 21:31:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B3E221534 for ; Fri, 27 Nov 2020 21:31:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731223AbgK0Vbi (ORCPT ); Fri, 27 Nov 2020 16:31:38 -0500 Received: from www62.your-server.de ([213.133.104.62]:36840 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727468AbgK0V37 (ORCPT ); Fri, 27 Nov 2020 16:29:59 -0500 Received: from sslproxy03.your-server.de ([88.198.220.132]) by www62.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92.3) (envelope-from ) id 1kilJD-0003hM-2h; Fri, 27 Nov 2020 22:29:39 +0100 Received: from [85.7.101.30] (helo=pc-9.home) by sslproxy03.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kilJC-000JbS-Nu; Fri, 27 Nov 2020 22:29:38 +0100 Subject: Re: [PATCH bpf v2 2/2] xsk: change the tx writeable condition To: Xuan Zhuo , magnus.karlsson@gmail.com Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Jakub Kicinski , Alexei Starovoitov , Jesper Dangaard Brouer , John Fastabend , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , KP Singh , "open list:XDP SOCKETS (AF_XDP)" , "open list:XDP SOCKETS (AF_XDP)" , open list References: <4fd58d473f4548dc6e9e24ea9876c802d5d584b4.1606285978.git.xuanzhuo@linux.alibaba.com> From: Daniel Borkmann Message-ID: <15bae73e-e753-123a-7535-0ab5c1178b40@iogearbox.net> Date: Fri, 27 Nov 2020 22:29:37 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <4fd58d473f4548dc6e9e24ea9876c802d5d584b4.1606285978.git.xuanzhuo@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.102.4/26001/Fri Nov 27 14:45:56 2020) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/25/20 7:48 AM, Xuan Zhuo wrote: > Modify the tx writeable condition from the queue is not full to the > number of present tx queues is less than the half of the total number > of queues. Because the tx queue not full is a very short time, this will > cause a large number of EPOLLOUT events, and cause a large number of > process wake up. > > Signed-off-by: Xuan Zhuo This one doesn't apply cleanly against bpf tree, please rebase. Small comment inline while looking over the patch: > --- > net/xdp/xsk.c | 16 +++++++++++++--- > net/xdp/xsk_queue.h | 6 ++++++ > 2 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c > index 0df8651..22e35e9 100644 > --- a/net/xdp/xsk.c > +++ b/net/xdp/xsk.c > @@ -211,6 +211,14 @@ static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len, > return 0; > } > > +static bool xsk_tx_writeable(struct xdp_sock *xs) > +{ > + if (xskq_cons_present_entries(xs->tx) > xs->tx->nentries / 2) > + return false; > + > + return true; > +} > + > static bool xsk_is_bound(struct xdp_sock *xs) > { > if (READ_ONCE(xs->state) == XSK_BOUND) { > @@ -296,7 +304,8 @@ void xsk_tx_release(struct xsk_buff_pool *pool) > rcu_read_lock(); > list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { > __xskq_cons_release(xs->tx); > - xs->sk.sk_write_space(&xs->sk); > + if (xsk_tx_writeable(xs)) > + xs->sk.sk_write_space(&xs->sk); > } > rcu_read_unlock(); > } > @@ -499,7 +508,8 @@ static int xsk_generic_xmit(struct sock *sk) > > out: > if (sent_frame) > - sk->sk_write_space(sk); > + if (xsk_tx_writeable(xs)) > + sk->sk_write_space(sk); > > mutex_unlock(&xs->mutex); > return err; > @@ -556,7 +566,7 @@ static __poll_t xsk_poll(struct file *file, struct socket *sock, > > if (xs->rx && !xskq_prod_is_empty(xs->rx)) > mask |= EPOLLIN | EPOLLRDNORM; > - if (xs->tx && !xskq_cons_is_full(xs->tx)) > + if (xs->tx && xsk_tx_writeable(xs)) > mask |= EPOLLOUT | EPOLLWRNORM; > > return mask; > diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h > index b936c46..b655004 100644 > --- a/net/xdp/xsk_queue.h > +++ b/net/xdp/xsk_queue.h > @@ -307,6 +307,12 @@ static inline bool xskq_cons_is_full(struct xsk_queue *q) > q->nentries; > } > > +static inline __u64 xskq_cons_present_entries(struct xsk_queue *q) Types prefixed with __ are mainly for user-space facing things like uapi headers, so in-kernel should be u64. Is there a reason this is not done as u32 (and thus same as producer and producer)? > +{ > + /* No barriers needed since data is not accessed */ > + return READ_ONCE(q->ring->producer) - READ_ONCE(q->ring->consumer); > +} > + > /* Functions for producers */ > > static inline u32 xskq_prod_nb_free(struct xsk_queue *q, u32 max) >