From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: [PATCH v2 net 1/3] net: bql: add __netdev_tx_sent_queue() Date: Wed, 31 Oct 2018 08:39:12 -0700 Message-ID: <20181031153914.132127-2-edumazet@google.com> References: <20181031153914.132127-1-edumazet@google.com> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Cc: netdev , Tariq Toukan , Willem de Bruijn , Eric Dumazet , Eric Dumazet To: "David S . Miller" Return-path: Received: from mail-pl1-f194.google.com ([209.85.214.194]:35954 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727623AbeKAAh6 (ORCPT ); Wed, 31 Oct 2018 20:37:58 -0400 Received: by mail-pl1-f194.google.com with SMTP id y11-v6so7478891plt.3 for ; Wed, 31 Oct 2018 08:39:27 -0700 (PDT) In-Reply-To: <20181031153914.132127-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org List-ID: When qdisc_run() tries to use BQL budget to bulk-dequeue a batch of packets, GSO can later transform this list in another list of skbs, and each skb is sent to device ndo_start_xmit(), one at a time, with skb->xmit_more being set to one but for last skb. Problem is that very often, BQL limit is hit in the middle of the packet train, forcing dev_hard_start_xmit() to stop the bulk send and requeue the end of the list. BQL role is to avoid head of line blocking, making sure a qdisc can deliver high priority packets before low priority ones. But there is no way requeued packets can be bypassed by fresh packets in the qdisc. Aborting the bulk send increases TX softirqs, and hot cache lines (after skb_segment()) are wasted. Note that for TSO packets, we never split a packet in the middle because of BQL limit being hit. Drivers should be able to update BQL counters without flipping/caring about BQL status, if the current skb has xmit_more set. Upper layers are ultimately responsible to stop sending another packet train when BQL limit is hit. Code template in a driver might look like the following : send_doorbell = __netdev_tx_sent_queue(tx_queue, nr_bytes, skb->xmit_more); Note that __netdev_tx_sent_queue() use is not mandatory, since following patch will change dev_hard_start_xmit() to not care about BQL status. But it is highly recommended so that xmit_more full benefits can be reached (less doorbells sent, and less atomic operations as well) Signed-off-by: Eric Dumazet --- include/linux/netdevice.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index dc1d9ed33b3192e9406b17c3107b3235b28ff1b9..857f8abf7b91bc79731873fc8f68e31f6bff4d03 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3190,6 +3190,26 @@ static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue, #endif } +/* Variant of netdev_tx_sent_queue() for drivers that are aware + * that they should not test BQL status themselves. + * We do want to change __QUEUE_STATE_STACK_XOFF only for the last + * skb of a batch. + * Returns true if the doorbell must be used to kick the NIC. + */ +static inline bool __netdev_tx_sent_queue(struct netdev_queue *dev_queue, + unsigned int bytes, + bool xmit_more) +{ + if (xmit_more) { +#ifdef CONFIG_BQL + dql_queued(&dev_queue->dql, bytes); +#endif + return netif_tx_queue_stopped(dev_queue); + } + netdev_tx_sent_queue(dev_queue, bytes); + return true; +} + /** * netdev_sent_queue - report the number of bytes queued to hardware * @dev: network device -- 2.19.1.930.g4563a0d9d0-goog