From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [RFC] net: remove busylock Date: Fri, 20 May 2016 07:16:55 -0700 Message-ID: <1463753815.18194.298.camel@edumazet-glaptop3.roam.corp.google.com> References: <1463677716.18194.203.camel@edumazet-glaptop3.roam.corp.google.com> <20160520092903.38620c60@redhat.com> <1463749909.18194.291.camel@edumazet-glaptop3.roam.corp.google.com> <1463752069.18194.294.camel@edumazet-glaptop3.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Alexander Duyck , netdev , Alexander Duyck , John Fastabend , Jamal Hadi Salim To: Jesper Dangaard Brouer Return-path: Received: from mail-pf0-f195.google.com ([209.85.192.195]:34058 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751667AbcETOQ7 (ORCPT ); Fri, 20 May 2016 10:16:59 -0400 Received: by mail-pf0-f195.google.com with SMTP id 145so11515858pfz.1 for ; Fri, 20 May 2016 07:16:59 -0700 (PDT) In-Reply-To: <1463752069.18194.294.camel@edumazet-glaptop3.roam.corp.google.com> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, 2016-05-20 at 06:47 -0700, Eric Dumazet wrote: > On Fri, 2016-05-20 at 06:11 -0700, Eric Dumazet wrote: > > On Fri, 2016-05-20 at 09:29 +0200, Jesper Dangaard Brouer wrote: > > > > > > > The hole idea behind allowing bulk qdisc dequeue, was to mitigate this, > > > by allowing dequeue to do more work, while holding the lock. > > > > > > You mention HTB. Notice HTB does not take advantage of bulk dequeue. > > > Have you tried to enable/allow HTB to bulk dequeue? > > > > > > > Well, __QDISC___STATE_RUNNING means exactly that : one cpu is dequeueing > > many packets from the qdisc and tx them to the device. > > > > It is generic for any kind of qdisc. > > > > HTB bulk dequeue would have to call ->dequeue() mutiple times. If you do > > this while holding qdisc spinlock, you block other cpus from doing > > concurrent ->enqueue(), adding latencies (always the same trade off...) > > > > HTB wont be anytime soon have separate protections for the ->enqueue() > > and the ->dequeue(). Have you looked at this monster ? I did, many > > times... > > > > Note that I am working on a patch to transform __QDISC___STATE_RUNNING > > to a seqcount do that we can grab stats without holding the qdisc lock. > > Slide note : __qdisc_run() could probably avoid a __netif_schedule() > when it breaks the loop, if another cpu is busy spinning on qdisc lock. > > -> Less (spurious) TX softirq invocations, so less chance to trigger the > infamous ksoftirqd bug we discussed lately. Also note that in our case, we have HTB on a bonding device, and FQ/pacing on slaves. Since bonding pretends to be multiqueue, TCQ_F_ONETXQUEUE is not set on sch->flags when HTB is installed at the bonding device root. We might add a flag to tell qdisc layer that a device is virtual and could benefit from bulk dequeue, since the ultimate TX queue is located on another (physical) netdev, eventually MQ enabled.