From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753921AbdIGGXn (ORCPT ); Thu, 7 Sep 2017 02:23:43 -0400 Received: from mail-oi0-f53.google.com ([209.85.218.53]:34068 "EHLO mail-oi0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751055AbdIGGXm (ORCPT ); Thu, 7 Sep 2017 02:23:42 -0400 X-Google-Smtp-Source: ADKCNb7AiGPqrg5CaEyStyPO4PmhDrV5JY9WdtHNfBehwNnUG7XiwBLnApHHcvn8K/FzJiOYccIm+uhMDtcQOKrNQxI= MIME-Version: 1.0 In-Reply-To: References: <20170906041337.GC23250@localhost.localdomain> <20170906061545.GA20519@lst.de> From: Dan Williams Date: Wed, 6 Sep 2017 23:23:41 -0700 Message-ID: Subject: Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU To: Thomas Gleixner Cc: Christoph Hellwig , Yu Chen , X86 ML , Ingo Molnar , "H. Peter Anvin" , Rui Zhang , LKML , "Rafael J. Wysocki" , Len Brown , Peter Zijlstra , Jeff Kirsher Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 6, 2017 at 10:59 PM, Thomas Gleixner wrote: > On Wed, 6 Sep 2017, Dan Williams wrote: > >> On Tue, Sep 5, 2017 at 11:15 PM, Christoph Hellwig wrote: >> > On Wed, Sep 06, 2017 at 12:13:38PM +0800, Yu Chen wrote: >> >> I agree, the driver could be rewritten, but it might take some time, so >> >> meanwhile I'm looking at also other possible optimization. >> > >> > Which driver are we talking about anyway? Let's start looking at it >> > and fix the issue there. >> >> As far as I understand, it's already fixed there: >> >> commit 7c9ae7f053e9e896c24fd23595ba369a5fe322e1 > > -ENOSUCHCOMMIT Sorry, that's still pending in -next. >> Author: Carolyn Wyborny >> Date: Tue Jun 20 15:16:53 2017 -0700 >> >> i40e: Fix for trace found with S4 state >> >> This patch fixes a problem found in systems when entering >> S4 state. This patch fixes the problem by ensuring that >> the misc vector's IRQ is disabled as well. Without this >> patch a stack trace can be seen upon entering S4 state. >> >> However this seems like something that should be handled generically >> in the irq-core especially since commit c5cb83bb337c >> "genirq/cpuhotplug: Handle managed IRQs on CPU hotplug" was headed in >> that direction. It's otherwise non-obvious when a driver needs to >> release and re-acquire interrupts or be reworked to use managed >> interrupts. > > There are two problems here: > > 1) The driver allocates 300 interrupts and uses exactly 8 randomly chosen > ones. > > 2) It's not using the managed affinity mechanics, so the interrupts cannot > be sanely handled by the kernel, neither affinity wise nor at hotplug > time. Ok, this driver is an obvious candidate, but is there a general guideline of when a driver must use affinity management? Should we be emitting a message when a driver exceeds a certain threshold of unmanaged interrupts to flag this in the future?