From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3AA9C433B4 for ; Mon, 17 May 2021 16:57:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B528861241 for ; Mon, 17 May 2021 16:57:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240625AbhEQQ7B (ORCPT ); Mon, 17 May 2021 12:59:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:33690 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236000AbhEQQ67 (ORCPT ); Mon, 17 May 2021 12:58:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621270662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=PWn0ER8ktL8CjGxRv5Ynu626ORHTDXR+dt3CTUoFfXU=; b=VvMX1LiTbpVHFM7EN0DZDlmpmmJV/omkhuGhec5WZxv4rDMS2Y2c1oVCvbwa34VFoJIYRf pmlxiDdIZOV+P34z/eWlV+iiC3wwJdr6NOBbakwXxHG9s5vGSDySUc5pnWq5x6HIv7kO34 jdyOmKPUuLF3vwEKnwM9g/EM1rZHrz0= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-140-1h8fvseCO_qJqfU5LNaqLw-1; Mon, 17 May 2021 12:57:40 -0400 X-MC-Unique: 1h8fvseCO_qJqfU5LNaqLw-1 Received: by mail-lf1-f71.google.com with SMTP id g16-20020ac24d900000b0290239d19e27ffso278963lfe.8 for ; Mon, 17 May 2021 09:57:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PWn0ER8ktL8CjGxRv5Ynu626ORHTDXR+dt3CTUoFfXU=; b=hm9mpW0B6mom/12mrmens3BgZ/bxdmQY0eENk/7PTTpy++zK881XtbVaOVpygg9GkJ 0UpHdKpyQvzhN4HvaLT/luwYQd2Ve2co/hPHC1jj1cRfpQb5FA+tSzo05ZG8dirxboBA cM4VRDb4NuDpuPPKwRL2QtgSEmBsvxIeNKQzl5utkecmaFFbi1p7eypHnuQXJjl+8H+/ eoynnySEuJfycUG7Uh+mf/uBdNSr+opskidyYtYBGX40UOkGxUx4PUgxttyYWuiQzqOg Qefh1GzHV2VrB/EKJQv3s7SdF5FukGOB2zHSxcatkL15nzlGedy5xVpdElZnoJb6TmO0 gwYA== X-Gm-Message-State: AOAM533y9aNl9KQRZlXhG8P45EpwlFnlud7wqCj2NWNfG5XncEDXwTxX qJWisnopFzvoQcO98DGfHLxfNu/uB+7sHlkXG1wPoTEI9l1MFFZCxTXI5PPVv6J2HXsaFngQbOI A3hC1bTGPuJMtW/SAf/Hat/JmgdAkKORt76vfpv9l X-Received: by 2002:ac2:51ce:: with SMTP id u14mr93676lfm.252.1621270658863; Mon, 17 May 2021 09:57:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzyffO4JpeGGpWTWQd6pBCcoqe0gRG4gxJB9gvjMRQo0SkkR936sREfwD/ThH7qQxY0d+BxlHTLIVDrxnuu2xs= X-Received: by 2002:ac2:51ce:: with SMTP id u14mr93653lfm.252.1621270658676; Mon, 17 May 2021 09:57:38 -0700 (PDT) MIME-Version: 1.0 References: <20210501021832.743094-1-jesse.brandeburg@intel.com> <16d8ca67-30c6-bb4b-8946-79de8629156e@arm.com> <20210504092340.00006c61@intel.com> In-Reply-To: <20210504092340.00006c61@intel.com> From: Nitesh Lal Date: Mon, 17 May 2021 12:57:26 -0400 Message-ID: Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint To: Jesse Brandeburg , Thomas Gleixner , Robin Murphy , "frederic@kernel.org" , "juri.lelli@redhat.com" , Marcelo Tosatti Cc: Ingo Molnar , linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org, jbrandeb@kernel.org, Alex Belits , "linux-api@vger.kernel.org" , "bhelgaas@google.com" , "linux-pci@vger.kernel.org" , "rostedt@goodmis.org" , "peterz@infradead.org" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "sfr@canb.auug.org.au" , "stephen@networkplumber.org" , "rppt@linux.vnet.ibm.com" , "jinyuqi@huawei.com" , "zhangshaokun@hisilicon.com" , netdev@vger.kernel.org, chris.friesen@windriver.com, Marc Zyngier Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 4, 2021 at 12:25 PM Jesse Brandeburg wrote: > > Robin Murphy wrote: > > > On 2021-05-01 03:18, Jesse Brandeburg wrote: > > > It was pointed out by Nitesh that the original work I did in 2014 > > > to automatically set the interrupt affinity when requesting a > > > mask is no longer necessary. The kernel has moved on and no > > > longer has the original problem, BUT the original patch > > > introduced a subtle bug when booting a system with reserved or > > > excluded CPUs. Drivers calling this function with a mask value > > > that included a CPU that was currently or in the future > > > unavailable would generally not update the hint. > > > > > > I'm sure there are a million ways to solve this, but the simplest > > > one is to just remove a little code that tries to force the > > > affinity, as Nitesh has shown it fixes the bug and doesn't seem > > > to introduce immediate side effects. > > > > Unfortunately, I think there are quite a few other drivers now relying > > on this behaviour, since they are really using irq_set_affinity_hint() > > as a proxy for irq_set_affinity(). Partly since the latter isn't > > exported to modules, but also I have a vague memory of it being said > > that it's nice to update the user-visible hint to match when the > > affinity does have to be forced to something specific. > > > > Robin. > > Thanks for your feedback Robin, but there is definitely a bug here that > is being exposed by this code. The fact that people are using this > function means they're all exposed to this bug. > > Not sure if you saw, but this analysis from Nitesh explains what > happened chronologically to the kernel w.r.t this code, it's a useful > analysis! [1] > > I'd add in addition that irqbalance daemon *stopped* paying attention > to hints quite a while ago, so I'm not quite sure what purpose they > serve. > > [1] > https://lore.kernel.org/lkml/CAFki+Lm0W_brLu31epqD3gAV+WNKOJfVDfX2M8ZM__aj3nv9uA@mail.gmail.com/ > Wanted to follow up to see if there are any more objections or even suggestions to take this forward? -- Thanks Nitesh From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nitesh Lal Date: Mon, 17 May 2021 12:57:26 -0400 Subject: [Intel-wired-lan] [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint In-Reply-To: <20210504092340.00006c61@intel.com> References: <20210501021832.743094-1-jesse.brandeburg@intel.com> <16d8ca67-30c6-bb4b-8946-79de8629156e@arm.com> <20210504092340.00006c61@intel.com> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: intel-wired-lan@osuosl.org List-ID: On Tue, May 4, 2021 at 12:25 PM Jesse Brandeburg wrote: > > Robin Murphy wrote: > > > On 2021-05-01 03:18, Jesse Brandeburg wrote: > > > It was pointed out by Nitesh that the original work I did in 2014 > > > to automatically set the interrupt affinity when requesting a > > > mask is no longer necessary. The kernel has moved on and no > > > longer has the original problem, BUT the original patch > > > introduced a subtle bug when booting a system with reserved or > > > excluded CPUs. Drivers calling this function with a mask value > > > that included a CPU that was currently or in the future > > > unavailable would generally not update the hint. > > > > > > I'm sure there are a million ways to solve this, but the simplest > > > one is to just remove a little code that tries to force the > > > affinity, as Nitesh has shown it fixes the bug and doesn't seem > > > to introduce immediate side effects. > > > > Unfortunately, I think there are quite a few other drivers now relying > > on this behaviour, since they are really using irq_set_affinity_hint() > > as a proxy for irq_set_affinity(). Partly since the latter isn't > > exported to modules, but also I have a vague memory of it being said > > that it's nice to update the user-visible hint to match when the > > affinity does have to be forced to something specific. > > > > Robin. > > Thanks for your feedback Robin, but there is definitely a bug here that > is being exposed by this code. The fact that people are using this > function means they're all exposed to this bug. > > Not sure if you saw, but this analysis from Nitesh explains what > happened chronologically to the kernel w.r.t this code, it's a useful > analysis! [1] > > I'd add in addition that irqbalance daemon *stopped* paying attention > to hints quite a while ago, so I'm not quite sure what purpose they > serve. > > [1] > https://lore.kernel.org/lkml/CAFki+Lm0W_brLu31epqD3gAV+WNKOJfVDfX2M8ZM__aj3nv9uA at mail.gmail.com/ > Wanted to follow up to see if there are any more objections or even suggestions to take this forward? -- Thanks Nitesh