LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-pci@vger.kernel.org, mtosatti@redhat.com,
	sassmann@redhat.com, jeffrey.t.kirsher@intel.com,
	jacob.e.keller@intel.com, jlelli@redhat.com, hch@infradead.org,
	bhelgaas@google.com, mike.marciniszyn@intel.com,
	dennis.dalessandro@intel.com, thomas.lendacky@amd.com,
	jerinj@marvell.com, mathias.nyman@intel.com, jiri@nvidia.com
Subject: Re: [RFC][Patch v1 2/3] i40e: limit msix vectors based on housekeeping CPUs
Date: Tue, 22 Sep 2020 11:54:41 +0200
Message-ID: <20200922095440.GA5217@lenoir> (raw)
In-Reply-To: <65513ee8-4678-1f96-1850-0e13dbf1810c@redhat.com>

On Mon, Sep 21, 2020 at 11:08:20PM -0400, Nitesh Narayan Lal wrote:
> 
> On 9/21/20 6:58 PM, Frederic Weisbecker wrote:
> > On Thu, Sep 17, 2020 at 11:23:59AM -0700, Jesse Brandeburg wrote:
> >> Nitesh Narayan Lal wrote:
> >>
> >>> In a realtime environment, it is essential to isolate unwanted IRQs from
> >>> isolated CPUs to prevent latency overheads. Creating MSIX vectors only
> >>> based on the online CPUs could lead to a potential issue on an RT setup
> >>> that has several isolated CPUs but a very few housekeeping CPUs. This is
> >>> because in these kinds of setups an attempt to move the IRQs to the
> >>> limited housekeeping CPUs from isolated CPUs might fail due to the per
> >>> CPU vector limit. This could eventually result in latency spikes because
> >>> of the IRQ threads that we fail to move from isolated CPUs.
> >>>
> >>> This patch prevents i40e to add vectors only based on available
> >>> housekeeping CPUs by using num_housekeeping_cpus().
> >>>
> >>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
> >> The driver changes are straightforward, but this isn't the only driver
> >> with this issue, right?  I'm sure ixgbe and ice both have this problem
> >> too, you should fix them as well, at a minimum, and probably other
> >> vendors drivers:
> >>
> >> $ rg -c --stats num_online_cpus drivers/net/ethernet
> >> ...
> >> 50 files contained matches
> > Ouch, I was indeed surprised that these MSI vector allocations were done
> > at the driver level and not at some $SUBSYSTEM level.
> >
> > The logic is already there in the driver so I wouldn't oppose to this very patch
> > but would a shared infrastructure make sense for this? Something that would
> > also handle hotplug operations?
> >
> > Does it possibly go even beyond networking drivers?
> 
> From a generic solution perspective, I think it makes sense to come up with a
> shared infrastructure.
> Something that can be consumed by all the drivers and maybe hotplug operations
> as well (I will have to further explore the hotplug part).

That would be great. I'm completely clueless about those MSI things and the
actual needs of those drivers. Now it seems to me that if several CPUs become
offline, or as is planned in the future, CPU isolation gets enabled/disabled
through cpuset, then the vectors may need some reorganization.

But I don't also want to push toward a complicated solution to handle CPU hotplug
if there is no actual problem to solve there. So I let you guys judge.

> However, there are RT workloads that are getting affected because of this
> issue, so does it make sense to go ahead with this per-driver basis approach
> for now?

Yep that sounds good.

> 
> Since a generic solution will require a fair amount of testing and
> understanding of different drivers. Having said that, I can definetly start
> looking in that direction.

Thanks a lot!

  reply index

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-09 15:08 [RFC] [PATCH v1 0/3] isolation: " Nitesh Narayan Lal
2020-09-09 15:08 ` [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs Nitesh Narayan Lal
2020-09-17 18:18   ` Jesse Brandeburg
2020-09-17 18:43     ` Nitesh Narayan Lal
2020-09-17 20:11   ` Bjorn Helgaas
2020-09-17 21:48     ` Jacob Keller
2020-09-17 22:09     ` Nitesh Narayan Lal
2020-09-21 23:40   ` Frederic Weisbecker
2020-09-22  3:16     ` Nitesh Narayan Lal
2020-09-22 10:08       ` Frederic Weisbecker
2020-09-22 13:50         ` Nitesh Narayan Lal
2020-09-22 20:58           ` Frederic Weisbecker
2020-09-22 21:15             ` Nitesh Narayan Lal
2020-09-22 21:26             ` Andrew Lunn
2020-09-22 22:20               ` Nitesh Narayan Lal
2020-09-09 15:08 ` [RFC][Patch v1 2/3] i40e: limit msix vectors based on housekeeping CPUs Nitesh Narayan Lal
2020-09-11 15:23   ` Marcelo Tosatti
2020-09-17 18:23   ` Jesse Brandeburg
2020-09-17 18:31     ` Nitesh Narayan Lal
2020-09-21 22:58     ` Frederic Weisbecker
2020-09-22  3:08       ` Nitesh Narayan Lal
2020-09-22  9:54         ` Frederic Weisbecker [this message]
2020-09-22 13:34           ` Nitesh Narayan Lal
2020-09-22 20:44             ` Frederic Weisbecker
2020-09-22 21:05               ` Nitesh Narayan Lal
2020-09-09 15:08 ` [RFC][Patch v1 3/3] PCI: Limit pci_alloc_irq_vectors as per " Nitesh Narayan Lal
2020-09-10 19:22   ` Marcelo Tosatti
2020-09-10 19:31     ` Nitesh Narayan Lal
2020-09-22 13:54       ` Nitesh Narayan Lal
2020-09-22 21:08         ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200922095440.GA5217@lenoir \
    --to=frederic@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=hch@infradead.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jerinj@marvell.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@nvidia.com \
    --cc=jlelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=mike.marciniszyn@intel.com \
    --cc=mtosatti@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nitesh@redhat.com \
    --cc=sassmann@redhat.com \
    --cc=thomas.lendacky@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git