linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Alex G." <mr.nuke.me@gmail.com>, Tal Gilboa <talgi@mellanox.com>,
	linux-pci@vger.kernel.org, bhelgaas@google.com,
	jakub.kicinski@netronome.com, keith.busch@intel.com,
	alex_gagniuc@dellteam.com, austin_bolen@dell.com,
	shyam_iyer@dell.com, Ariel Elior <ariel.elior@cavium.com>,
	everest-linux-l2@cavium.com,
	"David S. Miller" <davem@davemloft.net>,
	Michael Chan <michael.chan@broadcom.com>,
	Ganesh Goudar <ganeshgr@chelsio.com>,
	Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	Saeed Mahameed <saeedm@mellanox.com>,
	Dirk van der Merwe <dirk.vandermerwe@netronome.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org, linux-rdma@vger.kernel.org,
	oss-drivers@netronome.com
Subject: Re: [PATCH v6 8/9] net/mlx5: Do not call pcie_print_link_status()
Date: Thu, 9 Aug 2018 09:02:55 -0500	[thread overview]
Message-ID: <20180809140255.GG49411@bhelgaas-glaptop.roam.corp.google.com> (raw)
In-Reply-To: <20180808172736.GB13378@mtr-leonro.mtl.com>

On Wed, Aug 08, 2018 at 08:27:36PM +0300, Leon Romanovsky wrote:
> On Wed, Aug 08, 2018 at 11:33:51AM -0500, Alex G. wrote:
> >
> >
> > On 08/08/2018 10:56 AM, Tal Gilboa wrote:
> > > On 8/8/2018 6:41 PM, Leon Romanovsky wrote:
> > > > On Wed, Aug 08, 2018 at 05:23:12PM +0300, Tal Gilboa wrote:
> > > > > On 8/8/2018 9:08 AM, Leon Romanovsky wrote:
> > > > > > On Mon, Aug 06, 2018 at 06:25:42PM -0500, Alexandru Gagniuc wrote:
> > > > > > > This is now done by the PCI core to warn of sub-optimal bandwidth.
> > > > > > >
> > > > > > > Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
> > > > > > > ---
> > > > > > >    drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ----
> > > > > > >    1 file changed, 4 deletions(-)
> > > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> > > > > >
> > > > >
> > > > > Alex,
> > > > > I loaded mlx5 driver with and without these series. The report
> > > > > in dmesg is
> > > > > now missing. From what I understood, the status should be
> > > > > reported at least
> > > > > once, even if everything is in order.
> > > >
> > > > It is not what this series is doing and it removes prints completely if
> > > > fabric can deliver more than card is capable.
> > > >
> > > > > We need this functionality to stay.
> > > >
> > > > I'm not sure that you need this information in driver's dmesg output,
> > > > but most probably something globally visible and accessible per-pci
> > > > device.
> > >
> > > Currently we have users that look for it. If we remove the dmesg print
> > > we need this to be reported elsewhere. Adding it to sysfs for example
> > > should be a valid solution for our case.
> >
> > I think a stop-gap measure is to leave the pcie_print_link_status() call in
> > drivers that really need it for whatever reason. Implementing a reliable
> > reporting through sysfs might take some tinkering, and I don't think it's a
> > sufficient reason to block the heart of this series -- being able to detect
> > bottlenecks and link downtraining.
> 
> IMHO, you did right change and it is better to replace this print to some
> more generic solution now while you are doing it and don't leave leftovers.

I'd like to make forward progress on this, so I propose we merge only
the PCI core change (patch 1/9) and drop the individual driver
changes.  That would mean:

  - We'll get a message from every NIC driver that calls
    pcie_print_link_status() as before.

  - We'll get a new message from the core for every downtrained link.

  - If a link leading to the NIC is downtrained, there will be
    duplicate messages.  Maybe that's overkill but it's not terrible.

I provisionally put the patch below on my pci/enumeration branch.
Objections?


commit c870cc8cbc4d79014f3daa74d1e412f32e42bf1b
Author: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Date:   Mon Aug 6 18:25:35 2018 -0500

    PCI: Check for PCIe Link downtraining
    
    When both ends of a PCIe Link are capable of a higher bandwidth than is
    currently in use, the Link is said to be "downtrained".  A downtrained Link
    may indicate hardware or configuration problems in the system, but it's
    hard to identify such Links from userspace.
    
    Refactor pcie_print_link_status() so it continues to always print PCIe
    bandwidth information, as several NIC drivers desire.
    
    Add a new internal __pcie_print_link_status() to emit a message only when a
    device's bandwidth is constrained by the fabric and call it from the PCI
    core for all devices, which identifies all downtrained Links.  It also
    emits messages for a few cases that are technically not downtrained, such
    as a x4 device in an open-ended x1 slot.
    
    Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
    [bhelgaas: changelog, move __pcie_print_link_status() declaration to
    drivers/pci/, rename pcie_check_upstream_link() to
    pcie_report_downtraining()]
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 97acba712e4e..a84d341504a5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5264,14 +5264,16 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
 }
 
 /**
- * pcie_print_link_status - Report the PCI device's link speed and width
+ * __pcie_print_link_status - Report the PCI device's link speed and width
  * @dev: PCI device to query
+ * @verbose: Print info even when enough bandwidth is available
  *
- * Report the available bandwidth at the device.  If this is less than the
- * device is capable of, report the device's maximum possible bandwidth and
- * the upstream link that limits its performance to less than that.
+ * If the available bandwidth at the device is less than the device is
+ * capable of, report the device's maximum possible bandwidth and the
+ * upstream link that limits its performance.  If @verbose, always print
+ * the available bandwidth, even if the device isn't constrained.
  */
-void pcie_print_link_status(struct pci_dev *dev)
+void __pcie_print_link_status(struct pci_dev *dev, bool verbose)
 {
 	enum pcie_link_width width, width_cap;
 	enum pci_bus_speed speed, speed_cap;
@@ -5281,11 +5283,11 @@ void pcie_print_link_status(struct pci_dev *dev)
 	bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
 	bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed, &width);
 
-	if (bw_avail >= bw_cap)
+	if (bw_avail >= bw_cap && verbose)
 		pci_info(dev, "%u.%03u Gb/s available PCIe bandwidth (%s x%d link)\n",
 			 bw_cap / 1000, bw_cap % 1000,
 			 PCIE_SPEED2STR(speed_cap), width_cap);
-	else
+	else if (bw_avail < bw_cap)
 		pci_info(dev, "%u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link)\n",
 			 bw_avail / 1000, bw_avail % 1000,
 			 PCIE_SPEED2STR(speed), width,
@@ -5293,6 +5295,17 @@ void pcie_print_link_status(struct pci_dev *dev)
 			 bw_cap / 1000, bw_cap % 1000,
 			 PCIE_SPEED2STR(speed_cap), width_cap);
 }
+
+/**
+ * pcie_print_link_status - Report the PCI device's link speed and width
+ * @dev: PCI device to query
+ *
+ * Report the available bandwidth at the device.
+ */
+void pcie_print_link_status(struct pci_dev *dev)
+{
+	__pcie_print_link_status(dev, true);
+}
 EXPORT_SYMBOL(pcie_print_link_status);
 
 /**
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 70808c168fb9..ce880dab5bc8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -263,6 +263,7 @@ enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
 enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
 u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
 			   enum pcie_link_width *width);
+void __pcie_print_link_status(struct pci_dev *dev, bool verbose);
 
 /* Single Root I/O Virtualization */
 struct pci_sriov {
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index bc147c586643..387fc8ac54ec 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2231,6 +2231,25 @@ static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
 	return dev;
 }
 
+static void pcie_report_downtraining(struct pci_dev *dev)
+{
+	if (!pci_is_pcie(dev))
+		return;
+
+	/* Look from the device up to avoid downstream ports with no devices */
+	if ((pci_pcie_type(dev) != PCI_EXP_TYPE_ENDPOINT) &&
+	    (pci_pcie_type(dev) != PCI_EXP_TYPE_LEG_END) &&
+	    (pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM))
+		return;
+
+	/* Multi-function PCIe devices share the same link/status */
+	if (PCI_FUNC(dev->devfn) != 0 || dev->is_virtfn)
+		return;
+
+	/* Print link status only if the device is constrained by the fabric */
+	__pcie_print_link_status(dev, false);
+}
+
 static void pci_init_capabilities(struct pci_dev *dev)
 {
 	/* Enhanced Allocation */
@@ -2266,6 +2285,8 @@ static void pci_init_capabilities(struct pci_dev *dev)
 	/* Advanced Error Reporting */
 	pci_aer_init(dev);
 
+	pcie_report_downtraining(dev);
+
 	if (pci_probe_reset_function(dev) == 0)
 		dev->reset_fn = 1;
 }

  reply	other threads:[~2018-08-09 14:03 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-04 15:55 [PATCH v3] PCI: Check for PCIe downtraining conditions Alexandru Gagniuc
2018-06-05 12:27 ` Andy Shevchenko
2018-06-05 13:04   ` Andy Shevchenko
2018-07-16 21:17 ` Bjorn Helgaas
2018-07-16 22:28   ` Alex_Gagniuc
2018-07-18 21:53     ` Bjorn Helgaas
2018-07-19 15:46       ` Alex G.
2018-07-23 20:01       ` [PATCH v2] PCI/AER: Do not clear AER bits if we don't own AER Alexandru Gagniuc
2018-07-25  1:24         ` kbuild test robot
2018-07-23 20:03       ` [PATCH v5] PCI: Check for PCIe downtraining conditions Alexandru Gagniuc
2018-07-23 21:01         ` Jakub Kicinski
2018-07-23 21:52           ` Tal Gilboa
2018-07-23 22:14             ` Jakub Kicinski
2018-07-23 23:59               ` Alex G.
2018-07-24 13:39                 ` Tal Gilboa
2018-07-30 23:26                   ` Alex_Gagniuc
2018-07-31  6:40             ` Tal Gilboa
2018-07-31 15:10               ` Alex G.
2018-08-05  7:05                 ` Tal Gilboa
2018-08-06 18:39                   ` Alex_Gagniuc
2018-08-06 19:46                     ` Bjorn Helgaas
2018-08-06 23:25                       ` [PATCH v6 1/9] " Alexandru Gagniuc
2018-08-06 23:25                         ` [PATCH v6 2/9] bnx2x: Do not call pcie_print_link_status() Alexandru Gagniuc
2018-08-06 23:25                         ` [PATCH v6 3/9] bnxt_en: " Alexandru Gagniuc
2018-08-06 23:25                         ` [PATCH v6 4/9] cxgb4: " Alexandru Gagniuc
2018-08-06 23:25                         ` [PATCH v6 5/9] fm10k: " Alexandru Gagniuc
2018-08-07 17:52                           ` Jeff Kirsher
2018-08-06 23:25                         ` [PATCH v6 6/9] ixgbe: " Alexandru Gagniuc
2018-08-07 17:51                           ` Jeff Kirsher
2018-08-06 23:25                         ` [PATCH v6 7/9] net/mlx4: " Alexandru Gagniuc
2018-08-08  6:10                           ` Leon Romanovsky
2018-08-06 23:25                         ` [PATCH v6 8/9] net/mlx5: " Alexandru Gagniuc
2018-08-08  6:08                           ` Leon Romanovsky
2018-08-08 14:23                             ` Tal Gilboa
2018-08-08 15:41                               ` Leon Romanovsky
2018-08-08 15:56                                 ` Tal Gilboa
2018-08-08 16:33                                   ` Alex G.
2018-08-08 17:27                                     ` Leon Romanovsky
2018-08-09 14:02                                       ` Bjorn Helgaas [this message]
2018-08-06 23:25                         ` [PATCH v6 9/9] nfp: " Alexandru Gagniuc
2018-08-07 19:44                         ` [PATCH v6 1/9] PCI: Check for PCIe downtraining conditions David Miller
2018-08-07 21:41                         ` Bjorn Helgaas
2018-07-18 13:38   ` [PATCH v3] " Tal Gilboa
2018-07-19 15:49     ` Alex G.
2018-07-23  5:21       ` Tal Gilboa
2018-07-23 17:01         ` Alex G.
2018-07-23 21:35           ` Tal Gilboa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180809140255.GG49411@bhelgaas-glaptop.roam.corp.google.com \
    --to=helgaas@kernel.org \
    --cc=alex_gagniuc@dellteam.com \
    --cc=ariel.elior@cavium.com \
    --cc=austin_bolen@dell.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=dirk.vandermerwe@netronome.com \
    --cc=everest-linux-l2@cavium.com \
    --cc=ganeshgr@chelsio.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jakub.kicinski@netronome.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=keith.busch@intel.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=mr.nuke.me@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=saeedm@mellanox.com \
    --cc=shyam_iyer@dell.com \
    --cc=talgi@mellanox.com \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).