All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-kernel@vger.kernel.org, alexander.shishkin@linux.intel.com,
	linux-pci@vger.kernel.org, linuxarm@huawei.com
Subject: Re: [RFC PATCH] hwtracing: Add HiSilicon PCIe Tune and Trace device driver
Date: Thu, 16 Jul 2020 16:31:20 -0500	[thread overview]
Message-ID: <20200716213120.GA648781@bjorn-Precision-5520> (raw)
In-Reply-To: <6f945bfe-eaec-464e-9f01-8c76fb467a1d@hisilicon.com>

On Thu, Jul 16, 2020 at 05:06:19PM +0800, Yicong Yang wrote:
> On 2020/7/11 7:09, Bjorn Helgaas wrote:
> > On Sat, Jun 13, 2020 at 05:32:13PM +0800, Yicong Yang wrote:
> >> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
> >> integrated Endpoint(RCiEP) device, providing the capability
> >> to dynamically monitor and tune the PCIe traffic parameters(tune),
> >> and trace the TLP headers to the memory(trace).
> >>
> >> Add the driver for the device to enable its functions. The driver
> >> will create debugfs directory for each PTT device, and users can
> >> operate the device through the files under its directory.

> >> +Tune
> >> +====
> >> +
> >> +PTT tune is designed for monitoring and adjusting PCIe link parameters(events).
> >> +Currently we support events 4 classes. The scope of the events
> >> +covers the PCIe core with which the PTT device belongs to.
> > All of these look like things that have the potential to break the
> > PCIe protocol and cause errors like timeouts, receiver overflows, etc.
> > That's OK for a debug/analysis situation, but it should taint the
> > kernel somehow because I don't want to debug problems like that if
> > they're caused by users tweaking things.
> >
> > That might even be a reason *not* to merge the tune side of this.  I
> > can see how it might be useful for you internally, but it's not
> > obvious to me how it will benefit other users.  Maybe that part should
> > be an out-of-tree module?
> 
> All the tuning values are not accurate, but abstracted to several
> _levels_ of each events. The levels are delicately designed to
> guarantee by the hardware that they are always valid and will not
> break the PCIe link.  The possible level values exposed to the users
> is tested and safe and other values will not be accepted.
> 
> The final tuning events is not settled and we'll not exposed the
> events which will may lead to the link broken. Furthermore, maybe we
> could default disable the tune events' level adjustment and make
> them readonly. The user can enable the full tune function by a
> module parameters or in the BIOS, and a warning message will be
> displayed.
> 
> The tune part is beneficial for the users and not only for our
> internal use.  We intends to provide a way to tune the link
> depending on the downstream components and link configuration. For
> example, users can tune the data path QoS level to get better
> performance according to the link width is x8 or x16, or according
> to the endpoints' class is a network card or a nvme disk.  It will
> make our controller adapt to different condition with high
> performance, so we hope this feature to be merged.

OK.  This driver itself is outside my area, so I guess merging it is
up to Alexander.

Do you have any measurements of performance improvements?  I think it
would be good to have real numbers showing that this is useful.

You mentioned a warning message, so I assume you'll add some kind of
dmesg logging when tuning happens?

Is this protected so it's only usable by root or other appropriate
privileged user?

> >> +		 * The PTT can designate function for trace.
> >> +		 * Add the root port's subordinates in the list as we
> >> +		 * can specify certain function.
> >> +		 */
> >> +		child_bus = tpdev->subordinate;
> >> +		list_for_each_entry(tpdev, &child_bus->devices, bus_list) {
> > *This* looks like a potential problem with hotplug.  How do you deal
> > with devices being added/removed after this loop?
> 
> Yes. I have considered the add/remove situation but not intend to address it
> in this RFC and assume the topology is static after probing.
> I will manage the situation in next version.

What happens if a device is added or removed after boot?  If the only
limitation is that you can't tune or trace a hot-added device, that's
fine.  (I mean, it's really *not* fine because it's a poor user
experience, but at least it's just a usability issue, not a crash.)

But if hot-adding or hot-removing a device can cause an oops or a
crash or something, *that* is definitely a problem.

Bjorn

  reply	other threads:[~2020-07-16 21:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-13  9:32 [RFC PATCH] hwtracing: Add HiSilicon PCIe Tune and Trace device driver Yicong Yang
2020-06-24 10:20 ` Yicong Yang
2020-07-10 23:09 ` Bjorn Helgaas
2020-07-16  9:06   ` Yicong Yang
2020-07-16 21:31     ` Bjorn Helgaas [this message]
2020-07-20  7:46       ` Yicong Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200716213120.GA648781@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.