All of lore.kernel.org
 help / color / mirror / Atom feed
From: Evan Green <evan@rivosinc.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: "Palmer Dabbelt" <palmer@rivosinc.com>,
	"Heiko Stübner" <heiko@sntech.de>,
	"Conor Dooley" <conor@kernel.org>,
	slewis@rivosinc.com, "Vineet Gupta" <vineetg@rivosinc.com>,
	"Albert Ou" <aou@eecs.berkeley.edu>,
	"Andrew Bresticker" <abrestic@rivosinc.com>,
	"Andrew Jones" <ajones@ventanamicro.com>,
	"Anup Patel" <apatel@ventanamicro.com>,
	"Atish Patra" <atishp@rivosinc.com>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Celeste Liu" <coelacanthus@outlook.com>,
	"Conor.Dooley" <conor.dooley@microchip.com>,
	guoren <guoren@kernel.org>, "Jonathan Corbet" <corbet@lwn.net>,
	"Niklas Cassel" <niklas.cassel@wdc.com>,
	"Palmer Dabbelt" <palmer@dabbelt.com>,
	"Paul Walmsley" <paul.walmsley@sifive.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Ruizhe Pan" <c141028@gmail.com>,
	"Sunil V L" <sunilvl@ventanamicro.com>,
	"Tobias Klauser" <tklauser@distanz.ch>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-riscv@lists.infradead.org
Subject: Re: [PATCH v3 2/7] RISC-V: Add a syscall for HW probing
Date: Fri, 31 Mar 2023 10:51:55 -0700	[thread overview]
Message-ID: <CALs-HsuiyocyK+RW_5O3JniOE7ROHHDJ4K=+6NZRzGad7wJpMQ@mail.gmail.com> (raw)
In-Reply-To: <d91ffb1e-261b-4b2f-a78f-f2846600a3e7@app.fastmail.com>

Hi Arnd,

On Fri, Mar 31, 2023 at 6:21 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Mar 30, 2023, at 20:30, Evan Green wrote:
> > On Thu, Feb 23, 2023 at 2:06 AM Arnd Bergmann <arnd@arndb.de> wrote:
> >> > +    long sys_riscv_hwprobe(struct riscv_hwprobe *pairs, size_t
> >> > pair_count,
> >> > +                           size_t cpu_count, cpu_set_t *cpus,
> >> > +                           unsigned long flags);
> >>
> >> The cpu set argument worries me more: there should never be a
> >> need to optimize for broken hardware that has an asymmetric set
> >> of features. Just let the kernel figure out the minimum set
> >> of features that works across all CPUs and report that like we
> >> do with HWCAP. If there is a SoC that is so broken that it has
> >> important features on a subset of cores that some user might
> >> actually want to rely on, then have them go through the slow
> >> sysfs interface for probing the CPUs indidually, but don't make
> >> the broken case easier at the expense of normal users that
> >> run on working hardware.
> >
> > I'm not so sure. While I agree with you for major classes of features
> > (eg one CPU has floating point support but another does not), I expect
> > these bits to contain more subtle details as well, which might vary
> > across asymmetric implementations without breaking ABI compatibility
> > per-se. Maybe some vendor has implemented exotic video decoding
> > acceleration instructions that only work on the big core. Or maybe the
> > big cores support v3.1 of some extension (where certain things run
> > faster), but the little cores only have v3.0, where it's a little
> > slower. Certain apps would likely want to know these things so they
> > can allocate their work optimally across cores.
>
> Do you have a specific feature in mind where hardware would be
> intentionally designed this way? I still can't come up with a
> scenario where this would actually work in practice, as having
> asymmetric features is incompatible with so many other things
> we normally do.

Isn't this already a reality, with the ARM folks building systems
where only some cores can run arm32 code, and the rest are aarch64
only?
https://lwn.net/Articles/838339/

The way I see it, it's going to be close to impossible to _not_ design
hardware this way. As long as die area and energy efficiency are a
constraint, big.LITTLE and other asymmetric architectures seem
inevitable. Given this interface is sort of a software cpuid
instruction, pretending there won't be differences between cores feels
like an incomplete API.

>
> - In a virtual machine, the VCPU tents to get scheduled arbitrarily
>   to physical CPUs, so setting affinity in a guest won't actually
>   guarantee that the feature is still there.

Sure, but I wouldn't expect the guest to be able to observe physical
properties from a transient host CPU (I'd consider that a bug). I'd
expect either the VCPU is pinned to a physical CPU with no chance of
migration, in which case you might reasonably plumb down some physical
properties to the guest, or the guest sees a watered down generic CPU
containing the least common denominator of features.

>
> - Using a CPU feature from library code is practically impossible
>   if it requires special CPU affinity, as the application may
>   already be started on specific CPUs for another reason, and
>   having a library call sched_setaffinity will conflict with those.
>
> - Even in the simplest case of having a standalone application
>   without any shared libraries try to pick a sensible CPU to
>   run on is hard to do in a generic way, as it would need to
>   weigh availabilty of features on certain cores against the
>   number of cores with or without the feature and their current
>   and expected system load.

In ChromeOS at least, we struggled a lot with chrome getting scheduled
on the big vs little cores on some ARM platforms, as we'd get things
like dropped video frames if we stayed stuck on the wrong cores. I
think we managed to solve that with uclamp, but again that
acknowledges that there are differences between cores, which I think
justifies the cpuset parameter to a cpu feature interface like this.

>
> As long as there isn't a specific requirement, I think it's better
> to not actually encourage hardware vendors to implement designs
> like that, or at least not designing an interface to make getting
> this information a few microseconds faster that what already
> exists.

As this is ABI, there is a bit of "crystal ball gazing" I'm doing,
though I hope I've backed it up with enough to show it's not a
completely wild parameter.

-Evan

WARNING: multiple messages have this Message-ID (diff)
From: Evan Green <evan@rivosinc.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: "Niklas Cassel" <niklas.cassel@wdc.com>,
	"Heiko Stübner" <heiko@sntech.de>,
	linux-doc@vger.kernel.org,
	"Andrew Bresticker" <abrestic@rivosinc.com>,
	"Atish Patra" <atishp@rivosinc.com>,
	"Palmer Dabbelt" <palmer@rivosinc.com>,
	"Conor.Dooley" <conor.dooley@microchip.com>,
	"Celeste Liu" <coelacanthus@outlook.com>,
	slewis@rivosinc.com, "Bagas Sanjaya" <bagasdotme@gmail.com>,
	linux-riscv@lists.infradead.org,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Tobias Klauser" <tklauser@distanz.ch>,
	"Andrew Jones" <ajones@ventanamicro.com>,
	"Albert Ou" <aou@eecs.berkeley.edu>,
	"Vineet Gupta" <vineetg@rivosinc.com>,
	"Paul Walmsley" <paul.walmsley@sifive.com>,
	"Ruizhe Pan" <c141028@gmail.com>,
	"Anup Patel" <apatel@ventanamicro.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, "Conor Dooley" <conor@kernel.org>,
	"Palmer Dabbelt" <palmer@dabbelt.com>, guoren <guoren@kernel.org>
Subject: Re: [PATCH v3 2/7] RISC-V: Add a syscall for HW probing
Date: Fri, 31 Mar 2023 10:51:55 -0700	[thread overview]
Message-ID: <CALs-HsuiyocyK+RW_5O3JniOE7ROHHDJ4K=+6NZRzGad7wJpMQ@mail.gmail.com> (raw)
In-Reply-To: <d91ffb1e-261b-4b2f-a78f-f2846600a3e7@app.fastmail.com>

Hi Arnd,

On Fri, Mar 31, 2023 at 6:21 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Mar 30, 2023, at 20:30, Evan Green wrote:
> > On Thu, Feb 23, 2023 at 2:06 AM Arnd Bergmann <arnd@arndb.de> wrote:
> >> > +    long sys_riscv_hwprobe(struct riscv_hwprobe *pairs, size_t
> >> > pair_count,
> >> > +                           size_t cpu_count, cpu_set_t *cpus,
> >> > +                           unsigned long flags);
> >>
> >> The cpu set argument worries me more: there should never be a
> >> need to optimize for broken hardware that has an asymmetric set
> >> of features. Just let the kernel figure out the minimum set
> >> of features that works across all CPUs and report that like we
> >> do with HWCAP. If there is a SoC that is so broken that it has
> >> important features on a subset of cores that some user might
> >> actually want to rely on, then have them go through the slow
> >> sysfs interface for probing the CPUs indidually, but don't make
> >> the broken case easier at the expense of normal users that
> >> run on working hardware.
> >
> > I'm not so sure. While I agree with you for major classes of features
> > (eg one CPU has floating point support but another does not), I expect
> > these bits to contain more subtle details as well, which might vary
> > across asymmetric implementations without breaking ABI compatibility
> > per-se. Maybe some vendor has implemented exotic video decoding
> > acceleration instructions that only work on the big core. Or maybe the
> > big cores support v3.1 of some extension (where certain things run
> > faster), but the little cores only have v3.0, where it's a little
> > slower. Certain apps would likely want to know these things so they
> > can allocate their work optimally across cores.
>
> Do you have a specific feature in mind where hardware would be
> intentionally designed this way? I still can't come up with a
> scenario where this would actually work in practice, as having
> asymmetric features is incompatible with so many other things
> we normally do.

Isn't this already a reality, with the ARM folks building systems
where only some cores can run arm32 code, and the rest are aarch64
only?
https://lwn.net/Articles/838339/

The way I see it, it's going to be close to impossible to _not_ design
hardware this way. As long as die area and energy efficiency are a
constraint, big.LITTLE and other asymmetric architectures seem
inevitable. Given this interface is sort of a software cpuid
instruction, pretending there won't be differences between cores feels
like an incomplete API.

>
> - In a virtual machine, the VCPU tents to get scheduled arbitrarily
>   to physical CPUs, so setting affinity in a guest won't actually
>   guarantee that the feature is still there.

Sure, but I wouldn't expect the guest to be able to observe physical
properties from a transient host CPU (I'd consider that a bug). I'd
expect either the VCPU is pinned to a physical CPU with no chance of
migration, in which case you might reasonably plumb down some physical
properties to the guest, or the guest sees a watered down generic CPU
containing the least common denominator of features.

>
> - Using a CPU feature from library code is practically impossible
>   if it requires special CPU affinity, as the application may
>   already be started on specific CPUs for another reason, and
>   having a library call sched_setaffinity will conflict with those.
>
> - Even in the simplest case of having a standalone application
>   without any shared libraries try to pick a sensible CPU to
>   run on is hard to do in a generic way, as it would need to
>   weigh availabilty of features on certain cores against the
>   number of cores with or without the feature and their current
>   and expected system load.

In ChromeOS at least, we struggled a lot with chrome getting scheduled
on the big vs little cores on some ARM platforms, as we'd get things
like dropped video frames if we stayed stuck on the wrong cores. I
think we managed to solve that with uclamp, but again that
acknowledges that there are differences between cores, which I think
justifies the cpuset parameter to a cpu feature interface like this.

>
> As long as there isn't a specific requirement, I think it's better
> to not actually encourage hardware vendors to implement designs
> like that, or at least not designing an interface to make getting
> this information a few microseconds faster that what already
> exists.

As this is ABI, there is a bit of "crystal ball gazing" I'm doing,
though I hope I've backed it up with enough to show it's not a
completely wild parameter.

-Evan

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2023-03-31 17:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-21 19:08 [PATCH v3 0/7] RISC-V Hardware Probing User Interface Evan Green
2023-02-21 19:08 ` Evan Green
2023-02-21 19:08 ` [PATCH v3 1/7] RISC-V: Move struct riscv_cpuinfo to new header Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-21 19:08 ` [PATCH v3 2/7] RISC-V: Add a syscall for HW probing Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-23 10:06   ` Arnd Bergmann
2023-02-23 10:06     ` Arnd Bergmann
2023-03-30 18:30     ` Evan Green
2023-03-30 18:30       ` Evan Green
2023-03-30 20:20       ` Heiko Stübner
2023-03-30 20:20         ` Heiko Stübner
2023-03-30 21:24         ` Evan Green
2023-03-30 21:24           ` Evan Green
2023-03-31 13:21       ` Arnd Bergmann
2023-03-31 13:21         ` Arnd Bergmann
2023-03-31 17:51         ` Evan Green [this message]
2023-03-31 17:51           ` Evan Green
2023-02-27 22:19   ` Conor Dooley
2023-02-27 22:19     ` Conor Dooley
2023-02-21 19:08 ` [PATCH v3 3/7] RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-27 22:47   ` Conor Dooley
2023-02-27 22:47     ` Conor Dooley
2023-03-03  0:56     ` Evan Green
2023-03-03  0:56       ` Evan Green
2023-02-21 19:08 ` [PATCH v3 4/7] dt-bindings: Add RISC-V misaligned access performance Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-27 22:57   ` Conor Dooley
2023-02-27 22:57     ` Conor Dooley
2023-02-28 14:57     ` Rob Herring
2023-02-28 14:57       ` Rob Herring
2023-02-21 19:08 ` [PATCH v3 5/7] RISC-V: hwprobe: Support probing of " Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-22  9:39   ` Joe Perches
2023-02-22  9:39     ` Joe Perches
2023-02-27 23:14   ` Conor Dooley
2023-02-27 23:14     ` Conor Dooley
2023-02-21 19:08 ` [PATCH v3 6/7] selftests: Test the new RISC-V hwprobe interface Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-21 22:45   ` Mark Brown
2023-02-21 22:45     ` Mark Brown
2023-02-21 19:08 ` [PATCH v3 7/7] RISC-V: Add hwprobe vDSO function and data Evan Green
2023-02-21 19:08   ` Evan Green
2023-02-21 21:17   ` kernel test robot
2023-02-21 21:17     ` kernel test robot
2023-02-22  6:55   ` kernel test robot
2023-02-22  6:55     ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALs-HsuiyocyK+RW_5O3JniOE7ROHHDJ4K=+6NZRzGad7wJpMQ@mail.gmail.com' \
    --to=evan@rivosinc.com \
    --cc=abrestic@rivosinc.com \
    --cc=ajones@ventanamicro.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=apatel@ventanamicro.com \
    --cc=arnd@arndb.de \
    --cc=atishp@rivosinc.com \
    --cc=bagasdotme@gmail.com \
    --cc=c141028@gmail.com \
    --cc=coelacanthus@outlook.com \
    --cc=conor.dooley@microchip.com \
    --cc=conor@kernel.org \
    --cc=corbet@lwn.net \
    --cc=guoren@kernel.org \
    --cc=heiko@sntech.de \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=niklas.cassel@wdc.com \
    --cc=palmer@dabbelt.com \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rdunlap@infradead.org \
    --cc=slewis@rivosinc.com \
    --cc=sunilvl@ventanamicro.com \
    --cc=tklauser@distanz.ch \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.