soc.lore.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1
@ 2022-07-08 18:56 Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 1/7] soc/tegra: Changes " Thierry Reding
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

here's a set of Tegra pull requests for v5.20-rc1.

One thing that I would very much have liked to include in this is the
Multi-Gigabit Ethernet (MGBE) controller support for Tegra234. That's
very important because it unlocks much broader coverage using our test
farms and makes the devices generally a lot more useful. Unfortunately
there were some unforeseen delays and I don't have a final Acked-by on
the device tree bindings yet. I do expect that we'll finalize this in
the next few days, though. If we do, would it be acceptable if I sent
an update to two of these PRs (dt-bindings and arm64/dt) to enable this
critical piece of functionality?

Thanks,
Thierry

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-12 13:27   ` Arnd Bergmann
  2022-07-08 18:56 ` [GIT PULL 2/7] firmware: tegra: " Thierry Reding
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-soc

for you to fetch changes up to 4773d1c739e22101a92f89c0ae0983190ddbe112:

  soc/tegra: fuse: Add missing of_node_put() (2022-07-08 17:27:26 +0200)

Thanks,
Thierry

----------------------------------------------------------------
soc/tegra: Changes for v5.20-rc1

The bulk of these changes is the new CBB driver which is used to provide
(a lot of) information about SErrors when things go wrong, instead of
the kernel just crashing or hanging.

In addition more SoC information is exposed to sysfs and various minor
issues are fixed.

----------------------------------------------------------------
Bitan Biswas (1):
      soc/tegra: fuse: Expose Tegra production status

Liang He (2):
      soc/tegra: fuse: Add missing of_node_put() in tegra_init_fuse()
      soc/tegra: fuse: Add missing of_node_put()

Sumit Gupta (4):
      soc/tegra: Set ERD bit to mask inband errors
      soc/tegra: cbb: Add CBB 1.0 driver for Tegra194
      soc/tegra: cbb: Add driver for Tegra234 CBB 2.0
      soc/tegra: cbb: Add support for Tegra241 (Grace)

YueHaibing (1):
      soc/tegra: fuse: Add missing DMADEVICES dependency

 drivers/soc/tegra/Kconfig              |   11 +-
 drivers/soc/tegra/Makefile             |    1 +
 drivers/soc/tegra/cbb/Makefile         |    9 +
 drivers/soc/tegra/cbb/tegra-cbb.c      |  190 +++
 drivers/soc/tegra/cbb/tegra194-cbb.c   | 2365 ++++++++++++++++++++++++++++++++
 drivers/soc/tegra/cbb/tegra234-cbb.c   | 1114 +++++++++++++++
 drivers/soc/tegra/fuse/fuse-tegra.c    |   16 +
 drivers/soc/tegra/fuse/tegra-apbmisc.c |   36 +-
 include/soc/tegra/fuse.h               |    7 +
 include/soc/tegra/tegra-cbb.h          |   47 +
 10 files changed, 3791 insertions(+), 5 deletions(-)
 create mode 100644 drivers/soc/tegra/cbb/Makefile
 create mode 100644 drivers/soc/tegra/cbb/tegra-cbb.c
 create mode 100644 drivers/soc/tegra/cbb/tegra194-cbb.c
 create mode 100644 drivers/soc/tegra/cbb/tegra234-cbb.c
 create mode 100644 include/soc/tegra/tegra-cbb.h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 2/7] firmware: tegra: Changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 1/7] soc/tegra: Changes " Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 3/7] dt-bindings: " Thierry Reding
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-firmware

for you to fetch changes up to afcdb8e55c91c6ff0700ab272fd0f74e899ab884:

  firmware: tegra: Fix error check return value of debugfs_create_file() (2022-07-08 17:56:03 +0200)

Thanks,
Thierry

----------------------------------------------------------------
firmware: tegra: Changes for v5.20-rc1

Contains a fix that avoids unaligned accesses to IPC memory, which in
turn helps in suppressing speculative reads from out-of-bounds memory.

Another patch performs some minor cleanup around debugfs API usage.

----------------------------------------------------------------
Lv Ruyi (1):
      firmware: tegra: Fix error check return value of debugfs_create_file()

Timo Alho (1):
      firmware: tegra: bpmp: Do only aligned access to IPC memory area

 drivers/firmware/tegra/bpmp-debugfs.c | 10 +++++-----
 drivers/firmware/tegra/bpmp.c         |  6 +++---
 2 files changed, 8 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 3/7] dt-bindings: Changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 1/7] soc/tegra: Changes " Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 2/7] firmware: tegra: " Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 4/7] memory: tegra: " Thierry Reding
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-dt-bindings

for you to fetch changes up to 015a166368d72e3bd22368aa9fc793db953ec9c4:

  dt-bindings: tegra-ccplex-cluster: Remove status from required properties (2022-07-08 16:50:30 +0200)

Thanks,
Thierry

----------------------------------------------------------------
dt-bindings: Changes for v5.20-rc1

These changes add clock, reset, memory client and power domain
definitions for various devices found on Tegra234 along with a few
device tree bindings for new hardware.

----------------------------------------------------------------
Akhil R (1):
      dt-bindings: Add headers for Tegra234 GPCDMA

Kartik (1):
      dt-bindings: timer: Add Tegra186 & Tegra234 Timer

Mikko Perttunen (1):
      dt-bindings: Add headers for Host1x and VIC on Tegra234

Sumit Gupta (3):
      dt-bindings: arm: tegra: Add NVIDIA Tegra194 CBB 1.0 binding
      dt-bindings: arm: tegra: Add NVIDIA Tegra194 AXI2APB binding
      dt-bindings: arm: tegra: Add NVIDIA Tegra234 CBB 2.0 binding

Thierry Reding (4):
      dt-bindings: power: Add Tegra234 MGBE power domains
      dt-bindings: Add Tegra234 MGBE clocks and resets
      dt-bindings: memory: Add Tegra234 MGBE memory clients
      dt-bindings: tegra-ccplex-cluster: Remove status from required properties

 .../arm/tegra/nvidia,tegra-ccplex-cluster.yaml     |   1 -
 .../arm/tegra/nvidia,tegra194-axi2apb.yaml         |  40 ++++++++
 .../bindings/arm/tegra/nvidia,tegra194-cbb.yaml    |  97 ++++++++++++++++++
 .../bindings/arm/tegra/nvidia,tegra234-cbb.yaml    |  74 ++++++++++++++
 .../bindings/timer/nvidia,tegra186-timer.yaml      | 109 +++++++++++++++++++++
 include/dt-bindings/clock/tegra234-clock.h         | 105 ++++++++++++++++++++
 include/dt-bindings/memory/tegra234-mc.h           |  26 +++++
 include/dt-bindings/power/tegra234-powergate.h     |   2 +
 include/dt-bindings/reset/tegra234-reset.h         |  10 ++
 9 files changed, 463 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra194-axi2apb.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra194-cbb.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra234-cbb.yaml
 create mode 100644 Documentation/devicetree/bindings/timer/nvidia,tegra186-timer.yaml

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 4/7] memory: tegra: Changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
                   ` (2 preceding siblings ...)
  2022-07-08 18:56 ` [GIT PULL 3/7] dt-bindings: " Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 5/7] ARM: tegra: Device tree changes " Thierry Reding
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-memory

for you to fetch changes up to 5060237b0ee1b1c70fcd9a8048664905b74ba6a4:

  memory: tegra: Add MGBE memory clients for Tegra234 (2022-07-08 10:29:40 +0200)

Thanks,
Thierry

----------------------------------------------------------------
memory: tegra: Changes for v5.20-rc1

Add memory client definitions for the Multi-Gigabit Ethernet (MGBE)
controllers found on Tegra234.

----------------------------------------------------------------
Akhil R (1):
      dt-bindings: Add headers for Tegra234 GPCDMA

Sumit Gupta (3):
      dt-bindings: arm: tegra: Add NVIDIA Tegra194 CBB 1.0 binding
      dt-bindings: arm: tegra: Add NVIDIA Tegra194 AXI2APB binding
      dt-bindings: arm: tegra: Add NVIDIA Tegra234 CBB 2.0 binding

Thierry Reding (5):
      dt-bindings: power: Add Tegra234 MGBE power domains
      dt-bindings: Add Tegra234 MGBE clocks and resets
      dt-bindings: memory: Add Tegra234 MGBE memory clients
      Merge branch 'for-5.20/dt-bindings' into for-5.20/memory
      memory: tegra: Add MGBE memory clients for Tegra234

 .../arm/tegra/nvidia,tegra194-axi2apb.yaml         |  40 ++++++++
 .../bindings/arm/tegra/nvidia,tegra194-cbb.yaml    |  97 ++++++++++++++++++++
 .../bindings/arm/tegra/nvidia,tegra234-cbb.yaml    |  74 +++++++++++++++
 drivers/memory/tegra/tegra234.c                    |  80 ++++++++++++++++
 include/dt-bindings/clock/tegra234-clock.h         | 101 +++++++++++++++++++++
 include/dt-bindings/memory/tegra234-mc.h           |  21 +++++
 include/dt-bindings/power/tegra234-powergate.h     |   1 +
 include/dt-bindings/reset/tegra234-reset.h         |   9 ++
 8 files changed, 423 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra194-axi2apb.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra194-cbb.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra234-cbb.yaml

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 5/7] ARM: tegra: Device tree changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
                   ` (3 preceding siblings ...)
  2022-07-08 18:56 ` [GIT PULL 4/7] memory: tegra: " Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 6/7] arm64: " Thierry Reding
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-arm-dt

for you to fetch changes up to 799270e9b8be0846d418ed6c3f61408aef8059a1:

  ARM: tegra: Align gpio-keys node names with dtschema (2022-06-24 18:15:53 +0200)

Thanks,
Thierry

----------------------------------------------------------------
ARM: tegra: Device tree changes for v5.20-rc1

Two minor fixes to help reduce the noise from the DT validation tooling.

----------------------------------------------------------------
Krzysztof Kozlowski (2):
      ARM: tegra: Adjust whitespace around '='
      ARM: tegra: Align gpio-keys node names with dtschema

 arch/arm/boot/dts/tegra114-asus-tf701t.dts                |  8 ++++----
 arch/arm/boot/dts/tegra114-dalmore.dts                    |  8 ++++----
 arch/arm/boot/dts/tegra114-roth.dts                       |  6 +++---
 arch/arm/boot/dts/tegra114-tn7.dts                        |  6 +++---
 arch/arm/boot/dts/tegra124-apalis-eval.dts                |  2 +-
 arch/arm/boot/dts/tegra124-apalis-v1.2-eval.dts           |  2 +-
 arch/arm/boot/dts/tegra124-jetson-tk1.dts                 |  2 +-
 arch/arm/boot/dts/tegra124-nyan.dtsi                      |  4 ++--
 arch/arm/boot/dts/tegra124-venice2.dts                    |  2 +-
 arch/arm/boot/dts/tegra20-acer-a500-picasso.dts           | 10 +++++-----
 arch/arm/boot/dts/tegra20-asus-tf101.dts                  | 10 +++++-----
 arch/arm/boot/dts/tegra20-colibri-eval-v3.dts             |  2 +-
 arch/arm/boot/dts/tegra20-colibri-iris.dts                |  2 +-
 arch/arm/boot/dts/tegra20-harmony.dts                     |  2 +-
 arch/arm/boot/dts/tegra20-paz00.dts                       |  2 +-
 arch/arm/boot/dts/tegra20-seaboard.dts                    |  4 ++--
 arch/arm/boot/dts/tegra20-trimslice.dts                   |  2 +-
 arch/arm/boot/dts/tegra20-ventana.dts                     |  2 +-
 arch/arm/boot/dts/tegra30-apalis-eval.dts                 |  2 +-
 arch/arm/boot/dts/tegra30-apalis-v1.1-eval.dts            |  2 +-
 arch/arm/boot/dts/tegra30-asus-nexus7-grouper-common.dtsi | 14 +++++++-------
 arch/arm/boot/dts/tegra30-asus-nexus7-tilapia.dtsi        |  2 +-
 arch/arm/boot/dts/tegra30-asus-transformer-common.dtsi    | 10 +++++-----
 arch/arm/boot/dts/tegra30-cardhu.dtsi                     |  6 +++---
 arch/arm/boot/dts/tegra30-colibri-eval-v3.dts             |  2 +-
 arch/arm/boot/dts/tegra30-colibri.dtsi                    |  6 +++---
 arch/arm/boot/dts/tegra30-ouya.dts                        |  2 +-
 arch/arm/boot/dts/tegra30-pegatron-chagall.dts            | 10 +++++-----
 28 files changed, 66 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 6/7] arm64: tegra: Device tree changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
                   ` (4 preceding siblings ...)
  2022-07-08 18:56 ` [GIT PULL 5/7] ARM: tegra: Device tree changes " Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-08 18:56 ` [GIT PULL 7/7] arm64: tegra: Default configuration updates " Thierry Reding
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-arm64-dt

for you to fetch changes up to b415bb7c976f1d595ed752001c0938f702645dab:

  arm64: tegra: Fix SDMMC1 CD on P2888 (2022-07-08 18:00:13 +0200)

Thanks,
Thierry

----------------------------------------------------------------
arm64: tegra: Device tree changes for v5.20-rc1

This adds and enables various hardware on Tegra234 (host1x, VIC, GPCDMA)
as well as the Control BackBone related device tree nodes on Tegra194
and Tegra234.

Native timers are enabled on Tegra186, Tegra194 and Tegra234, which
allow keeping track of SoC-wide timestamps as well as hardware watchdog
functionality.

The audio subsystem is enhanced with the Output Processing Engine (OPE)
on Tegra210 and later.

Finally there are a handful of minor cleanups and fixes.

----------------------------------------------------------------
Akhil R (3):
      dt-bindings: Add headers for Tegra234 GPCDMA
      arm64: tegra: Add Tegra234 GPCDMA device tree node
      arm64: tegra: Update compatible for Tegra234 GPCDMA

Kartik (3):
      dt-bindings: timer: Add Tegra186 & Tegra234 Timer
      arm64: tegra: Enable native timers on Tegra186
      arm64: tegra: Enable native timers on Tegra234

Krzysztof Kozlowski (2):
      arm64: tegra: Adjust whitespace around '='
      arm64: tegra: Align gpio-keys node names with dtschema

Mikko Perttunen (4):
      dt-bindings: Add headers for Host1x and VIC on Tegra234
      arm64: tegra: Mark BPMP channels as no-memory-wc
      arm64: tegra: Add Host1x context stream IDs on Tegra186+
      arm64: tegra: Add Host1x and VIC on Tegra234

Sameer Pujar (2):
      arm64: tegra: Add OPE device on Tegra210 and later
      arm64: tegra: Enable OPE on various platforms

Sumit Gupta (5):
      dt-bindings: arm: tegra: Add NVIDIA Tegra194 CBB 1.0 binding
      dt-bindings: arm: tegra: Add NVIDIA Tegra194 AXI2APB binding
      dt-bindings: arm: tegra: Add NVIDIA Tegra234 CBB 2.0 binding
      arm64: tegra: Add node for CBB 1.0 on Tegra194
      arm64: tegra: Add node for CBB 2.0 on Tegra234

Tamás Szűcs (1):
      arm64: tegra: Fix SDMMC1 CD on P2888

Thierry Reding (6):
      dt-bindings: power: Add Tegra234 MGBE power domains
      dt-bindings: Add Tegra234 MGBE clocks and resets
      dt-bindings: memory: Add Tegra234 MGBE memory clients
      dt-bindings: tegra-ccplex-cluster: Remove status from required properties
      Merge branch 'for-5.20/dt-bindings' into for-5.20/arm64/dt
      arm64: tegra: Enable native timers on Tegra194

 .../arm/tegra/nvidia,tegra-ccplex-cluster.yaml     |   1 -
 .../arm/tegra/nvidia,tegra194-axi2apb.yaml         |  40 +++++
 .../bindings/arm/tegra/nvidia,tegra194-cbb.yaml    |  97 +++++++++++
 .../bindings/arm/tegra/nvidia,tegra234-cbb.yaml    |  74 +++++++++
 .../bindings/timer/nvidia,tegra186-timer.yaml      | 109 +++++++++++++
 arch/arm64/boot/dts/nvidia/tegra132-norrin.dts     |   4 +-
 arch/arm64/boot/dts/nvidia/tegra186-p2771-0000.dts |  49 +++++-
 .../dts/nvidia/tegra186-p3509-0000+p3636-0001.dts  |   6 +-
 arch/arm64/boot/dts/nvidia/tegra186.dtsi           |  37 ++++-
 arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi     |   2 +-
 arch/arm64/boot/dts/nvidia/tegra194-p2972-0000.dts |  47 +++++-
 .../arm64/boot/dts/nvidia/tegra194-p3509-0000.dtsi |  47 +++++-
 arch/arm64/boot/dts/nvidia/tegra194.dtsi           | 113 ++++++++++++-
 arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts |  84 ++++++++++
 arch/arm64/boot/dts/nvidia/tegra210-p2597.dtsi     |   6 +-
 arch/arm64/boot/dts/nvidia/tegra210-p2894.dtsi     |   2 +-
 arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts |  88 +++++++++-
 arch/arm64/boot/dts/nvidia/tegra210-smaug.dts      |  10 +-
 arch/arm64/boot/dts/nvidia/tegra210.dtsi           |  42 ++++-
 .../dts/nvidia/tegra234-p3737-0000+p3701-0000.dts  |  49 +++++-
 arch/arm64/boot/dts/nvidia/tegra234.dtsi           | 177 +++++++++++++++++++++
 include/dt-bindings/clock/tegra234-clock.h         | 105 ++++++++++++
 include/dt-bindings/memory/tegra234-mc.h           |  26 +++
 include/dt-bindings/power/tegra234-powergate.h     |   2 +
 include/dt-bindings/reset/tegra234-reset.h         |  10 ++
 25 files changed, 1196 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra194-axi2apb.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra194-cbb.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/tegra/nvidia,tegra234-cbb.yaml
 create mode 100644 Documentation/devicetree/bindings/timer/nvidia,tegra186-timer.yaml

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [GIT PULL 7/7] arm64: tegra: Default configuration updates for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
                   ` (5 preceding siblings ...)
  2022-07-08 18:56 ` [GIT PULL 6/7] arm64: " Thierry Reding
@ 2022-07-08 18:56 ` Thierry Reding
  2022-07-11 16:20 ` [GIT PULL 0/7] NVIDIA Tegra changes " patchwork-bot+linux-soc
  2022-07-12 21:30 ` patchwork-bot+linux-soc
  8 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-08 18:56 UTC (permalink / raw)
  To: arm, soc; +Cc: Thierry Reding, Jon Hunter, linux-tegra, linux-arm-kernel

Hi ARM SoC maintainers,

The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:

  Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-arm64-defconfig

for you to fetch changes up to 28b4dcc8f0108e2c430b63b0325c61ae97d6dc79:

  arm64: defconfig: Build Tegra OPE module (2022-06-08 15:39:04 +0200)

Thanks,
Thierry

----------------------------------------------------------------
arm64: tegra: Default configuration updates for v5.20-rc1

Updates the arm64 default configuration to enable the OPE driver as a
module.

----------------------------------------------------------------
Sameer Pujar (1):
      arm64: defconfig: Build Tegra OPE module

 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
                   ` (6 preceding siblings ...)
  2022-07-08 18:56 ` [GIT PULL 7/7] arm64: tegra: Default configuration updates " Thierry Reding
@ 2022-07-11 16:20 ` patchwork-bot+linux-soc
  2022-07-12 21:30 ` patchwork-bot+linux-soc
  8 siblings, 0 replies; 24+ messages in thread
From: patchwork-bot+linux-soc @ 2022-07-11 16:20 UTC (permalink / raw)
  To: Thierry Reding; +Cc: soc

Hello:

This series was applied to soc/soc.git (for-next)
by Arnd Bergmann <arnd@arndb.de>:

On Fri,  8 Jul 2022 20:56:01 +0200 you wrote:
> Hi ARM SoC maintainers,
> 
> here's a set of Tegra pull requests for v5.20-rc1.
> 
> One thing that I would very much have liked to include in this is the
> Multi-Gigabit Ethernet (MGBE) controller support for Tegra234. That's
> very important because it unlocks much broader coverage using our test
> farms and makes the devices generally a lot more useful. Unfortunately
> there were some unforeseen delays and I don't have a final Acked-by on
> the device tree bindings yet. I do expect that we'll finalize this in
> the next few days, though. If we do, would it be acceptable if I sent
> an update to two of these PRs (dt-bindings and arm64/dt) to enable this
> critical piece of functionality?
> 
> [...]

Here is the summary with links:
  - [GIT,PULL,1/7] soc/tegra: Changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,2/7] firmware: tegra: Changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,3/7] dt-bindings: Changes for v5.20-rc1
    https://git.kernel.org/soc/soc/c/a41bf1aabd46
  - [GIT,PULL,4/7] memory: tegra: Changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,5/7] ARM: tegra: Device tree changes for v5.20-rc1
    https://git.kernel.org/soc/soc/c/ed1646fe0232
  - [GIT,PULL,6/7] arm64: tegra: Device tree changes for v5.20-rc1
    https://git.kernel.org/soc/soc/c/c784744b0405
  - [GIT,PULL,7/7] arm64: tegra: Default configuration updates for v5.20-rc1
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-08 18:56 ` [GIT PULL 1/7] soc/tegra: Changes " Thierry Reding
@ 2022-07-12 13:27   ` Arnd Bergmann
  2022-07-13 10:58     ` Thierry Reding
  0 siblings, 1 reply; 24+ messages in thread
From: Arnd Bergmann @ 2022-07-12 13:27 UTC (permalink / raw)
  To: Thierry Reding
  Cc: arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter

On Fri, Jul 8, 2022 at 8:56 PM Thierry Reding <thierry.reding@gmail.com> wrote:
>   git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-soc
...
> ----------------------------------------------------------------
> soc/tegra: Changes for v5.20-rc1
>
> The bulk of these changes is the new CBB driver which is used to provide
> (a lot of) information about SErrors when things go wrong, instead of
> the kernel just crashing or hanging.
>
> In addition more SoC information is exposed to sysfs and various minor
> issues are fixed.
>

Hi Thierry,

I fear I'm going to skip this for the current merge window. It looks like
the CBB driver you add here would fit into the existing drivers/edac/
subsystem, or at the minimum should have been reviewed by the
corresponding maintainers (added to Cc)  to decide whether it goes
there or not.

I had not previously seen this driver, but I'll let them have a look first.

For the other patches, I found two more problems:

> Bitan Biswas (1):
>       soc/tegra: fuse: Expose Tegra production status

Please don't just add random attributes in the soc device infrastructure.
This one has a completely generic name but a SoC specific
meaning, and it lacks a description in Documentation/ABI.
Not sure what the right ABI is here, but this is something that needs
to be discussed more broadly when you send a new version.

I see there are already some custom attributes in the same device,
we should probably not have added those either, but I suppose
we are stuck with those, so please add the missing documentation.

> YueHaibing (1):
>      soc/tegra: fuse: Add missing DMADEVICES dependency

This one fixes the warning the wrong way: we don't 'select' random
drivers from other subsystems, and selecting the entire
subsystem makes it worse. Just drop the 'select' here and
enable the drivers in the defconfig.

         Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1
  2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
                   ` (7 preceding siblings ...)
  2022-07-11 16:20 ` [GIT PULL 0/7] NVIDIA Tegra changes " patchwork-bot+linux-soc
@ 2022-07-12 21:30 ` patchwork-bot+linux-soc
  8 siblings, 0 replies; 24+ messages in thread
From: patchwork-bot+linux-soc @ 2022-07-12 21:30 UTC (permalink / raw)
  To: Thierry Reding; +Cc: soc

Hello:

This series was applied to soc/soc.git (for-next)
by Arnd Bergmann <arnd@arndb.de>:

On Fri,  8 Jul 2022 20:56:01 +0200 you wrote:
> Hi ARM SoC maintainers,
> 
> here's a set of Tegra pull requests for v5.20-rc1.
> 
> One thing that I would very much have liked to include in this is the
> Multi-Gigabit Ethernet (MGBE) controller support for Tegra234. That's
> very important because it unlocks much broader coverage using our test
> farms and makes the devices generally a lot more useful. Unfortunately
> there were some unforeseen delays and I don't have a final Acked-by on
> the device tree bindings yet. I do expect that we'll finalize this in
> the next few days, though. If we do, would it be acceptable if I sent
> an update to two of these PRs (dt-bindings and arm64/dt) to enable this
> critical piece of functionality?
> 
> [...]

Here is the summary with links:
  - [GIT,PULL,1/7] soc/tegra: Changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,2/7] firmware: tegra: Changes for v5.20-rc1
    https://git.kernel.org/soc/soc/c/933edcde4383
  - [GIT,PULL,3/7] dt-bindings: Changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,4/7] memory: tegra: Changes for v5.20-rc1
    https://git.kernel.org/soc/soc/c/f10c00ae8628
  - [GIT,PULL,5/7] ARM: tegra: Device tree changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,6/7] arm64: tegra: Device tree changes for v5.20-rc1
    (no matching commit)
  - [GIT,PULL,7/7] arm64: tegra: Default configuration updates for v5.20-rc1
    https://git.kernel.org/soc/soc/c/8c1541b161a6

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-12 13:27   ` Arnd Bergmann
@ 2022-07-13 10:58     ` Thierry Reding
  2022-07-13 12:14       ` Arnd Bergmann
  0 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2022-07-13 10:58 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter

[-- Attachment #1: Type: text/plain, Size: 3374 bytes --]

On Tue, Jul 12, 2022 at 03:27:16PM +0200, Arnd Bergmann wrote:
> On Fri, Jul 8, 2022 at 8:56 PM Thierry Reding <thierry.reding@gmail.com> wrote:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git tags/tegra-for-5.20-soc
> ...
> > ----------------------------------------------------------------
> > soc/tegra: Changes for v5.20-rc1
> >
> > The bulk of these changes is the new CBB driver which is used to provide
> > (a lot of) information about SErrors when things go wrong, instead of
> > the kernel just crashing or hanging.
> >
> > In addition more SoC information is exposed to sysfs and various minor
> > issues are fixed.
> >
> 
> Hi Thierry,
> 
> I fear I'm going to skip this for the current merge window. It looks like
> the CBB driver you add here would fit into the existing drivers/edac/
> subsystem, or at the minimum should have been reviewed by the
> corresponding maintainers (added to Cc)  to decide whether it goes
> there or not.
> 
> I had not previously seen this driver, but I'll let them have a look first.

EDAC looks like it's used primarily for memory controllers, which this
is not. But then I also see explicit references to non-memory-controller
references in the infrastructure, so perhaps this does fit in there. The
CBB driver is primarily a means to provide additional information about
runtime errors, so it's not directly a means of discovering the errors
(they would be detected anyway and cause a crash) and I don't think we
have a means of correcting any of these errors.

I'll ask Sumit to work with the EDAC maintainers on this.

> For the other patches, I found two more problems:
> 
> > Bitan Biswas (1):
> >       soc/tegra: fuse: Expose Tegra production status
> 
> Please don't just add random attributes in the soc device infrastructure.
> This one has a completely generic name but a SoC specific
> meaning, and it lacks a description in Documentation/ABI.
> Not sure what the right ABI is here, but this is something that needs
> to be discussed more broadly when you send a new version.

I wasn't aware that the SoC device infrastructure was restricted to only
standardized attributes. Looks like there are a few other outliers that
add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.

Do we have some other place where this kind of thing can be exposed? Or
do we just need to come up with some better way of namespacing these?
Perhaps it would also be sufficient if all of these were better
documented so that people know what to look for on their platform of
interest.

> I see there are already some custom attributes in the same device,
> we should probably not have added those either, but I suppose
> we are stuck with those, so please add the missing documentation.

Yeah, that's a good point. These should definitely be documented
properly.

> 
> > YueHaibing (1):
> >      soc/tegra: fuse: Add missing DMADEVICES dependency
> 
> This one fixes the warning the wrong way: we don't 'select' random
> drivers from other subsystems, and selecting the entire
> subsystem makes it worse. Just drop the 'select' here and
> enable the drivers in the defconfig.

This doesn't actually select the DMADEVICES property. It adds a
dependency on DMADEVICES and if that is met it will select
TEGRA20_APB_DMA.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 10:58     ` Thierry Reding
@ 2022-07-13 12:14       ` Arnd Bergmann
  2022-07-13 12:19         ` Jon Hunter
                           ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Arnd Bergmann @ 2022-07-13 12:14 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Arnd Bergmann, arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter

On Wed, Jul 13, 2022 at 12:58 PM Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Tue, Jul 12, 2022 at 03:27:16PM +0200, Arnd Bergmann wrote:
> > On Fri, Jul 8, 2022 at 8:56 PM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > I fear I'm going to skip this for the current merge window. It looks like
> > the CBB driver you add here would fit into the existing drivers/edac/
> > subsystem, or at the minimum should have been reviewed by the
> > corresponding maintainers (added to Cc)  to decide whether it goes
> > there or not.
> >
> > I had not previously seen this driver, but I'll let them have a look first.
>
> EDAC looks like it's used primarily for memory controllers, which this
> is not. But then I also see explicit references to non-memory-controller
> references in the infrastructure, so perhaps this does fit in there. The
> CBB driver is primarily a means to provide additional information about
> runtime errors, so it's not directly a means of discovering the errors
> (they would be detected anyway and cause a crash) and I don't think we
> have a means of correcting any of these errors.

I think this is just a reflection of what other hardware can do:
most machines only detect memory errors, but the EDAC subsystem
can work with any type in principle. There are also a lot of
conditions elsewhere that can be detected but not corrected.

> I'll ask Sumit to work with the EDAC maintainers on this.

Thanks

> > For the other patches, I found two more problems:
> >
> > > Bitan Biswas (1):
> > >       soc/tegra: fuse: Expose Tegra production status
> >
> > Please don't just add random attributes in the soc device infrastructure.
> > This one has a completely generic name but a SoC specific
> > meaning, and it lacks a description in Documentation/ABI.
> > Not sure what the right ABI is here, but this is something that needs
> > to be discussed more broadly when you send a new version.
>
> I wasn't aware that the SoC device infrastructure was restricted to only
> standardized attributes. Looks like there are a few other outliers that
> add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.
>
> Do we have some other place where this kind of thing can be exposed? Or
> do we just need to come up with some better way of namespacing these?
> Perhaps it would also be sufficient if all of these were better
> documented so that people know what to look for on their platform of
> interest.

It's not a 100% strict rule, I've just tried to limit it as much as possible,
and sometimes missed drivers doing it anyway. My main goal here is
to make things consistent between SoC families, so if one piece of
information is provided by a number of them, I'd rather have a standard
attribute, or a common way of encoding this in the existing attributes
than to have too many custom attributes with similar names.

> > > YueHaibing (1):
> > >      soc/tegra: fuse: Add missing DMADEVICES dependency
> >
> > This one fixes the warning the wrong way: we don't 'select' random
> > drivers from other subsystems, and selecting the entire
> > subsystem makes it worse. Just drop the 'select' here and
> > enable the drivers in the defconfig.
>
> This doesn't actually select the DMADEVICES property. It adds a
> dependency on DMADEVICES and if that is met it will select
> TEGRA20_APB_DMA.

My mistake. However, I still think it's wrong to select
TEGRA20_APB_DMA here, unless there is a build-time
dependency that prevents it from being compiled otherwise.

The dmaengine subsystem is meant to abstract the relation
between the drivers using DMA and those providing the feature,
the same way we abstract all the other subsystems. The
fuse driver may only be used on machines that use
TEGRA20_APB_DMA, but neither the driver code nor
Kconfig should care about that.

        Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 12:14       ` Arnd Bergmann
@ 2022-07-13 12:19         ` Jon Hunter
  2022-07-13 12:36           ` Arnd Bergmann
  2022-07-13 20:22         ` Thierry Reding
  2022-07-14 13:31         ` Borislav Petkov
  2 siblings, 1 reply; 24+ messages in thread
From: Jon Hunter @ 2022-07-13 12:19 UTC (permalink / raw)
  To: Arnd Bergmann, Thierry Reding
  Cc: arm-soc, SoC Team, open list:TEGRA ARCHITECTURE SUPPORT,
	Linux ARM, linux-edac, Borislav Petkov, Mauro Carvalho Chehab,
	Tony Luck, James Morse, Robert Richter


On 13/07/2022 13:14, Arnd Bergmann wrote:

...

>>> For the other patches, I found two more problems:
>>>
>>>> Bitan Biswas (1):
>>>>        soc/tegra: fuse: Expose Tegra production status
>>>
>>> Please don't just add random attributes in the soc device infrastructure.
>>> This one has a completely generic name but a SoC specific
>>> meaning, and it lacks a description in Documentation/ABI.
>>> Not sure what the right ABI is here, but this is something that needs
>>> to be discussed more broadly when you send a new version.
>>
>> I wasn't aware that the SoC device infrastructure was restricted to only
>> standardized attributes. Looks like there are a few other outliers that
>> add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.
>>
>> Do we have some other place where this kind of thing can be exposed? Or
>> do we just need to come up with some better way of namespacing these?
>> Perhaps it would also be sufficient if all of these were better
>> documented so that people know what to look for on their platform of
>> interest.
> 
> It's not a 100% strict rule, I've just tried to limit it as much as possible,
> and sometimes missed drivers doing it anyway. My main goal here is
> to make things consistent between SoC families, so if one piece of
> information is provided by a number of them, I'd rather have a standard
> attribute, or a common way of encoding this in the existing attributes
> than to have too many custom attributes with similar names.


Makes sense. Any recommendations for this specific attribute? I could 
imagine other vendors may have engineering devices and production 
versions. This is slightly different from the silicon version.

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 12:19         ` Jon Hunter
@ 2022-07-13 12:36           ` Arnd Bergmann
  2022-07-14  6:49             ` Jon Hunter
  0 siblings, 1 reply; 24+ messages in thread
From: Arnd Bergmann @ 2022-07-13 12:36 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Arnd Bergmann, Thierry Reding, arm-soc, SoC Team,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter

On Wed, Jul 13, 2022 at 2:19 PM Jon Hunter <jonathanh@nvidia.com> wrote:
> On 13/07/2022 13:14, Arnd Bergmann wrote:
> >>> For the other patches, I found two more problems:
> >>>
> >>>> Bitan Biswas (1):
> >>>>        soc/tegra: fuse: Expose Tegra production status
> >>>
> >>> Please don't just add random attributes in the soc device infrastructure.
> >>> This one has a completely generic name but a SoC specific
> >>> meaning, and it lacks a description in Documentation/ABI.
> >>> Not sure what the right ABI is here, but this is something that needs
> >>> to be discussed more broadly when you send a new version.
> >>
> >> I wasn't aware that the SoC device infrastructure was restricted to only
> >> standardized attributes. Looks like there are a few other outliers that
> >> add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.
> >>
> >> Do we have some other place where this kind of thing can be exposed? Or
> >> do we just need to come up with some better way of namespacing these?
> >> Perhaps it would also be sufficient if all of these were better
> >> documented so that people know what to look for on their platform of
> >> interest.
> >
> > It's not a 100% strict rule, I've just tried to limit it as much as possible,
> > and sometimes missed drivers doing it anyway. My main goal here is
> > to make things consistent between SoC families, so if one piece of
> > information is provided by a number of them, I'd rather have a standard
> > attribute, or a common way of encoding this in the existing attributes
> > than to have too many custom attributes with similar names.
>
>
> Makes sense. Any recommendations for this specific attribute? I could
> imagine other vendors may have engineering devices and production
> versions. This is slightly different from the silicon version.

Not sure, I haven't seen this one referenced elsewhere so far.

What is the actual information this encodes in your case? Is this fused
down in a way that production devices lose access to certain features
that could be security critical but are useful for development?

         Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 12:14       ` Arnd Bergmann
  2022-07-13 12:19         ` Jon Hunter
@ 2022-07-13 20:22         ` Thierry Reding
  2022-07-14  6:30           ` Jon Hunter
  2022-07-14 14:45           ` Arnd Bergmann
  2022-07-14 13:31         ` Borislav Petkov
  2 siblings, 2 replies; 24+ messages in thread
From: Thierry Reding @ 2022-07-13 20:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter

[-- Attachment #1: Type: text/plain, Size: 5995 bytes --]

On Wed, Jul 13, 2022 at 02:14:27PM +0200, Arnd Bergmann wrote:
> On Wed, Jul 13, 2022 at 12:58 PM Thierry Reding
> <thierry.reding@gmail.com> wrote:
> > On Tue, Jul 12, 2022 at 03:27:16PM +0200, Arnd Bergmann wrote:
> > > On Fri, Jul 8, 2022 at 8:56 PM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > I fear I'm going to skip this for the current merge window. It looks like
> > > the CBB driver you add here would fit into the existing drivers/edac/
> > > subsystem, or at the minimum should have been reviewed by the
> > > corresponding maintainers (added to Cc)  to decide whether it goes
> > > there or not.
> > >
> > > I had not previously seen this driver, but I'll let them have a look first.
> >
> > EDAC looks like it's used primarily for memory controllers, which this
> > is not. But then I also see explicit references to non-memory-controller
> > references in the infrastructure, so perhaps this does fit in there. The
> > CBB driver is primarily a means to provide additional information about
> > runtime errors, so it's not directly a means of discovering the errors
> > (they would be detected anyway and cause a crash) and I don't think we
> > have a means of correcting any of these errors.
> 
> I think this is just a reflection of what other hardware can do:
> most machines only detect memory errors, but the EDAC subsystem
> can work with any type in principle. There are also a lot of
> conditions elsewhere that can be detected but not corrected.
> 
> > I'll ask Sumit to work with the EDAC maintainers on this.
> 
> Thanks
> 
> > > For the other patches, I found two more problems:
> > >
> > > > Bitan Biswas (1):
> > > >       soc/tegra: fuse: Expose Tegra production status
> > >
> > > Please don't just add random attributes in the soc device infrastructure.
> > > This one has a completely generic name but a SoC specific
> > > meaning, and it lacks a description in Documentation/ABI.
> > > Not sure what the right ABI is here, but this is something that needs
> > > to be discussed more broadly when you send a new version.
> >
> > I wasn't aware that the SoC device infrastructure was restricted to only
> > standardized attributes. Looks like there are a few other outliers that
> > add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.
> >
> > Do we have some other place where this kind of thing can be exposed? Or
> > do we just need to come up with some better way of namespacing these?
> > Perhaps it would also be sufficient if all of these were better
> > documented so that people know what to look for on their platform of
> > interest.
> 
> It's not a 100% strict rule, I've just tried to limit it as much as possible,
> and sometimes missed drivers doing it anyway. My main goal here is
> to make things consistent between SoC families, so if one piece of
> information is provided by a number of them, I'd rather have a standard
> attribute, or a common way of encoding this in the existing attributes
> than to have too many custom attributes with similar names.

The major/minor attributes that we have on Tegra SoCs should be easy to
standardize. It seems like those could be fairly common. The other one
that we have is the "platform" one, which I suppose is not as easy to
standardize. I don't recall the exact details, but I think we're mostly
interested in whether or not the platform is simulation or silicon. The
exact simulation value is not something that userspace scripts will look
at, as far as I recall.

Jon, correct me if I'm wrong.

Perhaps this can be deprecated in favour of a more standardized property
that can more easily be implemented on other SoCs.

The production mode is something that is read from a fuse and we expose
those via the nvmem subsystem already. Currently nvmem exposes only a
binary attribute in sysfs that userspace would need to parse and ideally
we'd have something a little easier to work with, but perhaps nvmem can
be enhanced to expose individual cells as separate attributes in some
standard format. We also have some other values in the fuses that we
want to make available to userspace (IDs and that sort of thing), so
it's good that you noticed this now before we would've added even more.

> > > > YueHaibing (1):
> > > >      soc/tegra: fuse: Add missing DMADEVICES dependency
> > >
> > > This one fixes the warning the wrong way: we don't 'select' random
> > > drivers from other subsystems, and selecting the entire
> > > subsystem makes it worse. Just drop the 'select' here and
> > > enable the drivers in the defconfig.
> >
> > This doesn't actually select the DMADEVICES property. It adds a
> > dependency on DMADEVICES and if that is met it will select
> > TEGRA20_APB_DMA.
> 
> My mistake. However, I still think it's wrong to select
> TEGRA20_APB_DMA here, unless there is a build-time
> dependency that prevents it from being compiled otherwise.
> 
> The dmaengine subsystem is meant to abstract the relation
> between the drivers using DMA and those providing the feature,
> the same way we abstract all the other subsystems. The
> fuse driver may only be used on machines that use
> TEGRA20_APB_DMA, but neither the driver code nor
> Kconfig should care about that.

This dependency has existed for quite a while and my recollection is
that we wanted to make this very explicit because the lack of the
TEGRA20_APB_DMA driver makes the FUSE driver completely useless on
Tegra20 and that in turn has a very negative impact on the rest of the
system, so we deemed a default configuration change insufficient.

Perhaps a better way to solve this would be to make TEGRA20_APB_DMA
default to "y" if ARCH_TEGRA_2x_SOC. And then perhaps make the FUSE
driver depend on DMADEVICES. That still wouldn't ensure that we get
SOC_TEGRA_FUSE enabled automatically all the time, but perhaps it'd
document the dependency a bit more explicitly.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 20:22         ` Thierry Reding
@ 2022-07-14  6:30           ` Jon Hunter
  2022-07-14 14:45           ` Arnd Bergmann
  1 sibling, 0 replies; 24+ messages in thread
From: Jon Hunter @ 2022-07-14  6:30 UTC (permalink / raw)
  To: Thierry Reding, Arnd Bergmann
  Cc: arm-soc, SoC Team, open list:TEGRA ARCHITECTURE SUPPORT,
	Linux ARM, linux-edac, Borislav Petkov, Mauro Carvalho Chehab,
	Tony Luck, James Morse, Robert Richter


On 13/07/2022 21:22, Thierry Reding wrote:

...

>>>>> Bitan Biswas (1):
>>>>>        soc/tegra: fuse: Expose Tegra production status
>>>>
>>>> Please don't just add random attributes in the soc device infrastructure.
>>>> This one has a completely generic name but a SoC specific
>>>> meaning, and it lacks a description in Documentation/ABI.
>>>> Not sure what the right ABI is here, but this is something that needs
>>>> to be discussed more broadly when you send a new version.
>>>
>>> I wasn't aware that the SoC device infrastructure was restricted to only
>>> standardized attributes. Looks like there are a few other outliers that
>>> add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.
>>>
>>> Do we have some other place where this kind of thing can be exposed? Or
>>> do we just need to come up with some better way of namespacing these?
>>> Perhaps it would also be sufficient if all of these were better
>>> documented so that people know what to look for on their platform of
>>> interest.
>>
>> It's not a 100% strict rule, I've just tried to limit it as much as possible,
>> and sometimes missed drivers doing it anyway. My main goal here is
>> to make things consistent between SoC families, so if one piece of
>> information is provided by a number of them, I'd rather have a standard
>> attribute, or a common way of encoding this in the existing attributes
>> than to have too many custom attributes with similar names.
> 
> The major/minor attributes that we have on Tegra SoCs should be easy to
> standardize. It seems like those could be fairly common. The other one
> that we have is the "platform" one, which I suppose is not as easy to
> standardize. I don't recall the exact details, but I think we're mostly
> interested in whether or not the platform is simulation or silicon. The
> exact simulation value is not something that userspace scripts will look
> at, as far as I recall.
> 
> Jon, correct me if I'm wrong.

There are a few different simulation types and I am seen some userspace 
code convert the value and display the actual type. However, in reality 
I am not sure how much this is used, but yes at least identifying that 
this is silicon is used widely from what I have seen.

Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 12:36           ` Arnd Bergmann
@ 2022-07-14  6:49             ` Jon Hunter
  0 siblings, 0 replies; 24+ messages in thread
From: Jon Hunter @ 2022-07-14  6:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thierry Reding, arm-soc, SoC Team,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter


On 13/07/2022 13:36, Arnd Bergmann wrote:
> On Wed, Jul 13, 2022 at 2:19 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>> On 13/07/2022 13:14, Arnd Bergmann wrote:
>>>>> For the other patches, I found two more problems:
>>>>>
>>>>>> Bitan Biswas (1):
>>>>>>         soc/tegra: fuse: Expose Tegra production status
>>>>>
>>>>> Please don't just add random attributes in the soc device infrastructure.
>>>>> This one has a completely generic name but a SoC specific
>>>>> meaning, and it lacks a description in Documentation/ABI.
>>>>> Not sure what the right ABI is here, but this is something that needs
>>>>> to be discussed more broadly when you send a new version.
>>>>
>>>> I wasn't aware that the SoC device infrastructure was restricted to only
>>>> standardized attributes. Looks like there are a few other outliers that
>>>> add custom attributes: UX500, ARM Integrator and RealView, and OMAP2.
>>>>
>>>> Do we have some other place where this kind of thing can be exposed? Or
>>>> do we just need to come up with some better way of namespacing these?
>>>> Perhaps it would also be sufficient if all of these were better
>>>> documented so that people know what to look for on their platform of
>>>> interest.
>>>
>>> It's not a 100% strict rule, I've just tried to limit it as much as possible,
>>> and sometimes missed drivers doing it anyway. My main goal here is
>>> to make things consistent between SoC families, so if one piece of
>>> information is provided by a number of them, I'd rather have a standard
>>> attribute, or a common way of encoding this in the existing attributes
>>> than to have too many custom attributes with similar names.
>>
>>
>> Makes sense. Any recommendations for this specific attribute? I could
>> imagine other vendors may have engineering devices and production
>> versions. This is slightly different from the silicon version.
> 
> Not sure, I haven't seen this one referenced elsewhere so far.
> 
> What is the actual information this encodes in your case? Is this fused
> down in a way that production devices lose access to certain features
> that could be security critical but are useful for development?

Yes I believe it is precisely that. Exact details I am not clear on, but 
I see a lot of references to this throughout our userspace and testing 
code.

Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 12:14       ` Arnd Bergmann
  2022-07-13 12:19         ` Jon Hunter
  2022-07-13 20:22         ` Thierry Reding
@ 2022-07-14 13:31         ` Borislav Petkov
  2022-07-15  8:06           ` Sumit Gupta
  2022-09-27 16:00           ` Thierry Reding
  2 siblings, 2 replies; 24+ messages in thread
From: Borislav Petkov @ 2022-07-14 13:31 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thierry Reding, arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter

On Wed, Jul 13, 2022 at 02:14:27PM +0200, Arnd Bergmann wrote:
> I think this is just a reflection of what other hardware can do:
> most machines only detect memory errors, but the EDAC subsystem
> can work with any type in principle. There are also a lot of
> conditions elsewhere that can be detected but not corrected.

Just a couple of thoughts from looking at this:

So the EDAC thing reports *hardware* errors by using the RAS
capabilities built into an IP block. So it started with memory
controllers but it is getting extended to other blocks. AMD are looking
at how to integrate GPU hw errors reporting into it, for example.

Looking at that CBB thing, it looks like it is supposed to report not
so much hardware errors but operational errors. Some of the hw errors
reported by RAS hw are also operation-related but not the majority.

Then, EDAC has this counters exposed in:

$ grep -r . /sys/devices/system/edac/
/sys/devices/system/edac/power/runtime_active_time:0
/sys/devices/system/edac/power/runtime_status:unsupported
/sys/devices/system/edac/power/runtime_suspended_time:0
/sys/devices/system/edac/power/control:auto
/sys/devices/system/edac/pci/edac_pci_log_pe:1
/sys/devices/system/edac/pci/pci0/pe_count:0
/sys/devices/system/edac/pci/pci0/npe_count:0
/sys/devices/system/edac/pci/pci_parity_count:0
/sys/devices/system/edac/pci/pci_nonparity_count:0
/sys/devices/system/edac/pci/edac_pci_log_npe:1
/sys/devices/system/edac/pci/edac_pci_panic_on_pe:0
/sys/devices/system/edac/pci/check_pci_errors:0
/sys/devices/system/edac/mc/power/runtime_active_time:0
/sys/devices/system/edac/mc/power/runtime_status:unsupported
...

with the respective hierarchy: memory controllers, PCI errors, etc.

So the main question is, does it make sense for you to fit this into the
EDAC hierarchy and what would even be the advantage of making it part of
EDAC?

HTH.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-13 20:22         ` Thierry Reding
  2022-07-14  6:30           ` Jon Hunter
@ 2022-07-14 14:45           ` Arnd Bergmann
  1 sibling, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2022-07-14 14:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Arnd Bergmann, arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse,
	Robert Richter

On Wed, Jul 13, 2022 at 10:22 PM Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Wed, Jul 13, 2022 at 02:14:27PM +0200, Arnd Bergmann wrote:
> >
> > It's not a 100% strict rule, I've just tried to limit it as much as possible,
> > and sometimes missed drivers doing it anyway. My main goal here is
> > to make things consistent between SoC families, so if one piece of
> > information is provided by a number of them, I'd rather have a standard
> > attribute, or a common way of encoding this in the existing attributes
> > than to have too many custom attributes with similar names.
>
> The major/minor attributes that we have on Tegra SoCs should be easy to
> standardize. It seems like those could be fairly common.

I think these can just be folded into one of the other attributes, probably
either revision or soc_id dependending on what they actually refer to.

These properties are intentionally free-text fields that you can match
using wildcards with the soc_device_match() function. If I read this
part right, the information is already available in the soc_id field,
so we don't even need to change anything here.

> The other one
> that we have is the "platform" one, which I suppose is not as easy to
> standardize. I don't recall the exact details, but I think we're mostly
> interested in whether or not the platform is simulation or silicon. The
> exact simulation value is not something that userspace scripts will look
> at, as far as I recall.

This also looks like it's part of the chip_id.

> > > > > YueHaibing (1):
> > > > >      soc/tegra: fuse: Add missing DMADEVICES dependency
> > > >
> > > > This one fixes the warning the wrong way: we don't 'select' random
> > > > drivers from other subsystems, and selecting the entire
> > > > subsystem makes it worse. Just drop the 'select' here and
> > > > enable the drivers in the defconfig.
> > >
> > > This doesn't actually select the DMADEVICES property. It adds a
> > > dependency on DMADEVICES and if that is met it will select
> > > TEGRA20_APB_DMA.
> >
> > My mistake. However, I still think it's wrong to select
> > TEGRA20_APB_DMA here, unless there is a build-time
> > dependency that prevents it from being compiled otherwise.
> >
> > The dmaengine subsystem is meant to abstract the relation
> > between the drivers using DMA and those providing the feature,
> > the same way we abstract all the other subsystems. The
> > fuse driver may only be used on machines that use
> > TEGRA20_APB_DMA, but neither the driver code nor
> > Kconfig should care about that.
>
> This dependency has existed for quite a while and my recollection is
> that we wanted to make this very explicit because the lack of the
> TEGRA20_APB_DMA driver makes the FUSE driver completely useless on
> Tegra20 and that in turn has a very negative impact on the rest of the
> system, so we deemed a default configuration change insufficient.
>
> Perhaps a better way to solve this would be to make TEGRA20_APB_DMA
> default to "y" if ARCH_TEGRA_2x_SOC. And then perhaps make the FUSE
> driver depend on DMADEVICES. That still wouldn't ensure that we get
> SOC_TEGRA_FUSE enabled automatically all the time, but perhaps it'd
> document the dependency a bit more explicitly.

Ok, this sounds good to me.

          Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-14 13:31         ` Borislav Petkov
@ 2022-07-15  8:06           ` Sumit Gupta
  2022-07-28 17:34             ` Thierry Reding
  2022-09-27 16:00           ` Thierry Reding
  1 sibling, 1 reply; 24+ messages in thread
From: Sumit Gupta @ 2022-07-15  8:06 UTC (permalink / raw)
  To: Borislav Petkov, Arnd Bergmann
  Cc: Thierry Reding, arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter,
	Sumit Gupta, bbasu, Vikram Sethi

Hi Arnd, Boris,

Thank you for your inputs.

>> I think this is just a reflection of what other hardware can do:
>> most machines only detect memory errors, but the EDAC subsystem
>> can work with any type in principle. There are also a lot of
>> conditions elsewhere that can be detected but not corrected.
> 
> Just a couple of thoughts from looking at this:
> 
> So the EDAC thing reports *hardware* errors by using the RAS
> capabilities built into an IP block. So it started with memory
> controllers but it is getting extended to other blocks. AMD are looking
> at how to integrate GPU hw errors reporting into it, for example.
> 
> Looking at that CBB thing, it looks like it is supposed to report not
> so much hardware errors but operational errors. Some of the hw errors
> reported by RAS hw are also operation-related but not the majority.
> 

CBB driver reports errors due to bad MMIO accesses within software.
The vast majority of the CBB errors tend to be programming errors in 
setting up address windows leading to decode errors.

> Then, EDAC has this counters exposed in:
> 
> $ grep -r . /sys/devices/system/edac/
> /sys/devices/system/edac/power/runtime_active_time:0
> /sys/devices/system/edac/power/runtime_status:unsupported
> /sys/devices/system/edac/power/runtime_suspended_time:0
> /sys/devices/system/edac/power/control:auto
> /sys/devices/system/edac/pci/edac_pci_log_pe:1
> /sys/devices/system/edac/pci/pci0/pe_count:0
> /sys/devices/system/edac/pci/pci0/npe_count:0
> /sys/devices/system/edac/pci/pci_parity_count:0
> /sys/devices/system/edac/pci/pci_nonparity_count:0
> /sys/devices/system/edac/pci/edac_pci_log_npe:1
> /sys/devices/system/edac/pci/edac_pci_panic_on_pe:0
> /sys/devices/system/edac/pci/check_pci_errors:0
> /sys/devices/system/edac/mc/power/runtime_active_time:0
> /sys/devices/system/edac/mc/power/runtime_status:unsupported
> ...
> 
> with the respective hierarchy: memory controllers, PCI errors, etc.
> 
> So the main question is, does it make sense for you to fit this into the
> EDAC hierarchy and what would even be the advantage of making it part of
> EDAC?
> 

I also think this doesn't seem to fit with the errors reported by EDAC 
which are mainly hardware errors as Boris explained.
Please share your thoughts and if we can merge the patches as it is.

> HTH.
> 
> --
> Regards/Gruss,
>      Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-15  8:06           ` Sumit Gupta
@ 2022-07-28 17:34             ` Thierry Reding
  2022-08-22  9:31               ` Sumit Gupta
  0 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2022-07-28 17:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Sumit Gupta, Borislav Petkov, arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter,
	bbasu, Vikram Sethi

[-- Attachment #1: Type: text/plain, Size: 2948 bytes --]

On Fri, Jul 15, 2022 at 01:36:16PM +0530, Sumit Gupta wrote:
> Hi Arnd, Boris,
> 
> Thank you for your inputs.
> 
> > > I think this is just a reflection of what other hardware can do:
> > > most machines only detect memory errors, but the EDAC subsystem
> > > can work with any type in principle. There are also a lot of
> > > conditions elsewhere that can be detected but not corrected.
> > 
> > Just a couple of thoughts from looking at this:
> > 
> > So the EDAC thing reports *hardware* errors by using the RAS
> > capabilities built into an IP block. So it started with memory
> > controllers but it is getting extended to other blocks. AMD are looking
> > at how to integrate GPU hw errors reporting into it, for example.
> > 
> > Looking at that CBB thing, it looks like it is supposed to report not
> > so much hardware errors but operational errors. Some of the hw errors
> > reported by RAS hw are also operation-related but not the majority.
> > 
> 
> CBB driver reports errors due to bad MMIO accesses within software.
> The vast majority of the CBB errors tend to be programming errors in setting
> up address windows leading to decode errors.
> 
> > Then, EDAC has this counters exposed in:
> > 
> > $ grep -r . /sys/devices/system/edac/
> > /sys/devices/system/edac/power/runtime_active_time:0
> > /sys/devices/system/edac/power/runtime_status:unsupported
> > /sys/devices/system/edac/power/runtime_suspended_time:0
> > /sys/devices/system/edac/power/control:auto
> > /sys/devices/system/edac/pci/edac_pci_log_pe:1
> > /sys/devices/system/edac/pci/pci0/pe_count:0
> > /sys/devices/system/edac/pci/pci0/npe_count:0
> > /sys/devices/system/edac/pci/pci_parity_count:0
> > /sys/devices/system/edac/pci/pci_nonparity_count:0
> > /sys/devices/system/edac/pci/edac_pci_log_npe:1
> > /sys/devices/system/edac/pci/edac_pci_panic_on_pe:0
> > /sys/devices/system/edac/pci/check_pci_errors:0
> > /sys/devices/system/edac/mc/power/runtime_active_time:0
> > /sys/devices/system/edac/mc/power/runtime_status:unsupported
> > ...
> > 
> > with the respective hierarchy: memory controllers, PCI errors, etc.
> > 
> > So the main question is, does it make sense for you to fit this into the
> > EDAC hierarchy and what would even be the advantage of making it part of
> > EDAC?
> > 
> 
> I also think this doesn't seem to fit with the errors reported by EDAC which
> are mainly hardware errors as Boris explained.
> Please share your thoughts and if we can merge the patches as it is.

Arnd,

any more thoughts on this? Looks like there is no consensus on where
this should go. If it's okay for this to go in via ARM SoC after all,
I could prepare another pull request including only the CBB changes
along with some of the reference count fixes. I could possibly also
rework the DMADEVICES dependency patch as discussed, or we could defer
it if it's too risky at this point.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-28 17:34             ` Thierry Reding
@ 2022-08-22  9:31               ` Sumit Gupta
  0 siblings, 0 replies; 24+ messages in thread
From: Sumit Gupta @ 2022-08-22  9:31 UTC (permalink / raw)
  To: Arnd Bergmann, Thierry Reding
  Cc: Borislav Petkov, arm-soc, SoC Team, Jon Hunter,
	open list:TEGRA ARCHITECTURE SUPPORT, Linux ARM, linux-edac,
	Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter,
	bbasu, Vikram Sethi, Sumit Gupta



> On Fri, Jul 15, 2022 at 01:36:16PM +0530, Sumit Gupta wrote:
>> Hi Arnd, Boris,
>>
>> Thank you for your inputs.
>>
>>>> I think this is just a reflection of what other hardware can do:
>>>> most machines only detect memory errors, but the EDAC subsystem
>>>> can work with any type in principle. There are also a lot of
>>>> conditions elsewhere that can be detected but not corrected.
>>> Just a couple of thoughts from looking at this:
>>>
>>> So the EDAC thing reports*hardware*  errors by using the RAS
>>> capabilities built into an IP block. So it started with memory
>>> controllers but it is getting extended to other blocks. AMD are looking
>>> at how to integrate GPU hw errors reporting into it, for example.
>>>
>>> Looking at that CBB thing, it looks like it is supposed to report not
>>> so much hardware errors but operational errors. Some of the hw errors
>>> reported by RAS hw are also operation-related but not the majority.
>>>
>> CBB driver reports errors due to bad MMIO accesses within software.
>> The vast majority of the CBB errors tend to be programming errors in setting
>> up address windows leading to decode errors.
>>
>>> Then, EDAC has this counters exposed in:
>>>
>>> $ grep -r ./sys/devices/system/edac/
>>> /sys/devices/system/edac/power/runtime_active_time:0
>>> /sys/devices/system/edac/power/runtime_status:unsupported
>>> /sys/devices/system/edac/power/runtime_suspended_time:0
>>> /sys/devices/system/edac/power/control:auto
>>> /sys/devices/system/edac/pci/edac_pci_log_pe:1
>>> /sys/devices/system/edac/pci/pci0/pe_count:0
>>> /sys/devices/system/edac/pci/pci0/npe_count:0
>>> /sys/devices/system/edac/pci/pci_parity_count:0
>>> /sys/devices/system/edac/pci/pci_nonparity_count:0
>>> /sys/devices/system/edac/pci/edac_pci_log_npe:1
>>> /sys/devices/system/edac/pci/edac_pci_panic_on_pe:0
>>> /sys/devices/system/edac/pci/check_pci_errors:0
>>> /sys/devices/system/edac/mc/power/runtime_active_time:0
>>> /sys/devices/system/edac/mc/power/runtime_status:unsupported
>>> ...
>>>
>>> with the respective hierarchy: memory controllers, PCI errors, etc.
>>>
>>> So the main question is, does it make sense for you to fit this into the
>>> EDAC hierarchy and what would even be the advantage of making it part of
>>> EDAC?
>>>
>> I also think this doesn't seem to fit with the errors reported by EDAC which
>> are mainly hardware errors as Boris explained.
>> Please share your thoughts and if we can merge the patches as it is.
> Arnd,
> 
> any more thoughts on this? Looks like there is no consensus on where
> this should go. If it's okay for this to go in via ARM SoC after all,
> I could prepare another pull request including only the CBB changes
> along with some of the reference count fixes. I could possibly also
> rework the DMADEVICES dependency patch as discussed, or we could defer
> it if it's too risky at this point.
> 
> Thierry

Hi Arnd, Thierry,
Gentle ping.

If we are OK with the reasoning then can we please queue the patch 
series for '6.1'.

Thank you,
Sumit

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [GIT PULL 1/7] soc/tegra: Changes for v5.20-rc1
  2022-07-14 13:31         ` Borislav Petkov
  2022-07-15  8:06           ` Sumit Gupta
@ 2022-09-27 16:00           ` Thierry Reding
  1 sibling, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2022-09-27 16:00 UTC (permalink / raw)
  To: Borislav Petkov, Arnd Bergmann
  Cc: arm, soc, Jon Hunter, linux-tegra, linux-arm-kernel, linux-edac,
	Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter,
	Rahul Bedarkar

[-- Attachment #1: Type: text/plain, Size: 3568 bytes --]

On Thu, Jul 14, 2022 at 03:31:07PM +0200, Borislav Petkov wrote:
> On Wed, Jul 13, 2022 at 02:14:27PM +0200, Arnd Bergmann wrote:
> > I think this is just a reflection of what other hardware can do:
> > most machines only detect memory errors, but the EDAC subsystem
> > can work with any type in principle. There are also a lot of
> > conditions elsewhere that can be detected but not corrected.
> 
> Just a couple of thoughts from looking at this:
> 
> So the EDAC thing reports *hardware* errors by using the RAS
> capabilities built into an IP block. So it started with memory
> controllers but it is getting extended to other blocks. AMD are looking
> at how to integrate GPU hw errors reporting into it, for example.
> 
> Looking at that CBB thing, it looks like it is supposed to report not
> so much hardware errors but operational errors. Some of the hw errors
> reported by RAS hw are also operation-related but not the majority.
> 
> Then, EDAC has this counters exposed in:
> 
> $ grep -r . /sys/devices/system/edac/
> /sys/devices/system/edac/power/runtime_active_time:0
> /sys/devices/system/edac/power/runtime_status:unsupported
> /sys/devices/system/edac/power/runtime_suspended_time:0
> /sys/devices/system/edac/power/control:auto
> /sys/devices/system/edac/pci/edac_pci_log_pe:1
> /sys/devices/system/edac/pci/pci0/pe_count:0
> /sys/devices/system/edac/pci/pci0/npe_count:0
> /sys/devices/system/edac/pci/pci_parity_count:0
> /sys/devices/system/edac/pci/pci_nonparity_count:0
> /sys/devices/system/edac/pci/edac_pci_log_npe:1
> /sys/devices/system/edac/pci/edac_pci_panic_on_pe:0
> /sys/devices/system/edac/pci/check_pci_errors:0
> /sys/devices/system/edac/mc/power/runtime_active_time:0
> /sys/devices/system/edac/mc/power/runtime_status:unsupported
> ...
> 
> with the respective hierarchy: memory controllers, PCI errors, etc.
> 
> So the main question is, does it make sense for you to fit this into the
> EDAC hierarchy and what would even be the advantage of making it part of
> EDAC?

Closing the loop on this: we've decided to keep this in drivers/soc for
now, with the option of re-evaluating when we encounter similar
functionality on other hardware.

I'm also going to hijack the thread because something else came up
recently that fits the audience here and it's up the same alley: on
Tegra234 a mechanism, called FSI (Functional Safety Island), exists
to report failures to an external MCU that's monitoring the system.

Special hardware exists in the SoC that can send these errors to the
MCU via different transports, and the idea is to report software-
detected failures from kernel drivers such as I2C or PCI via this
mechanism, so appropriate action can be taken. So essentially we're
looking at adding some new API, preferably something generic, to these
bus drivers along with "provider" drivers that get notified of these
reports so that they can be forwarded to the FSI (and then the MCU).

This again doesn't seem to be a great fit for EDAC as it is today, but
I can also not find anything better looking around the kernel. So I'm
wondering if this is something that others have encountered and might
have solved already and I just haven't found it, or if this is something
that would be worth creating a new subsystem for. Or perhaps this could
be integrated into EDAC somehow? I'm a bit reluctant to add yet another
custom infrastructure for this, given that it's functionality that
likely exists in other SoCs as well.

Any thoughts on this?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-09-27 16:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-08 18:56 [GIT PULL 0/7] NVIDIA Tegra changes for v5.20-rc1 Thierry Reding
2022-07-08 18:56 ` [GIT PULL 1/7] soc/tegra: Changes " Thierry Reding
2022-07-12 13:27   ` Arnd Bergmann
2022-07-13 10:58     ` Thierry Reding
2022-07-13 12:14       ` Arnd Bergmann
2022-07-13 12:19         ` Jon Hunter
2022-07-13 12:36           ` Arnd Bergmann
2022-07-14  6:49             ` Jon Hunter
2022-07-13 20:22         ` Thierry Reding
2022-07-14  6:30           ` Jon Hunter
2022-07-14 14:45           ` Arnd Bergmann
2022-07-14 13:31         ` Borislav Petkov
2022-07-15  8:06           ` Sumit Gupta
2022-07-28 17:34             ` Thierry Reding
2022-08-22  9:31               ` Sumit Gupta
2022-09-27 16:00           ` Thierry Reding
2022-07-08 18:56 ` [GIT PULL 2/7] firmware: tegra: " Thierry Reding
2022-07-08 18:56 ` [GIT PULL 3/7] dt-bindings: " Thierry Reding
2022-07-08 18:56 ` [GIT PULL 4/7] memory: tegra: " Thierry Reding
2022-07-08 18:56 ` [GIT PULL 5/7] ARM: tegra: Device tree changes " Thierry Reding
2022-07-08 18:56 ` [GIT PULL 6/7] arm64: " Thierry Reding
2022-07-08 18:56 ` [GIT PULL 7/7] arm64: tegra: Default configuration updates " Thierry Reding
2022-07-11 16:20 ` [GIT PULL 0/7] NVIDIA Tegra changes " patchwork-bot+linux-soc
2022-07-12 21:30 ` patchwork-bot+linux-soc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).