archive mirror
 help / color / mirror / Atom feed
From: Oded Gabbay <>
Subject: [git pull] habanalabs pull request for kernel 5.14
Date: Tue, 22 Jun 2021 12:36:27 +0300	[thread overview]
Message-ID: <20210622093627.GA4857@CORE.localdomain> (raw)

Hi Greg,

This is habanalabs pull request for the merge window of kernel 5.14.
The bulk of the commits relate to improvements of the communication
between the driver and the firmware that is running inside the ASIC.

There are also many small changes, new debug features, improvements, etc.
Details are in the tag.


The following changes since commit 91812dd0937cc6457e85f7733813c701ee971da5:

  bus: fsl-mc: mc-io: Correct misdocumentation of 'dpmcp_dev' param (2021-06-17 13:44:33 +0200)

are available in the Git repository at: tags/misc-habanalabs-next-2021-06-22

for you to fetch changes up to b7a71fddc0ddfdd66cdefcf5bf1f59a0f0bdea57:

  habanalabs/gaudi: refactor hard-reset related code (2021-06-21 10:21:51 +0300)

This tag contains habanalabs driver changes for v5.14:

- Change communication protocol with f/w. The new protocl allows better
  backward compatibility between different f/w versions and is more
- Send hard-reset cause to f/w after a hard-reset has happened.
- Move to indirection when generating interrupts to f/w.
- Better progress and error messages during the f/w load stage.
- Recognize that f/w is with enabled security according to device ID.
- Add validity check to event queue mechanism.
- Add new event from f/w that will indicate a daemon has been terminated
  inside the f/w.

- Move to TLB cache range invalidation in the device's MMU.
- Disable memory scrubbing by default for performance.

- Many fixes for sparse/smatch reported errors.
- Enable by default stop-on-err in the ASIC.
- Move to ASYNC device probing to speedup loading of driver in server
  with multiple devices.
- Fix to stop using disabled NIC ports when doing collective operation.
- Use standard error codes instead of positive values.
- Add support for resetting device after user has finished using it.
- Add debugfs option to avoid reset when a CS has got stuck.
- Add print of the last 8 CS pointers in case of error in QMANs.
- Add statistics on opening of the FD of a device.

Alon Mizrahi (1):
      habanalabs: use mmu cache range invalidation

Bharat Jauhari (1):
      habanalabs: enable dram scramble before linux f/w

Christophe JAILLET (1):
      habanalabs: Fix an error handling path in 'hl_pci_probe()'

Guy Nisan (1):
      habanalabs: modify progress status messages

Koby Elbaz (15):
      habanalabs/gaudi: use scratchpad regs instead of GIC controller
      habanalabs/gaudi: send hard reset cause to preboot
      habanalabs/gaudi: read GIC sts after FW is loaded
      habanalabs: read preboot status bits in an earlier stage
      habanalabs/gaudi: disable GIC usage if security is enabled
      habanalabs/gaudi: use COMMS to reset device / halt CPU
      habanalabs/gaudi: set the correct cpu_id on MME2_QM failure
      habanalabs: small code refactoring
      habanalabs: zero complex structures using memset
      habanalabs: set rc as 'valid' in case of intentional func exit
      habanalabs: remove node from list before freeing the node
      habanalabs/gaudi: set the correct rc in case of err
      habanalabs/goya: add '__force' attribute to suppress false alarm
      habanalabs: get lower/upper 32 bits via masking
      habanalabs/gaudi: refactor hard-reset related code

Moti Haimovski (1):
      habanalabs: increase ELBI reset timeout for PLDM

Oded Gabbay (15):
      habanalabs: update firmware files to latest
      habanalabs: update to latest f/w headers
      habanalabs: use dev_dbg upon hint address failure
      habanalabs: ignore device unusable status
      habanalabs: better error print for pin failure
      habanalabs: notify before f/w loading
      habanalabs: set memory scrubbing to disabled by default
      habanalabs: check running index in eqe control
      habanalabs/gaudi: refactor reset code
      habanalabs/gaudi: don't use nic_ports_mask in compute
      habanalabs: prefer ASYNC device probing
      habanalabs/gaudi: update to latest f/w specs
      habanalabs/gaudi: use standard error codes
      habanalabs: print firmware versions
      habanalabs: remove a rogue #ifdef

Ofir Bitton (10):
      habanalabs: give FW a grace time for configuring iATU
      habanalabs/gaudi: do not move HBM bar if iATU done by FW
      habanalabs/gaudi: split host irq interfaces towards FW
      habanalabs/gaudi: don't use disabled ports in collective wait
      habanalabs/gaudi: add FW alive event support
      habanalabs: reset device upon FD close if not idle
      habanalabs: enable stop on error for all QMANs and engines
      habanalabs/gaudi: correct driver events numbering
      habanalabs: allow reset upon device release
      habanalabs/gaudi: add support for NIC DERR

Ohad Sharabi (17):
      habanalabs: prepare preboot stage to dynamic f/w load
      habanalabs: request f/w in separate function
      habanalabs: refactor init device cpu code
      habanalabs: use common fw_version read
      habanalabs: dynamic fw load reset protocol
      habanalabs: expose ASIC specific PCI info to common code
      habanalabs: load boot fit to device
      habanalabs: load linux image to device
      habanalabs: set dma mask from fw once fw done iatu config
      habanalabs: avoid using uninitialized pointer
      habanalabs: read f/w's 2-nd sts and err registers
      habanalabs: check if asic secured with asic type
      habanalabs: track security status using positive logic
      habanalabs: skip valid test for boot_dev_sts regs
      habanalabs: fix mask to obtain page offset
      habanalabs: report EQ fault during heartbeat
      habanalabs/gaudi: print last QM PQEs on error

Omer Shpigelman (2):
      habanalabs: add missing space after casting
      habanalabs: add hard reset timeout for PLDM

Tal Albo (1):
      habanalabs/gaudi: update coresight configuration

Tomer Tayar (2):
      habanalabs/gaudi: add ARB to QM stop on error masks
      habanalabs: print more info when failing to pin user memory

Yuri Nudelman (3):
      habanalabs: add debug flag to prevent failure on timeout
      habanalabs: added open_stats info ioctl
      debugfs: add skip_reset_on_timeout option

Zvika Yehudai (1):
      habanalabs: fix typo

farah kassabri (1):
      habanalabs: add validity check for signal cs

 .../ABI/testing/debugfs-driver-habanalabs          |    8 +
 .../misc/habanalabs/common/command_submission.c    |   81 +-
 drivers/misc/habanalabs/common/context.c           |    9 -
 drivers/misc/habanalabs/common/debugfs.c           |    5 +
 drivers/misc/habanalabs/common/device.c            |   82 +-
 drivers/misc/habanalabs/common/firmware_if.c       | 1806 ++++++++++++++++++--
 drivers/misc/habanalabs/common/habanalabs.h        |  280 ++-
 drivers/misc/habanalabs/common/habanalabs_drv.c    |   24 +-
 drivers/misc/habanalabs/common/habanalabs_ioctl.c  |   23 +-
 drivers/misc/habanalabs/common/hw_queue.c          |   42 +-
 drivers/misc/habanalabs/common/irq.c               |   24 +-
 drivers/misc/habanalabs/common/memory.c            |   22 +-
 drivers/misc/habanalabs/common/mmu/mmu.c           |   14 +-
 drivers/misc/habanalabs/common/pci/pci.c           |   34 +-
 drivers/misc/habanalabs/common/sysfs.c             |    2 +-
 drivers/misc/habanalabs/gaudi/gaudi.c              | 1010 ++++++-----
 drivers/misc/habanalabs/gaudi/gaudiP.h             |    1 +
 drivers/misc/habanalabs/gaudi/gaudi_coresight.c    |    6 +-
 drivers/misc/habanalabs/gaudi/gaudi_security.c     |   15 +-
 drivers/misc/habanalabs/goya/goya.c                |  251 +--
 drivers/misc/habanalabs/goya/goyaP.h               |    2 +-
 drivers/misc/habanalabs/goya/goya_coresight.c      |    2 +-
 drivers/misc/habanalabs/include/common/cpucp_if.h  |   45 +-
 .../misc/habanalabs/include/common/hl_boot_if.h    |  184 +-
 .../habanalabs/include/gaudi/gaudi_async_events.h  |   14 +-
 .../include/gaudi/gaudi_async_ids_map_extended.h   |   31 +-
 .../misc/habanalabs/include/gaudi/gaudi_fw_if.h    |   46 +
 .../misc/habanalabs/include/gaudi/gaudi_masks.h    |   15 +-
 .../misc/habanalabs/include/gaudi/gaudi_reg_map.h  |   10 +
 include/uapi/misc/habanalabs.h                     |   13 +
 30 files changed, 3216 insertions(+), 885 deletions(-)

                 reply	other threads:[~2021-06-22  9:36 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210622093627.GA4857@CORE.localdomain \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).