All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH 00/14] Introduce xe_devcoredump.
Date: Tue, 2 May 2023 08:11:32 +0000	[thread overview]
Message-ID: <ZFDFtDvdqsnCn5H0@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <20230426205713.512695-1-rodrigo.vivi@intel.com>

On Wed, Apr 26, 2023 at 04:56:59PM -0400, Rodrigo Vivi wrote:
> Xe needs to align with other drivers on the way that the error states are
> dumped, avoiding a Xe only error_state solution. The goal is to use devcoredump
> infrastructure to report error states, since it produces a standardized way
> by exposing a virtual and temporary /sys/class/devcoredump device.
> 
> The initial goal is to have the simple_error_state in the devcoredump
> so we start using the infrastructure.
> 
> But this is just a start point to start building a useful and
> organized crash dump, using standard infrastructure. Later this
> will be changed to have output that can be parsed by tools and
> used for error replay.

We are certainly missing the GuC log, it would also be really nice to
get the ftrace included too. Not sure if the later is easy, I know I
looked into this on the i915 and couldn't figure it out but this was a
while ago and admittedly didn't try all that hard.

Matt 

> 
> Later, when we are in-tree, the goal is to collaborate with devcoredump
> infrastructure with overall possible improvements, like multiple file support
> for better organization of the dumps, snapshot support, dmesg extra print,
> and whatever may make sense and help the overall infrastructure.
> 
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Rodrigo Vivi (14):
>   drm/xe: Fix print of RING_EXECLIST_SQ_CONTENTS_HI
>   drm/xe: Introduce the dev_coredump infrastructure.
>   drm/xe: Do not take any action if our device was removed.
>   drm/xe: Extract non mapped regions out of GuC CTB into its own struct.
>   drm/xe: Convert GuC CT print to snapshot capture and print.
>   drm/xe: Add GuC CT snapshot to xe_devcoredump.
>   drm/xe: Introduce guc_submit_types.h with relevant structs.
>   drm/xe: Convert GuC Engine print to snapshot capture and print.
>   drm/xe: Add GuC Submit Engine snapshot to xe_devcoredump.
>   drm/xe: Convert Xe HW Engine print to snapshot capture and print.
>   drm/xe: Add HW Engine snapshot to xe_devcoredump.
>   drm/xe: Limit CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE to itself.
>   drm/xe: Convert VM print to snapshot capture and print.
>   drm/xe: Add VM snapshot to xe_devcoredump.
> 
>  drivers/gpu/drm/xe/Kconfig                |   1 +
>  drivers/gpu/drm/xe/Makefile               |   1 +
>  drivers/gpu/drm/xe/regs/xe_engine_regs.h  |   3 +-
>  drivers/gpu/drm/xe/xe_devcoredump.c       | 227 ++++++++++++++++++
>  drivers/gpu/drm/xe/xe_devcoredump.h       |  22 ++
>  drivers/gpu/drm/xe/xe_devcoredump_types.h |  60 +++++
>  drivers/gpu/drm/xe/xe_device_types.h      |   4 +
>  drivers/gpu/drm/xe/xe_execlist.c          |   4 +-
>  drivers/gpu/drm/xe/xe_gt_debugfs.c        |   2 +-
>  drivers/gpu/drm/xe/xe_guc_ct.c            | 275 +++++++++++++++-------
>  drivers/gpu/drm/xe/xe_guc_ct.h            |   7 +-
>  drivers/gpu/drm/xe/xe_guc_ct_types.h      |  46 +++-
>  drivers/gpu/drm/xe/xe_guc_fwif.h          |  29 ---
>  drivers/gpu/drm/xe/xe_guc_submit.c        | 258 ++++++++++++++------
>  drivers/gpu/drm/xe/xe_guc_submit.h        |  10 +-
>  drivers/gpu/drm/xe/xe_guc_submit_types.h  | 155 ++++++++++++
>  drivers/gpu/drm/xe/xe_hw_engine.c         | 210 ++++++++++++-----
>  drivers/gpu/drm/xe/xe_hw_engine.h         |   8 +-
>  drivers/gpu/drm/xe/xe_hw_engine_types.h   |  78 ++++++
>  drivers/gpu/drm/xe/xe_pci.c               |   2 +
>  drivers/gpu/drm/xe/xe_vm.c                | 140 +++++++++--
>  drivers/gpu/drm/xe/xe_vm.h                |   6 +-
>  drivers/gpu/drm/xe/xe_vm_types.h          |  18 ++
>  23 files changed, 1288 insertions(+), 278 deletions(-)
>  create mode 100644 drivers/gpu/drm/xe/xe_devcoredump.c
>  create mode 100644 drivers/gpu/drm/xe/xe_devcoredump.h
>  create mode 100644 drivers/gpu/drm/xe/xe_devcoredump_types.h
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_submit_types.h
> 
> --
> 2.39.2

      parent reply	other threads:[~2023-05-02  8:11 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-26 20:56 [PATCH 00/14] Introduce xe_devcoredump Rodrigo Vivi
2023-04-26 20:56 ` [Intel-xe] " Rodrigo Vivi
2023-04-26 20:57 ` [PATCH 01/14] drm/xe: Fix print of RING_EXECLIST_SQ_CONTENTS_HI Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-04-26 21:40   ` Lucas De Marchi
2023-04-26 21:59     ` Rodrigo Vivi
2023-04-26 20:57 ` [PATCH 02/14] drm/xe: Introduce the dev_coredump infrastructure Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-04-27  8:28   ` Thomas Hellström
2023-05-02  7:57     ` Matthew Brost
2023-05-02 18:06       ` Rodrigo Vivi
2023-05-02 18:06         ` Rodrigo Vivi
2023-05-02 20:29         ` Matthew Brost
2023-05-02 20:29           ` Matthew Brost
2023-05-02  7:55   ` Jani Nikula
2023-05-02 17:25     ` Rodrigo Vivi
2023-04-26 20:57 ` [PATCH 03/14] drm/xe: Do not take any action if our device was removed Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02 15:40   ` Matthew Brost
2023-05-02 17:21     ` Rodrigo Vivi
2023-05-02 23:06       ` Matthew Brost
2023-04-26 20:57 ` [PATCH 04/14] drm/xe: Extract non mapped regions out of GuC CTB into its own struct Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02  5:12   ` Matthew Brost
2023-04-26 20:57 ` [PATCH 05/14] drm/xe: Convert GuC CT print to snapshot capture and print Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02  5:27   ` Matthew Brost
2023-04-26 20:57 ` [PATCH 06/14] drm/xe: Add GuC CT snapshot to xe_devcoredump Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02 14:55   ` Matthew Brost
2023-05-02 14:55     ` [Intel-xe] " Matthew Brost
2023-04-26 20:57 ` [PATCH 07/14] drm/xe: Introduce guc_submit_types.h with relevant structs Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02  7:44   ` Matthew Brost
2023-05-02  7:44     ` [Intel-xe] " Matthew Brost
2023-04-26 20:57 ` [Intel-xe] [PATCH 08/14] drm/xe: Convert GuC Engine print to snapshot capture and print Rodrigo Vivi
2023-04-26 20:57   ` Rodrigo Vivi
2023-05-02 15:01   ` [Intel-xe] " Matthew Brost
2023-04-26 20:57 ` [Intel-xe] [PATCH 09/14] drm/xe: Add GuC Submit Engine snapshot to xe_devcoredump Rodrigo Vivi
2023-04-26 20:57   ` Rodrigo Vivi
2023-05-02 15:03   ` Matthew Brost
2023-05-02 15:03     ` [Intel-xe] " Matthew Brost
2023-04-26 20:57 ` [Intel-xe] [PATCH 10/14] drm/xe: Convert Xe HW Engine print to snapshot capture and print Rodrigo Vivi
2023-04-26 20:57   ` Rodrigo Vivi
2023-05-02 15:20   ` [Intel-xe] " Matthew Brost
2023-04-26 20:57 ` [PATCH 11/14] drm/xe: Add HW Engine snapshot to xe_devcoredump Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02 15:30   ` Matthew Brost
2023-04-26 20:57 ` [PATCH 12/14] drm/xe: Limit CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE to itself Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02 15:35   ` Matthew Brost
2023-05-02 15:35     ` [Intel-xe] " Matthew Brost
2023-04-26 20:57 ` [PATCH 13/14] drm/xe: Convert VM print to snapshot capture and print Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02  7:50   ` Matthew Brost
2023-05-02  8:07   ` Matthew Brost
2023-04-26 20:57 ` [PATCH 14/14] drm/xe: Add VM snapshot to xe_devcoredump Rodrigo Vivi
2023-04-26 20:57   ` [Intel-xe] " Rodrigo Vivi
2023-05-02 15:38   ` Matthew Brost
2023-04-26 21:01 ` [Intel-xe] ✓ CI.Patch_applied: success for Introduce xe_devcoredump Patchwork
2023-04-26 21:02 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-04-26 21:06 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-04-26 21:29 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-05-02  8:11 ` Matthew Brost [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFDFtDvdqsnCn5H0@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.