All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Roper <matthew.d.roper@intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org
Subject: [PATCH v5 0/6] i915: SSEU handling updates
Date: Wed,  1 Jun 2022 08:07:19 -0700	[thread overview]
Message-ID: <20220601150725.521468-1-matthew.d.roper@intel.com> (raw)

This series reworks i915's internal handling of slice/subslice/EU (SSEU)                                                                             
data to represent platforms like Xe_HP in a more natural manner and to                                                                               
prepare for future platforms where the masks will need to grow in size.                                                                              
One key idea of this series is that although we have a fixed ABI to                                                                                  
convey SSEU data to userspace (i.e., multiple u8[] arrays with data                                                                                  
stored at different strides), we don't need to use this cumbersome                                                                                   
format for the driver's own internal storage.  As long as we can convert                                                                             
into the uapi form properly when responding to the I915_QUERY ioctl,                                                                                 
it's preferable to use an internal storage format that's easier for the                                                                              
driver to work with.                                                                                                                                 
                                                                                                                                                     
Another key point here is that we're reaching the point where subslice                                                                               
(DSS) masks will soon not fit within simple u32/u64 integer values.                                                                                  
Xe_HP SDV and DG2 platforms today have subslice (DSS) masks that are 32                                                                              
bits, which maxes out the current storage of a u32.  With PVC the masks                                                                              
are represented by a pair of 32-bit registers, requiring a bump up to at                                                                             
least 64-bits of storage internally.  We could switch to u64 for that in                                                                             
the short term, but since we already know that upcoming architectures                                                                                
intend to provide DSS fuse bits via three or more registers it's best to                                                                             
switch to a representation that's more future-proof but still easy to                                                                                
work with in the driver code.  To accomodate this, we start storing our                                                                              
subslice mask for Xe_HP and beyond in a new typedef that can be                                                                                      
processed by the linux/bitmap.h operations.                                                                                                          
                                                                                                                                                     
Finally, since no userspace for Xe_HP or beyond is using the legacy                                                                                  
I915_GETPARAM ioctl lookups for I915_PARAM_SLICE_MASK and                                                                                            
I915_PARAM_SUBSLICE_MASK (since they've migrated to the more flexible                                                                                
I915_QUERY ioctl that can return more than a simple u32 value), we take                                                                              
the opportunity to officially drop support for those GETPARAM lookups on                                                                             
modern platforms.  Maintaining support for these GETPARAM lookups don't                                                                              
make sense for a number of reasons:                                                                                                                  
                                                                                                                                                     
 * Traditional slices no longer exist, and newer ideas like gslices,                                                                                 
   cslices, mslices, etc. aren't something userspace needs to query                                                                                  
   since it can be inferred from other information.                                                                                                  
 * The GETPARAM ioctl doesn't have a way to distinguish between geometry                                                                             
   subslice masks and compute subslice masks, which are distinct on                                                                                  
   Xe_HP and beyond.                                                                                                                                 
 * The I915_GETPARAM ioctl is limited to returning a 32-bit value, so                                                                                
   when subslice masks begin to exceed 32-bits (on PVC), it simply can't                                                                             
   return the entire mask.                                                                                                                           
 * The GETPARAM ioctl doesn't have a way to give sensible information                                                                                
   for multi-tile devices.                                                                                                                           
                                                                                                                                                     
v2:                                                                                                                                                  
 - Switch to union of hsw/xehp formats to keep the representation in a                                                                               
   natural format for different types of hardware.                                                                                                   
 - Avoid accessing internals of intel_sseu_ss_mask_t directly outside of                                                                             
   intel_sseu.[ch].                                                                                                                                  
 - Include PVC SSEU which needs the larger SS mask storage enabled by                                                                                
   this series.                                                                                                                                      
                                                                                                                                                     
v3:                                                                                                                                                  
 - Correct a BIT(s) typo that should have been BIT(ss), causing                                                                                      
   incorrect handling on gen9 platforms.                                                                                                             
                                                                                                                                                     
v4:                                                                                                                                                  
 - Eliminate sseu->{ss,eu}_stride fields and just calculate the proper                                                                               
   value in the UAPI code that needs them.                                                                                                           
 - Handle unwanted ~u8 sign extension at the callsite instead of inside                                                                              
   sseu_set_eus.                                                                                                                                     
 - Use BITMAP_BITS() macro rather than passing I915_MAX_SS_FUSE_BITS                                                                                 
   around directly to bitmap operations.                                                                                                             
 - Improved debugfs / dmesg reporting for Xe_HP dumps                                                                                                
 - Various assertion check improvements.                                                                                                             
                                                                                                                                                     
v5:
 - Rebase to latest drm-tip (resolve trivial conflicts)
 - Move XEHP_BITMAP_BITS() to the header so that we can also replace a usage of
   I915_MAX_SS_FUSE_BITS in one of the inline functions.  (Bala)
 - Change the local variable in intel_slicemask_from_xehp_dssmask() from u16 to
   'unsigned long' to make it a bit more future-proof.
 - Incorporate ack's received from Tvrtko and Lionel.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Matt Roper (6):
  drm/i915/xehp: Use separate sseu init function
  drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK
  drm/i915/sseu: Simplify gen11+ SSEU handling
  drm/i915/sseu: Don't try to store EU mask internally in UAPI format
  drm/i915/sseu: Disassociate internal subslice mask representation from
    uapi
  drm/i915/pvc: Add SSEU changes

 drivers/gpu/drm/i915/gem/i915_gem_context.c  |   5 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c           |  12 +-
 drivers/gpu/drm/i915/gt/intel_gt_regs.h      |   1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c         | 450 ++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  92 ++--
 drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c |  30 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  |  24 +-
 drivers/gpu/drm/i915/i915_drv.h              |   2 +
 drivers/gpu/drm/i915/i915_getparam.c         |  11 +-
 drivers/gpu/drm/i915/i915_pci.c              |   3 +-
 drivers/gpu/drm/i915/i915_query.c            |  26 +-
 drivers/gpu/drm/i915/intel_device_info.h     |   1 +
 13 files changed, 397 insertions(+), 264 deletions(-)

-- 
2.35.3


WARNING: multiple messages have this Message-ID (diff)
From: Matt Roper <matthew.d.roper@intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: [Intel-gfx] [PATCH v5 0/6] i915: SSEU handling updates
Date: Wed,  1 Jun 2022 08:07:19 -0700	[thread overview]
Message-ID: <20220601150725.521468-1-matthew.d.roper@intel.com> (raw)

This series reworks i915's internal handling of slice/subslice/EU (SSEU)                                                                             
data to represent platforms like Xe_HP in a more natural manner and to                                                                               
prepare for future platforms where the masks will need to grow in size.                                                                              
One key idea of this series is that although we have a fixed ABI to                                                                                  
convey SSEU data to userspace (i.e., multiple u8[] arrays with data                                                                                  
stored at different strides), we don't need to use this cumbersome                                                                                   
format for the driver's own internal storage.  As long as we can convert                                                                             
into the uapi form properly when responding to the I915_QUERY ioctl,                                                                                 
it's preferable to use an internal storage format that's easier for the                                                                              
driver to work with.                                                                                                                                 
                                                                                                                                                     
Another key point here is that we're reaching the point where subslice                                                                               
(DSS) masks will soon not fit within simple u32/u64 integer values.                                                                                  
Xe_HP SDV and DG2 platforms today have subslice (DSS) masks that are 32                                                                              
bits, which maxes out the current storage of a u32.  With PVC the masks                                                                              
are represented by a pair of 32-bit registers, requiring a bump up to at                                                                             
least 64-bits of storage internally.  We could switch to u64 for that in                                                                             
the short term, but since we already know that upcoming architectures                                                                                
intend to provide DSS fuse bits via three or more registers it's best to                                                                             
switch to a representation that's more future-proof but still easy to                                                                                
work with in the driver code.  To accomodate this, we start storing our                                                                              
subslice mask for Xe_HP and beyond in a new typedef that can be                                                                                      
processed by the linux/bitmap.h operations.                                                                                                          
                                                                                                                                                     
Finally, since no userspace for Xe_HP or beyond is using the legacy                                                                                  
I915_GETPARAM ioctl lookups for I915_PARAM_SLICE_MASK and                                                                                            
I915_PARAM_SUBSLICE_MASK (since they've migrated to the more flexible                                                                                
I915_QUERY ioctl that can return more than a simple u32 value), we take                                                                              
the opportunity to officially drop support for those GETPARAM lookups on                                                                             
modern platforms.  Maintaining support for these GETPARAM lookups don't                                                                              
make sense for a number of reasons:                                                                                                                  
                                                                                                                                                     
 * Traditional slices no longer exist, and newer ideas like gslices,                                                                                 
   cslices, mslices, etc. aren't something userspace needs to query                                                                                  
   since it can be inferred from other information.                                                                                                  
 * The GETPARAM ioctl doesn't have a way to distinguish between geometry                                                                             
   subslice masks and compute subslice masks, which are distinct on                                                                                  
   Xe_HP and beyond.                                                                                                                                 
 * The I915_GETPARAM ioctl is limited to returning a 32-bit value, so                                                                                
   when subslice masks begin to exceed 32-bits (on PVC), it simply can't                                                                             
   return the entire mask.                                                                                                                           
 * The GETPARAM ioctl doesn't have a way to give sensible information                                                                                
   for multi-tile devices.                                                                                                                           
                                                                                                                                                     
v2:                                                                                                                                                  
 - Switch to union of hsw/xehp formats to keep the representation in a                                                                               
   natural format for different types of hardware.                                                                                                   
 - Avoid accessing internals of intel_sseu_ss_mask_t directly outside of                                                                             
   intel_sseu.[ch].                                                                                                                                  
 - Include PVC SSEU which needs the larger SS mask storage enabled by                                                                                
   this series.                                                                                                                                      
                                                                                                                                                     
v3:                                                                                                                                                  
 - Correct a BIT(s) typo that should have been BIT(ss), causing                                                                                      
   incorrect handling on gen9 platforms.                                                                                                             
                                                                                                                                                     
v4:                                                                                                                                                  
 - Eliminate sseu->{ss,eu}_stride fields and just calculate the proper                                                                               
   value in the UAPI code that needs them.                                                                                                           
 - Handle unwanted ~u8 sign extension at the callsite instead of inside                                                                              
   sseu_set_eus.                                                                                                                                     
 - Use BITMAP_BITS() macro rather than passing I915_MAX_SS_FUSE_BITS                                                                                 
   around directly to bitmap operations.                                                                                                             
 - Improved debugfs / dmesg reporting for Xe_HP dumps                                                                                                
 - Various assertion check improvements.                                                                                                             
                                                                                                                                                     
v5:
 - Rebase to latest drm-tip (resolve trivial conflicts)
 - Move XEHP_BITMAP_BITS() to the header so that we can also replace a usage of
   I915_MAX_SS_FUSE_BITS in one of the inline functions.  (Bala)
 - Change the local variable in intel_slicemask_from_xehp_dssmask() from u16 to
   'unsigned long' to make it a bit more future-proof.
 - Incorporate ack's received from Tvrtko and Lionel.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Matt Roper (6):
  drm/i915/xehp: Use separate sseu init function
  drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK
  drm/i915/sseu: Simplify gen11+ SSEU handling
  drm/i915/sseu: Don't try to store EU mask internally in UAPI format
  drm/i915/sseu: Disassociate internal subslice mask representation from
    uapi
  drm/i915/pvc: Add SSEU changes

 drivers/gpu/drm/i915/gem/i915_gem_context.c  |   5 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c           |  12 +-
 drivers/gpu/drm/i915/gt/intel_gt_regs.h      |   1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c         | 450 ++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  92 ++--
 drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c |  30 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  |  24 +-
 drivers/gpu/drm/i915/i915_drv.h              |   2 +
 drivers/gpu/drm/i915/i915_getparam.c         |  11 +-
 drivers/gpu/drm/i915/i915_pci.c              |   3 +-
 drivers/gpu/drm/i915/i915_query.c            |  26 +-
 drivers/gpu/drm/i915/intel_device_info.h     |   1 +
 13 files changed, 397 insertions(+), 264 deletions(-)

-- 
2.35.3


             reply	other threads:[~2022-06-01 15:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-01 15:07 Matt Roper [this message]
2022-06-01 15:07 ` [Intel-gfx] [PATCH v5 0/6] i915: SSEU handling updates Matt Roper
2022-06-01 15:07 ` [PATCH v5 1/6] drm/i915/xehp: Use separate sseu init function Matt Roper
2022-06-01 15:07   ` [Intel-gfx] " Matt Roper
2022-06-01 15:07 ` [PATCH v5 2/6] drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK Matt Roper
2022-06-01 15:07   ` [Intel-gfx] " Matt Roper
2022-06-01 15:07 ` [PATCH v5 3/6] drm/i915/sseu: Simplify gen11+ SSEU handling Matt Roper
2022-06-01 15:07   ` [Intel-gfx] " Matt Roper
2022-06-01 15:07 ` [PATCH v5 4/6] drm/i915/sseu: Don't try to store EU mask internally in UAPI format Matt Roper
2022-06-01 15:07   ` [Intel-gfx] " Matt Roper
2022-06-01 15:07 ` [PATCH v5 5/6] drm/i915/sseu: Disassociate internal subslice mask representation from uapi Matt Roper
2022-06-01 15:07   ` [Intel-gfx] " Matt Roper
2022-06-01 15:07 ` [PATCH v5 6/6] drm/i915/pvc: Add SSEU changes Matt Roper
2022-06-01 15:07   ` [Intel-gfx] " Matt Roper
2022-06-01 15:59 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for i915: SSEU handling updates Patchwork
2022-06-01 16:22 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-06-01 20:52 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2022-06-02 14:24   ` Matt Roper
2022-06-02 12:32 ` [PATCH v5 0/6] " Balasubramani Vivekanandan
2022-06-02 12:32   ` [Intel-gfx] " Balasubramani Vivekanandan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220601150725.521468-1-matthew.d.roper@intel.com \
    --to=matthew.d.roper@intel.com \
    --cc=balasubramani.vivekanandan@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.