From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Ekstrand Subject: Re: [PATCH] i915, drm/fourcc: Improve the CCS modifier documentation Date: Fri, 25 Aug 2017 15:50:04 -0700 Message-ID: References: <1503081280-1813-1-git-send-email-jason.ekstrand@intel.com> <20170825151004.GU4914@intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0541483782==" Return-path: In-Reply-To: <20170825151004.GU4914@intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= Cc: Jason Ekstrand , Intel GFX , Ben Widawsky , Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org --===============0541483782== Content-Type: multipart/alternative; boundary="001a11443e2e20430205579bc4c7" --001a11443e2e20430205579bc4c7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Aug 25, 2017 at 8:10 AM, Ville Syrj=C3=A4l=C3=A4 < ville.syrjala@linux.intel.com> wrote: > On Fri, Aug 18, 2017 at 11:34:40AM -0700, Jason Ekstrand wrote: > > This updates the documentation on the intel CCS modifiers for a couple > > of reasons: > > > > 1) The old documentation required that the CCS modifier only be used > > with 8888 formats. While i915 currently only supports CCS scanout > > with 8888 formats (and advertises such through the blob format), CC= S > > can be used with many other formats. Mesa, in particular, can > > handle CCS on the full range of formats supported by the hardware. > > For image sharing entirely within userspace, we don't want this > > restriction. > > > > 2) The old documentation specified the scaling factor in terms of > > pixels which breaks down when you start using formats which are not > > 32-bit. By specifying it in terms of cache lines and tiles, we can > > properly specify the scale-down relationship with no format size > > assumptions. > > > > 3) The new comment provides more detail about the "real" layout of CCS > > on Sky Lake and also points out that the reason why Y tiling is > > important is because it affects row pitch calculations. > > > > 4) We shouldn't be documenting the Yf CCS modifier yet. Userspace is > > incapable of generating it right now and we don't fully know how it > > works yet. Trying to fully describe it is premature. > > > > Signed-off-by: Jason Ekstrand > > Cc: Ben Widawsky > > Cc: Ville Syrj=C3=A4l=C3=A4 > > --- > > include/uapi/drm/drm_fourcc.h | 35 ++++++++++++++++++++++------------- > > 1 file changed, 22 insertions(+), 13 deletions(-) > > > > diff --git a/include/uapi/drm/drm_fourcc.h > b/include/uapi/drm/drm_fourcc.h > > index 3ad838d..9670da4 100644 > > --- a/include/uapi/drm/drm_fourcc.h > > +++ b/include/uapi/drm/drm_fourcc.h > > @@ -266,21 +266,30 @@ extern "C" { > > /* > > * Intel color control surface (CCS) for render compression > > * > > - * The framebuffer format must be one of the 8:8:8:8 RGB formats. > > - * The main surface will be plane index 0 and must be Y/Yf-tiled, > > - * the CCS will be plane index 1. > > - * > > - * Each CCS tile matches a 1024x512 pixel area of the main surface. > > - * To match certain aspects of the 3D hardware the CCS is > > - * considered to be made up of normal 128Bx32 Y tiles, Thus > > - * the CCS pitch must be specified in multiples of 128 bytes. > > - * > > - * In reality the CCS tile appears to be a 64Bx64 Y tile, composed > > - * of QWORD (8 bytes) chunks instead of OWORD (16 bytes) chunks. > > - * But that fact is not relevant unless the memory is accessed > > - * directly. > > + * The image format must be compatible with CCS_E (lossless render > > + * compression) as specified in the PRM for the relevant hardware. > > + * The main surface will be plane index 0 and must be Y-tiled, > > + * the CCS will be plane index 1 and is also Y-tiled. > > + * > > + * Each 64B cache line in the CCS (a region of 16B x 4 rows when > > + * Y-tiled) corresponds to a region of 32x16 cache lines in the main > > + * surface. (As a corollary, each CCS tile corresponds to an area of > > + * 32x16 tiles in the main surface.) This relationship holds regardle= ss > > + * of the size of the number of bits per pixel of the image format. > > + * > > + * In reality, the cache lines in the CCS tile are proportioned in an > > + * 8B x 8 row configuration with each byte being 2x2 2-bit CCS entries= . > > + * However, CCS is documented as Y-tiled with the scaling relationship > > + * described in terms of cache lines as above so we consider it to be > > + * Y-tiled for the purpose of specifying this modifier. This means th= at > > + * the row pitch for the CCS assumes 128B/tile. > > */ > > #define I915_FORMAT_MOD_Y_TILED_CCS fourcc_mod_code(INTEL, 4) > > + > > +/* Reserved for the Yf version of the TILED_CCS modifier. > > + * > > + * Exact definition TBD. > > + */ > > IIRC the same explanation can be used for both Y and Yf. The CCS > surface is still using the wonky Y tile layout even if the main > surface is Yf. > My reading of the docs indicates that the cache line correlation above holds even for Yf and Ys. However, I'm reluctant to base too much off your R-E work because it was, as far as I know, entirely done with 32-bit formats. For 32-bit formats, Yf and Y have the same tile size. The only difference is that the cache lines are swizzled about a bit more with Yf. I'd like to confirm with some 64 and 128-bit formats before trying to spec it. --Jason --001a11443e2e20430205579bc4c7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On F= ri, Aug 25, 2017 at 8:10 AM, Ville Syrj=C3=A4l=C3=A4 <= ville.sy= rjala@linux.intel.com> wrote:
On Fri, Aug 18, 2017 at 11:34:40= AM -0700, Jason Ekstrand wrote:
> This updates the documentation on the intel CCS modifiers for a couple=
> of reasons:
>
>=C2=A0 1) The old documentation required that the CCS modifier only be = used
>=C2=A0 =C2=A0 =C2=A0with 8888 formats.=C2=A0 While i915 currently only = supports CCS scanout
>=C2=A0 =C2=A0 =C2=A0with 8888 formats (and advertises such through the = blob format), CCS
>=C2=A0 =C2=A0 =C2=A0can be used with many other formats.=C2=A0 Mesa, in= particular, can
>=C2=A0 =C2=A0 =C2=A0handle CCS on the full range of formats supported b= y the hardware.
>=C2=A0 =C2=A0 =C2=A0For image sharing entirely within userspace, we don= 't want this
>=C2=A0 =C2=A0 =C2=A0restriction.
>
>=C2=A0 2) The old documentation specified the scaling factor in terms o= f
>=C2=A0 =C2=A0 =C2=A0pixels which breaks down when you start using forma= ts which are not
>=C2=A0 =C2=A0 =C2=A032-bit.=C2=A0 By specifying it in terms of cache li= nes and tiles, we can
>=C2=A0 =C2=A0 =C2=A0properly specify the scale-down relationship with n= o format size
>=C2=A0 =C2=A0 =C2=A0assumptions.
>
>=C2=A0 3) The new comment provides more detail about the "real&quo= t; layout of CCS
>=C2=A0 =C2=A0 =C2=A0on Sky Lake and also points out that the reason why= Y tiling is
>=C2=A0 =C2=A0 =C2=A0important is because it affects row pitch calculati= ons.
>
>=C2=A0 4) We shouldn't be documenting the Yf CCS modifier yet.=C2= =A0 Userspace is
>=C2=A0 =C2=A0 =C2=A0incapable of generating it right now and we don'= ;t fully know how it
>=C2=A0 =C2=A0 =C2=A0works yet.=C2=A0 Trying to fully describe it is pre= mature.
>
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Ben Widawsky <ben@bwidawsk.= net>
> Cc: Ville Syrj=C3=A4l=C3=A4 <ville.syrjala@linux.intel.com>
> ---
>=C2=A0 include/uapi/drm/drm_fourcc.h | 35 ++++++++++++++++++++++-------= ------
>=C2=A0 1 file changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm= _fourcc.h
> index 3ad838d..9670da4 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -266,21 +266,30 @@ extern "C" {
>=C2=A0 /*
>=C2=A0 =C2=A0* Intel color control surface (CCS) for render compression=
>=C2=A0 =C2=A0*
> - * The framebuffer format must be one of the 8:8:8:8 RGB formats.
> - * The main surface will be plane index 0 and must be Y/Yf-tiled,
> - * the CCS will be plane index 1.
> - *
> - * Each CCS tile matches a 1024x512 pixel area of the main surface. > - * To match certain aspects of the 3D hardware the CCS is
> - * considered to be made up of normal 128Bx32 Y tiles, Thus
> - * the CCS pitch must be specified in multiples of 128 bytes.
> - *
> - * In reality the CCS tile appears to be a 64Bx64 Y tile, composed > - * of QWORD (8 bytes) chunks instead of OWORD (16 bytes) chunks.
> - * But that fact is not relevant unless the memory is accessed
> - * directly.
> + * The image format must be compatible with CCS_E (lossless render > + * compression) as specified in the PRM for the relevant hardware. > + * The main surface will be plane index 0 and must be Y-tiled,
> + * the CCS will be plane index 1 and is also Y-tiled.
> + *
> + * Each 64B cache line in the CCS (a region of 16B x 4 rows when
> + * Y-tiled) corresponds to a region of 32x16 cache lines in the main<= br> > + * surface.=C2=A0 (As a corollary, each CCS tile corresponds to an ar= ea of
> + * 32x16 tiles in the main surface.)=C2=A0 This relationship holds re= gardless
> + * of the size of the number of bits per pixel of the image format. > + *
> + * In reality, the cache lines in the CCS tile are proportioned in an=
> + * 8B x 8 row configuration with each byte being 2x2 2-bit CCS entrie= s.
> + * However, CCS is documented as Y-tiled with the scaling relationshi= p
> + * described in terms of cache lines as above so we consider it to be=
> + * Y-tiled for the purpose of specifying this modifier.=C2=A0 This me= ans that
> + * the row pitch for the CCS assumes 128B/tile.
>=C2=A0 =C2=A0*/
>=C2=A0 #define I915_FORMAT_MOD_Y_TILED_CCS=C2=A0 fourcc_mod_code(INTEL,= 4)
> +
> +/* Reserved for the Yf version of the TILED_CCS modifier.
> + *
> + * Exact definition TBD.
> + */

IIRC the same explanation can be used for both Y and Yf. The CC= S
surface is still using the wonky Y tile layout even if the main
surface is Yf.

My reading of the docs indicates that the cache= line correlation above holds even for Yf and Ys.=C2=A0 However, I'm re= luctant to base too much off your R-E work because it was, as far as I know= , entirely done with 32-bit formats.=C2=A0 For 32-bit formats, Yf and Y hav= e the same tile size.=C2=A0 The only difference is that the cache lines are= swizzled about a bit more with Yf.=C2=A0 I'd like to confirm with some= 64 and 128-bit formats before trying to spec it.

--Jason
--001a11443e2e20430205579bc4c7-- --===============0541483782== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== --===============0541483782==--