From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jason Ekstrand <jason@jlekstrand.net>
Subject: Re: [PATCH] i915,
	drm/fourcc: Improve the CCS modifier documentation
Date: Fri, 25 Aug 2017 15:50:04 -0700
Message-ID: <CAOFGe97i=o9FwY1oJ+MOgNnT8uL2FHXNbdENOh47_yD138HarA@mail.gmail.com>
References: <1503081280-1813-1-git-send-email-jason.ekstrand@intel.com>
 <20170825151004.GU4914@intel.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0541483782=="
Return-path: <intel-gfx-bounces@lists.freedesktop.org>
In-Reply-To: <20170825151004.GU4914@intel.com>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
To: =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= <ville.syrjala@linux.intel.com>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>, Intel GFX <intel-gfx@lists.freedesktop.org>, Ben Widawsky <ben@bwidawsk.net>, Maling list - DRI developers <dri-devel@lists.freedesktop.org>
List-Id: dri-devel@lists.freedesktop.org

--===============0541483782==
Content-Type: multipart/alternative; boundary="001a11443e2e20430205579bc4c7"

--001a11443e2e20430205579bc4c7
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, Aug 25, 2017 at 8:10 AM, Ville Syrj=C3=A4l=C3=A4 <
ville.syrjala@linux.intel.com> wrote:

> On Fri, Aug 18, 2017 at 11:34:40AM -0700, Jason Ekstrand wrote:
> > This updates the documentation on the intel CCS modifiers for a couple
> > of reasons:
> >
> >  1) The old documentation required that the CCS modifier only be used
> >     with 8888 formats.  While i915 currently only supports CCS scanout
> >     with 8888 formats (and advertises such through the blob format), CC=
S
> >     can be used with many other formats.  Mesa, in particular, can
> >     handle CCS on the full range of formats supported by the hardware.
> >     For image sharing entirely within userspace, we don't want this
> >     restriction.
> >
> >  2) The old documentation specified the scaling factor in terms of
> >     pixels which breaks down when you start using formats which are not
> >     32-bit.  By specifying it in terms of cache lines and tiles, we can
> >     properly specify the scale-down relationship with no format size
> >     assumptions.
> >
> >  3) The new comment provides more detail about the "real" layout of CCS
> >     on Sky Lake and also points out that the reason why Y tiling is
> >     important is because it affects row pitch calculations.
> >
> >  4) We shouldn't be documenting the Yf CCS modifier yet.  Userspace is
> >     incapable of generating it right now and we don't fully know how it
> >     works yet.  Trying to fully describe it is premature.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Ville Syrj=C3=A4l=C3=A4 <ville.syrjala@linux.intel.com>
> > ---
> >  include/uapi/drm/drm_fourcc.h | 35 ++++++++++++++++++++++-------------
> >  1 file changed, 22 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/uapi/drm/drm_fourcc.h
> b/include/uapi/drm/drm_fourcc.h
> > index 3ad838d..9670da4 100644
> > --- a/include/uapi/drm/drm_fourcc.h
> > +++ b/include/uapi/drm/drm_fourcc.h
> > @@ -266,21 +266,30 @@ extern "C" {
> >  /*
> >   * Intel color control surface (CCS) for render compression
> >   *
> > - * The framebuffer format must be one of the 8:8:8:8 RGB formats.
> > - * The main surface will be plane index 0 and must be Y/Yf-tiled,
> > - * the CCS will be plane index 1.
> > - *
> > - * Each CCS tile matches a 1024x512 pixel area of the main surface.
> > - * To match certain aspects of the 3D hardware the CCS is
> > - * considered to be made up of normal 128Bx32 Y tiles, Thus
> > - * the CCS pitch must be specified in multiples of 128 bytes.
> > - *
> > - * In reality the CCS tile appears to be a 64Bx64 Y tile, composed
> > - * of QWORD (8 bytes) chunks instead of OWORD (16 bytes) chunks.
> > - * But that fact is not relevant unless the memory is accessed
> > - * directly.
> > + * The image format must be compatible with CCS_E (lossless render
> > + * compression) as specified in the PRM for the relevant hardware.
> > + * The main surface will be plane index 0 and must be Y-tiled,
> > + * the CCS will be plane index 1 and is also Y-tiled.
> > + *
> > + * Each 64B cache line in the CCS (a region of 16B x 4 rows when
> > + * Y-tiled) corresponds to a region of 32x16 cache lines in the main
> > + * surface.  (As a corollary, each CCS tile corresponds to an area of
> > + * 32x16 tiles in the main surface.)  This relationship holds regardle=
ss
> > + * of the size of the number of bits per pixel of the image format.
> > + *
> > + * In reality, the cache lines in the CCS tile are proportioned in an
> > + * 8B x 8 row configuration with each byte being 2x2 2-bit CCS entries=
.
> > + * However, CCS is documented as Y-tiled with the scaling relationship
> > + * described in terms of cache lines as above so we consider it to be
> > + * Y-tiled for the purpose of specifying this modifier.  This means th=
at
> > + * the row pitch for the CCS assumes 128B/tile.
> >   */
> >  #define I915_FORMAT_MOD_Y_TILED_CCS  fourcc_mod_code(INTEL, 4)
> > +
> > +/* Reserved for the Yf version of the TILED_CCS modifier.
> > + *
> > + * Exact definition TBD.
> > + */
>
> IIRC the same explanation can be used for both Y and Yf. The CCS
> surface is still using the wonky Y tile layout even if the main
> surface is Yf.
>

My reading of the docs indicates that the cache line correlation above
holds even for Yf and Ys.  However, I'm reluctant to base too much off your
R-E work because it was, as far as I know, entirely done with 32-bit
formats.  For 32-bit formats, Yf and Y have the same tile size.  The only
difference is that the cache lines are swizzled about a bit more with Yf.
I'd like to confirm with some 64 and 128-bit formats before trying to spec
it.

--Jason

--001a11443e2e20430205579bc4c7
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On F=
ri, Aug 25, 2017 at 8:10 AM, Ville Syrj=C3=A4l=C3=A4 <span dir=3D"ltr">&lt;=
<a href=3D"mailto:ville.syrjala@linux.intel.com" target=3D"_blank">ville.sy=
rjala@linux.intel.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x"><div class=3D"HOEnZb"><div class=3D"h5">On Fri, Aug 18, 2017 at 11:34:40=
AM -0700, Jason Ekstrand wrote:<br>
&gt; This updates the documentation on the intel CCS modifiers for a couple=
<br>
&gt; of reasons:<br>
&gt;<br>
&gt;=C2=A0 1) The old documentation required that the CCS modifier only be =
used<br>
&gt;=C2=A0 =C2=A0 =C2=A0with 8888 formats.=C2=A0 While i915 currently only =
supports CCS scanout<br>
&gt;=C2=A0 =C2=A0 =C2=A0with 8888 formats (and advertises such through the =
blob format), CCS<br>
&gt;=C2=A0 =C2=A0 =C2=A0can be used with many other formats.=C2=A0 Mesa, in=
 particular, can<br>
&gt;=C2=A0 =C2=A0 =C2=A0handle CCS on the full range of formats supported b=
y the hardware.<br>
&gt;=C2=A0 =C2=A0 =C2=A0For image sharing entirely within userspace, we don=
&#39;t want this<br>
&gt;=C2=A0 =C2=A0 =C2=A0restriction.<br>
&gt;<br>
&gt;=C2=A0 2) The old documentation specified the scaling factor in terms o=
f<br>
&gt;=C2=A0 =C2=A0 =C2=A0pixels which breaks down when you start using forma=
ts which are not<br>
&gt;=C2=A0 =C2=A0 =C2=A032-bit.=C2=A0 By specifying it in terms of cache li=
nes and tiles, we can<br>
&gt;=C2=A0 =C2=A0 =C2=A0properly specify the scale-down relationship with n=
o format size<br>
&gt;=C2=A0 =C2=A0 =C2=A0assumptions.<br>
&gt;<br>
&gt;=C2=A0 3) The new comment provides more detail about the &quot;real&quo=
t; layout of CCS<br>
&gt;=C2=A0 =C2=A0 =C2=A0on Sky Lake and also points out that the reason why=
 Y tiling is<br>
&gt;=C2=A0 =C2=A0 =C2=A0important is because it affects row pitch calculati=
ons.<br>
&gt;<br>
&gt;=C2=A0 4) We shouldn&#39;t be documenting the Yf CCS modifier yet.=C2=
=A0 Userspace is<br>
&gt;=C2=A0 =C2=A0 =C2=A0incapable of generating it right now and we don&#39=
;t fully know how it<br>
&gt;=C2=A0 =C2=A0 =C2=A0works yet.=C2=A0 Trying to fully describe it is pre=
mature.<br>
&gt;<br>
&gt; Signed-off-by: Jason Ekstrand &lt;<a href=3D"mailto:jason@jlekstrand.n=
et">jason@jlekstrand.net</a>&gt;<br>
&gt; Cc: Ben Widawsky &lt;<a href=3D"mailto:ben@bwidawsk.net">ben@bwidawsk.=
net</a>&gt;<br>
&gt; Cc: Ville Syrj=C3=A4l=C3=A4 &lt;<a href=3D"mailto:ville.syrjala@linux.=
intel.com">ville.syrjala@linux.intel.com</a><wbr>&gt;<br>
&gt; ---<br>
&gt;=C2=A0 include/uapi/drm/drm_fourcc.h | 35 ++++++++++++++++++++++-------=
-<wbr>-----<br>
&gt;=C2=A0 1 file changed, 22 insertions(+), 13 deletions(-)<br>
&gt;<br>
&gt; diff --git a/include/uapi/drm/drm_fourcc.<wbr>h b/include/uapi/drm/drm=
_fourcc.<wbr>h<br>
&gt; index 3ad838d..9670da4 100644<br>
&gt; --- a/include/uapi/drm/drm_fourcc.<wbr>h<br>
&gt; +++ b/include/uapi/drm/drm_fourcc.<wbr>h<br>
&gt; @@ -266,21 +266,30 @@ extern &quot;C&quot; {<br>
&gt;=C2=A0 /*<br>
&gt;=C2=A0 =C2=A0* Intel color control surface (CCS) for render compression=
<br>
&gt;=C2=A0 =C2=A0*<br>
&gt; - * The framebuffer format must be one of the 8:8:8:8 RGB formats.<br>
&gt; - * The main surface will be plane index 0 and must be Y/Yf-tiled,<br>
&gt; - * the CCS will be plane index 1.<br>
&gt; - *<br>
&gt; - * Each CCS tile matches a 1024x512 pixel area of the main surface.<b=
r>
&gt; - * To match certain aspects of the 3D hardware the CCS is<br>
&gt; - * considered to be made up of normal 128Bx32 Y tiles, Thus<br>
&gt; - * the CCS pitch must be specified in multiples of 128 bytes.<br>
&gt; - *<br>
&gt; - * In reality the CCS tile appears to be a 64Bx64 Y tile, composed<br=
>
&gt; - * of QWORD (8 bytes) chunks instead of OWORD (16 bytes) chunks.<br>
&gt; - * But that fact is not relevant unless the memory is accessed<br>
&gt; - * directly.<br>
&gt; + * The image format must be compatible with CCS_E (lossless render<br=
>
&gt; + * compression) as specified in the PRM for the relevant hardware.<br=
>
&gt; + * The main surface will be plane index 0 and must be Y-tiled,<br>
&gt; + * the CCS will be plane index 1 and is also Y-tiled.<br>
&gt; + *<br>
&gt; + * Each 64B cache line in the CCS (a region of 16B x 4 rows when<br>
&gt; + * Y-tiled) corresponds to a region of 32x16 cache lines in the main<=
br>
&gt; + * surface.=C2=A0 (As a corollary, each CCS tile corresponds to an ar=
ea of<br>
&gt; + * 32x16 tiles in the main surface.)=C2=A0 This relationship holds re=
gardless<br>
&gt; + * of the size of the number of bits per pixel of the image format.<b=
r>
&gt; + *<br>
&gt; + * In reality, the cache lines in the CCS tile are proportioned in an=
<br>
&gt; + * 8B x 8 row configuration with each byte being 2x2 2-bit CCS entrie=
s.<br>
&gt; + * However, CCS is documented as Y-tiled with the scaling relationshi=
p<br>
&gt; + * described in terms of cache lines as above so we consider it to be=
<br>
&gt; + * Y-tiled for the purpose of specifying this modifier.=C2=A0 This me=
ans that<br>
&gt; + * the row pitch for the CCS assumes 128B/tile.<br>
&gt;=C2=A0 =C2=A0*/<br>
&gt;=C2=A0 #define I915_FORMAT_MOD_Y_TILED_CCS=C2=A0 fourcc_mod_code(INTEL,=
 4)<br>
&gt; +<br>
&gt; +/* Reserved for the Yf version of the TILED_CCS modifier.<br>
&gt; + *<br>
&gt; + * Exact definition TBD.<br>
&gt; + */<br>
<br>
</div></div>IIRC the same explanation can be used for both Y and Yf. The CC=
S<br>
surface is still using the wonky Y tile layout even if the main<br>
surface is Yf.<span class=3D"HOEnZb"><font color=3D"#888888"><br>
</font></span></blockquote></div></div><div class=3D"gmail_extra"><br></div=
><div class=3D"gmail_extra">My reading of the docs indicates that the cache=
 line correlation above holds even for Yf and Ys.=C2=A0 However, I&#39;m re=
luctant to base too much off your R-E work because it was, as far as I know=
, entirely done with 32-bit formats.=C2=A0 For 32-bit formats, Yf and Y hav=
e the same tile size.=C2=A0 The only difference is that the cache lines are=
 swizzled about a bit more with Yf.=C2=A0 I&#39;d like to confirm with some=
 64 and 128-bit formats before trying to spec it.</div><div class=3D"gmail_=
extra"><br></div><div class=3D"gmail_extra">--Jason<br></div></div>

--001a11443e2e20430205579bc4c7--

--===============0541483782==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4
IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg==

--===============0541483782==--