From: "Ville Syrjälä" <ville.syrjala@linux.intel.com> To: "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> Cc: "intel-gfx@lists.freedesktop.org" <intel-gfx@lists.freedesktop.org>, "dri-devel@lists.freedesktop.org" <dri-devel@lists.freedesktop.org>, Swati Sharma <swati2.sharma@intel.com> Subject: Re: [PATCH 01/12] drm: Inline drm_color_lut_extract() Date: Thu, 7 Nov 2019 17:43:04 +0200 [thread overview] Message-ID: <20191107154304.GD1208@intel.com> (raw) In-Reply-To: <79b85116-1a88-5a2d-a1ea-46836a67bfa2@amd.com> On Thu, Nov 07, 2019 at 03:31:28PM +0000, Kazlauskas, Nicholas wrote: > On 2019-11-07 10:17 a.m., Ville Syrjala wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > This thing can get called several thousand times per LUT > > so seems like we want to inline it to: > > - avoid the function call overhead > > - allow constant folding > > > > A quick synthetic test (w/o any hardware interaction) with > > a ridiculously large LUT size shows about 50% reduction in > > runtime on my HSW and BSW boxes. Slightly less with more > > reasonable LUT size but still easily measurable in tens > > of microseconds. > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > > Seems reasonable to me. It would probably make sense to even split this > further into two functions, one for high precision and one for low > precision so it's purely a calculation and not hitting any branches. Constant folding gets rid of it. > > Nicholas Kazlauskas > > > --- > > drivers/gpu/drm/drm_color_mgmt.c | 24 ------------------------ > > include/drm/drm_color_mgmt.h | 23 ++++++++++++++++++++++- > > 2 files changed, 22 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c > > index 4ce5c6d8de99..19c5f635992a 100644 > > --- a/drivers/gpu/drm/drm_color_mgmt.c > > +++ b/drivers/gpu/drm/drm_color_mgmt.c > > @@ -108,30 +108,6 @@ > > * standard enum values supported by the DRM plane. > > */ > > > > -/** > > - * drm_color_lut_extract - clamp and round LUT entries > > - * @user_input: input value > > - * @bit_precision: number of bits the hw LUT supports > > - * > > - * Extract a degamma/gamma LUT value provided by user (in the form of > > - * &drm_color_lut entries) and round it to the precision supported by the > > - * hardware. > > - */ > > -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision) > > -{ > > - uint32_t val = user_input; > > - uint32_t max = 0xffff >> (16 - bit_precision); > > - > > - /* Round only if we're not using full precision. */ > > - if (bit_precision < 16) { > > - val += 1UL << (16 - bit_precision - 1); > > - val >>= 16 - bit_precision; > > - } > > - > > - return clamp_val(val, 0, max); > > -} > > -EXPORT_SYMBOL(drm_color_lut_extract); > > - > > /** > > * drm_crtc_enable_color_mgmt - enable color management properties > > * @crtc: DRM CRTC > > diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h > > index d1c662d92ab7..069b21d61871 100644 > > --- a/include/drm/drm_color_mgmt.h > > +++ b/include/drm/drm_color_mgmt.h > > @@ -29,7 +29,28 @@ > > struct drm_crtc; > > struct drm_plane; > > > > -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision); > > +/** > > + * drm_color_lut_extract - clamp and round LUT entries > > + * @user_input: input value > > + * @bit_precision: number of bits the hw LUT supports > > + * > > + * Extract a degamma/gamma LUT value provided by user (in the form of > > + * &drm_color_lut entries) and round it to the precision supported by the > > + * hardware. > > + */ > > +static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision) > > +{ > > + u32 val = user_input; > > + u32 max = 0xffff >> (16 - bit_precision); > > + > > + /* Round only if we're not using full precision. */ > > + if (bit_precision < 16) { > > + val += 1UL << (16 - bit_precision - 1); > > + val >>= 16 - bit_precision; > > + } > > + > > + return clamp_val(val, 0, max); > > +} > > > > void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc, > > uint degamma_lut_size, > > > -- Ville Syrjälä Intel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com> To: "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> Cc: "intel-gfx@lists.freedesktop.org" <intel-gfx@lists.freedesktop.org>, "dri-devel@lists.freedesktop.org" <dri-devel@lists.freedesktop.org> Subject: Re: [Intel-gfx] [PATCH 01/12] drm: Inline drm_color_lut_extract() Date: Thu, 7 Nov 2019 17:43:04 +0200 [thread overview] Message-ID: <20191107154304.GD1208@intel.com> (raw) Message-ID: <20191107154304.8o8KtoTGWJHzlpXud_ya_YoB7kZdVWn6B0wSANuoC_8@z> (raw) In-Reply-To: <79b85116-1a88-5a2d-a1ea-46836a67bfa2@amd.com> On Thu, Nov 07, 2019 at 03:31:28PM +0000, Kazlauskas, Nicholas wrote: > On 2019-11-07 10:17 a.m., Ville Syrjala wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > This thing can get called several thousand times per LUT > > so seems like we want to inline it to: > > - avoid the function call overhead > > - allow constant folding > > > > A quick synthetic test (w/o any hardware interaction) with > > a ridiculously large LUT size shows about 50% reduction in > > runtime on my HSW and BSW boxes. Slightly less with more > > reasonable LUT size but still easily measurable in tens > > of microseconds. > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > > Seems reasonable to me. It would probably make sense to even split this > further into two functions, one for high precision and one for low > precision so it's purely a calculation and not hitting any branches. Constant folding gets rid of it. > > Nicholas Kazlauskas > > > --- > > drivers/gpu/drm/drm_color_mgmt.c | 24 ------------------------ > > include/drm/drm_color_mgmt.h | 23 ++++++++++++++++++++++- > > 2 files changed, 22 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c > > index 4ce5c6d8de99..19c5f635992a 100644 > > --- a/drivers/gpu/drm/drm_color_mgmt.c > > +++ b/drivers/gpu/drm/drm_color_mgmt.c > > @@ -108,30 +108,6 @@ > > * standard enum values supported by the DRM plane. > > */ > > > > -/** > > - * drm_color_lut_extract - clamp and round LUT entries > > - * @user_input: input value > > - * @bit_precision: number of bits the hw LUT supports > > - * > > - * Extract a degamma/gamma LUT value provided by user (in the form of > > - * &drm_color_lut entries) and round it to the precision supported by the > > - * hardware. > > - */ > > -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision) > > -{ > > - uint32_t val = user_input; > > - uint32_t max = 0xffff >> (16 - bit_precision); > > - > > - /* Round only if we're not using full precision. */ > > - if (bit_precision < 16) { > > - val += 1UL << (16 - bit_precision - 1); > > - val >>= 16 - bit_precision; > > - } > > - > > - return clamp_val(val, 0, max); > > -} > > -EXPORT_SYMBOL(drm_color_lut_extract); > > - > > /** > > * drm_crtc_enable_color_mgmt - enable color management properties > > * @crtc: DRM CRTC > > diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h > > index d1c662d92ab7..069b21d61871 100644 > > --- a/include/drm/drm_color_mgmt.h > > +++ b/include/drm/drm_color_mgmt.h > > @@ -29,7 +29,28 @@ > > struct drm_crtc; > > struct drm_plane; > > > > -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision); > > +/** > > + * drm_color_lut_extract - clamp and round LUT entries > > + * @user_input: input value > > + * @bit_precision: number of bits the hw LUT supports > > + * > > + * Extract a degamma/gamma LUT value provided by user (in the form of > > + * &drm_color_lut entries) and round it to the precision supported by the > > + * hardware. > > + */ > > +static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision) > > +{ > > + u32 val = user_input; > > + u32 max = 0xffff >> (16 - bit_precision); > > + > > + /* Round only if we're not using full precision. */ > > + if (bit_precision < 16) { > > + val += 1UL << (16 - bit_precision - 1); > > + val >>= 16 - bit_precision; > > + } > > + > > + return clamp_val(val, 0, max); > > +} > > > > void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc, > > uint degamma_lut_size, > > > -- Ville Syrjälä Intel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2019-11-07 15:43 UTC|newest] Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-07 15:17 [PATCH 00/12] drm/i915: Gamma cleanups Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 01/12] drm: Inline drm_color_lut_extract() Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:31 ` Kazlauskas, Nicholas 2019-11-07 15:31 ` [Intel-gfx] " Kazlauskas, Nicholas 2019-11-07 15:43 ` Ville Syrjälä [this message] 2019-11-07 15:43 ` Ville Syrjälä 2019-11-07 15:47 ` Kazlauskas, Nicholas 2019-11-07 15:47 ` [Intel-gfx] " Kazlauskas, Nicholas 2019-11-07 17:40 ` Daniel Vetter 2019-11-07 17:40 ` Daniel Vetter 2019-11-08 13:36 ` Ville Syrjälä 2019-11-08 13:36 ` [Intel-gfx] " Ville Syrjälä 2019-11-08 13:36 ` Ville Syrjälä 2019-11-08 16:41 ` Daniel Vetter 2019-11-08 16:41 ` [Intel-gfx] " Daniel Vetter 2019-11-08 16:41 ` Daniel Vetter 2019-11-08 13:56 ` [PATCH v2 " Ville Syrjala 2019-11-08 13:56 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` [PATCH 02/12] drm/i915: Polish CHV .load_luts() a bit Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2020-03-03 14:18 ` Sharma, Swati2 2020-03-03 14:18 ` [Intel-gfx] " Sharma, Swati2 2019-11-07 15:17 ` [PATCH 03/12] drm/i915: Polish CHV CGM CSC loading Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 04/12] drm/i915: Add i9xx_lut_8() Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2020-02-20 11:20 ` Emil Velikov 2020-02-20 11:20 ` Emil Velikov 2020-02-20 13:56 ` Ville Syrjälä 2020-02-20 13:56 ` Ville Syrjälä 2019-11-07 15:17 ` [PATCH 05/12] drm/i915: Clean up i9xx_load_luts_internal() Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 06/12] drm/i915: Split i9xx_read_lut_8() to gmch vs. ilk variants Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 07/12] drm/i915: s/blob_data/lut/ Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 08/12] drm/i915: s/chv_read_cgm_lut/chv_read_cgm_gamma/ Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 09/12] drm/i915: Clean up integer types in color code Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 15:17 ` [PATCH 10/12] drm/i915: Refactor LUT read functions Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` [PATCH 11/12] drm/i915: Fix readout of PIPEGCMAX Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` [PATCH 12/12] drm/i915: Pass the crtc to the low level read_lut() funcs Ville Syrjala 2019-11-07 15:17 ` [Intel-gfx] " Ville Syrjala 2019-11-07 15:17 ` Ville Syrjala 2019-11-07 19:17 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Gamma cleanups Patchwork 2019-11-07 19:17 ` [Intel-gfx] " Patchwork 2019-11-07 19:22 ` ✗ Fi.CI.SPARSE: " Patchwork 2019-11-07 19:22 ` [Intel-gfx] " Patchwork 2019-11-07 19:39 ` ✗ Fi.CI.BAT: failure " Patchwork 2019-11-07 19:39 ` [Intel-gfx] " Patchwork 2019-11-08 17:48 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Gamma cleanups (rev2) Patchwork 2019-11-08 17:48 ` [Intel-gfx] " Patchwork 2019-11-08 17:53 ` ✗ Fi.CI.SPARSE: " Patchwork 2019-11-08 17:53 ` [Intel-gfx] " Patchwork 2019-11-08 18:09 ` ✓ Fi.CI.BAT: success " Patchwork 2019-11-08 18:09 ` [Intel-gfx] " Patchwork 2019-11-10 12:13 ` ✓ Fi.CI.IGT: " Patchwork 2019-11-10 12:13 ` [Intel-gfx] " Patchwork
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191107154304.GD1208@intel.com \ --to=ville.syrjala@linux.intel.com \ --cc=Nicholas.Kazlauskas@amd.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=intel-gfx@lists.freedesktop.org \ --cc=swati2.sharma@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.