All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Alan Previn <alan.previn.teres.alexis@intel.com>,
	Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
	John Harrison <John.C.Harrison@Intel.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.19 16/38] drm/i915/guc: Dont update engine busyness stats too frequently
Date: Wed, 21 Sep 2022 17:46:00 +0200	[thread overview]
Message-ID: <20220921153646.796421323@linuxfoundation.org> (raw)
In-Reply-To: <20220921153646.298361220@linuxfoundation.org>

From: Alan Previn <alan.previn.teres.alexis@intel.com>

[ Upstream commit 59bcdb564b3bac3e86cc274e5dec05d4647ce47f ]

Using two different types of workoads, it was observed that
guc_update_engine_gt_clks was being called too frequently and/or
causing a CPU-to-lmem bandwidth hit over PCIE. Details on
the workloads and numbers are in the notes below.

Background: At the moment, guc_update_engine_gt_clks can be invoked
via one of 3 ways. #1 and #2 are infrequent under normal operating
conditions:
     1.When a predefined "ping_delay" timer expires so that GuC-
       busyness can sample the GTPM clock counter to ensure it
       doesn't miss a wrap-around of the 32-bits of the HW counter.
       (The ping_delay is calculated based on 1/8th the time taken
       for the counter go from 0x0 to 0xffffffff based on the
       GT frequency. This comes to about once every 28 seconds at a
       GT frequency of 19.2Mhz).
     2.In preparation for a gt reset.
     3.In response to __gt_park events (as the gt power management
       puts the gt into a lower power state when there is no work
       being done).

Root-cause: For both the workloads described farther below, it was
observed that when user space calls IOCTLs that unparks the
gt momentarily and repeats such calls many times in quick succession,
it triggers calling guc_update_engine_gt_clks as many times. However,
the primary purpose of guc_update_engine_gt_clks is to ensure we don't
miss the wraparound while the counter is ticking. Thus, the solution
is to ensure we skip that check if gt_park is calling this function
earlier than necessary.

Solution: Snapshot jiffies when we do actually update the busyness
stats. Then get the new jiffies every time intel_guc_busyness_park
is called and bail if we are being called too soon. Use half of the
ping_delay as a safe threshold.

NOTE1: Workload1: IGTs' gem_create was modified to create a file handle,
allocate memory with sizes that range from a min of 4K to the max supported
(in power of two step-sizes). Its maps, modifies and reads back the
memory. Allocations and modification is repeated until total memory
allocation reaches the max. Then the file handle is closed. With this
workload, guc_update_engine_gt_clks was called over 188 thousand times
in the span of 15 seconds while this test ran three times. With this patch,
the number of calls reduced to 14.

NOTE2: Workload2: 30 transcode sessions are created in quick succession.
While these sessions are created, pcm-iio tool was used to measure I/O
read operation bandwidth consumption sampled at 100 milisecond intervals
over the course of 20 seconds. The total bandwidth consumed over 20 seconds
without this patch was measured at average at 311KBps per sample. With this
patch, the number went down to about 175Kbps which is about a 43% savings.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623023157.211650-2-alan.previn.teres.alexis@intel.com
Stable-dep-of: aee5ae7c8492 ("drm/i915/guc: Cancel GuC engine busyness worker synchronously")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h            |  8 ++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 9feda105f913..a7acffbf15d1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -235,6 +235,14 @@ struct intel_guc {
 		 * @shift: Right shift value for the gpm timestamp
 		 */
 		u32 shift;
+
+		/**
+		 * @last_stat_jiffies: jiffies at last actual stats collection time
+		 * We use this timestamp to ensure we don't oversample the
+		 * stats because runtime power management events can trigger
+		 * stats collection at much higher rates than required.
+		 */
+		unsigned long last_stat_jiffies;
 	} timestamp;
 
 #ifdef CONFIG_DRM_I915_SELFTEST
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 26a051ef119d..96022f49f9b5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1365,6 +1365,8 @@ static void __update_guc_busyness_stats(struct intel_guc *guc)
 	unsigned long flags;
 	ktime_t unused;
 
+	guc->timestamp.last_stat_jiffies = jiffies;
+
 	spin_lock_irqsave(&guc->timestamp.lock, flags);
 
 	guc_update_pm_timestamp(guc, &unused);
@@ -1437,6 +1439,17 @@ void intel_guc_busyness_park(struct intel_gt *gt)
 		return;
 
 	cancel_delayed_work(&guc->timestamp.work);
+
+	/*
+	 * Before parking, we should sample engine busyness stats if we need to.
+	 * We can skip it if we are less than half a ping from the last time we
+	 * sampled the busyness stats.
+	 */
+	if (guc->timestamp.last_stat_jiffies &&
+	    !time_after(jiffies, guc->timestamp.last_stat_jiffies +
+			(guc->timestamp.ping_delay / 2)))
+		return;
+
 	__update_guc_busyness_stats(guc);
 }
 
-- 
2.35.1




  parent reply	other threads:[~2022-09-21 15:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21 15:45 [PATCH 5.19 00/38] 5.19.11-rc1 review Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 01/38] of: fdt: fix off-by-one error in unflatten_dt_nodes() Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 02/38] pinctrl: qcom: sc8180x: Fix gpio_wakeirq_map Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 03/38] pinctrl: qcom: sc8180x: Fix wrong pin numbers Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 04/38] pinctrl: rockchip: Enhance support for IRQ_TYPE_EDGE_BOTH Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 05/38] pinctrl: sunxi: Fix name for A100 R_PIO Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 06/38] SUNRPC: Fix call completion races with call_decode() Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 07/38] NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0 Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 08/38] gpio: mpc8xxx: Fix support for IRQ_TYPE_LEVEL_LOW flow_type in mpc85xx Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 09/38] NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 10/38] Revert "SUNRPC: Remove unreachable error condition" Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 11/38] drm/panel-edp: Fix delays for Innolux N116BCA-EA1 Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 12/38] drm/meson: Correct OSD1 global alpha value Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 13/38] drm/meson: Fix OSD1 RGB to YCbCr coefficient Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 14/38] drm/rockchip: vop2: Fix eDP/HDMI sync polarities Greg Kroah-Hartman
2022-09-21 15:45 ` [PATCH 5.19 15/38] drm/i915/vdsc: Set VDSC PIC_HEIGHT before using for DP DSC Greg Kroah-Hartman
2022-09-21 15:46 ` Greg Kroah-Hartman [this message]
2022-09-21 15:46 ` [PATCH 5.19 17/38] drm/i915/guc: Cancel GuC engine busyness worker synchronously Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 18/38] block: blk_queue_enter() / __bio_queue_enter() must return -EAGAIN for nowait Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 19/38] parisc: ccio-dma: Add missing iounmap in error path in ccio_probe() Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 20/38] of/device: Fix up of_dma_configure_id() stub Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 21/38] io_uring/msg_ring: check file type before putting Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 22/38] cifs: revalidate mapping when doing direct writes Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 23/38] cifs: dont send down the destination address to sendmsg for a SOCK_STREAM Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 24/38] cifs: always initialize struct msghdr smb_msg completely Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 25/38] blk-lib: fix blkdev_issue_secure_erase Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 26/38] parisc: Allow CONFIG_64BIT with ARCH=parisc Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 27/38] tools/include/uapi: Fix <asm/errno.h> for parisc and xtensa Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 28/38] drm/i915/gt: Fix perf limit reasons bit positions Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 29/38] drm/i915: Set correct domains values at _i915_vma_move_to_active Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 30/38] drm/amdgpu: make sure to init common IP before gmc Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 31/38] drm/amdgpu: Dont enable LTR if not supported Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 32/38] drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 33/38] drm/amdgpu: move nbio sdma_doorbell_range() into sdma " Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 34/38] net: Find dst with sks xfrm policy not ctl_sk Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 35/38] dt-bindings: apple,aic: Fix required item "apple,fiq-index" in affinity description Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 36/38] cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all() Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 37/38] ALSA: hda/sigmatel: Keep power up while beep is enabled Greg Kroah-Hartman
2022-09-21 15:46 ` [PATCH 5.19 38/38] ALSA: hda/sigmatel: Fix unused variable warning for beep power change Greg Kroah-Hartman
2022-09-22  7:11 ` [PATCH 5.19 00/38] 5.19.11-rc1 review Bagas Sanjaya
2022-09-22  7:14   ` Bagas Sanjaya
2022-09-22 16:42 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220921153646.796421323@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=John.C.Harrison@Intel.com \
    --cc=alan.previn.teres.alexis@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tvrtko.ursulin@intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.