All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jocelyn Falempe <jfalempe@redhat.com>
To: dri-devel@lists.freedesktop.org, airlied@redhat.com,
	tzimmermann@suse.de, daniel@ffwll.ch
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Subject: [PATCH v2] drm/mgag200: Add a workaround for low-latency
Date: Thu,  8 Feb 2024 10:51:10 +0100	[thread overview]
Message-ID: <20240208095125.377908-1-jfalempe@redhat.com> (raw)

We found a regression in v5.10 on real-time server, using the
rt-kernel and the mgag200 driver. It's some really specialized
workload, with <10us latency expectation on isolated core.
After the v5.10, the real time tasks missed their <10us latency
when something prints on the screen (fbcon or printk)

The regression has been bisected to 2 commits:
commit 0b34d58b6c32 ("drm/mgag200: Enable caching for SHMEM pages")
commit 4862ffaec523 ("drm/mgag200: Move vmap out of commit tail")

The first one changed the system memory framebuffer from Write-Combine
to the default caching.
Before the second commit, the mgag200 driver used to unmap the
framebuffer after each frame, which implicitly does a cache flush.
Both regressions are fixed by this commit, which restore WC mapping
for the framebuffer in system memory, and add a cache flush.
This is only needed on x86_64, for low-latency workload,
so the new kconfig DRM_MGAG200_IOBURST_WORKAROUND depends on
PREEMPT_RT and X86.

For more context, the whole thread can be found here [1]

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://lore.kernel.org/dri-devel/20231019135655.313759-1-jfalempe@redhat.com/ # 1
---
 drivers/gpu/drm/mgag200/Kconfig        | 12 ++++++++++++
 drivers/gpu/drm/mgag200/mgag200_drv.c  | 17 +++++++++++++++++
 drivers/gpu/drm/mgag200/mgag200_mode.c |  8 ++++++++
 3 files changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/mgag200/Kconfig b/drivers/gpu/drm/mgag200/Kconfig
index b28c5e4828f4..5e4d48df4854 100644
--- a/drivers/gpu/drm/mgag200/Kconfig
+++ b/drivers/gpu/drm/mgag200/Kconfig
@@ -11,3 +11,15 @@ config DRM_MGAG200
 	 MGA G200 desktop chips and the server variants. It requires 0.3.0
 	 of the modesetting userspace driver, and a version of mga driver
 	 that will fail on KMS enabled devices.
+
+config DRM_MGAG200_IOBURST_WORKAROUND
+	bool "Disable buffer caching"
+	depends on DRM_MGAG200 && PREEMPT_RT && X86
+	help
+	  Enable a workaround to avoid I/O bursts within the mgag200 driver at
+	  the expense of overall display performance.
+	  It restores the <v5.10 behavior, by mapping the framebuffer in system
+	  RAM as Write-Combining, and flushing the cache after each write.
+	  This is only useful on x86_64 if you want to run processes with
+	  deterministic latency.
+	  If unsure, say N.
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c b/drivers/gpu/drm/mgag200/mgag200_drv.c
index 54fce00e2136..573dbe256aa8 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.c
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
@@ -84,6 +84,20 @@ resource_size_t mgag200_probe_vram(void __iomem *mem, resource_size_t size)
 	return offset - 65536;
 }
 
+#if defined(CONFIG_DRM_MGAG200_IOBURST_WORKAROUND)
+static struct drm_gem_object *mgag200_create_object(struct drm_device *dev, size_t size)
+{
+	struct drm_gem_shmem_object *shmem;
+
+	shmem = kzalloc(sizeof(*shmem), GFP_KERNEL);
+	if (!shmem)
+		return NULL;
+
+	shmem->map_wc = true;
+	return &shmem->base;
+}
+#endif
+
 /*
  * DRM driver
  */
@@ -99,6 +113,9 @@ static const struct drm_driver mgag200_driver = {
 	.major = DRIVER_MAJOR,
 	.minor = DRIVER_MINOR,
 	.patchlevel = DRIVER_PATCHLEVEL,
+#if defined(CONFIG_DRM_MGAG200_IOBURST_WORKAROUND)
+	.gem_create_object = mgag200_create_object,
+#endif
 	DRM_GEM_SHMEM_DRIVER_OPS,
 };
 
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c
index 0eb769dd76ce..e17cb4c5f774 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -13,6 +13,7 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_cache.h>
 #include <drm/drm_damage_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_format_helper.h>
@@ -436,6 +437,13 @@ static void mgag200_handle_damage(struct mga_device *mdev, const struct iosys_ma
 
 	iosys_map_incr(&dst, drm_fb_clip_offset(fb->pitches[0], fb->format, clip));
 	drm_fb_memcpy(&dst, fb->pitches, vmap, fb, clip);
+
+	/* Flushing the cache greatly improves latency on x86_64 */
+#if defined(CONFIG_DRM_MGAG200_IOBURST_WORKAROUND)
+	if (!vmap->is_iomem)
+		drm_clflush_virt_range(vmap->vaddr + clip->y1 * fb->pitches[0],
+				       drm_rect_height(clip) * fb->pitches[0]);
+#endif
 }
 
 /*

base-commit: 1f36d634670d8001a45fe2f2dcae546819f9c7d8
-- 
2.43.0


             reply	other threads:[~2024-02-08  9:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-08  9:51 Jocelyn Falempe [this message]
2024-02-08 11:49 ` [PATCH v2] drm/mgag200: Add a workaround for low-latency Thomas Zimmermann
2024-02-26 15:53   ` Jocelyn Falempe
2024-03-12 12:56 ` [v2] " Sui Jingfeng
2024-03-12 14:25   ` Jocelyn Falempe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240208095125.377908-1-jfalempe@redhat.com \
    --to=jfalempe@redhat.com \
    --cc=airlied@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.