All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'maarten.lankhorst@linux.intel.com'" 
	<maarten.lankhorst@linux.intel.com>,
	"'mripard@kernel.org'" <mripard@kernel.org>,
	"'sean@poorly.run'" <sean@poorly.run>,
	"'airlied@linux.ie'" <airlied@linux.ie>,
	"'daniel@ffwll.ch'" <daniel@ffwll.ch>,
	"'dri-devel@lists.freedesktop.org'"
	<dri-devel@lists.freedesktop.org>,
	"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>
Subject: drm_cflush_sg() loops for over 3ms
Date: Mon, 13 Jan 2020 14:34:42 +0000	[thread overview]
Message-ID: <e2498e2794ab421bb27982b4c863e87f@AcuMS.aculab.com> (raw)

I've been looking at why some RT processes don't get scheduled promptly.
In my test the RT process's affinity ties it to a single cpu (this may not be such
a good idea as it seems).

What I've found is that the Intel i915 graphics driver uses the 'events_unbound'
kernel worker thread to periodically execute drm_cflush_sg().
(see https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/drm_cache.c)

I'm guessing this is to ensure that any writes to graphics memory become
visible is a semi-timely manner.

This loop takes about 1us per iteration split fairly evenly between whatever is in
for_each_sg_page() and drm_cflush_page().
With a 2560x1440 display the loop count is 3600 (4 bytes/pixel) and the whole
function takes around 3.3ms.

Since the kernel isn't pre-emptive (I though that wasn't much harder than SMP)
nothing else can run on that cpu until the loop finishes.

Adding a cond_resched() to the loop (maybe every 64 iterations) will
allow higher priority processes to run.
But really the code needs to be a lot faster.

I actually suspect that the (I assume IPI based) wbinv_on_all_cpus() would be
a lot faster - especially done by a per-cpu work queue?

I had moderate difficulty getting from the process (kworker/u8:3) to the
name of the worker thread pool, never mind the actual work.
Fortunately it runs so long that some of the output from 'echo t >/proc/sysrq-trigger'
still linked the pid (which I knew from ftrace scheduler events (and schedviz))
to the actual work item name.
(Oh, after I'd written a program to tidy up the raw ftrace output so schedviz
didn't barf on a trace that had wrapped.)

Is there anything in /proc (etc) that shows all the work queues and their current
work?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


WARNING: multiple messages have this Message-ID (diff)
From: David Laight <David.Laight@ACULAB.COM>
To: "'maarten.lankhorst@linux.intel.com'"
	<maarten.lankhorst@linux.intel.com>,
	 "'mripard@kernel.org'" <mripard@kernel.org>,
	"'sean@poorly.run'" <sean@poorly.run>,
	"'airlied@linux.ie'" <airlied@linux.ie>,
	"'daniel@ffwll.ch'" <daniel@ffwll.ch>,
	"'dri-devel@lists.freedesktop.org'"
	<dri-devel@lists.freedesktop.org>,
	"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>
Subject: drm_cflush_sg() loops for over 3ms
Date: Mon, 13 Jan 2020 14:34:42 +0000	[thread overview]
Message-ID: <e2498e2794ab421bb27982b4c863e87f@AcuMS.aculab.com> (raw)

I've been looking at why some RT processes don't get scheduled promptly.
In my test the RT process's affinity ties it to a single cpu (this may not be such
a good idea as it seems).

What I've found is that the Intel i915 graphics driver uses the 'events_unbound'
kernel worker thread to periodically execute drm_cflush_sg().
(see https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/drm_cache.c)

I'm guessing this is to ensure that any writes to graphics memory become
visible is a semi-timely manner.

This loop takes about 1us per iteration split fairly evenly between whatever is in
for_each_sg_page() and drm_cflush_page().
With a 2560x1440 display the loop count is 3600 (4 bytes/pixel) and the whole
function takes around 3.3ms.

Since the kernel isn't pre-emptive (I though that wasn't much harder than SMP)
nothing else can run on that cpu until the loop finishes.

Adding a cond_resched() to the loop (maybe every 64 iterations) will
allow higher priority processes to run.
But really the code needs to be a lot faster.

I actually suspect that the (I assume IPI based) wbinv_on_all_cpus() would be
a lot faster - especially done by a per-cpu work queue?

I had moderate difficulty getting from the process (kworker/u8:3) to the
name of the worker thread pool, never mind the actual work.
Fortunately it runs so long that some of the output from 'echo t >/proc/sysrq-trigger'
still linked the pid (which I knew from ftrace scheduler events (and schedviz))
to the actual work item name.
(Oh, after I'd written a program to tidy up the raw ftrace output so schedviz
didn't barf on a trace that had wrapped.)

Is there anything in /proc (etc) that shows all the work queues and their current
work?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

             reply	other threads:[~2020-01-13 14:34 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-13 14:34 David Laight [this message]
2020-01-13 14:34 ` drm_cflush_sg() loops for over 3ms David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e2498e2794ab421bb27982b4c863e87f@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=airlied@linux.ie \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=sean@poorly.run \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.