linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Price <steven.price@arm.com>
To: Rob Herring <robh@kernel.org>, Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>,
	Maxime Ripard <maxime.ripard@bootlin.com>,
	Sean Paul <sean@poorly.run>, Will Deacon <will.deacon@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	David Airlie <airlied@linux.ie>,
	Linux IOMMU <iommu@lists.linux-foundation.org>,
	Alyssa Rosenzweig <alyssa@rosenzweig.io>,
	"Marty E . Plummer" <hanetzer@startmail.com>,
	Robin Murphy <robin.murphy@arm.com>,
	"moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE" 
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver
Date: Wed, 10 Apr 2019 11:28:27 +0100	[thread overview]
Message-ID: <9036b1e7-bb1b-d681-829b-10088bc4c227@arm.com> (raw)
In-Reply-To: <CAL_Jsq+qcLRFX0xyKuFEDW3+n6oukfmgemtO4MumCFp_-qqcUg@mail.gmail.com>

On 09/04/2019 17:15, Rob Herring wrote:
> On Tue, Apr 9, 2019 at 10:56 AM Tomeu Vizoso <tomeu.vizoso@collabora.com> wrote:
>>
>> On Mon, 8 Apr 2019 at 23:04, Rob Herring <robh@kernel.org> wrote:
>>>
>>> On Fri, Apr 5, 2019 at 7:30 AM Steven Price <steven.price@arm.com> wrote:
>>>>
>>>> On 01/04/2019 08:47, Rob Herring wrote:
>>>>> This adds the initial driver for panfrost which supports Arm Mali
>>>>> Midgard and Bifrost family of GPUs. Currently, only the T860 and
>>>>> T760 Midgard GPUs have been tested.
>>>
>>> [...]
>>>>> +
>>>>> +             if (status & JOB_INT_MASK_ERR(j)) {
>>>>> +                     job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
>>>>> +                     job_write(pfdev, JS_COMMAND(j), JS_COMMAND_HARD_STOP_0);
>>>>
>>>> Hard-stopping an already completed job isn't likely to do very much :)
>>>> Also you are using the "_0" version which is only valid when "job chain
>>>> disambiguation" is present.
>>
>> Yeah, guess that can be removed.
> 
> Steven, TBC, are you suggesting removing both lines or leaving
> JS_COMMAND_NOP? I don't think we can ever have a pending job at this
> point as there's no queuing. So the NOP probably isn't needed, but
> doesn't hurt to have it either.

Both lines are redundant and can be removed. But equally neither will
cause any problems.

Writing NOP into the next register is basically only needed if you know
there's a job there which you no longer want to execute.

kbase does this in certain situations. The main one is on a GPU without
command chain disambiguation if you want to kill a particular job
there's a potential race. For example:

 * Submit job A, followed by job B to slot 0. Job A is currently
executing, job B is waiting in the _NEXT registers.

 * Kernel decides it wants to kill job A (it's taking too long, or the
application has quit).

 * Simply executing a JS_COMMAND_HARD_STOP is racy. If job A finishes
just before doing the register write, then it's actually job B that gets
killed (and it's not always safe to just re-execute a killed job).

 * Instead write NOP to JS_COMMAND_NEXT, then check (again) whether the
job currently running is the one you want. When you then HARD_STOP you
either hit the correct job, or 'miss' and do nothing.

Job chain disambiguation solves this problem by allowing the kernel to
tag each job with a flag, the hard-stop can then be targetted at the job
with the correct flag. Writing NOP into JS_COMMAND_NEXT is also useful
if in the above situation you want to kill job B. In that case you can't
hard-stop it (it hasn't start), so you simply want to remove it from the
_NEXT registers.

>>>> I suspect in this case you should also be signalling the fence? At the
>>>> moment you rely on the GPU timeout recovering from the fault.
>>>
>>> I'll defer to Tomeu who wrote this (IIRC).
>>
>> Yes, that would be an improvement.
> 
> Actually, I think that would break recovery because the job timeout
> will bail out if the done fence is signaled already. Perhaps we want
> to timeout immediately if that is possible with the scheduler.

Ideally you would signal the fence with an error code (which is
presumably what recovery does). There's no actual need to trigger a
timeout. I'm not sure quite how the DRM infrastructure handles this though.

Steve

  reply	other threads:[~2019-04-10 10:28 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-01  7:47 [PATCH v2 0/3] Initial Panfrost driver Rob Herring
2019-04-01  7:47 ` [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format Rob Herring
2019-04-01 19:11   ` Robin Murphy
2019-04-05 10:02     ` Robin Murphy
2019-04-11 13:15     ` Joerg Roedel
2019-04-05  9:42   ` Steven Price
2019-04-05  9:51     ` Robin Murphy
2019-04-05 10:36       ` Steven Price
2019-04-08  8:56         ` Steven Price
2019-04-01  7:47 ` [PATCH v2 2/3] drm: Add a drm_gem_objects_lookup helper Rob Herring
2019-04-01 13:06   ` Daniel Vetter
2019-04-01 13:48     ` Chris Wilson
2019-04-01 15:43       ` Eric Anholt
2019-04-08 20:09         ` Rob Herring
2019-04-09 16:55           ` Eric Anholt
2019-04-01 16:59     ` Rob Herring
2019-04-01 18:22       ` Eric Anholt
2019-04-01  7:47 ` [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver Rob Herring
2019-04-01  8:24   ` Neil Armstrong
2019-04-01 19:17     ` Robin Murphy
2019-04-01 16:02   ` Eric Anholt
2019-04-01 19:12   ` Robin Murphy
2019-04-02  0:33     ` Alyssa Rosenzweig
2019-04-02 11:23       ` Robin Murphy
2019-04-03  4:57     ` Rob Herring
2019-04-05 12:57       ` Robin Murphy
2019-04-05 12:30   ` Steven Price
2019-04-05 16:16     ` Alyssa Rosenzweig
2019-04-05 16:42       ` Steven Price
2019-04-05 16:53         ` Alyssa Rosenzweig
2019-04-15  9:18         ` Daniel Vetter
2019-04-15  9:30           ` Steven Price
2019-04-16  7:51             ` Daniel Vetter
2019-04-08 21:04     ` Rob Herring
2019-04-09 15:56       ` Tomeu Vizoso
2019-04-09 16:15         ` Rob Herring
2019-04-10 10:28           ` Steven Price [this message]
2019-04-10 10:19       ` Steven Price
2019-04-10 11:50         ` Tomeu Vizoso
2019-04-01 15:05 ` [PATCH v2 0/3] Initial Panfrost driver Alyssa Rosenzweig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9036b1e7-bb1b-d681-829b-10088bc4c227@arm.com \
    --to=steven.price@arm.com \
    --cc=airlied@linux.ie \
    --cc=alyssa@rosenzweig.io \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hanetzer@startmail.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.ripard@bootlin.com \
    --cc=narmstrong@baylibre.com \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sean@poorly.run \
    --cc=tomeu.vizoso@collabora.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).