From: Tomeu Vizoso <tomeu.vizoso@collabora.com>
To: Rob Herring <robh@kernel.org>
Cc: Neil Armstrong <narmstrong@baylibre.com>,
Maxime Ripard <maxime.ripard@bootlin.com>,
Sean Paul <sean@poorly.run>, Will Deacon <will.deacon@arm.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
Steven Price <steven.price@arm.com>,
David Airlie <airlied@linux.ie>,
Linux IOMMU <iommu@lists.linux-foundation.org>,
Alyssa Rosenzweig <alyssa@rosenzweig.io>,
"Marty E . Plummer" <hanetzer@startmail.com>,
Robin Murphy <robin.murphy@arm.com>,
"moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver
Date: Tue, 9 Apr 2019 17:56:05 +0200 [thread overview]
Message-ID: <CAAObsKCnycxxKWA+bUeU9vOu53eT9WZVkzdZ0KCtW9Xj9cV8Cw@mail.gmail.com> (raw)
In-Reply-To: <CAL_Jsq+wAUS=gukDLoY6DWP40w4Wtyd_Y4B2S3KMCx2ekL97qg@mail.gmail.com>
On Mon, 8 Apr 2019 at 23:04, Rob Herring <robh@kernel.org> wrote:
>
> On Fri, Apr 5, 2019 at 7:30 AM Steven Price <steven.price@arm.com> wrote:
> >
> > On 01/04/2019 08:47, Rob Herring wrote:
> > > This adds the initial driver for panfrost which supports Arm Mali
> > > Midgard and Bifrost family of GPUs. Currently, only the T860 and
> > > T760 Midgard GPUs have been tested.
>
> [...]
> > > +
> > > + if (status & JOB_INT_MASK_ERR(j)) {
> > > + job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
> > > + job_write(pfdev, JS_COMMAND(j), JS_COMMAND_HARD_STOP_0);
> >
> > Hard-stopping an already completed job isn't likely to do very much :)
> > Also you are using the "_0" version which is only valid when "job chain
> > disambiguation" is present.
Yeah, guess that can be removed.
> > I suspect in this case you should also be signalling the fence? At the
> > moment you rely on the GPU timeout recovering from the fault.
>
> I'll defer to Tomeu who wrote this (IIRC).
Yes, that would be an improvement.
> > One issue that I haven't got to the bottom of is that I can trigger a
> > lockdep splat:
> >
> > -----8<------
> > panfrost ffa30000.gpu: js fault, js=1, status=JOB_CONFIG_FAULT,
> > head=0x0, tail=0x0
> > root@debian:~/ddk_panfrost# panfrost ffa30000.gpu: gpu sched timeout,
> > js=1, status=0x40, head=0x0, tail=0x0, sched_job=12a94ba6
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.0.0+ #32 Not tainted
> > ------------------------------------------------------
> > kworker/1:0/608 is trying to acquire lock:
> > 89b1e2d8 (&(&js->job_lock)->rlock){-.-.}, at:
> > dma_fence_remove_callback+0x14/0x50
> >
> > but task is already holding lock:
> > a887e4b2 (&(&sched->job_list_lock)->rlock){-.-.}, at:
> > drm_sched_stop+0x24/0x10c
> >
> > which lock already depends on the new lock.
> >
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #1 (&(&sched->job_list_lock)->rlock){-.-.}:
> > drm_sched_process_job+0x60/0x208
> > dma_fence_signal+0x1dc/0x1fc
> > panfrost_job_irq_handler+0x160/0x194
> > __handle_irq_event_percpu+0x80/0x388
> > handle_irq_event_percpu+0x24/0x78
> > handle_irq_event+0x38/0x5c
> > handle_fasteoi_irq+0xb4/0x128
> > generic_handle_irq+0x18/0x28
> > __handle_domain_irq+0xa0/0xb4
> > gic_handle_irq+0x4c/0x78
> > __irq_svc+0x70/0x98
> > arch_cpu_idle+0x20/0x3c
> > arch_cpu_idle+0x20/0x3c
> > do_idle+0x11c/0x22c
> > cpu_startup_entry+0x18/0x20
> > start_kernel+0x398/0x420
> >
> > -> #0 (&(&js->job_lock)->rlock){-.-.}:
> > _raw_spin_lock_irqsave+0x50/0x64
> > dma_fence_remove_callback+0x14/0x50
> > drm_sched_stop+0x5c/0x10c
> > panfrost_job_timedout+0xd0/0x180
> > drm_sched_job_timedout+0x34/0x5c
> > process_one_work+0x2ac/0x6a4
> > worker_thread+0x28c/0x3fc
> > kthread+0x13c/0x158
> > ret_from_fork+0x14/0x20
> > (null)
> >
> > other info that might help us debug this:
> >
> > Possible unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&(&sched->job_list_lock)->rlock);
> > lock(&(&js->job_lock)->rlock);
> > lock(&(&sched->job_list_lock)->rlock);
> > lock(&(&js->job_lock)->rlock);
> >
> > *** DEADLOCK ***
> >
> > 3 locks held by kworker/1:0/608:
> > #0: 9b350627 ((wq_completion)"events"){+.+.}, at:
> > process_one_work+0x1f8/0x6a4
> > #1: a802aa2d ((work_completion)(&(&sched->work_tdr)->work)){+.+.}, at:
> > process_one_work+0x1f8/0x6a4
> > #2: a887e4b2 (&(&sched->job_list_lock)->rlock){-.-.}, at:
> > drm_sched_stop+0x24/0x10c
> >
> > stack backtrace:
> > CPU: 1 PID: 608 Comm: kworker/1:0 Not tainted 5.0.0+ #32
> > Hardware name: Rockchip (Device Tree)
> > Workqueue: events drm_sched_job_timedout
> > [<c0111088>] (unwind_backtrace) from [<c010c9a8>] (show_stack+0x10/0x14)
> > [<c010c9a8>] (show_stack) from [<c0773df4>] (dump_stack+0x9c/0xd4)
> > [<c0773df4>] (dump_stack) from [<c016d034>]
> > (print_circular_bug.constprop.15+0x1fc/0x2cc)
> > [<c016d034>] (print_circular_bug.constprop.15) from [<c016f6c0>]
> > (__lock_acquire+0xe5c/0x167c)
> > [<c016f6c0>] (__lock_acquire) from [<c0170828>] (lock_acquire+0xc4/0x210)
> > [<c0170828>] (lock_acquire) from [<c07920e0>]
> > (_raw_spin_lock_irqsave+0x50/0x64)
> > [<c07920e0>] (_raw_spin_lock_irqsave) from [<c0516784>]
> > (dma_fence_remove_callback+0x14/0x50)
> > [<c0516784>] (dma_fence_remove_callback) from [<c04def38>]
> > (drm_sched_stop+0x5c/0x10c)
> > [<c04def38>] (drm_sched_stop) from [<c04ec80c>]
> > (panfrost_job_timedout+0xd0/0x180)
> > [<c04ec80c>] (panfrost_job_timedout) from [<c04df340>]
> > (drm_sched_job_timedout+0x34/0x5c)
> > [<c04df340>] (drm_sched_job_timedout) from [<c013ec70>]
> > (process_one_work+0x2ac/0x6a4)
> > [<c013ec70>] (process_one_work) from [<c013fe48>]
> > (worker_thread+0x28c/0x3fc)
> > [<c013fe48>] (worker_thread) from [<c01452a0>] (kthread+0x13c/0x158)
> > [<c01452a0>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > Exception stack(0xeebd7fb0 to 0xeebd7ff8)
> > 7fa0: 00000000 00000000 00000000
> > 00000000
> > 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > 00000000
> > 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > ----8<----
> >
> > This is with the below simple reproducer:
> >
> > ----8<----
> > #include <sys/ioctl.h>
> > #include <fcntl.h>
> > #include <stdio.h>
> >
> > #include <libdrm/drm.h>
> > #include "panfrost_drm.h"
> >
> > int main(int argc, char **argv)
> > {
> > int fd;
> >
> > if (argc == 2)
> > fd = open(argv[1], O_RDWR);
> > else
> > fd = open("/dev/dri/renderD128", O_RDWR);
> > if (fd == -1) {
> > perror("Failed to open");
> > return 0;
> > }
> >
> > struct drm_panfrost_submit submit = {
> > .jc = 0,
> > };
> > return ioctl(fd, DRM_IOCTL_PANFROST_SUBMIT, &submit);
> > }
> > ----8<----
> >
> > Any ideas? I'm not an expert on DRM, so I got somewhat lost!
>
> Tomeu?
Ran out of time today, but will be able to look at it tomorrow.
Thanks!
Tomeu
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-04-09 15:56 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-01 7:47 [PATCH v2 0/3] Initial Panfrost driver Rob Herring
2019-04-01 7:47 ` [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format Rob Herring
2019-04-01 19:11 ` Robin Murphy
2019-04-05 10:02 ` Robin Murphy
2019-04-11 13:15 ` Joerg Roedel
2019-04-05 9:42 ` Steven Price
2019-04-05 9:51 ` Robin Murphy
2019-04-05 10:36 ` Steven Price
2019-04-08 8:56 ` Steven Price
2019-04-01 7:47 ` [PATCH v2 2/3] drm: Add a drm_gem_objects_lookup helper Rob Herring
2019-04-01 13:06 ` Daniel Vetter
2019-04-01 13:48 ` Chris Wilson
2019-04-01 15:43 ` Eric Anholt
2019-04-08 20:09 ` Rob Herring
2019-04-09 16:55 ` Eric Anholt
2019-04-01 16:59 ` Rob Herring
2019-04-01 18:22 ` Eric Anholt
2019-04-01 15:05 ` [PATCH v2 0/3] Initial Panfrost driver Alyssa Rosenzweig
[not found] ` <20190401074730.12241-4-robh@kernel.org>
2019-04-01 8:24 ` [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver Neil Armstrong
2019-04-01 19:17 ` Robin Murphy
2019-04-01 16:02 ` Eric Anholt
2019-04-01 19:12 ` Robin Murphy
2019-04-02 0:33 ` Alyssa Rosenzweig
2019-04-02 11:23 ` Robin Murphy
2019-04-03 4:57 ` Rob Herring
2019-04-05 12:57 ` Robin Murphy
[not found] ` <5efdc3cb-7367-65e1-d1bf-14051db5da10@arm.com>
2019-04-05 16:16 ` Alyssa Rosenzweig
2019-04-05 16:42 ` Steven Price
2019-04-05 16:53 ` Alyssa Rosenzweig
2019-04-15 9:18 ` Daniel Vetter
2019-04-15 9:30 ` Steven Price
2019-04-16 7:51 ` Daniel Vetter
2019-04-08 21:04 ` Rob Herring
2019-04-09 15:56 ` Tomeu Vizoso [this message]
2019-04-09 16:15 ` Rob Herring
2019-04-10 10:28 ` Steven Price
2019-04-10 10:19 ` Steven Price
2019-04-10 11:50 ` Tomeu Vizoso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAAObsKCnycxxKWA+bUeU9vOu53eT9WZVkzdZ0KCtW9Xj9cV8Cw@mail.gmail.com \
--to=tomeu.vizoso@collabora.com \
--cc=airlied@linux.ie \
--cc=alyssa@rosenzweig.io \
--cc=dri-devel@lists.freedesktop.org \
--cc=hanetzer@startmail.com \
--cc=iommu@lists.linux-foundation.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maxime.ripard@bootlin.com \
--cc=narmstrong@baylibre.com \
--cc=robh@kernel.org \
--cc=robin.murphy@arm.com \
--cc=sean@poorly.run \
--cc=steven.price@arm.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).