LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Tomeu Vizoso <tomeu.vizoso@collabora.com>
To: Rob Herring <robh@kernel.org>
Cc: Steven Price <steven.price@arm.com>,
	Neil Armstrong <narmstrong@baylibre.com>,
	Maxime Ripard <maxime.ripard@bootlin.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	David Airlie <airlied@linux.ie>,
	Linux IOMMU <iommu@lists.linux-foundation.org>,
	"moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE" 
	<linux-arm-kernel@lists.infradead.org>,
	"Marty E . Plummer" <hanetzer@startmail.com>,
	Sean Paul <sean@poorly.run>,
	Alyssa Rosenzweig <alyssa@rosenzweig.io>
Subject: Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver
Date: Tue, 9 Apr 2019 17:56:05 +0200
Message-ID: <CAAObsKCnycxxKWA+bUeU9vOu53eT9WZVkzdZ0KCtW9Xj9cV8Cw@mail.gmail.com> (raw)
In-Reply-To: <CAL_Jsq+wAUS=gukDLoY6DWP40w4Wtyd_Y4B2S3KMCx2ekL97qg@mail.gmail.com>

On Mon, 8 Apr 2019 at 23:04, Rob Herring <robh@kernel.org> wrote:
>
> On Fri, Apr 5, 2019 at 7:30 AM Steven Price <steven.price@arm.com> wrote:
> >
> > On 01/04/2019 08:47, Rob Herring wrote:
> > > This adds the initial driver for panfrost which supports Arm Mali
> > > Midgard and Bifrost family of GPUs. Currently, only the T860 and
> > > T760 Midgard GPUs have been tested.
>
> [...]
> > > +
> > > +             if (status & JOB_INT_MASK_ERR(j)) {
> > > +                     job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
> > > +                     job_write(pfdev, JS_COMMAND(j), JS_COMMAND_HARD_STOP_0);
> >
> > Hard-stopping an already completed job isn't likely to do very much :)
> > Also you are using the "_0" version which is only valid when "job chain
> > disambiguation" is present.

Yeah, guess that can be removed.

> > I suspect in this case you should also be signalling the fence? At the
> > moment you rely on the GPU timeout recovering from the fault.
>
> I'll defer to Tomeu who wrote this (IIRC).

Yes, that would be an improvement.

> > One issue that I haven't got to the bottom of is that I can trigger a
> > lockdep splat:
> >
> > -----8<------
> > panfrost ffa30000.gpu: js fault, js=1, status=JOB_CONFIG_FAULT,
> > head=0x0, tail=0x0
> > root@debian:~/ddk_panfrost# panfrost ffa30000.gpu: gpu sched timeout,
> > js=1, status=0x40, head=0x0, tail=0x0, sched_job=12a94ba6
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.0.0+ #32 Not tainted
> > ------------------------------------------------------
> > kworker/1:0/608 is trying to acquire lock:
> > 89b1e2d8 (&(&js->job_lock)->rlock){-.-.}, at:
> > dma_fence_remove_callback+0x14/0x50
> >
> > but task is already holding lock:
> > a887e4b2 (&(&sched->job_list_lock)->rlock){-.-.}, at:
> > drm_sched_stop+0x24/0x10c
> >
> > which lock already depends on the new lock.
> >
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #1 (&(&sched->job_list_lock)->rlock){-.-.}:
> >        drm_sched_process_job+0x60/0x208
> >        dma_fence_signal+0x1dc/0x1fc
> >        panfrost_job_irq_handler+0x160/0x194
> >        __handle_irq_event_percpu+0x80/0x388
> >        handle_irq_event_percpu+0x24/0x78
> >        handle_irq_event+0x38/0x5c
> >        handle_fasteoi_irq+0xb4/0x128
> >        generic_handle_irq+0x18/0x28
> >        __handle_domain_irq+0xa0/0xb4
> >        gic_handle_irq+0x4c/0x78
> >        __irq_svc+0x70/0x98
> >        arch_cpu_idle+0x20/0x3c
> >        arch_cpu_idle+0x20/0x3c
> >        do_idle+0x11c/0x22c
> >        cpu_startup_entry+0x18/0x20
> >        start_kernel+0x398/0x420
> >
> > -> #0 (&(&js->job_lock)->rlock){-.-.}:
> >        _raw_spin_lock_irqsave+0x50/0x64
> >        dma_fence_remove_callback+0x14/0x50
> >        drm_sched_stop+0x5c/0x10c
> >        panfrost_job_timedout+0xd0/0x180
> >        drm_sched_job_timedout+0x34/0x5c
> >        process_one_work+0x2ac/0x6a4
> >        worker_thread+0x28c/0x3fc
> >        kthread+0x13c/0x158
> >        ret_from_fork+0x14/0x20
> >          (null)
> >
> > other info that might help us debug this:
> >
> >  Possible unsafe locking scenario:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&(&sched->job_list_lock)->rlock);
> >                                lock(&(&js->job_lock)->rlock);
> >                                lock(&(&sched->job_list_lock)->rlock);
> >   lock(&(&js->job_lock)->rlock);
> >
> >  *** DEADLOCK ***
> >
> > 3 locks held by kworker/1:0/608:
> >  #0: 9b350627 ((wq_completion)"events"){+.+.}, at:
> > process_one_work+0x1f8/0x6a4
> >  #1: a802aa2d ((work_completion)(&(&sched->work_tdr)->work)){+.+.}, at:
> > process_one_work+0x1f8/0x6a4
> >  #2: a887e4b2 (&(&sched->job_list_lock)->rlock){-.-.}, at:
> > drm_sched_stop+0x24/0x10c
> >
> > stack backtrace:
> > CPU: 1 PID: 608 Comm: kworker/1:0 Not tainted 5.0.0+ #32
> > Hardware name: Rockchip (Device Tree)
> > Workqueue: events drm_sched_job_timedout
> > [<c0111088>] (unwind_backtrace) from [<c010c9a8>] (show_stack+0x10/0x14)
> > [<c010c9a8>] (show_stack) from [<c0773df4>] (dump_stack+0x9c/0xd4)
> > [<c0773df4>] (dump_stack) from [<c016d034>]
> > (print_circular_bug.constprop.15+0x1fc/0x2cc)
> > [<c016d034>] (print_circular_bug.constprop.15) from [<c016f6c0>]
> > (__lock_acquire+0xe5c/0x167c)
> > [<c016f6c0>] (__lock_acquire) from [<c0170828>] (lock_acquire+0xc4/0x210)
> > [<c0170828>] (lock_acquire) from [<c07920e0>]
> > (_raw_spin_lock_irqsave+0x50/0x64)
> > [<c07920e0>] (_raw_spin_lock_irqsave) from [<c0516784>]
> > (dma_fence_remove_callback+0x14/0x50)
> > [<c0516784>] (dma_fence_remove_callback) from [<c04def38>]
> > (drm_sched_stop+0x5c/0x10c)
> > [<c04def38>] (drm_sched_stop) from [<c04ec80c>]
> > (panfrost_job_timedout+0xd0/0x180)
> > [<c04ec80c>] (panfrost_job_timedout) from [<c04df340>]
> > (drm_sched_job_timedout+0x34/0x5c)
> > [<c04df340>] (drm_sched_job_timedout) from [<c013ec70>]
> > (process_one_work+0x2ac/0x6a4)
> > [<c013ec70>] (process_one_work) from [<c013fe48>]
> > (worker_thread+0x28c/0x3fc)
> > [<c013fe48>] (worker_thread) from [<c01452a0>] (kthread+0x13c/0x158)
> > [<c01452a0>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > Exception stack(0xeebd7fb0 to 0xeebd7ff8)
> > 7fa0:                                     00000000 00000000 00000000
> > 00000000
> > 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > 00000000
> > 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > ----8<----
> >
> > This is with the below simple reproducer:
> >
> > ----8<----
> > #include <sys/ioctl.h>
> > #include <fcntl.h>
> > #include <stdio.h>
> >
> > #include <libdrm/drm.h>
> > #include "panfrost_drm.h"
> >
> > int main(int argc, char **argv)
> > {
> >         int fd;
> >
> >         if (argc == 2)
> >                 fd = open(argv[1], O_RDWR);
> >         else
> >                 fd = open("/dev/dri/renderD128", O_RDWR);
> >         if (fd == -1) {
> >                 perror("Failed to open");
> >                 return 0;
> >         }
> >
> >         struct drm_panfrost_submit submit = {
> >                 .jc = 0,
> >         };
> >         return ioctl(fd, DRM_IOCTL_PANFROST_SUBMIT, &submit);
> > }
> > ----8<----
> >
> > Any ideas? I'm not an expert on DRM, so I got somewhat lost!
>
> Tomeu?

Ran out of time today, but will be able to look at it tomorrow.

Thanks!

Tomeu

  reply index

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-01  7:47 [PATCH v2 0/3] Initial Panfrost driver Rob Herring
2019-04-01  7:47 ` [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format Rob Herring
2019-04-01 19:11   ` Robin Murphy
2019-04-05 10:02     ` Robin Murphy
2019-04-11 13:15     ` Joerg Roedel
2019-04-05  9:42   ` Steven Price
2019-04-05  9:51     ` Robin Murphy
2019-04-05 10:36       ` Steven Price
2019-04-08  8:56         ` Steven Price
2019-04-01  7:47 ` [PATCH v2 2/3] drm: Add a drm_gem_objects_lookup helper Rob Herring
2019-04-01 13:06   ` Daniel Vetter
2019-04-01 13:48     ` Chris Wilson
2019-04-01 15:43       ` Eric Anholt
2019-04-08 20:09         ` Rob Herring
2019-04-09 16:55           ` Eric Anholt
2019-04-01 16:59     ` Rob Herring
2019-04-01 18:22       ` Eric Anholt
2019-04-01  7:47 ` [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver Rob Herring
2019-04-01  8:24   ` Neil Armstrong
2019-04-01 19:17     ` Robin Murphy
2019-04-01 16:02   ` Eric Anholt
2019-04-01 19:12   ` Robin Murphy
2019-04-02  0:33     ` Alyssa Rosenzweig
2019-04-02 11:23       ` Robin Murphy
2019-04-03  4:57     ` Rob Herring
2019-04-05 12:57       ` Robin Murphy
2019-04-05 12:30   ` Steven Price
2019-04-05 16:16     ` Alyssa Rosenzweig
2019-04-05 16:42       ` Steven Price
2019-04-05 16:53         ` Alyssa Rosenzweig
2019-04-15  9:18         ` Daniel Vetter
2019-04-15  9:30           ` Steven Price
2019-04-16  7:51             ` Daniel Vetter
2019-04-08 21:04     ` Rob Herring
2019-04-09 15:56       ` Tomeu Vizoso [this message]
2019-04-09 16:15         ` Rob Herring
2019-04-10 10:28           ` Steven Price
2019-04-10 10:19       ` Steven Price
2019-04-10 11:50         ` Tomeu Vizoso
2019-04-01 15:05 ` [PATCH v2 0/3] Initial Panfrost driver Alyssa Rosenzweig

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAObsKCnycxxKWA+bUeU9vOu53eT9WZVkzdZ0KCtW9Xj9cV8Cw@mail.gmail.com \
    --to=tomeu.vizoso@collabora.com \
    --cc=airlied@linux.ie \
    --cc=alyssa@rosenzweig.io \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hanetzer@startmail.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.ripard@bootlin.com \
    --cc=narmstrong@baylibre.com \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sean@poorly.run \
    --cc=steven.price@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox