dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Rob Clark <robdclark@gmail.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Rob Clark <robdclark@chromium.org>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU"
	<freedreno@lists.freedesktop.org>, "Menon, Nishanth" <nm@ti.com>,
	David Airlie <airlied@linux.ie>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU"
	<linux-arm-msm@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Sean Paul <sean@poorly.run>
Subject: Re: [PATCH v2 07/22] drm/msm: Do rpm get sooner in the submit path
Date: Tue, 17 Nov 2020 09:02:41 -0800	[thread overview]
Message-ID: <CAF6AEGt_wbWuQA7gBw4yn4f2x0SVbfub4eRDX59PCvnd_0uFxg@mail.gmail.com> (raw)
In-Reply-To: <20201106071621.j732gt4nqifjrccd@vireshk-i7>

On Thu, Nov 5, 2020 at 11:16 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 05-11-20, 11:24, Rob Clark wrote:
> > On Tue, Nov 3, 2020 at 7:04 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 03-11-20, 08:50, Rob Clark wrote:
> > > > sorry, it didn't apply cleanly (which I guess is due to some other
> > > > dependencies that need to be picked back to v5.4 product kernel), and
> > > > due to some other things I'm in middle of debugging I didn't have time
> > > > yet to switch to v5.10-rc or look at what else needs to
> > > > cherry-picked..
> > > >
> > > > If you could, pushing a branch with this patch somewhere would be a
> > > > bit easier to work with (ie. fetch && cherry-pick is easier to deal
> > > > with than picking things from list)
> > >
> > > It has been in linux-next for a few days. Here is the HEAD to pick
> > > from. There are few patches there since rc1.
> > >
> > > commit 203e29749cc0 ("opp: Allocate the OPP table outside of opp_table_lock")
> > >
> >
> > sorry for the delay, with that cherry-picked, I'm getting a whole lot of:
>
> Ahh, sorry about that and thanks for reporting it. Here is the fix:
>
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index c718092757d9..6b7f0066942d 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -112,8 +112,6 @@ static struct opp_table *_find_table_of_opp_np(struct device_node *opp_np)
>         struct opp_table *opp_table;
>         struct device_node *opp_table_np;
>
> -       lockdep_assert_held(&opp_table_lock);
> -
>         opp_table_np = of_get_parent(opp_np);
>         if (!opp_table_np)
>                 goto err;
> @@ -121,12 +119,15 @@ static struct opp_table *_find_table_of_opp_np(struct device_node *opp_np)
>         /* It is safe to put the node now as all we need now is its address */
>         of_node_put(opp_table_np);
>
> +       mutex_lock(&opp_table_lock);
>         list_for_each_entry(opp_table, &opp_tables, node) {
>                 if (opp_table_np == opp_table->np) {
>                         _get_opp_table_kref(opp_table);
> +                       mutex_unlock(&opp_table_lock);
>                         return opp_table;
>                 }
>         }
> +       mutex_unlock(&opp_table_lock);
>
>  err:
>         return ERR_PTR(-ENODEV);
>

With that on top of the previous patch,

[   26.378245] ======================================================
[   26.384595] WARNING: possible circular locking dependency detected
[   26.390947] 5.10.0-rc2+ #6 Not tainted
[   26.394804] ------------------------------------------------------
[   26.401155] chrome/1886 is trying to acquire lock:
[   26.406087] ffffffe5e264aa88 (opp_table_lock){+.+.}-{3:3}, at:
_find_opp_table+0x38/0x78
[   26.414436]
[   26.414436] but task is already holding lock:
[   26.420423] ffffffb0283935b0
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
submit_lock_objects+0x70/0x1ec
[   26.430268]
[   26.430268] which lock already depends on the new lock.
[   26.430268]
[   26.438661]
[   26.438661] the existing dependency chain (in reverse order) is:
[   26.446343]
[   26.446343] -> #3 (reservation_ww_class_mutex){+.+.}-{3:3}:
[   26.453603]        lock_acquire+0x23c/0x30c
[   26.457910]        __mutex_lock_common+0xdc/0xbc4
[   26.462743]        ww_mutex_lock_interruptible+0x84/0xec
[   26.468203]        msm_gem_fault+0x30/0x138
[   26.472507]        __do_fault+0x44/0x184
[   26.476541]        handle_mm_fault+0x754/0xc50
[   26.481117]        do_page_fault+0x230/0x354
[   26.485507]        do_translation_fault+0x40/0x54
[   26.490338]        do_mem_abort+0x44/0xac
[   26.494469]        el0_sync_compat_handler+0x15c/0x190
[   26.499756]        el0_sync_compat+0x144/0x180
[   26.504328]
[   26.504328] -> #2 (&mm->mmap_lock){++++}-{3:3}:
[   26.510511]        lock_acquire+0x23c/0x30c
[   26.514813]        __might_fault+0x60/0x80
[   26.519034]        compat_filldir+0x118/0x4d0
[   26.523519]        dcache_readdir+0x74/0x1e0
[   26.527907]        iterate_dir+0xd4/0x198
[   26.532037]        __arm64_compat_sys_getdents+0x6c/0x168
[   26.537583]        el0_svc_common+0xa4/0x174
[   26.541970]        do_el0_svc_compat+0x20/0x30
[   26.546543]        el0_sync_compat_handler+0x124/0x190
[   26.551828]        el0_sync_compat+0x144/0x180
[   26.556399]
[   26.556399] -> #1 (&sb->s_type->i_mutex_key#2){++++}-{3:3}:
[   26.563660]        lock_acquire+0x23c/0x30c
[   26.567963]        down_write+0x80/0x1dc
[   26.571997]        simple_recursive_removal+0x48/0x238
[   26.577288]        debugfs_remove+0x5c/0x78
[   26.581592]        opp_debug_unregister+0x34/0x118
[   26.586521]        dev_pm_opp_put_opp_table+0xd0/0x14c
[   26.591802]        dev_pm_opp_put_clkname+0x40/0x54
[   26.596820]        msm_dsi_host_destroy+0xe0/0x108
[   26.601748]        dsi_destroy+0x40/0x58
[   26.605789]        dsi_bind+0x8c/0x16c
[   26.609648]        component_bind_all+0xf4/0x20c
[   26.614399]        msm_drm_init+0x180/0x588
[   26.618696]        msm_drm_bind+0x1c/0x24
[   26.622822]        try_to_bring_up_master+0x160/0x1a8
[   26.628014]        component_master_add_with_match+0xc4/0x108
[   26.633918]        msm_pdev_probe+0x214/0x2a4
[   26.638405]        platform_drv_probe+0x94/0xb4
[   26.643066]        really_probe+0x138/0x348
[   26.647365]        driver_probe_device+0x80/0xb8
[   26.652111]        device_driver_attach+0x50/0x70
[   26.656951]        __driver_attach+0xb4/0xc8
[   26.661340]        bus_for_each_dev+0x80/0xc8
[   26.665825]        driver_attach+0x28/0x30
[   26.670038]        bus_add_driver+0x100/0x1d4
[   26.674524]        driver_register+0x68/0xfc
[   26.678910]        __platform_driver_register+0x48/0x50
[   26.684284]        msm_drm_register+0x64/0x68
[   26.688766]        do_one_initcall+0x1ac/0x3e4
[   26.693341]        do_initcall_level+0xa0/0xb8
[   26.697910]        do_initcalls+0x58/0x94
[   26.702039]        do_basic_setup+0x28/0x30
[   26.706338]        kernel_init_freeable+0x190/0x1d0
[   26.711352]        kernel_init+0x18/0x10c
[   26.715481]        ret_from_fork+0x10/0x18
[   26.719692]
[   26.719692] -> #0 (opp_table_lock){+.+.}-{3:3}:
[   26.725883]        check_noncircular+0x12c/0x134
[   26.730627]        __lock_acquire+0x2288/0x2b2c
[   26.735284]        lock_acquire+0x23c/0x30c
[   26.739584]        __mutex_lock_common+0xdc/0xbc4
[   26.744426]        mutex_lock_nested+0x50/0x58
[   26.748998]        _find_opp_table+0x38/0x78
[   26.753395]        dev_pm_opp_find_freq_exact+0x2c/0xdc
[   26.758771]        a6xx_gmu_resume+0xcc/0xed0
[   26.763255]        a6xx_pm_resume+0x140/0x174
[   26.767741]        adreno_resume+0x24/0x2c
[   26.771956]        pm_generic_runtime_resume+0x2c/0x3c
[   26.777244]        __rpm_callback+0x74/0x114
[   26.781633]        rpm_callback+0x30/0x84
[   26.785759]        rpm_resume+0x3c8/0x4f0
[   26.789884]        __pm_runtime_resume+0x80/0xa4
[   26.794631]        msm_gpu_submit+0x60/0x228
[   26.799019]        msm_ioctl_gem_submit+0xba0/0xc1c
[   26.804038]        drm_ioctl_kernel+0xa0/0x11c
[   26.808608]        drm_ioctl+0x240/0x3dc
[   26.812653]        drm_compat_ioctl+0xd4/0xe4
[   26.817141]        __arm64_compat_sys_ioctl+0xc4/0xf8
[   26.822331]        el0_svc_common+0xa4/0x174
[   26.826718]        do_el0_svc_compat+0x20/0x30
[   26.831291]        el0_sync_compat_handler+0x124/0x190
[   26.836577]        el0_sync_compat+0x144/0x180
[   26.841148]
[   26.841148] other info that might help us debug this:
[   26.841148]
[   26.849361] Chain exists of:
[   26.849361]   opp_table_lock --> &mm->mmap_lock -->
reservation_ww_class_mutex
[   26.849361]
[   26.861249]  Possible unsafe locking scenario:
[   26.861249]
[   26.867334]        CPU0                    CPU1
[   26.871990]        ----                    ----
[   26.876647]   lock(reservation_ww_class_mutex);
[   26.881309]                                lock(&mm->mmap_lock);
[   26.887487]                                lock(reservation_ww_class_mutex);
[   26.894730]   lock(opp_table_lock);
[   26.898327]
[   26.898327]  *** DEADLOCK ***
[   26.898327]
[   26.904410] 3 locks held by chrome/1886:
[   26.908447]  #0: ffffffb005bd9138 (&dev->struct_mutex){+.+.}-{3:3},
at: msm_ioctl_gem_submit+0x238/0xc1c
[   26.918199]  #1: ffffffb02251fa70
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
msm_ioctl_gem_submit+0x978/0xc1c
[   26.928843]  #2: ffffffb0283935b0
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
submit_lock_objects+0x70/0x1ec
[   26.939126]
[   26.939126] stack backtrace:
[   26.943612] CPU: 5 PID: 1886 Comm: chrome Not tainted 5.10.0-rc2+ #6
[   26.950137] Hardware name: Google Lazor (rev1+) with LTE (DT)
[   26.956039] Call trace:
[   26.958566]  dump_backtrace+0x0/0x1b4
[   26.962333]  show_stack+0x1c/0x24
[   26.965754]  dump_stack+0xdc/0x158
[   26.969251]  print_circular_bug+0x308/0x338
[   26.973550]  check_noncircular+0x12c/0x134
[   26.977762]  __lock_acquire+0x2288/0x2b2c
[   26.981889]  lock_acquire+0x23c/0x30c
[   26.985658]  __mutex_lock_common+0xdc/0xbc4
[   26.989961]  mutex_lock_nested+0x50/0x58
[   26.993994]  _find_opp_table+0x38/0x78
[   26.997852]  dev_pm_opp_find_freq_exact+0x2c/0xdc
[   27.002690]  a6xx_gmu_resume+0xcc/0xed0
[   27.006635]  a6xx_pm_resume+0x140/0x174
[   27.010580]  adreno_resume+0x24/0x2c
[   27.014259]  pm_generic_runtime_resume+0x2c/0x3c
[   27.019008]  __rpm_callback+0x74/0x114
[   27.022868]  rpm_callback+0x30/0x84
[   27.026455]  rpm_resume+0x3c8/0x4f0
[   27.030042]  __pm_runtime_resume+0x80/0xa4
[   27.034259]  msm_gpu_submit+0x60/0x228
[   27.038117]  msm_ioctl_gem_submit+0xba0/0xc1c
[   27.042604]  drm_ioctl_kernel+0xa0/0x11c
[   27.046637]  drm_ioctl+0x240/0x3dc
[   27.050139]  drm_compat_ioctl+0xd4/0xe4
[   27.054088]  __arm64_compat_sys_ioctl+0xc4/0xf8
[   27.058748]  el0_svc_common+0xa4/0x174
[   27.062606]  do_el0_svc_compat+0x20/0x30
[   27.066647]  el0_sync_compat_handler+0x124/0x190
[   27.071393]  el0_sync_compat+0x144/0x180

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-11-17 17:03 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12  2:09 [PATCH 00/14] drm/msm: de-struct_mutex-ification Rob Clark
2020-10-12  2:09 ` [PATCH v2 01/22] drm/msm/gem: Add obj->lock wrappers Rob Clark
2020-10-12  2:09 ` [PATCH v2 02/22] drm/msm/gem: Rename internal get_iova_locked helper Rob Clark
2020-10-12  2:09 ` [PATCH v2 03/22] drm/msm/gem: Move prototypes to msm_gem.h Rob Clark
2020-10-12  2:09 ` [PATCH v2 04/22] drm/msm/gem: Add some _locked() helpers Rob Clark
2020-10-12  2:09 ` [PATCH v2 05/22] drm/msm/gem: Move locking in shrinker path Rob Clark
2020-10-12  2:09 ` [PATCH v2 06/22] drm/msm/submit: Move copy_from_user ahead of locking bos Rob Clark
2020-10-12  2:09 ` [PATCH v2 07/22] drm/msm: Do rpm get sooner in the submit path Rob Clark
2020-10-12 14:35   ` Daniel Vetter
2020-10-12 15:43     ` Rob Clark
2020-10-20  9:07       ` Viresh Kumar
2020-10-20 10:56         ` Daniel Vetter
2020-10-20 11:24           ` Viresh Kumar
2020-10-20 11:42             ` Daniel Vetter
2020-10-20 14:13             ` Rob Clark
2020-10-22  8:06               ` Viresh Kumar
2020-10-25 17:39                 ` Rob Clark
2020-10-27 11:35                   ` Viresh Kumar
2020-11-03  5:47                     ` Viresh Kumar
2020-11-03 16:50                       ` Rob Clark
2020-11-04  3:03                         ` Viresh Kumar
2020-11-05 19:24                           ` Rob Clark
2020-11-06  7:16                             ` Viresh Kumar
2020-11-17 10:03                               ` Viresh Kumar
2020-11-17 17:02                               ` Rob Clark [this message]
2020-11-18  5:28                                 ` Viresh Kumar
2020-11-18 16:53                                   ` Rob Clark
2020-11-19  6:05                                     ` Viresh Kumar
2020-12-07  6:16                                       ` Viresh Kumar
2020-12-16  5:22                                         ` Viresh Kumar
2020-10-12  2:09 ` [PATCH v2 08/22] drm/msm/gem: Switch over to obj->resv for locking Rob Clark
2020-10-12  2:09 ` [PATCH v2 09/22] drm/msm: Use correct drm_gem_object_put() in fail case Rob Clark
2020-10-12  2:09 ` [PATCH v2 10/22] drm/msm: Drop chatty trace Rob Clark
2020-10-12  2:09 ` [PATCH v2 11/22] drm/msm: Move update_fences() Rob Clark
2020-10-12  2:09 ` [PATCH v2 12/22] drm/msm: Add priv->mm_lock to protect active/inactive lists Rob Clark
2020-10-12  2:09 ` [PATCH v2 13/22] drm/msm: Document and rename preempt_lock Rob Clark
2020-10-12  2:09 ` [PATCH v2 14/22] drm/msm: Protect ring->submits with it's own lock Rob Clark
2020-10-12  2:09 ` [PATCH v2 15/22] drm/msm: Refcount submits Rob Clark
2020-10-12  2:09 ` [PATCH v2 16/22] drm/msm: Remove obj->gpu Rob Clark
2020-10-12  2:09 ` [PATCH v2 17/22] drm/msm: Drop struct_mutex from the retire path Rob Clark
2020-10-12  2:09 ` [PATCH v2 18/22] drm/msm: Drop struct_mutex in free_object() path Rob Clark
2020-10-12  2:09 ` [PATCH v2 19/22] drm/msm: remove msm_gem_free_work Rob Clark
2020-10-12  2:09 ` [PATCH v2 20/22] drm/msm: drop struct_mutex in madvise path Rob Clark
2020-10-12  2:09 ` [PATCH v2 21/22] drm/msm: Drop struct_mutex in shrinker path Rob Clark
2020-10-12  2:09 ` [PATCH v2 22/22] drm/msm: Don't implicit-sync if only a single ring Rob Clark
2020-10-12 14:40   ` Daniel Vetter
2020-10-12 15:07     ` Rob Clark
2020-10-13 11:08       ` Daniel Vetter
2020-10-13 16:15         ` [Freedreno] " Rob Clark
2020-10-15  8:22           ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF6AEGt_wbWuQA7gBw4yn4f2x0SVbfub4eRDX59PCvnd_0uFxg@mail.gmail.com \
    --to=robdclark@gmail.com \
    --cc=airlied@linux.ie \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nm@ti.com \
    --cc=robdclark@chromium.org \
    --cc=sean@poorly.run \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).