linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rob Clark <robdclark@gmail.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Daniel Vetter <daniel@ffwll.ch>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Rob Clark <robdclark@chromium.org>, Sean Paul <sean@poorly.run>,
	David Airlie <airlied@linux.ie>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU" 
	<linux-arm-msm@vger.kernel.org>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU" 
	<freedreno@lists.freedesktop.org>,
	open list <linux-kernel@vger.kernel.org>,
	"Menon, Nishanth" <nm@ti.com>
Subject: Re: [PATCH v2 07/22] drm/msm: Do rpm get sooner in the submit path
Date: Wed, 18 Nov 2020 08:53:57 -0800	[thread overview]
Message-ID: <CAF6AEGv=-h7GFj5LR97FkeBBn+gk6TNS5hZkwBwufpE4yO7GyA@mail.gmail.com> (raw)
In-Reply-To: <20201118052829.ugt7i7ac6eqsj4l6@vireshk-i7>

On Tue, Nov 17, 2020 at 9:28 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 17-11-20, 09:02, Rob Clark wrote:
> > With that on top of the previous patch,
>
> Don't you still have this ? Which fixed the lockdep in the remove path.
>
> https://lore.kernel.org/lkml/20201022080644.2ck4okrxygmkuatn@vireshk-i7/
>
> To make it clear you need these patches to fix the OPP stuff:
>
> //From 5.10-rc3 (the one from the above link).
> commit e0df59de670b ("opp: Reduce the size of critical section in _opp_table_kref_release()")
>
> //Below two from linux-next
> commit ef43f01ac069 ("opp: Always add entries in dev_list with opp_table->lock held")
> commit 27c09484dd3d ("opp: Allocate the OPP table outside of opp_table_lock")
>
> This matches the diff I gave you earlier.
>

no, I did not have all three, only "opp: Allocate the OPP table
outside of opp_table_lock" plus the fixup.  But with all three:

[   27.072188] ======================================================
[   27.078542] WARNING: possible circular locking dependency detected
[   27.084897] 5.10.0-rc2+ #1 Not tainted
[   27.088750] ------------------------------------------------------
[   27.095103] chrome/1897 is trying to acquire lock:
[   27.100031] ffffffdb14e4aa88 (opp_table_lock){+.+.}-{3:3}, at:
_find_opp_table+0x38/0x78
[   27.108379]
[   27.108379] but task is already holding lock:
[   27.114373] ffffff8e2c8f91b0
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
submit_lock_objects+0x70/0x1ec
[   27.124212]
[   27.124212] which lock already depends on the new lock.
[   27.124212]
[   27.132604]
[   27.132604] the existing dependency chain (in reverse order) is:
[   27.140290]
[   27.140290] -> #4 (reservation_ww_class_mutex){+.+.}-{3:3}:
[   27.147544]        lock_acquire+0x23c/0x30c
[   27.151848]        __mutex_lock_common+0xdc/0xbc4
[   27.156685]        ww_mutex_lock_interruptible+0x84/0xec
[   27.162142]        msm_gem_fault+0x30/0x138
[   27.166443]        __do_fault+0x44/0x184
[   27.170479]        handle_mm_fault+0x754/0xc50
[   27.175053]        do_page_fault+0x230/0x354
[   27.179444]        do_translation_fault+0x40/0x54
[   27.184277]        do_mem_abort+0x44/0xac
[   27.188402]        el0_sync_compat_handler+0x15c/0x190
[   27.193680]        el0_sync_compat+0x144/0x180
[   27.198244]
[   27.198244] -> #3 (&mm->mmap_lock){++++}-{3:3}:
[   27.204435]        lock_acquire+0x23c/0x30c
[   27.208738]        __might_fault+0x60/0x80
[   27.212951]        compat_filldir+0x118/0x4d0
[   27.217434]        dcache_readdir+0x74/0x1e0
[   27.221825]        iterate_dir+0xd4/0x198
[   27.225947]        __arm64_compat_sys_getdents+0x6c/0x168
[   27.231495]        el0_svc_common+0xa4/0x174
[   27.235886]        do_el0_svc_compat+0x20/0x30
[   27.240461]        el0_sync_compat_handler+0x124/0x190
[   27.245746]        el0_sync_compat+0x144/0x180
[   27.250310]
[   27.250310] -> #2 (&sb->s_type->i_mutex_key#2){++++}-{3:3}:
[   27.257569]        lock_acquire+0x23c/0x30c
[   27.261877]        down_write+0x80/0x1dc
[   27.265912]        simple_recursive_removal+0x48/0x238
[   27.271193]        debugfs_remove+0x5c/0x78
[   27.275502]        opp_debug_remove_one+0x18/0x20
[   27.280343]        _opp_kref_release+0x40/0x74
[   27.284917]        dev_pm_opp_put_unlocked+0x44/0x64
[   27.290015]        _opp_remove_all_static+0x5c/0x90
[   27.295029]        dev_pm_opp_remove_table+0x70/0x90
[   27.300129]        dev_pm_opp_of_remove_table+0x14/0x1c
[   27.305504]        msm_dsi_host_destroy+0xd8/0x108
[   27.310434]        dsi_destroy+0x40/0x58
[   27.314469]        dsi_bind+0x8c/0x16c
[   27.318329]        component_bind_all+0xf4/0x20c
[   27.323081]        msm_drm_init+0x180/0x588
[   27.327382]        msm_drm_bind+0x1c/0x24
[   27.331503]        try_to_bring_up_master+0x160/0x1a8
[   27.336696]        component_master_add_with_match+0xc4/0x108
[   27.342597]        msm_pdev_probe+0x214/0x2a4
[   27.347076]        platform_drv_probe+0x94/0xb4
[   27.351739]        really_probe+0x138/0x348
[   27.356041]        driver_probe_device+0x80/0xb8
[   27.360788]        device_driver_attach+0x50/0x70
[   27.365621]        __driver_attach+0xb4/0xc8
[   27.370012]        bus_for_each_dev+0x80/0xc8
[   27.374495]        driver_attach+0x28/0x30
[   27.378712]        bus_add_driver+0x100/0x1d4
[   27.383188]        driver_register+0x68/0xfc
[   27.387579]        __platform_driver_register+0x48/0x50
[   27.392957]        msm_drm_register+0x64/0x68
[   27.397434]        do_one_initcall+0x1ac/0x3e4
[   27.402011]        do_initcall_level+0xa0/0xb8
[   27.406583]        do_initcalls+0x58/0x94
[   27.410704]        do_basic_setup+0x28/0x30
[   27.415008]        kernel_init_freeable+0x190/0x1d0
[   27.420024]        kernel_init+0x18/0x10c
[   27.424146]        ret_from_fork+0x10/0x18
[   27.428362]
[   27.428362] -> #1 (&opp_table->lock){+.+.}-{3:3}:
[   27.434725]        lock_acquire+0x23c/0x30c
[   27.439028]        __mutex_lock_common+0xdc/0xbc4
[   27.443862]        mutex_lock_nested+0x50/0x58
[   27.448436]        _find_opp_table_unlocked+0x44/0xb4
[   27.453626]        _opp_get_opp_table+0x3c/0x280
[   27.458375]        dev_pm_opp_get_opp_table_indexed+0x14/0x1c
[   27.464281]        of_genpd_add_provider_onecell+0xd8/0x1c0
[   27.470019]        rpmhpd_probe+0x244/0x26c
[   27.474323]        platform_drv_probe+0x94/0xb4
[   27.478985]        really_probe+0x138/0x348
[   27.483287]        driver_probe_device+0x80/0xb8
[   27.488033]        __device_attach_driver+0x90/0xa8
[   27.493047]        bus_for_each_drv+0x84/0xcc
[   27.497524]        __device_attach+0xc0/0x148
[   27.502007]        device_initial_probe+0x18/0x20
[   27.506840]        bus_probe_device+0x38/0x98
[   27.511317]        device_add+0x214/0x3c8
[   27.515443]        of_device_add+0x3c/0x48
[   27.519654]        of_platform_device_create_pdata+0xac/0xec
[   27.525473]        of_platform_bus_create+0x1cc/0x348
[   27.530664]        of_platform_populate+0x78/0xc8
[   27.535496]        devm_of_platform_populate+0x5c/0xa4
[   27.540779]        rpmh_rsc_probe+0x370/0x3d0
[   27.545253]        platform_drv_probe+0x94/0xb4
[   27.549916]        really_probe+0x138/0x348
[   27.554223]        driver_probe_device+0x80/0xb8
[   27.558971]        __device_attach_driver+0x90/0xa8
[   27.563988]        bus_for_each_drv+0x84/0xcc
[   27.568465]        __device_attach+0xc0/0x148
[   27.572942]        device_initial_probe+0x18/0x20
[   27.577778]        bus_probe_device+0x38/0x98
[   27.582263]        fw_devlink_resume+0xdc/0x110
[   27.586930]        of_platform_default_populate_init+0xb8/0xd0
[   27.592923]        do_one_initcall+0x1ac/0x3e4
[   27.597489]        do_initcall_level+0xa0/0xb8
[   27.602051]        do_initcalls+0x58/0x94
[   27.606175]        do_basic_setup+0x28/0x30
[   27.610472]        kernel_init_freeable+0x190/0x1d0
[   27.615493]        kernel_init+0x18/0x10c
[   27.619616]        ret_from_fork+0x10/0x18
[   27.623823]
[   27.623823] -> #0 (opp_table_lock){+.+.}-{3:3}:
[   27.630006]        check_noncircular+0x12c/0x134
[   27.634757]        __lock_acquire+0x2288/0x2b2c
[   27.639419]        lock_acquire+0x23c/0x30c
[   27.643727]        __mutex_lock_common+0xdc/0xbc4
[   27.648566]        mutex_lock_nested+0x50/0x58
[   27.653133]        _find_opp_table+0x38/0x78
[   27.657520]        dev_pm_opp_find_freq_exact+0x2c/0xdc
[   27.662890]        a6xx_gmu_resume+0xcc/0xed0
[   27.667372]        a6xx_pm_resume+0x140/0x174
[   27.671849]        adreno_resume+0x24/0x2c
[   27.676070]        pm_generic_runtime_resume+0x2c/0x3c
[   27.681351]        __rpm_callback+0x74/0x114
[   27.685741]        rpm_callback+0x30/0x84
[   27.689865]        rpm_resume+0x3c8/0x4f0
[   27.693989]        __pm_runtime_resume+0x80/0xa4
[   27.698742]        msm_gpu_submit+0x60/0x228
[   27.703136]        msm_ioctl_gem_submit+0xba0/0xc1c
[   27.708158]        drm_ioctl_kernel+0xa0/0x11c
[   27.712724]        drm_ioctl+0x240/0x3dc
[   27.716762]        drm_compat_ioctl+0xd4/0xe4
[   27.721244]        __arm64_compat_sys_ioctl+0xc4/0xf8
[   27.726435]        el0_svc_common+0xa4/0x174
[   27.730827]        do_el0_svc_compat+0x20/0x30
[   27.735395]        el0_sync_compat_handler+0x124/0x190
[   27.740675]        el0_sync_compat+0x144/0x180
[   27.745240]
[   27.745240] other info that might help us debug this:
[   27.745240]
[   27.753459] Chain exists of:
[   27.753459]   opp_table_lock --> &mm->mmap_lock -->
reservation_ww_class_mutex
[   27.753459]
[   27.765342]  Possible unsafe locking scenario:
[   27.765342]
[   27.771422]        CPU0                    CPU1
[   27.776085]        ----                    ----
[   27.780747]   lock(reservation_ww_class_mutex);
[   27.785413]                                lock(&mm->mmap_lock);
[   27.791591]                                lock(reservation_ww_class_mutex);
[   27.798833]   lock(opp_table_lock);
[   27.802428]
[   27.802428]  *** DEADLOCK ***
[   27.802428]
[   27.808506] 3 locks held by chrome/1897:
[   27.812540]  #0: ffffff8e05f91138 (&dev->struct_mutex){+.+.}-{3:3},
at: msm_ioctl_gem_submit+0x238/0xc1c
[   27.822295]  #1: ffffff8e1ebd2670
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
msm_ioctl_gem_submit+0x978/0xc1c
[   27.832930]  #2: ffffff8e2c8f91b0
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
submit_lock_objects+0x70/0x1ec
[   27.843216]
[   27.843216] stack backtrace:
[   27.847702] CPU: 5 PID: 1897 Comm: chrome Not tainted 5.10.0-rc2+ #1
[   27.854235] Hardware name: Google Lazor (rev1+) with LTE (DT)
[   27.860142] Call trace:
[   27.862662]  dump_backtrace+0x0/0x1b4
[   27.866426]  show_stack+0x1c/0x24
[   27.869847]  dump_stack+0xdc/0x158
[   27.873349]  print_circular_bug+0x308/0x338
[   27.877647]  check_noncircular+0x12c/0x134
[   27.881858]  __lock_acquire+0x2288/0x2b2c
[   27.885984]  lock_acquire+0x23c/0x30c
[   27.889753]  __mutex_lock_common+0xdc/0xbc4
[   27.894054]  mutex_lock_nested+0x50/0x58
[   27.898086]  _find_opp_table+0x38/0x78
[   27.901946]  dev_pm_opp_find_freq_exact+0x2c/0xdc
[   27.906784]  a6xx_gmu_resume+0xcc/0xed0
[   27.910734]  a6xx_pm_resume+0x140/0x174
[   27.914684]  adreno_resume+0x24/0x2c
[   27.918363]  pm_generic_runtime_resume+0x2c/0x3c
[   27.923113]  __rpm_callback+0x74/0x114
[   27.926975]  rpm_callback+0x30/0x84
[   27.930565]  rpm_resume+0x3c8/0x4f0
[   27.934154]  __pm_runtime_resume+0x80/0xa4
[   27.938373]  msm_gpu_submit+0x60/0x228
[   27.942233]  msm_ioctl_gem_submit+0xba0/0xc1c
[   27.946713]  drm_ioctl_kernel+0xa0/0x11c
[   27.950749]  drm_ioctl+0x240/0x3dc
[   27.954256]  drm_compat_ioctl+0xd4/0xe4
[   27.958207]  __arm64_compat_sys_ioctl+0xc4/0xf8
[   27.962871]  el0_svc_common+0xa4/0x174
[   27.966731]  do_el0_svc_compat+0x20/0x30
[   27.970766]  el0_sync_compat_handler+0x124/0x190
[   27.975516]  el0_sync_compat+0x144/0x180

  reply	other threads:[~2020-11-18 16:54 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12  2:09 [PATCH 00/14] drm/msm: de-struct_mutex-ification Rob Clark
2020-10-12  2:09 ` [PATCH v2 01/22] drm/msm/gem: Add obj->lock wrappers Rob Clark
2020-10-12  2:09 ` [PATCH v2 02/22] drm/msm/gem: Rename internal get_iova_locked helper Rob Clark
2020-10-12  2:09 ` [PATCH v2 03/22] drm/msm/gem: Move prototypes to msm_gem.h Rob Clark
2020-10-12  2:09 ` [PATCH v2 04/22] drm/msm/gem: Add some _locked() helpers Rob Clark
2020-10-12  2:09 ` [PATCH v2 05/22] drm/msm/gem: Move locking in shrinker path Rob Clark
2020-10-12  2:09 ` [PATCH v2 06/22] drm/msm/submit: Move copy_from_user ahead of locking bos Rob Clark
2020-10-12  2:09 ` [PATCH v2 07/22] drm/msm: Do rpm get sooner in the submit path Rob Clark
2020-10-12 14:35   ` Daniel Vetter
2020-10-12 15:43     ` Rob Clark
2020-10-20  9:07       ` Viresh Kumar
2020-10-20 10:56         ` Daniel Vetter
2020-10-20 11:24           ` Viresh Kumar
2020-10-20 11:42             ` Daniel Vetter
2020-10-20 14:13             ` Rob Clark
2020-10-22  8:06               ` Viresh Kumar
2020-10-25 17:39                 ` Rob Clark
2020-10-27 11:35                   ` Viresh Kumar
2020-11-03  5:47                     ` Viresh Kumar
2020-11-03 16:50                       ` Rob Clark
2020-11-04  3:03                         ` Viresh Kumar
2020-11-05 19:24                           ` Rob Clark
2020-11-06  7:16                             ` Viresh Kumar
2020-11-17 10:03                               ` Viresh Kumar
2020-11-17 17:02                               ` Rob Clark
2020-11-18  5:28                                 ` Viresh Kumar
2020-11-18 16:53                                   ` Rob Clark [this message]
2020-11-19  6:05                                     ` Viresh Kumar
2020-12-07  6:16                                       ` Viresh Kumar
2020-12-16  5:22                                         ` Viresh Kumar
2020-10-12  2:09 ` [PATCH v2 08/22] drm/msm/gem: Switch over to obj->resv for locking Rob Clark
2020-10-12  2:09 ` [PATCH v2 09/22] drm/msm: Use correct drm_gem_object_put() in fail case Rob Clark
2020-10-12  2:09 ` [PATCH v2 10/22] drm/msm: Drop chatty trace Rob Clark
2020-10-12  2:09 ` [PATCH v2 11/22] drm/msm: Move update_fences() Rob Clark
2020-10-12  2:09 ` [PATCH v2 12/22] drm/msm: Add priv->mm_lock to protect active/inactive lists Rob Clark
2020-10-12  2:09 ` [PATCH v2 13/22] drm/msm: Document and rename preempt_lock Rob Clark
2020-10-12  2:09 ` [PATCH v2 14/22] drm/msm: Protect ring->submits with it's own lock Rob Clark
2020-10-12  2:09 ` [PATCH v2 15/22] drm/msm: Refcount submits Rob Clark
2020-10-12  2:09 ` [PATCH v2 16/22] drm/msm: Remove obj->gpu Rob Clark
2020-10-12  2:09 ` [PATCH v2 17/22] drm/msm: Drop struct_mutex from the retire path Rob Clark
2020-10-12  2:09 ` [PATCH v2 18/22] drm/msm: Drop struct_mutex in free_object() path Rob Clark
2020-10-12  2:09 ` [PATCH v2 19/22] drm/msm: remove msm_gem_free_work Rob Clark
2020-10-12  2:09 ` [PATCH v2 20/22] drm/msm: drop struct_mutex in madvise path Rob Clark
2020-10-12  2:09 ` [PATCH v2 21/22] drm/msm: Drop struct_mutex in shrinker path Rob Clark
2020-10-12  2:09 ` [PATCH v2 22/22] drm/msm: Don't implicit-sync if only a single ring Rob Clark
2020-10-12 14:40   ` Daniel Vetter
2020-10-12 15:07     ` Rob Clark
2020-10-13 11:08       ` Daniel Vetter
2020-10-13 16:15         ` [Freedreno] " Rob Clark
2020-10-15  8:22           ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF6AEGv=-h7GFj5LR97FkeBBn+gk6TNS5hZkwBwufpE4yO7GyA@mail.gmail.com' \
    --to=robdclark@gmail.com \
    --cc=airlied@linux.ie \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nm@ti.com \
    --cc=robdclark@chromium.org \
    --cc=sean@poorly.run \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).