All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: <intel-gfx@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>,
	"Jani Nikula" <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Tvrtko Ursulin <tursulin@ursulin.net>,
	Anshuman Gupta <anshuman.gupta@intel.com>,
	Badal Nilawar <badal.nilawar@intel.com>,
	"Guenter\ Roeck" <linux@roeck-us.net>,
	Dale B Stimson <dale.b.stimson@intel.com>,
	"Andi\ Shyti" <andi.shyti@linux.intel.com>,
	Jonathan Cavitt <jonathan.cavitt@intel.com>,
	Nirmoy Das <nirmoy.das@intel.com>
Subject: Re: [PATCH v2] drm/i915/hwmon: Fix locking inversion in sysfs getter
Date: Tue, 12 Mar 2024 14:11:02 -0700	[thread overview]
Message-ID: <85y1an5f6h.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <1841305.FMhQkTaH9n@jkrzyszt-mobl2.ger.corp.intel.com>

On Tue, 12 Mar 2024 13:34:25 -0700, Janusz Krzysztofik wrote:
>

Hi Janusz,

> On Tuesday, 12 March 2024 17:25:14 CET Dixit, Ashutosh wrote:
> > On Mon, 11 Mar 2024 13:34:58 -0700, Janusz Krzysztofik wrote:
> > >
> > > In i915 hwmon sysfs getter path we now take a hwmon_lock, then acquire an
> > > rpm wakeref.  That results in lock inversion:
> > >
> > > <4> [197.079335] ======================================================
> > > <4> [197.085473] WARNING: possible circular locking dependency detected
> > > <4> [197.091611] 6.8.0-rc7-Patchwork_129026v7-gc4dc92fb1152+ #1 Not tainted
> > > <4> [197.098096] ------------------------------------------------------
> > > <4> [197.104231] prometheus-node/839 is trying to acquire lock:
> > > <4> [197.109680] ffffffff82764d80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x9a/0x350
> > > <4> [197.116939]
> > > but task is already holding lock:
> > > <4> [197.122730] ffff88811b772a40 (&hwmon->hwmon_lock){+.+.}-{3:3}, at: hwm_energy+0x4b/0x100 [i915]
> > > <4> [197.131543]
> > > which lock already depends on the new lock.
> > > ...
> > > <4> [197.507922] Chain exists of:
> > >   fs_reclaim --> &gt->reset.mutex --> &hwmon->hwmon_lock
> > > <4> [197.518528]  Possible unsafe locking scenario:
> > > <4> [197.524411]        CPU0                    CPU1
> > > <4> [197.528916]        ----                    ----
> > > <4> [197.533418]   lock(&hwmon->hwmon_lock);
> > > <4> [197.537237]                                lock(&gt->reset.mutex);
> > > <4> [197.543376]                                lock(&hwmon->hwmon_lock);
> > > <4> [197.549682]   lock(fs_reclaim);
> > > ...
> > > <4> [197.632548] Call Trace:
> > > <4> [197.634990]  <TASK>
> > > <4> [197.637088]  dump_stack_lvl+0x64/0xb0
> > > <4> [197.640738]  check_noncircular+0x15e/0x180
> > > <4> [197.652968]  check_prev_add+0xe9/0xce0
> > > <4> [197.656705]  __lock_acquire+0x179f/0x2300
> > > <4> [197.660694]  lock_acquire+0xd8/0x2d0
> > > <4> [197.673009]  fs_reclaim_acquire+0xa1/0xd0
> > > <4> [197.680478]  __kmalloc+0x9a/0x350
> > > <4> [197.689063]  acpi_ns_internalize_name.part.0+0x4a/0xb0
> > > <4> [197.694170]  acpi_ns_get_node_unlocked+0x60/0xf0
> > > <4> [197.720608]  acpi_ns_get_node+0x3b/0x60
> > > <4> [197.724428]  acpi_get_handle+0x57/0xb0
> > > <4> [197.728164]  acpi_has_method+0x20/0x50
> > > <4> [197.731896]  acpi_pci_set_power_state+0x43/0x120
> > > <4> [197.736485]  pci_power_up+0x24/0x1c0
> > > <4> [197.740047]  pci_pm_default_resume_early+0x9/0x30
> > > <4> [197.744725]  pci_pm_runtime_resume+0x2d/0x90
> > > <4> [197.753911]  __rpm_callback+0x3c/0x110
> > > <4> [197.762586]  rpm_callback+0x58/0x70
> > > <4> [197.766064]  rpm_resume+0x51e/0x730
> > > <4> [197.769542]  rpm_resume+0x267/0x730
> > > <4> [197.773020]  rpm_resume+0x267/0x730
> > > <4> [197.776498]  rpm_resume+0x267/0x730
> > > <4> [197.779974]  __pm_runtime_resume+0x49/0x90
> > > <4> [197.784055]  __intel_runtime_pm_get+0x19/0xa0 [i915]
> > > <4> [197.789070]  hwm_energy+0x55/0x100 [i915]
> > > <4> [197.793183]  hwm_read+0x9a/0x310 [i915]
> > > <4> [197.797124]  hwmon_attr_show+0x36/0x120
> > > <4> [197.800946]  dev_attr_show+0x15/0x60
> > > <4> [197.804509]  sysfs_kf_seq_show+0xb5/0x100
> > >
> > > Acquire the wakeref before the lock and hold it as long as the lock is
> > > also held.  Follow that pattern across the whole source file where similar
> > > lock inversion can happen.
> > >
> > > v2: Keep hardware read under the lock so the whole operation of updating
> > >     energy from hardware is still atomic (Guenter),
> > >   - instead, acquire the rpm wakeref before the lock and hold it as long
> > >     as the lock is held,
> > >   - use the same aproach for other similar places across the i915_hwmon.c
> > >     source file (Rodrigo).
> > >
> > > Fixes: c41b8bdcc297 ("drm/i915/hwmon: Show device level energy usage")
> >
> > I would think that the lock inversion issue was introduced here:
> >
> > 1b44019a93e2 ("drm/i915/guc: Disable PL1 power limit when loading GuC firmware")
> >
> > This is the commit which introduced this sequence:
> >	lock(&gt->reset.mutex);
> >	lock(&hwmon->hwmon_lock);
> >
> > Before this, everything was fine. So perhaps the Fixes tag should reference
> > this commit?
>
> OK, thanks for pointing that out.
>
> > Otherwise the patch LGTM:
> >
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

Thanks for fixing this. Somehow I didn't see it when I did
1b44019a93e2. Maybe just didn't have lockdep enabled in the kernel.

Thanks.
--
Ashutosh

  reply	other threads:[~2024-03-12 21:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-11 20:34 [PATCH v2] drm/i915/hwmon: Fix locking inversion in sysfs getter Janusz Krzysztofik
2024-03-12  2:43 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915/hwmon: Fix locking inversion in sysfs getter (rev2) Patchwork
2024-03-12  2:55 ` ✓ Fi.CI.BAT: success " Patchwork
2024-03-12 10:02 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-03-12 15:14   ` Janusz Krzysztofik
2024-03-13  7:45     ` Illipilli, TejasreeX
2024-03-12 16:25 ` [PATCH v2] drm/i915/hwmon: Fix locking inversion in sysfs getter Dixit, Ashutosh
2024-03-12 20:34   ` Janusz Krzysztofik
2024-03-12 21:11     ` Dixit, Ashutosh [this message]
2024-03-12 17:09 ` Andi Shyti
2024-03-12 20:35   ` Janusz Krzysztofik
2024-03-13  6:08 ` ✗ Fi.CI.IGT: failure for drm/i915/hwmon: Fix locking inversion in sysfs getter (rev2) Patchwork
2024-03-13  6:51 ` ✓ Fi.CI.IGT: success " Patchwork
2024-03-13 14:30 ` [PATCH v2] drm/i915/hwmon: Fix locking inversion in sysfs getter Andi Shyti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85y1an5f6h.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=andi.shyti@linux.intel.com \
    --cc=anshuman.gupta@intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=dale.b.stimson@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=janusz.krzysztofik@linux.intel.com \
    --cc=jonathan.cavitt@intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux@roeck-us.net \
    --cc=nirmoy.das@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.