alsa-devel.alsa-project.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] component: do not leave master devres group open after bind
@ 2021-09-22  8:54 Kai Vehmanen
  2021-09-28 10:22 ` Takashi Iwai
  2021-10-05 14:35 ` Greg KH
  0 siblings, 2 replies; 6+ messages in thread
From: Kai Vehmanen @ 2021-09-22  8:54 UTC (permalink / raw)
  To: dri-devel, gregkh, tiwai
  Cc: alsa-devel, kai.vehmanen, Rafael J . Wysocki, jani.nikula,
	Imre Deak, Russell King, Russell King, intel-gfx

In current code, the devres group for aggregate master is left open
after call to component_master_add_*(). This leads to problems when the
master does further managed allocations on its own. When any
participating driver calls component_del(), this leads to immediate
release of resources.

This came up when investigating a page fault occurring with i915 DRM
driver unbind with 5.15-rc1 kernel. The following sequence occurs:

 i915_pci_remove()
   -> intel_display_driver_unregister()
     -> i915_audio_component_cleanup()
       -> component_del()
         -> component.c:take_down_master()
           -> hdac_component_master_unbind() [via master->ops->unbind()]
           -> devres_release_group(master->parent, NULL)

With older kernels this has not caused issues, but with audio driver
moving to use managed interfaces for more of its allocations, this no
longer works. Devres log shows following to occur:

component_master_add_with_match()
[  126.886032] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000323ccdc5 devm_component_match_release (24 bytes)
[  126.886045] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000865cdb29 grp< (0 bytes)
[  126.886049] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 grp< (0 bytes)

audio driver completes its PCI probe()
[  126.892238] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 pcim_iomap_release (48 bytes)

component_del() called() at DRM/i915 unbind()
[  137.579422] i915 0000:00:02.0: DEVRES REL 00000000ef44c293 grp< (0 bytes)
[  137.579445] snd_hda_intel 0000:00:1f.3: DEVRES REL 00000000865cdb29 grp< (0 bytes)
[  137.579458] snd_hda_intel 0000:00:1f.3: DEVRES REL 000000001b480725 pcim_iomap_release (48 bytes)

So the "devres_release_group(master->parent, NULL)" ends up freeing the
pcim_iomap allocation. Upon next runtime resume, the audio driver will
cause a page fault as the iomap alloc was released without the driver
knowing about it.

Fix this issue by using the "struct master" pointer as identifier for
the devres group, and by closing the devres group after
the master->ops->bind() call is done. This allows devres allocations
done by the driver acting as master to be isolated from the binding state
of the aggregate driver. This modifies the logic originally introduced in
commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master device")

BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/base/component.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

V2 changes:
 - after review form Imre and Russell, removing RFC tag
 - rebased on top of 5.15-rc2 (V1 was on drm-tip)
 - CI test results for V1 show that this patch fixes multiple
   failures in i915 unbind and module reload tests:
   https://patchwork.freedesktop.org/series/94889/

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 5e79299f6c3f..870485cbbb87 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -246,7 +246,7 @@ static int try_to_bring_up_master(struct master *master,
 		return 0;
 	}
 
-	if (!devres_open_group(master->parent, NULL, GFP_KERNEL))
+	if (!devres_open_group(master->parent, master, GFP_KERNEL))
 		return -ENOMEM;
 
 	/* Found all components */
@@ -258,6 +258,7 @@ static int try_to_bring_up_master(struct master *master,
 		return ret;
 	}
 
+	devres_close_group(master->parent, NULL);
 	master->bound = true;
 	return 1;
 }
@@ -282,7 +283,7 @@ static void take_down_master(struct master *master)
 {
 	if (master->bound) {
 		master->ops->unbind(master->parent);
-		devres_release_group(master->parent, NULL);
+		devres_release_group(master->parent, master);
 		master->bound = false;
 	}
 }

base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] component: do not leave master devres group open after bind
  2021-09-22  8:54 [PATCH v2] component: do not leave master devres group open after bind Kai Vehmanen
@ 2021-09-28 10:22 ` Takashi Iwai
  2021-09-28 10:45   ` Kai Vehmanen
  2021-10-05 14:35 ` Greg KH
  1 sibling, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2021-09-28 10:22 UTC (permalink / raw)
  To: Kai Vehmanen
  Cc: alsa-devel, Rafael J . Wysocki, jani.nikula, gregkh, Imre Deak,
	dri-devel, Russell King, Russell King, intel-gfx

On Wed, 22 Sep 2021 10:54:32 +0200,
Kai Vehmanen wrote:
(snip)
> --- a/drivers/base/component.c
> +++ b/drivers/base/component.c
> @@ -246,7 +246,7 @@ static int try_to_bring_up_master(struct master *master,
>  		return 0;
>  	}
>  
> -	if (!devres_open_group(master->parent, NULL, GFP_KERNEL))
> +	if (!devres_open_group(master->parent, master, GFP_KERNEL))
>  		return -ENOMEM;
>  
>  	/* Found all components */
> @@ -258,6 +258,7 @@ static int try_to_bring_up_master(struct master *master,
>  		return ret;
>  	}
>  
> +	devres_close_group(master->parent, NULL);

Just wondering whether we should pass master here instead of NULL,
too?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] component: do not leave master devres group open after bind
  2021-09-28 10:22 ` Takashi Iwai
@ 2021-09-28 10:45   ` Kai Vehmanen
  0 siblings, 0 replies; 6+ messages in thread
From: Kai Vehmanen @ 2021-09-28 10:45 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: alsa-devel, Kai Vehmanen, Rafael J . Wysocki, jani.nikula,
	gregkh, Imre Deak, dri-devel, Russell King, Russell King,
	intel-gfx

Hey,

On Tue, 28 Sep 2021, Takashi Iwai wrote:

> On Wed, 22 Sep 2021 10:54:32 +0200, Kai Vehmanen wrote:
> > --- a/drivers/base/component.c
> > +++ b/drivers/base/component.c
> > @@ -246,7 +246,7 @@ static int try_to_bring_up_master(struct master *master,
> >  		return 0;
> >  	}
> >  
> > -	if (!devres_open_group(master->parent, NULL, GFP_KERNEL))
> > +	if (!devres_open_group(master->parent, master, GFP_KERNEL))
> >  		return -ENOMEM;
> >  
> >  	/* Found all components */
> > @@ -258,6 +258,7 @@ static int try_to_bring_up_master(struct master *master,
> >  		return ret;
> >  	}
> >  
> > +	devres_close_group(master->parent, NULL);
> 
> Just wondering whether we should pass master here instead of NULL,
> too?

I wondered about this as well. Functionally it should be equivalent as 
passing NULL will apply the operation to the latest added group. I noted 
the practise of passing NULL has been followed in the existing code when 
referring to groups created within the same function. E.g.

»       if (!devres_open_group(component->dev, component, GFP_KERNEL)) {
[...]
»       ret = component->ops->bind(component->dev, master->parent, data);
»       if (!ret) {
»       »       component->bound = true;

»       »       /*                                                                                                                                                          
»       »        * Close the component device's group so that resources                                                                                                     
»       »        * allocated in the binding are encapsulated for removal                                                                                                    
»       »        * at unbind.  Remove the group on the DRM device as we                                                                                                     
»       »        * can clean those resources up independently.                                                                                                              
»       »        */
»       »       devres_close_group(component->dev, NULL);

... so I followed this existing practise. I can change and send a V3 if 
the explicit parameter is preferred.

Br, Kai

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] component: do not leave master devres group open after bind
  2021-09-22  8:54 [PATCH v2] component: do not leave master devres group open after bind Kai Vehmanen
  2021-09-28 10:22 ` Takashi Iwai
@ 2021-10-05 14:35 ` Greg KH
  2021-10-06 13:47   ` Kai Vehmanen
  1 sibling, 1 reply; 6+ messages in thread
From: Greg KH @ 2021-10-05 14:35 UTC (permalink / raw)
  To: Kai Vehmanen
  Cc: alsa-devel, Rafael J . Wysocki, tiwai, Imre Deak, dri-devel,
	jani.nikula, Russell King, Russell King, intel-gfx

On Wed, Sep 22, 2021 at 11:54:32AM +0300, Kai Vehmanen wrote:
> In current code, the devres group for aggregate master is left open
> after call to component_master_add_*(). This leads to problems when the
> master does further managed allocations on its own. When any
> participating driver calls component_del(), this leads to immediate
> release of resources.
> 
> This came up when investigating a page fault occurring with i915 DRM
> driver unbind with 5.15-rc1 kernel. The following sequence occurs:
> 
>  i915_pci_remove()
>    -> intel_display_driver_unregister()
>      -> i915_audio_component_cleanup()
>        -> component_del()
>          -> component.c:take_down_master()
>            -> hdac_component_master_unbind() [via master->ops->unbind()]
>            -> devres_release_group(master->parent, NULL)
> 
> With older kernels this has not caused issues, but with audio driver
> moving to use managed interfaces for more of its allocations, this no
> longer works. Devres log shows following to occur:
> 
> component_master_add_with_match()
> [  126.886032] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000323ccdc5 devm_component_match_release (24 bytes)
> [  126.886045] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000865cdb29 grp< (0 bytes)
> [  126.886049] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 grp< (0 bytes)
> 
> audio driver completes its PCI probe()
> [  126.892238] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 pcim_iomap_release (48 bytes)
> 
> component_del() called() at DRM/i915 unbind()
> [  137.579422] i915 0000:00:02.0: DEVRES REL 00000000ef44c293 grp< (0 bytes)
> [  137.579445] snd_hda_intel 0000:00:1f.3: DEVRES REL 00000000865cdb29 grp< (0 bytes)
> [  137.579458] snd_hda_intel 0000:00:1f.3: DEVRES REL 000000001b480725 pcim_iomap_release (48 bytes)
> 
> So the "devres_release_group(master->parent, NULL)" ends up freeing the
> pcim_iomap allocation. Upon next runtime resume, the audio driver will
> cause a page fault as the iomap alloc was released without the driver
> knowing about it.
> 
> Fix this issue by using the "struct master" pointer as identifier for
> the devres group, and by closing the devres group after
> the master->ops->bind() call is done. This allows devres allocations
> done by the driver acting as master to be isolated from the binding state
> of the aggregate driver. This modifies the logic originally introduced in
> commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master device")
> 
> BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> Acked-by: Imre Deak <imre.deak@intel.com>
> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
>  drivers/base/component.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

What commit does this "fix:"?  And does it need to go to stable
kernel(s)?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] component: do not leave master devres group open after bind
  2021-10-05 14:35 ` Greg KH
@ 2021-10-06 13:47   ` Kai Vehmanen
  2021-10-13 13:09     ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Kai Vehmanen @ 2021-10-06 13:47 UTC (permalink / raw)
  To: Greg KH
  Cc: alsa-devel, Kai Vehmanen, Rafael J . Wysocki, Takashi Iwai,
	Imre Deak, dri-devel, jani.nikula, Russell King, Russell King,
	intel-gfx

Hi,

On Tue, 5 Oct 2021, Greg KH wrote:

> On Wed, Sep 22, 2021 at 11:54:32AM +0300, Kai Vehmanen wrote:
> > In current code, the devres group for aggregate master is left open
> > after call to component_master_add_*(). This leads to problems when the
> > master does further managed allocations on its own. When any
> > participating driver calls component_del(), this leads to immediate
> > release of resources.
[...]
> > the devres group, and by closing the devres group after
> > the master->ops->bind() call is done. This allows devres allocations
> > done by the driver acting as master to be isolated from the binding state
> > of the aggregate driver. This modifies the logic originally introduced in
> > commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master device")
> > 
> > BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
> > Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > Acked-by: Imre Deak <imre.deak@intel.com>
> > Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> 
> What commit does this "fix:"?  And does it need to go to stable
> kernel(s)?

I didn't put a "Fixes" on the original commit 9e1ccb4a7700 
("drivers/base: fix devres handling for master device") as it alone
didn't cause problems. It did open the door for possible devres issues
for anybody calling component_master_add_().

On audio side, this surfaced with the more recent commit 3fcaf24e5dce 
("ALSA: hda: Allocate resources with device-managed APIs"). In theory one 
could have hit issues already before, but this made it very easy to hit
on actual systems.

If I'd have to pick one, it would be 9e1ccb4a7700 ("drivers/base: fix 
devres handling for master device"). And yes, given comments on this 
thread, I'd say this needs to go to stable kernels.

Br, Kai

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] component: do not leave master devres group open after bind
  2021-10-06 13:47   ` Kai Vehmanen
@ 2021-10-13 13:09     ` Greg KH
  0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2021-10-13 13:09 UTC (permalink / raw)
  To: Kai Vehmanen
  Cc: alsa-devel, Rafael J . Wysocki, Takashi Iwai, Imre Deak,
	dri-devel, jani.nikula, Russell King, Russell King, intel-gfx

On Wed, Oct 06, 2021 at 04:47:57PM +0300, Kai Vehmanen wrote:
> Hi,
> 
> On Tue, 5 Oct 2021, Greg KH wrote:
> 
> > On Wed, Sep 22, 2021 at 11:54:32AM +0300, Kai Vehmanen wrote:
> > > In current code, the devres group for aggregate master is left open
> > > after call to component_master_add_*(). This leads to problems when the
> > > master does further managed allocations on its own. When any
> > > participating driver calls component_del(), this leads to immediate
> > > release of resources.
> [...]
> > > the devres group, and by closing the devres group after
> > > the master->ops->bind() call is done. This allows devres allocations
> > > done by the driver acting as master to be isolated from the binding state
> > > of the aggregate driver. This modifies the logic originally introduced in
> > > commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master device")
> > > 
> > > BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
> > > Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > > Acked-by: Imre Deak <imre.deak@intel.com>
> > > Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > 
> > What commit does this "fix:"?  And does it need to go to stable
> > kernel(s)?
> 
> I didn't put a "Fixes" on the original commit 9e1ccb4a7700 
> ("drivers/base: fix devres handling for master device") as it alone
> didn't cause problems. It did open the door for possible devres issues
> for anybody calling component_master_add_().
> 
> On audio side, this surfaced with the more recent commit 3fcaf24e5dce 
> ("ALSA: hda: Allocate resources with device-managed APIs"). In theory one 
> could have hit issues already before, but this made it very easy to hit
> on actual systems.
> 
> If I'd have to pick one, it would be 9e1ccb4a7700 ("drivers/base: fix 
> devres handling for master device"). And yes, given comments on this 
> thread, I'd say this needs to go to stable kernels.

Then please add a fixes: line and a cc: stable line and resend.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-13 13:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22  8:54 [PATCH v2] component: do not leave master devres group open after bind Kai Vehmanen
2021-09-28 10:22 ` Takashi Iwai
2021-09-28 10:45   ` Kai Vehmanen
2021-10-05 14:35 ` Greg KH
2021-10-06 13:47   ` Kai Vehmanen
2021-10-13 13:09     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).