linux-input.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] HID: google: Don't use devm for hid_hw_stop()
@ 2023-05-05 23:24 Stephen Boyd
  2023-05-06  0:06 ` Dmitry Torokhov
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Boyd @ 2023-05-05 23:24 UTC (permalink / raw)
  To: Jiri Kosina, Benjamin Tissoires
  Cc: linux-kernel, patches, linux-input, Dmitry Torokhov

We (ChromeOS) got a syzkaller report of a KASAN use after free read in
hidinput_find_key(). The callstack is from evdev_ioctl() calling
hidinput_setkeycode():

 __asan_report_load4_noabort+0x44/0x50
 hidinput_find_key+0x25c/0x340
 hidinput_locate_usage+0x31c/0x400
 hidinput_setkeycode+0x70/0x460
 input_set_keycode+0xd4/0x3f8
 evdev_do_ioctl+0x2508/0x6678
 evdev_ioctl_handler+0x12c/0x180
 evdev_ioctl+0x40/0x54

The memory being read was allocated during hammer_probe() by
hid_open_report():

 Allocated by task 19025:
 kasan_save_stack+0x38/0x68
 __kasan_kmalloc+0x90/0xac
 __kmalloc+0x27c/0x45c
 hid_add_field+0x4b0/0x125c
 hid_parser_main+0x214/0x994
 hid_open_report+0x388/0x7a8
 hammer_probe+0x80/0x698 [hid_google_hammer]

and the memory was freed by hid_close_report() called from
hid_destroy_device().

 Freed by task 19025:
 kasan_save_stack+0x38/0x68
 kasan_set_track+0x28/0x3c
 kasan_set_free_info+0x28/0x4c
 ____kasan_slab_free+0x110/0x164
 __kasan_slab_free+0x18/0x28
 kfree+0x208/0x950
 hid_close_report+0xd0/0x29c
 hid_device_remove+0x104/0x198
 device_release_driver_internal+0x204/0x400
 device_release_driver+0x30/0x40
 bus_remove_device+0x2a0/0x390
 device_del+0x49c/0x858
 hid_destroy_device+0x78/0x11c
 usbhid_disconnect+0xb4/0x100
 usb_unbind_interface+0x178/0x6f4
 device_release_driver_internal+0x240/0x400
 device_release_driver+0x30/0x40
 bus_remove_device+0x2a0/0x390

The memory that's being read by the ioctl is an HID report that's been
freed when the HID device is destroyed because the usb interface is
unbound. In hid_device_remove() we assume that the hid report can be
closed with hid_close_report() after the hid_driver is unbound, which is
generally safe because the driver should have stopped the hardware with
hid_hw_stop() when it was unbound. In fact, hid_device_remove() falls
back to calling hid_hw_stop() directly if the hid driver doesn't have a
remove() function, so the assumption is that hid_hw_stop() has been
called once the hid_driver::remove() function returns. hid_hw_stop()
will eventually call hidinput_disconnect() which will unregister the
hidinput device; ensuring that userspace can't call ioctls on the
hidinput device when hid_hw_stop() returns.

Unfortunately, the hid google hammer driver hand rolls a devm function
to call hid_hw_stop() when the driver is unbound and implements an
hid_driver::remove() function. The driver core doesn't call the devm
release functions until _after_ the bus unbinds the driver, so the order
of operations is like this:

  __device_release_driver()
   ...
   device_remove(dev)
    hid_device_remove(hdev)
      hdrv->remove(hdev);
      hid_close_report(hdev)  <---- Frees the report
   device_unbind_cleanup(dev)
    devres_release_all(dev)
      ...
      hid_hw_stop(hdev) <--- Removes the hid_input device

We want the order of operations to be hid_hw_stop() and then
hid_close_report() so that the report can be freed without the hid_input
device hanging around attempting to deref the report. Remove the hand
rolled devm function and call hid_hw_stop() from the hammer_remove()
function to fix the ordering.

Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Fixes: d950db3f80a8 ("HID: google: switch to devm when registering keyboard backlight LED")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---
 drivers/hid/hid-google-hammer.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/hid/hid-google-hammer.c b/drivers/hid/hid-google-hammer.c
index 7ae5f27df54d..e7f7c3c68747 100644
--- a/drivers/hid/hid-google-hammer.c
+++ b/drivers/hid/hid-google-hammer.c
@@ -495,11 +495,6 @@ static void hammer_get_folded_state(struct hid_device *hdev)
 	kfree(buf);
 }
 
-static void hammer_stop(void *hdev)
-{
-	hid_hw_stop(hdev);
-}
-
 static int hammer_probe(struct hid_device *hdev,
 			const struct hid_device_id *id)
 {
@@ -520,10 +515,6 @@ static int hammer_probe(struct hid_device *hdev,
 	if (error)
 		return error;
 
-	error = devm_add_action(&hdev->dev, hammer_stop, hdev);
-	if (error)
-		return error;
-
 	/*
 	 * We always want to poll for, and handle tablet mode events from
 	 * devices that have folded usage, even when nobody has opened the input
@@ -533,8 +524,10 @@ static int hammer_probe(struct hid_device *hdev,
 	if (hammer_has_folded_event(hdev)) {
 		hdev->quirks |= HID_QUIRK_ALWAYS_POLL;
 		error = hid_hw_open(hdev);
-		if (error)
+		if (error) {
+			hid_hw_stop(hdev);
 			return error;
+		}
 
 		hammer_get_folded_state(hdev);
 	}
@@ -576,7 +569,8 @@ static void hammer_remove(struct hid_device *hdev)
 		spin_unlock_irqrestore(&cbas_ec_lock, flags);
 	}
 
-	/* Unregistering LEDs and stopping the hardware is done via devm */
+	/* Unregistering LEDs is done via devm */
+	hid_hw_stop(hdev);
 }
 
 static const struct hid_device_id hammer_devices[] = {

base-commit: 457391b0380335d5e9a5babdec90ac53928b23b4
-- 
https://chromeos.dev


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] HID: google: Don't use devm for hid_hw_stop()
  2023-05-05 23:24 [PATCH] HID: google: Don't use devm for hid_hw_stop() Stephen Boyd
@ 2023-05-06  0:06 ` Dmitry Torokhov
  2023-05-10 18:51   ` Stephen Boyd
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Torokhov @ 2023-05-06  0:06 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Jiri Kosina, Benjamin Tissoires, linux-kernel, patches, linux-input

On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote:
> 
...
> Unfortunately, the hid google hammer driver hand rolls a devm function
> to call hid_hw_stop() when the driver is unbound and implements an
> hid_driver::remove() function. The driver core doesn't call the devm
> release functions until _after_ the bus unbinds the driver, so the order
> of operations is like this:

Excellent analysis, but the problem is not limited to the hammer driver
(potentially) and shalt be dealt with appropriately, at the HID bus
level.

Actually, it is not even limited to HID, but exists in most buses with
non-trivial ->remove() implementation. For example I fixed similar issue
in I2C in 5b5475826c52 ("i2c: ensure timely release of driver-allocated
resources"). I tried fixing it in SPI but Mark has some objections, and
wanted to fix it in the driver core, so I was thinking about it and then
dropped the ball. At this time I do not think fixing it at driver core
makes logic any clearer, so I think we just need to fix a handful of
buses.

Anyway, I'll CC you on an alternative patch.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] HID: google: Don't use devm for hid_hw_stop()
  2023-05-06  0:06 ` Dmitry Torokhov
@ 2023-05-10 18:51   ` Stephen Boyd
  2023-05-10 20:24     ` Dmitry Torokhov
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Boyd @ 2023-05-10 18:51 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Jiri Kosina, Benjamin Tissoires, linux-kernel, patches, linux-input

Quoting Dmitry Torokhov (2023-05-05 17:06:07)
> On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote:
> >
> ...
> > Unfortunately, the hid google hammer driver hand rolls a devm function
> > to call hid_hw_stop() when the driver is unbound and implements an
> > hid_driver::remove() function. The driver core doesn't call the devm
> > release functions until _after_ the bus unbinds the driver, so the order
> > of operations is like this:
>
> Excellent analysis, but the problem is not limited to the hammer driver
> (potentially) and shalt be dealt with appropriately, at the HID bus
> level.

Thanks. I thought of the bus level approach as well, but I was trying to
keep the fix isolated to the driver that had the problem. I'd like to
get the fix into the stable kernel, as this fixes a regression
introduced by commit d950db3f80a8 ("HID: google: switch to devm when
registering keyboard backlight LED") in v5.18.

Is the bus level approach going to be acceptable as a stable backport?

Is it a problem to call hid_hw_stop() directly? I suppose for the
hid-google-hammer driver we don't want to leave the led sysfs node
hanging around after the hid_hw_stop() function has been called either,
so some sort of forced ejection of the devm led device is needed and the
bus level approach helps there.

I was curious to see if anything else had this problem so I did this
poor grep to find code that isn't calling hid_hw_stop() from probe or
remove:

  git grep -W hid_hw_stop | grep .c= | grep -v probe | grep -v remove

and I got this list (minus hid core which doesn't matter):

 drivers/hid/hid-google-hammer.c=static void hammer_stop(void *hdev)
 drivers/hid/hid-mcp2221.c=static void mcp2221_hid_unregister(void *ptr)
 drivers/hid/hid-wiimote-core.c=static void wiimote_destroy(struct
wiimote_data *wdata)
 drivers/hid/wacom_sys.c=static int wacom_parse_and_register(struct
wacom *wacom, bool wireless)
 drivers/hid/wacom_sys.c=static void wacom_wireless_work(struct
work_struct *work)
 drivers/hid/wacom_sys.c=static void wacom_mode_change_work(struct
work_struct *work)

The wacom_sys.c ones look OK because they're during workqueues that are
probably flushed, and wiimote_destroy() is called from an error path or
driver remove, so it is also OK. But mcp2221_hid_unregister() has the
same problem.

If you look at drivers/hid/hid-mcp2221.c you'll see this comment above
mcp2221_remove() too:

 /* This is needed to be sure hid_hw_stop() isn't called twice by the
subsystem */
 static void mcp2221_remove(struct hid_device *hdev)

which is kinda weird. Why can't we have a devm_hid_hw_start() API that
tells the hid bus to not call hid_hw_stop() at all in
hid_device_remove()? That would allow us to avoid this pitfall where
everything is moved to devm and the driver has no remove function at all
and we forget to populate an empty one. Instead, the bus layer can know
that hardware will be stopped with devm later.

>
> Actually, it is not even limited to HID, but exists in most buses with
> non-trivial ->remove() implementation. For example I fixed similar issue
> in I2C in 5b5475826c52 ("i2c: ensure timely release of driver-allocated
> resources"). I tried fixing it in SPI but Mark has some objections, and
> wanted to fix it in the driver core, so I was thinking about it and then
> dropped the ball. At this time I do not think fixing it at driver core
> makes logic any clearer, so I think we just need to fix a handful of
> buses.

Do you have a link to that discussion?

-------

This got me thinking that maybe both of these approaches are wrong.
Maybe the call to hid_close_report() should be removed from
hid_device_remove() instead.

The device is being removed from the bus when hid_device_remove() is
called, but it hasn't been released yet. Other devices like the hidinput
device are referencing the hdev device because they set the hdev as
their parent. Basically, child devices are still bound to some sort of
driver or subsystem when the parent hdev is unbound from its driver,
leading to a state where the child drivers could still access the hdev
while it is being destroyed. If we remove the hid_close_report() call
from this function it will eventually be called by hid_device_release()
when the last reference to the device is dropped, i.e. when the child
devices all get destroyed. In the case of hid-google-hammer, that would
be when hid_hw_stop() is called from the devm release function by driver
core.

The benefit of this approach is that we don't allocate a devres group
for all the hid devices when only two drivers need it. The possible
downside is that we keep the report around while the device exists but
has no driver bound to it.

Here's a totally untested patch for that.

---8<----
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 22623eb4f72f..93905e200cae 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1211,8 +1211,8 @@ int hid_open_report(struct hid_device *device)
 		hid_parser_reserved
 	};

-	if (WARN_ON(device->status & HID_STAT_PARSED))
-		return -EBUSY;
+	if (device->status & HID_STAT_PARSED)
+		hid_close_report(device);

 	start = device->dev_rdesc;
 	if (WARN_ON(!start))
@@ -2662,7 +2662,6 @@ static void hid_device_remove(struct device *dev)
 			hdrv->remove(hdev);
 		else /* default remove */
 			hid_hw_stop(hdev);
-		hid_close_report(hdev);
 		hdev->driver = NULL;
 	}

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] HID: google: Don't use devm for hid_hw_stop()
  2023-05-10 18:51   ` Stephen Boyd
@ 2023-05-10 20:24     ` Dmitry Torokhov
  2023-05-10 20:50       ` Stephen Boyd
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Torokhov @ 2023-05-10 20:24 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Jiri Kosina, Benjamin Tissoires, linux-kernel, patches, linux-input

On Wed, May 10, 2023 at 11:51:31AM -0700, Stephen Boyd wrote:
> Quoting Dmitry Torokhov (2023-05-05 17:06:07)
> > On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote:
> > >
> > ...
> > > Unfortunately, the hid google hammer driver hand rolls a devm function
> > > to call hid_hw_stop() when the driver is unbound and implements an
> > > hid_driver::remove() function. The driver core doesn't call the devm
> > > release functions until _after_ the bus unbinds the driver, so the order
> > > of operations is like this:
> >
> > Excellent analysis, but the problem is not limited to the hammer driver
> > (potentially) and shalt be dealt with appropriately, at the HID bus
> > level.
> 
> Thanks. I thought of the bus level approach as well, but I was trying to
> keep the fix isolated to the driver that had the problem. I'd like to
> get the fix into the stable kernel, as this fixes a regression
> introduced by commit d950db3f80a8 ("HID: google: switch to devm when
> registering keyboard backlight LED") in v5.18.
> 
> Is the bus level approach going to be acceptable as a stable backport?

Sure, why not given the kind of stuff flowing into stable kernels. At
least this would be fixing real issue that can be triggered with a real
device.

> 
> Is it a problem to call hid_hw_stop() directly? I suppose for the
> hid-google-hammer driver we don't want to leave the led sysfs node
> hanging around after the hid_hw_stop() function has been called either,
> so some sort of forced ejection of the devm led device is needed and the
> bus level approach helps there.
> 
> I was curious to see if anything else had this problem so I did this
> poor grep to find code that isn't calling hid_hw_stop() from probe or
> remove:
> 
>   git grep -W hid_hw_stop | grep .c= | grep -v probe | grep -v remove
> 
> and I got this list (minus hid core which doesn't matter):
> 
>  drivers/hid/hid-google-hammer.c=static void hammer_stop(void *hdev)
>  drivers/hid/hid-mcp2221.c=static void mcp2221_hid_unregister(void *ptr)
>  drivers/hid/hid-wiimote-core.c=static void wiimote_destroy(struct
> wiimote_data *wdata)
>  drivers/hid/wacom_sys.c=static int wacom_parse_and_register(struct
> wacom *wacom, bool wireless)
>  drivers/hid/wacom_sys.c=static void wacom_wireless_work(struct
> work_struct *work)
>  drivers/hid/wacom_sys.c=static void wacom_mode_change_work(struct
> work_struct *work)
> 
> The wacom_sys.c ones look OK because they're during workqueues that are
> probably flushed, and wiimote_destroy() is called from an error path or
> driver remove, so it is also OK. But mcp2221_hid_unregister() has the
> same problem.
> 
> If you look at drivers/hid/hid-mcp2221.c you'll see this comment above
> mcp2221_remove() too:
> 
>  /* This is needed to be sure hid_hw_stop() isn't called twice by the
> subsystem */
>  static void mcp2221_remove(struct hid_device *hdev)
> 
> which is kinda weird. Why can't we have a devm_hid_hw_start() API that
> tells the hid bus to not call hid_hw_stop() at all in
> hid_device_remove()? That would allow us to avoid this pitfall where
> everything is moved to devm and the driver has no remove function at all
> and we forget to populate an empty one. Instead, the bus layer can know
> that hardware will be stopped with devm later.

So yes, this is another option: all bus code should exclusively use
devm* API and can not use non-managed resources. This for HID includes
disconnecting hiddev, hidraw and hidinput handlers/drivers.

FTR, I think having devm_hid_hw_start() would be nice.

> 
> >
> > Actually, it is not even limited to HID, but exists in most buses with
> > non-trivial ->remove() implementation. For example I fixed similar issue
> > in I2C in 5b5475826c52 ("i2c: ensure timely release of driver-allocated
> > resources"). I tried fixing it in SPI but Mark has some objections, and
> > wanted to fix it in the driver core, so I was thinking about it and then
> > dropped the ball. At this time I do not think fixing it at driver core
> > makes logic any clearer, so I think we just need to fix a handful of
> > buses.
> 
> Do you have a link to that discussion?

https://lore.kernel.org/lkml/YFf2RD931nq3RudJ@google.com/

> 
> -------
> 
> This got me thinking that maybe both of these approaches are wrong.
> Maybe the call to hid_close_report() should be removed from
> hid_device_remove() instead.
> 
> The device is being removed from the bus when hid_device_remove() is
> called, but it hasn't been released yet. Other devices like the hidinput
> device are referencing the hdev device because they set the hdev as
> their parent. Basically, child devices are still bound to some sort of
> driver or subsystem when the parent hdev is unbound from its driver,
> leading to a state where the child drivers could still access the hdev
> while it is being destroyed. If we remove the hid_close_report() call
> from this function it will eventually be called by hid_device_release()
> when the last reference to the device is dropped, i.e. when the child
> devices all get destroyed. In the case of hid-google-hammer, that would
> be when hid_hw_stop() is called from the devm release function by driver
> core.
> 
> The benefit of this approach is that we don't allocate a devres group
> for all the hid devices when only two drivers need it. The possible
> downside is that we keep the report around while the device exists but
> has no driver bound to it.
> 
> Here's a totally untested patch for that.
> 
> ---8<----
> diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> index 22623eb4f72f..93905e200cae 100644
> --- a/drivers/hid/hid-core.c
> +++ b/drivers/hid/hid-core.c
> @@ -1211,8 +1211,8 @@ int hid_open_report(struct hid_device *device)
>  		hid_parser_reserved
>  	};
> 
> -	if (WARN_ON(device->status & HID_STAT_PARSED))
> -		return -EBUSY;
> +	if (device->status & HID_STAT_PARSED)
> +		hid_close_report(device);
> 
>  	start = device->dev_rdesc;
>  	if (WARN_ON(!start))
> @@ -2662,7 +2662,6 @@ static void hid_device_remove(struct device *dev)
>  			hdrv->remove(hdev);
>  		else /* default remove */
>  			hid_hw_stop(hdev);
> -		hid_close_report(hdev);
>  		hdev->driver = NULL;
>  	}

This will probably work, but it I consider this still being fragile as
at some point we might want to add some more unwinding, and we'll run
into this issue again. I would feel much safer if the order of release
followed (inversely) order of allocations more closely.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] HID: google: Don't use devm for hid_hw_stop()
  2023-05-10 20:24     ` Dmitry Torokhov
@ 2023-05-10 20:50       ` Stephen Boyd
  2023-05-11  0:25         ` Dmitry Torokhov
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Boyd @ 2023-05-10 20:50 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Jiri Kosina, Benjamin Tissoires, linux-kernel, patches, linux-input

Quoting Dmitry Torokhov (2023-05-10 13:24:08)
> On Wed, May 10, 2023 at 11:51:31AM -0700, Stephen Boyd wrote:
> > Quoting Dmitry Torokhov (2023-05-05 17:06:07)
> > > On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote:
> > > >
> > > ...
> > > > Unfortunately, the hid google hammer driver hand rolls a devm function
> > > > to call hid_hw_stop() when the driver is unbound and implements an
> > > > hid_driver::remove() function. The driver core doesn't call the devm
> > > > release functions until _after_ the bus unbinds the driver, so the order
> > > > of operations is like this:
> > >
> > > Excellent analysis, but the problem is not limited to the hammer driver
> > > (potentially) and shalt be dealt with appropriately, at the HID bus
> > > level.
> >
> > Thanks. I thought of the bus level approach as well, but I was trying to
> > keep the fix isolated to the driver that had the problem. I'd like to
> > get the fix into the stable kernel, as this fixes a regression
> > introduced by commit d950db3f80a8 ("HID: google: switch to devm when
> > registering keyboard backlight LED") in v5.18.
> >
> > Is the bus level approach going to be acceptable as a stable backport?
>
> Sure, why not given the kind of stuff flowing into stable kernels. At
> least this would be fixing real issue that can be triggered with a real
> device.

Hmm, ok. I was worried it would be too much "new code" vs. fixing
something.

> >
> > This got me thinking that maybe both of these approaches are wrong.
> > Maybe the call to hid_close_report() should be removed from
> > hid_device_remove() instead.
> >
> > The device is being removed from the bus when hid_device_remove() is
> > called, but it hasn't been released yet. Other devices like the hidinput
> > device are referencing the hdev device because they set the hdev as
> > their parent. Basically, child devices are still bound to some sort of
> > driver or subsystem when the parent hdev is unbound from its driver,
> > leading to a state where the child drivers could still access the hdev
> > while it is being destroyed. If we remove the hid_close_report() call
> > from this function it will eventually be called by hid_device_release()
> > when the last reference to the device is dropped, i.e. when the child
> > devices all get destroyed. In the case of hid-google-hammer, that would
> > be when hid_hw_stop() is called from the devm release function by driver
> > core.
> >
> > The benefit of this approach is that we don't allocate a devres group
> > for all the hid devices when only two drivers need it. The possible
> > downside is that we keep the report around while the device exists but
> > has no driver bound to it.
> >
> > Here's a totally untested patch for that.
> >
> > ---8<----
> > diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> > index 22623eb4f72f..93905e200cae 100644
> > --- a/drivers/hid/hid-core.c
> > +++ b/drivers/hid/hid-core.c
> > @@ -1211,8 +1211,8 @@ int hid_open_report(struct hid_device *device)
> >               hid_parser_reserved
> >       };
> >
> > -     if (WARN_ON(device->status & HID_STAT_PARSED))
> > -             return -EBUSY;
> > +     if (device->status & HID_STAT_PARSED)
> > +             hid_close_report(device);
> >
> >       start = device->dev_rdesc;
> >       if (WARN_ON(!start))
> > @@ -2662,7 +2662,6 @@ static void hid_device_remove(struct device *dev)
> >                       hdrv->remove(hdev);
> >               else /* default remove */
> >                       hid_hw_stop(hdev);
> > -             hid_close_report(hdev);
> >               hdev->driver = NULL;
> >       }
>
> This will probably work, but it I consider this still being fragile as
> at some point we might want to add some more unwinding, and we'll run
> into this issue again. I would feel much safer if the order of release
> followed (inversely) order of allocations more closely.
>

Sorry, I'm not following here. How is it fragile? Are you saying that if
we want to add devm calls into the bus layer itself the order of release
won't be inverse of allocation/creation?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] HID: google: Don't use devm for hid_hw_stop()
  2023-05-10 20:50       ` Stephen Boyd
@ 2023-05-11  0:25         ` Dmitry Torokhov
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitry Torokhov @ 2023-05-11  0:25 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Jiri Kosina, Benjamin Tissoires, linux-kernel, patches, linux-input

On Wed, May 10, 2023 at 01:50:01PM -0700, Stephen Boyd wrote:
> Quoting Dmitry Torokhov (2023-05-10 13:24:08)
> > On Wed, May 10, 2023 at 11:51:31AM -0700, Stephen Boyd wrote:
> > > Quoting Dmitry Torokhov (2023-05-05 17:06:07)
> > > > On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote:
> > > > >
> > > > ...
> > > > > Unfortunately, the hid google hammer driver hand rolls a devm function
> > > > > to call hid_hw_stop() when the driver is unbound and implements an
> > > > > hid_driver::remove() function. The driver core doesn't call the devm
> > > > > release functions until _after_ the bus unbinds the driver, so the order
> > > > > of operations is like this:
> > > >
> > > > Excellent analysis, but the problem is not limited to the hammer driver
> > > > (potentially) and shalt be dealt with appropriately, at the HID bus
> > > > level.
> > >
> > > Thanks. I thought of the bus level approach as well, but I was trying to
> > > keep the fix isolated to the driver that had the problem. I'd like to
> > > get the fix into the stable kernel, as this fixes a regression
> > > introduced by commit d950db3f80a8 ("HID: google: switch to devm when
> > > registering keyboard backlight LED") in v5.18.
> > >
> > > Is the bus level approach going to be acceptable as a stable backport?
> >
> > Sure, why not given the kind of stuff flowing into stable kernels. At
> > least this would be fixing real issue that can be triggered with a real
> > device.
> 
> Hmm, ok. I was worried it would be too much "new code" vs. fixing
> something.
> 
> > >
> > > This got me thinking that maybe both of these approaches are wrong.
> > > Maybe the call to hid_close_report() should be removed from
> > > hid_device_remove() instead.
> > >
> > > The device is being removed from the bus when hid_device_remove() is
> > > called, but it hasn't been released yet. Other devices like the hidinput
> > > device are referencing the hdev device because they set the hdev as
> > > their parent. Basically, child devices are still bound to some sort of
> > > driver or subsystem when the parent hdev is unbound from its driver,
> > > leading to a state where the child drivers could still access the hdev
> > > while it is being destroyed. If we remove the hid_close_report() call
> > > from this function it will eventually be called by hid_device_release()
> > > when the last reference to the device is dropped, i.e. when the child
> > > devices all get destroyed. In the case of hid-google-hammer, that would
> > > be when hid_hw_stop() is called from the devm release function by driver
> > > core.
> > >
> > > The benefit of this approach is that we don't allocate a devres group
> > > for all the hid devices when only two drivers need it. The possible
> > > downside is that we keep the report around while the device exists but
> > > has no driver bound to it.
> > >
> > > Here's a totally untested patch for that.
> > >
> > > ---8<----
> > > diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> > > index 22623eb4f72f..93905e200cae 100644
> > > --- a/drivers/hid/hid-core.c
> > > +++ b/drivers/hid/hid-core.c
> > > @@ -1211,8 +1211,8 @@ int hid_open_report(struct hid_device *device)
> > >               hid_parser_reserved
> > >       };
> > >
> > > -     if (WARN_ON(device->status & HID_STAT_PARSED))
> > > -             return -EBUSY;
> > > +     if (device->status & HID_STAT_PARSED)
> > > +             hid_close_report(device);
> > >
> > >       start = device->dev_rdesc;
> > >       if (WARN_ON(!start))
> > > @@ -2662,7 +2662,6 @@ static void hid_device_remove(struct device *dev)
> > >                       hdrv->remove(hdev);
> > >               else /* default remove */
> > >                       hid_hw_stop(hdev);
> > > -             hid_close_report(hdev);
> > >               hdev->driver = NULL;
> > >       }
> >
> > This will probably work, but it I consider this still being fragile as
> > at some point we might want to add some more unwinding, and we'll run
> > into this issue again. I would feel much safer if the order of release
> > followed (inversely) order of allocations more closely.
> >
> 
> Sorry, I'm not following here. How is it fragile? Are you saying that if
> we want to add devm calls into the bus layer itself the order of release
> won't be inverse of allocation/creation?

What I was trying to say is that later someone else might be tempted to
add more traditional-style resources and non-devm-unwinding for them.
Having an explicit devres groups gives exact point when driver-allocated resources
are released, and makes patch authors take it into consideration.

If everything is devm-controlled then we do not need a separate devres
group.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-11  0:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-05 23:24 [PATCH] HID: google: Don't use devm for hid_hw_stop() Stephen Boyd
2023-05-06  0:06 ` Dmitry Torokhov
2023-05-10 18:51   ` Stephen Boyd
2023-05-10 20:24     ` Dmitry Torokhov
2023-05-10 20:50       ` Stephen Boyd
2023-05-11  0:25         ` Dmitry Torokhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).