linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND][PATCH] Bluetooth: Make request workqueue freezable
@ 2015-05-12  0:52 Laura Abbott
  2015-05-12  1:07 ` Marcel Holtmann
  0 siblings, 1 reply; 45+ messages in thread
From: Laura Abbott @ 2015-05-12  0:52 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo Padovan, Johan Hedberg
  Cc: Laura Abbott, David S. Miller, linux-bluetooth, netdev,
	linux-kernel, Ming Lei, Takashi Iwai, Rafael J. Wysocki

We've received a number of reports of warnings when coming
out of suspend with certain bluetooth firmware configurations:

WARNING: CPU: 3 PID: 3280 at drivers/base/firmware_class.c:1126
_request_firmware+0x558/0x810()
Modules linked in: ccm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle ip6table_security ip6table_raw ip6table_filter
ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
binfmt_misc bnep intel_rapl iosf_mbi arc4 x86_pkg_temp_thermal
snd_hda_codec_hdmi coretemp kvm_intel joydev snd_hda_codec_realtek
iwldvm snd_hda_codec_generic kvm iTCO_wdt mac80211 iTCO_vendor_support
snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul
snd_hwdep crc32_pclmul snd_seq crc32c_intel ghash_clmulni_intel uvcvideo
snd_seq_device iwlwifi btusb videobuf2_vmalloc snd_pcm videobuf2_core
 serio_raw bluetooth cfg80211 videobuf2_memops sdhci_pci v4l2_common
videodev thinkpad_acpi sdhci i2c_i801 lpc_ich mfd_core wacom mmc_core
media snd_timer tpm_tis hid_logitech_hidpp wmi tpm rfkill snd mei_me mei
shpchp soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
i2c_algo_bit drm_kms_helper e1000e drm hid_logitech_dj ptp pps_core
video
CPU: 3 PID: 3280 Comm: kworker/u17:0 Not tainted 3.19.3-200.fc21.x86_64
Hardware name: LENOVO 343522U/343522U, BIOS GCET96WW (2.56 ) 10/22/2013
Workqueue: hci0 hci_power_on [bluetooth]
 0000000000000000 0000000089944328 ffff88040acffb78 ffffffff8176e215
 0000000000000000 0000000000000000 ffff88040acffbb8 ffffffff8109bc1a
 0000000000000000 ffff88040acffcd0 00000000fffffff5 ffff8804076bac40
Call Trace:
 [<ffffffff8176e215>] dump_stack+0x45/0x57
 [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
 [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff814dbe78>] _request_firmware+0x558/0x810
 [<ffffffff814dc165>] request_firmware+0x35/0x50
 [<ffffffffa03a7886>] btusb_setup_bcm_patchram+0x86/0x590 [btusb]
 [<ffffffff814d40e6>] ? rpm_idle+0xd6/0x230
 [<ffffffffa04d4801>] hci_dev_do_open+0xe1/0xa90 [bluetooth]
 [<ffffffff810c51dd>] ? ttwu_do_activate.constprop.90+0x5d/0x70
 [<ffffffffa04d5980>] hci_power_on+0x40/0x200 [bluetooth]
 [<ffffffff810b487c>] process_one_work+0x14c/0x3f0
 [<ffffffff810b52f3>] worker_thread+0x53/0x470
 [<ffffffff810b52a0>] ? rescuer_thread+0x300/0x300
 [<ffffffff810ba548>] kthread+0xd8/0xf0
 [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
 [<ffffffff81774958>] ret_from_fork+0x58/0x90
 [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0

This occurs after every resume.

When resuming, the bluetooth stack calls hci_register_dev,
allocates a new workqueue, and immediately schedules the
power_on on the newly created workqueue. Since the new
workqueue is not freezable, the work runs immediately and
triggers the warning since resume is still happening and
usermodehelper has not yet been re-enabled. Fix this by
making the request workqueue freezable. This ensures
the work will not run until unfreezing occurs and usermodehelper
is re-enabled.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
---
Resend because I think this got lost in the thread.
This should be fixing the actual root cause of the warnings.
---
 net/bluetooth/hci_core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 476709b..87f2e48 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3131,7 +3131,8 @@ int hci_register_dev(struct hci_dev *hdev)
 	}
 
 	hdev->req_workqueue = alloc_workqueue("%s", WQ_HIGHPRI | WQ_UNBOUND |
-					      WQ_MEM_RECLAIM, 1, hdev->name);
+					      WQ_MEM_RECLAIM | WQ_FREEZABLE,
+					      1, hdev->name);
 	if (!hdev->req_workqueue) {
 		destroy_workqueue(hdev->workqueue);
 		error = -ENOMEM;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-12  0:52 [RESEND][PATCH] Bluetooth: Make request workqueue freezable Laura Abbott
@ 2015-05-12  1:07 ` Marcel Holtmann
  2015-05-12  1:46   ` Laura Abbott
  0 siblings, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-12  1:07 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, linux-kernel, Ming Lei, Takashi Iwai, Rafael J. Wysocki

Hi Laura,

> We've received a number of reports of warnings when coming
> out of suspend with certain bluetooth firmware configurations:
> 
> WARNING: CPU: 3 PID: 3280 at drivers/base/firmware_class.c:1126
> _request_firmware+0x558/0x810()
> Modules linked in: ccm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
> ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
> ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
> binfmt_misc bnep intel_rapl iosf_mbi arc4 x86_pkg_temp_thermal
> snd_hda_codec_hdmi coretemp kvm_intel joydev snd_hda_codec_realtek
> iwldvm snd_hda_codec_generic kvm iTCO_wdt mac80211 iTCO_vendor_support
> snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul
> snd_hwdep crc32_pclmul snd_seq crc32c_intel ghash_clmulni_intel uvcvideo
> snd_seq_device iwlwifi btusb videobuf2_vmalloc snd_pcm videobuf2_core
> serio_raw bluetooth cfg80211 videobuf2_memops sdhci_pci v4l2_common
> videodev thinkpad_acpi sdhci i2c_i801 lpc_ich mfd_core wacom mmc_core
> media snd_timer tpm_tis hid_logitech_hidpp wmi tpm rfkill snd mei_me mei
> shpchp soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
> i2c_algo_bit drm_kms_helper e1000e drm hid_logitech_dj ptp pps_core
> video
> CPU: 3 PID: 3280 Comm: kworker/u17:0 Not tainted 3.19.3-200.fc21.x86_64
> Hardware name: LENOVO 343522U/343522U, BIOS GCET96WW (2.56 ) 10/22/2013
> Workqueue: hci0 hci_power_on [bluetooth]
> 0000000000000000 0000000089944328 ffff88040acffb78 ffffffff8176e215
> 0000000000000000 0000000000000000 ffff88040acffbb8 ffffffff8109bc1a
> 0000000000000000 ffff88040acffcd0 00000000fffffff5 ffff8804076bac40
> Call Trace:
> [<ffffffff8176e215>] dump_stack+0x45/0x57
> [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
> [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
> [<ffffffff814dbe78>] _request_firmware+0x558/0x810
> [<ffffffff814dc165>] request_firmware+0x35/0x50
> [<ffffffffa03a7886>] btusb_setup_bcm_patchram+0x86/0x590 [btusb]
> [<ffffffff814d40e6>] ? rpm_idle+0xd6/0x230
> [<ffffffffa04d4801>] hci_dev_do_open+0xe1/0xa90 [bluetooth]
> [<ffffffff810c51dd>] ? ttwu_do_activate.constprop.90+0x5d/0x70
> [<ffffffffa04d5980>] hci_power_on+0x40/0x200 [bluetooth]
> [<ffffffff810b487c>] process_one_work+0x14c/0x3f0
> [<ffffffff810b52f3>] worker_thread+0x53/0x470
> [<ffffffff810b52a0>] ? rescuer_thread+0x300/0x300
> [<ffffffff810ba548>] kthread+0xd8/0xf0
> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
> [<ffffffff81774958>] ret_from_fork+0x58/0x90
> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
> 
> This occurs after every resume.
> 
> When resuming, the bluetooth stack calls hci_register_dev,
> allocates a new workqueue, and immediately schedules the
> power_on on the newly created workqueue. Since the new
> workqueue is not freezable, the work runs immediately and
> triggers the warning since resume is still happening and
> usermodehelper has not yet been re-enabled. Fix this by
> making the request workqueue freezable. This ensures
> the work will not run until unfreezing occurs and usermodehelper
> is re-enabled.
> 
> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
> ---
> Resend because I think this got lost in the thread.
> This should be fixing the actual root cause of the warnings.

so I am not convinced that it actually fixes the root cause. This is just papering over it.

The problem is pretty clear, the firmware for some of the Bluetooth controllers is optional and that means during the original hotplug event it will not be found and the controller keeps operating. However for some reason instead of actually suspending and resuming the Bluetooth controller, we see a unplug + replug (since we are going through probe) and that is causing this funny behaviour.

So how does making one of the core workqueues freezable fixes this the right way. I do not even know how many other side effects that might have. That hdev->req_workqueue is a Bluetooth core internal workqueue that we are using for multiple tasks.

Rather tell me on why we are probing the USB devices that might need firmware without having userspace ready. It sounds to me that the USB driver probe callback should be delayed if we can not guarantee that it can request firmware. As I explained many times, the call path that causes this is going through probe callback of the driver itself.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-12  1:07 ` Marcel Holtmann
@ 2015-05-12  1:46   ` Laura Abbott
  2015-05-12 15:14     ` Marcel Holtmann
  0 siblings, 1 reply; 45+ messages in thread
From: Laura Abbott @ 2015-05-12  1:46 UTC (permalink / raw)
  To: Marcel Holtmann, Laura Abbott
  Cc: Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, linux-kernel, Ming Lei, Takashi Iwai, Rafael J. Wysocki

On 05/11/2015 06:07 PM, Marcel Holtmann wrote:
> Hi Laura,
>
>> We've received a number of reports of warnings when coming
>> out of suspend with certain bluetooth firmware configurations:
>>
>> WARNING: CPU: 3 PID: 3280 at drivers/base/firmware_class.c:1126
>> _request_firmware+0x558/0x810()
>> Modules linked in: ccm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
>> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
>> ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
>> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
>> ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
>> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
>> binfmt_misc bnep intel_rapl iosf_mbi arc4 x86_pkg_temp_thermal
>> snd_hda_codec_hdmi coretemp kvm_intel joydev snd_hda_codec_realtek
>> iwldvm snd_hda_codec_generic kvm iTCO_wdt mac80211 iTCO_vendor_support
>> snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul
>> snd_hwdep crc32_pclmul snd_seq crc32c_intel ghash_clmulni_intel uvcvideo
>> snd_seq_device iwlwifi btusb videobuf2_vmalloc snd_pcm videobuf2_core
>> serio_raw bluetooth cfg80211 videobuf2_memops sdhci_pci v4l2_common
>> videodev thinkpad_acpi sdhci i2c_i801 lpc_ich mfd_core wacom mmc_core
>> media snd_timer tpm_tis hid_logitech_hidpp wmi tpm rfkill snd mei_me mei
>> shpchp soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
>> i2c_algo_bit drm_kms_helper e1000e drm hid_logitech_dj ptp pps_core
>> video
>> CPU: 3 PID: 3280 Comm: kworker/u17:0 Not tainted 3.19.3-200.fc21.x86_64
>> Hardware name: LENOVO 343522U/343522U, BIOS GCET96WW (2.56 ) 10/22/2013
>> Workqueue: hci0 hci_power_on [bluetooth]
>> 0000000000000000 0000000089944328 ffff88040acffb78 ffffffff8176e215
>> 0000000000000000 0000000000000000 ffff88040acffbb8 ffffffff8109bc1a
>> 0000000000000000 ffff88040acffcd0 00000000fffffff5 ffff8804076bac40
>> Call Trace:
>> [<ffffffff8176e215>] dump_stack+0x45/0x57
>> [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
>> [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
>> [<ffffffff814dbe78>] _request_firmware+0x558/0x810
>> [<ffffffff814dc165>] request_firmware+0x35/0x50
>> [<ffffffffa03a7886>] btusb_setup_bcm_patchram+0x86/0x590 [btusb]
>> [<ffffffff814d40e6>] ? rpm_idle+0xd6/0x230
>> [<ffffffffa04d4801>] hci_dev_do_open+0xe1/0xa90 [bluetooth]
>> [<ffffffff810c51dd>] ? ttwu_do_activate.constprop.90+0x5d/0x70
>> [<ffffffffa04d5980>] hci_power_on+0x40/0x200 [bluetooth]
>> [<ffffffff810b487c>] process_one_work+0x14c/0x3f0
>> [<ffffffff810b52f3>] worker_thread+0x53/0x470
>> [<ffffffff810b52a0>] ? rescuer_thread+0x300/0x300
>> [<ffffffff810ba548>] kthread+0xd8/0xf0
>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
>> [<ffffffff81774958>] ret_from_fork+0x58/0x90
>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
>>
>> This occurs after every resume.
>>
>> When resuming, the bluetooth stack calls hci_register_dev,
>> allocates a new workqueue, and immediately schedules the
>> power_on on the newly created workqueue. Since the new
>> workqueue is not freezable, the work runs immediately and
>> triggers the warning since resume is still happening and
>> usermodehelper has not yet been re-enabled. Fix this by
>> making the request workqueue freezable. This ensures
>> the work will not run until unfreezing occurs and usermodehelper
>> is re-enabled.
>>
>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>> ---
>> Resend because I think this got lost in the thread.
>> This should be fixing the actual root cause of the warnings.
>
> so I am not convinced that it actually fixes the root cause. This is just papering over it.
>
> The problem is pretty clear, the firmware for some of the Bluetooth controllers is optional and that means during the original hotplug event it will not be found and the controller keeps operating. However for some reason instead of actually suspending and resuming the Bluetooth controller, we see a unplug + replug (since we are going through probe) and that is causing this funny behaviour.
>

Fundamentally the issue is the request_firmware is being called at the
wrong time. From Documentation/workqueue.txt:

   WQ_FREEZABLE

         A freezable wq participates in the freeze phase of the system
         suspend operations.  Work items on the wq are drained and no
         new work item starts execution until thawed.


By making the request workqueue freezable, any work that gets scheduled
will not run until the time for tasks to unthaw.
4320f6b1d9db4ca912c5eb6ecb328b2e090e1586
("PM / sleep: Fix request_firmware() error at resume") fixed the resume
path such that before all tasks are unthawed, calls to
usermodehelper_read_trylock will block until usermodehelper is fully
resumed. This means that any task which is frozen and then woken up
again should have the right sequencing for usermodehelper. The workqueue
which handled the bluetooth power on was never being frozen properly so
there was never any guarantee of when it would run. This patch gives
it the necessary sequence.

> So how does making one of the core workqueues freezable fixes this the right way. I do not even know how many other side effects that might have. That hdev->req_workqueue is a Bluetooth core internal workqueue that we are using for multiple tasks.
>
> Rather tell me on why we are probing the USB devices that might need firmware without having userspace ready. It sounds to me that the USB driver probe callback should be delayed if we can not guarantee that it can request firmware. As I explained many times, the call path that causes this is going through probe callback of the driver itself.
>

I agree that if the driver probe function was requesting firmware
directly there would be a problem. The power_on function is already
  being called asynchronously on a workqueue. Making that workqueue 
freezable does exactly the delay you describe.

The side effects should be limited to the change in behavior of
draining all work items at freeze time and then not running
again until task thaw. Do you know of any limitations where
draining the workqueue at freeze time could not happen or would
block indefinitely?

> Regards
>
> Marcel
>

Thanks,
Laura


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-12  1:46   ` Laura Abbott
@ 2015-05-12 15:14     ` Marcel Holtmann
  2015-05-13  1:18       ` Laura Abbott
  0 siblings, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-12 15:14 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Laura Abbott, Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Takashi Iwai,
	Rafael J. Wysocki

Hi Laura,

>>> We've received a number of reports of warnings when coming
>>> out of suspend with certain bluetooth firmware configurations:
>>> 
>>> WARNING: CPU: 3 PID: 3280 at drivers/base/firmware_class.c:1126
>>> _request_firmware+0x558/0x810()
>>> Modules linked in: ccm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
>>> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
>>> ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
>>> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
>>> ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
>>> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
>>> binfmt_misc bnep intel_rapl iosf_mbi arc4 x86_pkg_temp_thermal
>>> snd_hda_codec_hdmi coretemp kvm_intel joydev snd_hda_codec_realtek
>>> iwldvm snd_hda_codec_generic kvm iTCO_wdt mac80211 iTCO_vendor_support
>>> snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul
>>> snd_hwdep crc32_pclmul snd_seq crc32c_intel ghash_clmulni_intel uvcvideo
>>> snd_seq_device iwlwifi btusb videobuf2_vmalloc snd_pcm videobuf2_core
>>> serio_raw bluetooth cfg80211 videobuf2_memops sdhci_pci v4l2_common
>>> videodev thinkpad_acpi sdhci i2c_i801 lpc_ich mfd_core wacom mmc_core
>>> media snd_timer tpm_tis hid_logitech_hidpp wmi tpm rfkill snd mei_me mei
>>> shpchp soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
>>> i2c_algo_bit drm_kms_helper e1000e drm hid_logitech_dj ptp pps_core
>>> video
>>> CPU: 3 PID: 3280 Comm: kworker/u17:0 Not tainted 3.19.3-200.fc21.x86_64
>>> Hardware name: LENOVO 343522U/343522U, BIOS GCET96WW (2.56 ) 10/22/2013
>>> Workqueue: hci0 hci_power_on [bluetooth]
>>> 0000000000000000 0000000089944328 ffff88040acffb78 ffffffff8176e215
>>> 0000000000000000 0000000000000000 ffff88040acffbb8 ffffffff8109bc1a
>>> 0000000000000000 ffff88040acffcd0 00000000fffffff5 ffff8804076bac40
>>> Call Trace:
>>> [<ffffffff8176e215>] dump_stack+0x45/0x57
>>> [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
>>> [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
>>> [<ffffffff814dbe78>] _request_firmware+0x558/0x810
>>> [<ffffffff814dc165>] request_firmware+0x35/0x50
>>> [<ffffffffa03a7886>] btusb_setup_bcm_patchram+0x86/0x590 [btusb]
>>> [<ffffffff814d40e6>] ? rpm_idle+0xd6/0x230
>>> [<ffffffffa04d4801>] hci_dev_do_open+0xe1/0xa90 [bluetooth]
>>> [<ffffffff810c51dd>] ? ttwu_do_activate.constprop.90+0x5d/0x70
>>> [<ffffffffa04d5980>] hci_power_on+0x40/0x200 [bluetooth]
>>> [<ffffffff810b487c>] process_one_work+0x14c/0x3f0
>>> [<ffffffff810b52f3>] worker_thread+0x53/0x470
>>> [<ffffffff810b52a0>] ? rescuer_thread+0x300/0x300
>>> [<ffffffff810ba548>] kthread+0xd8/0xf0
>>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
>>> [<ffffffff81774958>] ret_from_fork+0x58/0x90
>>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
>>> 
>>> This occurs after every resume.
>>> 
>>> When resuming, the bluetooth stack calls hci_register_dev,
>>> allocates a new workqueue, and immediately schedules the
>>> power_on on the newly created workqueue. Since the new
>>> workqueue is not freezable, the work runs immediately and
>>> triggers the warning since resume is still happening and
>>> usermodehelper has not yet been re-enabled. Fix this by
>>> making the request workqueue freezable. This ensures
>>> the work will not run until unfreezing occurs and usermodehelper
>>> is re-enabled.
>>> 
>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>> ---
>>> Resend because I think this got lost in the thread.
>>> This should be fixing the actual root cause of the warnings.
>> 
>> so I am not convinced that it actually fixes the root cause. This is just papering over it.
>> 
>> The problem is pretty clear, the firmware for some of the Bluetooth controllers is optional and that means during the original hotplug event it will not be found and the controller keeps operating. However for some reason instead of actually suspending and resuming the Bluetooth controller, we see a unplug + replug (since we are going through probe) and that is causing this funny behaviour.
>> 
> 
> Fundamentally the issue is the request_firmware is being called at the
> wrong time. From Documentation/workqueue.txt:
> 
>  WQ_FREEZABLE
> 
>        A freezable wq participates in the freeze phase of the system
>        suspend operations.  Work items on the wq are drained and no
>        new work item starts execution until thawed.
> 
> 
> By making the request workqueue freezable, any work that gets scheduled
> will not run until the time for tasks to unthaw.
> 4320f6b1d9db4ca912c5eb6ecb328b2e090e1586
> ("PM / sleep: Fix request_firmware() error at resume") fixed the resume
> path such that before all tasks are unthawed, calls to
> usermodehelper_read_trylock will block until usermodehelper is fully
> resumed. This means that any task which is frozen and then woken up
> again should have the right sequencing for usermodehelper. The workqueue
> which handled the bluetooth power on was never being frozen properly so
> there was never any guarantee of when it would run. This patch gives
> it the necessary sequence.
> 
>> So how does making one of the core workqueues freezable fixes this the right way. I do not even know how many other side effects that might have. That hdev->req_workqueue is a Bluetooth core internal workqueue that we are using for multiple tasks.
>> 
>> Rather tell me on why we are probing the USB devices that might need firmware without having userspace ready. It sounds to me that the USB driver probe callback should be delayed if we can not guarantee that it can request firmware. As I explained many times, the call path that causes this is going through probe callback of the driver itself.
>> 
> 
> I agree that if the driver probe function was requesting firmware
> directly there would be a problem. The power_on function is already
> being called asynchronously on a workqueue. Making that workqueue freezable does exactly the delay you describe.

I am not convinced. Now we are hacking the Bluetooth core layer (which has nothing to do with the drivers suspend/resume or probe) to do something different so that we do not see this warning.

I can not do anything about the platform in question choosing a unplug/replug for suspend/resume instead of having a proper USB suspend and resume handling. That is pretty much out of our control. I would rather have the USB subsystem delay the probe() callback if we tell it to. Of just have request_firmware() actually sleep until userspace is ready. Seriously, why is request_firmware not just sleeping for us.

> The side effects should be limited to the change in behavior of
> draining all work items at freeze time and then not running
> again until task thaw. Do you know of any limitations where
> draining the workqueue at freeze time could not happen or would
> block indefinitely?

Has anybody actually looked at the hdev->req_workqueue usage? You are touching core code now and unless you convince me that this has no impact, then I am cautious first of all.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-12 15:14     ` Marcel Holtmann
@ 2015-05-13  1:18       ` Laura Abbott
  2015-05-19  9:46         ` Takashi Iwai
  0 siblings, 1 reply; 45+ messages in thread
From: Laura Abbott @ 2015-05-13  1:18 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Laura Abbott, Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Takashi Iwai,
	Rafael J. Wysocki, linux-usb

(cc-ing linux-usb as well)

On 05/12/2015 08:14 AM, Marcel Holtmann wrote:
> Hi Laura,
>
>>>> We've received a number of reports of warnings when coming
>>>> out of suspend with certain bluetooth firmware configurations:
>>>>
>>>> WARNING: CPU: 3 PID: 3280 at drivers/base/firmware_class.c:1126
>>>> _request_firmware+0x558/0x810()
>>>> Modules linked in: ccm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
>>>> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
>>>> ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
>>>> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
>>>> ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
>>>> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
>>>> binfmt_misc bnep intel_rapl iosf_mbi arc4 x86_pkg_temp_thermal
>>>> snd_hda_codec_hdmi coretemp kvm_intel joydev snd_hda_codec_realtek
>>>> iwldvm snd_hda_codec_generic kvm iTCO_wdt mac80211 iTCO_vendor_support
>>>> snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul
>>>> snd_hwdep crc32_pclmul snd_seq crc32c_intel ghash_clmulni_intel uvcvideo
>>>> snd_seq_device iwlwifi btusb videobuf2_vmalloc snd_pcm videobuf2_core
>>>> serio_raw bluetooth cfg80211 videobuf2_memops sdhci_pci v4l2_common
>>>> videodev thinkpad_acpi sdhci i2c_i801 lpc_ich mfd_core wacom mmc_core
>>>> media snd_timer tpm_tis hid_logitech_hidpp wmi tpm rfkill snd mei_me mei
>>>> shpchp soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
>>>> i2c_algo_bit drm_kms_helper e1000e drm hid_logitech_dj ptp pps_core
>>>> video
>>>> CPU: 3 PID: 3280 Comm: kworker/u17:0 Not tainted 3.19.3-200.fc21.x86_64
>>>> Hardware name: LENOVO 343522U/343522U, BIOS GCET96WW (2.56 ) 10/22/2013
>>>> Workqueue: hci0 hci_power_on [bluetooth]
>>>> 0000000000000000 0000000089944328 ffff88040acffb78 ffffffff8176e215
>>>> 0000000000000000 0000000000000000 ffff88040acffbb8 ffffffff8109bc1a
>>>> 0000000000000000 ffff88040acffcd0 00000000fffffff5 ffff8804076bac40
>>>> Call Trace:
>>>> [<ffffffff8176e215>] dump_stack+0x45/0x57
>>>> [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
>>>> [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
>>>> [<ffffffff814dbe78>] _request_firmware+0x558/0x810
>>>> [<ffffffff814dc165>] request_firmware+0x35/0x50
>>>> [<ffffffffa03a7886>] btusb_setup_bcm_patchram+0x86/0x590 [btusb]
>>>> [<ffffffff814d40e6>] ? rpm_idle+0xd6/0x230
>>>> [<ffffffffa04d4801>] hci_dev_do_open+0xe1/0xa90 [bluetooth]
>>>> [<ffffffff810c51dd>] ? ttwu_do_activate.constprop.90+0x5d/0x70
>>>> [<ffffffffa04d5980>] hci_power_on+0x40/0x200 [bluetooth]
>>>> [<ffffffff810b487c>] process_one_work+0x14c/0x3f0
>>>> [<ffffffff810b52f3>] worker_thread+0x53/0x470
>>>> [<ffffffff810b52a0>] ? rescuer_thread+0x300/0x300
>>>> [<ffffffff810ba548>] kthread+0xd8/0xf0
>>>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
>>>> [<ffffffff81774958>] ret_from_fork+0x58/0x90
>>>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
>>>>
>>>> This occurs after every resume.
>>>>
>>>> When resuming, the bluetooth stack calls hci_register_dev,
>>>> allocates a new workqueue, and immediately schedules the
>>>> power_on on the newly created workqueue. Since the new
>>>> workqueue is not freezable, the work runs immediately and
>>>> triggers the warning since resume is still happening and
>>>> usermodehelper has not yet been re-enabled. Fix this by
>>>> making the request workqueue freezable. This ensures
>>>> the work will not run until unfreezing occurs and usermodehelper
>>>> is re-enabled.
>>>>
>>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>>> ---
>>>> Resend because I think this got lost in the thread.
>>>> This should be fixing the actual root cause of the warnings.
>>>
>>> so I am not convinced that it actually fixes the root cause. This is just papering over it.
>>>
>>> The problem is pretty clear, the firmware for some of the Bluetooth controllers is optional and that means during the original hotplug event it will not be found and the controller keeps operating. However for some reason instead of actually suspending and resuming the Bluetooth controller, we see a unplug + replug (since we are going through probe) and that is causing this funny behaviour.
>>>
>>
>> Fundamentally the issue is the request_firmware is being called at the
>> wrong time. From Documentation/workqueue.txt:
>>
>>   WQ_FREEZABLE
>>
>>         A freezable wq participates in the freeze phase of the system
>>         suspend operations.  Work items on the wq are drained and no
>>         new work item starts execution until thawed.
>>
>>
>> By making the request workqueue freezable, any work that gets scheduled
>> will not run until the time for tasks to unthaw.
>> 4320f6b1d9db4ca912c5eb6ecb328b2e090e1586
>> ("PM / sleep: Fix request_firmware() error at resume") fixed the resume
>> path such that before all tasks are unthawed, calls to
>> usermodehelper_read_trylock will block until usermodehelper is fully
>> resumed. This means that any task which is frozen and then woken up
>> again should have the right sequencing for usermodehelper. The workqueue
>> which handled the bluetooth power on was never being frozen properly so
>> there was never any guarantee of when it would run. This patch gives
>> it the necessary sequence.
>>
>>> So how does making one of the core workqueues freezable fixes this the right way. I do not even know how many other side effects that might have. That hdev->req_workqueue is a Bluetooth core internal workqueue that we are using for multiple tasks.
>>>
>>> Rather tell me on why we are probing the USB devices that might need firmware without having userspace ready. It sounds to me that the USB driver probe callback should be delayed if we can not guarantee that it can request firmware. As I explained many times, the call path that causes this is going through probe callback of the driver itself.
>>>
>>
>> I agree that if the driver probe function was requesting firmware
>> directly there would be a problem. The power_on function is already
>> being called asynchronously on a workqueue. Making that workqueue freezable does exactly the delay you describe.
>
> I am not convinced. Now we are hacking the Bluetooth core layer (which has nothing to do with the drivers suspend/resume or probe) to do something different so that we do not see this warning.
>
> I can not do anything about the platform in question choosing a unplug/replug for suspend/resume instead of having a proper USB suspend and resume handling. That is pretty much out of our control. I would rather have the USB subsystem delay the probe() callback if we tell it to. Of just have request_firmware() actually sleep until userspace is ready. Seriously, why is request_firmware not just sleeping for us.
>

The closest thing to blocking is usermodehelper_read_lock_wait which
waits for a limited amount of time. Takashi Iwai proposed switching
to that unconditionally for all request_firmware but I never saw a
response from the firmware maintainers. I suspect that may not be
acceptable because if the firmware actually needs to block it should
be an asynchronous call. The firmware maintainers can correct me
if I'm incorrect in my understanding.

I'll let the USB maintainers chime in about how feasible adding
a delay of probe would be.


>> The side effects should be limited to the change in behavior of
>> draining all work items at freeze time and then not running
>> again until task thaw. Do you know of any limitations where
>> draining the workqueue at freeze time could not happen or would
>> block indefinitely?
>
> Has anybody actually looked at the hdev->req_workqueue usage? You are touching core code now and unless you convince me that this has no impact, then I am cautious first of all.
>

Yes, I understand this is core code. I was hoping for a review to
see if this would make sense. Perhaps I should have just marked
this patch RFC to indicate this.

> Regards
>
> Marcel
>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-13  1:18       ` Laura Abbott
@ 2015-05-19  9:46         ` Takashi Iwai
  2015-05-19 14:26           ` Alan Stern
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-19  9:46 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Marcel Holtmann, Laura Abbott, Gustavo F. Padovan, Johan Hedberg,
	David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

At Tue, 12 May 2015 18:18:13 -0700,
Laura Abbott wrote:
> 
> (cc-ing linux-usb as well)
> 
> On 05/12/2015 08:14 AM, Marcel Holtmann wrote:
> > Hi Laura,
> >
> >>>> We've received a number of reports of warnings when coming
> >>>> out of suspend with certain bluetooth firmware configurations:
> >>>>
> >>>> WARNING: CPU: 3 PID: 3280 at drivers/base/firmware_class.c:1126
> >>>> _request_firmware+0x558/0x810()
> >>>> Modules linked in: ccm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
> >>>> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
> >>>> ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
> >>>> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
> >>>> ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
> >>>> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
> >>>> binfmt_misc bnep intel_rapl iosf_mbi arc4 x86_pkg_temp_thermal
> >>>> snd_hda_codec_hdmi coretemp kvm_intel joydev snd_hda_codec_realtek
> >>>> iwldvm snd_hda_codec_generic kvm iTCO_wdt mac80211 iTCO_vendor_support
> >>>> snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul
> >>>> snd_hwdep crc32_pclmul snd_seq crc32c_intel ghash_clmulni_intel uvcvideo
> >>>> snd_seq_device iwlwifi btusb videobuf2_vmalloc snd_pcm videobuf2_core
> >>>> serio_raw bluetooth cfg80211 videobuf2_memops sdhci_pci v4l2_common
> >>>> videodev thinkpad_acpi sdhci i2c_i801 lpc_ich mfd_core wacom mmc_core
> >>>> media snd_timer tpm_tis hid_logitech_hidpp wmi tpm rfkill snd mei_me mei
> >>>> shpchp soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
> >>>> i2c_algo_bit drm_kms_helper e1000e drm hid_logitech_dj ptp pps_core
> >>>> video
> >>>> CPU: 3 PID: 3280 Comm: kworker/u17:0 Not tainted 3.19.3-200.fc21.x86_64
> >>>> Hardware name: LENOVO 343522U/343522U, BIOS GCET96WW (2.56 ) 10/22/2013
> >>>> Workqueue: hci0 hci_power_on [bluetooth]
> >>>> 0000000000000000 0000000089944328 ffff88040acffb78 ffffffff8176e215
> >>>> 0000000000000000 0000000000000000 ffff88040acffbb8 ffffffff8109bc1a
> >>>> 0000000000000000 ffff88040acffcd0 00000000fffffff5 ffff8804076bac40
> >>>> Call Trace:
> >>>> [<ffffffff8176e215>] dump_stack+0x45/0x57
> >>>> [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
> >>>> [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
> >>>> [<ffffffff814dbe78>] _request_firmware+0x558/0x810
> >>>> [<ffffffff814dc165>] request_firmware+0x35/0x50
> >>>> [<ffffffffa03a7886>] btusb_setup_bcm_patchram+0x86/0x590 [btusb]
> >>>> [<ffffffff814d40e6>] ? rpm_idle+0xd6/0x230
> >>>> [<ffffffffa04d4801>] hci_dev_do_open+0xe1/0xa90 [bluetooth]
> >>>> [<ffffffff810c51dd>] ? ttwu_do_activate.constprop.90+0x5d/0x70
> >>>> [<ffffffffa04d5980>] hci_power_on+0x40/0x200 [bluetooth]
> >>>> [<ffffffff810b487c>] process_one_work+0x14c/0x3f0
> >>>> [<ffffffff810b52f3>] worker_thread+0x53/0x470
> >>>> [<ffffffff810b52a0>] ? rescuer_thread+0x300/0x300
> >>>> [<ffffffff810ba548>] kthread+0xd8/0xf0
> >>>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
> >>>> [<ffffffff81774958>] ret_from_fork+0x58/0x90
> >>>> [<ffffffff810ba470>] ? kthread_create_on_node+0x1b0/0x1b0
> >>>>
> >>>> This occurs after every resume.
> >>>>
> >>>> When resuming, the bluetooth stack calls hci_register_dev,
> >>>> allocates a new workqueue, and immediately schedules the
> >>>> power_on on the newly created workqueue. Since the new
> >>>> workqueue is not freezable, the work runs immediately and
> >>>> triggers the warning since resume is still happening and
> >>>> usermodehelper has not yet been re-enabled. Fix this by
> >>>> making the request workqueue freezable. This ensures
> >>>> the work will not run until unfreezing occurs and usermodehelper
> >>>> is re-enabled.
> >>>>
> >>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
> >>>> ---
> >>>> Resend because I think this got lost in the thread.
> >>>> This should be fixing the actual root cause of the warnings.
> >>>
> >>> so I am not convinced that it actually fixes the root cause. This is just papering over it.
> >>>
> >>> The problem is pretty clear, the firmware for some of the Bluetooth controllers is optional and that means during the original hotplug event it will not be found and the controller keeps operating. However for some reason instead of actually suspending and resuming the Bluetooth controller, we see a unplug + replug (since we are going through probe) and that is causing this funny behaviour.
> >>>
> >>
> >> Fundamentally the issue is the request_firmware is being called at the
> >> wrong time. From Documentation/workqueue.txt:
> >>
> >>   WQ_FREEZABLE
> >>
> >>         A freezable wq participates in the freeze phase of the system
> >>         suspend operations.  Work items on the wq are drained and no
> >>         new work item starts execution until thawed.
> >>
> >>
> >> By making the request workqueue freezable, any work that gets scheduled
> >> will not run until the time for tasks to unthaw.
> >> 4320f6b1d9db4ca912c5eb6ecb328b2e090e1586
> >> ("PM / sleep: Fix request_firmware() error at resume") fixed the resume
> >> path such that before all tasks are unthawed, calls to
> >> usermodehelper_read_trylock will block until usermodehelper is fully
> >> resumed. This means that any task which is frozen and then woken up
> >> again should have the right sequencing for usermodehelper. The workqueue
> >> which handled the bluetooth power on was never being frozen properly so
> >> there was never any guarantee of when it would run. This patch gives
> >> it the necessary sequence.
> >>
> >>> So how does making one of the core workqueues freezable fixes this the right way. I do not even know how many other side effects that might have. That hdev->req_workqueue is a Bluetooth core internal workqueue that we are using for multiple tasks.
> >>>
> >>> Rather tell me on why we are probing the USB devices that might need firmware without having userspace ready. It sounds to me that the USB driver probe callback should be delayed if we can not guarantee that it can request firmware. As I explained many times, the call path that causes this is going through probe callback of the driver itself.
> >>>
> >>
> >> I agree that if the driver probe function was requesting firmware
> >> directly there would be a problem. The power_on function is already
> >> being called asynchronously on a workqueue. Making that workqueue freezable does exactly the delay you describe.
> >
> > I am not convinced. Now we are hacking the Bluetooth core layer (which has nothing to do with the drivers suspend/resume or probe) to do something different so that we do not see this warning.
> >
> > I can not do anything about the platform in question choosing a unplug/replug for suspend/resume instead of having a proper USB suspend and resume handling. That is pretty much out of our control. I would rather have the USB subsystem delay the probe() callback if we tell it to. Of just have request_firmware() actually sleep until userspace is ready. Seriously, why is request_firmware not just sleeping for us.
> >
> 
> The closest thing to blocking is usermodehelper_read_lock_wait which
> waits for a limited amount of time. Takashi Iwai proposed switching
> to that unconditionally for all request_firmware but I never saw a
> response from the firmware maintainers. I suspect that may not be
> acceptable because if the firmware actually needs to block it should
> be an asynchronous call. The firmware maintainers can correct me
> if I'm incorrect in my understanding.

IIRC, the reason of using usermodehelper_read_trylock() for the normal
request_firmware() (not the nowait one) is to check the call of
request_firmware() in the resume callback.  If the firmware hasn't
been cached, it should fail.

So, using _trylock() there isn't wrong, per se.  It's indeed safer.
But, the problem is that _trylock() is used unconditionally for all
request_firmware() calls even if it's never from the resume path.

Maybe we should allow the f/w loader caller to specify whether to use
UMH trylock or wait.  The patch below exposes _request_firmware() and
FW_OPT_ flags.  Then BT driver can call like

	_request_firmware(&fw, name, dev,
		FW_OPT_UEVENT | FW_OPT_NO_WARN | FW_OPT_UMH_LOCK_WAIT)

Note that the patch is totally untested!

Or doesn't this look intuitive enough?


Takashi

---
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 171841ad1008..47eb5551c119 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -97,21 +97,6 @@ static inline long firmware_loading_timeout(void)
 	return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
 }
 
-/* firmware behavior options */
-#define FW_OPT_UEVENT	(1U << 0)
-#define FW_OPT_NOWAIT	(1U << 1)
-#ifdef CONFIG_FW_LOADER_USER_HELPER
-#define FW_OPT_USERHELPER	(1U << 2)
-#else
-#define FW_OPT_USERHELPER	0
-#endif
-#ifdef CONFIG_FW_LOADER_USER_HELPER_FALLBACK
-#define FW_OPT_FALLBACK		FW_OPT_USERHELPER
-#else
-#define FW_OPT_FALLBACK		0
-#endif
-#define FW_OPT_NO_WARN	(1U << 3)
-
 struct firmware_cache {
 	/* firmware_buf instance will be added into the below list */
 	spinlock_t lock;
@@ -1085,7 +1070,7 @@ static int assign_firmware_buf(struct firmware *fw, struct device *device,
 }
 
 /* called from request_firmware() and request_firmware_work_func() */
-static int
+int
 _request_firmware(const struct firmware **firmware_p, const char *name,
 		  struct device *device, unsigned int opt_flags)
 {
@@ -1099,13 +1084,16 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 	if (!name || name[0] == '\0')
 		return -EINVAL;
 
+	/* Need to pin this module until return */
+	__module_get(THIS_MODULE);
+
 	ret = _request_firmware_prepare(&fw, name, device);
 	if (ret <= 0) /* error or already assigned */
 		goto out;
 
 	ret = 0;
 	timeout = firmware_loading_timeout();
-	if (opt_flags & FW_OPT_NOWAIT) {
+	if (opt_flags & FW_OPT_UMH_LOCK_WAIT) {
 		timeout = usermodehelper_read_lock_wait(timeout);
 		if (!timeout) {
 			dev_dbg(device, "firmware: %s loading timed out\n",
@@ -1147,67 +1135,10 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 	}
 
 	*firmware_p = fw;
-	return ret;
-}
-
-/**
- * request_firmware: - send firmware request and wait for it
- * @firmware_p: pointer to firmware image
- * @name: name of firmware file
- * @device: device for which firmware is being loaded
- *
- *      @firmware_p will be used to return a firmware image by the name
- *      of @name for device @device.
- *
- *      Should be called from user context where sleeping is allowed.
- *
- *      @name will be used as $FIRMWARE in the uevent environment and
- *      should be distinctive enough not to be confused with any other
- *      firmware image for this or any other device.
- *
- *	Caller must hold the reference count of @device.
- *
- *	The function can be called safely inside device's suspend and
- *	resume callback.
- **/
-int
-request_firmware(const struct firmware **firmware_p, const char *name,
-		 struct device *device)
-{
-	int ret;
-
-	/* Need to pin this module until return */
-	__module_get(THIS_MODULE);
-	ret = _request_firmware(firmware_p, name, device,
-				FW_OPT_UEVENT | FW_OPT_FALLBACK);
-	module_put(THIS_MODULE);
-	return ret;
-}
-EXPORT_SYMBOL(request_firmware);
-
-/**
- * request_firmware_direct: - load firmware directly without usermode helper
- * @firmware_p: pointer to firmware image
- * @name: name of firmware file
- * @device: device for which firmware is being loaded
- *
- * This function works pretty much like request_firmware(), but this doesn't
- * fall back to usermode helper even if the firmware couldn't be loaded
- * directly from fs.  Hence it's useful for loading optional firmwares, which
- * aren't always present, without extra long timeouts of udev.
- **/
-int request_firmware_direct(const struct firmware **firmware_p,
-			    const char *name, struct device *device)
-{
-	int ret;
-
-	__module_get(THIS_MODULE);
-	ret = _request_firmware(firmware_p, name, device,
-				FW_OPT_UEVENT | FW_OPT_NO_WARN);
 	module_put(THIS_MODULE);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(request_firmware_direct);
+EXPORT_SYMBOL_GPL(_request_firmware);
 
 /**
  * release_firmware: - release the resource associated with a firmware image
@@ -1291,6 +1222,7 @@ request_firmware_nowait(
 	fw_work->context = context;
 	fw_work->cont = cont;
 	fw_work->opt_flags = FW_OPT_NOWAIT | FW_OPT_FALLBACK |
+		FW_OPT_UMH_LOCK_WAIT |
 		(uevent ? FW_OPT_UEVENT : FW_OPT_USERHELPER);
 
 	if (!try_module_get(module)) {
diff --git a/include/linux/firmware.h b/include/linux/firmware.h
index 5c41c5e75b5c..460aa30965cf 100644
--- a/include/linux/firmware.h
+++ b/include/linux/firmware.h
@@ -39,23 +39,21 @@ struct builtin_fw {
 	__used __section(.builtin_fw) = { name, blob, size }
 
 #if defined(CONFIG_FW_LOADER) || (defined(CONFIG_FW_LOADER_MODULE) && defined(MODULE))
-int request_firmware(const struct firmware **fw, const char *name,
-		     struct device *device);
+int _request_firmware(const struct firmware **firmware_p, const char *name,
+		      struct device *device, unsigned int opt_flags);
 int request_firmware_nowait(
 	struct module *module, bool uevent,
 	const char *name, struct device *device, gfp_t gfp, void *context,
 	void (*cont)(const struct firmware *fw, void *context));
-int request_firmware_direct(const struct firmware **fw, const char *name,
-			    struct device *device);
-
 void release_firmware(const struct firmware *fw);
 #else
-static inline int request_firmware(const struct firmware **fw,
-				   const char *name,
-				   struct device *device)
+static inline int
+_request_firmware(const struct firmware **firmware_p, const char *name,
+		  struct device *device, unsigned int opt_flags)
 {
 	return -EINVAL;
 }
+
 static inline int request_firmware_nowait(
 	struct module *module, bool uevent,
 	const char *name, struct device *device, gfp_t gfp, void *context,
@@ -67,13 +65,69 @@ static inline int request_firmware_nowait(
 static inline void release_firmware(const struct firmware *fw)
 {
 }
+#endif
 
-static inline int request_firmware_direct(const struct firmware **fw,
-					  const char *name,
-					  struct device *device)
+/* firmware behavior options */
+#define FW_OPT_UEVENT		(1U << 0)
+#define FW_OPT_NOWAIT		(1U << 1)
+#ifdef CONFIG_FW_LOADER_USER_HELPER
+#define FW_OPT_USERHELPER	(1U << 2)
+#else
+#define FW_OPT_USERHELPER	0
+#endif
+#ifdef CONFIG_FW_LOADER_USER_HELPER_FALLBACK
+#define FW_OPT_FALLBACK		FW_OPT_USERHELPER
+#else
+#define FW_OPT_FALLBACK		0
+#endif
+#define FW_OPT_NO_WARN		(1U << 3)
+#define FW_OPT_UMH_LOCK_WAIT	(1U << 4)
+
+/**
+ * request_firmware: - send firmware request and wait for it
+ * @firmware_p: pointer to firmware image
+ * @name: name of firmware file
+ * @device: device for which firmware is being loaded
+ *
+ *      @firmware_p will be used to return a firmware image by the name
+ *      of @name for device @device.
+ *
+ *      Should be called from user context where sleeping is allowed.
+ *
+ *      @name will be used as $FIRMWARE in the uevent environment and
+ *      should be distinctive enough not to be confused with any other
+ *      firmware image for this or any other device.
+ *
+ *	Caller must hold the reference count of @device.
+ *
+ *	The function can be called safely inside device's suspend and
+ *	resume callback.
+ **/
+static inline int
+request_firmware(const struct firmware **firmware_p, const char *name,
+		 struct device *device)
 {
-	return -EINVAL;
+	return _request_firmware(firmware_p, name, device,
+				 FW_OPT_UEVENT | FW_OPT_FALLBACK);
+}
+
+/**
+ * request_firmware_direct: - load firmware directly without usermode helper
+ * @firmware_p: pointer to firmware image
+ * @name: name of firmware file
+ * @device: device for which firmware is being loaded
+ *
+ * This function works pretty much like request_firmware(), but this doesn't
+ * fall back to usermode helper even if the firmware couldn't be loaded
+ * directly from fs.  Hence it's useful for loading optional firmwares, which
+ * aren't always present, without extra long timeouts of udev.
+ **/
+static inline int
+request_firmware_direct(const struct firmware **firmware_p,
+			const char *name, struct device *device)
+{
+	return _request_firmware(firmware_p, name, device,
+				 FW_OPT_UEVENT | FW_OPT_NO_WARN);
 }
 
-#endif
 #endif

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19  9:46         ` Takashi Iwai
@ 2015-05-19 14:26           ` Alan Stern
  2015-05-19 14:52             ` Oliver Neukum
                               ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Alan Stern @ 2015-05-19 14:26 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Laura Abbott, Marcel Holtmann, Laura Abbott, Gustavo F. Padovan,
	Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

On Tue, 19 May 2015, Takashi Iwai wrote:

> > > I am not convinced. Now we are hacking the Bluetooth core layer
> > > (which has nothing to do with the drivers suspend/resume or
> > > probe) to do something different so that we do not see this
> > > warning.
> > >
> > > I can not do anything about the platform in question choosing a
> > > unplug/replug for suspend/resume instead of having a proper USB
> > > suspend and resume handling. That is pretty much out of our
> > > control.

Actually one can do something about this.  I mean, one _can_ implement
proper USB suspend and resume handling in the Bluetooth driver.  At
this point the details aren't clear to me, but perhaps if the driver in
question had a reset_resume callback then it might work better.

> > >  I would rather have the USB subsystem delay the probe()
> > > callback if we tell it to.

This is possible.  I am not sure it would be the right thing to do,
though.  What happens if the probe routine gets called early on during
the boot-up procedure, before userspace is up and running?  The same
thing should happen here.

> > >  Of just have request_firmware()
> > > actually sleep until userspace is ready. Seriously, why is
> > > request_firmware not just sleeping for us.

It won't work.  The request_firmware call is part of the probe 
sequence, which in turn is part of the resume sequence.  Userspace 
doesn't start running again until the resume sequence is finished.  If 
request_firmware waited for userspace, it would hang.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19 14:26           ` Alan Stern
@ 2015-05-19 14:52             ` Oliver Neukum
  2015-05-19 15:22             ` Marcel Holtmann
  2015-05-19 17:13             ` Takashi Iwai
  2 siblings, 0 replies; 45+ messages in thread
From: Oliver Neukum @ 2015-05-19 14:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Takashi Iwai, Laura Abbott, Marcel Holtmann, Laura Abbott,
	Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

On Tue, 2015-05-19 at 10:26 -0400, Alan Stern wrote:
> On Tue, 19 May 2015, Takashi Iwai wrote:
> 
> > > > I am not convinced. Now we are hacking the Bluetooth core layer
> > > > (which has nothing to do with the drivers suspend/resume or
> > > > probe) to do something different so that we do not see this
> > > > warning.
> > > >
> > > > I can not do anything about the platform in question choosing a
> > > > unplug/replug for suspend/resume instead of having a proper USB
> > > > suspend and resume handling. That is pretty much out of our
> > > > control.
> 
> Actually one can do something about this.  I mean, one _can_ implement
> proper USB suspend and resume handling in the Bluetooth driver.  At
> this point the details aren't clear to me, but perhaps if the driver in
> question had a reset_resume callback then it might work better.

I doubt this would work. By losing power the BT controller is thrown
out of its cell. It looks to me like fundamentally BT needs to
fully reestablish the network from scratch after a loss of power.

> > > >  I would rather have the USB subsystem delay the probe()
> > > > callback if we tell it to.
> 
> This is possible.  I am not sure it would be the right thing to do,
> though.  What happens if the probe routine gets called early on during
> the boot-up procedure, before userspace is up and running?  The same
> thing should happen here.

Yes. Basically if you want firmware during probe the firmware
infrastructure has to be there. That is if you build such a module
statically the firmware must be included in the kernel image.

> > > >  Of just have request_firmware()
> > > > actually sleep until userspace is ready. Seriously, why is
> > > > request_firmware not just sleeping for us.
> 
> It won't work.  The request_firmware call is part of the probe 
> sequence, which in turn is part of the resume sequence.  Userspace 
> doesn't start running again until the resume sequence is finished.  If 
> request_firmware waited for userspace, it would hang.

I'd recommend the sledge hammer. Never free the firmware while the
hardware is connected or the system sleeping. If you must do this
there is a notifier chain.

	Regards
		Oliver




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19 14:26           ` Alan Stern
  2015-05-19 14:52             ` Oliver Neukum
@ 2015-05-19 15:22             ` Marcel Holtmann
  2015-05-19 17:17               ` Alan Stern
  2015-05-19 17:13             ` Takashi Iwai
  2 siblings, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-19 15:22 UTC (permalink / raw)
  To: Alan Stern
  Cc: Takashi Iwai, Laura Abbott, Laura Abbott, Gustavo F. Padovan,
	Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

Hi Alan,

>>>> I am not convinced. Now we are hacking the Bluetooth core layer
>>>> (which has nothing to do with the drivers suspend/resume or
>>>> probe) to do something different so that we do not see this
>>>> warning.
>>>> 
>>>> I can not do anything about the platform in question choosing a
>>>> unplug/replug for suspend/resume instead of having a proper USB
>>>> suspend and resume handling. That is pretty much out of our
>>>> control.
> 
> Actually one can do something about this.  I mean, one _can_ implement
> proper USB suspend and resume handling in the Bluetooth driver.  At
> this point the details aren't clear to me, but perhaps if the driver in
> question had a reset_resume callback then it might work better.

the btusb.ko driver has suspend/resume support. Are you saying we also need reset_resume support?

>>>> I would rather have the USB subsystem delay the probe()
>>>> callback if we tell it to.
> 
> This is possible.  I am not sure it would be the right thing to do,
> though.  What happens if the probe routine gets called early on during
> the boot-up procedure, before userspace is up and running?  The same
> thing should happen here.

For modules this will be hard. Since you need userspace before being able to load the modules. If built-in code, then in theory this might be possible. Depending on the order of the init sections.

>>>> Of just have request_firmware()
>>>> actually sleep until userspace is ready. Seriously, why is
>>>> request_firmware not just sleeping for us.
> 
> It won't work.  The request_firmware call is part of the probe 
> sequence, which in turn is part of the resume sequence.  Userspace 
> doesn't start running again until the resume sequence is finished.  If 
> request_firmware waited for userspace, it would hang.

Then I really have no idea on how to solve this unless we silence the warning from request_firmware. From a driver perspective we go back trough probe(). So the driver has to treat this as a new device.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19 14:26           ` Alan Stern
  2015-05-19 14:52             ` Oliver Neukum
  2015-05-19 15:22             ` Marcel Holtmann
@ 2015-05-19 17:13             ` Takashi Iwai
  2015-05-19 17:42               ` Oliver Neukum
  2 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-19 17:13 UTC (permalink / raw)
  To: Alan Stern
  Cc: Laura Abbott, Marcel Holtmann, Laura Abbott, Gustavo F. Padovan,
	Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

At Tue, 19 May 2015 10:26:46 -0400 (EDT),
Alan Stern wrote:
> 
> > > >  Of just have request_firmware()
> > > > actually sleep until userspace is ready. Seriously, why is
> > > > request_firmware not just sleeping for us.
> 
> It won't work.  The request_firmware call is part of the probe 
> sequence, which in turn is part of the resume sequence.  Userspace 
> doesn't start running again until the resume sequence is finished.  If 
> request_firmware waited for userspace, it would hang.

Note that the recent request_firmware() doesn't need the user-space
invocation (unless the fallback is explicitly enabled) but loads the
file directly.  And, request_firmware() for the cached data is valid
to be called in the resume path.


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19 15:22             ` Marcel Holtmann
@ 2015-05-19 17:17               ` Alan Stern
  0 siblings, 0 replies; 45+ messages in thread
From: Alan Stern @ 2015-05-19 17:17 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Takashi Iwai, Laura Abbott, Laura Abbott, Gustavo F. Padovan,
	Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

On Tue, 19 May 2015, Marcel Holtmann wrote:

> Hi Alan,
> 
> >>>> I am not convinced. Now we are hacking the Bluetooth core layer
> >>>> (which has nothing to do with the drivers suspend/resume or
> >>>> probe) to do something different so that we do not see this
> >>>> warning.
> >>>> 
> >>>> I can not do anything about the platform in question choosing a
> >>>> unplug/replug for suspend/resume instead of having a proper USB
> >>>> suspend and resume handling. That is pretty much out of our
> >>>> control.
> > 
> > Actually one can do something about this.  I mean, one _can_ implement
> > proper USB suspend and resume handling in the Bluetooth driver.  At
> > this point the details aren't clear to me, but perhaps if the driver in
> > question had a reset_resume callback then it might work better.
> 
> the btusb.ko driver has suspend/resume support. Are you saying we
> also need reset_resume support?

I don't know; I'm not familiar enough with how Bluetooth works.  If the 
device loses power and requires its firmware to be loaded again, then a 
reset_resume would end up doing much the same thing as probe anyway.  
So implementing reset_resume might not make much difference.

> >>>> I would rather have the USB subsystem delay the probe()
> >>>> callback if we tell it to.
> > 
> > This is possible.  I am not sure it would be the right thing to do,
> > though.  What happens if the probe routine gets called early on during
> > the boot-up procedure, before userspace is up and running?  The same
> > thing should happen here.
> 
> For modules this will be hard. Since you need userspace before being
> able to load the modules. If built-in code, then in theory this might
> be possible. Depending on the order of the init sections.

Yes, I meant built-in.

> >>>> Of just have request_firmware()
> >>>> actually sleep until userspace is ready. Seriously, why is
> >>>> request_firmware not just sleeping for us.
> > 
> > It won't work.  The request_firmware call is part of the probe 
> > sequence, which in turn is part of the resume sequence.  Userspace 
> > doesn't start running again until the resume sequence is finished.  If 
> > request_firmware waited for userspace, it would hang.
> 
> Then I really have no idea on how to solve this unless we silence the
> warning from request_firmware. From a driver perspective we go back
> trough probe(). So the driver has to treat this as a new device.

Oliver's suggestion to keep the firmware in memory may be the only
reasonable solution.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19 17:13             ` Takashi Iwai
@ 2015-05-19 17:42               ` Oliver Neukum
  2015-05-20  6:29                 ` Takashi Iwai
  0 siblings, 1 reply; 45+ messages in thread
From: Oliver Neukum @ 2015-05-19 17:42 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Alan Stern, Laura Abbott, Marcel Holtmann, Laura Abbott,
	Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

On Tue, 2015-05-19 at 19:13 +0200, Takashi Iwai wrote:
> At Tue, 19 May 2015 10:26:46 -0400 (EDT),
> Alan Stern wrote:
> > 
> > > > >  Of just have request_firmware()
> > > > > actually sleep until userspace is ready. Seriously, why is
> > > > > request_firmware not just sleeping for us.
> > 
> > It won't work.  The request_firmware call is part of the probe 
> > sequence, which in turn is part of the resume sequence.  Userspace 
> > doesn't start running again until the resume sequence is finished.  If 
> > request_firmware waited for userspace, it would hang.
> 
> Note that the recent request_firmware() doesn't need the user-space
> invocation (unless the fallback is explicitly enabled) but loads the

That is a dangerous approach. You cannot be sure you can do file IO.
It depends on the exact shape of the device tree.

> file directly.  And, request_firmware() for the cached data is valid
> to be called in the resume path.

Well, yes, if your data is cached in RAM, all is well. But that leads
to the same problem one step further. What must be cached?

	Regards
		Oliver




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-19 17:42               ` Oliver Neukum
@ 2015-05-20  6:29                 ` Takashi Iwai
  2015-05-20  8:40                   ` Oliver Neukum
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-20  6:29 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Alan Stern, Laura Abbott, Marcel Holtmann, Laura Abbott,
	Gustavo F. Padovan, Johan Hedberg, David S. Miller,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	netdev, Linux Kernel Mailing List, Ming Lei, Rafael J. Wysocki,
	linux-usb

At Tue, 19 May 2015 19:42:55 +0200,
Oliver Neukum wrote:
> 
> On Tue, 2015-05-19 at 19:13 +0200, Takashi Iwai wrote:
> > At Tue, 19 May 2015 10:26:46 -0400 (EDT),
> > Alan Stern wrote:
> > > 
> > > > > >  Of just have request_firmware()
> > > > > > actually sleep until userspace is ready. Seriously, why is
> > > > > > request_firmware not just sleeping for us.
> > > 
> > > It won't work.  The request_firmware call is part of the probe 
> > > sequence, which in turn is part of the resume sequence.  Userspace 
> > > doesn't start running again until the resume sequence is finished.  If 
> > > request_firmware waited for userspace, it would hang.
> > 
> > Note that the recent request_firmware() doesn't need the user-space
> > invocation (unless the fallback is explicitly enabled) but loads the
> 
> That is a dangerous approach. You cannot be sure you can do file IO.
> It depends on the exact shape of the device tree.
 
It's the reason why firmware loader still takes UMH lock (thus we're
seeing this very problem).

> > file directly.  And, request_firmware() for the cached data is valid
> > to be called in the resume path.
> 
> Well, yes, if your data is cached in RAM, all is well. But that leads
> to the same problem one step further. What must be cached?

The data is cached in RAM.  More specifically, the former loaded
firmware files are reloaded and saved at suspend for each device
object.  See fw_pm_notify() in firmware_class.c.

The question is then why the cached data isn't used.  I have no
concrete answer to it for now, need more investigation, but my wild
guess is that it's because the device itself is being renewed.
Or, something wrong in firmware_class.c.


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-20  6:29                 ` Takashi Iwai
@ 2015-05-20  8:40                   ` Oliver Neukum
  2015-05-20  9:46                     ` Marcel Holtmann
  2015-05-20 10:02                     ` Ming Lei
  0 siblings, 2 replies; 45+ messages in thread
From: Oliver Neukum @ 2015-05-20  8:40 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Ming Lei, David S. Miller, Laura Abbott, Johan Hedberg,
	Marcel Holtmann, Rafael J. Wysocki, Gustavo F. Padovan,
	Laura Abbott, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, linux-usb, netdev

On Wed, 2015-05-20 at 08:29 +0200, Takashi Iwai wrote:
> The data is cached in RAM.  More specifically, the former loaded
> firmware files are reloaded and saved at suspend for each device
> object.  See fw_pm_notify() in firmware_class.c.

OK, this may be a stupid idea, but do we know the firmware
was successfully loaded in the first place?
Also btusb is in the habit of falling back to a generic
firmware in some places. It seems to me that caching
firmware is conceptually not enough, but we'd also need
to record the absence of firmware images.

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-20  8:40                   ` Oliver Neukum
@ 2015-05-20  9:46                     ` Marcel Holtmann
  2015-05-20 12:44                       ` Takashi Iwai
  2015-05-20 10:02                     ` Ming Lei
  1 sibling, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-20  9:46 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Takashi Iwai, Ming Lei, David S. Miller, Laura Abbott,
	Johan Hedberg, Rafael J. Wysocki, Gustavo F. Padovan,
	Laura Abbott, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

Hi Oliver,

>> The data is cached in RAM.  More specifically, the former loaded
>> firmware files are reloaded and saved at suspend for each device
>> object.  See fw_pm_notify() in firmware_class.c.
> 
> OK, this may be a stupid idea, but do we know the firmware
> was successfully loaded in the first place?
> Also btusb is in the habit of falling back to a generic
> firmware in some places. It seems to me that caching
> firmware is conceptually not enough, but we'd also need
> to record the absence of firmware images.

in a lot of cases the firmware is optional. The device will operate fine without the firmware. There are a few devices where the firmware is required, but for many it just contains patches.

It would be nice if we could tell request_firmware() if it is optional or mandatory firmware. Or if it should just cache the status of a missing firmware as well.

As long as the device in question gets disconnected and we run through the USB driver probe() callback again, the btusb.ko driver can not do anything smart in this case. It has to be done in request_firmware().

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-20  8:40                   ` Oliver Neukum
  2015-05-20  9:46                     ` Marcel Holtmann
@ 2015-05-20 10:02                     ` Ming Lei
  1 sibling, 0 replies; 45+ messages in thread
From: Ming Lei @ 2015-05-20 10:02 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Takashi Iwai, David S. Miller, Laura Abbott, Johan Hedberg,
	Marcel Holtmann, Rafael J. Wysocki, Gustavo F. Padovan,
	Laura Abbott, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, linux-usb, netdev

On Wed, May 20, 2015 at 4:40 PM, Oliver Neukum <oneukum@suse.com> wrote:
> On Wed, 2015-05-20 at 08:29 +0200, Takashi Iwai wrote:
>> The data is cached in RAM.  More specifically, the former loaded
>> firmware files are reloaded and saved at suspend for each device
>> object.  See fw_pm_notify() in firmware_class.c.
>
> OK, this may be a stupid idea, but do we know the firmware
> was successfully loaded in the first place?

Yes, the firmware loader records that as device resource.
In reality, there won't be lots of devices requiring firmware
in one running system, so the idea of caching for every
successful loading is workable.

> Also btusb is in the habit of falling back to a generic
> firmware in some places. It seems to me that caching
> firmware is conceptually not enough, but we'd also need
> to record the absence of firmware images.

The caching can't cover the case which starts to load fw
during resume in the 1st time.

>
>         Regards
>                 Oliver
>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-20  9:46                     ` Marcel Holtmann
@ 2015-05-20 12:44                       ` Takashi Iwai
  2015-05-20 23:42                         ` Laura Abbott
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-20 12:44 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Oliver Neukum, Ming Lei, David S. Miller, Laura Abbott,
	Johan Hedberg, Rafael J. Wysocki, Gustavo F. Padovan,
	Laura Abbott, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Wed, 20 May 2015 11:46:31 +0200,
Marcel Holtmann wrote:
> 
> Hi Oliver,
> 
> >> The data is cached in RAM.  More specifically, the former loaded
> >> firmware files are reloaded and saved at suspend for each device
> >> object.  See fw_pm_notify() in firmware_class.c.
> > 
> > OK, this may be a stupid idea, but do we know the firmware
> > was successfully loaded in the first place?
> > Also btusb is in the habit of falling back to a generic
> > firmware in some places. It seems to me that caching
> > firmware is conceptually not enough, but we'd also need
> > to record the absence of firmware images.
> 
> in a lot of cases the firmware is optional. The device will operate fine without the firmware. There are a few devices where the firmware is required, but for many it just contains patches.
> 
> It would be nice if we could tell request_firmware() if it is optional or mandatory firmware. Or if it should just cache the status of a missing firmware as well.

OK, below is a quick hack to record the failed f/w files, too.
Not sure whether this helps, though.  Proper tests are appreciated.


Takashi

---
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] firmware: cache failed firmwares, too

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 drivers/base/firmware_class.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 171841ad1008..a15af7289c94 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -1035,6 +1035,8 @@ _request_firmware_prepare(struct firmware **firmware_p, const char *name,
 	firmware->priv = buf;
 
 	if (ret > 0) {
+		if (buf->size == -1UL)
+			return -ENOENT; /* already recorded as failure */
 		ret = sync_cached_firmware_buf(buf);
 		if (!ret) {
 			fw_set_page_data(buf, firmware);
@@ -1047,17 +1049,12 @@ _request_firmware_prepare(struct firmware **firmware_p, const char *name,
 	return 1; /* need to load */
 }
 
-static int assign_firmware_buf(struct firmware *fw, struct device *device,
+static void assign_firmware_buf(struct firmware *fw, struct device *device,
 			       unsigned int opt_flags)
 {
 	struct firmware_buf *buf = fw->priv;
 
 	mutex_lock(&fw_lock);
-	if (!buf->size || is_fw_load_aborted(buf)) {
-		mutex_unlock(&fw_lock);
-		return -ENOENT;
-	}
-
 	/*
 	 * add firmware name into devres list so that we can auto cache
 	 * and uncache firmware for device.
@@ -1079,9 +1076,9 @@ static int assign_firmware_buf(struct firmware *fw, struct device *device,
 	}
 
 	/* pass the pages buffer to driver at the last minute */
-	fw_set_page_data(buf, fw);
+	if (buf->size != -1UL)
+		fw_set_page_data(buf, fw);
 	mutex_unlock(&fw_lock);
-	return 0;
 }
 
 /* called from request_firmware() and request_firmware_work_func() */
@@ -1124,6 +1121,9 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 
 	ret = fw_get_filesystem_firmware(device, fw->priv);
 	if (ret) {
+		struct firmware_buf *buf = fw->priv;
+
+		buf->size = -1UL; /* failed */
 		if (!(opt_flags & FW_OPT_NO_WARN))
 			dev_warn(device,
 				 "Direct firmware load for %s failed with error %d\n",
@@ -1132,12 +1132,12 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 			dev_warn(device, "Falling back to user helper\n");
 			ret = fw_load_from_user_helper(fw, name, device,
 						       opt_flags, timeout);
+			if (ret)
+				buf->size = -1UL; /* failed */
 		}
 	}
 
-	if (!ret)
-		ret = assign_firmware_buf(fw, device, opt_flags);
-
+	assign_firmware_buf(fw, device, opt_flags);
 	usermodehelper_read_unlock();
 
  out:
@@ -1435,17 +1435,8 @@ static void __async_dev_cache_fw_image(void *fw_entry,
 				       async_cookie_t cookie)
 {
 	struct fw_cache_entry *fce = fw_entry;
-	struct firmware_cache *fwc = &fw_cache;
-	int ret;
-
-	ret = cache_firmware(fce->name);
-	if (ret) {
-		spin_lock(&fwc->name_lock);
-		list_del(&fce->list);
-		spin_unlock(&fwc->name_lock);
 
-		free_fw_cache_entry(fce);
-	}
+	cache_firmware(fce->name);
 }
 
 /* called with dev->devres_lock held */
-- 
2.4.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-20 12:44                       ` Takashi Iwai
@ 2015-05-20 23:42                         ` Laura Abbott
  2015-05-21  4:21                           ` Takashi Iwai
  0 siblings, 1 reply; 45+ messages in thread
From: Laura Abbott @ 2015-05-20 23:42 UTC (permalink / raw)
  To: Takashi Iwai, Marcel Holtmann
  Cc: Oliver Neukum, Ming Lei, David S. Miller, Laura Abbott,
	Johan Hedberg, Rafael J. Wysocki, Gustavo F. Padovan, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/20/2015 05:44 AM, Takashi Iwai wrote:
> At Wed, 20 May 2015 11:46:31 +0200,
> Marcel Holtmann wrote:
>>
>> Hi Oliver,
>>
>>>> The data is cached in RAM.  More specifically, the former loaded
>>>> firmware files are reloaded and saved at suspend for each device
>>>> object.  See fw_pm_notify() in firmware_class.c.
>>>
>>> OK, this may be a stupid idea, but do we know the firmware
>>> was successfully loaded in the first place?
>>> Also btusb is in the habit of falling back to a generic
>>> firmware in some places. It seems to me that caching
>>> firmware is conceptually not enough, but we'd also need
>>> to record the absence of firmware images.
>>
>> in a lot of cases the firmware is optional. The device will operate fine without the firmware. There are a few devices where the firmware is required, but for many it just contains patches.
>>
>> It would be nice if we could tell request_firmware() if it is optional or mandatory firmware. Or if it should just cache the status of a missing firmware as well.
>
> OK, below is a quick hack to record the failed f/w files, too.
> Not sure whether this helps, though.  Proper tests are appreciated.
>
>

This doesn't quite work. We end up with the name on fw_names but
the firmware isn't actually on the firmware cache list.

If request_firmware fails to get the firmware from the filesystem,
release firmware will be called which is going to free the
firmware_buf which has been marked as failed anyway. The only
way to make this work would be to always piggy back and increase
the ref so it always stays around. But this also marks the firmware
as a permanent failure. There would need to be a hook somewhere
to force a cache drop, else there would be no way to add new
firmware to a running system without a reboot.

Perhaps we split the difference: keep a list of firmware images
that failed to load in the past and if one is requested during
a time when usermodehelper isn't available, silently return an
error? This way, if correct firmware is loaded at a regular time
the item can be removed from the list.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-20 23:42                         ` Laura Abbott
@ 2015-05-21  4:21                           ` Takashi Iwai
  2015-05-21 12:07                             ` Marcel Holtmann
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-21  4:21 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Marcel Holtmann, Oliver Neukum, Ming Lei, David S. Miller,
	Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Wed, 20 May 2015 16:42:44 -0700,
Laura Abbott wrote:
> 
> On 05/20/2015 05:44 AM, Takashi Iwai wrote:
> > At Wed, 20 May 2015 11:46:31 +0200,
> > Marcel Holtmann wrote:
> >>
> >> Hi Oliver,
> >>
> >>>> The data is cached in RAM.  More specifically, the former loaded
> >>>> firmware files are reloaded and saved at suspend for each device
> >>>> object.  See fw_pm_notify() in firmware_class.c.
> >>>
> >>> OK, this may be a stupid idea, but do we know the firmware
> >>> was successfully loaded in the first place?
> >>> Also btusb is in the habit of falling back to a generic
> >>> firmware in some places. It seems to me that caching
> >>> firmware is conceptually not enough, but we'd also need
> >>> to record the absence of firmware images.
> >>
> >> in a lot of cases the firmware is optional. The device will operate fine without the firmware. There are a few devices where the firmware is required, but for many it just contains patches.
> >>
> >> It would be nice if we could tell request_firmware() if it is optional or mandatory firmware. Or if it should just cache the status of a missing firmware as well.
> >
> > OK, below is a quick hack to record the failed f/w files, too.
> > Not sure whether this helps, though.  Proper tests are appreciated.
> >
> >
> 
> This doesn't quite work. We end up with the name on fw_names but
> the firmware isn't actually on the firmware cache list.
> 
> If request_firmware fails to get the firmware from the filesystem,
> release firmware will be called which is going to free the
> firmware_buf which has been marked as failed anyway. The only
> way to make this work would be to always piggy back and increase
> the ref so it always stays around. But this also marks the firmware
> as a permanent failure. There would need to be a hook somewhere
> to force a cache drop, else there would be no way to add new
> firmware to a running system without a reboot.
> 
> Perhaps we split the difference: keep a list of firmware images
> that failed to load in the past and if one is requested during
> a time when usermodehelper isn't available, silently return an
> error? This way, if correct firmware is loaded at a regular time
> the item can be removed from the list.

Well, IMO, it's way too much expectation for the generic f/w loader.
The driver itself must know already which should be really loaded.
The fact is that it's the driver who calls the function that might not
work in the resume path.  So the driver can deal with such exceptions
at best.

This can be either delaying the f/w loading via proper UMH lock (like
my former patch or your patch) or avoiding the f/w request of
non-existing files that the driver already knows of.


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21  4:21                           ` Takashi Iwai
@ 2015-05-21 12:07                             ` Marcel Holtmann
  2015-05-21 12:36                               ` Takashi Iwai
  0 siblings, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-21 12:07 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Laura Abbott, Oliver Neukum, Ming Lei, David S. Miller,
	Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

Hi Takashi,

>>>>>> The data is cached in RAM.  More specifically, the former loaded
>>>>>> firmware files are reloaded and saved at suspend for each device
>>>>>> object.  See fw_pm_notify() in firmware_class.c.
>>>>> 
>>>>> OK, this may be a stupid idea, but do we know the firmware
>>>>> was successfully loaded in the first place?
>>>>> Also btusb is in the habit of falling back to a generic
>>>>> firmware in some places. It seems to me that caching
>>>>> firmware is conceptually not enough, but we'd also need
>>>>> to record the absence of firmware images.
>>>> 
>>>> in a lot of cases the firmware is optional. The device will operate fine without the firmware. There are a few devices where the firmware is required, but for many it just contains patches.
>>>> 
>>>> It would be nice if we could tell request_firmware() if it is optional or mandatory firmware. Or if it should just cache the status of a missing firmware as well.
>>> 
>>> OK, below is a quick hack to record the failed f/w files, too.
>>> Not sure whether this helps, though.  Proper tests are appreciated.
>>> 
>>> 
>> 
>> This doesn't quite work. We end up with the name on fw_names but
>> the firmware isn't actually on the firmware cache list.
>> 
>> If request_firmware fails to get the firmware from the filesystem,
>> release firmware will be called which is going to free the
>> firmware_buf which has been marked as failed anyway. The only
>> way to make this work would be to always piggy back and increase
>> the ref so it always stays around. But this also marks the firmware
>> as a permanent failure. There would need to be a hook somewhere
>> to force a cache drop, else there would be no way to add new
>> firmware to a running system without a reboot.
>> 
>> Perhaps we split the difference: keep a list of firmware images
>> that failed to load in the past and if one is requested during
>> a time when usermodehelper isn't available, silently return an
>> error? This way, if correct firmware is loaded at a regular time
>> the item can be removed from the list.
> 
> Well, IMO, it's way too much expectation for the generic f/w loader.
> The driver itself must know already which should be really loaded.
> The fact is that it's the driver who calls the function that might not
> work in the resume path.  So the driver can deal with such exceptions
> at best.

I keep repeating myself here. From the driver point of view it goes via probe() callback of the USB driver. So the driver does not know. For the driver it looks like a brand new device. There are platforms that might decide to just kill the power to the USB bus where the Bluetooth controller sits on. It gets the power back on resume. However this means it is a brand new device at that point. So the driver should not have to remember everything.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 12:07                             ` Marcel Holtmann
@ 2015-05-21 12:36                               ` Takashi Iwai
  2015-05-21 14:18                                 ` Alan Stern
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-21 12:36 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Laura Abbott, Oliver Neukum, Ming Lei, David S. Miller,
	Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Thu, 21 May 2015 14:07:11 +0200,
Marcel Holtmann wrote:
> 
> Hi Takashi,
> 
> >>>>>> The data is cached in RAM.  More specifically, the former loaded
> >>>>>> firmware files are reloaded and saved at suspend for each device
> >>>>>> object.  See fw_pm_notify() in firmware_class.c.
> >>>>> 
> >>>>> OK, this may be a stupid idea, but do we know the firmware
> >>>>> was successfully loaded in the first place?
> >>>>> Also btusb is in the habit of falling back to a generic
> >>>>> firmware in some places. It seems to me that caching
> >>>>> firmware is conceptually not enough, but we'd also need
> >>>>> to record the absence of firmware images.
> >>>> 
> >>>> in a lot of cases the firmware is optional. The device will operate fine without the firmware. There are a few devices where the firmware is required, but for many it just contains patches.
> >>>> 
> >>>> It would be nice if we could tell request_firmware() if it is optional or mandatory firmware. Or if it should just cache the status of a missing firmware as well.
> >>> 
> >>> OK, below is a quick hack to record the failed f/w files, too.
> >>> Not sure whether this helps, though.  Proper tests are appreciated.
> >>> 
> >>> 
> >> 
> >> This doesn't quite work. We end up with the name on fw_names but
> >> the firmware isn't actually on the firmware cache list.
> >> 
> >> If request_firmware fails to get the firmware from the filesystem,
> >> release firmware will be called which is going to free the
> >> firmware_buf which has been marked as failed anyway. The only
> >> way to make this work would be to always piggy back and increase
> >> the ref so it always stays around. But this also marks the firmware
> >> as a permanent failure. There would need to be a hook somewhere
> >> to force a cache drop, else there would be no way to add new
> >> firmware to a running system without a reboot.
> >> 
> >> Perhaps we split the difference: keep a list of firmware images
> >> that failed to load in the past and if one is requested during
> >> a time when usermodehelper isn't available, silently return an
> >> error? This way, if correct firmware is loaded at a regular time
> >> the item can be removed from the list.
> > 
> > Well, IMO, it's way too much expectation for the generic f/w loader.
> > The driver itself must know already which should be really loaded.
> > The fact is that it's the driver who calls the function that might not
> > work in the resume path.  So the driver can deal with such exceptions
> > at best.
> 
> I keep repeating myself here. From the driver point of view it goes
> via probe() callback of the USB driver. So the driver does not
> know. For the driver it looks like a brand new device. There are
> platforms that might decide to just kill the power to the USB bus
> where the Bluetooth controller sits on. It gets the power back on
> resume. However this means it is a brand new device at that
> point. So the driver should not have to remember everything. 

Then avoiding the failed firmware is no solution, indeed.
If it's a new probe, it should be never executed during resume.
That is, either freeze the work like Laura's patch or explicitly allow
the UMH lock wait like my patch.  Laura's patch has a merit that it's
much simpler.  OTOH, if you want to keep the changes only in
request_firmware() call, you can think of changes like my patch; a
revised version is attached below.


Takashi

---
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 171841ad1008..87157f557263 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -97,21 +97,6 @@ static inline long firmware_loading_timeout(void)
 	return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
 }
 
-/* firmware behavior options */
-#define FW_OPT_UEVENT	(1U << 0)
-#define FW_OPT_NOWAIT	(1U << 1)
-#ifdef CONFIG_FW_LOADER_USER_HELPER
-#define FW_OPT_USERHELPER	(1U << 2)
-#else
-#define FW_OPT_USERHELPER	0
-#endif
-#ifdef CONFIG_FW_LOADER_USER_HELPER_FALLBACK
-#define FW_OPT_FALLBACK		FW_OPT_USERHELPER
-#else
-#define FW_OPT_FALLBACK		0
-#endif
-#define FW_OPT_NO_WARN	(1U << 3)
-
 struct firmware_cache {
 	/* firmware_buf instance will be added into the below list */
 	spinlock_t lock;
@@ -1085,8 +1070,7 @@ static int assign_firmware_buf(struct firmware *fw, struct device *device,
 }
 
 /* called from request_firmware() and request_firmware_work_func() */
-static int
-_request_firmware(const struct firmware **firmware_p, const char *name,
+int _request_firmware(const struct firmware **firmware_p, const char *name,
 		  struct device *device, unsigned int opt_flags)
 {
 	struct firmware *fw;
@@ -1099,13 +1083,15 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 	if (!name || name[0] == '\0')
 		return -EINVAL;
 
+	/* Need to pin this module until return */
+	__module_get(THIS_MODULE);
 	ret = _request_firmware_prepare(&fw, name, device);
 	if (ret <= 0) /* error or already assigned */
 		goto out;
 
 	ret = 0;
 	timeout = firmware_loading_timeout();
-	if (opt_flags & FW_OPT_NOWAIT) {
+	if (opt_flags & FW_OPT_UMH_LOCK_WAIT) {
 		timeout = usermodehelper_read_lock_wait(timeout);
 		if (!timeout) {
 			dev_dbg(device, "firmware: %s loading timed out\n",
@@ -1147,8 +1133,10 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
 	}
 
 	*firmware_p = fw;
+	module_put(THIS_MODULE);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(_request_firmware);
 
 /**
  * request_firmware: - send firmware request and wait for it
@@ -1174,14 +1162,8 @@ int
 request_firmware(const struct firmware **firmware_p, const char *name,
 		 struct device *device)
 {
-	int ret;
-
-	/* Need to pin this module until return */
-	__module_get(THIS_MODULE);
-	ret = _request_firmware(firmware_p, name, device,
+	return _request_firmware(firmware_p, name, device,
 				FW_OPT_UEVENT | FW_OPT_FALLBACK);
-	module_put(THIS_MODULE);
-	return ret;
 }
 EXPORT_SYMBOL(request_firmware);
 
@@ -1199,13 +1181,8 @@ EXPORT_SYMBOL(request_firmware);
 int request_firmware_direct(const struct firmware **firmware_p,
 			    const char *name, struct device *device)
 {
-	int ret;
-
-	__module_get(THIS_MODULE);
-	ret = _request_firmware(firmware_p, name, device,
+	return _request_firmware(firmware_p, name, device,
 				FW_OPT_UEVENT | FW_OPT_NO_WARN);
-	module_put(THIS_MODULE);
-	return ret;
 }
 EXPORT_SYMBOL_GPL(request_firmware_direct);
 
@@ -1291,6 +1268,7 @@ request_firmware_nowait(
 	fw_work->context = context;
 	fw_work->cont = cont;
 	fw_work->opt_flags = FW_OPT_NOWAIT | FW_OPT_FALLBACK |
+		FW_OPT_UMH_LOCK_WAIT |
 		(uevent ? FW_OPT_UEVENT : FW_OPT_USERHELPER);
 
 	if (!try_module_get(module)) {
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index d21f3b4176d3..3465f1e4030e 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -1633,6 +1633,11 @@ out:
 	return ret;
 }
 
+#define bt_request_firmware(fw, name, dev) \
+	_request_firmware(fw, name, dev, \
+			  FW_OPT_UEVENT | FW_OPT_FALLBACK | \
+			  FW_OPT_UMH_LOCK_WAIT)
+
 static int btusb_setup_rtl8723a(struct hci_dev *hdev)
 {
 	struct btusb_data *data = dev_get_drvdata(&hdev->dev);
@@ -1641,7 +1646,7 @@ static int btusb_setup_rtl8723a(struct hci_dev *hdev)
 	int ret;
 
 	BT_INFO("%s: rtl: loading rtl_bt/rtl8723a_fw.bin", hdev->name);
-	ret = request_firmware(&fw, "rtl_bt/rtl8723a_fw.bin", &udev->dev);
+	ret = bt_request_firmware(&fw, "rtl_bt/rtl8723a_fw.bin", &udev->dev);
 	if (ret < 0) {
 		BT_ERR("%s: Failed to load rtl_bt/rtl8723a_fw.bin", hdev->name);
 		return ret;
@@ -1678,7 +1683,7 @@ static int btusb_setup_rtl8723b(struct hci_dev *hdev, u16 lmp_subver,
 	int ret;
 
 	BT_INFO("%s: rtl: loading %s", hdev->name, fw_name);
-	ret = request_firmware(&fw, fw_name, &udev->dev);
+	ret = bt_request_firmware(&fw, fw_name, &udev->dev);
 	if (ret < 0) {
 		BT_ERR("%s: Failed to load %s", hdev->name, fw_name);
 		return ret;
@@ -1754,7 +1759,7 @@ static const struct firmware *btusb_setup_intel_get_fw(struct hci_dev *hdev,
 		 ver->fw_variant,  ver->fw_revision, ver->fw_build_num,
 		 ver->fw_build_ww, ver->fw_build_yy);
 
-	ret = request_firmware(&fw, fwname, &hdev->dev);
+	ret = bt_request_firmware(&fw, fwname, &hdev->dev);
 	if (ret < 0) {
 		if (ret == -EINVAL) {
 			BT_ERR("%s Intel firmware file request failed (%d)",
@@ -1770,7 +1775,7 @@ static const struct firmware *btusb_setup_intel_get_fw(struct hci_dev *hdev,
 		 */
 		snprintf(fwname, sizeof(fwname), "intel/ibt-hw-%x.%x.bseq",
 			 ver->hw_platform, ver->hw_variant);
-		if (request_firmware(&fw, fwname, &hdev->dev) < 0) {
+		if (bt_request_firmware(&fw, fwname, &hdev->dev) < 0) {
 			BT_ERR("%s failed to open default Intel fw file: %s",
 			       hdev->name, fwname);
 			return NULL;
@@ -2482,7 +2487,7 @@ static int btusb_setup_intel_new(struct hci_dev *hdev)
 	snprintf(fwname, sizeof(fwname), "intel/ibt-11-%u.sfi",
 		 le16_to_cpu(params->dev_revid));
 
-	err = request_firmware(&fw, fwname, &hdev->dev);
+	err = bt_request_firmware(&fw, fwname, &hdev->dev);
 	if (err < 0) {
 		BT_ERR("%s: Failed to load Intel firmware file (%d)",
 		       hdev->name, err);
@@ -2905,7 +2910,7 @@ static int btusb_setup_qca_load_rampatch(struct hci_dev *hdev,
 
 	snprintf(fwname, sizeof(fwname), "qca/rampatch_usb_%08x.bin", ver_rom);
 
-	err = request_firmware(&fw, fwname, &hdev->dev);
+	err = bt_request_firmware(&fw, fwname, &hdev->dev);
 	if (err) {
 		BT_ERR("%s: failed to request rampatch file: %s (%d)",
 		       hdev->name, fwname, err);
@@ -2948,7 +2953,7 @@ static int btusb_setup_qca_load_nvm(struct hci_dev *hdev,
 	snprintf(fwname, sizeof(fwname), "qca/nvm_usb_%08x.bin",
 		 le32_to_cpu(ver->rom_version));
 
-	err = request_firmware(&fw, fwname, &hdev->dev);
+	err = bt_request_firmware(&fw, fwname, &hdev->dev);
 	if (err) {
 		BT_ERR("%s: failed to request NVM file: %s (%d)",
 		       hdev->name, fwname, err);
diff --git a/include/linux/firmware.h b/include/linux/firmware.h
index 5c41c5e75b5c..68859bc365eb 100644
--- a/include/linux/firmware.h
+++ b/include/linux/firmware.h
@@ -26,6 +26,22 @@ struct builtin_fw {
 	unsigned long size;
 };
 
+/* firmware behavior options */
+#define FW_OPT_UEVENT		(1U << 0)	/* enable uevent */
+#define FW_OPT_NOWAIT		(1U << 1)	/* handle in background wq */
+#ifdef CONFIG_FW_LOADER_USER_HELPER
+#define FW_OPT_USERHELPER	(1U << 2)
+#else
+#define FW_OPT_USERHELPER	0
+#endif
+#ifdef CONFIG_FW_LOADER_USER_HELPER_FALLBACK
+#define FW_OPT_FALLBACK		FW_OPT_USERHELPER /* fallback via userhelper */
+#else
+#define FW_OPT_FALLBACK		0
+#endif
+#define FW_OPT_NO_WARN		(1U << 3)	/* no warning for failure */
+#define FW_OPT_UMH_LOCK_WAIT	(1U << 4)	/* wait usermodehelper lock */
+
 /* We have to play tricks here much like stringify() to get the
    __COUNTER__ macro to be expanded as we want it */
 #define __fw_concat1(x, y) x##y
@@ -39,6 +55,8 @@ struct builtin_fw {
 	__used __section(.builtin_fw) = { name, blob, size }
 
 #if defined(CONFIG_FW_LOADER) || (defined(CONFIG_FW_LOADER_MODULE) && defined(MODULE))
+int _request_firmware(const struct firmware **fw, const char *name,
+		      struct device *device, int opt_flags);
 int request_firmware(const struct firmware **fw, const char *name,
 		     struct device *device);
 int request_firmware_nowait(
@@ -50,6 +68,12 @@ int request_firmware_direct(const struct firmware **fw, const char *name,
 
 void release_firmware(const struct firmware *fw);
 #else
+static inline int _request_firmware(const struct firmware **fw,
+				    const char *name, struct device *device,
+				    int opt_flags)
+{
+	return -EINVAL;
+}
 static inline int request_firmware(const struct firmware **fw,
 				   const char *name,
 				   struct device *device)

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 12:36                               ` Takashi Iwai
@ 2015-05-21 14:18                                 ` Alan Stern
  2015-05-21 14:39                                   ` Marcel Holtmann
  2015-05-21 15:04                                   ` Takashi Iwai
  0 siblings, 2 replies; 45+ messages in thread
From: Alan Stern @ 2015-05-21 14:18 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Marcel Holtmann, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On Thu, 21 May 2015, Takashi Iwai wrote:

> Then avoiding the failed firmware is no solution, indeed.
> If it's a new probe, it should be never executed during resume.

Can you expand this comment?  What's wrong with probing during resume?

The USB stack does carry out probes during resume under certain
circumstances.  A driver lacking a reset_resume callback is one of
those circumstances.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 14:18                                 ` Alan Stern
@ 2015-05-21 14:39                                   ` Marcel Holtmann
  2015-05-21 15:26                                     ` Alan Stern
  2015-05-21 15:04                                   ` Takashi Iwai
  1 sibling, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-21 14:39 UTC (permalink / raw)
  To: Alan Stern
  Cc: Takashi Iwai, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

Hi Alan,

>> Then avoiding the failed firmware is no solution, indeed.
>> If it's a new probe, it should be never executed during resume.
> 
> Can you expand this comment?  What's wrong with probing during resume?
> 
> The USB stack does carry out probes during resume under certain
> circumstances.  A driver lacking a reset_resume callback is one of
> those circumstances.

in case the platform kills the power to the USB lines, we can never do anything about this. I do not want to hack around this in the driver.

What are the cases where we should implement reset_resume and would it really help here. Since the btusb.ko driver implements suspend/resume support, would reset_resume ever be called?

However I get the feeling someone needs to go back and see if the device is the same one and just gets probed again or if it is a new one from the USB host stack perspective.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 14:18                                 ` Alan Stern
  2015-05-21 14:39                                   ` Marcel Holtmann
@ 2015-05-21 15:04                                   ` Takashi Iwai
  1 sibling, 0 replies; 45+ messages in thread
From: Takashi Iwai @ 2015-05-21 15:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: Marcel Holtmann, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Thu, 21 May 2015 10:18:08 -0400 (EDT),
Alan Stern wrote:
> 
> On Thu, 21 May 2015, Takashi Iwai wrote:
> 
> > Then avoiding the failed firmware is no solution, indeed.
> > If it's a new probe, it should be never executed during resume.
> 
> Can you expand this comment?  What's wrong with probing during resume?

Well, if the probe requires the access to a user-space file, it can't
be done during resume.  That's the very problem we're seeing now.
The firmware loader can't help much alone if it's a new device
object.

> The USB stack does carry out probes during resume under certain
> circumstances.  A driver lacking a reset_resume callback is one of
> those circumstances.

So, having a proper reset_resume in btusb would help in the end?


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 14:39                                   ` Marcel Holtmann
@ 2015-05-21 15:26                                     ` Alan Stern
  2015-05-21 15:35                                       ` Takashi Iwai
  2015-05-22  0:21                                       ` Laura Abbott
  0 siblings, 2 replies; 45+ messages in thread
From: Alan Stern @ 2015-05-21 15:26 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Takashi Iwai, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On Thu, 21 May 2015, Marcel Holtmann wrote:

> Hi Alan,
> 
> >> Then avoiding the failed firmware is no solution, indeed.
> >> If it's a new probe, it should be never executed during resume.
> > 
> > Can you expand this comment?  What's wrong with probing during resume?
> > 
> > The USB stack does carry out probes during resume under certain
> > circumstances.  A driver lacking a reset_resume callback is one of
> > those circumstances.
> 
> in case the platform kills the power to the USB lines, we can never
> do anything about this. I do not want to hack around this in the
> driver.
> 
> What are the cases where we should implement reset_resume and would
> it really help here. Since the btusb.ko driver implements
> suspend/resume support, would reset_resume ever be called?

One of those cases is exactly what you have been talking about: when
the platform kills power to the USB lines during suspend.  The driver's
reset_resume routine will be called during resume, as opposed to the
probe routine being called.  Therefore the driver will be able to tell
that this is not a new device instance.

The other cases are less likely to occur: a device is unable to resume 
normally and requires a reset before it will start working again, or 
something else goes wrong along those lines.

> However I get the feeling someone needs to go back and see if the
> device is the same one and just gets probed again or if it is a new
> one from the USB host stack perspective.

That can be done easily enough by enabling usbcore debugging before 
carrying out the system suspend:

	echo 'module usbcore =p' >/debug/dynamic_debug/control

The debugging information in the kernel log will tell just what 
happened.


On Thu, 21 May 2015, Takashi Iwai wrote:

> At Thu, 21 May 2015 10:18:08 -0400 (EDT),
> Alan Stern wrote:
> > 
> > On Thu, 21 May 2015, Takashi Iwai wrote:
> > 
> > > Then avoiding the failed firmware is no solution, indeed.
> > > If it's a new probe, it should be never executed during resume.
> > 
> > Can you expand this comment?  What's wrong with probing during resume?
> 
> Well, if the probe requires the access to a user-space file, it can't
> be done during resume.  That's the very problem we're seeing now.
> The firmware loader can't help much alone if it's a new device
> object.

But the same thing happens during early boot, if the driver is built 
into the kernel.  When the probe occurs, userspace isn't up and running 
yet, so the firmware loader can't do anything.

Why should probe during resume be any worse than probe during early 
boot?

> > The USB stack does carry out probes during resume under certain
> > circumstances.  A driver lacking a reset_resume callback is one of
> > those circumstances.
> 
> So, having a proper reset_resume in btusb would help in the end?

It might, depending on how the driver is written.  I don't know enough 
about the details of btusb to say.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 15:26                                     ` Alan Stern
@ 2015-05-21 15:35                                       ` Takashi Iwai
  2015-05-21 17:27                                         ` Arend van Spriel
  2015-05-21 17:37                                         ` Alan Stern
  2015-05-22  0:21                                       ` Laura Abbott
  1 sibling, 2 replies; 45+ messages in thread
From: Takashi Iwai @ 2015-05-21 15:35 UTC (permalink / raw)
  To: Alan Stern
  Cc: Marcel Holtmann, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Thu, 21 May 2015 11:26:17 -0400 (EDT),
Alan Stern wrote:
> 
> On Thu, 21 May 2015, Takashi Iwai wrote:
> 
> > At Thu, 21 May 2015 10:18:08 -0400 (EDT),
> > Alan Stern wrote:
> > > 
> > > On Thu, 21 May 2015, Takashi Iwai wrote:
> > > 
> > > > Then avoiding the failed firmware is no solution, indeed.
> > > > If it's a new probe, it should be never executed during resume.
> > > 
> > > Can you expand this comment?  What's wrong with probing during resume?
> > 
> > Well, if the probe requires the access to a user-space file, it can't
> > be done during resume.  That's the very problem we're seeing now.
> > The firmware loader can't help much alone if it's a new device
> > object.
> 
> But the same thing happens during early boot, if the driver is built 
> into the kernel.  When the probe occurs, userspace isn't up and running 
> yet, so the firmware loader can't do anything.
> 
> Why should probe during resume be any worse than probe during early 
> boot?

The early boot has initrd, so the files can be there.  But the resume
has no way to fetch the file except for cached data.


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 15:35                                       ` Takashi Iwai
@ 2015-05-21 17:27                                         ` Arend van Spriel
  2015-05-21 17:32                                           ` Takashi Iwai
  2015-05-21 17:37                                         ` Alan Stern
  1 sibling, 1 reply; 45+ messages in thread
From: Arend van Spriel @ 2015-05-21 17:27 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Alan Stern, Marcel Holtmann, Laura Abbott, Oliver Neukum,
	Ming Lei, David S. Miller, Laura Abbott, Johan Hedberg,
	Rafael J. Wysocki, Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/21/15 17:35, Takashi Iwai wrote:
> At Thu, 21 May 2015 11:26:17 -0400 (EDT),
> Alan Stern wrote:
>>
>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>
>>> At Thu, 21 May 2015 10:18:08 -0400 (EDT),
>>> Alan Stern wrote:
>>>>
>>>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>>>
>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>> If it's a new probe, it should be never executed during resume.
>>>>
>>>> Can you expand this comment?  What's wrong with probing during resume?
>>>
>>> Well, if the probe requires the access to a user-space file, it can't
>>> be done during resume.  That's the very problem we're seeing now.
>>> The firmware loader can't help much alone if it's a new device
>>> object.
>>
>> But the same thing happens during early boot, if the driver is built
>> into the kernel.  When the probe occurs, userspace isn't up and running
>> yet, so the firmware loader can't do anything.
>>
>> Why should probe during resume be any worse than probe during early
>> boot?
>
> The early boot has initrd, so the files can be there.  But the resume
> has no way to fetch the file except for cached data.

but initrd is optional so without initrd it is pretty much the same.

Regards,
Arend

> Takashi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 17:27                                         ` Arend van Spriel
@ 2015-05-21 17:32                                           ` Takashi Iwai
  2015-05-21 20:46                                             ` Arend van Spriel
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-21 17:32 UTC (permalink / raw)
  To: Arend van Spriel
  Cc: Alan Stern, Marcel Holtmann, Laura Abbott, Oliver Neukum,
	Ming Lei, David S. Miller, Laura Abbott, Johan Hedberg,
	Rafael J. Wysocki, Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Thu, 21 May 2015 19:27:41 +0200,
Arend van Spriel wrote:
> 
> On 05/21/15 17:35, Takashi Iwai wrote:
> > At Thu, 21 May 2015 11:26:17 -0400 (EDT),
> > Alan Stern wrote:
> >>
> >> On Thu, 21 May 2015, Takashi Iwai wrote:
> >>
> >>> At Thu, 21 May 2015 10:18:08 -0400 (EDT),
> >>> Alan Stern wrote:
> >>>>
> >>>> On Thu, 21 May 2015, Takashi Iwai wrote:
> >>>>
> >>>>> Then avoiding the failed firmware is no solution, indeed.
> >>>>> If it's a new probe, it should be never executed during resume.
> >>>>
> >>>> Can you expand this comment?  What's wrong with probing during resume?
> >>>
> >>> Well, if the probe requires the access to a user-space file, it can't
> >>> be done during resume.  That's the very problem we're seeing now.
> >>> The firmware loader can't help much alone if it's a new device
> >>> object.
> >>
> >> But the same thing happens during early boot, if the driver is built
> >> into the kernel.  When the probe occurs, userspace isn't up and running
> >> yet, so the firmware loader can't do anything.
> >>
> >> Why should probe during resume be any worse than probe during early
> >> boot?
> >
> > The early boot has initrd, so the files can be there.  But the resume
> > has no way to fetch the file except for cached data.
> 
> but initrd is optional so without initrd it is pretty much the same.

User can build the firmware into the kernel.


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 15:35                                       ` Takashi Iwai
  2015-05-21 17:27                                         ` Arend van Spriel
@ 2015-05-21 17:37                                         ` Alan Stern
  2015-05-21 18:11                                           ` Takashi Iwai
  1 sibling, 1 reply; 45+ messages in thread
From: Alan Stern @ 2015-05-21 17:37 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Marcel Holtmann, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On Thu, 21 May 2015, Takashi Iwai wrote:

> At Thu, 21 May 2015 11:26:17 -0400 (EDT),
> Alan Stern wrote:
> > 
> > On Thu, 21 May 2015, Takashi Iwai wrote:
> > 
> > > At Thu, 21 May 2015 10:18:08 -0400 (EDT),
> > > Alan Stern wrote:
> > > > 
> > > > On Thu, 21 May 2015, Takashi Iwai wrote:
> > > > 
> > > > > Then avoiding the failed firmware is no solution, indeed.
> > > > > If it's a new probe, it should be never executed during resume.
> > > > 
> > > > Can you expand this comment?  What's wrong with probing during resume?
> > > 
> > > Well, if the probe requires the access to a user-space file, it can't
> > > be done during resume.  That's the very problem we're seeing now.
> > > The firmware loader can't help much alone if it's a new device
> > > object.
> > 
> > But the same thing happens during early boot, if the driver is built 
> > into the kernel.  When the probe occurs, userspace isn't up and running 
> > yet, so the firmware loader can't do anything.
> > 
> > Why should probe during resume be any worse than probe during early 
> > boot?
> 
> The early boot has initrd, so the files can be there.  But the resume
> has no way to fetch the file except for cached data.

I suppose USB could delay re-probing until userspace is running again,
if we knew when that was.  But it would be awkward and prone to races.  
It also would leave a user-visible window of time during which the 
device does not exist, which we want to avoid.  (This may not matter 
for bluetooth, but it does matter for other kinds of devices.)

I would prefer to solve this problem in a different way, if possible.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 17:37                                         ` Alan Stern
@ 2015-05-21 18:11                                           ` Takashi Iwai
  2015-05-21 18:17                                             ` Laura Abbott
  0 siblings, 1 reply; 45+ messages in thread
From: Takashi Iwai @ 2015-05-21 18:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Marcel Holtmann, Laura Abbott, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

At Thu, 21 May 2015 13:37:56 -0400 (EDT),
Alan Stern wrote:
> 
> On Thu, 21 May 2015, Takashi Iwai wrote:
> 
> > At Thu, 21 May 2015 11:26:17 -0400 (EDT),
> > Alan Stern wrote:
> > > 
> > > On Thu, 21 May 2015, Takashi Iwai wrote:
> > > 
> > > > At Thu, 21 May 2015 10:18:08 -0400 (EDT),
> > > > Alan Stern wrote:
> > > > > 
> > > > > On Thu, 21 May 2015, Takashi Iwai wrote:
> > > > > 
> > > > > > Then avoiding the failed firmware is no solution, indeed.
> > > > > > If it's a new probe, it should be never executed during resume.
> > > > > 
> > > > > Can you expand this comment?  What's wrong with probing during resume?
> > > > 
> > > > Well, if the probe requires the access to a user-space file, it can't
> > > > be done during resume.  That's the very problem we're seeing now.
> > > > The firmware loader can't help much alone if it's a new device
> > > > object.
> > > 
> > > But the same thing happens during early boot, if the driver is built 
> > > into the kernel.  When the probe occurs, userspace isn't up and running 
> > > yet, so the firmware loader can't do anything.
> > > 
> > > Why should probe during resume be any worse than probe during early 
> > > boot?
> > 
> > The early boot has initrd, so the files can be there.  But the resume
> > has no way to fetch the file except for cached data.
> 
> I suppose USB could delay re-probing until userspace is running again,
> if we knew when that was.  But it would be awkward and prone to races.  
> It also would leave a user-visible window of time during which the 
> device does not exist, which we want to avoid.  (This may not matter 
> for bluetooth, but it does matter for other kinds of devices.)

Right.

> I would prefer to solve this problem in a different way, if possible.

Well, we're back in square again :)

But, before going further the discussion in loop again, I'd like to
know which firmware file actually hits.  Is it a non-existing
firmware?  Or is it a firmware that should have been cached?  In the
latter case, why it isn't used?


Takashi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 18:11                                           ` Takashi Iwai
@ 2015-05-21 18:17                                             ` Laura Abbott
  0 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2015-05-21 18:17 UTC (permalink / raw)
  To: Takashi Iwai, Alan Stern
  Cc: Marcel Holtmann, Oliver Neukum, Ming Lei, David S. Miller,
	Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/21/2015 11:11 AM, Takashi Iwai wrote:
> At Thu, 21 May 2015 13:37:56 -0400 (EDT),
> Alan Stern wrote:
>>
>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>
>>> At Thu, 21 May 2015 11:26:17 -0400 (EDT),
>>> Alan Stern wrote:
>>>>
>>>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>>>
>>>>> At Thu, 21 May 2015 10:18:08 -0400 (EDT),
>>>>> Alan Stern wrote:
>>>>>>
>>>>>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>>>>>
>>>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>>>> If it's a new probe, it should be never executed during resume.
>>>>>>
>>>>>> Can you expand this comment?  What's wrong with probing during resume?
>>>>>
>>>>> Well, if the probe requires the access to a user-space file, it can't
>>>>> be done during resume.  That's the very problem we're seeing now.
>>>>> The firmware loader can't help much alone if it's a new device
>>>>> object.
>>>>
>>>> But the same thing happens during early boot, if the driver is built
>>>> into the kernel.  When the probe occurs, userspace isn't up and running
>>>> yet, so the firmware loader can't do anything.
>>>>
>>>> Why should probe during resume be any worse than probe during early
>>>> boot?
>>>
>>> The early boot has initrd, so the files can be there.  But the resume
>>> has no way to fetch the file except for cached data.
>>
>> I suppose USB could delay re-probing until userspace is running again,
>> if we knew when that was.  But it would be awkward and prone to races.
>> It also would leave a user-visible window of time during which the
>> device does not exist, which we want to avoid.  (This may not matter
>> for bluetooth, but it does matter for other kinds of devices.)
>
> Right.
>
>> I would prefer to solve this problem in a different way, if possible.
>
> Well, we're back in square again :)
>
> But, before going further the discussion in loop again, I'd like to
> know which firmware file actually hits.  Is it a non-existing
> firmware?  Or is it a firmware that should have been cached?  In the
> latter case, why it isn't used?
>

Non-existent firmware. The firmware was never present in the system and
was never loaded at all.

>
> Takashi
>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 17:32                                           ` Takashi Iwai
@ 2015-05-21 20:46                                             ` Arend van Spriel
  2015-05-22 11:30                                               ` Oliver Neukum
  0 siblings, 1 reply; 45+ messages in thread
From: Arend van Spriel @ 2015-05-21 20:46 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Alan Stern, Marcel Holtmann, Laura Abbott, Oliver Neukum,
	Ming Lei, David S. Miller, Laura Abbott, Johan Hedberg,
	Rafael J. Wysocki, Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/21/15 19:32, Takashi Iwai wrote:
> At Thu, 21 May 2015 19:27:41 +0200,
> Arend van Spriel wrote:
>>
>> On 05/21/15 17:35, Takashi Iwai wrote:
>>> At Thu, 21 May 2015 11:26:17 -0400 (EDT),
>>> Alan Stern wrote:
>>>>
>>>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>>>
>>>>> At Thu, 21 May 2015 10:18:08 -0400 (EDT),
>>>>> Alan Stern wrote:
>>>>>>
>>>>>> On Thu, 21 May 2015, Takashi Iwai wrote:
>>>>>>
>>>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>>>> If it's a new probe, it should be never executed during resume.
>>>>>>
>>>>>> Can you expand this comment?  What's wrong with probing during resume?
>>>>>
>>>>> Well, if the probe requires the access to a user-space file, it can't
>>>>> be done during resume.  That's the very problem we're seeing now.
>>>>> The firmware loader can't help much alone if it's a new device
>>>>> object.

So you are saying each device driver should come up with some retry 
mechanism. Would make more sense to come up with something like that 
behind the scenes in the firmware loader so all device drivers can rely 
on one and the same solution.

Regards,
Arend

>>>> But the same thing happens during early boot, if the driver is built
>>>> into the kernel.  When the probe occurs, userspace isn't up and running
>>>> yet, so the firmware loader can't do anything.
>>>>
>>>> Why should probe during resume be any worse than probe during early
>>>> boot?
>>>
>>> The early boot has initrd, so the files can be there.  But the resume
>>> has no way to fetch the file except for cached data.
>>
>> but initrd is optional so without initrd it is pretty much the same.
>
> User can build the firmware into the kernel.
>
>
> Takashi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 15:26                                     ` Alan Stern
  2015-05-21 15:35                                       ` Takashi Iwai
@ 2015-05-22  0:21                                       ` Laura Abbott
  2015-05-22  3:13                                         ` Marcel Holtmann
  2015-05-22  7:37                                         ` [RESEND][PATCH] Bluetooth: Make request workqueue freezable Arend van Spriel
  1 sibling, 2 replies; 45+ messages in thread
From: Laura Abbott @ 2015-05-22  0:21 UTC (permalink / raw)
  To: Alan Stern, Marcel Holtmann
  Cc: Takashi Iwai, Oliver Neukum, Ming Lei, David S. Miller,
	Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/21/2015 08:26 AM, Alan Stern wrote:
> On Thu, 21 May 2015, Marcel Holtmann wrote:
>
>> Hi Alan,
>>
>>>> Then avoiding the failed firmware is no solution, indeed.
>>>> If it's a new probe, it should be never executed during resume.
>>>
>>> Can you expand this comment?  What's wrong with probing during resume?
>>>
>>> The USB stack does carry out probes during resume under certain
>>> circumstances.  A driver lacking a reset_resume callback is one of
>>> those circumstances.
>>
>> in case the platform kills the power to the USB lines, we can never
>> do anything about this. I do not want to hack around this in the
>> driver.
>>
>> What are the cases where we should implement reset_resume and would
>> it really help here. Since the btusb.ko driver implements
>> suspend/resume support, would reset_resume ever be called?
>
> One of those cases is exactly what you have been talking about: when
> the platform kills power to the USB lines during suspend.  The driver's
> reset_resume routine will be called during resume, as opposed to the
> probe routine being called.  Therefore the driver will be able to tell
> that this is not a new device instance.
>
> The other cases are less likely to occur: a device is unable to resume
> normally and requires a reset before it will start working again, or
> something else goes wrong along those lines.
>
>> However I get the feeling someone needs to go back and see if the
>> device is the same one and just gets probed again or if it is a new
>> one from the USB host stack perspective.
>
> That can be done easily enough by enabling usbcore debugging before
> carrying out the system suspend:
>
> 	echo 'module usbcore =p' >/debug/dynamic_debug/control
>
> The debugging information in the kernel log will tell just what
> happened.
>
>

Playing around in my test setup as a baseline

[   41.991035] usb usb1-port11: not reset yet, waiting 50ms
[   42.092902] usb 1-11: reset full-speed USB device number 4 using xhci_hcd
[   42.143575] usb usb1-port11: not reset yet, waiting 50ms
[   42.257822] btusb 1-11:1.0: no reset_resume for driver btusb?
[   42.257823] btusb 1-11:1.1: no reset_resume for driver btusb?
[   42.257825] btusb 1-11:1.0: forced unbind
[   42.258305] kworker/dying (826) used greatest stack depth: 10680 bytes left
[   42.331342] usb 1-9.2: reset full-speed USB device number 7 using xhci_hcd
[   42.416631] usb 1-9.2: ep0 maxpacket = 8
[   42.681288] usb 1-9.1: reset low-speed USB device number 5 using xhci_hcd
[   42.968138] usb 1-9.1: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
[   42.968157] usb 1-9.1: ep 0x82 - rounding interval to 64 microframes, ep desc says 80 microframes
[   43.036290] usb 1-9.4: reset high-speed USB device number 8 using xhci_hcd
[   43.123126] hub 1-9.4:1.0: hub_reset_resume
[   43.123581] hub 1-9.4:1.0: enabling power on all ports
[   43.224853] PM: resume of devices complete after 2456.587 msecs
[   43.225038] btusb 1-11:1.0: usb_probe_interface
[   43.225040] btusb 1-11:1.0: usb_probe_interface - got id
[   43.225802] ------------[ cut here ]------------
[   43.225807] WARNING: CPU: 7 PID: 2844 at drivers/base/firmware_class.c:1118 _request_firmware+0x5ee/0x890()


so it is trying to call the reset resume. If I try a 'dummy reset resume'

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index a7bdac0..cda8137 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -3401,6 +3401,7 @@ static struct usb_driver btusb_driver = {
  #ifdef CONFIG_PM
         .suspend        = btusb_suspend,
         .resume         = btusb_resume,
+       .reset_resume   = btusb_resume,
  #endif
         .id_table       = btusb_table,
         .supports_autosuspend = 1,


I no longer see the warning which means that probe is no longer being called.

Marcel, does implementing a proper reset_resume callback seem like the right
approach or do you need more information?

Thanks,
Laura

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-22  0:21                                       ` Laura Abbott
@ 2015-05-22  3:13                                         ` Marcel Holtmann
  2015-05-28  0:47                                           ` Laura Abbott
                                                             ` (2 more replies)
  2015-05-22  7:37                                         ` [RESEND][PATCH] Bluetooth: Make request workqueue freezable Arend van Spriel
  1 sibling, 3 replies; 45+ messages in thread
From: Marcel Holtmann @ 2015-05-22  3:13 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

Hi Laura,

>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>> If it's a new probe, it should be never executed during resume.
>>>> 
>>>> Can you expand this comment?  What's wrong with probing during resume?
>>>> 
>>>> The USB stack does carry out probes during resume under certain
>>>> circumstances.  A driver lacking a reset_resume callback is one of
>>>> those circumstances.
>>> 
>>> in case the platform kills the power to the USB lines, we can never
>>> do anything about this. I do not want to hack around this in the
>>> driver.
>>> 
>>> What are the cases where we should implement reset_resume and would
>>> it really help here. Since the btusb.ko driver implements
>>> suspend/resume support, would reset_resume ever be called?
>> 
>> One of those cases is exactly what you have been talking about: when
>> the platform kills power to the USB lines during suspend.  The driver's
>> reset_resume routine will be called during resume, as opposed to the
>> probe routine being called.  Therefore the driver will be able to tell
>> that this is not a new device instance.
>> 
>> The other cases are less likely to occur: a device is unable to resume
>> normally and requires a reset before it will start working again, or
>> something else goes wrong along those lines.
>> 
>>> However I get the feeling someone needs to go back and see if the
>>> device is the same one and just gets probed again or if it is a new
>>> one from the USB host stack perspective.
>> 
>> That can be done easily enough by enabling usbcore debugging before
>> carrying out the system suspend:
>> 
>> 	echo 'module usbcore =p' >/debug/dynamic_debug/control
>> 
>> The debugging information in the kernel log will tell just what
>> happened.
>> 
>> 
> 
> Playing around in my test setup as a baseline
> 
> [   41.991035] usb usb1-port11: not reset yet, waiting 50ms
> [   42.092902] usb 1-11: reset full-speed USB device number 4 using xhci_hcd
> [   42.143575] usb usb1-port11: not reset yet, waiting 50ms
> [   42.257822] btusb 1-11:1.0: no reset_resume for driver btusb?
> [   42.257823] btusb 1-11:1.1: no reset_resume for driver btusb?
> [   42.257825] btusb 1-11:1.0: forced unbind
> [   42.258305] kworker/dying (826) used greatest stack depth: 10680 bytes left
> [   42.331342] usb 1-9.2: reset full-speed USB device number 7 using xhci_hcd
> [   42.416631] usb 1-9.2: ep0 maxpacket = 8
> [   42.681288] usb 1-9.1: reset low-speed USB device number 5 using xhci_hcd
> [   42.968138] usb 1-9.1: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
> [   42.968157] usb 1-9.1: ep 0x82 - rounding interval to 64 microframes, ep desc says 80 microframes
> [   43.036290] usb 1-9.4: reset high-speed USB device number 8 using xhci_hcd
> [   43.123126] hub 1-9.4:1.0: hub_reset_resume
> [   43.123581] hub 1-9.4:1.0: enabling power on all ports
> [   43.224853] PM: resume of devices complete after 2456.587 msecs
> [   43.225038] btusb 1-11:1.0: usb_probe_interface
> [   43.225040] btusb 1-11:1.0: usb_probe_interface - got id
> [   43.225802] ------------[ cut here ]------------
> [   43.225807] WARNING: CPU: 7 PID: 2844 at drivers/base/firmware_class.c:1118 _request_firmware+0x5ee/0x890()
> 
> 
> so it is trying to call the reset resume. If I try a 'dummy reset resume'
> 
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index a7bdac0..cda8137 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -3401,6 +3401,7 @@ static struct usb_driver btusb_driver = {
> #ifdef CONFIG_PM
>        .suspend        = btusb_suspend,
>        .resume         = btusb_resume,
> +       .reset_resume   = btusb_resume,
> #endif
>        .id_table       = btusb_table,
>        .supports_autosuspend = 1,
> 
> 
> I no longer see the warning which means that probe is no longer being called.
> 
> Marcel, does implementing a proper reset_resume callback seem like the right
> approach or do you need more information?

I wonder what is the right work that needs to be done in the reset_resume callback. I am curious if devices are forgetting their firmware or not. There is a patch to make the Realtek devices forcefully reset_resume since they forget their firmware. So there is at least one kind of devices where the firmware does not survive normal suspend/resume behaviour.

If the devices forget the firmware, then this means that we actually need to tell the Bluetooth core that this device has been reset and it has to run through hdev->setup() again. If it does that, then we have the same problem that the firmware will not be found since userspace is not yet ready. However we could note the fact that we tried to lookup the firmware and just know that it was not found. So that might help already.

Devices that always require the firmware, we can assume that the firmware will have been cached since we successfully loaded it in the first place. Which will most likely make the Realtek devices function just fine. It has the advantage that we do not have to go through the disconnect() and probe() cycle which will in turn unregister and re-register the HCI device.

The question really is what a btusb_reset_resume() function should be doing. We have to assume that the device lost all its state and a reset of the Bluetooth stack is needed. The question if firmware is persistent or not might be a per device quirk that we have to deal with later.

This means we need to do at least the following transactions in reset_resume case:

	Clear all suspend flags and flush all data
	Start the interrupt URB if it was running before
	Start the bulk URBs if they were running before
	Call hci_reset_dev

Now the problem with hci_reset_dev at the moment is that it just injects a hardware error event. And based on that we restart the Bluetooth stack for that device. This needs extra work to restart the stack without needing to inject an event. The injection of the event will still be needed in case of HCI user channel operation, but that is yet another story.

Do we have an easy way to simulate this behaviour and trigger this specific case? We really need to know for which devices the firmware stays around.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-22  0:21                                       ` Laura Abbott
  2015-05-22  3:13                                         ` Marcel Holtmann
@ 2015-05-22  7:37                                         ` Arend van Spriel
  2015-05-22  7:41                                           ` Arend van Spriel
  1 sibling, 1 reply; 45+ messages in thread
From: Arend van Spriel @ 2015-05-22  7:37 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alan Stern, Marcel Holtmann, Takashi Iwai, Oliver Neukum,
	Ming Lei, David S. Miller, Laura Abbott, Johan Hedberg,
	Rafael J. Wysocki, Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/22/15 02:21, Laura Abbott wrote:
> On 05/21/2015 08:26 AM, Alan Stern wrote:
>> On Thu, 21 May 2015, Marcel Holtmann wrote:
>>
>>> Hi Alan,
>>>
>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>> If it's a new probe, it should be never executed during resume.
>>>>
>>>> Can you expand this comment? What's wrong with probing during resume?
>>>>
>>>> The USB stack does carry out probes during resume under certain
>>>> circumstances. A driver lacking a reset_resume callback is one of
>>>> those circumstances.
>>>
>>> in case the platform kills the power to the USB lines, we can never
>>> do anything about this. I do not want to hack around this in the
>>> driver.
>>>
>>> What are the cases where we should implement reset_resume and would
>>> it really help here. Since the btusb.ko driver implements
>>> suspend/resume support, would reset_resume ever be called?
>>
>> One of those cases is exactly what you have been talking about: when
>> the platform kills power to the USB lines during suspend. The driver's
>> reset_resume routine will be called during resume, as opposed to the
>> probe routine being called. Therefore the driver will be able to tell
>> that this is not a new device instance.
>>
>> The other cases are less likely to occur: a device is unable to resume
>> normally and requires a reset before it will start working again, or
>> something else goes wrong along those lines.
>>
>>> However I get the feeling someone needs to go back and see if the
>>> device is the same one and just gets probed again or if it is a new
>>> one from the USB host stack perspective.
>>
>> That can be done easily enough by enabling usbcore debugging before
>> carrying out the system suspend:
>>
>> echo 'module usbcore =p' >/debug/dynamic_debug/control
>>
>> The debugging information in the kernel log will tell just what
>> happened.
>>
>>
>
> Playing around in my test setup as a baseline
>
> [ 41.991035] usb usb1-port11: not reset yet, waiting 50ms
> [ 42.092902] usb 1-11: reset full-speed USB device number 4 using xhci_hcd
> [ 42.143575] usb usb1-port11: not reset yet, waiting 50ms
> [ 42.257822] btusb 1-11:1.0: no reset_resume for driver btusb?
> [ 42.257823] btusb 1-11:1.1: no reset_resume for driver btusb?
> [ 42.257825] btusb 1-11:1.0: forced unbind
> [ 42.258305] kworker/dying (826) used greatest stack depth: 10680 bytes
> left
> [ 42.331342] usb 1-9.2: reset full-speed USB device number 7 using xhci_hcd
> [ 42.416631] usb 1-9.2: ep0 maxpacket = 8
> [ 42.681288] usb 1-9.1: reset low-speed USB device number 5 using xhci_hcd
> [ 42.968138] usb 1-9.1: ep 0x81 - rounding interval to 64 microframes,
> ep desc says 80 microframes
> [ 42.968157] usb 1-9.1: ep 0x82 - rounding interval to 64 microframes,
> ep desc says 80 microframes
> [ 43.036290] usb 1-9.4: reset high-speed USB device number 8 using xhci_hcd
> [ 43.123126] hub 1-9.4:1.0: hub_reset_resume
> [ 43.123581] hub 1-9.4:1.0: enabling power on all ports
> [ 43.224853] PM: resume of devices complete after 2456.587 msecs
> [ 43.225038] btusb 1-11:1.0: usb_probe_interface
> [ 43.225040] btusb 1-11:1.0: usb_probe_interface - got id
> [ 43.225802] ------------[ cut here ]------------
> [ 43.225807] WARNING: CPU: 7 PID: 2844 at
> drivers/base/firmware_class.c:1118 _request_firmware+0x5ee/0x890()
>
>
> so it is trying to call the reset resume. If I try a 'dummy reset resume'
>
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index a7bdac0..cda8137 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -3401,6 +3401,7 @@ static struct usb_driver btusb_driver = {
> #ifdef CONFIG_PM
> .suspend = btusb_suspend,
> .resume = btusb_resume,
> + .reset_resume = btusb_resume,
> #endif
> .id_table = btusb_table,
> .supports_autosuspend = 1,
>
>
> I no longer see the warning which means that probe is no longer being
> called.
>
> Marcel, does implementing a proper reset_resume callback seem like the
> right
> approach or do you need more information?

Hi, Laura

I believe that some devices supported by btusb would need to do a 
request_firmware() in the reset_resume() callback and thus end up with 
the same issue. btusb could store the firmware obtained during the probe 
in it driver private structure and use that in reset_resume() callback, 
but it means the memory for the firmware blobs will not be released 
until the driver is unloaded.

Regards,
Arend

> Thanks,
> Laura
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-22  7:37                                         ` [RESEND][PATCH] Bluetooth: Make request workqueue freezable Arend van Spriel
@ 2015-05-22  7:41                                           ` Arend van Spriel
  0 siblings, 0 replies; 45+ messages in thread
From: Arend van Spriel @ 2015-05-22  7:41 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alan Stern, Marcel Holtmann, Takashi Iwai, Oliver Neukum,
	Ming Lei, David S. Miller, Laura Abbott, Johan Hedberg,
	Rafael J. Wysocki, Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/22/15 09:37, Arend van Spriel wrote:
> On 05/22/15 02:21, Laura Abbott wrote:
>> On 05/21/2015 08:26 AM, Alan Stern wrote:
>>> On Thu, 21 May 2015, Marcel Holtmann wrote:
>>>
>>>> Hi Alan,
>>>>
>>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>>> If it's a new probe, it should be never executed during resume.
>>>>>
>>>>> Can you expand this comment? What's wrong with probing during resume?
>>>>>
>>>>> The USB stack does carry out probes during resume under certain
>>>>> circumstances. A driver lacking a reset_resume callback is one of
>>>>> those circumstances.
>>>>
>>>> in case the platform kills the power to the USB lines, we can never
>>>> do anything about this. I do not want to hack around this in the
>>>> driver.
>>>>
>>>> What are the cases where we should implement reset_resume and would
>>>> it really help here. Since the btusb.ko driver implements
>>>> suspend/resume support, would reset_resume ever be called?
>>>
>>> One of those cases is exactly what you have been talking about: when
>>> the platform kills power to the USB lines during suspend. The driver's
>>> reset_resume routine will be called during resume, as opposed to the
>>> probe routine being called. Therefore the driver will be able to tell
>>> that this is not a new device instance.
>>>
>>> The other cases are less likely to occur: a device is unable to resume
>>> normally and requires a reset before it will start working again, or
>>> something else goes wrong along those lines.
>>>
>>>> However I get the feeling someone needs to go back and see if the
>>>> device is the same one and just gets probed again or if it is a new
>>>> one from the USB host stack perspective.
>>>
>>> That can be done easily enough by enabling usbcore debugging before
>>> carrying out the system suspend:
>>>
>>> echo 'module usbcore =p' >/debug/dynamic_debug/control
>>>
>>> The debugging information in the kernel log will tell just what
>>> happened.
>>>
>>>
>>
>> Playing around in my test setup as a baseline
>>
>> [ 41.991035] usb usb1-port11: not reset yet, waiting 50ms
>> [ 42.092902] usb 1-11: reset full-speed USB device number 4 using
>> xhci_hcd
>> [ 42.143575] usb usb1-port11: not reset yet, waiting 50ms
>> [ 42.257822] btusb 1-11:1.0: no reset_resume for driver btusb?
>> [ 42.257823] btusb 1-11:1.1: no reset_resume for driver btusb?
>> [ 42.257825] btusb 1-11:1.0: forced unbind
>> [ 42.258305] kworker/dying (826) used greatest stack depth: 10680 bytes
>> left
>> [ 42.331342] usb 1-9.2: reset full-speed USB device number 7 using
>> xhci_hcd
>> [ 42.416631] usb 1-9.2: ep0 maxpacket = 8
>> [ 42.681288] usb 1-9.1: reset low-speed USB device number 5 using
>> xhci_hcd
>> [ 42.968138] usb 1-9.1: ep 0x81 - rounding interval to 64 microframes,
>> ep desc says 80 microframes
>> [ 42.968157] usb 1-9.1: ep 0x82 - rounding interval to 64 microframes,
>> ep desc says 80 microframes
>> [ 43.036290] usb 1-9.4: reset high-speed USB device number 8 using
>> xhci_hcd
>> [ 43.123126] hub 1-9.4:1.0: hub_reset_resume
>> [ 43.123581] hub 1-9.4:1.0: enabling power on all ports
>> [ 43.224853] PM: resume of devices complete after 2456.587 msecs
>> [ 43.225038] btusb 1-11:1.0: usb_probe_interface
>> [ 43.225040] btusb 1-11:1.0: usb_probe_interface - got id
>> [ 43.225802] ------------[ cut here ]------------
>> [ 43.225807] WARNING: CPU: 7 PID: 2844 at
>> drivers/base/firmware_class.c:1118 _request_firmware+0x5ee/0x890()
>>
>>
>> so it is trying to call the reset resume. If I try a 'dummy reset resume'
>>
>> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
>> index a7bdac0..cda8137 100644
>> --- a/drivers/bluetooth/btusb.c
>> +++ b/drivers/bluetooth/btusb.c
>> @@ -3401,6 +3401,7 @@ static struct usb_driver btusb_driver = {
>> #ifdef CONFIG_PM
>> .suspend = btusb_suspend,
>> .resume = btusb_resume,
>> + .reset_resume = btusb_resume,
>> #endif
>> .id_table = btusb_table,
>> .supports_autosuspend = 1,
>>
>>
>> I no longer see the warning which means that probe is no longer being
>> called.
>>
>> Marcel, does implementing a proper reset_resume callback seem like the
>> right
>> approach or do you need more information?
>
> Hi, Laura
>
> I believe that some devices supported by btusb would need to do a
> request_firmware() in the reset_resume() callback and thus end up with
> the same issue. btusb could store the firmware obtained during the probe
> in it driver private structure and use that in reset_resume() callback,
> but it means the memory for the firmware blobs will not be released
> until the driver is unloaded.

Same is true if caching is done in firmware_loader so it may not be a 
big deal.

Regards,
Arend

> Regards,
> Arend
>
>> Thanks,
>> Laura
>> --
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-bluetooth" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-21 20:46                                             ` Arend van Spriel
@ 2015-05-22 11:30                                               ` Oliver Neukum
  0 siblings, 0 replies; 45+ messages in thread
From: Oliver Neukum @ 2015-05-22 11:30 UTC (permalink / raw)
  To: Arend van Spriel
  Cc: Takashi Iwai, Ming Lei, David S. Miller, Laura Abbott,
	JohanHedberg, Marcel Holtmann, Rafael J. Wysocki,
	Gustavo F. Padovan, Laura Abbott, Alan Stern,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On Thu, 2015-05-21 at 22:46 +0200, Arend van Spriel wrote:
> On 05/21/15 19:32, Takashi Iwai wrote:


> >>>>> Well, if the probe requires the access to a user-space file, it can't
> >>>>> be done during resume.  That's the very problem we're seeing now.
> >>>>> The firmware loader can't help much alone if it's a new device
> >>>>> object.
> 
> So you are saying each device driver should come up with some retry 
> mechanism. Would make more sense to come up with something like that 
> behind the scenes in the firmware loader so all device drivers can rely 
> on one and the same solution.

There is already a notifier for this. I don't see why the firmware
layer couldn't retrigger a match for all unbound devices, just like
we do when a new driver is loaded.

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RESEND][PATCH] Bluetooth: Make request workqueue freezable
  2015-05-22  3:13                                         ` Marcel Holtmann
@ 2015-05-28  0:47                                           ` Laura Abbott
  2015-06-02  1:14                                           ` [PATCH 1/2] Bluetooth: Add reset_resume function Laura Abbott
  2015-06-02  1:14                                           ` [PATCH 2/2] Bluetooth: btusb: " Laura Abbott
  2 siblings, 0 replies; 45+ messages in thread
From: Laura Abbott @ 2015-05-28  0:47 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Laura Abbott, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan,
	bluez mailin list (linux-bluetooth@vger.kernel.org),
	Linux Kernel Mailing List, USB list, netdev

On 05/21/2015 08:13 PM, Marcel Holtmann wrote:
> Hi Laura,
>
>>>>>> Then avoiding the failed firmware is no solution, indeed.
>>>>>> If it's a new probe, it should be never executed during resume.
>>>>>
>>>>> Can you expand this comment?  What's wrong with probing during resume?
>>>>>
>>>>> The USB stack does carry out probes during resume under certain
>>>>> circumstances.  A driver lacking a reset_resume callback is one of
>>>>> those circumstances.
>>>>
>>>> in case the platform kills the power to the USB lines, we can never
>>>> do anything about this. I do not want to hack around this in the
>>>> driver.
>>>>
>>>> What are the cases where we should implement reset_resume and would
>>>> it really help here. Since the btusb.ko driver implements
>>>> suspend/resume support, would reset_resume ever be called?
>>>
>>> One of those cases is exactly what you have been talking about: when
>>> the platform kills power to the USB lines during suspend.  The driver's
>>> reset_resume routine will be called during resume, as opposed to the
>>> probe routine being called.  Therefore the driver will be able to tell
>>> that this is not a new device instance.
>>>
>>> The other cases are less likely to occur: a device is unable to resume
>>> normally and requires a reset before it will start working again, or
>>> something else goes wrong along those lines.
>>>
>>>> However I get the feeling someone needs to go back and see if the
>>>> device is the same one and just gets probed again or if it is a new
>>>> one from the USB host stack perspective.
>>>
>>> That can be done easily enough by enabling usbcore debugging before
>>> carrying out the system suspend:
>>>
>>> 	echo 'module usbcore =p' >/debug/dynamic_debug/control
>>>
>>> The debugging information in the kernel log will tell just what
>>> happened.
>>>
>>>
>>
>> Playing around in my test setup as a baseline
>>
>> [   41.991035] usb usb1-port11: not reset yet, waiting 50ms
>> [   42.092902] usb 1-11: reset full-speed USB device number 4 using xhci_hcd
>> [   42.143575] usb usb1-port11: not reset yet, waiting 50ms
>> [   42.257822] btusb 1-11:1.0: no reset_resume for driver btusb?
>> [   42.257823] btusb 1-11:1.1: no reset_resume for driver btusb?
>> [   42.257825] btusb 1-11:1.0: forced unbind
>> [   42.258305] kworker/dying (826) used greatest stack depth: 10680 bytes left
>> [   42.331342] usb 1-9.2: reset full-speed USB device number 7 using xhci_hcd
>> [   42.416631] usb 1-9.2: ep0 maxpacket = 8
>> [   42.681288] usb 1-9.1: reset low-speed USB device number 5 using xhci_hcd
>> [   42.968138] usb 1-9.1: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
>> [   42.968157] usb 1-9.1: ep 0x82 - rounding interval to 64 microframes, ep desc says 80 microframes
>> [   43.036290] usb 1-9.4: reset high-speed USB device number 8 using xhci_hcd
>> [   43.123126] hub 1-9.4:1.0: hub_reset_resume
>> [   43.123581] hub 1-9.4:1.0: enabling power on all ports
>> [   43.224853] PM: resume of devices complete after 2456.587 msecs
>> [   43.225038] btusb 1-11:1.0: usb_probe_interface
>> [   43.225040] btusb 1-11:1.0: usb_probe_interface - got id
>> [   43.225802] ------------[ cut here ]------------
>> [   43.225807] WARNING: CPU: 7 PID: 2844 at drivers/base/firmware_class.c:1118 _request_firmware+0x5ee/0x890()
>>
>>
>> so it is trying to call the reset resume. If I try a 'dummy reset resume'
>>
>> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
>> index a7bdac0..cda8137 100644
>> --- a/drivers/bluetooth/btusb.c
>> +++ b/drivers/bluetooth/btusb.c
>> @@ -3401,6 +3401,7 @@ static struct usb_driver btusb_driver = {
>> #ifdef CONFIG_PM
>>         .suspend        = btusb_suspend,
>>         .resume         = btusb_resume,
>> +       .reset_resume   = btusb_resume,
>> #endif
>>         .id_table       = btusb_table,
>>         .supports_autosuspend = 1,
>>
>>
>> I no longer see the warning which means that probe is no longer being called.
>>
>> Marcel, does implementing a proper reset_resume callback seem like the right
>> approach or do you need more information?
>
> I wonder what is the right work that needs to be done in the reset_resume callback. I am curious if devices are forgetting their firmware or not. There is a patch to make the Realtek devices forcefully reset_resume since they forget their firmware. So there is at least one kind of devices where the firmware does not survive normal suspend/resume behaviour.
>
> If the devices forget the firmware, then this means that we actually need to tell the Bluetooth core that this device has been reset and it has to run through hdev->setup() again. If it does that, then we have the same problem that the firmware will not be found since userspace is not yet ready. However we could note the fact that we tried to lookup the firmware and just know that it was not found. So that might help already.
>
> Devices that always require the firmware, we can assume that the firmware will have been cached since we successfully loaded it in the first place. Which will most likely make the Realtek devices function just fine. It has the advantage that we do not have to go through the disconnect() and probe() cycle which will in turn unregister and re-register the HCI device.
>
> The question really is what a btusb_reset_resume() function should be doing. We have to assume that the device lost all its state and a reset of the Bluetooth stack is needed. The question if firmware is persistent or not might be a per device quirk that we have to deal with later.
>
> This means we need to do at least the following transactions in reset_resume case:
>
> 	Clear all suspend flags and flush all data
> 	Start the interrupt URB if it was running before
> 	Start the bulk URBs if they were running before
> 	Call hci_reset_dev
>
> Now the problem with hci_reset_dev at the moment is that it just injects a hardware error event. And based on that we restart the Bluetooth stack for that device. This needs extra work to restart the stack without needing to inject an event. The injection of the event will still be needed in case of HCI user channel operation, but that is yet another story.
>
> Do we have an easy way to simulate this behaviour and trigger this specific case? We really need to know for which devices the firmware stays around.
>

Which behavior are you looking to simulate? At least on the chipset I have,
the bluetooth always gets reset with a simple suspend to ram. For testing
I've been forcing the firmware warning by requesting non-existent firmware.

> Regards
>
> Marcel
>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/2] Bluetooth: Add reset_resume function
  2015-05-22  3:13                                         ` Marcel Holtmann
  2015-05-28  0:47                                           ` Laura Abbott
@ 2015-06-02  1:14                                           ` Laura Abbott
  2015-06-02  1:28                                             ` Marcel Holtmann
  2015-06-02  7:47                                             ` Oliver Neukum
  2015-06-02  1:14                                           ` [PATCH 2/2] Bluetooth: btusb: " Laura Abbott
  2 siblings, 2 replies; 45+ messages in thread
From: Laura Abbott @ 2015-06-02  1:14 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Laura Abbott, Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, linux-bluetooth, Linux Kernel Mailing List,
	USB list, netdev

Bluetooth devices off of some buses such as USB may lose power across
suspend/resume. When this happens, drivers may need to have the setup
function called again and behave differently than a cold power on.
Add a reset_resume function for drivers to call. During the
reset_resume case, the flag HCI_RESET_RESUME will be set to allow
drivers to differentate.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
---
This matches with what hci_reset_dev does and also ensures
the setup function gets called again.
---
 include/net/bluetooth/hci.h      |  1 +
 include/net/bluetooth/hci_core.h |  1 +
 net/bluetooth/hci_core.c         | 16 ++++++++++++++++
 3 files changed, 18 insertions(+)

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index d95da83..6285410 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -185,6 +185,7 @@ enum {
 	HCI_RAW,
 
 	HCI_RESET,
+	HCI_RESET_RESUME,
 };
 
 /* HCI socket flags */
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index a056c2b..14f9c72 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -941,6 +941,7 @@ int hci_register_dev(struct hci_dev *hdev);
 void hci_unregister_dev(struct hci_dev *hdev);
 int hci_suspend_dev(struct hci_dev *hdev);
 int hci_resume_dev(struct hci_dev *hdev);
+int hci_reset_resume_dev(struct hci_dev *hdev);
 int hci_reset_dev(struct hci_dev *hdev);
 int hci_dev_open(__u16 dev);
 int hci_dev_close(__u16 dev);
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index c4802f3..090762b 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -1558,6 +1558,7 @@ static int hci_dev_do_close(struct hci_dev *hdev)
 	BT_DBG("%s %p", hdev->name, hdev);
 
 	if (!hci_dev_test_flag(hdev, HCI_UNREGISTER) &&
+	    !test_bit(HCI_RESET_RESUME, &hdev->flags) &&
 	    test_bit(HCI_UP, &hdev->flags)) {
 		/* Execute vendor specific shutdown routine */
 		if (hdev->shutdown)
@@ -2110,6 +2111,7 @@ static void hci_power_on(struct work_struct *work)
 		 */
 		mgmt_index_added(hdev);
 	}
+	hci_dev_test_and_clear_flag(hdev, HCI_RESET_RESUME);
 }
 
 static void hci_power_off(struct work_struct *work)
@@ -3298,6 +3300,20 @@ int hci_reset_dev(struct hci_dev *hdev)
 }
 EXPORT_SYMBOL(hci_reset_dev);
 
+/*
+ * For USB reset_resume callbacks
+ */
+int hci_reset_resume_dev(struct hci_dev *hdev)
+{
+	set_bit(HCI_RESET_RESUME, &hdev->flags);
+	hci_dev_do_close(hdev);
+	hci_dev_set_flag(hdev, HCI_SETUP);
+
+	queue_work(hdev->req_workqueue, &hdev->power_on);
+	return 0;
+}
+EXPORT_SYMBOL(hci_reset_resume_dev);
+
 /* Receive frame from HCI drivers */
 int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb)
 {
-- 
2.4.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/2] Bluetooth: btusb: Add reset_resume function
  2015-05-22  3:13                                         ` Marcel Holtmann
  2015-05-28  0:47                                           ` Laura Abbott
  2015-06-02  1:14                                           ` [PATCH 1/2] Bluetooth: Add reset_resume function Laura Abbott
@ 2015-06-02  1:14                                           ` Laura Abbott
  2015-06-02  1:32                                             ` Marcel Holtmann
  2 siblings, 1 reply; 45+ messages in thread
From: Laura Abbott @ 2015-06-02  1:14 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Laura Abbott, Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, linux-bluetooth, Linux Kernel Mailing List,
	USB list, netdev


Some USB hubs may lose power across suspend/resume.
Add a reset_resume callback to properly reset those bluetoot devices.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
---
Now the setup function is called again with the HCI_RESET_RESUME
flag set. The various functions could then use that RESET_RESUME
flag to determine if loading the firmware is appropriate or not.
---
 drivers/bluetooth/btusb.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index 3c10d4d..34884cf 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -3382,6 +3382,21 @@ done:
 
 	return err;
 }
+
+static int btusb_reset_resume(struct usb_interface *intf)
+{
+	struct btusb_data *data = usb_get_intfdata(intf);
+	struct hci_dev *hdev = data->hdev;
+	int ret;
+
+	BT_DBG("intf %p", intf);
+
+	ret = btusb_resume(intf);
+	if (ret)
+		return ret;
+
+	return hci_reset_resume_dev(hdev);
+}
 #endif
 
 static struct usb_driver btusb_driver = {
@@ -3391,6 +3406,7 @@ static struct usb_driver btusb_driver = {
 #ifdef CONFIG_PM
 	.suspend	= btusb_suspend,
 	.resume		= btusb_resume,
+	.reset_resume	= btusb_reset_resume,
 #endif
 	.id_table	= btusb_table,
 	.supports_autosuspend = 1,
-- 
2.4.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/2] Bluetooth: Add reset_resume function
  2015-06-02  1:14                                           ` [PATCH 1/2] Bluetooth: Add reset_resume function Laura Abbott
@ 2015-06-02  1:28                                             ` Marcel Holtmann
  2015-06-02 14:17                                               ` Josh Boyer
  2015-06-02  7:47                                             ` Oliver Neukum
  1 sibling, 1 reply; 45+ messages in thread
From: Marcel Holtmann @ 2015-06-02  1:28 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, BlueZ development, Linux Kernel Mailing List,
	USB list, netdev

Hi Laura,

> Bluetooth devices off of some buses such as USB may lose power across
> suspend/resume. When this happens, drivers may need to have the setup
> function called again and behave differently than a cold power on.
> Add a reset_resume function for drivers to call. During the
> reset_resume case, the flag HCI_RESET_RESUME will be set to allow
> drivers to differentate.
> 
> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
> ---
> This matches with what hci_reset_dev does and also ensures
> the setup function gets called again.
> ---
> include/net/bluetooth/hci.h      |  1 +
> include/net/bluetooth/hci_core.h |  1 +
> net/bluetooth/hci_core.c         | 16 ++++++++++++++++
> 3 files changed, 18 insertions(+)
> 
> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> index d95da83..6285410 100644
> --- a/include/net/bluetooth/hci.h
> +++ b/include/net/bluetooth/hci.h
> @@ -185,6 +185,7 @@ enum {
> 	HCI_RAW,
> 
> 	HCI_RESET,
> +	HCI_RESET_RESUME,
> };

no more addition to this list of flags please. These are userspace exposed flags and with that ABI that we are never ever touching again. If you need flags on a per device basis, then use the second list.

> /* HCI socket flags */
> diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
> index a056c2b..14f9c72 100644
> --- a/include/net/bluetooth/hci_core.h
> +++ b/include/net/bluetooth/hci_core.h
> @@ -941,6 +941,7 @@ int hci_register_dev(struct hci_dev *hdev);
> void hci_unregister_dev(struct hci_dev *hdev);
> int hci_suspend_dev(struct hci_dev *hdev);
> int hci_resume_dev(struct hci_dev *hdev);
> +int hci_reset_resume_dev(struct hci_dev *hdev);
> int hci_reset_dev(struct hci_dev *hdev);
> int hci_dev_open(__u16 dev);
> int hci_dev_close(__u16 dev);
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index c4802f3..090762b 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -1558,6 +1558,7 @@ static int hci_dev_do_close(struct hci_dev *hdev)
> 	BT_DBG("%s %p", hdev->name, hdev);
> 
> 	if (!hci_dev_test_flag(hdev, HCI_UNREGISTER) &&
> +	    !test_bit(HCI_RESET_RESUME, &hdev->flags) &&
> 	    test_bit(HCI_UP, &hdev->flags)) {
> 		/* Execute vendor specific shutdown routine */
> 		if (hdev->shutdown)
> @@ -2110,6 +2111,7 @@ static void hci_power_on(struct work_struct *work)
> 		 */
> 		mgmt_index_added(hdev);
> 	}
> +	hci_dev_test_and_clear_flag(hdev, HCI_RESET_RESUME);

If you do not use the result of the test, why bother testing at all. Also you realize that you are a now clearing the flag on hdev->dev_flags and not hdev->flags.

It also means that this code is not tested when you actually have had a reset resume and then get a clean power down. Not running the shutdown procedure would be actually wrong in that case.

> }
> 
> static void hci_power_off(struct work_struct *work)
> @@ -3298,6 +3300,20 @@ int hci_reset_dev(struct hci_dev *hdev)
> }
> EXPORT_SYMBOL(hci_reset_dev);
> 
> +/*
> + * For USB reset_resume callbacks
> + */
> +int hci_reset_resume_dev(struct hci_dev *hdev)
> +{
> +	set_bit(HCI_RESET_RESUME, &hdev->flags);
> +	hci_dev_do_close(hdev);
> +	hci_dev_set_flag(hdev, HCI_SETUP);
> +
> +	queue_work(hdev->req_workqueue, &hdev->power_on);
> +	return 0;
> +}
> +EXPORT_SYMBOL(hci_reset_resume_dev);
> +

When we are reacting to a hardware error, we do hci_dev_do_close followed hci_dev_do_open. Why would you queue the power on work here. It sounds more like that this should be actually similar to hci_error_reset that gets queued.

And this is where I said the really tricky part comes in. Is the device keeping the firmware or not. We really need to know that one first. If it keeps its firmware, then we do not need to run through hdev->setup again.

>From a driver point of view, the current guarantee is that hdev->setup is only executed once. And this means really only once. It does not need to protect itself against being run again. So it should only be run again if the device looses all its states.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] Bluetooth: btusb: Add reset_resume function
  2015-06-02  1:14                                           ` [PATCH 2/2] Bluetooth: btusb: " Laura Abbott
@ 2015-06-02  1:32                                             ` Marcel Holtmann
  0 siblings, 0 replies; 45+ messages in thread
From: Marcel Holtmann @ 2015-06-02  1:32 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, linux-bluetooth, Linux Kernel Mailing List,
	USB list, netdev

Hi Laura,

> Some USB hubs may lose power across suspend/resume.
> Add a reset_resume callback to properly reset those bluetoot devices.
> 
> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
> ---
> Now the setup function is called again with the HCI_RESET_RESUME
> flag set. The various functions could then use that RESET_RESUME
> flag to determine if loading the firmware is appropriate or not.
> ---
> drivers/bluetooth/btusb.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index 3c10d4d..34884cf 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -3382,6 +3382,21 @@ done:
> 
> 	return err;
> }
> +
> +static int btusb_reset_resume(struct usb_interface *intf)
> +{
> +	struct btusb_data *data = usb_get_intfdata(intf);
> +	struct hci_dev *hdev = data->hdev;
> +	int ret;
> +
> +	BT_DBG("intf %p", intf);
> +
> +	ret = btusb_resume(intf);
> +	if (ret)
> +		return ret;
> +
> +	return hci_reset_resume_dev(hdev);
> +}

it seems convenient to call btusb_resume, but I would really prefer if we didn’t. From what I know is that when reset_resume callback is called, then the device has been reset. So that means any prior transfer we have remembered is null and void. So even trying to replay any of it is just a lost cause.

Instead we should clear any pending transfers and clear everything and instead pretend that we bring the transport back to its virgin state. It also means that isochronous transfers should be all killed since we will have no SCO connections after this. Remember that we are telling the Bluetooth core to reset this device.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/2] Bluetooth: Add reset_resume function
  2015-06-02  1:14                                           ` [PATCH 1/2] Bluetooth: Add reset_resume function Laura Abbott
  2015-06-02  1:28                                             ` Marcel Holtmann
@ 2015-06-02  7:47                                             ` Oliver Neukum
  1 sibling, 0 replies; 45+ messages in thread
From: Oliver Neukum @ 2015-06-02  7:47 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Marcel Holtmann, Ming Lei, David S. Miller, Johan Hedberg,
	Rafael J. Wysocki, Gustavo F. Padovan, Alan Stern, Takashi Iwai,
	linux-bluetooth, Linux Kernel Mailing List, USB list, netdev

On Mon, 2015-06-01 at 18:14 -0700, Laura Abbott wrote:
> Bluetooth devices off of some buses such as USB may lose power across
> suspend/resume. When this happens, drivers may need to have the setup
> function called again and behave differently than a cold power on.

Yes, but what is the point? We use reset_resume() to retain
some features of a device across a loss of power.
If power is lost, all settings are gone and all connections
are broken. So what is the difference compared to a plug out/in
cycle?

	Regards
		Oliver




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/2] Bluetooth: Add reset_resume function
  2015-06-02  1:28                                             ` Marcel Holtmann
@ 2015-06-02 14:17                                               ` Josh Boyer
  2015-06-02 15:07                                                 ` Marcel Holtmann
  0 siblings, 1 reply; 45+ messages in thread
From: Josh Boyer @ 2015-06-02 14:17 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Laura Abbott, Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, BlueZ development, Linux Kernel Mailing List,
	USB list, netdev

On Mon, Jun 1, 2015 at 9:28 PM, Marcel Holtmann <marcel@holtmann.org> wrote:
> Hi Laura,
>
>> Bluetooth devices off of some buses such as USB may lose power across
>> suspend/resume. When this happens, drivers may need to have the setup
>> function called again and behave differently than a cold power on.
>> Add a reset_resume function for drivers to call. During the
>> reset_resume case, the flag HCI_RESET_RESUME will be set to allow
>> drivers to differentate.
>>
>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>> ---
>> This matches with what hci_reset_dev does and also ensures
>> the setup function gets called again.
>> ---
>> include/net/bluetooth/hci.h      |  1 +
>> include/net/bluetooth/hci_core.h |  1 +
>> net/bluetooth/hci_core.c         | 16 ++++++++++++++++
>> 3 files changed, 18 insertions(+)
>>
>> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
>> index d95da83..6285410 100644
>> --- a/include/net/bluetooth/hci.h
>> +++ b/include/net/bluetooth/hci.h
>> @@ -185,6 +185,7 @@ enum {
>>       HCI_RAW,
>>
>>       HCI_RESET,
>> +     HCI_RESET_RESUME,
>> };
>
> no more addition to this list of flags please. These are userspace exposed flags and with that ABI that we are never ever touching again. If you need flags on a per device basis, then use the second list.

It would be helpful for other developers if you added a comment to
that effect above the enum definition.  Otherwise you're going to wind
up repeating yourself over time.

Also, if they're exposed to userspace, should this file be using the
uapi mechanism?  I'm confused how they're exposed today, given that
they aren't installed via 'make headers_install'.  Is this manually
synced with some other .h file in a userspace package?

josh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/2] Bluetooth: Add reset_resume function
  2015-06-02 14:17                                               ` Josh Boyer
@ 2015-06-02 15:07                                                 ` Marcel Holtmann
  0 siblings, 0 replies; 45+ messages in thread
From: Marcel Holtmann @ 2015-06-02 15:07 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Laura Abbott, Alan Stern, Takashi Iwai, Oliver Neukum, Ming Lei,
	David S. Miller, Johan Hedberg, Rafael J. Wysocki,
	Gustavo F. Padovan, BlueZ development, Linux Kernel Mailing List,
	USB list, netdev

Hi Josh,

>>> Bluetooth devices off of some buses such as USB may lose power across
>>> suspend/resume. When this happens, drivers may need to have the setup
>>> function called again and behave differently than a cold power on.
>>> Add a reset_resume function for drivers to call. During the
>>> reset_resume case, the flag HCI_RESET_RESUME will be set to allow
>>> drivers to differentate.
>>> 
>>> Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
>>> ---
>>> This matches with what hci_reset_dev does and also ensures
>>> the setup function gets called again.
>>> ---
>>> include/net/bluetooth/hci.h      |  1 +
>>> include/net/bluetooth/hci_core.h |  1 +
>>> net/bluetooth/hci_core.c         | 16 ++++++++++++++++
>>> 3 files changed, 18 insertions(+)
>>> 
>>> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
>>> index d95da83..6285410 100644
>>> --- a/include/net/bluetooth/hci.h
>>> +++ b/include/net/bluetooth/hci.h
>>> @@ -185,6 +185,7 @@ enum {
>>>      HCI_RAW,
>>> 
>>>      HCI_RESET,
>>> +     HCI_RESET_RESUME,
>>> };
>> 
>> no more addition to this list of flags please. These are userspace exposed flags and with that ABI that we are never ever touching again. If you need flags on a per device basis, then use the second list.
> 
> It would be helpful for other developers if you added a comment to
> that effect above the enum definition.  Otherwise you're going to wind
> up repeating yourself over time.

nobody has done that so far ;)

> Also, if they're exposed to userspace, should this file be using the
> uapi mechanism?  I'm confused how they're exposed today, given that
> they aren't installed via 'make headers_install'.  Is this manually
> synced with some other .h file in a userspace package?

This code is from 2.4.6 and with that pretty much ancient and predates UAPI. The BlueZ userspace library provides userspace versions of these defines etc.

Regards

Marcel


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2015-06-02 15:07 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-12  0:52 [RESEND][PATCH] Bluetooth: Make request workqueue freezable Laura Abbott
2015-05-12  1:07 ` Marcel Holtmann
2015-05-12  1:46   ` Laura Abbott
2015-05-12 15:14     ` Marcel Holtmann
2015-05-13  1:18       ` Laura Abbott
2015-05-19  9:46         ` Takashi Iwai
2015-05-19 14:26           ` Alan Stern
2015-05-19 14:52             ` Oliver Neukum
2015-05-19 15:22             ` Marcel Holtmann
2015-05-19 17:17               ` Alan Stern
2015-05-19 17:13             ` Takashi Iwai
2015-05-19 17:42               ` Oliver Neukum
2015-05-20  6:29                 ` Takashi Iwai
2015-05-20  8:40                   ` Oliver Neukum
2015-05-20  9:46                     ` Marcel Holtmann
2015-05-20 12:44                       ` Takashi Iwai
2015-05-20 23:42                         ` Laura Abbott
2015-05-21  4:21                           ` Takashi Iwai
2015-05-21 12:07                             ` Marcel Holtmann
2015-05-21 12:36                               ` Takashi Iwai
2015-05-21 14:18                                 ` Alan Stern
2015-05-21 14:39                                   ` Marcel Holtmann
2015-05-21 15:26                                     ` Alan Stern
2015-05-21 15:35                                       ` Takashi Iwai
2015-05-21 17:27                                         ` Arend van Spriel
2015-05-21 17:32                                           ` Takashi Iwai
2015-05-21 20:46                                             ` Arend van Spriel
2015-05-22 11:30                                               ` Oliver Neukum
2015-05-21 17:37                                         ` Alan Stern
2015-05-21 18:11                                           ` Takashi Iwai
2015-05-21 18:17                                             ` Laura Abbott
2015-05-22  0:21                                       ` Laura Abbott
2015-05-22  3:13                                         ` Marcel Holtmann
2015-05-28  0:47                                           ` Laura Abbott
2015-06-02  1:14                                           ` [PATCH 1/2] Bluetooth: Add reset_resume function Laura Abbott
2015-06-02  1:28                                             ` Marcel Holtmann
2015-06-02 14:17                                               ` Josh Boyer
2015-06-02 15:07                                                 ` Marcel Holtmann
2015-06-02  7:47                                             ` Oliver Neukum
2015-06-02  1:14                                           ` [PATCH 2/2] Bluetooth: btusb: " Laura Abbott
2015-06-02  1:32                                             ` Marcel Holtmann
2015-05-22  7:37                                         ` [RESEND][PATCH] Bluetooth: Make request workqueue freezable Arend van Spriel
2015-05-22  7:41                                           ` Arend van Spriel
2015-05-21 15:04                                   ` Takashi Iwai
2015-05-20 10:02                     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).