* Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
@ 2018-05-15 12:43 Oliver Neukum
2018-05-16 3:37 ` Hayes Wang
0 siblings, 1 reply; 9+ messages in thread
From: Oliver Neukum @ 2018-05-15 12:43 UTC (permalink / raw)
To: hayeswang; +Cc: netdev
Hi,
I got reports about hangs with this trace:
May 13 01:36:55 neroon kernel: INFO: task kworker/0:0:4 blocked for more than 60 seconds.
May 13 01:36:55 neroon kernel: Tainted: G U 4.17.0-rc4-1.g8257a00-vanilla #1
May 13 01:36:55 neroon kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 01:36:55 neroon kernel: kworker/0:0 D 0 4 2 0x80000000
May 13 01:36:55 neroon kernel: Workqueue: events rtl_work_func_t [r8152]
May 13 01:36:55 neroon kernel: Call Trace:
May 13 01:36:55 neroon kernel: ? __schedule+0x289/0x880
May 13 01:36:55 neroon kernel: schedule+0x2f/0x90
May 13 01:36:55 neroon kernel: rpm_resume+0xf9/0x7a0
May 13 01:36:55 neroon kernel: ? wait_woken+0x80/0x80
May 13 01:36:55 neroon kernel: rpm_resume+0x547/0x7a0
May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x34/0x70
May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x34/0x70
May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
May 13 01:36:55 neroon kernel: __pm_runtime_resume+0x3a/0x50
May 13 01:36:55 neroon kernel: usb_autopm_get_interface+0x1d/0x50 [usbcore]
May 13 01:36:55 neroon kernel: rtl_work_func_t+0x3c/0x27c [r8152]
May 13 01:36:55 neroon kernel: process_one_work+0x1d4/0x3f0
May 13 01:36:55 neroon kernel: worker_thread+0x2b/0x3d0
May 13 01:36:55 neroon kernel: ? process_one_work+0x3f0/0x3f0
May 13 01:36:55 neroon kernel: kthread+0x113/0x130
May 13 01:36:55 neroon kernel: ? kthread_create_worker_on_cpu+0x50/0x50
May 13 01:36:55 neroon kernel: ret_from_fork+0x3a/0x50
Any idea?
Regards
Oliver
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-15 12:43 Hangs in r8152 connected to power management in kernels at least up v4.17-rc4 Oliver Neukum
@ 2018-05-16 3:37 ` Hayes Wang
2018-05-16 8:26 ` Oliver Neukum
0 siblings, 1 reply; 9+ messages in thread
From: Hayes Wang @ 2018-05-16 3:37 UTC (permalink / raw)
To: Oliver Neukum; +Cc: netdev
Oliver Neukum [mailto:oneukum@suse.com]
>
> Hi,
>
> I got reports about hangs with this trace:
>
> May 13 01:36:55 neroon kernel: INFO: task kworker/0:0:4 blocked for more
> than 60 seconds.
> May 13 01:36:55 neroon kernel: Tainted: G U
> 4.17.0-rc4-1.g8257a00-vanilla #1
> May 13 01:36:55 neroon kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> May 13 01:36:55 neroon kernel: kworker/0:0 D 0 4 2
> 0x80000000
> May 13 01:36:55 neroon kernel: Workqueue: events rtl_work_func_t [r8152]
> May 13 01:36:55 neroon kernel: Call Trace:
> May 13 01:36:55 neroon kernel: ? __schedule+0x289/0x880
> May 13 01:36:55 neroon kernel: schedule+0x2f/0x90
> May 13 01:36:55 neroon kernel: rpm_resume+0xf9/0x7a0
> May 13 01:36:55 neroon kernel: ? wait_woken+0x80/0x80
> May 13 01:36:55 neroon kernel: rpm_resume+0x547/0x7a0
> May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
> May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x34/0x70
> May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
> May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x34/0x70
> May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
> May 13 01:36:55 neroon kernel: __pm_runtime_resume+0x3a/0x50
> May 13 01:36:55 neroon kernel: usb_autopm_get_interface+0x1d/0x50 [usbcore]
Would usb_autopm_get_interface() take a long time?
The driver would wake the device if it has suspended.
I have no idea about how usb_autopm_get_interface() works, so I don't know how to help.
> May 13 01:36:55 neroon kernel: rtl_work_func_t+0x3c/0x27c [r8152]
> May 13 01:36:55 neroon kernel: process_one_work+0x1d4/0x3f0
> May 13 01:36:55 neroon kernel: worker_thread+0x2b/0x3d0
> May 13 01:36:55 neroon kernel: ? process_one_work+0x3f0/0x3f0
> May 13 01:36:55 neroon kernel: kthread+0x113/0x130
> May 13 01:36:55 neroon kernel: ?
> kthread_create_worker_on_cpu+0x50/0x50
> May 13 01:36:55 neroon kernel: ret_from_fork+0x3a/0x50
Best Regards,
Hayes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 3:37 ` Hayes Wang
@ 2018-05-16 8:26 ` Oliver Neukum
2018-05-16 10:00 ` Hayes Wang
0 siblings, 1 reply; 9+ messages in thread
From: Oliver Neukum @ 2018-05-16 8:26 UTC (permalink / raw)
To: Hayes Wang; +Cc: netdev
Am Mittwoch, den 16.05.2018, 03:37 +0000 schrieb Hayes Wang:
> Oliver Neukum [mailto:oneukum@suse.com]
> >
> > Hi,
> >
> > I got reports about hangs with this trace:
> >
> > May 13 01:36:55 neroon kernel: INFO: task kworker/0:0:4 blocked for more
> > than 60 seconds.
> > May 13 01:36:55 neroon kernel: Tainted: G U
> > 4.17.0-rc4-1.g8257a00-vanilla #1
> > May 13 01:36:55 neroon kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > May 13 01:36:55 neroon kernel: kworker/0:0 D 0 4 2
> > 0x80000000
> > May 13 01:36:55 neroon kernel: Workqueue: events rtl_work_func_t [r8152]
> > May 13 01:36:55 neroon kernel: Call Trace:
> > May 13 01:36:55 neroon kernel: ? __schedule+0x289/0x880
> > May 13 01:36:55 neroon kernel: schedule+0x2f/0x90
> > May 13 01:36:55 neroon kernel: rpm_resume+0xf9/0x7a0
> > May 13 01:36:55 neroon kernel: ? wait_woken+0x80/0x80
> > May 13 01:36:55 neroon kernel: rpm_resume+0x547/0x7a0
> > May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
> > May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x34/0x70
> > May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
> > May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x34/0x70
> > May 13 01:36:55 neroon kernel: ? __switch_to_asm+0x40/0x70
> > May 13 01:36:55 neroon kernel: __pm_runtime_resume+0x3a/0x50
> > May 13 01:36:55 neroon kernel: usb_autopm_get_interface+0x1d/0x50 [usbcore]
>
> Would usb_autopm_get_interface() take a long time?
> The driver would wake the device if it has suspended.
> I have no idea about how usb_autopm_get_interface() works, so I don't know how to help.
Hi,
it basically calls r8152_resume() and makes a control request to the
hub. I think we are spinning in rtl8152_runtime_resume(), but where?
It has a lot of NAPI stuff. Any suggestions on how to instrument or
trace this?
Regards
Oliver
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 8:26 ` Oliver Neukum
@ 2018-05-16 10:00 ` Hayes Wang
2018-05-16 10:09 ` Oliver Neukum
0 siblings, 1 reply; 9+ messages in thread
From: Hayes Wang @ 2018-05-16 10:00 UTC (permalink / raw)
To: Oliver Neukum; +Cc: netdev
Oliver Neukum [mailto:oneukum@suse.com]
> Sent: Wednesday, May 16, 2018 4:27 PM
[...]
> >
> > Would usb_autopm_get_interface() take a long time?
> > The driver would wake the device if it has suspended.
> > I have no idea about how usb_autopm_get_interface() works, so I don't know
> how to help.
>
> Hi,
>
> it basically calls r8152_resume() and makes a control request to the
> hub. I think we are spinning in rtl8152_runtime_resume(), but where?
> It has a lot of NAPI stuff. Any suggestions on how to instrument or
> trace this?
Is rtl8152_runtime_resume() called? I don't see the name in the trace.
I guess the relative API in rtl8152_runtime_resume() are
ops->disable = rtl8153_disable;
ops->autosuspend_en = rtl8153_runtime_enable;
And I don't find any possible dead lock in rtl8152_runtime_resume().
Besides, I find a similar issue as following.
https://www.spinics.net/lists/netdev/msg493512.html
Best Regards,
Hayes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 10:00 ` Hayes Wang
@ 2018-05-16 10:09 ` Oliver Neukum
2018-05-16 12:07 ` Hayes Wang
0 siblings, 1 reply; 9+ messages in thread
From: Oliver Neukum @ 2018-05-16 10:09 UTC (permalink / raw)
To: Hayes Wang; +Cc: netdev, jslaby
Am Mittwoch, den 16.05.2018, 10:00 +0000 schrieb Hayes Wang:
> Oliver Neukum [mailto:oneukum@suse.com]
> > Sent: Wednesday, May 16, 2018 4:27 PM
>
> [...]
> > >
> > > Would usb_autopm_get_interface() take a long time?
> > > The driver would wake the device if it has suspended.
> > > I have no idea about how usb_autopm_get_interface() works, so I don't know
> >
> > how to help.
> >
> > Hi,
> >
> > it basically calls r8152_resume() and makes a control request to the
> > hub. I think we are spinning in rtl8152_runtime_resume(), but where?
> > It has a lot of NAPI stuff. Any suggestions on how to instrument or
> > trace this?
>
> Is rtl8152_runtime_resume() called? I don't see the name in the trace.
Good question. I see nothing else that could produce a live lock.
>
> I guess the relative API in rtl8152_runtime_resume() are
> ops->disable = rtl8153_disable;
> ops->autosuspend_en = rtl8153_runtime_enable;
>
> And I don't find any possible dead lock in rtl8152_runtime_resume().
>
> Besides, I find a similar issue as following.
> https://www.spinics.net/lists/netdev/msg493512.html
Well, if we have an imbalance in NAPI it should strike whereever
it is used, not just in suspend(). Is there debugging for NAPI
we could activate?
Regards
Oliver
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 10:09 ` Oliver Neukum
@ 2018-05-16 12:07 ` Hayes Wang
2018-05-16 13:29 ` Jiri Slaby
0 siblings, 1 reply; 9+ messages in thread
From: Hayes Wang @ 2018-05-16 12:07 UTC (permalink / raw)
To: Oliver Neukum; +Cc: netdev, jslaby
Oliver Neukum [mailto:oneukum@suse.com]
> Sent: Wednesday, May 16, 2018 6:10 PM
[...]
> > Besides, I find a similar issue as following.
> > https://www.spinics.net/lists/netdev/msg493512.html
>
> Well, if we have an imbalance in NAPI it should strike whereever
> it is used, not just in suspend(). Is there debugging for NAPI
> we could activate?
No. The driver doesn't embed such debugging about it.
And I don't find an imbalance between napi_disable() and napi_enable().
Best Regards,
Hayes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 12:07 ` Hayes Wang
@ 2018-05-16 13:29 ` Jiri Slaby
2018-05-16 13:36 ` Jiri Slaby
0 siblings, 1 reply; 9+ messages in thread
From: Jiri Slaby @ 2018-05-16 13:29 UTC (permalink / raw)
To: Hayes Wang, Oliver Neukum; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 1143 bytes --]
On 05/16/2018, 02:07 PM, Hayes Wang wrote:
> Oliver Neukum [mailto:oneukum@suse.com]
>> Sent: Wednesday, May 16, 2018 6:10 PM
> [...]
>>> Besides, I find a similar issue as following.
>>> https://www.spinics.net/lists/netdev/msg493512.html
>>
>> Well, if we have an imbalance in NAPI it should strike whereever
>> it is used, not just in suspend(). Is there debugging for NAPI
>> we could activate?
>
> No. The driver doesn't embed such debugging about it.
Despite of that, Oliver, I have a kernel with a debug patch (attached)
at (suse-only link):
https://build.suse.de/project/show/home:jirislaby:stable-drm
> And I don't find an imbalance between napi_disable() and napi_enable().
There is none, apparently (the warns never triggered). BUt still the
driver sucks wrt both power mgmt and dock plug/unplug. Since I am using
the patch (it upper-bounds the napi_disable loop count) and the udev
rule below, I can really use the nic.
$ cat /etc/udev/rules.d/10-disable-r8152-pm.rules
ACTION=="add", SUBSYSTEM=="usb", ATTR{idProduct}=="8153",
ATTR{idVendor}=="0bda", TEST=="power/control", ATTR{power/control}="on"
thanks,
--
js
suse labs
[-- Attachment #2: r8152.patch --]
[-- Type: text/x-patch, Size: 5467 bytes --]
---
drivers/net/usb/r8152.c | 62 +++++++++++++++++++++++++++++++-----------------
net/core/dev.c | 14 +++++++++-
2 files changed, 53 insertions(+), 23 deletions(-)
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -704,6 +704,8 @@ struct r8152 {
unsigned long flags;
struct usb_device *udev;
struct napi_struct napi;
+ int napi_stat;
+ void *napi_last_en, *napi_last_dis;
struct usb_interface *intf;
struct net_device *netdev;
struct urb *intr_urb;
@@ -775,6 +777,31 @@ static unsigned int agg_buf_sz = 16384;
#define RTL_LIMITED_TSO_SIZE (agg_buf_sz - sizeof(struct tx_desc) - \
VLAN_ETH_HLEN - ETH_FCS_LEN)
+static void my_napi_enable(struct r8152 *tp)
+{
+ if (tp->napi_stat == 0) {
+ napi_enable(&tp->napi);
+ tp->napi_stat++;
+ tp->napi_last_en = (void *)_RET_IP_;
+ return;
+ }
+
+ WARN(1, "napi_stat=%d\n", tp->napi_stat);
+}
+
+static void my_napi_disable(struct r8152 *tp)
+{
+ if (tp->napi_stat == 1) {
+ napi_disable(&tp->napi);
+ tp->napi_stat--;
+ tp->napi_last_dis = (void *)_RET_IP_;
+ return;
+ }
+
+ WARN(1, "napi_stat=%d last_dis=%pF last_en=%pF\n",
+ tp->napi_stat, tp->napi_last_en, tp->napi_last_dis);
+}
+
static
int get_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data)
{
@@ -3787,7 +3814,6 @@ static bool rtl8153_in_nway(struct r8152
static void set_carrier(struct r8152 *tp)
{
struct net_device *netdev = tp->netdev;
- struct napi_struct *napi = &tp->napi;
u8 speed;
speed = rtl8152_get_speed(tp);
@@ -3796,12 +3822,12 @@ static void set_carrier(struct r8152 *tp
if (!netif_carrier_ok(netdev)) {
tp->rtl_ops.enable(tp);
netif_stop_queue(netdev);
- napi_disable(napi);
+ my_napi_disable(tp);
netif_carrier_on(netdev);
rtl_start_rx(tp);
clear_bit(RTL8152_SET_RX_MODE, &tp->flags);
_rtl8152_set_rx_mode(netdev);
- napi_enable(&tp->napi);
+ my_napi_enable(tp);
netif_wake_queue(netdev);
netif_info(tp, link, netdev, "carrier on\n");
} else if (netif_queue_stopped(netdev) &&
@@ -3811,9 +3837,9 @@ static void set_carrier(struct r8152 *tp
} else {
if (netif_carrier_ok(netdev)) {
netif_carrier_off(netdev);
- napi_disable(napi);
+ my_napi_disable(tp);
tp->rtl_ops.disable(tp);
- napi_enable(napi);
+ my_napi_enable(tp);
netif_info(tp, link, netdev, "carrier off\n");
}
}
@@ -3934,7 +3960,7 @@ static int rtl8152_open(struct net_devic
res);
goto out_unlock;
}
- napi_enable(&tp->napi);
+ my_napi_enable(tp);
mutex_unlock(&tp->control);
@@ -3962,7 +3988,7 @@ static int rtl8152_close(struct net_devi
#ifdef CONFIG_PM_SLEEP
unregister_pm_notifier(&tp->pm_notifier);
#endif
- napi_disable(&tp->napi);
+ my_napi_disable(tp);
clear_bit(WORK_ENABLE, &tp->flags);
usb_kill_urb(tp->intr_urb);
cancel_delayed_work_sync(&tp->schedule);
@@ -4230,7 +4256,7 @@ static int rtl8152_pre_reset(struct usb_
return 0;
netif_stop_queue(netdev);
- napi_disable(&tp->napi);
+ my_napi_disable(tp);
clear_bit(WORK_ENABLE, &tp->flags);
usb_kill_urb(tp->intr_urb);
cancel_delayed_work_sync(&tp->schedule);
@@ -4264,7 +4290,7 @@ static int rtl8152_post_reset(struct usb
mutex_unlock(&tp->control);
}
- napi_enable(&tp->napi);
+ my_napi_enable(tp);
netif_wake_queue(netdev);
usb_submit_urb(tp->intr_urb, GFP_KERNEL);
@@ -4302,10 +4328,8 @@ static int rtl8152_runtime_resume(struct
struct net_device *netdev = tp->netdev;
if (netif_running(netdev) && netdev->flags & IFF_UP) {
- struct napi_struct *napi = &tp->napi;
-
tp->rtl_ops.autosuspend_en(tp, false);
- napi_disable(napi);
+ my_napi_disable(tp);
set_bit(WORK_ENABLE, &tp->flags);
if (netif_carrier_ok(netdev)) {
@@ -4318,7 +4342,7 @@ static int rtl8152_runtime_resume(struct
}
}
- napi_enable(napi);
+ my_napi_enable(tp);
clear_bit(SELECTIVE_SUSPEND, &tp->flags);
smp_mb__after_atomic();
@@ -4388,13 +4412,11 @@ static int rtl8152_runtime_suspend(struc
tp->rtl_ops.autosuspend_en(tp, true);
if (netif_carrier_ok(netdev)) {
- struct napi_struct *napi = &tp->napi;
-
- napi_disable(napi);
+ my_napi_disable(tp);
rtl_stop_rx(tp);
rxdy_gated_en(tp, false);
ocp_write_dword(tp, MCU_TYPE_PLA, PLA_RCR, rcr);
- napi_enable(napi);
+ my_napi_enable(tp);
}
if (delay_autosuspend(tp)) {
@@ -4415,14 +4437,12 @@ static int rtl8152_system_suspend(struct
netif_device_detach(netdev);
if (netif_running(netdev) && test_bit(WORK_ENABLE, &tp->flags)) {
- struct napi_struct *napi = &tp->napi;
-
clear_bit(WORK_ENABLE, &tp->flags);
usb_kill_urb(tp->intr_urb);
- napi_disable(napi);
+ my_napi_disable(tp);
cancel_delayed_work_sync(&tp->schedule);
tp->rtl_ops.down(tp);
- napi_enable(napi);
+ my_napi_enable(tp);
}
return ret;
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5654,13 +5654,23 @@ EXPORT_SYMBOL(netif_napi_add);
void napi_disable(struct napi_struct *n)
{
+ int a;
+
might_sleep();
set_bit(NAPI_STATE_DISABLE, &n->state);
- while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
+ a = 0;
+ while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) {
msleep(1);
- while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
+ if (WARN(a++ > 20 * 1000, "NAPI_STATE_SCHED never cleared"))
+ break;
+ }
+ a = 0;
+ while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state)) {
msleep(1);
+ if (WARN(a++ > 20 * 1000, "NAPI_STATE_NPSVC never cleared"))
+ break;
+ }
hrtimer_cancel(&n->timer);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 13:29 ` Jiri Slaby
@ 2018-05-16 13:36 ` Jiri Slaby
2018-05-25 8:41 ` Jiri Slaby
0 siblings, 1 reply; 9+ messages in thread
From: Jiri Slaby @ 2018-05-16 13:36 UTC (permalink / raw)
To: Hayes Wang, Oliver Neukum; +Cc: netdev
On 05/16/2018, 03:29 PM, Jiri Slaby wrote:
> On 05/16/2018, 02:07 PM, Hayes Wang wrote:
>> Oliver Neukum [mailto:oneukum@suse.com]
>>> Sent: Wednesday, May 16, 2018 6:10 PM
>> [...]
>>>> Besides, I find a similar issue as following.
>>>> https://www.spinics.net/lists/netdev/msg493512.html
>>>
>>> Well, if we have an imbalance in NAPI it should strike whereever
>>> it is used, not just in suspend(). Is there debugging for NAPI
>>> we could activate?
>>
>> No. The driver doesn't embed such debugging about it.
>
> Despite of that, Oliver, I have a kernel with a debug patch (attached)
> at (suse-only link):
> https://build.suse.de/project/show/home:jirislaby:stable-drm
>
>> And I don't find an imbalance between napi_disable() and napi_enable().
>
> There is none, apparently (the warns never triggered). BUt still the
> driver sucks wrt both power mgmt and dock plug/unplug. Since I am using
> the patch (it upper-bounds the napi_disable loop count) and the udev
> rule below, I can really use the nic.
BTW the added warning to napi_disable indeed triggers:
> xzgrep -a -B 2 kernel:.*WARNING.*napi messages-20180*
> messages-20180503.xz:2018-04-27T09:57:00.048922+02:00 anemoi2 kernel: [158616.363052] ------------[ cut here ]------------
> messages-20180503.xz:2018-04-27T09:57:00.048979+02:00 anemoi2 kernel: [158616.363070] NAPI_STATE_SCHED never cleared
> messages-20180503.xz:2018-04-27T09:57:00.048988+02:00 anemoi2 kernel: [158616.363120] WARNING: CPU: 1 PID: 14365 at ../net/core/dev.c:5665 napi_disable+0x3d/0x80
And since I do 'ip l set dev ethX down' before unplugging the dock with
the NIC, I have not seen a single occurrence.
So I assume it must be a problem of making usb->disconnect without prior
ndo->close (or alike).
thanks,
--
js
suse labs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hangs in r8152 connected to power management in kernels at least up v4.17-rc4
2018-05-16 13:36 ` Jiri Slaby
@ 2018-05-25 8:41 ` Jiri Slaby
0 siblings, 0 replies; 9+ messages in thread
From: Jiri Slaby @ 2018-05-25 8:41 UTC (permalink / raw)
To: Hayes Wang, Oliver Neukum; +Cc: netdev
On 05/16/2018, 03:36 PM, Jiri Slaby wrote:
> So I assume it must be a problem of making usb->disconnect without prior
> ndo->close (or alike).
So according to my debug messages, I think this should workaround the
problem:
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -3962,7 +3962,8 @@ static int rtl8152_close(struct net_device *netdev)
#ifdef CONFIG_PM_SLEEP
unregister_pm_notifier(&tp->pm_notifier);
#endif
- napi_disable(&tp->napi);
+ if (!test_bit(RTL8152_UNPLUG, &tp->flags))
+ napi_disable(&tp->napi);
clear_bit(WORK_ENABLE, &tp->flags);
usb_kill_urb(tp->intr_urb);
cancel_delayed_work_sync(&tp->schedule);
napi is deleted in usb->disconnect, then unregister_netdev is called
which invokes netdev->ndo_stop, i.e. rtl8152_close above. And
rtl8152_close tries to napi_disable, but that is already deleted.
The patch does not solve the race between disconnect and ndo_stop
AFAICS. It needs locking, IMO. I am not familiar enough with the code,
but it looks like ->disconnect can happen any time while ->stop is in
progress.
thanks,
--
js
suse labs
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-05-25 8:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-15 12:43 Hangs in r8152 connected to power management in kernels at least up v4.17-rc4 Oliver Neukum
2018-05-16 3:37 ` Hayes Wang
2018-05-16 8:26 ` Oliver Neukum
2018-05-16 10:00 ` Hayes Wang
2018-05-16 10:09 ` Oliver Neukum
2018-05-16 12:07 ` Hayes Wang
2018-05-16 13:29 ` Jiri Slaby
2018-05-16 13:36 ` Jiri Slaby
2018-05-25 8:41 ` Jiri Slaby
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.