All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel oops in __netif_schedule() for at76c50x-usb
@ 2012-07-02 15:21 Larry Finger
  2012-07-02 15:31 ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Larry Finger @ 2012-07-02 15:21 UTC (permalink / raw)
  To: Johannes Berg; +Cc: wireless

Regarding the oops that I reported for PPC architecture that reported "Unable to 
handle kernel paging request for data at address 0x000004c", I have now repeated 
it on x86_64 architecture, where the objdump tool is better. The error occurs in 
the line in __netif_schedule() that says

          if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state))

Debug printouts have shown that q is not NULL, and it appears to be in the 
correct address range. I think q->state is zero; however, q->state cannot be 
written.

Additional testing shows this problem to be another side effect of commit 
3a25a8c ("mac80211: add improved HW queue control") for a device with only a 
single HW queue.

Any suggestions for additional debugging printouts will be greatly appreciated.

Thanks,

Larry

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-02 15:21 Kernel oops in __netif_schedule() for at76c50x-usb Larry Finger
@ 2012-07-02 15:31 ` Johannes Berg
  2012-07-02 16:12   ` Larry Finger
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2012-07-02 15:31 UTC (permalink / raw)
  To: Larry Finger; +Cc: wireless

Hi Larry,

Sorry! I had your other email still marked unread but hadn't gotten
around to it :-(

> Regarding the oops that I reported for PPC architecture that reported "Unable to 
> handle kernel paging request for data at address 0x000004c", I have now repeated 
> it on x86_64 architecture, where the objdump tool is better. The error occurs in 
> the line in __netif_schedule() that says
> 
>           if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state))
> 
> Debug printouts have shown that q is not NULL, and it appears to be in the 
> correct address range. I think q->state is zero; however, q->state cannot be 
> written.
> 
> Additional testing shows this problem to be another side effect of commit 
> 3a25a8c ("mac80211: add improved HW queue control") for a device with only a 
> single HW queue.

Looking at the code again, it seems pretty obviously wrong ... OUCH!

I'm not sure which fix is correct though. Should we have software QoS
queues for these drivers, but we'll never use them? Then this would
work:
http://p.sipsolutions.net/e015bf7db9a05887.txt

Or we could change the enable code path. Hmm.

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-02 15:31 ` Johannes Berg
@ 2012-07-02 16:12   ` Larry Finger
  2012-07-02 17:38     ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Larry Finger @ 2012-07-02 16:12 UTC (permalink / raw)
  To: Johannes Berg; +Cc: wireless

On 07/02/2012 10:31 AM, Johannes Berg wrote:
> Hi Larry,
>
> Sorry! I had your other email still marked unread but hadn't gotten
> around to it :-(
>
>> Regarding the oops that I reported for PPC architecture that reported "Unable to
>> handle kernel paging request for data at address 0x000004c", I have now repeated
>> it on x86_64 architecture, where the objdump tool is better. The error occurs in
>> the line in __netif_schedule() that says
>>
>>            if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state))
>>
>> Debug printouts have shown that q is not NULL, and it appears to be in the
>> correct address range. I think q->state is zero; however, q->state cannot be
>> written.
>>
>> Additional testing shows this problem to be another side effect of commit
>> 3a25a8c ("mac80211: add improved HW queue control") for a device with only a
>> single HW queue.
>
> Looking at the code again, it seems pretty obviously wrong ... OUCH!
>
> I'm not sure which fix is correct though. Should we have software QoS
> queues for these drivers, but we'll never use them? Then this would
> work:
> http://p.sipsolutions.net/e015bf7db9a05887.txt
>
> Or we could change the enable code path. Hmm.

That patch does prevent the oops. I was not able to make a connection with the 
device, but I just acquired it, and I'm not sure of its quality, or that of the 
driver. It does scan OK, and I think the patch is OK. I'll do more tests with 
b43legacy later as the machine with that iface is busy. I will also test b43 on 
the PPC using the open-source firmware.

Although you may want to change the enable code path, some patch will be needed 
to prevent a regression in 3.5. If this is the one, you may add a "Tested-by" 
for me.

Larry



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-02 16:12   ` Larry Finger
@ 2012-07-02 17:38     ` Johannes Berg
  2012-07-02 22:50       ` Larry Finger
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2012-07-02 17:38 UTC (permalink / raw)
  To: Larry Finger; +Cc: wireless


> > I'm not sure which fix is correct though. Should we have software QoS
> > queues for these drivers, but we'll never use them? Then this would
> > work:
> > http://p.sipsolutions.net/e015bf7db9a05887.txt
> >
> > Or we could change the enable code path. Hmm.
> 
> That patch does prevent the oops. I was not able to make a connection with the 
> device, but I just acquired it, and I'm not sure of its quality, or that of the 
> driver.

I don't think that device works today -- IIRC it requires the BSSID
before authentication and that wasn't possible before the auth redesign.

> It does scan OK, and I think the patch is OK. I'll do more tests with 
> b43legacy later as the machine with that iface is busy. I will also test b43 on 
> the PPC using the open-source firmware.
> 
> Although you may want to change the enable code path, some patch will be needed 
> to prevent a regression in 3.5. If this is the one, you may add a "Tested-by" 
> for me.

Thanks. Could you try this patch instead? I think it makes more sense.

http://p.sipsolutions.net/c3e9b814a409ca11.txt

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-02 17:38     ` Johannes Berg
@ 2012-07-02 22:50       ` Larry Finger
  2012-07-04  9:46         ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Larry Finger @ 2012-07-02 22:50 UTC (permalink / raw)
  To: Johannes Berg; +Cc: wireless

On 07/02/2012 12:38 PM, Johannes Berg wrote:
>
>>> I'm not sure which fix is correct though. Should we have software QoS
>>> queues for these drivers, but we'll never use them? Then this would
>>> work:
>>> http://p.sipsolutions.net/e015bf7db9a05887.txt
>>>
>>> Or we could change the enable code path. Hmm.
>>
>> That patch does prevent the oops. I was not able to make a connection with the
>> device, but I just acquired it, and I'm not sure of its quality, or that of the
>> driver.
>
> I don't think that device works today -- IIRC it requires the BSSID
> before authentication and that wasn't possible before the auth redesign.
>
>> It does scan OK, and I think the patch is OK. I'll do more tests with
>> b43legacy later as the machine with that iface is busy. I will also test b43 on
>> the PPC using the open-source firmware.
>>
>> Although you may want to change the enable code path, some patch will be needed
>> to prevent a regression in 3.5. If this is the one, you may add a "Tested-by"
>> for me.
>
> Thanks. Could you try this patch instead? I think it makes more sense.
>
> http://p.sipsolutions.net/c3e9b814a409ca11.txt

That one fails and gives the oops in __netif_schedule.

Larry


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-02 22:50       ` Larry Finger
@ 2012-07-04  9:46         ` Johannes Berg
  2012-07-04 10:49           ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2012-07-04  9:46 UTC (permalink / raw)
  To: Larry Finger; +Cc: wireless

On Mon, 2012-07-02 at 17:50 -0500, Larry Finger wrote:
> On 07/02/2012 12:38 PM, Johannes Berg wrote:
> >
> >>> I'm not sure which fix is correct though. Should we have software QoS
> >>> queues for these drivers, but we'll never use them? Then this would
> >>> work:
> >>> http://p.sipsolutions.net/e015bf7db9a05887.txt
> >>>
> >>> Or we could change the enable code path. Hmm.
> >>
> >> That patch does prevent the oops. I was not able to make a connection with the
> >> device, but I just acquired it, and I'm not sure of its quality, or that of the
> >> driver.
> >
> > I don't think that device works today -- IIRC it requires the BSSID
> > before authentication and that wasn't possible before the auth redesign.
> >
> >> It does scan OK, and I think the patch is OK. I'll do more tests with
> >> b43legacy later as the machine with that iface is busy. I will also test b43 on
> >> the PPC using the open-source firmware.
> >>
> >> Although you may want to change the enable code path, some patch will be needed
> >> to prevent a regression in 3.5. If this is the one, you may add a "Tested-by"
> >> for me.
> >
> > Thanks. Could you try this patch instead? I think it makes more sense.
> >
> > http://p.sipsolutions.net/c3e9b814a409ca11.txt
> 
> That one fails and gives the oops in __netif_schedule.

Hmmm, that's odd. I'll try to reproduce this to be able to track it
better.

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-04  9:46         ` Johannes Berg
@ 2012-07-04 10:49           ` Johannes Berg
  2012-07-04 10:54             ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2012-07-04 10:49 UTC (permalink / raw)
  To: Larry Finger; +Cc: wireless

On Wed, 2012-07-04 at 11:46 +0200, Johannes Berg wrote:

> > >> Although you may want to change the enable code path, some patch will be needed
> > >> to prevent a regression in 3.5. If this is the one, you may add a "Tested-by"
> > >> for me.
> > >
> > > Thanks. Could you try this patch instead? I think it makes more sense.
> > >
> > > http://p.sipsolutions.net/c3e9b814a409ca11.txt
> > 
> > That one fails and gives the oops in __netif_schedule.
> 
> Hmmm, that's odd. I'll try to reproduce this to be able to track it
> better.

Ok, strange. I can reproduce the original problem easily with hwsim, but
with this new patch, which should be equivalent to the old, it's fixed:

http://p.sipsolutions.net/d78d8740ad2d15b4.txt

Can you try just this patch?

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kernel oops in __netif_schedule() for at76c50x-usb
  2012-07-04 10:49           ` Johannes Berg
@ 2012-07-04 10:54             ` Johannes Berg
  0 siblings, 0 replies; 8+ messages in thread
From: Johannes Berg @ 2012-07-04 10:54 UTC (permalink / raw)
  To: Larry Finger; +Cc: wireless

On Wed, 2012-07-04 at 12:49 +0200, Johannes Berg wrote:
> On Wed, 2012-07-04 at 11:46 +0200, Johannes Berg wrote:
> 
> > > >> Although you may want to change the enable code path, some patch will be needed
> > > >> to prevent a regression in 3.5. If this is the one, you may add a "Tested-by"
> > > >> for me.
> > > >
> > > > Thanks. Could you try this patch instead? I think it makes more sense.
> > > >
> > > > http://p.sipsolutions.net/c3e9b814a409ca11.txt
> > > 
> > > That one fails and gives the oops in __netif_schedule.
> > 
> > Hmmm, that's odd. I'll try to reproduce this to be able to track it
> > better.
> 
> Ok, strange. I can reproduce the original problem easily with hwsim, but
> with this new patch, which should be equivalent to the old, it's fixed:
> 
> http://p.sipsolutions.net/d78d8740ad2d15b4.txt

Ok, actually, the same bug is on stop, but for some reason that doesn't
crash for me.

I've posted this patch now:
http://p.sipsolutions.net/55032a5ae0520dd8.txt

johannes


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-07-04 10:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-02 15:21 Kernel oops in __netif_schedule() for at76c50x-usb Larry Finger
2012-07-02 15:31 ` Johannes Berg
2012-07-02 16:12   ` Larry Finger
2012-07-02 17:38     ` Johannes Berg
2012-07-02 22:50       ` Larry Finger
2012-07-04  9:46         ` Johannes Berg
2012-07-04 10:49           ` Johannes Berg
2012-07-04 10:54             ` Johannes Berg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.