From: Kirill Buksha <kirbuk200@gmail.com>
To: Ajay.Kathat@microchip.com
Cc: Claudiu.Beznea@microchip.com, linux-wireless@vger.kernel.org,
kvalo@kernel.org, mwalle@kernel.org
Subject: Re: wilc1000 kernel crash
Date: Tue, 4 Apr 2023 18:20:21 +0200 [thread overview]
Message-ID: <ce8001cb-af43-ade4-3f68-36fe7eb0d46f@gmail.com> (raw)
In-Reply-To: <f69b432d-f7c0-a03f-870e-c8fc0038feda@microchip.com>
On 4.4.23. 03:30, Ajay.Kathat@microchip.com wrote:
> On 4/3/23 07:24, Kirill Buksha wrote:
>> [Some people who received this message don't often get email from kirbuk200@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> On 16.12.22. 11:18, Michael Walle wrote:
>>> Hi,
>>>
>>> On 22/12/09 02:14, Ajay.Kathat@microchip.com wrote:
>>>> No progress yet. I tried to simulate the condition a few times but was
>>>> unable to see the exact failure in my setup so I need to try more.
>>> Shouldn't it also be possible to see the issue by code reading? I've
>>> provided the call tree in my previous mail and my concerns regarding
>>> the locking. Either I'm missing something there or there is no
>>> locking between these threads which could cause this issue.
>>>
>>>> For the other "FW not responding" continuous logs, I got some clue.
>>>> Probably, will try to send that patch first.
>>> Ok, let me know if you have some patches, I'm happy to test them.
>>>
>>> -michael
>>>
>>>
>> Hello,
>>
>> I faced the same kernel oops issue. After analyzing my logs and brief
>> debugging, I agree with Mikhail: the problem seems to be accessing the
>> scan_result pointer after it has been nulled.
> I have submitted a patch [1] which has fix for scan_result NULL pointer
> exception issue. The submitted patch handles the synchronization between
> mac_close() and asynchronous interrupts from firmware. Basically, it
> takes care of blocking the execution of mac_close() till all pending
> works are completed and afterward no new work addition is allowed since
> the close is in progress. It is worth to try with that patch once and
> check it's behavior.
>
> 1.
> https://lore.kernel.org/linux-wireless/20230404012010.15261-1-ajay.kathat@microchip.com/T/#u
Thank you for the patch. I will take a look/test it when I have time.
>> Regarding the solution: if there is a race between two threads (as
>> Michael described earlier), then I think that the locking mechanism will
>> be the most reliable solution. We ran into problems during
>> deinitialization, but driver contains two more places
>> (handle_scan_done() and wilc_disconnect() functions in wilc1000/hif.c),
>> where scan_result is set to NULL.
>>
>> I use NetworkManager to manage networks and I have experienced the same
>> failure multiple times when switching from one WiFi network to another.
>> Keep in mind that switching between networks calls wilc_disconnect() and
>> wilc_deinit() functions and it is not yet clear which one is causing a
>> core dump. I think it's worth at least taking a look at these areas of
>> the code. What do you think?
> If possible, please share the sequence(commands) for Wifi network
> switching scenario. It looks like both functions(mac_close & disconnect)
> are getting called from user context. mac_close() is a netdevice
> callback whereas wilc_disconnect() is a cfg80211 callback. Generally,
> wilc_disconnect() should be enough to disconnect from current Wifi
> network without bringing the complete interface down. Is NetworkManager
> closing the interface(mac_close()) before switching the WiFi network.
>
>
> Regards,
> Ajay
The commands are as follows:
while true; do nmcli c up wlan0-client; nmcli c up wlan0-client-2; done
It takes about 5 minutes until I see the core dump.
I see following message after every command:
...
wilc1000_sdio mmc0:0001:1 wlan0: Deinitializing wilc1000...
...
Message above comes from wilc_wlan_deinitialize() function which is called from wilc_mac_close(). It seems that interface is closed between connections.
Best regards,
Kirill Buksha.
prev parent reply other threads:[~2023-04-04 16:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-24 13:54 wilc1000 kernel crash Michael Walle
2022-10-25 20:26 ` Ajay.Kathat
2022-10-26 8:54 ` Michael Walle
2022-12-09 12:03 ` Michael Walle
2022-12-09 14:14 ` Ajay.Kathat
2022-12-16 10:18 ` Michael Walle
2023-04-03 14:24 ` Kirill Buksha
2023-04-04 1:30 ` Ajay.Kathat
2023-04-04 16:20 ` Kirill Buksha [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce8001cb-af43-ade4-3f68-36fe7eb0d46f@gmail.com \
--to=kirbuk200@gmail.com \
--cc=Ajay.Kathat@microchip.com \
--cc=Claudiu.Beznea@microchip.com \
--cc=kvalo@kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mwalle@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).