All of lore.kernel.org
 help / color / mirror / Atom feed
* mwifiex crash when removing interface while scanning
@ 2013-02-19 18:02 Daniel Drake
  2013-02-19 23:31 ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Drake @ 2013-02-19 18:02 UTC (permalink / raw)
  To: linux-wireless, Bing Zhao

Hi,

Testing mwifiex_sdio on XO-4 (8787) on Linux 3.8. In OLPC's use case
we quite regularly will power down the wireless card when it is not
being actively used. This is currently proving problematic.

Here is a test case reproducing one of the failures we are seeing:

First, run this script:

insmod mwifiex_sdio.ko
sleep 1
ifconfig eth0 up
iwlist eth0 scan &
sleep 0.5
rmmod mwifiex_sdio

It normally runs fine the first time, no errors, but run it again and
it terminates with these errors:

    mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
    eth0      Failed to read scan data : No such device

>From this point, things are pretty screwed up. Try to load the driver again...

insmod mwifiex_sdio.ko
mwifiex_sdio mmc0:0001:1: WLAN FW already running! Skip FW dnld
mwifiex_sdio mmc0:0001:1: WLAN FW is active
mwifiex_sdio mmc0:0001:1: mwifiex_cmd_timeout_func: Timeout cmd id
(331.391420) = 0xa9, act = 0x0
mwifiex_sdio mmc0:0001:1: num_data_h2c_failure = 0
mwifiex_sdio mmc0:0001:1: num_cmd_h2c_failure = 0
mwifiex_sdio mmc0:0001:1: num_cmd_timeout = 1
mwifiex_sdio mmc0:0001:1: num_tx_timeout = 0
mwifiex_sdio mmc0:0001:1: last_cmd_index = 1
mwifiex_sdio mmc0:0001:1: last_cmd_id: 00 00 a9 00 00 00 00 00 00 00
mwifiex_sdio mmc0:0001:1: last_cmd_act: 00 00 00 00 00 00 00 00 00 00
mwifiex_sdio mmc0:0001:1: last_cmd_resp_index = 0
mwifiex_sdio mmc0:0001:1: last_cmd_resp_id: 00 00 00 00 00 00 00 00 00 00
mwifiex_sdio mmc0:0001:1: last_event_index = 0
mwifiex_sdio mmc0:0001:1: last_event: 00 00 00 00 00 00 00 00 00 00
mwifiex_sdio mmc0:0001:1: data_sent=1 cmd_sent=1
mwifiex_sdio mmc0:0001:1: ps_mode=0 ps_state=0
mwifiex_sdio mmc0:0001:1: cmd timeout
mwifiex_sdio: Resetting card...
WARNING: driver mwifiex_sdio did not remove its interrupt handler!
mmc0: card 0001 removed

and it doesn't come back.

Let me know how we can help further debugging.

Thanks
Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-02-19 18:02 mwifiex crash when removing interface while scanning Daniel Drake
@ 2013-02-19 23:31 ` Bing Zhao
  2013-02-20  5:06   ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-02-19 23:31 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless

Hi Daniel,

> Hi,
> 
> Testing mwifiex_sdio on XO-4 (8787) on Linux 3.8. In OLPC's use case
> we quite regularly will power down the wireless card when it is not
> being actively used. This is currently proving problematic.

I have an XO-4. Perhaps I can update to kernel 3.8?

> 
> Here is a test case reproducing one of the failures we are seeing:
> 
> First, run this script:
> 
> insmod mwifiex_sdio.ko
> sleep 1
> ifconfig eth0 up
> iwlist eth0 scan &
> sleep 0.5
> rmmod mwifiex_sdio
> 
> It normally runs fine the first time, no errors, but run it again and
> it terminates with these errors:
> 
>     mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
>     eth0      Failed to read scan data : No such device
> 
> From this point, things are pretty screwed up. Try to load the driver again...
> 
> insmod mwifiex_sdio.ko
> mwifiex_sdio mmc0:0001:1: WLAN FW already running! Skip FW dnld
> mwifiex_sdio mmc0:0001:1: WLAN FW is active
> mwifiex_sdio mmc0:0001:1: mwifiex_cmd_timeout_func: Timeout cmd id
> (331.391420) = 0xa9, act = 0x0
> mwifiex_sdio mmc0:0001:1: num_data_h2c_failure = 0
> mwifiex_sdio mmc0:0001:1: num_cmd_h2c_failure = 0
> mwifiex_sdio mmc0:0001:1: num_cmd_timeout = 1
> mwifiex_sdio mmc0:0001:1: num_tx_timeout = 0
> mwifiex_sdio mmc0:0001:1: last_cmd_index = 1
> mwifiex_sdio mmc0:0001:1: last_cmd_id: 00 00 a9 00 00 00 00 00 00 00
> mwifiex_sdio mmc0:0001:1: last_cmd_act: 00 00 00 00 00 00 00 00 00 00
> mwifiex_sdio mmc0:0001:1: last_cmd_resp_index = 0
> mwifiex_sdio mmc0:0001:1: last_cmd_resp_id: 00 00 00 00 00 00 00 00 00 00
> mwifiex_sdio mmc0:0001:1: last_event_index = 0
> mwifiex_sdio mmc0:0001:1: last_event: 00 00 00 00 00 00 00 00 00 00
> mwifiex_sdio mmc0:0001:1: data_sent=1 cmd_sent=1
> mwifiex_sdio mmc0:0001:1: ps_mode=0 ps_state=0
> mwifiex_sdio mmc0:0001:1: cmd timeout
> mwifiex_sdio: Resetting card...
> WARNING: driver mwifiex_sdio did not remove its interrupt handler!
> mmc0: card 0001 removed
> 
> and it doesn't come back.
> 
> Let me know how we can help further debugging.

I will try this test with my SD8787 card and get back to you later.

Thanks,
Bing

> 
> Thanks
> Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-02-19 23:31 ` Bing Zhao
@ 2013-02-20  5:06   ` Bing Zhao
  2013-02-20 17:47     ` Daniel Drake
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-02-20  5:06 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless

Hi Daniel,

> > insmod mwifiex_sdio.ko
> > sleep 1
> > ifconfig eth0 up
> > iwlist eth0 scan &
> > sleep 0.5
> > rmmod mwifiex_sdio
> >
> > It normally runs fine the first time, no errors, but run it again and
> > it terminates with these errors:
> >
> >     mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
> >     eth0      Failed to read scan data : No such device

After this error message, do you see the sd8787 card is removed from the bus?

"mmc0: card 0001 removed"

> >
> > From this point, things are pretty screwed up. Try to load the driver again...
> >
> > insmod mwifiex_sdio.ko
> > mwifiex_sdio mmc0:0001:1: WLAN FW already running! Skip FW dnld
> > mwifiex_sdio mmc0:0001:1: WLAN FW is active
> > mwifiex_sdio mmc0:0001:1: mwifiex_cmd_timeout_func: Timeout cmd id
> > (331.391420) = 0xa9, act = 0x0
> > mwifiex_sdio mmc0:0001:1: num_data_h2c_failure = 0
> > mwifiex_sdio mmc0:0001:1: num_cmd_h2c_failure = 0
> > mwifiex_sdio mmc0:0001:1: num_cmd_timeout = 1
> > mwifiex_sdio mmc0:0001:1: num_tx_timeout = 0
> > mwifiex_sdio mmc0:0001:1: last_cmd_index = 1
> > mwifiex_sdio mmc0:0001:1: last_cmd_id: 00 00 a9 00 00 00 00 00 00 00
> > mwifiex_sdio mmc0:0001:1: last_cmd_act: 00 00 00 00 00 00 00 00 00 00
> > mwifiex_sdio mmc0:0001:1: last_cmd_resp_index = 0
> > mwifiex_sdio mmc0:0001:1: last_cmd_resp_id: 00 00 00 00 00 00 00 00 00 00
> > mwifiex_sdio mmc0:0001:1: last_event_index = 0
> > mwifiex_sdio mmc0:0001:1: last_event: 00 00 00 00 00 00 00 00 00 00
> > mwifiex_sdio mmc0:0001:1: data_sent=1 cmd_sent=1
> > mwifiex_sdio mmc0:0001:1: ps_mode=0 ps_state=0
> > mwifiex_sdio mmc0:0001:1: cmd timeout
> > mwifiex_sdio: Resetting card...
> > WARNING: driver mwifiex_sdio did not remove its interrupt handler!
> > mmc0: card 0001 removed
> >
> > and it doesn't come back.
> >
> > Let me know how we can help further debugging.
> 
> I will try this test with my SD8787 card and get back to you later.

My card works fine with a Ricoh controller on x86_64 machine. Of course the power is not cut when I remove the mwifiex_sdio module.

I can upgrade my XO-4 to kernel 3.8 if feasible. Please let me know if a HOW-TO doc is available.

Thanks,
Bing


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mwifiex crash when removing interface while scanning
  2013-02-20  5:06   ` Bing Zhao
@ 2013-02-20 17:47     ` Daniel Drake
  2013-02-21  0:00       ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Drake @ 2013-02-20 17:47 UTC (permalink / raw)
  To: Bing Zhao; +Cc: linux-wireless

On Tue, Feb 19, 2013 at 11:06 PM, Bing Zhao <bzhao@marvell.com> wrote:
>> >     mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
>> >     eth0      Failed to read scan data : No such device
>
> After this error message, do you see the sd8787 card is removed from the bus?
>
> "mmc0: card 0001 removed"

No.

> My card works fine with a Ricoh controller on x86_64 machine. Of course the power is not cut when I remove the mwifiex_sdio module.

Under this kernel, I'm not sure if the power is or isn't being cut at
this point. I don't think this is relevant though; regular
insmod/rmmod cycles work fine, the problem only appears when tearing
down the interface while a scan is in progress.

> I can upgrade my XO-4 to kernel 3.8 if feasible. Please let me know if a HOW-TO doc is available.

You will need a serial console connected, as there is no output on the
LCD at this time when running the upstream kernel.

http://wiki.laptop.org/go/User:DanielDrake/Run_upstream_kernel_on_XO-4

Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-02-20 17:47     ` Daniel Drake
@ 2013-02-21  0:00       ` Bing Zhao
  2013-02-21 14:20         ` Daniel Drake
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-02-21  0:00 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless

Hi Daniel,

> >> >     mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
> >> >     eth0      Failed to read scan data : No such device
> >
> > After this error message, do you see the sd8787 card is removed from the bus?
> >
> > "mmc0: card 0001 removed"
> 
> No.

That's strange.
"mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1" is from 
mwifiex_remove_card().

> 
> > My card works fine with a Ricoh controller on x86_64 machine. Of course the power is not cut when I
> remove the mwifiex_sdio module.
> 
> Under this kernel, I'm not sure if the power is or isn't being cut at
> this point. I don't think this is relevant though; regular

"we quite regularly will power down the wireless card when it is not
being actively used..."

I interpreted this statement as "the power to the card is being cut while inactive".
mwifiex_remove_card() is called seems agrees with it. Perhaps I'm missing something here.

> insmod/rmmod cycles work fine, the problem only appears when tearing
> down the interface while a scan is in progress.

Could you add some debug logs in the tear-down path to show where exactly it gets stuck?

> 
> > I can upgrade my XO-4 to kernel 3.8 if feasible. Please let me know if a HOW-TO doc is available.
> 
> You will need a serial console connected, as there is no output on the
> LCD at this time when running the upstream kernel.
> 
> http://wiki.laptop.org/go/User:DanielDrake/Run_upstream_kernel_on_XO-4

Will do.

Thanks,
Bing

> 
> Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mwifiex crash when removing interface while scanning
  2013-02-21  0:00       ` Bing Zhao
@ 2013-02-21 14:20         ` Daniel Drake
  2013-02-21 19:57           ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Drake @ 2013-02-21 14:20 UTC (permalink / raw)
  To: Bing Zhao; +Cc: linux-wireless

On Wed, Feb 20, 2013 at 6:00 PM, Bing Zhao <bzhao@marvell.com> wrote:
> Hi Daniel,
>
>> >> >     mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
>> >> >     eth0      Failed to read scan data : No such device
>> >
>> > After this error message, do you see the sd8787 card is removed from the bus?
>> >
>> > "mmc0: card 0001 removed"
>>
>> No.
>
> That's strange.
> "mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1" is from
> mwifiex_remove_card().

I haven't dug into the code, but it seems quite normal to me. It makes
sense for mwifiex_remove_card to be called in the rmmod path. And it
makes sense for the kernel not to report that the card was removed,
since it wasn't actually removed. Only the module was unloaded.

>> Under this kernel, I'm not sure if the power is or isn't being cut at
>> this point. I don't think this is relevant though; regular
>
> "we quite regularly will power down the wireless card when it is not
> being actively used..."
>
> I interpreted this statement as "the power to the card is being cut while inactive".
> mwifiex_remove_card() is called seems agrees with it. Perhaps I'm missing something here.

Yes, we will want to make sure that the power is physically cut. I
just haven't verified what the exact behaviour is under 3.8.
Nevertheless, I think this detail is irrelevant, given that
rmmod/insmod cycles work fine, but the problem only occurs when
scanning is involved.

>> insmod/rmmod cycles work fine, the problem only appears when tearing
>> down the interface while a scan is in progress.
>
> Could you add some debug logs in the tear-down path to show where exactly it gets stuck?

I mis-spoke there, sorry. The problem happens on the next modprobe
*after* a teardown when a scan was in progress. The teardown itself
seems to have completed without getting stuck.

If you have any suggestions for where to add debug messages, that
would be appreciated. Otherwise I will look into sprinkling a few
throughout the probe sequences to see if any communication succeeds
before the first error is printed.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-02-21 14:20         ` Daniel Drake
@ 2013-02-21 19:57           ` Bing Zhao
  2013-03-08  5:00             ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-02-21 19:57 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless

Hi Daniel,

> > That's strange.
> > "mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1" is from
> > mwifiex_remove_card().
> 
> I haven't dug into the code, but it seems quite normal to me. It makes
> sense for mwifiex_remove_card to be called in the rmmod path. And it

Ah, you are right. I overlooked that rmmod will lead to mwifiex_remove_card.

> makes sense for the kernel not to report that the card was removed,
> since it wasn't actually removed. Only the module was unloaded.
> 
> >> Under this kernel, I'm not sure if the power is or isn't being cut at
> >> this point. I don't think this is relevant though; regular
> >
> > "we quite regularly will power down the wireless card when it is not
> > being actively used..."
> >
> > I interpreted this statement as "the power to the card is being cut while inactive".
> > mwifiex_remove_card() is called seems agrees with it. Perhaps I'm missing something here.
> 
> Yes, we will want to make sure that the power is physically cut. I
> just haven't verified what the exact behaviour is under 3.8.

MMC core will report "mmc0: card 0001 removed" when the power is cut.

> Nevertheless, I think this detail is irrelevant, given that
> rmmod/insmod cycles work fine, but the problem only occurs when
> scanning is involved.
> 
> >> insmod/rmmod cycles work fine, the problem only appears when tearing
> >> down the interface while a scan is in progress.
> >
> > Could you add some debug logs in the tear-down path to show where exactly it gets stuck?
> 
> I mis-spoke there, sorry. The problem happens on the next modprobe
> *after* a teardown when a scan was in progress. The teardown itself
> seems to have completed without getting stuck.
> 
> If you have any suggestions for where to add debug messages, that
> would be appreciated. Otherwise I will look into sprinkling a few
> throughout the probe sequences to see if any communication succeeds
> before the first error is printed.

For now, let's use dynamic_debug to collect some logs.

CONFIG_DYNAMIC_DEBUG=y

# run these commands before running the test script:
echo "module mwifiex +p" > /sys/kernel/debug/dynamic_debug/control
echo "module mwifiex_sdio +p" > /sys/kernel/debug/dynamic_debug/control

Thanks,
Bing

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-02-21 19:57           ` Bing Zhao
@ 2013-03-08  5:00             ` Bing Zhao
  2013-03-11 20:12               ` Daniel Drake
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-03-08  5:00 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]

Hi Daniel,

> > If you have any suggestions for where to add debug messages, that
> > would be appreciated. Otherwise I will look into sprinkling a few
> > throughout the probe sequences to see if any communication succeeds
> > before the first error is printed.
> 
> For now, let's use dynamic_debug to collect some logs.
> 
> CONFIG_DYNAMIC_DEBUG=y
> 
> # run these commands before running the test script:
> echo "module mwifiex +p" > /sys/kernel/debug/dynamic_debug/control
> echo "module mwifiex_sdio +p" > /sys/kernel/debug/dynamic_debug/control

Running your test script on my XO-4 with 3.8 kernel does reproduce command timeout.
Although the timeout happens at different path than yours they could have the same cause.

Attached please find two patches that seems fix the FUNC_SHUTDOWN timeout on my XO-4.
Hope it helps.

Thanks,
Bing


[-- Attachment #2: 0002-mwifiex-skip-pending-commands-after-function-shutdow.patch --]
[-- Type: application/octet-stream, Size: 1829 bytes --]

From e92cd542ac757493db1182bd81160e19c1116052 Mon Sep 17 00:00:00 2001
From: Bing Zhao <bzhao@marvell.com>
Date: Thu, 7 Mar 2013 20:16:03 -0800
Subject: [PATCH 2/2] mwifiex: skip pending commands after function shutdown

During rmmod mwifiex_sdio processing FUNC_SHUTDOWN command is
sent to firmware. Firmware expcets only FUNC_INIT once WLAN
function is shut down.

Any command pending in the command queue should be ignored and
freed.

Signed-off-by: Bing Zhao <bzhao@marvell.com>
---
 drivers/net/wireless/mwifiex/cmdevt.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/cmdevt.c b/drivers/net/wireless/mwifiex/cmdevt.c
index f47ad3a..9a1302b 100644
--- a/drivers/net/wireless/mwifiex/cmdevt.c
+++ b/drivers/net/wireless/mwifiex/cmdevt.c
@@ -157,6 +157,20 @@ static int mwifiex_dnld_cmd_to_fw(struct mwifiex_private *priv,
 		return -1;
 	}
 
+	cmd_code = le16_to_cpu(host_cmd->command);
+	cmd_size = le16_to_cpu(host_cmd->size);
+
+	if (adapter->hw_status == MWIFIEX_HW_STATUS_RESET &&
+	    cmd_code != HostCmd_CMD_FUNC_SHUTDOWN &&
+	    cmd_code != HostCmd_CMD_FUNC_INIT) {
+		dev_err(adapter->dev,
+			"DNLD_CMD: FW in reset state, ignore cmd %#x\n",
+			cmd_code);
+		mwifiex_complete_cmd(adapter, cmd_node);
+		mwifiex_insert_cmd_to_free_q(adapter, cmd_node);
+		return -1;
+	}
+
 	/* Set command sequence number */
 	adapter->seq_num++;
 	host_cmd->seq_num = cpu_to_le16(HostCmd_SET_SEQ_NO_BSS_INFO
@@ -168,9 +182,6 @@ static int mwifiex_dnld_cmd_to_fw(struct mwifiex_private *priv,
 	adapter->curr_cmd = cmd_node;
 	spin_unlock_irqrestore(&adapter->mwifiex_cmd_lock, flags);
 
-	cmd_code = le16_to_cpu(host_cmd->command);
-	cmd_size = le16_to_cpu(host_cmd->size);
-
 	/* Adjust skb length */
 	if (cmd_node->cmd_skb->len > cmd_size)
 		/*
-- 
1.8.0


[-- Attachment #3: 0001-mwifiex-fix-race-when-queuing-commands.patch --]
[-- Type: application/octet-stream, Size: 5052 bytes --]

From fedbf160a732b017da386613b508a8306f5fe4ca Mon Sep 17 00:00:00 2001
From: Amitkumar Karwar <akarwar@marvell.com>
Date: Wed, 27 Feb 2013 18:26:32 -0800
Subject: [PATCH 1/2] mwifiex: fix race when queuing commands

mwifiex_send_cmd_async() is called for sync as well as async
commands. (mwifiex_send_cmd_sync() internally calls it for
sync command.)

"adapter->cmd_queued" gets filled inside mwifiex_send_cmd_async()
routine for both types of commands. But it is used only for sync
commands in mwifiex_wait_queue_complete(). This could lead to a
race when two threads try to queue a sync command with another
sync/async command simultaneously.

Get rid of global variable and pass command node as a parameter
to mwifiex_wait_queue_complete() to fix the problem.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
---
 drivers/net/wireless/mwifiex/cmdevt.c    |  5 ++---
 drivers/net/wireless/mwifiex/main.h      |  4 ++--
 drivers/net/wireless/mwifiex/scan.c      |  8 ++++----
 drivers/net/wireless/mwifiex/sta_ioctl.c | 10 ++--------
 4 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/cmdevt.c b/drivers/net/wireless/mwifiex/cmdevt.c
index d19a88c..f47ad3a 100644
--- a/drivers/net/wireless/mwifiex/cmdevt.c
+++ b/drivers/net/wireless/mwifiex/cmdevt.c
@@ -484,8 +484,6 @@ int mwifiex_send_cmd_sync(struct mwifiex_private *priv, uint16_t cmd_no,
 
 	ret = mwifiex_send_cmd_async(priv, cmd_no, cmd_action, cmd_oid,
 				     data_buf);
-	if (!ret)
-		ret = mwifiex_wait_queue_complete(adapter);
 
 	return ret;
 }
@@ -588,9 +586,10 @@ int mwifiex_send_cmd_async(struct mwifiex_private *priv, uint16_t cmd_no,
 	if (cmd_no == HostCmd_CMD_802_11_SCAN) {
 		mwifiex_queue_scan_cmd(priv, cmd_node);
 	} else {
-		adapter->cmd_queued = cmd_node;
 		mwifiex_insert_cmd_to_pending_q(adapter, cmd_node, true);
 		queue_work(adapter->workqueue, &adapter->main_work);
+		if (cmd_node->wait_q_enabled)
+			ret = mwifiex_wait_queue_complete(adapter, cmd_node);
 	}
 
 	return ret;
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 9206575..7255289 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -728,7 +728,6 @@ struct mwifiex_adapter {
 	u16 cmd_wait_q_required;
 	struct mwifiex_wait_queue cmd_wait_q;
 	u8 scan_wait_q_woken;
-	struct cmd_ctrl_node *cmd_queued;
 	spinlock_t queue_lock;		/* lock for tx queues */
 	struct completion fw_load;
 	u8 country_code[IEEE80211_COUNTRY_STRING_LEN];
@@ -1023,7 +1022,8 @@ int mwifiex_request_set_multicast_list(struct mwifiex_private *priv,
 			struct mwifiex_multicast_list *mcast_list);
 int mwifiex_copy_mcast_addr(struct mwifiex_multicast_list *mlist,
 			    struct net_device *dev);
-int mwifiex_wait_queue_complete(struct mwifiex_adapter *adapter);
+int mwifiex_wait_queue_complete(struct mwifiex_adapter *adapter,
+				struct cmd_ctrl_node *cmd_queued);
 int mwifiex_bss_start(struct mwifiex_private *priv, struct cfg80211_bss *bss,
 		      struct cfg80211_ssid *req_ssid);
 int mwifiex_cancel_hs(struct mwifiex_private *priv, int cmd_type);
diff --git a/drivers/net/wireless/mwifiex/scan.c b/drivers/net/wireless/mwifiex/scan.c
index bb60c27..d215b4d 100644
--- a/drivers/net/wireless/mwifiex/scan.c
+++ b/drivers/net/wireless/mwifiex/scan.c
@@ -1388,10 +1388,13 @@ int mwifiex_scan_networks(struct mwifiex_private *priv,
 			list_del(&cmd_node->list);
 			spin_unlock_irqrestore(&adapter->scan_pending_q_lock,
 					       flags);
-			adapter->cmd_queued = cmd_node;
 			mwifiex_insert_cmd_to_pending_q(adapter, cmd_node,
 							true);
 			queue_work(adapter->workqueue, &adapter->main_work);
+
+			/* Perform internal scan synchronously */
+			if (!priv->scan_request)
+				mwifiex_wait_queue_complete(adapter, cmd_node);
 		} else {
 			spin_unlock_irqrestore(&adapter->scan_pending_q_lock,
 					       flags);
@@ -1946,9 +1949,6 @@ int mwifiex_request_scan(struct mwifiex_private *priv,
 		/* Normal scan */
 		ret = mwifiex_scan_networks(priv, NULL);
 
-	if (!ret)
-		ret = mwifiex_wait_queue_complete(priv->adapter);
-
 	up(&priv->async_sem);
 
 	return ret;
diff --git a/drivers/net/wireless/mwifiex/sta_ioctl.c b/drivers/net/wireless/mwifiex/sta_ioctl.c
index 76d31de..8c943b6 100644
--- a/drivers/net/wireless/mwifiex/sta_ioctl.c
+++ b/drivers/net/wireless/mwifiex/sta_ioctl.c
@@ -54,16 +54,10 @@ int mwifiex_copy_mcast_addr(struct mwifiex_multicast_list *mlist,
  * This function waits on a cmd wait queue. It also cancels the pending
  * request after waking up, in case of errors.
  */
-int mwifiex_wait_queue_complete(struct mwifiex_adapter *adapter)
+int mwifiex_wait_queue_complete(struct mwifiex_adapter *adapter,
+				struct cmd_ctrl_node *cmd_queued)
 {
 	int status;
-	struct cmd_ctrl_node *cmd_queued;
-
-	if (!adapter->cmd_queued)
-		return 0;
-
-	cmd_queued = adapter->cmd_queued;
-	adapter->cmd_queued = NULL;
 
 	dev_dbg(adapter->dev, "cmd pending\n");
 	atomic_inc(&adapter->cmd_pending);
-- 
1.8.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: mwifiex crash when removing interface while scanning
  2013-03-08  5:00             ` Bing Zhao
@ 2013-03-11 20:12               ` Daniel Drake
  2013-03-13  3:04                 ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Drake @ 2013-03-11 20:12 UTC (permalink / raw)
  To: Bing Zhao; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

Hi,

On Thu, Mar 7, 2013 at 11:00 PM, Bing Zhao <bzhao@marvell.com> wrote:
> Running your test script on my XO-4 with 3.8 kernel does reproduce command timeout.
> Although the timeout happens at different path than yours they could have the same cause.
>
> Attached please find two patches that seems fix the FUNC_SHUTDOWN timeout on my XO-4.
> Hope it helps.

Thanks, this seems like an improvement.
With NetworkManager disabled, I can now run the original test script
in a loop, and after a few minutes it is still running.

I still see this message:
    mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
(cmd_pending is sometimes -2 as well).
Does this suggest a leak?

Even after these patches it still seems easy to confuse this driver
with just a trivial setup. Here's one example:

NetworkManager enabled again, clean reboot, using test.sh from the
first mail, the following failure seems 100% reproducible:

# bash test.sh
[   17.129827] lis3lv02d_i2c 4-0019: Failed to get supply 'Vdd': -517
[   17.136003] i2c 4-0019: Driver lis3lv02d_i2c requests probe deferral
[   17.915844] mwifiex_sdio mmc0:0001:1: WLAN FW is active
eth0: ERROR while getting interface flags: No such device
eth0      Interface doesn't support scanning.
[   18.407286] mwifiex_sdio mmc0:0001:1: ignoring F/W country code US
[   18.430167] mwifiex_sdio mmc0:0001:1: driver_version = mwifiex 1.0
(14.66.9.p96)
[   18.510658] systemd-udevd[235]: renamed network interface mlan0 to eth0

Oops - seems like my script doesn't wait long enough for the network
interface rename to happen, so that wasn't a complete test. This seems
to happen frequently on first run, but subsequent runs work better.
Lets run it again:

# bash test.sh
[   21.653722] mwifiex_sdio mmc0:0001:1: WLAN FW already running! Skip FW dnld
[   21.662303] lis3lv02d_i2c 4-0019: Failed to get supply 'Vdd': -517
[   21.673173] i2c 4-0019: Driver lis3lv02d_i2c requests probe deferral
[   21.681210] mwifiex_sdio mmc0:0001:1: WLAN FW is active
[   21.794164] mwifiex_sdio mmc0:0001:1: ignoring F/W country code US
[   21.807990] mwifiex_sdio mmc0:0001:1: driver_version = mwifiex 1.0
(14.66.9.p96)
[   21.893378] systemd-udevd[235]: renamed network interface mlan0 to eth0
[   21.984014] ieee80211 phy1: uap0: changing to 2 not supported
[   22.054609] ieee80211 phy1: uap0: changing to 2 not supported
[   22.095906] ieee80211 phy1: uap0: changing to 2 not supported
[   22.115854] ieee80211 phy1: uap0: changing to 2 not supported
[   22.146579] ieee80211 phy1: uap0: changing to 2 not supported
eth0      Interface doesn't support scanning : Device or resource busy
[   24.092789] mwifiex_sdio mmc0:0001:1: DNLD_CMD: FW in reset state,
ignore cmd 0x6
[   24.101801] mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0,
cmd_pending=-2

I guess the scan did not succeed there because NetworkManager was
scanning. No problem, things look OK (apart from the last 2 messages
which seem a little concerning).

Now lets load the module manually and do a scan:

# insmod mwifiex_sdio.ko
[   26.639096] lis3lv02d_i2c 4-0019: Failed to get supply 'Vdd': -517
[   26.645260] i2c 4-0019: Driver lis3lv02d_i2c requests probe deferral
[   26.656082] mwifiex_sdio mmc0:0001:1: WLAN FW already running! Skip FW dnld
[   26.666128] mwifiex_sdio mmc0:0001:1: WLAN FW is active
[   26.777635] mwifiex_sdio mmc0:0001:1: ignoring F/W country code US
[   26.798631] mwifiex_sdio mmc0:0001:1: driver_version = mwifiex 1.0
(14.66.9.p96)
[   26.862487] systemd-udevd[235]: renamed network interface mlan0 to eth0
[   27.129054] ieee80211 phy2: uap0: changing to 2 not supported
[   27.168360] ieee80211 phy2: uap0: changing to 2 not supported
[   27.196832] ieee80211 phy2: uap0: changing to 2 not supported
[   27.216715] ieee80211 phy2: uap0: changing to 2 not supported
[   27.239068] ieee80211 phy2: uap0: changing to 2 not supported

# i\bwlist eth0 scan
eth0      Interface doesn't support scanning : Device or resource busy

OK, NetworkManager is scanning. I keep trying this every few seconds:

# iwlist eth0 scan
eth0      Interface doesn't support scanning : Device or resource busy
# iwlist eth0 scan
eth0      Interface doesn't support scanning : Device or resource busy
# iwlist eth0 scan
eth0      Interface doesn't support scanning : Device or resource busy
# iwlist eth0 scan
[   49.567711] mwifiex_sdio mmc0:0001:1: mwifiex_cmd_timeout_func:
Timeout cmd id (49.555599) = 0x6, act = 0x3
[   49.577420] mwifiex_sdio mmc0:0001:1: num_data_h2c_failure = 0
[   49.583240] mwifiex_sdio mmc0:0001:1: num_cmd_h2c_failure = 0
[   49.588986] mwifiex_sdio mmc0:0001:1: num_cmd_timeout = 1
[   49.588986] mwifiex_sdio mmc0:0001:1: num_tx_timeout = 0
[   49.599612] mwifiex_sdio mmc0:0001:1: last_cmd_index = 0
[   49.604892] mwifiex_sdio mmc0:0001:1: last_cmd_id: 06 00 71 00 06
00 06 00 5e 00
[   49.612243] mwifiex_sdio mmc0:0001:1: last_cmd_act: 03 00 00 00 03
00 03 00 01 00
[   49.619683] mwifiex_sdio mmc0:0001:1: last_cmd_resp_index = 4
[   49.619693] mwifiex_sdio mmc0:0001:1: last_cmd_resp_id: 12 80 71 80
06 80 06 80 5e 80
[   49.633176] mwifiex_sdio mmc0:0001:1: last_event_index = 3
[   49.638627] mwifiex_sdio mmc0:0001:1: last_event: 17 00 0a 00 08 00
17 00 17 00
[   49.645883] mwifiex_sdio mmc0:0001:1: data_sent=0 cmd_sent=1
[   49.651499] mwifiex_sdio mmc0:0001:1: ps_mode=1 ps_state=0
[   49.656950] mwifiex_sdio mmc0:0001:1: cmd timeout
[   49.661697] mwifiex_sdio: Resetting card...
[   49.676122] mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0,
cmd_pending=-4
[   49.683632] ieee80211 phy2: crypto keys added
[   49.689716] mwifiex_sdio mmc0:0001:1: PREP_CMD: card is removed
eth0      Interface doesn't support scanning : No such device
[   49.717762] mwifiex_sdio mmc0:0001:1: PREP_CMD: card is removed
[   49.808245] mmc0: card 0001 removed

...and the device doesn't come back.

Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-03-11 20:12               ` Daniel Drake
@ 2013-03-13  3:04                 ` Bing Zhao
  2013-03-14 16:26                   ` Daniel Drake
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-03-13  3:04 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

Hi Daniel,

> Hi,
> 
> On Thu, Mar 7, 2013 at 11:00 PM, Bing Zhao <bzhao@marvell.com> wrote:
> > Running your test script on my XO-4 with 3.8 kernel does reproduce command timeout.
> > Although the timeout happens at different path than yours they could have the same cause.
> >
> > Attached please find two patches that seems fix the FUNC_SHUTDOWN timeout on my XO-4.
> > Hope it helps.
> 
> Thanks, this seems like an improvement.
> With NetworkManager disabled, I can now run the original test script
> in a loop, and after a few minutes it is still running.

Good to know it improves.

> 
> I still see this message:
>     mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0, cmd_pending=-1
> (cmd_pending is sometimes -2 as well).
> Does this suggest a leak?

Yeah, I also noticed this. I will check.

> 
> Even after these patches it still seems easy to confuse this driver
> with just a trivial setup. Here's one example:
> 
> NetworkManager enabled again, clean reboot, using test.sh from the
> first mail, the following failure seems 100% reproducible:
> 

[...]

> [   24.092789] mwifiex_sdio mmc0:0001:1: DNLD_CMD: FW in reset state,
> ignore cmd 0x6
> [   24.101801] mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0,
> cmd_pending=-2
> 
> I guess the scan did not succeed there because NetworkManager was
> scanning. No problem, things look OK (apart from the last 2 messages
> which seem a little concerning).

The "FW in reset state, ignore cmd 0x06" message is correct. During driver/firmware shutdown process we must skip these commands.

> 
> Now lets load the module manually and do a scan:
> 
> # insmod mwifiex_sdio.ko
> [   26.639096] lis3lv02d_i2c 4-0019: Failed to get supply 'Vdd': -517
> [   26.645260] i2c 4-0019: Driver lis3lv02d_i2c requests probe deferral
> [   26.656082] mwifiex_sdio mmc0:0001:1: WLAN FW already running! Skip FW dnld
> [   26.666128] mwifiex_sdio mmc0:0001:1: WLAN FW is active
> [   26.777635] mwifiex_sdio mmc0:0001:1: ignoring F/W country code US
> [   26.798631] mwifiex_sdio mmc0:0001:1: driver_version = mwifiex 1.0
> (14.66.9.p96)
> [   26.862487] systemd-udevd[235]: renamed network interface mlan0 to eth0
> [   27.129054] ieee80211 phy2: uap0: changing to 2 not supported
> [   27.168360] ieee80211 phy2: uap0: changing to 2 not supported
> [   27.196832] ieee80211 phy2: uap0: changing to 2 not supported
> [   27.216715] ieee80211 phy2: uap0: changing to 2 not supported
> [   27.239068] ieee80211 phy2: uap0: changing to 2 not supported
> 
> # i\bwlist eth0 scan
> eth0      Interface doesn't support scanning : Device or resource busy
> 
> OK, NetworkManager is scanning. I keep trying this every few seconds:
> 
> # iwlist eth0 scan
> eth0      Interface doesn't support scanning : Device or resource busy
> # iwlist eth0 scan
> eth0      Interface doesn't support scanning : Device or resource busy
> # iwlist eth0 scan
> eth0      Interface doesn't support scanning : Device or resource busy
> # iwlist eth0 scan
> [   49.567711] mwifiex_sdio mmc0:0001:1: mwifiex_cmd_timeout_func:
> Timeout cmd id (49.555599) = 0x6, act = 0x3

Tim Shepard reported a scan timeout issue back in kernel 3.5 when NetworkManager is enabled, and he had a HACK workaround for that. I wonder if this is the same issue in 3.8.
Could you please try that workaround again?

diff --git a/drivers/net/wireless/mwifiex/main.c b/drivers/net/wireless/mwifie
index 121443a..a451d18 100644
--- a/drivers/net/wireless/mwifiex/main.c
+++ b/drivers/net/wireless/mwifiex/main.c
@@ -364,6 +364,7 @@ static void mwifiex_fw_dpc(const struct firmware *firmware
                goto err_add_intf;
        }

+#if 0
        /* Create AP interface by default */
        if (!mwifiex_add_virtual_intf(adapter->wiphy, "uap%d",
                                      NL80211_IFTYPE_AP, NULL, NULL)) {
@@ -377,6 +378,7 @@ static void mwifiex_fw_dpc(const struct firmware *firmware
                dev_err(adapter->dev, "cannot create default P2P interface\n")
                goto err_add_intf;
        }
+#endif
        rtnl_unlock();

        mwifiex_drv_get_driver_version(adapter, fmt, sizeof(fmt) - 1);

Thanks,
Bing


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: mwifiex crash when removing interface while scanning
  2013-03-13  3:04                 ` Bing Zhao
@ 2013-03-14 16:26                   ` Daniel Drake
  2013-03-15  3:38                     ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Drake @ 2013-03-14 16:26 UTC (permalink / raw)
  To: Bing Zhao; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

On Tue, Mar 12, 2013 at 9:04 PM, Bing Zhao <bzhao@marvell.com> wrote:
> Tim Shepard reported a scan timeout issue back in kernel 3.5 when NetworkManager is enabled, and he had a HACK workaround for that. I wonder if this is the same issue in 3.8.
> Could you please try that workaround again?

That seems to avoid the issue. I'm worried that it suggests some more
significant lower-level bug, but I guess for now it would make sense
to finish off those 2 patches you already posted, and investigate the
possible cmd_pending leak, before revisiting this.

Thanks.
Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-03-14 16:26                   ` Daniel Drake
@ 2013-03-15  3:38                     ` Bing Zhao
  2013-03-19  6:19                       ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-03-15  3:38 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

Hi Daniel,

> > Tim Shepard reported a scan timeout issue back in kernel 3.5 when NetworkManager is enabled, and he
> had a HACK workaround for that. I wonder if this is the same issue in 3.8.
> > Could you please try that workaround again?
> 
> That seems to avoid the issue. I'm worried that it suggests some more
> significant lower-level bug, but I guess for now it would make sense
> to finish off those 2 patches you already posted, and investigate the
> possible cmd_pending leak, before revisiting this.

Thanks for testing. I will fine tune the 2 patches and resend the updated version to the list.
The cmd_pending leak will be investigated too.

Thanks,
Bing

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-03-15  3:38                     ` Bing Zhao
@ 2013-03-19  6:19                       ` Bing Zhao
  2013-03-25 16:38                         ` Daniel Drake
  0 siblings, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-03-19  6:19 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

[-- Attachment #1: Type: text/plain, Size: 538 bytes --]

Hi Daniel,

> > That seems to avoid the issue. I'm worried that it suggests some more
> > significant lower-level bug, but I guess for now it would make sense
> > to finish off those 2 patches you already posted, and investigate the
> > possible cmd_pending leak, before revisiting this.
> 
> Thanks for testing. I will fine tune the 2 patches and resend the updated version to the list.
> The cmd_pending leak will be investigated too.

Please try attached patch on top of the 3 patches I sent last Friday.

Thanks,
Bing


[-- Attachment #2: 0001-mwifiex-fix-negative-cmd_pending-count.patch --]
[-- Type: application/octet-stream, Size: 3417 bytes --]

From d4514470a1897b9a6ffda53cecd8fc2280cebd1e Mon Sep 17 00:00:00 2001
From: Bing Zhao <bzhao@marvell.com>
Date: Mon, 18 Mar 2013 20:32:38 -0700
Subject: [PATCH] mwifiex: fix negative cmd_pending count

cmd_pending is increased in mwifiex_wait_queue_complete() and
decreased in mwifiex_complete_cmd() currently.
If there are two or more commands in the cmd_pending_q the main
worker thread will pick up next command from cmd_pending_q
automatically after finishing current command. As a result
mwifiex_wait_queue_complete() will not be called because
the command is alreay completed. This leads to a negative
number in cmd_pending count.

Fix it by increasing cmd_pending when a cmd is queued and
decreasing when that cmd is freed. This covers both synchronous
and asynchronous commands.

Cc: <stable@vger.kernel.org> # 3.8
Reported-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
---
 drivers/net/wireless/mwifiex/cmdevt.c    |   15 ++++++++++++++-
 drivers/net/wireless/mwifiex/sta_ioctl.c |    3 ---
 drivers/net/wireless/mwifiex/util.c      |    1 -
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/cmdevt.c b/drivers/net/wireless/mwifiex/cmdevt.c
index 9a1302b..b6454de 100644
--- a/drivers/net/wireless/mwifiex/cmdevt.c
+++ b/drivers/net/wireless/mwifiex/cmdevt.c
@@ -630,6 +630,17 @@ mwifiex_insert_cmd_to_free_q(struct mwifiex_adapter *adapter,
 	spin_lock_irqsave(&adapter->cmd_free_q_lock, flags);
 	list_add_tail(&cmd_node->list, &adapter->cmd_free_q);
 	spin_unlock_irqrestore(&adapter->cmd_free_q_lock, flags);
+
+	if (cmd_node->cmd_skb) {
+		struct host_cmd_ds_command *host_cmd =
+					(void *)cmd_node->cmd_skb->data;
+
+		atomic_dec(&adapter->cmd_pending);
+		dev_dbg(adapter->dev,
+			"cmd: FREE_CMD: cmd=%#x, cmd_pending=%d\n",
+			le16_to_cpu(host_cmd->command),
+			atomic_read(&adapter->cmd_pending));
+	}
 }
 
 /*
@@ -673,7 +684,9 @@ mwifiex_insert_cmd_to_pending_q(struct mwifiex_adapter *adapter,
 		list_add(&cmd_node->list, &adapter->cmd_pending_q);
 	spin_unlock_irqrestore(&adapter->cmd_pending_q_lock, flags);
 
-	dev_dbg(adapter->dev, "cmd: QUEUE_CMD: cmd=%#x is queued\n", command);
+	atomic_inc(&adapter->cmd_pending);
+	dev_dbg(adapter->dev, "cmd: QUEUE_CMD: cmd=%#x, cmd_pending=%d\n",
+		command, atomic_read(&adapter->cmd_pending));
 }
 
 /*
diff --git a/drivers/net/wireless/mwifiex/sta_ioctl.c b/drivers/net/wireless/mwifiex/sta_ioctl.c
index 8c943b6..e6c9b2a 100644
--- a/drivers/net/wireless/mwifiex/sta_ioctl.c
+++ b/drivers/net/wireless/mwifiex/sta_ioctl.c
@@ -59,9 +59,6 @@ int mwifiex_wait_queue_complete(struct mwifiex_adapter *adapter,
 {
 	int status;
 
-	dev_dbg(adapter->dev, "cmd pending\n");
-	atomic_inc(&adapter->cmd_pending);
-
 	/* Wait for completion */
 	status = wait_event_interruptible(adapter->cmd_wait_q.wait,
 					  *(cmd_queued->condition));
diff --git a/drivers/net/wireless/mwifiex/util.c b/drivers/net/wireless/mwifiex/util.c
index 54667e6..e57ac0d 100644
--- a/drivers/net/wireless/mwifiex/util.c
+++ b/drivers/net/wireless/mwifiex/util.c
@@ -239,7 +239,6 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb)
 int mwifiex_complete_cmd(struct mwifiex_adapter *adapter,
 			 struct cmd_ctrl_node *cmd_node)
 {
-	atomic_dec(&adapter->cmd_pending);
 	dev_dbg(adapter->dev, "cmd completed: status=%d\n",
 		adapter->cmd_wait_q.status);
 
-- 
1.7.0.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: mwifiex crash when removing interface while scanning
  2013-03-19  6:19                       ` Bing Zhao
@ 2013-03-25 16:38                         ` Daniel Drake
  2013-03-25 21:27                           ` John Rhodes
  2013-03-28  2:24                           ` Bing Zhao
  0 siblings, 2 replies; 18+ messages in thread
From: Daniel Drake @ 2013-03-25 16:38 UTC (permalink / raw)
  To: Bing Zhao; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

Hi,

On Tue, Mar 19, 2013 at 12:19 AM, Bing Zhao <bzhao@marvell.com> wrote:
> Hi Daniel,
>
>> > That seems to avoid the issue. I'm worried that it suggests some more
>> > significant lower-level bug, but I guess for now it would make sense
>> > to finish off those 2 patches you already posted, and investigate the
>> > possible cmd_pending leak, before revisiting this.
>>
>> Thanks for testing. I will fine tune the 2 patches and resend the updated version to the list.
>> The cmd_pending leak will be investigated too.
>
> Please try attached patch on top of the 3 patches I sent last Friday.

Thanks for continuing to look at this.

Unfortuntaely the cmd_pending warning is still there and the numbers
are now -4 and -5 e.g.
[  128.269912] mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0,
cmd_pending=-5

This is using the same test script used in the opening message on this
thread, with uap0 and p2p0 auto-creation disabled, with or without
NetworkManager running at the same time.

Thanks
Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-03-25 16:38                         ` Daniel Drake
@ 2013-03-25 21:27                           ` John Rhodes
  2013-03-28  2:24                           ` Bing Zhao
  1 sibling, 0 replies; 18+ messages in thread
From: John Rhodes @ 2013-03-25 21:27 UTC (permalink / raw)
  To: Daniel Drake, Bing Zhao; +Cc: linux-wireless, Amitkumar Karwar

Hi Daniel,

We are now also seeing cmd_pending warnings in testing at Marvell in Santa Clara.  Bing is working on a revised patch for this issue.  We'll keep you posted on the progress.  I'll also update the tracking system ticket #479039.

John

-----Original Message-----
From: dsdrake@gmail.com [mailto:dsdrake@gmail.com] On Behalf Of Daniel Drake
Sent: Monday, March 25, 2013 12:39 PM
To: Bing Zhao
Cc: linux-wireless@vger.kernel.org; John Rhodes; Amitkumar Karwar
Subject: Re: mwifiex crash when removing interface while scanning

Hi,

On Tue, Mar 19, 2013 at 12:19 AM, Bing Zhao <bzhao@marvell.com> wrote:
> Hi Daniel,
>
>> > That seems to avoid the issue. I'm worried that it suggests some 
>> > more significant lower-level bug, but I guess for now it would make 
>> > sense to finish off those 2 patches you already posted, and 
>> > investigate the possible cmd_pending leak, before revisiting this.
>>
>> Thanks for testing. I will fine tune the 2 patches and resend the updated version to the list.
>> The cmd_pending leak will be investigated too.
>
> Please try attached patch on top of the 3 patches I sent last Friday.

Thanks for continuing to look at this.

Unfortuntaely the cmd_pending warning is still there and the numbers are now -4 and -5 e.g.
[  128.269912] mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0,
cmd_pending=-5

This is using the same test script used in the opening message on this thread, with uap0 and p2p0 auto-creation disabled, with or without NetworkManager running at the same time.

Thanks
Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-03-25 16:38                         ` Daniel Drake
  2013-03-25 21:27                           ` John Rhodes
@ 2013-03-28  2:24                           ` Bing Zhao
  2013-03-29 13:46                             ` Daniel Drake
  1 sibling, 1 reply; 18+ messages in thread
From: Bing Zhao @ 2013-03-28  2:24 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

[-- Attachment #1: Type: text/plain, Size: 771 bytes --]

Hi Daniel,

> > Please try attached patch on top of the 3 patches I sent last Friday.
> 
> Thanks for continuing to look at this.
> 
> Unfortuntaely the cmd_pending warning is still there and the numbers
> are now -4 and -5 e.g.
> [  128.269912] mwifiex_sdio mmc0:0001:1: rx_pending=0, tx_pending=0,
> cmd_pending=-5

This is caused by decreasing cmd_pending while cleanup scan commands in scan_pending_q.
cmd_pending only counts the commands queued in cmd_pending_q.

Attached v2 patch should fix it. Please try it out.

Thanks,
Bing

> 
> This is using the same test script used in the opening message on this
> thread, with uap0 and p2p0 auto-creation disabled, with or without
> NetworkManager running at the same time.
> 
> Thanks
> Daniel

[-- Attachment #2: 0001-v2-mwifiex-fix-negative-cmd_pending-count.patch --]
[-- Type: application/octet-stream, Size: 8797 bytes --]

From cc7b51871ee66f211d3adcade16d811903235957 Mon Sep 17 00:00:00 2001
From: Bing Zhao <bzhao@marvell.com>
Date: Wed, 27 Mar 2013 18:36:47 -0700
Subject: [PATCH v2 1/2] mwifiex: fix negative cmd_pending count

cmd_pending is increased in mwifiex_wait_queue_complete() and
decreased in mwifiex_complete_cmd() currently.
If there are two or more commands in the cmd_pending_q the main
worker thread will pick up next command from cmd_pending_q
automatically after finishing current command. As a result
mwifiex_wait_queue_complete() will not be called because
the command is alreay completed. This leads to a negative
number in cmd_pending count.

Fix it by increasing cmd_pending when a cmd is queued into
cmd_pending_q and decreasing when that cmd is recycled. For scan
commands we don't perform inc/dec operations until it's moved
from scan_pending_q to cmd_pending_q. This covers both
synchronous and asynchronous commands.

Cc: <stable@vger.kernel.org> # 3.8
Reported-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
---
 drivers/net/wireless/mwifiex/cmdevt.c      |   35 ++++++++++++++++++++--------
 drivers/net/wireless/mwifiex/init.c        |    2 +-
 drivers/net/wireless/mwifiex/main.h        |    2 +
 drivers/net/wireless/mwifiex/sta_cmdresp.c |    2 +-
 drivers/net/wireless/mwifiex/sta_ioctl.c   |    3 --
 drivers/net/wireless/mwifiex/util.c        |    1 -
 6 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/cmdevt.c b/drivers/net/wireless/mwifiex/cmdevt.c
index 9a1302b..da469c3 100644
--- a/drivers/net/wireless/mwifiex/cmdevt.c
+++ b/drivers/net/wireless/mwifiex/cmdevt.c
@@ -153,7 +153,7 @@ static int mwifiex_dnld_cmd_to_fw(struct mwifiex_private *priv,
 			" or cmd size is 0, not sending\n");
 		if (cmd_node->wait_q_enabled)
 			adapter->cmd_wait_q.status = -1;
-		mwifiex_insert_cmd_to_free_q(adapter, cmd_node);
+		mwifiex_recycle_cmd_node(adapter, cmd_node);
 		return -1;
 	}
 
@@ -167,7 +167,7 @@ static int mwifiex_dnld_cmd_to_fw(struct mwifiex_private *priv,
 			"DNLD_CMD: FW in reset state, ignore cmd %#x\n",
 			cmd_code);
 		mwifiex_complete_cmd(adapter, cmd_node);
-		mwifiex_insert_cmd_to_free_q(adapter, cmd_node);
+		mwifiex_recycle_cmd_node(adapter, cmd_node);
 		return -1;
 	}
 
@@ -228,7 +228,7 @@ static int mwifiex_dnld_cmd_to_fw(struct mwifiex_private *priv,
 			adapter->cmd_sent = false;
 		if (cmd_node->wait_q_enabled)
 			adapter->cmd_wait_q.status = -1;
-		mwifiex_insert_cmd_to_free_q(adapter, adapter->curr_cmd);
+		mwifiex_recycle_cmd_node(adapter, adapter->curr_cmd);
 
 		spin_lock_irqsave(&adapter->mwifiex_cmd_lock, flags);
 		adapter->curr_cmd = NULL;
@@ -632,6 +632,20 @@ mwifiex_insert_cmd_to_free_q(struct mwifiex_adapter *adapter,
 	spin_unlock_irqrestore(&adapter->cmd_free_q_lock, flags);
 }
 
+/* This function reuses a command node. */
+void mwifiex_recycle_cmd_node(struct mwifiex_adapter *adapter,
+			      struct cmd_ctrl_node *cmd_node)
+{
+	struct host_cmd_ds_command *host_cmd = (void *)cmd_node->cmd_skb->data;
+
+	mwifiex_insert_cmd_to_free_q(adapter, cmd_node);
+
+	atomic_dec(&adapter->cmd_pending);
+	dev_dbg(adapter->dev, "cmd: FREE_CMD: cmd=%#x, cmd_pending=%d\n",
+		le16_to_cpu(host_cmd->command),
+		atomic_read(&adapter->cmd_pending));
+}
+
 /*
  * This function queues a command to the command pending queue.
  *
@@ -673,7 +687,9 @@ mwifiex_insert_cmd_to_pending_q(struct mwifiex_adapter *adapter,
 		list_add(&cmd_node->list, &adapter->cmd_pending_q);
 	spin_unlock_irqrestore(&adapter->cmd_pending_q_lock, flags);
 
-	dev_dbg(adapter->dev, "cmd: QUEUE_CMD: cmd=%#x is queued\n", command);
+	atomic_inc(&adapter->cmd_pending);
+	dev_dbg(adapter->dev, "cmd: QUEUE_CMD: cmd=%#x, cmd_pending=%d\n",
+		command, atomic_read(&adapter->cmd_pending));
 }
 
 /*
@@ -783,7 +799,7 @@ int mwifiex_process_cmdresp(struct mwifiex_adapter *adapter)
 	if (adapter->curr_cmd->cmd_flag & CMD_F_CANCELED) {
 		dev_err(adapter->dev, "CMD_RESP: %#x been canceled\n",
 			le16_to_cpu(resp->command));
-		mwifiex_insert_cmd_to_free_q(adapter, adapter->curr_cmd);
+		mwifiex_recycle_cmd_node(adapter, adapter->curr_cmd);
 		spin_lock_irqsave(&adapter->mwifiex_cmd_lock, flags);
 		adapter->curr_cmd = NULL;
 		spin_unlock_irqrestore(&adapter->mwifiex_cmd_lock, flags);
@@ -833,7 +849,7 @@ int mwifiex_process_cmdresp(struct mwifiex_adapter *adapter)
 		if (adapter->curr_cmd->wait_q_enabled)
 			adapter->cmd_wait_q.status = -1;
 
-		mwifiex_insert_cmd_to_free_q(adapter, adapter->curr_cmd);
+		mwifiex_recycle_cmd_node(adapter, adapter->curr_cmd);
 		spin_lock_irqsave(&adapter->mwifiex_cmd_lock, flags);
 		adapter->curr_cmd = NULL;
 		spin_unlock_irqrestore(&adapter->mwifiex_cmd_lock, flags);
@@ -865,8 +881,7 @@ int mwifiex_process_cmdresp(struct mwifiex_adapter *adapter)
 		if (adapter->curr_cmd->wait_q_enabled)
 			adapter->cmd_wait_q.status = ret;
 
-		/* Clean up and put current command back to cmd_free_q */
-		mwifiex_insert_cmd_to_free_q(adapter, adapter->curr_cmd);
+		mwifiex_recycle_cmd_node(adapter, adapter->curr_cmd);
 
 		spin_lock_irqsave(&adapter->mwifiex_cmd_lock, flags);
 		adapter->curr_cmd = NULL;
@@ -993,7 +1008,7 @@ mwifiex_cancel_all_pending_cmd(struct mwifiex_adapter *adapter)
 			mwifiex_complete_cmd(adapter, cmd_node);
 			cmd_node->wait_q_enabled = false;
 		}
-		mwifiex_insert_cmd_to_free_q(adapter, cmd_node);
+		mwifiex_recycle_cmd_node(adapter, cmd_node);
 		spin_lock_irqsave(&adapter->cmd_pending_q_lock, flags);
 	}
 	spin_unlock_irqrestore(&adapter->cmd_pending_q_lock, flags);
@@ -1040,7 +1055,7 @@ mwifiex_cancel_pending_ioctl(struct mwifiex_adapter *adapter)
 		cmd_node = adapter->curr_cmd;
 		cmd_node->wait_q_enabled = false;
 		cmd_node->cmd_flag |= CMD_F_CANCELED;
-		mwifiex_insert_cmd_to_free_q(adapter, cmd_node);
+		mwifiex_recycle_cmd_node(adapter, cmd_node);
 		mwifiex_complete_cmd(adapter, adapter->curr_cmd);
 		adapter->curr_cmd = NULL;
 		spin_unlock_irqrestore(&adapter->mwifiex_cmd_lock, cmd_flags);
diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
index daf8801..003c014 100644
--- a/drivers/net/wireless/mwifiex/init.c
+++ b/drivers/net/wireless/mwifiex/init.c
@@ -713,7 +713,7 @@ mwifiex_shutdown_drv(struct mwifiex_adapter *adapter)
 	if (adapter->curr_cmd) {
 		dev_warn(adapter->dev, "curr_cmd is still in processing\n");
 		del_timer(&adapter->cmd_timer);
-		mwifiex_insert_cmd_to_free_q(adapter, adapter->curr_cmd);
+		mwifiex_recycle_cmd_node(adapter, adapter->curr_cmd);
 		adapter->curr_cmd = NULL;
 	}
 
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index cab8a85..fef89fd 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -798,6 +798,8 @@ void mwifiex_cancel_pending_ioctl(struct mwifiex_adapter *adapter);
 
 void mwifiex_insert_cmd_to_free_q(struct mwifiex_adapter *adapter,
 				  struct cmd_ctrl_node *cmd_node);
+void mwifiex_recycle_cmd_node(struct mwifiex_adapter *adapter,
+			      struct cmd_ctrl_node *cmd_node);
 
 void mwifiex_insert_cmd_to_pending_q(struct mwifiex_adapter *adapter,
 				     struct cmd_ctrl_node *cmd_node,
diff --git a/drivers/net/wireless/mwifiex/sta_cmdresp.c b/drivers/net/wireless/mwifiex/sta_cmdresp.c
index c7dc450..9f990e1 100644
--- a/drivers/net/wireless/mwifiex/sta_cmdresp.c
+++ b/drivers/net/wireless/mwifiex/sta_cmdresp.c
@@ -95,7 +95,7 @@ mwifiex_process_cmdresp_error(struct mwifiex_private *priv,
 		break;
 	}
 	/* Handling errors here */
-	mwifiex_insert_cmd_to_free_q(adapter, adapter->curr_cmd);
+	mwifiex_recycle_cmd_node(adapter, adapter->curr_cmd);
 
 	spin_lock_irqsave(&adapter->mwifiex_cmd_lock, flags);
 	adapter->curr_cmd = NULL;
diff --git a/drivers/net/wireless/mwifiex/sta_ioctl.c b/drivers/net/wireless/mwifiex/sta_ioctl.c
index 8c943b6..e6c9b2a 100644
--- a/drivers/net/wireless/mwifiex/sta_ioctl.c
+++ b/drivers/net/wireless/mwifiex/sta_ioctl.c
@@ -59,9 +59,6 @@ int mwifiex_wait_queue_complete(struct mwifiex_adapter *adapter,
 {
 	int status;
 
-	dev_dbg(adapter->dev, "cmd pending\n");
-	atomic_inc(&adapter->cmd_pending);
-
 	/* Wait for completion */
 	status = wait_event_interruptible(adapter->cmd_wait_q.wait,
 					  *(cmd_queued->condition));
diff --git a/drivers/net/wireless/mwifiex/util.c b/drivers/net/wireless/mwifiex/util.c
index 54667e6..e57ac0d 100644
--- a/drivers/net/wireless/mwifiex/util.c
+++ b/drivers/net/wireless/mwifiex/util.c
@@ -239,7 +239,6 @@ int mwifiex_recv_packet(struct mwifiex_private *priv, struct sk_buff *skb)
 int mwifiex_complete_cmd(struct mwifiex_adapter *adapter,
 			 struct cmd_ctrl_node *cmd_node)
 {
-	atomic_dec(&adapter->cmd_pending);
 	dev_dbg(adapter->dev, "cmd completed: status=%d\n",
 		adapter->cmd_wait_q.status);
 
-- 
1.7.0.2


[-- Attachment #3: 0002-v2-mwifiex-complete-last-internal-scan.patch --]
[-- Type: application/octet-stream, Size: 1714 bytes --]

From 4719f6ca80abb87a01714fd41f370613795d9b04 Mon Sep 17 00:00:00 2001
From: Bing Zhao <bzhao@marvell.com>
Date: Tue, 19 Mar 2013 12:10:11 -0700
Subject: [PATCH v2 2/2] mwifiex: complete last internal scan

We are waiting on first scan command of internal scan request
before association, so we should complete on last internal scan
command response.

Cc: <stable@vger.kernel.org> # 3.8
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
---
 drivers/net/wireless/mwifiex/scan.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/scan.c b/drivers/net/wireless/mwifiex/scan.c
index d215b4d..e7f6dea 100644
--- a/drivers/net/wireless/mwifiex/scan.c
+++ b/drivers/net/wireless/mwifiex/scan.c
@@ -1393,8 +1393,10 @@ int mwifiex_scan_networks(struct mwifiex_private *priv,
 			queue_work(adapter->workqueue, &adapter->main_work);
 
 			/* Perform internal scan synchronously */
-			if (!priv->scan_request)
+			if (!priv->scan_request) {
+				dev_dbg(adapter->dev, "wait internal scan\n");
 				mwifiex_wait_queue_complete(adapter, cmd_node);
+			}
 		} else {
 			spin_unlock_irqrestore(&adapter->scan_pending_q_lock,
 					       flags);
@@ -1793,7 +1795,12 @@ check_next_scan:
 		/* Need to indicate IOCTL complete */
 		if (adapter->curr_cmd->wait_q_enabled) {
 			adapter->cmd_wait_q.status = 0;
-			mwifiex_complete_cmd(adapter, adapter->curr_cmd);
+			if (!priv->scan_request) {
+				dev_dbg(adapter->dev,
+					"complete internal scan\n");
+				mwifiex_complete_cmd(adapter,
+						     adapter->curr_cmd);
+			}
 		}
 		if (priv->report_scan_result)
 			priv->report_scan_result = false;
-- 
1.7.0.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: mwifiex crash when removing interface while scanning
  2013-03-28  2:24                           ` Bing Zhao
@ 2013-03-29 13:46                             ` Daniel Drake
  2013-03-29 18:44                               ` Bing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Drake @ 2013-03-29 13:46 UTC (permalink / raw)
  To: Bing Zhao; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

On Wed, Mar 27, 2013 at 8:24 PM, Bing Zhao <bzhao@marvell.com> wrote:
> This is caused by decreasing cmd_pending while cleanup scan commands in scan_pending_q.
> cmd_pending only counts the commands queued in cmd_pending_q.
>
> Attached v2 patch should fix it. Please try it out.

Thanks, applied both patches, that seems to be working now.

Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: mwifiex crash when removing interface while scanning
  2013-03-29 13:46                             ` Daniel Drake
@ 2013-03-29 18:44                               ` Bing Zhao
  0 siblings, 0 replies; 18+ messages in thread
From: Bing Zhao @ 2013-03-29 18:44 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-wireless, John Rhodes, Amitkumar Karwar

Hi Daniel,

> On Wed, Mar 27, 2013 at 8:24 PM, Bing Zhao <bzhao@marvell.com> wrote:
> > This is caused by decreasing cmd_pending while cleanup scan commands in scan_pending_q.
> > cmd_pending only counts the commands queued in cmd_pending_q.
> >
> > Attached v2 patch should fix it. Please try it out.
> 
> Thanks, applied both patches, that seems to be working now.

Thanks for testing. I will resend the patches to the list.

Thanks,
Bing

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-03-29 18:45 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-19 18:02 mwifiex crash when removing interface while scanning Daniel Drake
2013-02-19 23:31 ` Bing Zhao
2013-02-20  5:06   ` Bing Zhao
2013-02-20 17:47     ` Daniel Drake
2013-02-21  0:00       ` Bing Zhao
2013-02-21 14:20         ` Daniel Drake
2013-02-21 19:57           ` Bing Zhao
2013-03-08  5:00             ` Bing Zhao
2013-03-11 20:12               ` Daniel Drake
2013-03-13  3:04                 ` Bing Zhao
2013-03-14 16:26                   ` Daniel Drake
2013-03-15  3:38                     ` Bing Zhao
2013-03-19  6:19                       ` Bing Zhao
2013-03-25 16:38                         ` Daniel Drake
2013-03-25 21:27                           ` John Rhodes
2013-03-28  2:24                           ` Bing Zhao
2013-03-29 13:46                             ` Daniel Drake
2013-03-29 18:44                               ` Bing Zhao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.