All of lore.kernel.org
 help / color / mirror / Atom feed
* Incomplete scan results after rf(un)kill
@ 2017-01-27  6:50 Daniel J Blueman
  2017-01-27 15:44 ` Valo, Kalle
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel J Blueman @ 2017-01-27  6:50 UTC (permalink / raw)
  To: ath10k

On 4.9.5 and previous, I've noticed that 20% of the time after
rfunkilling, AP scan results would often have only 1-2 out of the
previous 10 APs, and the situation would persist until removing and
reinserting the ath10k_pci module.

This is on my Dell XPS 13 9360 (Kaby Lake) with current BIOS 1.2.3 on
Ubuntu (ElementaryOS) 16.04 with updates.

Could this have any relation to the missing firmware [1]?

What state can I capture to help diagnose this?

Thanks!

Dan

-- [1]

ath10k_pci 0000:3a:00.0: enabling device (0000 -> 0002)
ath10k_pci 0000:3a:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
ath10k_pci 0000:3a:00.0: Direct firmware load for
ath10k/pre-cal-pci-0000:3a:00.0.bin failed with error -2
ath10k_pci 0000:3a:00.0: Direct firmware load for
ath10k/cal-pci-0000:3a:00.0.bin failed with error -2
ath10k_pci 0000:3a:00.0: Direct firmware load for
ath10k/QCA6174/hw3.0/firmware-5.bin failed with error -2
ath10k_pci 0000:3a:00.0: could not fetch firmware file
'ath10k/QCA6174/hw3.0/firmware-5.bin': -2
ath10k_pci 0000:3a:00.0: qca6174 hw3.2 target 0x05030000 chip_id
0x00340aff sub 1a56:1535
ath10k_pci 0000:3a:00.0: kconfig debug 0 debugfs 1 tracing 1 dfs 0 testmode 0
ath10k_pci 0000:3a:00.0: firmware ver WLAN.RM.2.0-00180-QCARMSWPZ-1
api 4 features wowlan,ignore-otp,no-4addr-pad crc32 75dee6c5
ath10k_pci 0000:3a:00.0: board_file api 2 bmi_id N/A crc32 6fc88fe7
ath10k_pci 0000:3a:00.0: htt-ver 3.26 wmi-op 4 htt-op 3 cal otp
max-sta 32 raw 0 hwcrypto 1
-- 
Daniel J Blueman

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Incomplete scan results after rf(un)kill
  2017-01-27  6:50 Incomplete scan results after rf(un)kill Daniel J Blueman
@ 2017-01-27 15:44 ` Valo, Kalle
  2017-01-29  2:38   ` Daniel J Blueman
  0 siblings, 1 reply; 6+ messages in thread
From: Valo, Kalle @ 2017-01-27 15:44 UTC (permalink / raw)
  To: Daniel J Blueman; +Cc: ath10k

Daniel J Blueman <daniel@quora.org> writes:

> On 4.9.5 and previous, I've noticed that 20% of the time after
> rfunkilling, AP scan results would often have only 1-2 out of the
> previous 10 APs, and the situation would persist until removing and
> reinserting the ath10k_pci module.
>
> This is on my Dell XPS 13 9360 (Kaby Lake) with current BIOS 1.2.3 on
> Ubuntu (ElementaryOS) 16.04 with updates.
>
> Could this have any relation to the missing firmware [1]?
>
> What state can I capture to help diagnose this?

Do you see any pattern what APs are visible when the bug happens? For
example, are those 1-2 APs always the same one? And are they ones with
strongest signal strength or maybe related to certain channels?

After the bug happens how does the device work otherwise? Have you
tested performance (iperf) or signal strength? Is there any packet loss
etc?

> ath10k_pci 0000:3a:00.0: enabling device (0000 -> 0002)
> ath10k_pci 0000:3a:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
> ath10k_pci 0000:3a:00.0: Direct firmware load for
> ath10k/pre-cal-pci-0000:3a:00.0.bin failed with error -2
> ath10k_pci 0000:3a:00.0: Direct firmware load for
> ath10k/cal-pci-0000:3a:00.0.bin failed with error -2
> ath10k_pci 0000:3a:00.0: Direct firmware load for
> ath10k/QCA6174/hw3.0/firmware-5.bin failed with error -2
> ath10k_pci 0000:3a:00.0: could not fetch firmware file
> 'ath10k/QCA6174/hw3.0/firmware-5.bin': -2

You can ignore these failed with error -2 messages, it just means that
ath10k is trying to find the correct firmware image and calibration
data. We know it's confusing users and have a patch pending:

https://patchwork.kernel.org/patch/9237095/

-- 
Kalle Valo
_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Incomplete scan results after rf(un)kill
  2017-01-27 15:44 ` Valo, Kalle
@ 2017-01-29  2:38   ` Daniel J Blueman
  2017-01-30  8:04     ` Valo, Kalle
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel J Blueman @ 2017-01-29  2:38 UTC (permalink / raw)
  To: Valo, Kalle; +Cc: ath10k

On 27 January 2017 at 23:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
> Daniel J Blueman <daniel@quora.org> writes:
>
>> On 4.9.5 and previous, I've noticed that 20% of the time after
>> rfunkilling, AP scan results would often have only 1-2 out of the
>> previous 10 APs, and the situation would persist until removing and
>> reinserting the ath10k_pci module.
>>
>> This is on my Dell XPS 13 9360 (Kaby Lake) with current BIOS 1.2.3 on
>> Ubuntu (ElementaryOS) 16.04 with updates.
>>
>> Could this have any relation to the missing firmware [1]?
>>
>> What state can I capture to help diagnose this?
>
> Do you see any pattern what APs are visible when the bug happens? For
> example, are those 1-2 APs always the same one? And are they ones with
> strongest signal strength or maybe related to certain channels?
>
> After the bug happens how does the device work otherwise? Have you
> tested performance (iperf) or signal strength? Is there any packet loss
> etc?

Many thanks for following up Kalle! Interestingly, I do see the same
subset of APs (but not the strongest) in the Networkmanager GUI,
however when I scan from the CLI, nothing:

# nmcli dev wifi
*  SSID  MODE  CHAN  RATE  SIGNAL  BARS  SECURITY
#

The issue also occurs when coming out of suspend. I didn't check dmesg
other times, but this may or may not be related:

[  204.454815] WARNING: CPU: 3 PID: 0 at
/home/kernel/COD/linux/net/core/dev.c:5161 net_rx_action+0x26e/0x380
[  204.454818] Modules linked in: rfcomm ccm xt_CHECKSUM
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 bridge
stp llc ebtable_filter ebtables bnep binfmt_misc f2fs arc4
nls_iso8859_1 hid_multitouch snd_hda_codec_hdmi dell_led
snd_hda_codec_realtek snd_hda_codec_generic snd_soc_skl
snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core
snd_soc_sst_match snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine
dell_laptop dell_wmi dell_smbios snd_hda_intel snd_hda_codec
i2c_designware_platform i2c_designware_core dcdbas snd_hda_core
snd_hwdep snd_pcm snd_seq_midi intel_rapl x86_pkg_temp_thermal
snd_seq_midi_event intel_powerclamp coretemp snd_rawmidi
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_seq uvcvideo
aesni_intel ath10k_pci videobuf2_vmalloc ath10k_core videobuf2_memops
[  204.454847]  videobuf2_v4l2 aes_x86_64 ath lrw snd_seq_device
glue_helper ablk_helper mac80211 input_leds cryptd snd_timer
videobuf2_core joydev videodev serio_raw media snd cfg80211
rtsx_pci_ms memstick soundcore btusb btrtl hci_uart btbcm btqca
soc_button_array btintel intel_vbtn bluetooth int3400_thermal acpi_pad
acpi_thermal_rel intel_lpss_acpi intel_hid int3403_thermal
sparse_keymap mac_hid idma64 acpi_als virt_dma
processor_thermal_device int340x_thermal_zone mei_me kfifo_buf
industrialio intel_pch_thermal mei intel_lpss_pci shpchp
intel_soc_dts_iosf intel_lpss kvm_intel kvm irqbypass ipt_REJECT
nf_reject_ipv4 xt_DSCP iptable_mangle xt_limit xt_tcpudp xt_addrtype
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6_tables
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp sch_fq_codel
nf_nat
[  204.454883]  nf_conntrack_ftp tcp_bbr nf_conntrack iptable_filter
parport_pc ip_tables x_tables ppdev lp parport autofs4 mmc_block
rtsx_pci_sdmmc i915 psmouse i2c_algo_bit nvme drm_kms_helper nvme_core
syscopyarea sysfillrect rtsx_pci sysimgblt fb_sys_fops drm i2c_hid wmi
hid pinctrl_sunrisepoint video pinctrl_intel fjes
[  204.454904] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G     U
4.9.6-040906-generic #201701260330
[  204.454906] Hardware name: Dell Inc. XPS 13 9360/0T3FTF, BIOS 1.3.2
01/18/2017
[  204.454908]  ffff911b3e583e68 ffffffff81c1bf92 0000000000000000
0000000000000000
[  204.454910]  ffff911b3e583ea8 ffffffff81883e7b 0000142900000040
ffff911b2f687c80
[  204.454913]  0000000000000040 0000000000000000 000000000000012c
0000000000000047
[  204.454916] Call Trace:
[  204.454917]  <IRQ>
[  204.454922]  [<ffffffff81c1bf92>] dump_stack+0x63/0x81
[  204.454925]  [<ffffffff81883e7b>] __warn+0xcb/0xf0
[  204.454927]  [<ffffffff81883fad>] warn_slowpath_null+0x1d/0x20
[  204.454928]  [<ffffffff81f73eae>] net_rx_action+0x26e/0x380
[  204.454933]  [<ffffffffc06f3e24>] ?
ath10k_pci_interrupt_handler+0x74/0xd0 [ath10k_pci]
[  204.454935]  [<ffffffff82094654>] __do_softirq+0x104/0x28c
[  204.454937]  [<ffffffff8188a336>] irq_exit+0xb6/0xc0
[  204.454938]  [<ffffffff820943a4>] do_IRQ+0x54/0xd0
[  204.454940]  [<ffffffff82092482>] common_interrupt+0x82/0x82
[  204.454940]  <EOI>
[  204.454943]  [<ffffffff81f13dd2>] ? cpuidle_enter_state+0x122/0x2c0
[  204.454944]  [<ffffffff81f13fa7>] cpuidle_enter+0x17/0x20
[  204.454946]  [<ffffffff818c9f53>] call_cpuidle+0x23/0x40
[  204.454947]  [<ffffffff818ca1cb>] cpu_startup_entry+0x15b/0x240
[  204.454950]  [<ffffffff81851b74>] start_secondary+0x154/0x190

Daniel
-- 
Daniel J Blueman

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Incomplete scan results after rf(un)kill
  2017-01-29  2:38   ` Daniel J Blueman
@ 2017-01-30  8:04     ` Valo, Kalle
  2017-01-30 12:56       ` Daniel J Blueman
  0 siblings, 1 reply; 6+ messages in thread
From: Valo, Kalle @ 2017-01-30  8:04 UTC (permalink / raw)
  To: Daniel J Blueman; +Cc: ath10k

Daniel J Blueman <daniel@quora.org> writes:

> On 27 January 2017 at 23:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
>> Daniel J Blueman <daniel@quora.org> writes:
>>
>>> On 4.9.5 and previous, I've noticed that 20% of the time after
>>> rfunkilling, AP scan results would often have only 1-2 out of the
>>> previous 10 APs, and the situation would persist until removing and
>>> reinserting the ath10k_pci module.
>>>
>>> This is on my Dell XPS 13 9360 (Kaby Lake) with current BIOS 1.2.3 on
>>> Ubuntu (ElementaryOS) 16.04 with updates.
>>>
>>> Could this have any relation to the missing firmware [1]?
>>>
>>> What state can I capture to help diagnose this?
>>
>> Do you see any pattern what APs are visible when the bug happens? For
>> example, are those 1-2 APs always the same one? And are they ones with
>> strongest signal strength or maybe related to certain channels?
>>
>> After the bug happens how does the device work otherwise? Have you
>> tested performance (iperf) or signal strength? Is there any packet loss
>> etc?
>
> Many thanks for following up Kalle! Interestingly, I do see the same
> subset of APs (but not the strongest) in the Networkmanager GUI,
> however when I scan from the CLI, nothing:
>
> # nmcli dev wifi
> *  SSID  MODE  CHAN  RATE  SIGNAL  BARS  SECURITY
> #

Better to use 'sudo iw wlan0 scan' (or whatever is the ath10k network
interface name in your setup, systemd has made guessing the name
difficult). That provides more information and communicates directly
with kernel.

> The issue also occurs when coming out of suspend. I didn't check dmesg
> other times, but this may or may not be related:
>
> [  204.454815] WARNING: CPU: 3 PID: 0 at
> /home/kernel/COD/linux/net/core/dev.c:5161 net_rx_action+0x26e/0x380

It may very well be. Do you know exactly to what warning that line 5161
points to in your kernel version? With latest ath.git master branch line
5182 in dev.c is this warning napi_poll():

	WARN_ON_ONCE(work > weight);

But I do not know if you are seeing that warning or something else
because my kernel sources doesn't match what you use.

-- 
Kalle Valo
_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Incomplete scan results after rf(un)kill
  2017-01-30  8:04     ` Valo, Kalle
@ 2017-01-30 12:56       ` Daniel J Blueman
  2017-02-07  6:59         ` Daniel J Blueman
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel J Blueman @ 2017-01-30 12:56 UTC (permalink / raw)
  To: Valo, Kalle; +Cc: ath10k

On 30 January 2017 at 16:04, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
> Daniel J Blueman <daniel@quora.org> writes:
>
>> On 27 January 2017 at 23:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
>>> Daniel J Blueman <daniel@quora.org> writes:
>>>
>>>> On 4.9.5 and previous, I've noticed that 20% of the time after
>>>> rfunkilling, AP scan results would often have only 1-2 out of the
>>>> previous 10 APs, and the situation would persist until removing and
>>>> reinserting the ath10k_pci module.
>>>>
>>>> This is on my Dell XPS 13 9360 (Kaby Lake) with current BIOS 1.2.3 on
>>>> Ubuntu (ElementaryOS) 16.04 with updates.
>>>>
>>>> Could this have any relation to the missing firmware [1]?
>>>>
>>>> What state can I capture to help diagnose this?
>>>
>>> Do you see any pattern what APs are visible when the bug happens? For
>>> example, are those 1-2 APs always the same one? And are they ones with
>>> strongest signal strength or maybe related to certain channels?
>>>
>>> After the bug happens how does the device work otherwise? Have you
>>> tested performance (iperf) or signal strength? Is there any packet loss
>>> etc?
>>
>> Many thanks for following up Kalle! Interestingly, I do see the same
>> subset of APs (but not the strongest) in the Networkmanager GUI,
>> however when I scan from the CLI, nothing:
>>
>> # nmcli dev wifi
>> *  SSID  MODE  CHAN  RATE  SIGNAL  BARS  SECURITY
>> #
>
> Better to use 'sudo iw wlan0 scan' (or whatever is the ath10k network
> interface name in your setup, systemd has made guessing the name
> difficult). That provides more information and communicates directly
> with kernel.

Good tip; I'll harvest this info at next occurrence.

>> The issue also occurs when coming out of suspend. I didn't check dmesg
>> other times, but this may or may not be related:
>>
>> [  204.454815] WARNING: CPU: 3 PID: 0 at
>> /home/kernel/COD/linux/net/core/dev.c:5161 net_rx_action+0x26e/0x380
>
> It may very well be. Do you know exactly to what warning that line 5161
> points to in your kernel version? With latest ath.git master branch line
> 5182 in dev.c is this warning napi_poll():
>
>         WARN_ON_ONCE(work > weight);
>
> But I do not know if you are seeing that warning or something else
> because my kernel sources doesn't match what you use.

I'm using the ubuntu mainline builds; WARN_ON_ONCE(work > weight) is
at line 5161 in dev.c in 4.9.6.

Thanks,
  Daniel
-- 
Daniel J Blueman

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Incomplete scan results after rf(un)kill
  2017-01-30 12:56       ` Daniel J Blueman
@ 2017-02-07  6:59         ` Daniel J Blueman
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel J Blueman @ 2017-02-07  6:59 UTC (permalink / raw)
  To: Valo, Kalle; +Cc: ath10k

On 30 January 2017 at 20:56, Daniel J Blueman <daniel@quora.org> wrote:
> On 30 January 2017 at 16:04, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
>> Daniel J Blueman <daniel@quora.org> writes:
>>
>>> On 27 January 2017 at 23:44, Valo, Kalle <kvalo@qca.qualcomm.com> wrote:
>>>> Daniel J Blueman <daniel@quora.org> writes:
>>>>
>>>>> On 4.9.5 and previous, I've noticed that 20% of the time after
>>>>> rfunkilling, AP scan results would often have only 1-2 out of the
>>>>> previous 10 APs, and the situation would persist until removing and
>>>>> reinserting the ath10k_pci module.
>>>>>
>>>>> This is on my Dell XPS 13 9360 (Kaby Lake) with current BIOS 1.2.3 on
>>>>> Ubuntu (ElementaryOS) 16.04 with updates.
>>>>>
>>>>> Could this have any relation to the missing firmware [1]?
>>>>>
>>>>> What state can I capture to help diagnose this?
>>>>
>>>> Do you see any pattern what APs are visible when the bug happens? For
>>>> example, are those 1-2 APs always the same one? And are they ones with
>>>> strongest signal strength or maybe related to certain channels?
>>>>
>>>> After the bug happens how does the device work otherwise? Have you
>>>> tested performance (iperf) or signal strength? Is there any packet loss
>>>> etc?
>>>
>>> Many thanks for following up Kalle! Interestingly, I do see the same
>>> subset of APs (but not the strongest) in the Networkmanager GUI,
>>> however when I scan from the CLI, nothing:
>>>
>>> # nmcli dev wifi
>>> *  SSID  MODE  CHAN  RATE  SIGNAL  BARS  SECURITY
>>> #
>>
>> Better to use 'sudo iw wlan0 scan' (or whatever is the ath10k network
>> interface name in your setup, systemd has made guessing the name
>> difficult). That provides more information and communicates directly
>> with kernel.
>
> Good tip; I'll harvest this info at next occurrence.
>
>>> The issue also occurs when coming out of suspend. I didn't check dmesg
>>> other times, but this may or may not be related:
>>>
>>> [  204.454815] WARNING: CPU: 3 PID: 0 at
>>> /home/kernel/COD/linux/net/core/dev.c:5161 net_rx_action+0x26e/0x380
>>
>> It may very well be. Do you know exactly to what warning that line 5161
>> points to in your kernel version? With latest ath.git master branch line
>> 5182 in dev.c is this warning napi_poll():
>>
>>         WARN_ON_ONCE(work > weight);
>>
>> But I do not know if you are seeing that warning or something else
>> because my kernel sources doesn't match what you use.
>
> I'm using the ubuntu mainline builds; WARN_ON_ONCE(work > weight) is
> at line 5161 in dev.c in 4.9.6.

I have confirmed this behaviour multiple times now on 4.9.8.

Initially, we see the networkmanager UI show always 1 AP, but the CLI
scan list is empty:
$ sudo nmcli dev wifi
*  SSID  MODE  CHAN  RATE  SIGNAL  BARS  SECURITY
$

However, after executing a scan with iw, all the expected APs are listed:
$ sudo iw wlp58s0 scan
..lots of APs
$

We then see networkmanager connect to the expected AP, and 'sudo nmcli
dev wifi' shows all the APs in the iw scan.

Would this likely be a networkmanager issue, or driver?

Thanks Kalle!

Dan
-- 
Daniel J Blueman

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-07  7:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-27  6:50 Incomplete scan results after rf(un)kill Daniel J Blueman
2017-01-27 15:44 ` Valo, Kalle
2017-01-29  2:38   ` Daniel J Blueman
2017-01-30  8:04     ` Valo, Kalle
2017-01-30 12:56       ` Daniel J Blueman
2017-02-07  6:59         ` Daniel J Blueman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.