All of lore.kernel.org
 help / color / mirror / Atom feed
* iwlwifi warnings in 5.5-rc1
@ 2019-12-10 20:46 Jens Axboe
  2019-12-11  8:36 ` Johannes Berg
  2019-12-11 13:45 ` Johannes Berg
  0 siblings, 2 replies; 25+ messages in thread
From: Jens Axboe @ 2019-12-10 20:46 UTC (permalink / raw)
  To: Johannes Berg, Emmanuel Grumbach, Luca Coelho; +Cc: linux-wireless, Networking

Hi,

Since the GRO issue got fixed, iwlwifi has worked fine for me.
However, on every boot, I get some warnings:

------------[ cut here ]------------
STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
WARNING: CPU: 0 PID: 606 at net/mac80211/sta_info.c:1931 ieee80211_sta_update_pending_airtime+0x11c/0x130 [mac80211]
Modules linked in: rfcomm ccm cmac msr bnep iwlmvm joydev x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mac80211 kvm binfmt_misc irqbypass nls_iso8859_1 snd_soc_skl libarc4 iwlwifi snd_soc_hdac_hda snd_hda_ext_core snd_hda_codec_hdmi snd_soc_acpi_intel_match hid_multitouch snd_soc_acpi snd_soc_sst_ipc snd_soc_sst_dsp wmi_bmof intel_wmi_thunderbolt snd_soc_core uvcvideo btusb snd_hda_codec_realtek crct10dif_pclmul btrtl crc32_pclmul ghash_clmulni_intel btbcm snd_hda_codec_generic btintel bluetooth videobuf2_vmalloc aesni_intel cfg80211 videobuf2_memops glue_helper videobuf2_v4l2 crypto_simd videobuf2_common cryptd intel_cstate videodev snd_hda_intel intel_rapl_perf snd_intel_dspcfg snd_hda_codec serio_raw input_leds snd_hwdep mc snd_hda_core mei_me ecdh_generic ecc snd_pcm mei snd_seq thinkpad_acpi intel_lpss_pci nvram processor_thermal_device ledtrig_audio ucsi_acpi intel_lpss intel_soc_dts_iosf typec_ucsi snd_timer idma64 intel_rapl_common virt_dma snd_seq_device
 typec intel_pch_thermal snd wmi int3403_thermal soundcore int340x_thermal_zone int3400_thermal acpi_pad acpi_thermal_rel sch_fq_codel ip_tables x_tables hid_generic usbhid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm e1000e nvme i2c_hid ptp nvme_core hid pps_core video
CPU: 0 PID: 606 Comm: irq/139-iwlwifi Tainted: G     U            5.5.0-rc1+ #4143
Hardware name: LENOVO 20QD001XUS/20QD001XUS, BIOS N2HET42W (1.25 ) 11/26/2019
RIP: 0010:ieee80211_sta_update_pending_airtime+0x11c/0x130 [mac80211]
Code: 99 a6 96 c4 0f 0b 8b 45 d4 eb a1 48 83 c6 40 45 89 e0 89 c1 89 45 d4 48 c7 c7 88 ec f7 c0 c6 05 6a af 09 00 01 e8 73 a6 96 c4 <0f> 0b 8b 45 d4 eb 91 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
RSP: 0018:ffffa6cfc092fb30 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 00000000ffffff30 RCX: 0000000000000000
RDX: 0000000000000049 RSI: ffffffff86d89729 RDI: ffffffff86d872ac
RBP: ffffa6cfc092fb60 R08: ffffffff86d896e0 R09: 0000000000000049
R10: abcc77118461cefd R11: ffffffff86d89729 R12: 00000000000000d0
R13: ffff8afb89fd07c0 R14: 0000000000000002 R15: ffff8afb8fcc05c8
FS:  0000000000000000(0000) GS:ffff8afbae400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f355c35aea0 CR3: 000000042c609005 CR4: 00000000001606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __ieee80211_tx_status+0x69b/0x8c0 [mac80211]
 ? sta_info_get_by_addrs+0x12e/0x1e0 [mac80211]
 ieee80211_tx_status+0x75/0xa0 [mac80211]
 iwl_mvm_tx_reclaim+0x2b5/0x3d0 [iwlmvm]
 iwl_mvm_rx_ba_notif+0x285/0x330 [iwlmvm]
 iwl_mvm_rx_common+0xde/0x2a0 [iwlmvm]
 iwl_mvm_rx_mq+0x71/0xb0 [iwlmvm]
 iwl_pcie_rx_handle+0x3b5/0xa70 [iwlwifi]
 ? irq_forced_thread_fn+0x80/0x80
 iwl_pcie_irq_rx_msix_handler+0x58/0x120 [iwlwifi]
 irq_thread_fn+0x23/0x60
 irq_thread+0xd8/0x170
 ? wake_threads_waitq+0x30/0x30
 kthread+0x103/0x140
 ? irq_thread_dtor+0xa0/0xa0
 ? kthread_park+0x90/0x90
 ret_from_fork+0x1f/0x30
---[ end trace 0c7c7c0cf1fc73ce ]---
------------[ cut here ]------------
Device phy0 AC 2 pending airtime underflow: 4294967088, 208
WARNING: CPU: 0 PID: 606 at net/mac80211/sta_info.c:1940 ieee80211_sta_update_pending_airtime+0xf6/0x130 [mac80211]
Modules linked in: rfcomm ccm cmac msr bnep iwlmvm joydev x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mac80211 kvm binfmt_misc irqbypass nls_iso8859_1 snd_soc_skl libarc4 iwlwifi snd_soc_hdac_hda snd_hda_ext_core snd_hda_codec_hdmi snd_soc_acpi_intel_match hid_multitouch snd_soc_acpi snd_soc_sst_ipc snd_soc_sst_dsp wmi_bmof intel_wmi_thunderbolt snd_soc_core uvcvideo btusb snd_hda_codec_realtek crct10dif_pclmul btrtl crc32_pclmul ghash_clmulni_intel btbcm snd_hda_codec_generic btintel bluetooth videobuf2_vmalloc aesni_intel cfg80211 videobuf2_memops glue_helper videobuf2_v4l2 crypto_simd videobuf2_common cryptd intel_cstate videodev snd_hda_intel intel_rapl_perf snd_intel_dspcfg snd_hda_codec serio_raw input_leds snd_hwdep mc snd_hda_core mei_me ecdh_generic ecc snd_pcm mei snd_seq thinkpad_acpi intel_lpss_pci nvram processor_thermal_device ledtrig_audio ucsi_acpi intel_lpss intel_soc_dts_iosf typec_ucsi snd_timer idma64 intel_rapl_common virt_dma snd_seq_device
 typec intel_pch_thermal snd wmi int3403_thermal soundcore int340x_thermal_zone int3400_thermal acpi_pad acpi_thermal_rel sch_fq_codel ip_tables x_tables hid_generic usbhid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm e1000e nvme i2c_hid ptp nvme_core hid pps_core video
CPU: 0 PID: 606 Comm: irq/139-iwlwifi Tainted: G     U  W         5.5.0-rc1+ #4143
Hardware name: LENOVO 20QD001XUS/20QD001XUS, BIOS N2HET42W (1.25 ) 11/26/2019
RIP: 0010:ieee80211_sta_update_pending_airtime+0xf6/0x130 [mac80211]
Code: 48 8b b2 90 01 00 00 48 85 f6 75 07 48 8b b2 40 01 00 00 45 89 e0 89 c1 44 89 f2 89 45 d4 48 c7 c7 c0 ec f7 c0 e8 99 a6 96 c4 <0f> 0b 8b 45 d4 eb a1 48 83 c6 40 45 89 e0 89 c1 89 45 d4 48 c7 c7
RSP: 0018:ffffa6cfc092fb30 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 00000000ffffff30 RCX: 0000000000000000
RDX: 000000000000003b RSI: ffffffff86d8971b RDI: ffffffff86d872ac
RBP: ffffa6cfc092fb60 R08: ffffffff86d896e0 R09: 000000000000003b
R10: abcc77118461cefd R11: ffffffff86d8971b R12: 00000000000000d0
R13: ffff8afb89fd07c0 R14: 0000000000000002 R15: ffff8afb8fcc05c8
FS:  0000000000000000(0000) GS:ffff8afbae400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f355c35aea0 CR3: 000000042c609005 CR4: 00000000001606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __ieee80211_tx_status+0x69b/0x8c0 [mac80211]
 ? sta_info_get_by_addrs+0x12e/0x1e0 [mac80211]
 ieee80211_tx_status+0x75/0xa0 [mac80211]
 iwl_mvm_tx_reclaim+0x2b5/0x3d0 [iwlmvm]
 iwl_mvm_rx_ba_notif+0x285/0x330 [iwlmvm]
 iwl_mvm_rx_common+0xde/0x2a0 [iwlmvm]
 iwl_mvm_rx_mq+0x71/0xb0 [iwlmvm]
 iwl_pcie_rx_handle+0x3b5/0xa70 [iwlwifi]
 ? irq_forced_thread_fn+0x80/0x80
 iwl_pcie_irq_rx_msix_handler+0x58/0x120 [iwlwifi]
 irq_thread_fn+0x23/0x60
 irq_thread+0xd8/0x170
 ? wake_threads_waitq+0x30/0x30
 kthread+0x103/0x140
 ? irq_thread_dtor+0xa0/0xa0
 ? kthread_park+0x90/0x90
 ret_from_fork+0x1f/0x30
---[ end trace 0c7c7c0cf1fc73cf ]---

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-10 20:46 iwlwifi warnings in 5.5-rc1 Jens Axboe
@ 2019-12-11  8:36 ` Johannes Berg
  2019-12-11  8:53   ` Toke Høiland-Jørgensen
  2019-12-21  0:55   ` Jens Axboe
  2019-12-11 13:45 ` Johannes Berg
  1 sibling, 2 replies; 25+ messages in thread
From: Johannes Berg @ 2019-12-11  8:36 UTC (permalink / raw)
  To: Jens Axboe, Emmanuel Grumbach, Luca Coelho; +Cc: linux-wireless, Networking

On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
> Hi,
> 
> Since the GRO issue got fixed, iwlwifi has worked fine for me.
> However, on every boot, I get some warnings:
> 
> ------------[ cut here ]------------
> STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208

Yeah, we've seen a few reports of this.

I guess I kinda feel responsible for this since I merged the AQL work,
I'll take a look as soon as I can.

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11  8:36 ` Johannes Berg
@ 2019-12-11  8:53   ` Toke Høiland-Jørgensen
  2019-12-11 10:11     ` Johannes Berg
  2019-12-21  0:55   ` Jens Axboe
  1 sibling, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11  8:53 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
>> Hi,
>> 
>> Since the GRO issue got fixed, iwlwifi has worked fine for me.
>> However, on every boot, I get some warnings:
>> 
>> ------------[ cut here ]------------
>> STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
>
> Yeah, we've seen a few reports of this.

FWIW I've tried reproducing but I don't get the error with the 8265 /
8275 chip in my laptop. I've thought about sending a patch for mac80211
to just clear the tx_time_est field after calling
ieee80211_sta_update_pending_airtime() - that should prevent any errors
from double-reporting of skbs (which is what I'm guessing is going on
here). However, it kinda feels like a band-aid so I'd much rather figure
out why this particular driver/device combo cause this :)

> I guess I kinda feel responsible for this since I merged the AQL work,
> I'll take a look as soon as I can.

Sounds good, thanks!

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11  8:53   ` Toke Høiland-Jørgensen
@ 2019-12-11 10:11     ` Johannes Berg
  2019-12-11 10:23       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 10:11 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

On Wed, 2019-12-11 at 09:53 +0100, Toke Høiland-Jørgensen wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
> > > Hi,
> > > 
> > > Since the GRO issue got fixed, iwlwifi has worked fine for me.
> > > However, on every boot, I get some warnings:
> > > 
> > > ------------[ cut here ]------------
> > > STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
> > 
> > Yeah, we've seen a few reports of this.
> 
> FWIW I've tried reproducing but I don't get the error with the 8265 /
> 8275 chip in my laptop. I've thought about sending a patch for mac80211
> to just clear the tx_time_est field after calling
> ieee80211_sta_update_pending_airtime() - that should prevent any errors
> from double-reporting of skbs (which is what I'm guessing is going on
> here).

It does feel like it, but I'm not sure how that'd be possible?

OK, I talked with Emmanuel and I think it's the GSO path - it'll end up
with skb_clone() and then report both of them back.

Regardless, I think I'll probably have to disable AQL and make it more
opt-in for the driver - I found a bunch of other issues ...

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 10:11     ` Johannes Berg
@ 2019-12-11 10:23       ` Toke Høiland-Jørgensen
  2019-12-11 11:51         ` Johannes Berg
  0 siblings, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11 10:23 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-12-11 at 09:53 +0100, Toke Høiland-Jørgensen wrote:
>> Johannes Berg <johannes@sipsolutions.net> writes:
>> 
>> > On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
>> > > Hi,
>> > > 
>> > > Since the GRO issue got fixed, iwlwifi has worked fine for me.
>> > > However, on every boot, I get some warnings:
>> > > 
>> > > ------------[ cut here ]------------
>> > > STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
>> > 
>> > Yeah, we've seen a few reports of this.
>> 
>> FWIW I've tried reproducing but I don't get the error with the 8265 /
>> 8275 chip in my laptop. I've thought about sending a patch for mac80211
>> to just clear the tx_time_est field after calling
>> ieee80211_sta_update_pending_airtime() - that should prevent any errors
>> from double-reporting of skbs (which is what I'm guessing is going on
>> here).
>
> It does feel like it, but I'm not sure how that'd be possible?
>
> OK, I talked with Emmanuel and I think it's the GSO path - it'll end up
> with skb_clone() and then report both of them back.

Right, figured it was something like that; just couldn't find the place
in the driver where it did that from my cursory browsing.

> Regardless, I think I'll probably have to disable AQL and make it more
> opt-in for the driver - I found a bunch of other issues ...

Issues like what? Making it opt-in was going to be my backup plan; I was
kinda hoping we could work out any issues so it would be a "no harm"
kind of thing that could be left as always-on. Maybe that was a bit too
optimistic; but it's also a pain having to keep track of which drivers
have which features/fixes...

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 10:23       ` Toke Høiland-Jørgensen
@ 2019-12-11 11:51         ` Johannes Berg
  2019-12-11 13:42           ` Johannes Berg
  2019-12-11 14:02           ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 11:51 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

Hi Toke,

> > OK, I talked with Emmanuel and I think it's the GSO path - it'll end up
> > with skb_clone() and then report both of them back.
> 
> Right, figured it was something like that; just couldn't find the place
> in the driver where it did that from my cursory browsing.

Yeah, deeply hidden in the guts :)

> > Regardless, I think I'll probably have to disable AQL and make it more
> > opt-in for the driver - I found a bunch of other issues ...
> 
> Issues like what? Making it opt-in was going to be my backup plan; I was
> kinda hoping we could work out any issues so it would be a "no harm"
> kind of thing that could be left as always-on. Maybe that was a bit too
> optimistic; but it's also a pain having to keep track of which drivers
> have which features/fixes...

Sorry to keep you in suspense, had to run when I sent that email and
didn't have time to elaborate.

1) Hardware building A-MPDU will probably make the airtime estimate
   quite a bit wrong. Maybe this doesn't matter? But I wasn't sure how
   this works now with ath10k where (most of?) the testing was.

2) GSO/TSO like what we have - it's not really clear how to handle it.
   The airtime estimate will shoot *way* up (64kB frame) once that frame
   enters, and then ... should it "trickle back down" as the generated 
   parts are transmitted? But then the driver needs to split up the
   airtime estimate? Or should it just go back down entirely? On the
   first frame? That might overshoot. On the last frame? Opposite
   problem ...

3) I'm not quite convinced that all drivers report the TX rate
   completely correctly in the status, some don't even use this path
   but the ieee80211_tx_status_ext() which doesn't update the rate.

4) Probably most importantly, this is completely broken with HE because
   there's currently no way to report HE rates in the TX status at all!
   I just worked around that in our driver for 'iw' reporting purposes
   by implementing the rate reporting in the sta_statistics callback,
   but that data won't be used by the airtime estimates.


Now, (1) probably doesn't matter, the estimates don't need to be that
accurate. (2) I'm not sure how to solve; (3) and (4) could both be
solved by having some mechanism of the rate scaling to tell us what the
current rate is whenever it updates, rather than relying on the
last_rate. Really we should do that much more, and even phase out
last_rate entirely, it's a stupid concept.


There's an additional wrinkle here - what about HE scheduled mode, where
the AP decided when and at what rate you're allowed to send? We don't
report that at all, not even as part of rate scaling, since rate scaling
only affects *our* decision, not when we send as a response to a trigger
frame. This is _still_ relevant for AQL, but there we can only see what
the AP used last (**), but we don't track that now nor do we track the
proportion of packets that we sent triggered vs. normal medium access...


(**) like it does with our rate scaling today (***), but in fact we can
do better since that's local information.

(***) There's an additional problem here - if you _just_ transmitted a
management frame, you'll have a last_rate of say 6Mbps severely over-
estimating the airtime for the next data frame ...



Overall, at least for devices capable of HE we currently _have_ to
disable this, and we need more infrastructure that we cannot build in
the short term to fix that.

I'm also not sure that the last_rate is reliable enough in general, both
because of the management frame issue and because of HW offloaded rate
scaling issues that I'm not convinced reports this correctly for all
drivers.

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 11:51         ` Johannes Berg
@ 2019-12-11 13:42           ` Johannes Berg
  2019-12-11 14:04             ` Toke Høiland-Jørgensen
  2019-12-11 14:02           ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 13:42 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

Btw, there's *another* issue. You said in the commit log:

    This patch does *not* include any mechanism to wake a throttled TXQ again,
    on the assumption that this will happen anyway as a side effect of whatever
    freed the skb (most commonly a TX completion).

Thinking about this some more, I'm not convinced that this assumption
holds. You could have been stopped due to the global limit, and now you
wake some queue but the TXQ is empty - now you should reschedule some
*other* TXQ since the global limit had kicked in, not the per-TXQ limit,
and prevented dequeuing, no?

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-10 20:46 iwlwifi warnings in 5.5-rc1 Jens Axboe
  2019-12-11  8:36 ` Johannes Berg
@ 2019-12-11 13:45 ` Johannes Berg
  2019-12-11 14:09   ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 13:45 UTC (permalink / raw)
  To: Jens Axboe, Emmanuel Grumbach, Luca Coelho, Jason A. Donenfeld,
	Steve French
  Cc: linux-wireless, Networking, Toke Høiland-Jørgensen

++ others who reported this

On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:

> ------------[ cut here ]------------
> STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208

We think this is due to TSO, the change below will disable the AQL again
for now until we can figure out how to really fix it. I think I'll do
the equivalent for 5.5 and maybe leave it enabled only for ath10k, or
something like that ...

johannes


diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 6cca0853f183..4c2b5ba3ac09 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -672,9 +672,7 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 			IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H;
 	}
 
-	local->airtime_flags = AIRTIME_USE_TX |
-			       AIRTIME_USE_RX |
-			       AIRTIME_USE_AQL;
+	local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
 	local->aql_threshold = IEEE80211_AQL_THRESHOLD;
 	atomic_set(&local->aql_total_pending_airtime, 0);
 
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 8eafd81e97b4..a14d0dac52e8 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1916,6 +1916,9 @@ void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
 {
 	int tx_pending;
 
+	if (!(local->airtime_flags & AIRTIME_USE_AQL))
+		return;
+
 	if (!tx_completed) {
 		if (sta)
 			atomic_add(tx_airtime,



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 11:51         ` Johannes Berg
  2019-12-11 13:42           ` Johannes Berg
@ 2019-12-11 14:02           ` Toke Høiland-Jørgensen
  2019-12-11 21:17             ` Johannes Berg
  1 sibling, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11 14:02 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> Hi Toke,
>
>> > OK, I talked with Emmanuel and I think it's the GSO path - it'll end up
>> > with skb_clone() and then report both of them back.
>> 
>> Right, figured it was something like that; just couldn't find the place
>> in the driver where it did that from my cursory browsing.
>
> Yeah, deeply hidden in the guts :)
>
>> > Regardless, I think I'll probably have to disable AQL and make it more
>> > opt-in for the driver - I found a bunch of other issues ...
>> 
>> Issues like what? Making it opt-in was going to be my backup plan; I was
>> kinda hoping we could work out any issues so it would be a "no harm"
>> kind of thing that could be left as always-on. Maybe that was a bit too
>> optimistic; but it's also a pain having to keep track of which drivers
>> have which features/fixes...
>
> Sorry to keep you in suspense, had to run when I sent that email and
> didn't have time to elaborate.
>
> 1) Hardware building A-MPDU will probably make the airtime estimate
>    quite a bit wrong. Maybe this doesn't matter? But I wasn't sure how
>    this works now with ath10k where (most of?) the testing was.

Yeah, not too worried about this...

> 2) GSO/TSO like what we have - it's not really clear how to handle it.
>    The airtime estimate will shoot *way* up (64kB frame) once that frame
>    enters, and then ... should it "trickle back down" as the generated 
>    parts are transmitted? But then the driver needs to split up the
>    airtime estimate? Or should it just go back down entirely? On the
>    first frame? That might overshoot. On the last frame? Opposite
>    problem ...

Well, ideally it would be divided out over the component packets; but
yeah, who is going to do that? I think reporting it on the first packet
would be the safest if we had to choose. Also, ideally we would want the
GSO/TSO mechanism to lower the size of the superpackets at lower rates
(does it?). At higher rates this matters less...

> 3) I'm not quite convinced that all drivers report the TX rate
>    completely correctly in the status, some don't even use this path
>    but the ieee80211_tx_status_ext() which doesn't update the rate.
>
> 4) Probably most importantly, this is completely broken with HE because
>    there's currently no way to report HE rates in the TX status at all!
>    I just worked around that in our driver for 'iw' reporting purposes
>    by implementing the rate reporting in the sta_statistics callback,
>    but that data won't be used by the airtime estimates.

Hmm, yeah, both of those are good points. I guess I just kinda assumed
that the drivers were already doing the right thing there... :)

> Now, (1) probably doesn't matter, the estimates don't need to be that
> accurate. (2) I'm not sure how to solve; (3) and (4) could both be
> solved by having some mechanism of the rate scaling to tell us what the
> current rate is whenever it updates, rather than relying on the
> last_rate. Really we should do that much more, and even phase out
> last_rate entirely, it's a stupid concept.

Yes, that last bit would be good!

> There's an additional wrinkle here - what about HE scheduled mode, where
> the AP decided when and at what rate you're allowed to send? We don't
> report that at all, not even as part of rate scaling, since rate scaling
> only affects *our* decision, not when we send as a response to a trigger
> frame. This is _still_ relevant for AQL, but there we can only see what
> the AP used last (**), but we don't track that now nor do we track the
> proportion of packets that we sent triggered vs. normal medium
> access...

Huh, wasn't aware that was a thing in HE; that's cool! And yeah, could
have interesting interactions with AQL...

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 13:42           ` Johannes Berg
@ 2019-12-11 14:04             ` Toke Høiland-Jørgensen
  2019-12-11 14:12               ` Johannes Berg
  0 siblings, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11 14:04 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> Btw, there's *another* issue. You said in the commit log:
>
>     This patch does *not* include any mechanism to wake a throttled TXQ again,
>     on the assumption that this will happen anyway as a side effect of whatever
>     freed the skb (most commonly a TX completion).
>
> Thinking about this some more, I'm not convinced that this assumption
> holds. You could have been stopped due to the global limit, and now you
> wake some queue but the TXQ is empty - now you should reschedule some
> *other* TXQ since the global limit had kicked in, not the per-TXQ limit,
> and prevented dequeuing, no?

Well if you hit the global limit that means you have 24ms worth of data
queued in the hardware; those should be completed in turn, and enable
more to be dequeued, no?

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 13:45 ` Johannes Berg
@ 2019-12-11 14:09   ` Toke Høiland-Jørgensen
  2019-12-11 14:13     ` Johannes Berg
  0 siblings, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11 14:09 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho,
	Jason A. Donenfeld, Steve French
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> ++ others who reported this
>
> On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
>
>> ------------[ cut here ]------------
>> STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
>
> We think this is due to TSO, the change below will disable the AQL again
> for now until we can figure out how to really fix it. I think I'll do
> the equivalent for 5.5 and maybe leave it enabled only for ath10k, or
> something like that ...

If we're doing this on a per-driver basis, let's make it a proper
NL80211_EXT_FEATURE and expose it to userspace; that way users can at
least discover if it's supported on their device. I can send a patch
adding that...

Maybe we should untangle this from airtime_flags completely, since if we
just use the flags people could conceivably end up disabling it by
mistake, couldn't they?

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:04             ` Toke Høiland-Jørgensen
@ 2019-12-11 14:12               ` Johannes Berg
  2019-12-11 14:47                 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 14:12 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

On Wed, 2019-12-11 at 15:04 +0100, Toke Høiland-Jørgensen wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > Btw, there's *another* issue. You said in the commit log:
> > 
> >     This patch does *not* include any mechanism to wake a throttled TXQ again,
> >     on the assumption that this will happen anyway as a side effect of whatever
> >     freed the skb (most commonly a TX completion).
> > 
> > Thinking about this some more, I'm not convinced that this assumption
> > holds. You could have been stopped due to the global limit, and now you
> > wake some queue but the TXQ is empty - now you should reschedule some
> > *other* TXQ since the global limit had kicked in, not the per-TXQ limit,
> > and prevented dequeuing, no?
> 
> Well if you hit the global limit that means you have 24ms worth of data
> queued in the hardware; those should be completed in turn, and enable
> more to be dequeued, no?

Yes, but on which queues?

Say you have some queues - some (Q1-Qn) got a LOT of traffic, and
another (Q0) just has some interactive traffic.

You could then end up in a situation where you have 24ms queued up on
Q1-Qn (with n high enough to not have hit the per-queue AQL limit),
right?

Say also the last frame on Q0 was dequeued by the hardware, but the
tx_dequeue() got NULL because of the AQL limit having been eaten up by
all the packets on Q1-Qn.

Now you'll no longer get a new dequeue attempt on Q0 (it was already
empty last time, so no hardware reclaim to trigger new dequeues), and a
new dequeue on the *other* queues will not do anything for this queue.

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:09   ` Toke Høiland-Jørgensen
@ 2019-12-11 14:13     ` Johannes Berg
  2019-12-11 14:55       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 14:13 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho, Jason A. Donenfeld, Steve French
  Cc: linux-wireless, Networking

On Wed, 2019-12-11 at 15:09 +0100, Toke Høiland-Jørgensen wrote:
> 
> If we're doing this on a per-driver basis, let's make it a proper
> NL80211_EXT_FEATURE and expose it to userspace; that way users can at
> least discover if it's supported on their device. I can send a patch
> adding that...

Sure. Just didn't get to that yet, but if you want to send a patch
that's very welcome. I have to run out now, will be back in the evening
at most.

> Maybe we should untangle this from airtime_flags completely, since if we
> just use the flags people could conceivably end up disabling it by
> mistake, couldn't they?

Yes, that sounds like a good plan, now I was wondering why it's there
anyway.

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:12               ` Johannes Berg
@ 2019-12-11 14:47                 ` Toke Høiland-Jørgensen
  2019-12-11 21:18                   ` Johannes Berg
  0 siblings, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11 14:47 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-12-11 at 15:04 +0100, Toke Høiland-Jørgensen wrote:
>> Johannes Berg <johannes@sipsolutions.net> writes:
>> 
>> > Btw, there's *another* issue. You said in the commit log:
>> > 
>> >     This patch does *not* include any mechanism to wake a throttled TXQ again,
>> >     on the assumption that this will happen anyway as a side effect of whatever
>> >     freed the skb (most commonly a TX completion).
>> > 
>> > Thinking about this some more, I'm not convinced that this assumption
>> > holds. You could have been stopped due to the global limit, and now you
>> > wake some queue but the TXQ is empty - now you should reschedule some
>> > *other* TXQ since the global limit had kicked in, not the per-TXQ limit,
>> > and prevented dequeuing, no?
>> 
>> Well if you hit the global limit that means you have 24ms worth of data
>> queued in the hardware; those should be completed in turn, and enable
>> more to be dequeued, no?
>
> Yes, but on which queues?
>
> Say you have some queues - some (Q1-Qn) got a LOT of traffic, and
> another (Q0) just has some interactive traffic.
>
> You could then end up in a situation where you have 24ms queued up on
> Q1-Qn (with n high enough to not have hit the per-queue AQL limit),
> right?
>
> Say also the last frame on Q0 was dequeued by the hardware, but the
> tx_dequeue() got NULL because of the AQL limit having been eaten up by
> all the packets on Q1-Qn.
>
> Now you'll no longer get a new dequeue attempt on Q0 (it was already
> empty last time, so no hardware reclaim to trigger new dequeues), and a
> new dequeue on the *other* queues will not do anything for this queue.

Oh, right, I see; yeah, that could probably happen. I guess we could
either kick all available queues whenever the global limit goes from
"above" to "below"; or we could remove the "return NULL" logic from
tx_dequeue() and rely on next_txq() to throttle. I think the latter is
probably simpler, but I'm a little worried that the throttling will
become too lax (because the driver can keep dequeueing in the same
scheduling round)...

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:13     ` Johannes Berg
@ 2019-12-11 14:55       ` Toke Høiland-Jørgensen
  2019-12-11 21:19         ` Johannes Berg
  0 siblings, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-11 14:55 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho,
	Jason A. Donenfeld, Steve French
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-12-11 at 15:09 +0100, Toke Høiland-Jørgensen wrote:
>> 
>> If we're doing this on a per-driver basis, let's make it a proper
>> NL80211_EXT_FEATURE and expose it to userspace; that way users can at
>> least discover if it's supported on their device. I can send a patch
>> adding that...
>
> Sure. Just didn't get to that yet, but if you want to send a patch
> that's very welcome. I have to run out now, will be back in the evening
> at most.

Patch here (for those not following linux-wireless):
https://patchwork.kernel.org/project/linux-wireless/list/?series=215107

>> Maybe we should untangle this from airtime_flags completely, since if we
>> just use the flags people could conceivably end up disabling it by
>> mistake, couldn't they?
>
> Yes, that sounds like a good plan, now I was wondering why it's there
> anyway.

Convenience, I think :)

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:02           ` Toke Høiland-Jørgensen
@ 2019-12-11 21:17             ` Johannes Berg
  2019-12-12 10:55               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 21:17 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

On Wed, 2019-12-11 at 15:02 +0100, Toke Høiland-Jørgensen wrote:

> > 2) GSO/TSO like what we have - it's not really clear how to handle it.
> >    The airtime estimate will shoot *way* up (64kB frame) once that frame
> >    enters, and then ... should it "trickle back down" as the generated 
> >    parts are transmitted? But then the driver needs to split up the
> >    airtime estimate? Or should it just go back down entirely? On the
> >    first frame? That might overshoot. On the last frame? Opposite
> >    problem ...
> 
> Well, ideally it would be divided out over the component packets; but
> yeah, who is going to do that?

I'm not even sure we *can* do this easily - do we know up-front how many
packets this will expand to? We should know, but it might not be so easy
given the abstraction layers. We could guess and if it's wrong just set
it to 0 on any remaining ones.

> I think reporting it on the first packet
> would be the safest if we had to choose. 

Agree.

> Also, ideally we would want the
> GSO/TSO mechanism to lower the size of the superpackets at lower rates
> (does it?). At higher rates this matters less...

Well TCP does limit (pacing shift) the amount of outstanding data, so if
it's _really_ slow I guess it will also limit the size of the
superpackets?

It's really just an artifact of our software implementation that we
report the SKBs back as used with partial content. Maybe we shouldn't
even do that, since they weren't generated by mac80211 in the first
place, and only report the original skb or something.

> > 3) I'm not quite convinced that all drivers report the TX rate
> >    completely correctly in the status, some don't even use this path
> >    but the ieee80211_tx_status_ext() which doesn't update the rate.
> > 
> > 4) Probably most importantly, this is completely broken with HE because
> >    there's currently no way to report HE rates in the TX status at all!
> >    I just worked around that in our driver for 'iw' reporting purposes
> >    by implementing the rate reporting in the sta_statistics callback,
> >    but that data won't be used by the airtime estimates.
> 
> Hmm, yeah, both of those are good points. I guess I just kinda assumed
> that the drivers were already doing the right thing there... :)

I'm not really sure I want to rely on this - this was never really
needed *functionally*, just from a *statistics* point of view (e.g. "iw
link" or such).

> > Now, (1) probably doesn't matter, the estimates don't need to be that
> > accurate. (2) I'm not sure how to solve; (3) and (4) could both be
> > solved by having some mechanism of the rate scaling to tell us what the
> > current rate is whenever it updates, rather than relying on the
> > last_rate. Really we should do that much more, and even phase out
> > last_rate entirely, it's a stupid concept.
> 
> Yes, that last bit would be good!

We already partially have this, we have a 'get best rate' or so callback
in the rate scaling, we'd just have to extend it to the driver ops for
offloaded rate scaling.

Ideally, it'd be a function call from the rate scaling to mac80211 so we
don't have to call a function every time we need the value, but the rate
scaling just calls us whenever it updates. This would even work with
iwlwifi's offloaded algorithm - it notifies the host on all changes.

> > There's an additional wrinkle here - what about HE scheduled mode, where
> > the AP decided when and at what rate you're allowed to send? We don't
> > report that at all, not even as part of rate scaling, since rate scaling
> > only affects *our* decision, not when we send as a response to a trigger
> > frame. This is _still_ relevant for AQL, but there we can only see what
> > the AP used last (**), but we don't track that now nor do we track the
> > proportion of packets that we sent triggered vs. normal medium
> > access...
> 
> Huh, wasn't aware that was a thing in HE; that's cool! And yeah, could
> have interesting interactions with AQL...

Yeah ...

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:47                 ` Toke Høiland-Jørgensen
@ 2019-12-11 21:18                   ` Johannes Berg
  2019-12-12 10:45                     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 21:18 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

On Wed, 2019-12-11 at 15:47 +0100, Toke Høiland-Jørgensen wrote:

> > Say you have some queues - some (Q1-Qn) got a LOT of traffic, and
> > another (Q0) just has some interactive traffic.
> > 
> > You could then end up in a situation where you have 24ms queued up on
> > Q1-Qn (with n high enough to not have hit the per-queue AQL limit),
> > right?
> > 
> > Say also the last frame on Q0 was dequeued by the hardware, but the
> > tx_dequeue() got NULL because of the AQL limit having been eaten up by
> > all the packets on Q1-Qn.
> > 
> > Now you'll no longer get a new dequeue attempt on Q0 (it was already
> > empty last time, so no hardware reclaim to trigger new dequeues), and a
> > new dequeue on the *other* queues will not do anything for this queue.
> 
> Oh, right, I see; yeah, that could probably happen. I guess we could
> either kick all available queues whenever the global limit goes from
> "above" to "below"; or we could remove the "return NULL" logic from
> tx_dequeue() and rely on next_txq() to throttle. I think the latter is
> probably simpler, but I'm a little worried that the throttling will
> become too lax (because the driver can keep dequeueing in the same
> scheduling round)...

I honestly have no idea what's better ... :) You're the expert, I'm just
poking holes into it ;-)

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 14:55       ` Toke Høiland-Jørgensen
@ 2019-12-11 21:19         ` Johannes Berg
  2019-12-12 10:55           ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-11 21:19 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho, Jason A. Donenfeld, Steve French
  Cc: linux-wireless, Networking

On Wed, 2019-12-11 at 15:55 +0100, Toke Høiland-Jørgensen wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > On Wed, 2019-12-11 at 15:09 +0100, Toke Høiland-Jørgensen wrote:
> > > If we're doing this on a per-driver basis, let's make it a proper
> > > NL80211_EXT_FEATURE and expose it to userspace; that way users can at
> > > least discover if it's supported on their device. I can send a patch
> > > adding that...
> > 
> > Sure. Just didn't get to that yet, but if you want to send a patch
> > that's very welcome. I have to run out now, will be back in the evening
> > at most.
> 
> Patch here (for those not following linux-wireless):
> https://patchwork.kernel.org/project/linux-wireless/list/?series=215107

Thanks!

Maybe I should roll that into a single patch so it's actually easier to
apply as a bugfix while keeping ath10k on AQL for 5.5, otherwise it
could be argued that the ath10k patch is a feature for -next ...

> > > Maybe we should untangle this from airtime_flags completely, since if we
> > > just use the flags people could conceivably end up disabling it by
> > > mistake, couldn't they?
> > 
> > Yes, that sounds like a good plan, now I was wondering why it's there
> > anyway.
> 
> Convenience, I think :)

:)

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 21:18                   ` Johannes Berg
@ 2019-12-12 10:45                     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-12 10:45 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-12-11 at 15:47 +0100, Toke Høiland-Jørgensen wrote:
>
>> > Say you have some queues - some (Q1-Qn) got a LOT of traffic, and
>> > another (Q0) just has some interactive traffic.
>> > 
>> > You could then end up in a situation where you have 24ms queued up on
>> > Q1-Qn (with n high enough to not have hit the per-queue AQL limit),
>> > right?
>> > 
>> > Say also the last frame on Q0 was dequeued by the hardware, but the
>> > tx_dequeue() got NULL because of the AQL limit having been eaten up by
>> > all the packets on Q1-Qn.
>> > 
>> > Now you'll no longer get a new dequeue attempt on Q0 (it was already
>> > empty last time, so no hardware reclaim to trigger new dequeues), and a
>> > new dequeue on the *other* queues will not do anything for this queue.
>> 
>> Oh, right, I see; yeah, that could probably happen. I guess we could
>> either kick all available queues whenever the global limit goes from
>> "above" to "below"; or we could remove the "return NULL" logic from
>> tx_dequeue() and rely on next_txq() to throttle. I think the latter is
>> probably simpler, but I'm a little worried that the throttling will
>> become too lax (because the driver can keep dequeueing in the same
>> scheduling round)...
>
> I honestly have no idea what's better ... :)

Right, I guess we'll have to go and measure. Let's leave it as-is for
now, then, and we can adjust in a separate patch.

> You're the expert, I'm just poking holes into it ;-)

And you're doing that very well, as it turns out; thanks! ;)

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 21:17             ` Johannes Berg
@ 2019-12-12 10:55               ` Toke Høiland-Jørgensen
  2019-12-12 11:00                 ` Johannes Berg
  0 siblings, 1 reply; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-12 10:55 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-12-11 at 15:02 +0100, Toke Høiland-Jørgensen wrote:
>
>> > 2) GSO/TSO like what we have - it's not really clear how to handle it.
>> >    The airtime estimate will shoot *way* up (64kB frame) once that frame
>> >    enters, and then ... should it "trickle back down" as the generated 
>> >    parts are transmitted? But then the driver needs to split up the
>> >    airtime estimate? Or should it just go back down entirely? On the
>> >    first frame? That might overshoot. On the last frame? Opposite
>> >    problem ...
>> 
>> Well, ideally it would be divided out over the component packets; but
>> yeah, who is going to do that?
>
> I'm not even sure we *can* do this easily - do we know up-front how many
> packets this will expand to? We should know, but it might not be so easy
> given the abstraction layers. We could guess and if it's wrong just set
> it to 0 on any remaining ones.

I was thinking about a scheme where we re-defined the value in the cb to
be a "time per byte" value, that we could just multiply by the packet
length; that would make it trivial to do partial reporting. Not sure
it's quite workable in practice, though; it would be hard to avoid
rounding errors, and there's also the additional headers when splitting
a packet, so the lengths don't necessarily add up.

>> I think reporting it on the first packet
>> would be the safest if we had to choose. 
>
> Agree.
>
>> Also, ideally we would want the GSO/TSO mechanism to lower the size
>> of the superpackets at lower rates (does it?). At higher rates this
>> matters less...
>
> Well TCP does limit (pacing shift) the amount of outstanding data, so if
> it's _really_ slow I guess it will also limit the size of the
> superpackets?

Yeah, I *think* it does... :)

> It's really just an artifact of our software implementation that we
> report the SKBs back as used with partial content. Maybe we shouldn't
> even do that, since they weren't generated by mac80211 in the first
> place, and only report the original skb or something.

Hmm, yeah, was wondering how that works, actually. I assumed you send
the whole thing to the hardware as one superpacket? But if so how do you
get the completion events back? Or are you splitting it in the driver
just before you send it to the hardware?

>> > 3) I'm not quite convinced that all drivers report the TX rate
>> >    completely correctly in the status, some don't even use this path
>> >    but the ieee80211_tx_status_ext() which doesn't update the rate.
>> > 
>> > 4) Probably most importantly, this is completely broken with HE because
>> >    there's currently no way to report HE rates in the TX status at all!
>> >    I just worked around that in our driver for 'iw' reporting purposes
>> >    by implementing the rate reporting in the sta_statistics callback,
>> >    but that data won't be used by the airtime estimates.
>> 
>> Hmm, yeah, both of those are good points. I guess I just kinda assumed
>> that the drivers were already doing the right thing there... :)
>
> I'm not really sure I want to rely on this - this was never really
> needed *functionally*, just from a *statistics* point of view (e.g. "iw
> link" or such).

Right, I see. Well I guess now that we're turning this on one driver at
a time, we can ensure that the driver provides sufficiently accurate
rate information as part of that.

BTW, since we're discussing this in the context of iwlwifi: do you have
any data as to how much benefit AQL would be for that? I.e., do the
Intel devices tend to buffer a lot of data in hardware/firmware?

>> > Now, (1) probably doesn't matter, the estimates don't need to be that
>> > accurate. (2) I'm not sure how to solve; (3) and (4) could both be
>> > solved by having some mechanism of the rate scaling to tell us what the
>> > current rate is whenever it updates, rather than relying on the
>> > last_rate. Really we should do that much more, and even phase out
>> > last_rate entirely, it's a stupid concept.
>> 
>> Yes, that last bit would be good!
>
> We already partially have this, we have a 'get best rate' or so callback
> in the rate scaling, we'd just have to extend it to the driver ops for
> offloaded rate scaling.
>
> Ideally, it'd be a function call from the rate scaling to mac80211 so we
> don't have to call a function every time we need the value, but the rate
> scaling just calls us whenever it updates. This would even work with
> iwlwifi's offloaded algorithm - it notifies the host on all changes.

Yup, this makes sense, and would be easy to integrate with Minstrel as
well, I think. We already have ieee80211_sta_set_expected_throughput(),
so maybe expanding that? It just provides a single number now, but we
could change it to set the full rate info instead?

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11 21:19         ` Johannes Berg
@ 2019-12-12 10:55           ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-12-12 10:55 UTC (permalink / raw)
  To: Johannes Berg, Jens Axboe, Emmanuel Grumbach, Luca Coelho,
	Jason A. Donenfeld, Steve French
  Cc: linux-wireless, Networking

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-12-11 at 15:55 +0100, Toke Høiland-Jørgensen wrote:
>> Johannes Berg <johannes@sipsolutions.net> writes:
>> 
>> > On Wed, 2019-12-11 at 15:09 +0100, Toke Høiland-Jørgensen wrote:
>> > > If we're doing this on a per-driver basis, let's make it a proper
>> > > NL80211_EXT_FEATURE and expose it to userspace; that way users can at
>> > > least discover if it's supported on their device. I can send a patch
>> > > adding that...
>> > 
>> > Sure. Just didn't get to that yet, but if you want to send a patch
>> > that's very welcome. I have to run out now, will be back in the evening
>> > at most.
>> 
>> Patch here (for those not following linux-wireless):
>> https://patchwork.kernel.org/project/linux-wireless/list/?series=215107
>
> Thanks!
>
> Maybe I should roll that into a single patch so it's actually easier to
> apply as a bugfix while keeping ath10k on AQL for 5.5, otherwise it
> could be argued that the ath10k patch is a feature for -next ...

Yeah, good point. Since it seems I'm sending a v2 anyway, I'll combine
the two for that...

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-12 10:55               ` Toke Høiland-Jørgensen
@ 2019-12-12 11:00                 ` Johannes Berg
  0 siblings, 0 replies; 25+ messages in thread
From: Johannes Berg @ 2019-12-12 11:00 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jens Axboe, Emmanuel Grumbach,
	Luca Coelho
  Cc: linux-wireless, Networking

On Thu, 2019-12-12 at 11:55 +0100, Toke Høiland-Jørgensen wrote:
> 
> > I'm not even sure we *can* do this easily - do we know up-front how many
> > packets this will expand to? We should know, but it might not be so easy
> > given the abstraction layers. We could guess and if it's wrong just set
> > it to 0 on any remaining ones.
> 
> I was thinking about a scheme where we re-defined the value in the cb to
> be a "time per byte" value, that we could just multiply by the packet
> length; that would make it trivial to do partial reporting. Not sure
> it's quite workable in practice, though; it would be hard to avoid
> rounding errors, and there's also the additional headers when splitting
> a packet, so the lengths don't necessarily add up.

Yeah, that won't really work. We could only estimate the # of pieces and
split up the value across them.

> > It's really just an artifact of our software implementation that we
> > report the SKBs back as used with partial content. Maybe we shouldn't
> > even do that, since they weren't generated by mac80211 in the first
> > place, and only report the original skb or something.
> 
> Hmm, yeah, was wondering how that works, actually. I assumed you send
> the whole thing to the hardware as one superpacket? But if so how do you
> get the completion events back? Or are you splitting it in the driver
> just before you send it to the hardware?

If we get say a 64k superpacket, we'll split it first into SKBs of max
A-MSDU size, and then rejigger the pieces inside each A-MSDU thing using
the DMA engine so we get A-MSDUs of MTU-sized packets out at the end.
The hardware is dumb here, it only takes care of TCP checksum.

> > I'm not really sure I want to rely on this - this was never really
> > needed *functionally*, just from a *statistics* point of view (e.g. "iw
> > link" or such).
> 
> Right, I see. Well I guess now that we're turning this on one driver at
> a time, we can ensure that the driver provides sufficiently accurate
> rate information as part of that.

Right.

> BTW, since we're discussing this in the context of iwlwifi: do you have
> any data as to how much benefit AQL would be for that? I.e., do the
> Intel devices tend to buffer a lot of data in hardware/firmware?

Hardware we have queues up to ~240 frames or so, otherwise no real
buffering. Per station/TID.

> > Ideally, it'd be a function call from the rate scaling to mac80211 so we
> > don't have to call a function every time we need the value, but the rate
> > scaling just calls us whenever it updates. This would even work with
> > iwlwifi's offloaded algorithm - it notifies the host on all changes.
> 
> Yup, this makes sense, and would be easy to integrate with Minstrel as
> well, I think. We already have ieee80211_sta_set_expected_throughput(),
> so maybe expanding that? It just provides a single number now, but we
> could change it to set the full rate info instead?

Right, was thinking around that area too. Not sure about the details
really though.

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-11  8:36 ` Johannes Berg
  2019-12-11  8:53   ` Toke Høiland-Jørgensen
@ 2019-12-21  0:55   ` Jens Axboe
  2019-12-21  9:17     ` Johannes Berg
  1 sibling, 1 reply; 25+ messages in thread
From: Jens Axboe @ 2019-12-21  0:55 UTC (permalink / raw)
  To: Johannes Berg, Emmanuel Grumbach, Luca Coelho; +Cc: linux-wireless, Networking

On 12/11/19 1:36 AM, Johannes Berg wrote:
> On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
>> Hi,
>>
>> Since the GRO issue got fixed, iwlwifi has worked fine for me.
>> However, on every boot, I get some warnings:
>>
>> ------------[ cut here ]------------
>> STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
> 
> Yeah, we've seen a few reports of this.
> 
> I guess I kinda feel responsible for this since I merged the AQL work,
> I'll take a look as soon as I can.

Still the case in -git, as of right now. Just following up on this to
ensure that a patch is merged to fix this up.

I'd be happy to test, if you need something tested.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-21  0:55   ` Jens Axboe
@ 2019-12-21  9:17     ` Johannes Berg
  2019-12-21 13:45       ` Jens Axboe
  0 siblings, 1 reply; 25+ messages in thread
From: Johannes Berg @ 2019-12-21  9:17 UTC (permalink / raw)
  To: Jens Axboe, Emmanuel Grumbach, Luca Coelho; +Cc: linux-wireless, Networking

On Fri, 2019-12-20 at 17:55 -0700, Jens Axboe wrote:
> On 12/11/19 1:36 AM, Johannes Berg wrote:
> > On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
> > > Hi,
> > > 
> > > Since the GRO issue got fixed, iwlwifi has worked fine for me.
> > > However, on every boot, I get some warnings:
> > > 
> > > ------------[ cut here ]------------
> > > STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
> > 
> > Yeah, we've seen a few reports of this.
> > 
> > I guess I kinda feel responsible for this since I merged the AQL work,
> > I'll take a look as soon as I can.
> 
> Still the case in -git, as of right now. Just following up on this to
> ensure that a patch is merged to fix this up.

It's in net/master, so from my POV it's on the way :)

johannes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: iwlwifi warnings in 5.5-rc1
  2019-12-21  9:17     ` Johannes Berg
@ 2019-12-21 13:45       ` Jens Axboe
  0 siblings, 0 replies; 25+ messages in thread
From: Jens Axboe @ 2019-12-21 13:45 UTC (permalink / raw)
  To: Johannes Berg, Emmanuel Grumbach, Luca Coelho; +Cc: linux-wireless, Networking

On 12/21/19 2:17 AM, Johannes Berg wrote:
> On Fri, 2019-12-20 at 17:55 -0700, Jens Axboe wrote:
>> On 12/11/19 1:36 AM, Johannes Berg wrote:
>>> On Tue, 2019-12-10 at 13:46 -0700, Jens Axboe wrote:
>>>> Hi,
>>>>
>>>> Since the GRO issue got fixed, iwlwifi has worked fine for me.
>>>> However, on every boot, I get some warnings:
>>>>
>>>> ------------[ cut here ]------------
>>>> STA b4:75:0e:99:1f:e0 AC 2 txq pending airtime underflow: 4294967088, 208
>>>
>>> Yeah, we've seen a few reports of this.
>>>
>>> I guess I kinda feel responsible for this since I merged the AQL work,
>>> I'll take a look as soon as I can.
>>
>> Still the case in -git, as of right now. Just following up on this to
>> ensure that a patch is merged to fix this up.
> 
> It's in net/master, so from my POV it's on the way :)

Great! Just wanted to follow up, as I've seen various draft patches
being tossed around, but nothing posted that looked like a final
submission. Guess I just wasn't CC'ed on those.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-12-21 13:45 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-10 20:46 iwlwifi warnings in 5.5-rc1 Jens Axboe
2019-12-11  8:36 ` Johannes Berg
2019-12-11  8:53   ` Toke Høiland-Jørgensen
2019-12-11 10:11     ` Johannes Berg
2019-12-11 10:23       ` Toke Høiland-Jørgensen
2019-12-11 11:51         ` Johannes Berg
2019-12-11 13:42           ` Johannes Berg
2019-12-11 14:04             ` Toke Høiland-Jørgensen
2019-12-11 14:12               ` Johannes Berg
2019-12-11 14:47                 ` Toke Høiland-Jørgensen
2019-12-11 21:18                   ` Johannes Berg
2019-12-12 10:45                     ` Toke Høiland-Jørgensen
2019-12-11 14:02           ` Toke Høiland-Jørgensen
2019-12-11 21:17             ` Johannes Berg
2019-12-12 10:55               ` Toke Høiland-Jørgensen
2019-12-12 11:00                 ` Johannes Berg
2019-12-21  0:55   ` Jens Axboe
2019-12-21  9:17     ` Johannes Berg
2019-12-21 13:45       ` Jens Axboe
2019-12-11 13:45 ` Johannes Berg
2019-12-11 14:09   ` Toke Høiland-Jørgensen
2019-12-11 14:13     ` Johannes Berg
2019-12-11 14:55       ` Toke Høiland-Jørgensen
2019-12-11 21:19         ` Johannes Berg
2019-12-12 10:55           ` Toke Høiland-Jørgensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.