All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	netdev@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Jozsef Kadlecsik <kadlec@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	"Ahmed S. Darwish" <a.darwish@linutronix.de>,
	Eric Dumazet <edumazet@google.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH net-next 9/9] net: sched: Remove Qdisc::running sequence counter
Date: Mon, 18 Oct 2021 10:23:19 -0700	[thread overview]
Message-ID: <1cdc197a-f9c8-34e4-b19c-132dbbbcafb5@gmail.com> (raw)
In-Reply-To: <20211016084910.4029084-10-bigeasy@linutronix.de>



On 10/16/21 1:49 AM, Sebastian Andrzej Siewior wrote:
> From: "Ahmed S. Darwish" <a.darwish@linutronix.de>
> 
> The Qdisc::running sequence counter has two uses:
> 
>   1. Reliably reading qdisc's tc statistics while the qdisc is running
>      (a seqcount read/retry loop at gnet_stats_add_basic()).
> 
>   2. As a flag, indicating whether the qdisc in question is running
>      (without any retry loops).
> 
> For the first usage, the Qdisc::running sequence counter write section,
> qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what
> is actually needed: the raw qdisc's bstats update. A u64_stats sync
> point was thus introduced (in previous commits) inside the bstats
> structure itself. A local u64_stats write section is then started and
> stopped for the bstats updates.
> 
> Use that u64_stats sync point mechanism for the bstats read/retry loop
> at gnet_stats_add_basic().
> 
> For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag,
> accessed with atomic bitops, is sufficient. Using a bit flag instead of
> a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads
> to the SMP barriers implicitly added through raw_read_seqcount() and
> write_seqcount_begin/end() getting removed. All call sites have been
> surveyed though, and no required ordering was identified.
> 
> Now that the qdisc->running sequence counter is no longer used, remove
> it.
> 
> Note, using u64_stats implies no sequence counter protection for 64-bit
> architectures. This can lead to the qdisc tc statistics "packets" vs.
> "bytes" values getting out of sync on rare occasions. The individual
> values will still be valid.
> 
> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>


I see this has been merged this week end before we could test this thing during work days :/

Just add a rate estimator on a qdisc:

tc qd add dev lo root est 1sec 4sec pfifo

then :

[  140.824352] ------------[ cut here ]------------
[  140.824361] WARNING: CPU: 15 PID: 0 at net/core/gen_stats.c:157 gnet_stats_add_basic+0x97/0xc0
[  140.824378] Modules linked in: ipvlan bonding vfat fat w1_therm i2c_mux_pca954x i2c_mux ds2482 wire cdc_acm ehci_pci ehci_hcd bnx2x mdio xt_TCPMSS ip6table_mangle ip6_tables ipv6
[  140.824413] CPU: 15 PID: 0 Comm: swapper/15 Not tainted 5.15.0-smp-DEV #73
[  140.824415] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 2.48.0 10/02/2019
[  140.824417] RIP: 0010:gnet_stats_add_basic+0x97/0xc0
[  140.824420] Code: 2c 38 4a 03 5c 38 08 48 c7 c6 68 15 51 a4 e8 60 00 c7 ff 44 39 e0 72 db 89 d8 eb 05 31 c0 45 31 ed 4d 01 2e 49 01 46 08 eb 17 <0f> 0b 4d 85 ff 75 96 48 8b 02 48 8b 4a 08 49 01 06 89 c8 49 01 46
[  140.824432] RSP: 0018:ffff99415fbc5e08 EFLAGS: 00010206
[  140.824434] RAX: 0000000080000100 RBX: ffff9939812f41d0 RCX: 0000000000000001
[  140.824436] RDX: ffff99399705e0b0 RSI: 0000000000000000 RDI: ffff99415fbc5e40
[  140.824438] RBP: ffff99415fbc5e30 R08: 0000000000000000 R09: 0000000000000000
[  140.824440] R10: 0000000000000000 R11: ffffffffffffffff R12: ffff99415fbd7740
[  140.824441] R13: dead000000000122 R14: ffff99415fbc5e40 R15: 0000000000000000
[  140.824443] FS:  0000000000000000(0000) GS:ffff99415fbc0000(0000) knlGS:0000000000000000
[  140.824445] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  140.824447] CR2: 000000000087fff0 CR3: 0000000f11610006 CR4: 00000000000606e0
[  140.824449] Call Trace:
[  140.824450]  <IRQ>
[  140.824453]  ? local_bh_enable+0x20/0x20
[  140.824457]  est_timer+0x5e/0x130
[  140.824460]  call_timer_fn+0x2c/0x110
[  140.824464]  expire_timers+0x4c/0xf0
[  140.824467]  __run_timers+0x16f/0x1b0
[  140.824470]  run_timer_softirq+0x1d/0x40
[  140.824473]  __do_softirq+0x142/0x2a1
[  140.824477]  irq_exit_rcu+0x6b/0xb0
[  140.824480]  sysvec_apic_timer_interrupt+0x79/0x90
[  140.824483]  </IRQ>
[  140.824493]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  140.824497] RIP: 0010:cpuidle_enter_state+0x19b/0x300
[  140.824502] Code: ff 45 84 e4 74 20 48 c7 45 c8 00 00 00 00 9c 8f 45 c8 f7 45 c8 00 02 00 00 0f 85 e4 00 00 00 31 ff e8 c9 0d 88 ff fb 8b 45 bc <85> c0 78 52 48 89 de 89 c3 48 6b d3 68 48 8b 4c 16 48 4c 2b 6d b0
[  140.824503] RSP: 0018:ffff99398089be60 EFLAGS: 00000246
[  140.824505] RAX: 0000000000000004 RBX: ffffffffa446cb28 RCX: 000000000000001f
[  140.824506] RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000000
[  140.824507] RBP: ffff99398089beb0 R08: 0000000000000002 R09: 00000020cf9326e4
[  140.824508] R10: 0000000000638824 R11: 0000000000000000 R12: 0000000000000000
[  140.824509] R13: 00000020c9c8d180 R14: ffffc4733fbe1c50 R15: 0000000000000004
[  140.824511]  cpuidle_enter+0x2e/0x40
[  140.824514]  do_idle+0x19f/0x240
[  140.824517]  cpu_startup_entry+0x25/0x30
[  140.824519]  start_secondary+0x7c/0x80
[  140.824521]  secondary_startup_64_no_verify+0xc3/0xcb
[  140.824525] ---[ end trace d64fa4b3dc94b292 ]---



  reply	other threads:[~2021-10-18 17:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-16  8:49 [PATCH net-next 0/9] Try to simplify the gnet_stats and remove qdisc->running sequence counter Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 1/9] gen_stats: Add instead Set the value in __gnet_stats_copy_basic() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 2/9] gen_stats: Add gnet_stats_add_queue() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 3/9] mq, mqprio: Use gnet_stats_add_queue() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 4/9] gen_stats: Move remaining users to gnet_stats_add_queue() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 5/9] u64_stats: Introduce u64_stats_set() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 6/9] net: sched: Protect Qdisc::bstats with u64_stats Sebastian Andrzej Siewior
2021-10-26 23:47   ` kernel test robot
2021-10-16  8:49 ` [PATCH net-next 7/9] net: sched: Use _bstats_update/set() instead of raw writes Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 8/9] net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 9/9] net: sched: Remove Qdisc::running sequence counter Sebastian Andrzej Siewior
2021-10-18 17:23   ` Eric Dumazet [this message]
2021-10-18 18:30     ` Eric Dumazet
2021-10-18 19:24       ` Eric Dumazet
2021-10-18 23:53       ` Eric Dumazet
2021-10-19 10:12     ` [PATCH net-next] net: sched: Allow statistics reads from softirq Sebastian Andrzej Siewior
2021-10-19 12:10       ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1cdc197a-f9c8-34e4-b19c-132dbbbcafb5@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=a.darwish@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kadlec@netfilter.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=tglx@linutronix.de \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.