linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Walle <michael@walle.cc>
To: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Xiaoliang Yang <xiaoliang.yang_1@nxp.com>,
	Claudiu Manoil <claudiu.manoil@nxp.com>,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	UNGLinuxDriver@microchip.com, Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vinicius Costa Gomes <vinicius.gomes@intel.com>,
	Maxim Kochetkov <fido_max@inbox.ru>,
	Colin Foster <colin.foster@in-advantage.com>,
	Richie Pearn <richard.pearn@nxp.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 net 1/3] net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet
Date: Tue, 06 Sep 2022 00:53:20 +0200	[thread overview]
Message-ID: <d50be0e224c70453e1a4a7d690cfdf1b@walle.cc> (raw)
In-Reply-To: <20220905170125.1269498-2-vladimir.oltean@nxp.com>

Am 2022-09-05 19:01, schrieb Vladimir Oltean:
> The blamed commit broke tc-taprio schedules such as this one:
> 
> tc qdisc replace dev $swp1 root taprio \
>         num_tc 8 \
>         map 0 1 2 3 4 5 6 7 \
>         queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
>         base-time 0 \
>         sched-entry S 0x7f 990000 \
>         sched-entry S 0x80  10000 \
>         flags 0x2
> 
> because the gate entry for TC 7 (S 0x80 10000 ns) now has a static 
> guard
> band added earlier than its 'gate close' event, such that packet
> overruns won't occur in the worst case of the largest packet possible.
> 
> Since guard bands are statically determined based on the per-tc
> QSYS_QMAXSDU_CFG_* with a fallback on the port-based QSYS_PORT_MAX_SDU,
> we need to discuss what happens with TC 7 depending on kernel version,
> since the driver, prior to commit 55a515b1f5a9 ("net: dsa: felix: drop
> oversized frames with tc-taprio instead of hanging the port"), did not
> touch QSYS_QMAXSDU_CFG_*, and therefore relied on QSYS_PORT_MAX_SDU.
> 
> 1 (before vsc9959_tas_guard_bands_update): QSYS_PORT_MAX_SDU defaults 
> to
>   1518, and at gigabit this introduces a static guard band (independent
>   of packet sizes) of 12144 ns, plus QSYS::HSCH_MISC_CFG.FRM_ADJ (bit
>   time of 20 octets => 160 ns). But this is larger than the time window
>   itself, of 10000 ns. So, the queue system never considers a frame 
> with
>   TC 7 as eligible for transmission, since the gate practically never
>   opens, and these frames are forever stuck in the TX queues and hang
>   the port.
> 
> 2 (after vsc9959_tas_guard_bands_update): Under the sole goal of
>   enabling oversized frame dropping, we make an effort to set
>   QSYS_QMAXSDU_CFG_7 to 1230 bytes. But QSYS_QMAXSDU_CFG_7 plays
>   one more role, which we did not take into account: per-tc static 
> guard
>   band, expressed in L2 byte time (auto-adjusted for FCS and L1 
> overhead).
>   There is a discrepancy between what the driver thinks (that there is
>   no guard band, and 100% of min_gate_len[tc] is available for egress
>   scheduling) and what the hardware actually does (crops the equivalent
>   of QSYS_QMAXSDU_CFG_7 ns out of min_gate_len[tc]). In practice, this
>   means that the hardware thinks it has exactly 0 ns for scheduling tc 
> 7.
> 
> In both cases, even minimum sized Ethernet frames are stuck on egress
> rather than being considered for scheduling on TC 7, even if they would
> fit given a proper configuration. Considering the current situation,
> with vsc9959_tas_guard_bands_update(), frames between 60 octets and 
> 1230
> octets in size are not eligible for oversized dropping (because they 
> are
> smaller than QSYS_QMAXSDU_CFG_7), but won't be considered as eligible
> for scheduling either, because the min_gate_len[7] (10000 ns) minus the
> guard band determined by QSYS_QMAXSDU_CFG_7 (1230 octets * 8 ns per
> octet == 9840 ns) minus the guard band auto-added for L1 overhead by
> QSYS::HSCH_MISC_CFG.FRM_ADJ (20 octets * 8 ns per octet == 160 octets)
> leaves 0 ns for scheduling in the queue system proper.
> 
> Investigating the hardware behavior, it becomes apparent that the queue
> system needs precisely 33 ns of 'gate open' time in order to consider a
> frame as eligible for scheduling to a tc. So the solution to this
> problem is to amend vsc9959_tas_guard_bands_update(), by giving the
> per-tc guard bands less space by exactly 33 ns, just enough for one
> frame to be scheduled in that interval. This allows the queue system to
> make forward progress for that port-tc, and prevents it from hanging.
> 
> Fixes: 297c4de6f780 ("net: dsa: felix: re-enable TAS guard band mode")
> Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

I haven't looked at the overall code, but the solution described
above sounds good.

FWIW, I don't think such a schedule, where exactly one frame
can be sent, is very likely in the wild though. Imagine a piece
of software is generating one frame per cycle. It might happen
that during one (hardware) cycle there is no frame ready (because
it is software and it jitters), but then in the next cycle, there
are now two frames ready. In that case you'll always lag one frame
behind and you'll never recover from it.

Either I'd make sure I can send at two frames in one cycle, or
my software would only send a frame every other cycle.

Thanks for taking care of this!

-michael

  reply	other threads:[~2022-09-05 22:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-05 17:01 [PATCH v2 net 0/3] Fixes for Felix DSA driver calculation of tc-taprio guard bands Vladimir Oltean
2022-09-05 17:01 ` [PATCH v2 net 1/3] net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet Vladimir Oltean
2022-09-05 22:53   ` Michael Walle [this message]
2022-09-06  0:11     ` Vladimir Oltean
2022-09-06  7:17       ` Michael Walle
2022-09-05 17:01 ` [PATCH v2 net 2/3] net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio Vladimir Oltean
2022-09-05 17:01 ` [PATCH v2 net 3/3] net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set Vladimir Oltean
2022-09-07 13:00 ` [PATCH v2 net 0/3] Fixes for Felix DSA driver calculation of tc-taprio guard bands patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d50be0e224c70453e1a4a7d690cfdf1b@walle.cc \
    --to=michael@walle.cc \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=andrew@lunn.ch \
    --cc=claudiu.manoil@nxp.com \
    --cc=colin.foster@in-advantage.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=f.fainelli@gmail.com \
    --cc=fido_max@inbox.ru \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=richard.pearn@nxp.com \
    --cc=vinicius.gomes@intel.com \
    --cc=vivien.didelot@gmail.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=xiaoliang.yang_1@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).