netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: "Daniel Borkmann" <daniel@iogearbox.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: brouer@redhat.com, Alexander Lobakin <alexandr.lobakin@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Michal Swiatkowski <michal.swiatkowski@linux.intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Shay Agroskin <shayagr@amazon.com>,
	Arthur Kiyanovski <akiyano@amazon.com>,
	David Arinzon <darinzon@amazon.com>,
	Noam Dagan <ndagan@amazon.com>, Saeed Bishara <saeedb@amazon.com>,
	Ioana Ciornei <ioana.ciornei@nxp.com>,
	Claudiu Manoil <claudiu.manoil@nxp.com>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
	Marcin Wojtas <mw@semihalf.com>,
	Russell King <linux@armlinux.org.uk>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Edward Cree <ecree.xilinx@gmail.com>,
	Martin Habets <habetsm.xilinx@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Yajun Deng <yajun.deng@linux.dev>,
	Sergey Ryazanov <ryazanov.s.a@gmail.com>,
	David Ahern <dsahern@kernel.org>, Andrei Vagin <avagin@gmail.com>,
	Johannes Berg <johannes.berg@intel.com>,
	Vladimir Oltean <vladimir.oltean@nxp.com>,
	Cong Wang <cong.wang@bytedance.com>,
	netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	bpf@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH v2 net-next 21/26] ice: add XDP and XSK generic per-channel statistics
Date: Mon, 29 Nov 2021 14:59:53 +0100	[thread overview]
Message-ID: <37732c0b-a1f5-5e1d-d34e-16ef07fab597@redhat.com> (raw)
In-Reply-To: <3c2fd51e-96c4-d500-bb4c-1972bb0fa3d6@iogearbox.net>



On 27/11/2021 00.01, Daniel Borkmann wrote:
> On 11/26/21 11:27 PM, Daniel Borkmann wrote:
>> On 11/26/21 7:06 PM, Jakub Kicinski wrote:
> [...]
>>> The information required by the admin is higher level. As you say the
>>> primary concern there is "how many packets did XDP eat".
>>
>> Agree. Above said, for XDP_DROP I would see one use case where you 
>> compare
>> different drivers or bond vs no bond as we did in the past in [0] when
>> testing against a packet generator (although I don't see bond driver 
>> covered
>> in this series here yet where it aggregates the XDP stats from all 
>> bond slave devs).
>>
>> On a higher-level wrt "how many packets did XDP eat", it would make sense
>> to have the stats for successful XDP_{TX,REDIRECT} given these are out
>> of reach from a BPF prog PoV - we can only count there how many times we
>> returned with XDP_TX but not whether the pkt /successfully made it/.
>>

Exactly.

>> In terms of error cases, could we just standardize all drivers on the 
>> behavior
>> of e.g. mlx5e_xdp_handle(), meaning, a failure from XDP_{TX,REDIRECT} will
>> hit the trace_xdp_exception() and then fallthrough to bump a drop counter
>> (same as we bump in XDP_DROP then). So the drop counter will account for
>> program drops but also driver-related drops.
>>

Hmm... I don't agree here.  IMHO the BPF-program's *choice* to drop (via 
XDP_DROP) should NOT share the counter with the driver-related drops.

The driver-related drops must be accounted separate.

For the record, I think mlx5e_xdp_handle() does the wrong thing, of 
accounting everything as XDP_DROP in (rq->stats->xdp_drop++).

Current mlx5 driver stats are highly problematic actually.
Please don't model stats behavior after this driver.

E.g. if BPF-prog takes the *choice* XDP_TX or XDP_REDIRECT or XDP_DROP, 
then the packet is invisible to "ifconfig" stats.  It is like the driver 
never received these packets (which is wrong IMHO). (The stats are only 
avail via ethtool -S).


>> At some later point the trace_xdp_exception() could be extended with 
>> an error
>> code that the driver would propagate (given some of them look quite 
>> similar
>> across drivers, fwiw), and then whoever wants to do further processing 
>> with them can do so via bpftrace or other tooling.

I do like trace_xdp_exception() is invoked in mlx5e_xdp_handle(), but do 
notice that xdp_do_redirect() also have a tracepoint that can be used 
for troubleshooting. (I usually use xdp_monitor for troubleshooting 
which catch both).

I like the stats XDP handling better in mvneta_run_xdp().

> Just thinking out loud, one straight forward example we could start out 
> with that is also related to Paolo's series [1] ...
> 
> enum xdp_error {
>      XDP_UNKNOWN,
>      XDP_ACTION_INVALID,
>      XDP_ACTION_UNSUPPORTED,
> };
> 
> ... and then bpf_warn_invalid_xdp_action() returns one of the latter two
> which we pass to trace_xdp_exception(). Later there could be XDP_DRIVER_*
> cases e.g. propagated from XDP_TX error exceptions.
> 
>          [...]
>          default:
>                  err = bpf_warn_invalid_xdp_action(act);
>                  fallthrough;
>          case XDP_ABORTED:
> xdp_abort:
>                  trace_xdp_exception(rq->netdev, prog, act, err);
>                  fallthrough;
>          case XDP_DROP:
>                  lrstats->xdp_drop++;
>                  break;
>          }
>          [...]
> 
>    [1] 
> https://lore.kernel.org/netdev/cover.1637924200.git.pabeni@redhat.com/
> 
>> So overall wrt this series: from the lrstats we'd be /dropping/ the pass,
>> tx_errors, redirect_errors, invalid, aborted counters. And we'd be 
>> /keeping/
>> bytes & packets counters that XDP sees, (driver-)successful tx & redirect
>> counters as well as drop counter. Also, XDP bytes & packets counters 
>> should
>> not be counted twice wrt ethtool stats.
>>
>>    [0] 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9e2ee5c7e7c35d195e2aa0692a7241d47a433d1e 
>>

Concrete example with mlx5:

For most other hardware (than mlx5) I experience that XDP_TX creates a 
push-back on NIC RX-handing speed.  Thus, the XDP_TX stats recorded by 
BPF-prog is usually correct.

With mlx5 hardware (tested on ConnectX-5 Ex MT28800) the RX 
packets-per-sec (pps) stats can easily be faster than actually XDP_TX 
transmitted frames.

$ sudo ./xdp_rxq_info --dev mlx5p1 --action XDP_TX
  [...]
  Running XDP on dev:mlx5p1 (ifindex:10) action:XDP_TX options:swapmac
  XDP stats       CPU     pps         issue-pps
  XDP-RX CPU      1       13,922,430  0
  XDP-RX CPU      total   13,922,430

  RXQ stats       RXQ:CPU pps         issue-pps
  rx_queue_index    1:1   13,922,430  0
  rx_queue_index    1:sum 13,922,430

The real xmit speed is (from below ethtool_stats.pl) is
  9,391,314 pps <= rx1_xdp_tx_xmit /sec

The dropped packets are double accounted as:
  4,552,033 <= rx_xdp_drop /sec
  4,552,033 <= rx_xdp_tx_full /sec


Show adapter(s) (mlx5p1) statistics (ONLY that changed!)
Ethtool(mlx5p1  ) stat:       217865 (        217,865) <= ch1_poll /sec
Ethtool(mlx5p1  ) stat:       217864 (        217,864) <= ch_poll /sec
Ethtool(mlx5p1  ) stat:     13943371 (     13,943,371) <= 
rx1_cache_reuse /sec
Ethtool(mlx5p1  ) stat:      4552033 (      4,552,033) <= rx1_xdp_drop /sec
Ethtool(mlx5p1  ) stat:       146740 (        146,740) <= 
rx1_xdp_tx_cqes /sec
Ethtool(mlx5p1  ) stat:      4552033 (      4,552,033) <= 
rx1_xdp_tx_full /sec
Ethtool(mlx5p1  ) stat:      9391314 (      9,391,314) <= 
rx1_xdp_tx_inlnw /sec
Ethtool(mlx5p1  ) stat:       880436 (        880,436) <= 
rx1_xdp_tx_mpwqe /sec
Ethtool(mlx5p1  ) stat:       997833 (        997,833) <= 
rx1_xdp_tx_nops /sec
Ethtool(mlx5p1  ) stat:      9391314 (      9,391,314) <= 
rx1_xdp_tx_xmit /sec
Ethtool(mlx5p1  ) stat:     45095173 (     45,095,173) <= 
rx_64_bytes_phy /sec
Ethtool(mlx5p1  ) stat:   2886090490 (  2,886,090,490) <= rx_bytes_phy /sec
Ethtool(mlx5p1  ) stat:     13943293 (     13,943,293) <= rx_cache_reuse 
/sec
Ethtool(mlx5p1  ) stat:     31151957 (     31,151,957) <= 
rx_out_of_buffer /sec
Ethtool(mlx5p1  ) stat:     45095158 (     45,095,158) <= rx_packets_phy 
/sec
Ethtool(mlx5p1  ) stat:   2886072350 (  2,886,072,350) <= rx_prio0_bytes 
/sec
Ethtool(mlx5p1  ) stat:     45094878 (     45,094,878) <= 
rx_prio0_packets /sec
Ethtool(mlx5p1  ) stat:   2705707938 (  2,705,707,938) <= 
rx_vport_unicast_bytes /sec
Ethtool(mlx5p1  ) stat:     45095129 (     45,095,129) <= 
rx_vport_unicast_packets /sec
Ethtool(mlx5p1  ) stat:      4552033 (      4,552,033) <= rx_xdp_drop /sec
Ethtool(mlx5p1  ) stat:       146739 (        146,739) <= rx_xdp_tx_cqe /sec
Ethtool(mlx5p1  ) stat:      4552033 (      4,552,033) <= rx_xdp_tx_full 
/sec
Ethtool(mlx5p1  ) stat:      9391319 (      9,391,319) <= 
rx_xdp_tx_inlnw /sec
Ethtool(mlx5p1  ) stat:       880436 (        880,436) <= 
rx_xdp_tx_mpwqe /sec
Ethtool(mlx5p1  ) stat:       997831 (        997,831) <= rx_xdp_tx_nops 
/sec
Ethtool(mlx5p1  ) stat:      9391319 (      9,391,319) <= rx_xdp_tx_xmit 
/sec
Ethtool(mlx5p1  ) stat:    601044221 (    601,044,221) <= tx_bytes_phy /sec
Ethtool(mlx5p1  ) stat:      9391316 (      9,391,316) <= tx_packets_phy 
/sec
Ethtool(mlx5p1  ) stat:    601040871 (    601,040,871) <= tx_prio0_bytes 
/sec
Ethtool(mlx5p1  ) stat:      9391264 (      9,391,264) <= 
tx_prio0_packets /sec
Ethtool(mlx5p1  ) stat:    563478483 (    563,478,483) <= 
tx_vport_unicast_bytes /sec
Ethtool(mlx5p1  ) stat:      9391316 (      9,391,316) <= 
tx_vport_unicast_packets /sec



[1] 
https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl


The net_devices stats says the NIC is processing zero packets:

  $ sar -n DEV 2 1000
  [...]
  Average:        IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s 
rxcmp/s   txcmp/s  rxmcst/s   %ifutil
  [...]
  Average:       mlx5p1      0,00      0,00      0,00      0,00 
0,00      0,00      0,00      0,00
  Average:       mlx5p2      0,00      0,00      0,00      0,00 
0,00      0,00      0,00      0,00


  reply	other threads:[~2021-11-29 14:03 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23 16:39 [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 01/26] rtnetlink: introduce generic XDP statistics Alexander Lobakin
2021-11-30  2:36   ` David Ahern
2021-11-23 16:39 ` [PATCH v2 net-next 02/26] xdp: provide common driver helpers for implementing XDP stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 03/26] ena: implement generic XDP statistics callbacks Alexander Lobakin
2021-11-29 13:34   ` Shay Agroskin
2021-11-30 19:14     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 04/26] dpaa2: implement generic XDP stats callbacks Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 05/26] enetc: " Alexander Lobakin
2021-11-23 17:09   ` Vladimir Oltean
2021-11-24 11:37     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 06/26] mvneta: reformat mvneta_netdev_ops Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 07/26] mvneta: add .ndo_get_xdp_stats() callback Alexander Lobakin
2021-11-24 11:39   ` Russell King (Oracle)
2021-11-25 17:16     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 08/26] mvpp2: provide " Alexander Lobakin
2021-11-24 11:33   ` Russell King (Oracle)
2021-11-24 11:36   ` Russell King (Oracle)
2021-11-23 16:39 ` [PATCH v2 net-next 09/26] mlx5: don't mix XDP_DROP and Rx XDP error cases Alexander Lobakin
2021-11-24 18:15   ` kernel test robot
2021-11-25 16:40     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 10/26] mlx5: provide generic XDP stats callbacks Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 11/26] sf100, sfx: implement " Alexander Lobakin
2021-11-24  9:59   ` Edward Cree
2021-11-23 16:39 ` [PATCH v2 net-next 12/26] veth: don't mix XDP_DROP counter with Rx XDP errors Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 13/26] veth: drop 'xdp_' suffix from packets and bytes stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 14/26] veth: reformat veth_netdev_ops Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 15/26] veth: add generic XDP stats callbacks Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 16/26] virtio_net: don't mix XDP_DROP counter with Rx XDP errors Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 17/26] virtio_net: rename xdp_tx{,_drops} SQ stats to xdp_xmit{,_errors} Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 18/26] virtio_net: reformat virtnet_netdev Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 19/26] virtio_net: add callbacks for generic XDP stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 20/26] i40e: add XDP and XSK generic per-channel statistics Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 21/26] ice: " Alexander Lobakin
2021-11-24  0:52   ` Daniel Borkmann
2021-11-24 16:34     ` Lorenz Bauer
2021-11-25 11:56     ` Toke Høiland-Jørgensen
2021-11-25 17:07       ` Alexander Lobakin
2021-11-25 17:44         ` Jakub Kicinski
2021-11-25 20:40           ` Alexander Lobakin
2021-11-26 12:30             ` Toke Høiland-Jørgensen
2021-11-26 18:06               ` Jakub Kicinski
2021-11-26 18:47                 ` Toke Høiland-Jørgensen
2021-11-26 19:14                   ` Jakub Kicinski
2021-11-28 17:54                     ` Ido Schimmel
2021-11-29 14:47                       ` Jakub Kicinski
2021-11-29 15:51                         ` Petr Machata
2021-11-29 15:54                           ` Petr Machata
2021-11-29 16:05                           ` Jakub Kicinski
2021-11-29 17:08                             ` Petr Machata
2021-11-29 17:17                               ` Jakub Kicinski
2021-11-30 11:55                                 ` Petr Machata
2021-11-30 15:07                                   ` Jakub Kicinski
2021-11-26 22:27                 ` Daniel Borkmann
2021-11-26 23:01                   ` Daniel Borkmann
2021-11-29 13:59                     ` Jesper Dangaard Brouer [this message]
2021-11-29 15:03                       ` Jakub Kicinski
2021-11-29 11:51                   ` Toke Høiland-Jørgensen
2021-11-23 16:39 ` [PATCH v2 net-next 22/26] igb: add XDP " Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 23/26] igc: bail out early on XSK xmit if no descs are available Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 24/26] igc: add XDP and XSK generic per-channel statistics Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 25/26] ixgbe: " Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 26/26] Documentation: reflect generic XDP statistics Alexander Lobakin
2021-11-28 22:23 ` [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats David Ahern
2021-11-30 15:56 ` Alexander Lobakin
2021-11-30 16:12   ` Jakub Kicinski
2021-11-30 16:34     ` Alexander Lobakin
2021-11-30 17:04       ` Jakub Kicinski
2021-11-30 17:38         ` David Ahern
2021-11-30 19:46           ` Jakub Kicinski
2021-12-01 15:21           ` Jamal Hadi Salim
2021-11-30 16:17   ` Toke Høiland-Jørgensen
2021-11-30 17:07     ` Jakub Kicinski
2021-11-30 17:56       ` David Ahern
2021-11-30 19:53         ` Jakub Kicinski
2021-11-30 17:45   ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37732c0b-a1f5-5e1d-d34e-16ef07fab597@redhat.com \
    --to=jbrouer@redhat.com \
    --cc=akiyano@amazon.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=andrii@kernel.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=avagin@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=claudiu.manoil@nxp.com \
    --cc=cong.wang@bytedance.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=darinzon@amazon.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=ecree.xilinx@gmail.com \
    --cc=habetsm.xilinx@gmail.com \
    --cc=hawk@kernel.org \
    --cc=ioana.ciornei@nxp.com \
    --cc=jasowang@redhat.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=johannes.berg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=michal.swiatkowski@linux.intel.com \
    --cc=mst@redhat.com \
    --cc=mw@semihalf.com \
    --cc=ndagan@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=ryazanov.s.a@gmail.com \
    --cc=saeedb@amazon.com \
    --cc=saeedm@nvidia.com \
    --cc=shayagr@amazon.com \
    --cc=songliubraving@fb.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=toke@redhat.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=yajun.deng@linux.dev \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).