All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Song Liu" <songliubraving@fb.com>,
	"Sergey Ryazanov" <ryazanov.s.a@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Vladimir Oltean" <vladimir.oltean@nxp.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Russell King" <linux@armlinux.org.uk>,
	"Andrei Vagin" <avagin@gmail.com>,
	"Tony Nguyen" <anthony.l.nguyen@intel.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Ioana Ciornei" <ioana.ciornei@nxp.com>,
	"Arthur Kiyanovski" <akiyano@amazon.com>,
	"Leon Romanovsky" <leon@kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	linux-rdma@vger.kernel.org, linux-doc@vger.kernel.org,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Noam Dagan" <ndagan@amazon.com>,
	nikolay@nvidia.com, "Cong Wang" <cong.wang@bytedance.com>,
	"Martin Habets" <habetsm.xilinx@gmail.com>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Johannes Berg" <johannes.berg@intel.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Claudiu Manoil" <claudiu.manoil@nxp.com>,
	"Alexander Lobakin" <alexandr.lobakin@intel.com>,
	"Yonghong Song" <yhs@fb.com>,
	"Shay Agroskin" <shayagr@amazon.com>,
	"Marcin Wojtas" <mw@semihalf.com>,
	petrm@nvidia.com, "Daniel Borkmann" <daniel@iogearbox.net>,
	"David Arinzon" <darinzon@amazon.com>,
	"David Ahern" <dsahern@kernel.org>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, "Martin KaFai Lau" <kafai@fb.com>,
	"Edward Cree" <ecree.xilinx@gmail.com>,
	"Yajun Deng" <yajun.deng@linux.dev>,
	netdev@vger.kernel.org, "Saeed Bishara" <saeedb@amazon.com>,
	"Michal Swiatkowski" <michal.swiatkowski@linux.intel.com>,
	bpf@vger.kernel.org, "Saeed Mahameed" <saeedm@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH v2 net-next 21/26] ice: add XDP and XSK generic per-channel statistics
Date: Sun, 28 Nov 2021 19:54:53 +0200	[thread overview]
Message-ID: <YaPCbaMVaVlxXcHC@shredder> (raw)
In-Reply-To: <20211126111431.4a2ed007@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

+Petr, Nik

On Fri, Nov 26, 2021 at 11:14:31AM -0800, Jakub Kicinski wrote:
> On Fri, 26 Nov 2021 19:47:17 +0100 Toke Høiland-Jørgensen wrote:
> > > Fair. In all honesty I said that hoping to push for a more flexible
> > > approach hidden entirely in BPF, and not involving driver changes.
> > > Assuming the XDP program has more fine grained stats we should be able
> > > to extract those instead of double-counting. Hence my vague "let's work
> > > with apps" comment.
> > >
> > > For example to a person familiar with the workload it'd be useful to
> > > know if program returned XDP_DROP because of configured policy or
> > > failure to parse a packet. I don't think that sort distinction is
> > > achievable at the level of standard stats.
> > >
> > > The information required by the admin is higher level. As you say the
> > > primary concern there is "how many packets did XDP eat".  
> > 
> > Right, sure, I am also totally fine with having only a somewhat
> > restricted subset of stats available at the interface level and make
> > everything else be BPF-based. I'm hoping we can converge of a common
> > understanding of what this "minimal set" should be :)
> > 
> > > Speaking of which, one thing that badly needs clarification is our
> > > expectation around XDP packets getting counted towards the interface
> > > stats.  
> > 
> > Agreed. My immediate thought is that "XDP packets are interface packets"
> > but that is certainly not what we do today, so not sure if changing it
> > at this point would break things?
> 
> I'd vote for taking the risk and trying to align all the drivers.

I agree. I think IFLA_STATS64 in RTM_NEWLINK should contain statistics
of all the packets seen by the netdev. The breakdown into software /
hardware / XDP should be reported via RTM_NEWSTATS.

Currently, for soft devices such as VLANs, bridges and GRE, user space
only sees statistics of packets forwarded by software, which is quite
useless when forwarding is offloaded from the kernel to hardware.

Petr is working on exposing hardware statistics for such devices via
rtnetlink. Unlike XDP (?), we need to be able to let user space enable /
disable hardware statistics as we have a limited number of hardware
counters and they can also reduce the bandwidth when enabled. We are
thinking of adding a new RTM_SETSTATS for that:

# ip stats set dev swp1 hw_stats on

For query, something like (under discussion):

# ip stats show dev swp1 // all groups
# ip stats show dev swp1 group link
# ip stats show dev swp1 group offload // all sub-groups
# ip stats show dev swp1 group offload sub-group cpu
# ip stats show dev swp1 group offload sub-group hw

Like other iproute2 commands, these follow the nesting of the
RTM_{NEW,GET}STATS uAPI.

Looking at patch #1 [1], I think that whatever you decide to expose for
XDP can be queried via:

# ip stats show dev swp1 group xdp
# ip stats show dev swp1 group xdp sub-group regular
# ip stats show dev swp1 group xdp sub-group xsk

Regardless, the following command should show statistics of all the
packets seen by the netdev:

# ip -s link show dev swp1

There is a PR [2] for node_exporter to use rtnetlink to fetch netdev
statistics instead of the old proc interface. It should be possible to
extend it to use RTM_*STATS for more fine-grained statistics.

[1] https://lore.kernel.org/netdev/20211123163955.154512-2-alexandr.lobakin@intel.com/
[2] https://github.com/prometheus/node_exporter/pull/2074
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: Ido Schimmel <idosch@idosch.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Alexander Lobakin" <alexandr.lobakin@intel.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"David S. Miller" <davem@davemloft.net>,
	"Jesse Brandeburg" <jesse.brandeburg@intel.com>,
	"Michal Swiatkowski" <michal.swiatkowski@linux.intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Shay Agroskin" <shayagr@amazon.com>,
	"Arthur Kiyanovski" <akiyano@amazon.com>,
	"David Arinzon" <darinzon@amazon.com>,
	"Noam Dagan" <ndagan@amazon.com>,
	"Saeed Bishara" <saeedb@amazon.com>,
	"Ioana Ciornei" <ioana.ciornei@nxp.com>,
	"Claudiu Manoil" <claudiu.manoil@nxp.com>,
	"Tony Nguyen" <anthony.l.nguyen@intel.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Marcin Wojtas" <mw@semihalf.com>,
	"Russell King" <linux@armlinux.org.uk>,
	"Saeed Mahameed" <saeedm@nvidia.com>,
	"Leon Romanovsky" <leon@kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Edward Cree" <ecree.xilinx@gmail.com>,
	"Martin Habets" <habetsm.xilinx@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Yajun Deng" <yajun.deng@linux.dev>,
	"Sergey Ryazanov" <ryazanov.s.a@gmail.com>,
	"David Ahern" <dsahern@kernel.org>,
	"Andrei Vagin" <avagin@gmail.com>,
	"Johannes Berg" <johannes.berg@intel.com>,
	"Vladimir Oltean" <vladimir.oltean@nxp.com>,
	"Cong Wang" <cong.wang@bytedance.com>,
	netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	bpf@vger.kernel.org, virtualization@lists.linux-foundation.org,
	petrm@nvidia.com, nikolay@nvidia.com
Subject: Re: [PATCH v2 net-next 21/26] ice: add XDP and XSK generic per-channel statistics
Date: Sun, 28 Nov 2021 19:54:53 +0200	[thread overview]
Message-ID: <YaPCbaMVaVlxXcHC@shredder> (raw)
In-Reply-To: <20211126111431.4a2ed007@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

+Petr, Nik

On Fri, Nov 26, 2021 at 11:14:31AM -0800, Jakub Kicinski wrote:
> On Fri, 26 Nov 2021 19:47:17 +0100 Toke Høiland-Jørgensen wrote:
> > > Fair. In all honesty I said that hoping to push for a more flexible
> > > approach hidden entirely in BPF, and not involving driver changes.
> > > Assuming the XDP program has more fine grained stats we should be able
> > > to extract those instead of double-counting. Hence my vague "let's work
> > > with apps" comment.
> > >
> > > For example to a person familiar with the workload it'd be useful to
> > > know if program returned XDP_DROP because of configured policy or
> > > failure to parse a packet. I don't think that sort distinction is
> > > achievable at the level of standard stats.
> > >
> > > The information required by the admin is higher level. As you say the
> > > primary concern there is "how many packets did XDP eat".  
> > 
> > Right, sure, I am also totally fine with having only a somewhat
> > restricted subset of stats available at the interface level and make
> > everything else be BPF-based. I'm hoping we can converge of a common
> > understanding of what this "minimal set" should be :)
> > 
> > > Speaking of which, one thing that badly needs clarification is our
> > > expectation around XDP packets getting counted towards the interface
> > > stats.  
> > 
> > Agreed. My immediate thought is that "XDP packets are interface packets"
> > but that is certainly not what we do today, so not sure if changing it
> > at this point would break things?
> 
> I'd vote for taking the risk and trying to align all the drivers.

I agree. I think IFLA_STATS64 in RTM_NEWLINK should contain statistics
of all the packets seen by the netdev. The breakdown into software /
hardware / XDP should be reported via RTM_NEWSTATS.

Currently, for soft devices such as VLANs, bridges and GRE, user space
only sees statistics of packets forwarded by software, which is quite
useless when forwarding is offloaded from the kernel to hardware.

Petr is working on exposing hardware statistics for such devices via
rtnetlink. Unlike XDP (?), we need to be able to let user space enable /
disable hardware statistics as we have a limited number of hardware
counters and they can also reduce the bandwidth when enabled. We are
thinking of adding a new RTM_SETSTATS for that:

# ip stats set dev swp1 hw_stats on

For query, something like (under discussion):

# ip stats show dev swp1 // all groups
# ip stats show dev swp1 group link
# ip stats show dev swp1 group offload // all sub-groups
# ip stats show dev swp1 group offload sub-group cpu
# ip stats show dev swp1 group offload sub-group hw

Like other iproute2 commands, these follow the nesting of the
RTM_{NEW,GET}STATS uAPI.

Looking at patch #1 [1], I think that whatever you decide to expose for
XDP can be queried via:

# ip stats show dev swp1 group xdp
# ip stats show dev swp1 group xdp sub-group regular
# ip stats show dev swp1 group xdp sub-group xsk

Regardless, the following command should show statistics of all the
packets seen by the netdev:

# ip -s link show dev swp1

There is a PR [2] for node_exporter to use rtnetlink to fetch netdev
statistics instead of the old proc interface. It should be possible to
extend it to use RTM_*STATS for more fine-grained statistics.

[1] https://lore.kernel.org/netdev/20211123163955.154512-2-alexandr.lobakin@intel.com/
[2] https://github.com/prometheus/node_exporter/pull/2074

  reply	other threads:[~2021-11-28 17:55 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23 16:39 [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 01/26] rtnetlink: introduce generic XDP statistics Alexander Lobakin
2021-11-30  2:36   ` David Ahern
2021-11-30  2:36     ` David Ahern
2021-11-23 16:39 ` [PATCH v2 net-next 02/26] xdp: provide common driver helpers for implementing XDP stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 03/26] ena: implement generic XDP statistics callbacks Alexander Lobakin
2021-11-29 13:34   ` Shay Agroskin
2021-11-30 19:14     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 04/26] dpaa2: implement generic XDP stats callbacks Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 05/26] enetc: " Alexander Lobakin
2021-11-23 17:09   ` Vladimir Oltean
2021-11-24 11:37     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 06/26] mvneta: reformat mvneta_netdev_ops Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 07/26] mvneta: add .ndo_get_xdp_stats() callback Alexander Lobakin
2021-11-24 11:39   ` Russell King (Oracle)
2021-11-24 11:39     ` Russell King (Oracle)
2021-11-25 17:16     ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 08/26] mvpp2: provide " Alexander Lobakin
2021-11-24 11:33   ` Russell King (Oracle)
2021-11-24 11:33     ` Russell King (Oracle)
2021-11-24 11:36   ` Russell King (Oracle)
2021-11-24 11:36     ` Russell King (Oracle)
2021-11-23 16:39 ` [PATCH v2 net-next 09/26] mlx5: don't mix XDP_DROP and Rx XDP error cases Alexander Lobakin
2021-11-24 18:15   ` kernel test robot
2021-11-24 18:15     ` kernel test robot
2021-11-25 16:40     ` Alexander Lobakin
2021-11-25 16:40       ` Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 10/26] mlx5: provide generic XDP stats callbacks Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 11/26] sf100, sfx: implement " Alexander Lobakin
2021-11-24  9:59   ` Edward Cree
2021-11-23 16:39 ` [PATCH v2 net-next 12/26] veth: don't mix XDP_DROP counter with Rx XDP errors Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 13/26] veth: drop 'xdp_' suffix from packets and bytes stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 14/26] veth: reformat veth_netdev_ops Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 15/26] veth: add generic XDP stats callbacks Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 16/26] virtio_net: don't mix XDP_DROP counter with Rx XDP errors Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 17/26] virtio_net: rename xdp_tx{,_drops} SQ stats to xdp_xmit{,_errors} Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 18/26] virtio_net: reformat virtnet_netdev Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 19/26] virtio_net: add callbacks for generic XDP stats Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 20/26] i40e: add XDP and XSK generic per-channel statistics Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 21/26] ice: " Alexander Lobakin
2021-11-24  0:52   ` Daniel Borkmann
2021-11-24  0:52     ` Daniel Borkmann
2021-11-24 16:34     ` Lorenz Bauer
2021-11-25 11:56     ` Toke Høiland-Jørgensen
2021-11-25 11:56       ` Toke Høiland-Jørgensen
2021-11-25 17:07       ` Alexander Lobakin
2021-11-25 17:44         ` Jakub Kicinski
2021-11-25 20:40           ` Alexander Lobakin
2021-11-26 12:30             ` Toke Høiland-Jørgensen
2021-11-26 12:30               ` Toke Høiland-Jørgensen
2021-11-26 18:06               ` Jakub Kicinski
2021-11-26 18:47                 ` Toke Høiland-Jørgensen
2021-11-26 18:47                   ` Toke Høiland-Jørgensen
2021-11-26 19:14                   ` Jakub Kicinski
2021-11-28 17:54                     ` Ido Schimmel [this message]
2021-11-28 17:54                       ` Ido Schimmel
2021-11-29 14:47                       ` Jakub Kicinski
2021-11-29 15:51                         ` Petr Machata
2021-11-29 15:54                           ` Petr Machata
2021-11-29 16:05                           ` Jakub Kicinski
2021-11-29 17:08                             ` Petr Machata
2021-11-29 17:17                               ` Jakub Kicinski
2021-11-30 11:55                                 ` Petr Machata
2021-11-30 15:07                                   ` Jakub Kicinski
2021-11-26 22:27                 ` Daniel Borkmann
2021-11-26 22:27                   ` Daniel Borkmann
2021-11-26 23:01                   ` Daniel Borkmann
2021-11-29 13:59                     ` Jesper Dangaard Brouer
2021-11-29 15:03                       ` Jakub Kicinski
2021-11-29 11:51                   ` Toke Høiland-Jørgensen
2021-11-29 11:51                     ` Toke Høiland-Jørgensen
2021-11-23 16:39 ` [PATCH v2 net-next 22/26] igb: add XDP " Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 23/26] igc: bail out early on XSK xmit if no descs are available Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 24/26] igc: add XDP and XSK generic per-channel statistics Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 25/26] ixgbe: " Alexander Lobakin
2021-11-23 16:39 ` [PATCH v2 net-next 26/26] Documentation: reflect generic XDP statistics Alexander Lobakin
2021-11-28 22:23 ` [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats David Ahern
2021-11-28 22:23   ` David Ahern
2021-11-30 15:56 ` Alexander Lobakin
2021-11-30 16:12   ` Jakub Kicinski
2021-11-30 16:34     ` Alexander Lobakin
2021-11-30 17:04       ` Jakub Kicinski
2021-11-30 17:38         ` David Ahern
2021-11-30 17:38           ` David Ahern
2021-11-30 19:46           ` Jakub Kicinski
2021-12-01 15:21           ` Jamal Hadi Salim
2021-12-01 15:21             ` Jamal Hadi Salim
2021-11-30 16:17   ` Toke Høiland-Jørgensen
2021-11-30 16:17     ` Toke Høiland-Jørgensen
2021-11-30 17:07     ` Jakub Kicinski
2021-11-30 17:56       ` David Ahern
2021-11-30 17:56         ` David Ahern
2021-11-30 19:53         ` Jakub Kicinski
2021-11-30 17:45   ` David Ahern
2021-11-30 17:45     ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YaPCbaMVaVlxXcHC@shredder \
    --to=idosch@idosch.org \
    --cc=akiyano@amazon.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=andrii@kernel.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=avagin@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=claudiu.manoil@nxp.com \
    --cc=cong.wang@bytedance.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=darinzon@amazon.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=ecree.xilinx@gmail.com \
    --cc=habetsm.xilinx@gmail.com \
    --cc=hawk@kernel.org \
    --cc=ioana.ciornei@nxp.com \
    --cc=johannes.berg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=michal.swiatkowski@linux.intel.com \
    --cc=mst@redhat.com \
    --cc=mw@semihalf.com \
    --cc=ndagan@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@nvidia.com \
    --cc=petrm@nvidia.com \
    --cc=ryazanov.s.a@gmail.com \
    --cc=saeedb@amazon.com \
    --cc=saeedm@nvidia.com \
    --cc=shayagr@amazon.com \
    --cc=songliubraving@fb.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=toke@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=vladimir.oltean@nxp.com \
    --cc=yajun.deng@linux.dev \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.