linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* traceability of wifi packet drops
@ 2023-03-27 14:19 Johannes Berg
  2023-03-27 14:31 ` Johannes Berg
  2023-03-28  1:09 ` Jakub Kicinski
  0 siblings, 2 replies; 12+ messages in thread
From: Johannes Berg @ 2023-03-27 14:19 UTC (permalink / raw)
  To: netdev; +Cc: linux-wireless

Hi,

So I just ran into this problem with a colleague again; we don't have
good visibility into why in the wifi stack a packet is dropped.

In the network stack we have skb drop reasons for (part of?) this, but
we don't really use this in wifi/mac80211 yet.

Unfortunately we have probably >100 distinct drop reasons in the wifi
stack, so annotating those is not only tedious, it would also double the
list of SKB drop reasons from currently ~75.

Any good ideas? I even thought about just encoding the line number
wherever we use RX_DROP_UNUSABLE / RX_DROP_MONITOR, but that's kind of
awkward too. Obviously we could change the internal API to completely
get rid of enum ieee80211_rx_result and use enum skb_drop_reason
instead, but then we'd probably need to carve out some space to also
differentiate DROP_MONITOR and DROP_UNUSABLE, perhaps something like


	SKB_DROP_REASON_MAC80211_MASK		0x03ff0000
	SKB_DROP_REASON_MAC80211_TYPE_MASK	0x03000000
	SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE	0x01000000
	SKB_DROP_REASON_MAC80211_TYPE_MONITOR	0x02000000

	SKB_DROP_REASON_MAC80211_DUP		(SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE | 1)
	SKB_DROP_REASON_MAC80211_BAD_BIP_KEYIDX	(SKB_DROP_REASON_MAC80211_TYPE_MONITOR | 1)


etc.


That'd be a LOT of annotations (and thus work) though, and a lot of new
IDs/names, for something that's not really used all that much, i.e. a
file number / line number within mac80211 would be completely
sufficient, so the alternative could be to just have a separate
tracepoint inside mac80211 with a line number or so?

Anyone have any great ideas?

Thanks,
johannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-27 14:19 traceability of wifi packet drops Johannes Berg
@ 2023-03-27 14:31 ` Johannes Berg
  2023-03-28  1:09 ` Jakub Kicinski
  1 sibling, 0 replies; 12+ messages in thread
From: Johannes Berg @ 2023-03-27 14:31 UTC (permalink / raw)
  To: netdev; +Cc: linux-wireless

On Mon, 2023-03-27 at 16:19 +0200, Johannes Berg wrote:
> 
> 	SKB_DROP_REASON_MAC80211_DUP		(SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE | 1)
> 	SKB_DROP_REASON_MAC80211_BAD_BIP_KEYIDX	(SKB_DROP_REASON_MAC80211_TYPE_MONITOR | 1)
> 

Ah, this would lose the ability to immediately see monitor/unusable, so
we'd have to make the names even longer :(

Maybe some creative macro such as

DROP_UNUSABLE(DUP)
DROP_MONITOR(BAD_BIP_KEYIDX)

could be done, but that hurts ctags/elixir and the likes ...

johannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-27 14:19 traceability of wifi packet drops Johannes Berg
  2023-03-27 14:31 ` Johannes Berg
@ 2023-03-28  1:09 ` Jakub Kicinski
  2023-03-28  2:27   ` Dave Taht
  2023-03-28  7:37   ` Johannes Berg
  1 sibling, 2 replies; 12+ messages in thread
From: Jakub Kicinski @ 2023-03-28  1:09 UTC (permalink / raw)
  To: Johannes Berg, Eric Dumazet; +Cc: netdev, linux-wireless

On Mon, 27 Mar 2023 16:19:34 +0200 Johannes Berg wrote:
> So I just ran into this problem with a colleague again; we don't have
> good visibility into why in the wifi stack a packet is dropped.
> 
> In the network stack we have skb drop reasons for (part of?) this, but
> we don't really use this in wifi/mac80211 yet.
> 
> Unfortunately we have probably >100 distinct drop reasons in the wifi
> stack, so annotating those is not only tedious, it would also double the
> list of SKB drop reasons from currently ~75.
> 
> Any good ideas? I even thought about just encoding the line number
> wherever we use RX_DROP_UNUSABLE / RX_DROP_MONITOR, but that's kind of
> awkward too. Obviously we could change the internal API to completely
> get rid of enum ieee80211_rx_result and use enum skb_drop_reason
> instead, but then we'd probably need to carve out some space to also
> differentiate DROP_MONITOR and DROP_UNUSABLE, perhaps something like
> 
> 
> 	SKB_DROP_REASON_MAC80211_MASK		0x03ff0000
> 	SKB_DROP_REASON_MAC80211_TYPE_MASK	0x03000000
> 	SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE	0x01000000
> 	SKB_DROP_REASON_MAC80211_TYPE_MONITOR	0x02000000
> 
> 	SKB_DROP_REASON_MAC80211_DUP		(SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE | 1)
> 	SKB_DROP_REASON_MAC80211_BAD_BIP_KEYIDX	(SKB_DROP_REASON_MAC80211_TYPE_MONITOR | 1)
> 
> 
> etc.
> 
> 
> That'd be a LOT of annotations (and thus work) though, and a lot of new
> IDs/names, for something that's not really used all that much, i.e. a
> file number / line number within mac80211 would be completely
> sufficient, so the alternative could be to just have a separate
> tracepoint inside mac80211 with a line number or so?
> 
> Anyone have any great ideas?

We need something that'd scale to more subsystems, so I don't think
having all the definitions in enum skb_drop_reason directly is an
option.

My knee jerk idea would be to either use the top 8 bits of the
skb reason enum to denote the space. And then we'd say 0 is core
1 is wifi (enum ieee80211_rx_result) etc. Within the WiFi space 
you can use whatever encoding you like.

On a quick look nothing is indexed by the reason directly, so no
problems with using the high bits.

Option #2 is to add one main drop reason called SKB_DROP_REASON_MAC80211
and have a separate tracepoint which exposes the detailed wifi
reason and any necessary context. mac80211 would then have its own
wrapper around kfree_skb_reason() which triggers the tracepoint.

Those are perhaps fairly obvious and unimaginative. Adding Eric,
since he has been filling in a lot of the drop reasons lately.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-28  1:09 ` Jakub Kicinski
@ 2023-03-28  2:27   ` Dave Taht
  2023-03-28  7:37   ` Johannes Berg
  1 sibling, 0 replies; 12+ messages in thread
From: Dave Taht @ 2023-03-28  2:27 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Johannes Berg, Eric Dumazet, netdev, linux-wireless

On Mon, Mar 27, 2023 at 6:32 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 27 Mar 2023 16:19:34 +0200 Johannes Berg wrote:
> > So I just ran into this problem with a colleague again; we don't have
> > good visibility into why in the wifi stack a packet is dropped.
> >
> > In the network stack we have skb drop reasons for (part of?) this, but
> > we don't really use this in wifi/mac80211 yet.
> >
> > Unfortunately we have probably >100 distinct drop reasons in the wifi
> > stack, so annotating those is not only tedious, it would also double the
> > list of SKB drop reasons from currently ~75.
> >
> > Any good ideas? I even thought about just encoding the line number
> > wherever we use RX_DROP_UNUSABLE / RX_DROP_MONITOR, but that's kind of
> > awkward too. Obviously we could change the internal API to completely
> > get rid of enum ieee80211_rx_result and use enum skb_drop_reason
> > instead, but then we'd probably need to carve out some space to also
> > differentiate DROP_MONITOR and DROP_UNUSABLE, perhaps something like
> >
> >
> >       SKB_DROP_REASON_MAC80211_MASK           0x03ff0000
> >       SKB_DROP_REASON_MAC80211_TYPE_MASK      0x03000000
> >       SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE  0x01000000
> >       SKB_DROP_REASON_MAC80211_TYPE_MONITOR   0x02000000
> >
> >       SKB_DROP_REASON_MAC80211_DUP            (SKB_DROP_REASON_MAC80211_TYPE_UNUSABLE | 1)
> >       SKB_DROP_REASON_MAC80211_BAD_BIP_KEYIDX (SKB_DROP_REASON_MAC80211_TYPE_MONITOR | 1)
> >
> >
> > etc.
> >
> >
> > That'd be a LOT of annotations (and thus work) though, and a lot of new
> > IDs/names, for something that's not really used all that much, i.e. a
> > file number / line number within mac80211 would be completely
> > sufficient, so the alternative could be to just have a separate
> > tracepoint inside mac80211 with a line number or so?
> >
> > Anyone have any great ideas?
>
> We need something that'd scale to more subsystems, so I don't think
> having all the definitions in enum skb_drop_reason directly is an
> option.

I am not going to comment on this idea directly, but I did want to
note that long on my round-to-it list was adding something like this
to the qdiscs generally:

QDISC_DROP_REASON_CONGESTIVE
QDISC_DROP_REASON_OVERFLOW
QDISC_DROP_REASON_GSO_SPLIT

since the wifi stack also has both of the former and for all I know,
the latter, whatever you come up with I applaud, with a better name
(QUEUE?)

>
> My knee jerk idea would be to either use the top 8 bits of the
> skb reason enum to denote the space. And then we'd say 0 is core
> 1 is wifi (enum ieee80211_rx_result) etc. Within the WiFi space
> you can use whatever encoding you like.
>
> On a quick look nothing is indexed by the reason directly, so no
> problems with using the high bits.
>
> Option #2 is to add one main drop reason called SKB_DROP_REASON_MAC80211
> and have a separate tracepoint which exposes the detailed wifi
> reason and any necessary context. mac80211 would then have its own
> wrapper around kfree_skb_reason() which triggers the tracepoint.
>
> Those are perhaps fairly obvious and unimaginative. Adding Eric,
> since he has been filling in a lot of the drop reasons lately.



-- 
AMA March 31: https://www.broadband.io/c/broadband-grant-events/dave-taht
Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-28  1:09 ` Jakub Kicinski
  2023-03-28  2:27   ` Dave Taht
@ 2023-03-28  7:37   ` Johannes Berg
  2023-03-28  8:16     ` Eric Dumazet
  2023-03-28 22:58     ` Jakub Kicinski
  1 sibling, 2 replies; 12+ messages in thread
From: Johannes Berg @ 2023-03-28  7:37 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet; +Cc: netdev, linux-wireless

On Mon, 2023-03-27 at 18:09 -0700, Jakub Kicinski wrote:
> > Anyone have any great ideas?
> 
> We need something that'd scale to more subsystems, so I don't think
> having all the definitions in enum skb_drop_reason directly is an
> option.

Fair enough.

> My knee jerk idea would be to either use the top 8 bits of the
> skb reason enum to denote the space. And then we'd say 0 is core
> 1 is wifi (enum ieee80211_rx_result) etc. Within the WiFi space 
> you can use whatever encoding you like.

Right. That's not _that_ far from what I proposed above, except you pull
the core out 

> On a quick look nothing is indexed by the reason directly, so no
> problems with using the high bits.

I think you missed he drop_reasons[] array in skbuff.c, but I guess we
could just not add these to the DEFINE_DROP_REASON() macro (and couldn't
really add them anyway).

The only user seems to be drop_monitor, which anyway checks the array
bounds (in the trace hit function.)

Or we change the design of this to actually have each subsystem provide
an array/a callback for their namespace, if the strings are important?
Some registration/unregistration might be needed for modules, but that
could be done.

> Option #2 is to add one main drop reason called SKB_DROP_REASON_MAC80211
> and have a separate tracepoint which exposes the detailed wifi
> reason and any necessary context. mac80211 would then have its own
> wrapper around kfree_skb_reason() which triggers the tracepoint.

Yeah, I considered doing that with just the line number, but who knows
what people might want to use this for in the end, so that's not a great
idea either I guess :-)

I would prefer the version with the drop reasons, since then also you
only have to worry about one tracepoint.

johannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-28  7:37   ` Johannes Berg
@ 2023-03-28  8:16     ` Eric Dumazet
  2023-03-28  8:18       ` Johannes Berg
  2023-03-28 22:58     ` Jakub Kicinski
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-03-28  8:16 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Jakub Kicinski, netdev, linux-wireless

On Tue, Mar 28, 2023 at 12:37 AM Johannes Berg
<johannes@sipsolutions.net> wrote:
>
> On Mon, 2023-03-27 at 18:09 -0700, Jakub Kicinski wrote:
> > > Anyone have any great ideas?
> >
> > We need something that'd scale to more subsystems, so I don't think
> > having all the definitions in enum skb_drop_reason directly is an
> > option.
>
> Fair enough.
>
> > My knee jerk idea would be to either use the top 8 bits of the
> > skb reason enum to denote the space. And then we'd say 0 is core
> > 1 is wifi (enum ieee80211_rx_result) etc. Within the WiFi space
> > you can use whatever encoding you like.
>
> Right. That's not _that_ far from what I proposed above, except you pull
> the core out
>
> > On a quick look nothing is indexed by the reason directly, so no
> > problems with using the high bits.
>
> I think you missed he drop_reasons[] array in skbuff.c, but I guess we
> could just not add these to the DEFINE_DROP_REASON() macro (and couldn't
> really add them anyway).
>
> The only user seems to be drop_monitor, which anyway checks the array
> bounds (in the trace hit function.)
>
> Or we change the design of this to actually have each subsystem provide
> an array/a callback for their namespace, if the strings are important?
> Some registration/unregistration might be needed for modules, but that
> could be done.
>
> > Option #2 is to add one main drop reason called SKB_DROP_REASON_MAC80211
> > and have a separate tracepoint which exposes the detailed wifi
> > reason and any necessary context. mac80211 would then have its own
> > wrapper around kfree_skb_reason() which triggers the tracepoint.
>
> Yeah, I considered doing that with just the line number, but who knows
> what people might want to use this for in the end, so that's not a great
> idea either I guess :-)
>
> I would prefer the version with the drop reasons, since then also you
> only have to worry about one tracepoint.
>

About visibility: Even before 'drop reasons' developers can always use
the call graph .

perf record -a -g -e skb:kfree_skb ...

Really drop reasons are nice when you want filtering and convenience.
But they are a lot of work to add/maintain.
This all comes at a cost (both maintenance but also run time cost
because we need to propagate reasons )

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-28  8:16     ` Eric Dumazet
@ 2023-03-28  8:18       ` Johannes Berg
  0 siblings, 0 replies; 12+ messages in thread
From: Johannes Berg @ 2023-03-28  8:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Jakub Kicinski, netdev, linux-wireless

On Tue, 2023-03-28 at 01:16 -0700, Eric Dumazet wrote:
> 
> About visibility: Even before 'drop reasons' developers can always use
> the call graph .
> 
> perf record -a -g -e skb:kfree_skb ...

Yeah, except right now it doesn't work in mac80211 because we have these
"RX handlers" that just return a status, and then a single place drops
the skb.

> Really drop reasons are nice when you want filtering and convenience.
> But they are a lot of work to add/maintain.
> This all comes at a cost (both maintenance but also run time cost
> because we need to propagate reasons )

Right ... I don't think the runtime cost would be much of an issue here
(we just get them as a return value from one function to pass to
kfree_skb), but the maintenance is hard.

I guess we could also restructure the code to call kfree_skb() in all
those places that want drop it and then we have the call trace, but I
doubt that would actually be cheaper both in the maintenance sense
(always need to pay attention to it) or in the runtime sense.

johannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-28  7:37   ` Johannes Berg
  2023-03-28  8:16     ` Eric Dumazet
@ 2023-03-28 22:58     ` Jakub Kicinski
  2023-03-29  8:35       ` Johannes Berg
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2023-03-28 22:58 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Eric Dumazet, netdev, linux-wireless

On Tue, 28 Mar 2023 09:37:43 +0200 Johannes Berg wrote:
> > My knee jerk idea would be to either use the top 8 bits of the
> > skb reason enum to denote the space. And then we'd say 0 is core
> > 1 is wifi (enum ieee80211_rx_result) etc. Within the WiFi space 
> > you can use whatever encoding you like.  
> 
> Right. That's not _that_ far from what I proposed above, except you pull
> the core out 

Thinking about it again, maybe yours is actually cleaner.
Having the subsystem reason on the top bits, I mean.
That way after masking the specific bits out the lower bits
can still provide a valid "global" drop reason.

The UNUSABLE vs MONITOR bits I'd be tempted to put in the "global"
reason, but maybe that's not a great idea given Eric's concern :)

> > On a quick look nothing is indexed by the reason directly, so no
> > problems with using the high bits.  
> 
> I think you missed he drop_reasons[] array in skbuff.c, but I guess we
> could just not add these to the DEFINE_DROP_REASON() macro (and couldn't
> really add them anyway).
> 
> The only user seems to be drop_monitor, which anyway checks the array
> bounds (in the trace hit function.)
> 
> Or we change the design of this to actually have each subsystem provide
> an array/a callback for their namespace, if the strings are important?
> Some registration/unregistration might be needed for modules, but that
> could be done.

Right, drop monitor is good ol' kernel code, we can make it do whatever
we want. I was worried that tracing / BPF may tie our hands but they
support sparse enums just fine.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-28 22:58     ` Jakub Kicinski
@ 2023-03-29  8:35       ` Johannes Berg
  2023-03-29 18:02         ` Jakub Kicinski
  0 siblings, 1 reply; 12+ messages in thread
From: Johannes Berg @ 2023-03-29  8:35 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Eric Dumazet, netdev, linux-wireless

On Tue, 2023-03-28 at 15:58 -0700, Jakub Kicinski wrote:
> On Tue, 28 Mar 2023 09:37:43 +0200 Johannes Berg wrote:
> > > My knee jerk idea would be to either use the top 8 bits of the
> > > skb reason enum to denote the space. And then we'd say 0 is core
> > > 1 is wifi (enum ieee80211_rx_result) etc. Within the WiFi space 
> > > you can use whatever encoding you like.  
> > 
> > Right. That's not _that_ far from what I proposed above, except you pull
> > the core out 
> 
> Thinking about it again, maybe yours is actually cleaner.
> Having the subsystem reason on the top bits, I mean.
> That way after masking the specific bits out the lower bits
> can still provide a valid "global" drop reason.

Ah, that's not even what I was thinking, but that would work.

What I was thinking was basically the same as you had, just hadn't
thought about more subsystems yet and was trying to carve out some bits
for wifi specifically :-)

I don't think we'll really end up with a case where we really want to
use the low bits for a global reason and a wifi specific reason in
higher bits together - there are basically no applicable reasons that we
have today ...

I mean, it basically doesn't make sense to have any of the current
reasons (such as _IP_CSUM or _IP_INHDR) with a wifi specific reason
since we wouldn't even go look at the IP header when wifi drops
something.

The only one that _might_ be relevant would be possibly _PKT_TOO_SMALL
where wifi has various "reasons" for it to be too small (depending on
the packet format), but I'm not sure that it would even be helpful to
have drop monitor report "packet too small" from different layers; today
it's used from L3/L4, not L2.


> The UNUSABLE vs MONITOR bits I'd be tempted to put in the "global"
> reason, but maybe that's not a great idea given Eric's concern :)

This bit is actually functionally important for us, but to me it doesn't
matter much where it is.

> Right, drop monitor is good ol' kernel code, we can make it do whatever
> we want. I was worried that tracing / BPF may tie our hands but they
> support sparse enums just fine.

Ah, OK, right.



So after all the discussion, I would revise my proposal to be more like
yours. Let's say like the below. This will not put the reason string
into tracing, but I think we can live with that.

johannes



diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index c0a3ea806cd5..f884eb0a1d2d 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -343,6 +343,26 @@ enum skb_drop_reason {
 	 * used as a real 'reason'
 	 */
 	SKB_DROP_REASON_MAX,
+
+	/** @SKB_DROP_REASON_SUBSYS_MASK: subsystem mask in drop reasons,
+	 * see &enum skb_drop_reason_subsys
+	 */
+	SKB_DROP_REASON_SUBSYS_MASK = 0xff000000,
+};
+
+/**
+ * enum skb_drop_reason_subsys - subsystem tag for (extended) drop reasons
+ */
+enum skb_drop_reason_subsys {
+	/** @SKB_DROP_REASON_SUBSYS_CORE: core drop reasons defined above */
+	SKB_DROP_REASON_SUBSYS_CORE,
+	/** @SKB_DROP_REASON_SUBSYS_MAC80211: mac80211 drop reasons,
+	 * see net/mac80211/dropreason.h
+	 */
+	SKB_DROP_REASON_SUBSYS_MAC80211,
+
+	/** @SKB_DROP_REASON_SUBSYS_NUM: number of subsystems defined */
+	SKB_DROP_REASON_SUBSYS_NUM
 };
 
 #define SKB_DR_INIT(name, reason)				\
@@ -358,6 +378,17 @@ enum skb_drop_reason {
 			SKB_DR_SET(name, reason);		\
 	} while (0)
 
-extern const char * const drop_reasons[];
+struct drop_reason_list {
+	const char * const *reasons;
+	size_t n_reasons;
+};
+
+/* Note: due to dynamic registrations, access must be under RCU */
+extern const struct drop_reason_list __rcu *
+drop_reasons_by_subsys[SKB_DROP_REASON_SUBSYS_NUM];
+
+void drop_reasons_register_subsys(enum skb_drop_reason_subsys subsys,
+				  const struct drop_reason_list *list);
+void drop_reasons_unregister_subsys(enum skb_drop_reason_subsys subsys);
 
 #endif
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 5a782d1d8fd3..c6c60dc75b2d 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -21,6 +21,7 @@
 #include <linux/workqueue.h>
 #include <linux/netlink.h>
 #include <linux/net_dropmon.h>
+#include <linux/bitfield.h>
 #include <linux/percpu.h>
 #include <linux/timer.h>
 #include <linux/bitops.h>
@@ -504,8 +505,6 @@ static void net_dm_packet_trace_kfree_skb_hit(void *ignore,
 	if (!nskb)
 		return;
 
-	if (unlikely(reason >= SKB_DROP_REASON_MAX || reason <= 0))
-		reason = SKB_DROP_REASON_NOT_SPECIFIED;
 	cb = NET_DM_SKB_CB(nskb);
 	cb->reason = reason;
 	cb->pc = location;
@@ -552,9 +551,9 @@ static size_t net_dm_in_port_size(void)
 }
 
 #define NET_DM_MAX_SYMBOL_LEN 40
+#define NET_DM_MAX_REASON_LEN 50
 
-static size_t net_dm_packet_report_size(size_t payload_len,
-					enum skb_drop_reason reason)
+static size_t net_dm_packet_report_size(size_t payload_len)
 {
 	size_t size;
 
@@ -576,7 +575,7 @@ static size_t net_dm_packet_report_size(size_t payload_len,
 	       /* NET_DM_ATTR_PROTO */
 	       nla_total_size(sizeof(u16)) +
 	       /* NET_DM_ATTR_REASON */
-	       nla_total_size(strlen(drop_reasons[reason]) + 1) +
+	       nla_total_size(NET_DM_MAX_REASON_LEN + 1) +
 	       /* NET_DM_ATTR_PAYLOAD */
 	       nla_total_size(payload_len);
 }
@@ -610,6 +609,8 @@ static int net_dm_packet_report_fill(struct sk_buff *msg, struct sk_buff *skb,
 				     size_t payload_len)
 {
 	struct net_dm_skb_cb *cb = NET_DM_SKB_CB(skb);
+	const struct drop_reason_list *list = NULL;
+	unsigned int subsys, subsys_reason;
 	char buf[NET_DM_MAX_SYMBOL_LEN];
 	struct nlattr *attr;
 	void *hdr;
@@ -627,9 +628,24 @@ static int net_dm_packet_report_fill(struct sk_buff *msg, struct sk_buff *skb,
 			      NET_DM_ATTR_PAD))
 		goto nla_put_failure;
 
+	rcu_read_lock();
+	subsys = u32_get_bits(cb->reason, SKB_DROP_REASON_SUBSYS_MASK);
+	if (subsys < SKB_DROP_REASON_SUBSYS_NUM)
+		list = rcu_dereference(drop_reasons_by_subsys[subsys]);
+	subsys_reason = cb->reason & ~SKB_DROP_REASON_SUBSYS_MASK;
+	if (!list ||
+	    subsys_reason >= list->n_reasons ||
+	    !list->reasons[subsys_reason] ||
+	    strlen(list->reasons[subsys_reason]) > NET_DM_MAX_REASON_LEN) {
+		list = rcu_dereference(drop_reasons_by_subsys[SKB_DROP_REASON_SUBSYS_CORE]);
+		subsys_reason = SKB_DROP_REASON_NOT_SPECIFIED;
+	}
 	if (nla_put_string(msg, NET_DM_ATTR_REASON,
-			   drop_reasons[cb->reason]))
+			   list->reasons[subsys_reason])) {
+		rcu_read_unlock();
 		goto nla_put_failure;
+	}
+	rcu_read_unlock();
 
 	snprintf(buf, sizeof(buf), "%pS", cb->pc);
 	if (nla_put_string(msg, NET_DM_ATTR_SYMBOL, buf))
@@ -687,9 +703,7 @@ static void net_dm_packet_report(struct sk_buff *skb)
 	if (net_dm_trunc_len)
 		payload_len = min_t(size_t, net_dm_trunc_len, payload_len);
 
-	msg = nlmsg_new(net_dm_packet_report_size(payload_len,
-						  NET_DM_SKB_CB(skb)->reason),
-			GFP_KERNEL);
+	msg = nlmsg_new(net_dm_packet_report_size(payload_len), GFP_KERNEL);
 	if (!msg)
 		goto out;
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 050a875d09c5..3449f9acbc6c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -122,11 +122,59 @@ EXPORT_SYMBOL(sysctl_max_skb_frags);
 
 #undef FN
 #define FN(reason) [SKB_DROP_REASON_##reason] = #reason,
-const char * const drop_reasons[] = {
+static const char * const drop_reasons[] = {
 	[SKB_CONSUMED] = "CONSUMED",
 	DEFINE_DROP_REASON(FN, FN)
 };
-EXPORT_SYMBOL(drop_reasons);
+
+static const struct drop_reason_list drop_reasons_core = {
+	.reasons = drop_reasons,
+	.n_reasons = ARRAY_SIZE(drop_reasons),
+};
+
+const struct drop_reason_list __rcu *
+drop_reasons_by_subsys[SKB_DROP_REASON_SUBSYS_NUM] = {
+	[SKB_DROP_REASON_SUBSYS_CORE] = &drop_reasons_core,
+};
+EXPORT_SYMBOL(drop_reasons_by_subsys);
+
+/**
+ * drop_reasons_register_subsys - register another drop reason subsystem
+ * @subsys: the subsystem to register, must not be the core
+ * @list: the list of drop reasons within the subsystem, must point to
+ *	a statically initialized list
+ */
+void drop_reasons_register_subsys(enum skb_drop_reason_subsys subsys,
+				  const struct drop_reason_list *list)
+{
+	if (WARN(subsys <= SKB_DROP_REASON_SUBSYS_CORE ||
+		 subsys >= ARRAY_SIZE(drop_reasons_by_subsys),
+		 "invalid subsystem %d\n", subsys))
+		return;
+
+	/* must point to statically allocated memory, so INIT is OK */
+	RCU_INIT_POINTER(drop_reasons_by_subsys[subsys], list);
+}
+EXPORT_SYMBOL_GPL(drop_reasons_register_subsys);
+
+/**
+ * drop_reasons_unregister_subsys - unregister a drop reason subsystem
+ * @subsys: the subsystem to remove, must not be the core
+ *
+ * Note: This will synchronize_rcu() to ensure no users when it returns.
+ */
+void drop_reasons_unregister_subsys(enum skb_drop_reason_subsys subsys)
+{
+	if (WARN(subsys <= SKB_DROP_REASON_SUBSYS_CORE ||
+		 subsys >= ARRAY_SIZE(drop_reasons_by_subsys),
+		 "invalid subsystem %d\n", subsys))
+		return;
+
+	RCU_INIT_POINTER(drop_reasons_by_subsys[subsys], NULL);
+
+	synchronize_rcu();
+}
+EXPORT_SYMBOL_GPL(drop_reasons_unregister_subsys);
 
 /**
  *	skb_panic - private function for out-of-line support


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-29  8:35       ` Johannes Berg
@ 2023-03-29 18:02         ` Jakub Kicinski
  2023-03-29 18:57           ` Johannes Berg
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2023-03-29 18:02 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Eric Dumazet, netdev, linux-wireless

On Wed, 29 Mar 2023 10:35:08 +0200 Johannes Berg wrote:
> > Thinking about it again, maybe yours is actually cleaner.
> > Having the subsystem reason on the top bits, I mean.
> > That way after masking the specific bits out the lower bits
> > can still provide a valid "global" drop reason.  
> 
> Ah, that's not even what I was thinking, but that would work.
> 
> What I was thinking was basically the same as you had, just hadn't
> thought about more subsystems yet and was trying to carve out some bits
> for wifi specifically :-)
> 
> I don't think we'll really end up with a case where we really want to
> use the low bits for a global reason and a wifi specific reason in
> higher bits together - there are basically no applicable reasons that we
> have today ...
> 
> I mean, it basically doesn't make sense to have any of the current
> reasons (such as _IP_CSUM or _IP_INHDR) with a wifi specific reason
> since we wouldn't even go look at the IP header when wifi drops
> something.
> 
> The only one that _might_ be relevant would be possibly _PKT_TOO_SMALL
> where wifi has various "reasons" for it to be too small (depending on
> the packet format), but I'm not sure that it would even be helpful to
> have drop monitor report "packet too small" from different layers; today
> it's used from L3/L4, not L2.

No, no what I was trying to say is that instead of using the upper bits
to identify the space (with 0 being the current enum skb_drop_reason)
we could use entries in enum skb_drop_reason. In hope that it'd make
the fine grained subsystem reason seem more like additional information
than a completely parallel system.

But it's just a thought, all of the approaches seem acceptable.

Quick code change perhaps illustrates it best:

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index c0a3ea806cd5..048402ffa6ad 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -338,11 +338,21 @@ enum skb_drop_reason {
 	 * for another host.
 	 */
 	SKB_DROP_REASON_IPV6_NDISC_NS_OTHERHOST,
+	/* packet was unusable? IDK */
+	SKB_DROP_REASON_MAC80211_UNUSABLE,
+	/* also no idea :) */
+	SKB_DROP_REASON_MAC80211_MONITOR,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
 	 */
 	SKB_DROP_REASON_MAX,
+
+	/**
+	 * @SKB_DROP_SUBSYS_REASON_MASK: fine grained reason from a particular
+	 * subsystem
+	 */
+	SKB_DROP_SUBSYS_REASON_MASK = 0xffff0000,
 };
 
 #define SKB_DR_INIT(name, reason)				\
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 5a782d1d8fd3..a06ba912c793 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -504,6 +504,7 @@ static void net_dm_packet_trace_kfree_skb_hit(void *ignore,
 	if (!nskb)
 		return;
 
+	reason &= ~SKB_DROP_SUBSYS_REASON_MASK;
 	if (unlikely(reason >= SKB_DROP_REASON_MAX || reason <= 0))
 		reason = SKB_DROP_REASON_NOT_SPECIFIED;
 	cb = NET_DM_SKB_CB(nskb);
@@ -611,6 +612,7 @@ static int net_dm_packet_report_fill(struct sk_buff *msg, struct sk_buff *skb,
 {
 	struct net_dm_skb_cb *cb = NET_DM_SKB_CB(skb);
 	char buf[NET_DM_MAX_SYMBOL_LEN];
+	unsigned int reason, subreason;
 	struct nlattr *attr;
 	void *hdr;
 	int rc;
@@ -627,10 +629,19 @@ static int net_dm_packet_report_fill(struct sk_buff *msg, struct sk_buff *skb,
 			      NET_DM_ATTR_PAD))
 		goto nla_put_failure;
 
+	subreason = FIELD_GET(SKB_DROP_SUBSYS_REASON_MASK, cb->reason);
+	reason = cb->reason & ~SKB_DROP_SUBSYS_REASON_MASK;
 	if (nla_put_string(msg, NET_DM_ATTR_REASON,
-			   drop_reasons[cb->reason]))
+			   drop_reasons[reason]))
 		goto nla_put_failure;
 
+	if (subreason) {
+		/* additionally search the per-subsys table,
+		 * table is found based on @reason
+		 * and indexed with @subreason
+		 */
+	}
+
 	snprintf(buf, sizeof(buf), "%pS", cb->pc);
 	if (nla_put_string(msg, NET_DM_ATTR_SYMBOL, buf))
 		goto nla_put_failure;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-29 18:02         ` Jakub Kicinski
@ 2023-03-29 18:57           ` Johannes Berg
  2023-03-29 19:09             ` Jakub Kicinski
  0 siblings, 1 reply; 12+ messages in thread
From: Johannes Berg @ 2023-03-29 18:57 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Eric Dumazet, netdev, linux-wireless

On Wed, 2023-03-29 at 11:02 -0700, Jakub Kicinski wrote:
> 
> No, no what I was trying to say is that instead of using the upper bits
> to identify the space (with 0 being the current enum skb_drop_reason)
> we could use entries in enum skb_drop_reason. In hope that it'd make
> the fine grained subsystem reason seem more like additional information
> than a completely parallel system.

Ah! Looking at your code example ... right, so you'd see "mac80211 drop
unusable" or "mac80211 drop to monitor", and fine-grained in the higher
bits.

> But it's just a thought, all of the approaches seem acceptable.

I _think_ I like the one I prototyped this morning better, I'm not sure
I like the subsystem == existing reason part _that_ much. It ultimately
doesn't matter much, it just feels odd that you'd be allowed to have a,
I don't know picking a random example, SKB_DROP_REASON_DUP_FRAG with a
fine-grained higher bits value?

Not that we'll ever be starved for space ...

> Quick code change perhaps illustrates it best:
> 

Yeah, that ends up really looking very similar :-)

Then again thinking about the implementation, we'd not be able to use a
simple array for the sub-reasons, or at least that'd waste a bunch of
space, since there are already quite a few 'main' reasons and we'd
want/need to add the mac80211 ones (with sub-reason) at the end. So that
makes a big array for the sub-reasons that's very sparsely populated (*)
Extending with a high 'subsystem' like I did this morning is more
compact here.

(*) or put the sub-reasons pointer/num with the 'main' reasons into the
drop_reasons[] array but that would take the same additional space


So ... which one do _you_ like better? I think I somewhat prefer the one
with adding a high bits subsystem, but I can relatively easily rejigger
my changes from this morning to implement the semantics you had here
too.

Anyone else have an opinion? :)

johannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: traceability of wifi packet drops
  2023-03-29 18:57           ` Johannes Berg
@ 2023-03-29 19:09             ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2023-03-29 19:09 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Eric Dumazet, netdev, linux-wireless

On Wed, 29 Mar 2023 20:57:26 +0200 Johannes Berg wrote:
> On Wed, 2023-03-29 at 11:02 -0700, Jakub Kicinski wrote:
> > But it's just a thought, all of the approaches seem acceptable.  
> 
> I _think_ I like the one I prototyped this morning better, I'm not sure
> I like the subsystem == existing reason part _that_ much. It ultimately
> doesn't matter much, it just feels odd that you'd be allowed to have a,
> I don't know picking a random example, SKB_DROP_REASON_DUP_FRAG with a
> fine-grained higher bits value?
> 
> Not that we'll ever be starved for space ...

Ack, for most drop_reasons having higher order bits would make no sense.

> > Quick code change perhaps illustrates it best:
> >   
> 
> Yeah, that ends up really looking very similar :-)
> 
> Then again thinking about the implementation, we'd not be able to use a
> simple array for the sub-reasons, or at least that'd waste a bunch of
> space, since there are already quite a few 'main' reasons and we'd
> want/need to add the mac80211 ones (with sub-reason) at the end. So that
> makes a big array for the sub-reasons that's very sparsely populated (*)
> Extending with a high 'subsystem' like I did this morning is more
> compact here.
> 
> (*) or put the sub-reasons pointer/num with the 'main' reasons into the
> drop_reasons[] array but that would take the same additional space

Yup, the only difference is that the collector side is simpler if the
subsystem is a valid drop reason. For those who don't expect to care
about subsystem drop details the aggregate stats are still (bpftrace
notation):

	@stats[reason & 0xffff] = count();

With the higher bits we have to add a layer of stats to the collection?

	$grp = reason >> 24;
	if ($grp != 0)
		@groups[$grp] = count();
	else
		@stats[reason] = count();

That said, I'm probably over-thinking because most will do:

	@stats[reason] = count();

... which works the same regardless.

> So ... which one do _you_ like better? I think I somewhat prefer the one
> with adding a high bits subsystem, but I can relatively easily rejigger
> my changes from this morning to implement the semantics you had here
> too.

No preference. You're coding it up so you're in the best position 
to pick :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-03-29 19:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-27 14:19 traceability of wifi packet drops Johannes Berg
2023-03-27 14:31 ` Johannes Berg
2023-03-28  1:09 ` Jakub Kicinski
2023-03-28  2:27   ` Dave Taht
2023-03-28  7:37   ` Johannes Berg
2023-03-28  8:16     ` Eric Dumazet
2023-03-28  8:18       ` Johannes Berg
2023-03-28 22:58     ` Jakub Kicinski
2023-03-29  8:35       ` Johannes Berg
2023-03-29 18:02         ` Jakub Kicinski
2023-03-29 18:57           ` Johannes Berg
2023-03-29 19:09             ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).