All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next v4 0/2] Replace mono_delivery_time with tstamp_type
@ 2024-04-18  0:43 Abhishek Chauhan
  2024-04-18  0:43 ` [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty Abhishek Chauhan
  2024-04-18  0:43 ` [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type Abhishek Chauhan
  0 siblings, 2 replies; 16+ messages in thread
From: Abhishek Chauhan @ 2024-04-18  0:43 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Willem de Bruijn,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel

Patch 1 :- This patch takes care of only renaming the mono delivery
timestamp to tstamp_type with no change in functionality of 
existing available code in kernel also  
Starts assigning tstamp_type with either mono or real and 
introduces a new enum in the skbuff.h, again no change in functionality 
of the existing available code in kernel , just making the code scalable.

Patch 2 :- Additional bit was added to support tai timestamp type to 
avoid tstamp drops in the forwarding path when testing TC-ETF. 
With this patch i am updating bpf filter.c and not sure
what impacts it has towards BPF code. 
I need upstream BPF community to help me in adding the necessary BPF 
changes to avoid any BPF test case failures. 
I have added some BPF changes as part of this commit to handle 
cases of both tai and mono bit being set at the same time. 

Abhishek Chauhan (2):
  net: Rename mono_delivery_time to tstamp_type for scalabilty
  net: Add additional bit to support clockid_t timestamp type

 include/linux/skbuff.h                        | 61 +++++++++++++++----
 include/net/inet_frag.h                       |  4 +-
 include/uapi/linux/bpf.h                      |  1 +
 net/bridge/netfilter/nf_conntrack_bridge.c    |  6 +-
 net/core/dev.c                                |  2 +-
 net/core/filter.c                             | 50 ++++++++++++---
 net/ieee802154/6lowpan/reassembly.c           |  2 +-
 net/ipv4/inet_fragment.c                      |  2 +-
 net/ipv4/ip_fragment.c                        |  2 +-
 net/ipv4/ip_output.c                          | 11 ++--
 net/ipv4/raw.c                                |  2 +-
 net/ipv4/tcp_output.c                         | 14 ++---
 net/ipv6/ip6_output.c                         |  8 +--
 net/ipv6/netfilter.c                          |  6 +-
 net/ipv6/netfilter/nf_conntrack_reasm.c       |  2 +-
 net/ipv6/raw.c                                |  2 +-
 net/ipv6/reassembly.c                         |  2 +-
 net/ipv6/tcp_ipv6.c                           |  2 +-
 net/packet/af_packet.c                        |  7 +--
 net/sched/act_bpf.c                           |  4 +-
 net/sched/cls_bpf.c                           |  4 +-
 .../selftests/bpf/prog_tests/ctx_rewrite.c    | 13 ++--
 22 files changed, 139 insertions(+), 68 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18  0:43 [RFC PATCH bpf-next v4 0/2] Replace mono_delivery_time with tstamp_type Abhishek Chauhan
@ 2024-04-18  0:43 ` Abhishek Chauhan
  2024-04-18 18:47   ` Willem de Bruijn
  2024-04-18  0:43 ` [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type Abhishek Chauhan
  1 sibling, 1 reply; 16+ messages in thread
From: Abhishek Chauhan @ 2024-04-18  0:43 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Willem de Bruijn,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel

mono_delivery_time was added to check if skb->tstamp has delivery
time in mono clock base (i.e. EDT) otherwise skb->tstamp has
timestamp in ingress and delivery_time at egress.

Renaming the bitfield from mono_delivery_time to tstamp_type is for
extensibilty for other timestamps such as userspace timestamp
(i.e. SO_TXTIME) set via sock opts.

As we are renaming the mono_delivery_time to tstamp_type, it makes
sense to start assigning tstamp_type based on enum defined
in this commit.

Earlier we used bool arg flag to check if the tstamp is mono in
function skb_set_delivery_time, Now the signature of the functions
accepts tstamp_type to distinguish between mono and real time.

In future tstamp_type:1 can be extended to support userspace timestamp
by increasing the bitfield.

Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
---
Changes since v3
- Fixed inconsistent capitalization in skbuff.h
- remove reference to MONO_DELIVERY_TIME_MASK in skbuff.h
  and point it to skb_tstamp_type now.
- Explicitely setting SKB_CLOCK_MONO if valid transmit_time
  ip_send_unicast_reply 
- Keeping skb_tstamp inline with skb_clear_tstamp. 
- skb_set_delivery_time checks if timstamp is 0 and 
  sets the tstamp_type to SKB_CLOCK_REAL.
- Above comments are given by Willem 
- Found out that skbuff.h has access to uapi/linux/time.h
  So now instead of using  CLOCK_REAL/CLOCK_MONO 
  i am checking actual clockid_t directly to set tstamp_type 
  example:- CLOCK_REALTIME/CLOCK_MONOTONIC 
- Compilation error fixed in 
  net/ieee802154/6lowpan/reassembly.c

Changes since v2
- Minor changes to commit subject

Changes since v1
- Squashed the two commits into one as mentioned by Willem.
- Introduced switch in skb_set_delivery_time.
- Renamed and removed directionality aspects w.r.t tstamp_type 
  as mentioned by Willem.

 include/linux/skbuff.h                     | 44 +++++++++++++++++-----
 include/net/inet_frag.h                    |  4 +-
 net/bridge/netfilter/nf_conntrack_bridge.c |  6 +--
 net/core/dev.c                             |  2 +-
 net/core/filter.c                          | 10 ++---
 net/ieee802154/6lowpan/reassembly.c        |  2 +-
 net/ipv4/inet_fragment.c                   |  2 +-
 net/ipv4/ip_fragment.c                     |  2 +-
 net/ipv4/ip_output.c                       |  9 +++--
 net/ipv4/tcp_output.c                      | 14 +++----
 net/ipv6/ip6_output.c                      |  6 +--
 net/ipv6/netfilter.c                       |  6 +--
 net/ipv6/netfilter/nf_conntrack_reasm.c    |  2 +-
 net/ipv6/reassembly.c                      |  2 +-
 net/ipv6/tcp_ipv6.c                        |  2 +-
 net/sched/act_bpf.c                        |  4 +-
 net/sched/cls_bpf.c                        |  4 +-
 17 files changed, 73 insertions(+), 48 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index aded4949ea79..2e7811c7e4db 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -702,6 +702,17 @@ typedef unsigned int sk_buff_data_t;
 typedef unsigned char *sk_buff_data_t;
 #endif
 
+/**
+ * tstamp_type:1 can take 2 values each
+ * represented by time base in skb
+ * 0x0 => real timestamp_type
+ * 0x1 => mono timestamp_type
+ */
+enum skb_tstamp_type {
+	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
+	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
+};
+
 /**
  * DOC: Basic sk_buff geometry
  *
@@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
  *	@dst_pending_confirm: need to confirm neighbour
  *	@decrypted: Decrypted SKB
  *	@slow_gro: state present at GRO time, slower prepare step required
- *	@mono_delivery_time: When set, skb->tstamp has the
+ *	@tstamp_type: When set, skb->tstamp has the
  *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
  *		skb->tstamp has the (rcv) timestamp at ingress and
  *		delivery_time at egress.
@@ -950,7 +961,7 @@ struct sk_buff {
 	/* private: */
 	__u8			__mono_tc_offset[0];
 	/* public: */
-	__u8			mono_delivery_time:1;	/* See SKB_MONO_DELIVERY_TIME_MASK */
+	__u8			tstamp_type:1;	/* See SKB_CLOCK_*_MASK */
 #ifdef CONFIG_NET_XGRESS
 	__u8			tc_at_ingress:1;	/* See TC_AT_INGRESS_MASK */
 	__u8			tc_skip_classify:1;
@@ -4170,7 +4181,7 @@ static inline void skb_get_new_timestampns(const struct sk_buff *skb,
 static inline void __net_timestamp(struct sk_buff *skb)
 {
 	skb->tstamp = ktime_get_real();
-	skb->mono_delivery_time = 0;
+	skb->tstamp_type = SKB_CLOCK_REAL;
 }
 
 static inline ktime_t net_timedelta(ktime_t t)
@@ -4179,10 +4190,23 @@ static inline ktime_t net_timedelta(ktime_t t)
 }
 
 static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
-					 bool mono)
+					  u8 tstamp_type)
 {
 	skb->tstamp = kt;
-	skb->mono_delivery_time = kt && mono;
+
+	if (!kt) {
+		skb->tstamp_type = SKB_CLOCK_REAL;
+		return;
+	}
+
+	switch (tstamp_type) {
+	case CLOCK_REALTIME:
+		skb->tstamp_type = SKB_CLOCK_REAL;
+		break;
+	case CLOCK_MONOTONIC:
+		skb->tstamp_type = SKB_CLOCK_MONO;
+		break;
+	}
 }
 
 DECLARE_STATIC_KEY_FALSE(netstamp_needed_key);
@@ -4192,8 +4216,8 @@ DECLARE_STATIC_KEY_FALSE(netstamp_needed_key);
  */
 static inline void skb_clear_delivery_time(struct sk_buff *skb)
 {
-	if (skb->mono_delivery_time) {
-		skb->mono_delivery_time = 0;
+	if (skb->tstamp_type) {
+		skb->tstamp_type = SKB_CLOCK_REAL;
 		if (static_branch_unlikely(&netstamp_needed_key))
 			skb->tstamp = ktime_get_real();
 		else
@@ -4203,7 +4227,7 @@ static inline void skb_clear_delivery_time(struct sk_buff *skb)
 
 static inline void skb_clear_tstamp(struct sk_buff *skb)
 {
-	if (skb->mono_delivery_time)
+	if (skb->tstamp_type)
 		return;
 
 	skb->tstamp = 0;
@@ -4211,7 +4235,7 @@ static inline void skb_clear_tstamp(struct sk_buff *skb)
 
 static inline ktime_t skb_tstamp(const struct sk_buff *skb)
 {
-	if (skb->mono_delivery_time)
+	if (skb->tstamp_type)
 		return 0;
 
 	return skb->tstamp;
@@ -4219,7 +4243,7 @@ static inline ktime_t skb_tstamp(const struct sk_buff *skb)
 
 static inline ktime_t skb_tstamp_cond(const struct sk_buff *skb, bool cond)
 {
-	if (!skb->mono_delivery_time && skb->tstamp)
+	if (skb->tstamp_type != SKB_CLOCK_MONO && skb->tstamp)
 		return skb->tstamp;
 
 	if (static_branch_unlikely(&netstamp_needed_key) || cond)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index 153960663ce4..5af6eb14c5db 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -76,7 +76,7 @@ struct frag_v6_compare_key {
  * @stamp: timestamp of the last received fragment
  * @len: total length of the original datagram
  * @meat: length of received fragments so far
- * @mono_delivery_time: stamp has a mono delivery time (EDT)
+ * @tstamp_type: stamp has a mono delivery time (EDT)
  * @flags: fragment queue flags
  * @max_size: maximum received fragment size
  * @fqdir: pointer to struct fqdir
@@ -97,7 +97,7 @@ struct inet_frag_queue {
 	ktime_t			stamp;
 	int			len;
 	int			meat;
-	u8			mono_delivery_time;
+	u8			tstamp_type;
 	__u8			flags;
 	u16			max_size;
 	struct fqdir		*fqdir;
diff --git a/net/bridge/netfilter/nf_conntrack_bridge.c b/net/bridge/netfilter/nf_conntrack_bridge.c
index c3c51b9a6826..816bb0fde718 100644
--- a/net/bridge/netfilter/nf_conntrack_bridge.c
+++ b/net/bridge/netfilter/nf_conntrack_bridge.c
@@ -32,7 +32,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
 					   struct sk_buff *))
 {
 	int frag_max_size = BR_INPUT_SKB_CB(skb)->frag_max_size;
-	bool mono_delivery_time = skb->mono_delivery_time;
+	u8 tstamp_type = skb->tstamp_type;
 	unsigned int hlen, ll_rs, mtu;
 	ktime_t tstamp = skb->tstamp;
 	struct ip_frag_state state;
@@ -82,7 +82,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
 			if (iter.frag)
 				ip_fraglist_prepare(skb, &iter);
 
-			skb_set_delivery_time(skb, tstamp, mono_delivery_time);
+			skb_set_delivery_time(skb, tstamp, tstamp_type);
 			err = output(net, sk, data, skb);
 			if (err || !iter.frag)
 				break;
@@ -113,7 +113,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
 			goto blackhole;
 		}
 
-		skb_set_delivery_time(skb2, tstamp, mono_delivery_time);
+		skb_set_delivery_time(skb2, tstamp, tstamp_type);
 		err = output(net, sk, data, skb2);
 		if (err)
 			goto blackhole;
diff --git a/net/core/dev.c b/net/core/dev.c
index 854a3a28a8d8..4e7dfefbb824 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2146,7 +2146,7 @@ EXPORT_SYMBOL(net_disable_timestamp);
 static inline void net_timestamp_set(struct sk_buff *skb)
 {
 	skb->tstamp = 0;
-	skb->mono_delivery_time = 0;
+	skb->tstamp_type = SKB_CLOCK_REAL;
 	if (static_branch_unlikely(&netstamp_needed_key))
 		skb->tstamp = ktime_get_real();
 }
diff --git a/net/core/filter.c b/net/core/filter.c
index 8d185d99a643..b1192e672cbb 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -7709,13 +7709,13 @@ BPF_CALL_3(bpf_skb_set_tstamp, struct sk_buff *, skb,
 		if (!tstamp)
 			return -EINVAL;
 		skb->tstamp = tstamp;
-		skb->mono_delivery_time = 1;
+		skb->tstamp_type = SKB_CLOCK_MONO;
 		break;
 	case BPF_SKB_TSTAMP_UNSPEC:
 		if (tstamp)
 			return -EINVAL;
 		skb->tstamp = 0;
-		skb->mono_delivery_time = 0;
+		skb->tstamp_type = SKB_CLOCK_REAL;
 		break;
 	default:
 		return -EINVAL;
@@ -9422,7 +9422,7 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
 					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
 		*insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
 					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
-		/* skb->tc_at_ingress && skb->mono_delivery_time,
+		/* skb->tc_at_ingress && skb->tstamp_type:1,
 		 * read 0 as the (rcv) timestamp.
 		 */
 		*insn++ = BPF_MOV64_IMM(value_reg, 0);
@@ -9447,7 +9447,7 @@ static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog,
 	 * the bpf prog is aware the tstamp could have delivery time.
 	 * Thus, write skb->tstamp as is if tstamp_type_access is true.
 	 * Otherwise, writing at ingress will have to clear the
-	 * mono_delivery_time bit also.
+	 * mono_delivery_time (skb->tstamp_type:1)bit also.
 	 */
 	if (!prog->tstamp_type_access) {
 		__u8 tmp_reg = BPF_REG_AX;
@@ -9457,7 +9457,7 @@ static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog,
 		*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
 		/* goto <store> */
 		*insn++ = BPF_JMP_A(2);
-		/* <clear>: mono_delivery_time */
+		/* <clear>: mono_delivery_time or (skb->tstamp_type:1) */
 		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, ~SKB_MONO_DELIVERY_TIME_MASK);
 		*insn++ = BPF_STX_MEM(BPF_B, skb_reg, tmp_reg, SKB_BF_MONO_TC_OFFSET);
 	}
diff --git a/net/ieee802154/6lowpan/reassembly.c b/net/ieee802154/6lowpan/reassembly.c
index 6dd960ec558c..ba0455ad7701 100644
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -130,7 +130,7 @@ static int lowpan_frag_queue(struct lowpan_frag_queue *fq,
 		goto err;
 
 	fq->q.stamp = skb->tstamp;
-	fq->q.mono_delivery_time = skb->mono_delivery_time;
+	fq->q.tstamp_type = skb->tstamp_type;
 	if (frag_type == LOWPAN_DISPATCH_FRAG1)
 		fq->q.flags |= INET_FRAG_FIRST_IN;
 
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index faaec92a46ac..d179a2c84222 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -619,7 +619,7 @@ void inet_frag_reasm_finish(struct inet_frag_queue *q, struct sk_buff *head,
 	skb_mark_not_on_list(head);
 	head->prev = NULL;
 	head->tstamp = q->stamp;
-	head->mono_delivery_time = q->mono_delivery_time;
+	head->tstamp_type = q->tstamp_type;
 
 	if (sk)
 		refcount_add(sum_truesize - head_truesize, &sk->sk_wmem_alloc);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index fb947d1613fe..787aa86800f5 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -355,7 +355,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 		qp->iif = dev->ifindex;
 
 	qp->q.stamp = skb->tstamp;
-	qp->q.mono_delivery_time = skb->mono_delivery_time;
+	qp->q.tstamp_type = skb->tstamp_type;
 	qp->q.meat += skb->len;
 	qp->ecn |= ecn;
 	add_frag_mem_limit(qp->q.fqdir, skb->truesize);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 1fe794967211..f29349151203 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -764,7 +764,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 {
 	struct iphdr *iph;
 	struct sk_buff *skb2;
-	bool mono_delivery_time = skb->mono_delivery_time;
+	u8 tstamp_type = skb->tstamp_type;
 	struct rtable *rt = skb_rtable(skb);
 	unsigned int mtu, hlen, ll_rs;
 	struct ip_fraglist_iter iter;
@@ -856,7 +856,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 				}
 			}
 
-			skb_set_delivery_time(skb, tstamp, mono_delivery_time);
+			skb_set_delivery_time(skb, tstamp, tstamp_type);
 			err = output(net, sk, skb);
 
 			if (!err)
@@ -912,7 +912,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 		/*
 		 *	Put this fragment into the sending queue.
 		 */
-		skb_set_delivery_time(skb2, tstamp, mono_delivery_time);
+		skb_set_delivery_time(skb2, tstamp, tstamp_type);
 		err = output(net, sk, skb2);
 		if (err)
 			goto fail;
@@ -1649,7 +1649,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 			  arg->csumoffset) = csum_fold(csum_add(nskb->csum,
 								arg->csum));
 		nskb->ip_summed = CHECKSUM_NONE;
-		nskb->mono_delivery_time = !!transmit_time;
+		if (transmit_time)
+			nskb->tstamp_type = SKB_CLOCK_MONO;
 		if (txhash)
 			skb_set_hash(nskb, txhash, PKT_HASH_TYPE_L4);
 		ip_push_pending_frames(sk, &fl4);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 61119d42b0fd..a062f88c47c3 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
 	tp = tcp_sk(sk);
 	prior_wstamp = tp->tcp_wstamp_ns;
 	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
-	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
+	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
 	if (clone_it) {
 		oskb = skb;
 
@@ -1650,7 +1650,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
 
 	skb_split(skb, buff, len);
 
-	skb_set_delivery_time(buff, skb->tstamp, true);
+	skb_set_delivery_time(buff, skb->tstamp, CLOCK_MONOTONIC);
 	tcp_fragment_tstamp(skb, buff);
 
 	old_factor = tcp_skb_pcount(skb);
@@ -2731,7 +2731,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 		if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
 			/* "skb_mstamp_ns" is used as a start point for the retransmit timer */
 			tp->tcp_wstamp_ns = tp->tcp_clock_cache;
-			skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
+			skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
 			list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue);
 			tcp_init_tso_segs(skb, mss_now);
 			goto repair; /* Skip network transmission */
@@ -3714,11 +3714,11 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 #ifdef CONFIG_SYN_COOKIES
 	if (unlikely(synack_type == TCP_SYNACK_COOKIE && ireq->tstamp_ok))
 		skb_set_delivery_time(skb, cookie_init_timestamp(req, now),
-				      true);
+				      CLOCK_MONOTONIC);
 	else
 #endif
 	{
-		skb_set_delivery_time(skb, now, true);
+		skb_set_delivery_time(skb, now, CLOCK_MONOTONIC);
 		if (!tcp_rsk(req)->snt_synack) /* Timestamp first SYNACK */
 			tcp_rsk(req)->snt_synack = tcp_skb_timestamp_us(skb);
 	}
@@ -3805,7 +3805,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 	bpf_skops_write_hdr_opt((struct sock *)sk, skb, req, syn_skb,
 				synack_type, &opts);
 
-	skb_set_delivery_time(skb, now, true);
+	skb_set_delivery_time(skb, now, CLOCK_MONOTONIC);
 	tcp_add_tx_delay(skb, tp);
 
 	return skb;
@@ -3989,7 +3989,7 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn)
 
 	err = tcp_transmit_skb(sk, syn_data, 1, sk->sk_allocation);
 
-	skb_set_delivery_time(syn, syn_data->skb_mstamp_ns, true);
+	skb_set_delivery_time(syn, syn_data->skb_mstamp_ns, CLOCK_MONOTONIC);
 
 	/* Now full SYN+DATA was cloned and sent (or not),
 	 * remove the SYN from the original skb (syn_data)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index b9dd3a66e423..a9e819115622 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -859,7 +859,7 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 	struct rt6_info *rt = (struct rt6_info *)skb_dst(skb);
 	struct ipv6_pinfo *np = skb->sk && !dev_recursion_level() ?
 				inet6_sk(skb->sk) : NULL;
-	bool mono_delivery_time = skb->mono_delivery_time;
+	u8 tstamp_type = skb->tstamp_type;
 	struct ip6_frag_state state;
 	unsigned int mtu, hlen, nexthdr_offset;
 	ktime_t tstamp = skb->tstamp;
@@ -955,7 +955,7 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 			if (iter.frag)
 				ip6_fraglist_prepare(skb, &iter);
 
-			skb_set_delivery_time(skb, tstamp, mono_delivery_time);
+			skb_set_delivery_time(skb, tstamp, tstamp_type);
 			err = output(net, sk, skb);
 			if (!err)
 				IP6_INC_STATS(net, ip6_dst_idev(&rt->dst),
@@ -1016,7 +1016,7 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 		/*
 		 *	Put this fragment into the sending queue.
 		 */
-		skb_set_delivery_time(frag, tstamp, mono_delivery_time);
+		skb_set_delivery_time(frag, tstamp, tstamp_type);
 		err = output(net, sk, frag);
 		if (err)
 			goto fail;
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 53d255838e6a..e0c2347b4dc6 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -126,7 +126,7 @@ int br_ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 				  struct sk_buff *))
 {
 	int frag_max_size = BR_INPUT_SKB_CB(skb)->frag_max_size;
-	bool mono_delivery_time = skb->mono_delivery_time;
+	u8 tstamp_type = skb->tstamp_type;
 	ktime_t tstamp = skb->tstamp;
 	struct ip6_frag_state state;
 	u8 *prevhdr, nexthdr = 0;
@@ -192,7 +192,7 @@ int br_ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 			if (iter.frag)
 				ip6_fraglist_prepare(skb, &iter);
 
-			skb_set_delivery_time(skb, tstamp, mono_delivery_time);
+			skb_set_delivery_time(skb, tstamp, tstamp_type);
 			err = output(net, sk, data, skb);
 			if (err || !iter.frag)
 				break;
@@ -225,7 +225,7 @@ int br_ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 			goto blackhole;
 		}
 
-		skb_set_delivery_time(skb2, tstamp, mono_delivery_time);
+		skb_set_delivery_time(skb2, tstamp, tstamp_type);
 		err = output(net, sk, data, skb2);
 		if (err)
 			goto blackhole;
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index d0dcbaca1994..5cc5d823d33f 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -264,7 +264,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
 		fq->iif = dev->ifindex;
 
 	fq->q.stamp = skb->tstamp;
-	fq->q.mono_delivery_time = skb->mono_delivery_time;
+	fq->q.tstamp_type = skb->tstamp_type;
 	fq->q.meat += skb->len;
 	fq->ecn |= ecn;
 	if (payload_len > fq->q.max_size)
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index acb4f119e11f..ea724ff558b4 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -198,7 +198,7 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 		fq->iif = dev->ifindex;
 
 	fq->q.stamp = skb->tstamp;
-	fq->q.mono_delivery_time = skb->mono_delivery_time;
+	fq->q.tstamp_type = skb->tstamp_type;
 	fq->q.meat += skb->len;
 	fq->ecn |= ecn;
 	add_frag_mem_limit(fq->q.fqdir, skb->truesize);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index bb7c3caf4f85..7f8a8bd77547 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -975,7 +975,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
 			mark = inet_twsk(sk)->tw_mark;
 		else
 			mark = READ_ONCE(sk->sk_mark);
-		skb_set_delivery_time(buff, tcp_transmit_time(sk), true);
+		skb_set_delivery_time(buff, tcp_transmit_time(sk), CLOCK_MONOTONIC);
 	}
 	if (txhash) {
 		/* autoflowlabel/skb_get_hash_flowi6 rely on buff->hash */
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 0e3cf11ae5fc..9cd12d61818a 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -54,8 +54,8 @@ TC_INDIRECT_SCOPE int tcf_bpf_act(struct sk_buff *skb,
 		bpf_compute_data_pointers(skb);
 		filter_res = bpf_prog_run(filter, skb);
 	}
-	if (unlikely(!skb->tstamp && skb->mono_delivery_time))
-		skb->mono_delivery_time = 0;
+	if (unlikely(!skb->tstamp && skb->tstamp_type))
+		skb->tstamp_type = SKB_CLOCK_REAL;
 	if (skb_sk_is_prefetched(skb) && filter_res != TC_ACT_OK)
 		skb_orphan(skb);
 
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 5e83e890f6a4..506516be5287 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -104,8 +104,8 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
 			bpf_compute_data_pointers(skb);
 			filter_res = bpf_prog_run(prog->filter, skb);
 		}
-		if (unlikely(!skb->tstamp && skb->mono_delivery_time))
-			skb->mono_delivery_time = 0;
+		if (unlikely(!skb->tstamp && skb->tstamp_type))
+			skb->tstamp_type = SKB_CLOCK_REAL;
 
 		if (prog->exts_integrated) {
 			res->class   = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-18  0:43 [RFC PATCH bpf-next v4 0/2] Replace mono_delivery_time with tstamp_type Abhishek Chauhan
  2024-04-18  0:43 ` [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty Abhishek Chauhan
@ 2024-04-18  0:43 ` Abhishek Chauhan
  2024-04-18 19:06   ` Willem de Bruijn
  1 sibling, 1 reply; 16+ messages in thread
From: Abhishek Chauhan @ 2024-04-18  0:43 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Willem de Bruijn,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel

tstamp_type is now set based on actual clockid_t compressed
into 2 bits.

To make the design scalable for future needs this commit bring in
the change to extend the tstamp_type:1 to tstamp_type:2 to support
other clockid_t timestamp.

We now support CLOCK_TAI as part of tstamp_type as part of this
commit with exisiting support CLOCK_MONOTONIC and CLOCK_REALTIME.

Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
---
Changes since v3
- Carefully reviewed BPF APIs and made changes in 
  BPF code as well. 
- Re-used actual clockid_t values since skbuff.h 
  indirectly includes uapi/linux/time.h
- Added CLOCK_TAI as part of the skb_set_delivery_time
  handling instead of CLOCK_USER
- Added default in switch for unsupported and invalid 
  timestamp with an WARN_ONCE.
- All of the above comments were given by Willem  
- Made changes in filter.c as per Martin's comments
  to handle invalid cases in bpf code with addition of
  SKB_TAI_DELIVERY_TIME_MASK

Changes since v2
- Minor changes to commit subject

Changes since v1 
- identified additional changes in BPF framework.
- Bit shift in SKB_MONO_DELIVERY_TIME_MASK and TC_AT_INGRESS_MASK.
- Made changes in skb_set_delivery_time to keep changes similar to 
  previous code for mono_delivery_time and just setting tstamp_type
  bit 1 for userspace timestamp.

 include/linux/skbuff.h                        | 21 +++++++---
 include/uapi/linux/bpf.h                      |  1 +
 net/core/filter.c                             | 40 ++++++++++++++++---
 net/ipv4/ip_output.c                          |  2 +-
 net/ipv4/raw.c                                |  2 +-
 net/ipv6/ip6_output.c                         |  2 +-
 net/ipv6/raw.c                                |  2 +-
 net/packet/af_packet.c                        |  7 ++--
 .../selftests/bpf/prog_tests/ctx_rewrite.c    | 13 ++++--
 9 files changed, 68 insertions(+), 22 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2e7811c7e4db..647522ef41cd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -703,14 +703,17 @@ typedef unsigned char *sk_buff_data_t;
 #endif
 
 /**
- * tstamp_type:1 can take 2 values each
+ * tstamp_type:2 can take 4 values each
  * represented by time base in skb
  * 0x0 => real timestamp_type
  * 0x1 => mono timestamp_type
+ * 0x2 => tai timestamp_type
+ * 0x3 => undefined timestamp_type
  */
 enum skb_tstamp_type {
 	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
 	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
+	SKB_CLOCK_TAI,	/* Time base in skb is TAI */
 };
 
 /**
@@ -833,7 +836,8 @@ enum skb_tstamp_type {
  *	@tstamp_type: When set, skb->tstamp has the
  *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
  *		skb->tstamp has the (rcv) timestamp at ingress and
- *		delivery_time at egress.
+ *		delivery_time at egress or skb->tstamp defined by skb->sk->sk_clockid
+ *		coming from userspace
  *	@napi_id: id of the NAPI struct this skb came from
  *	@sender_cpu: (aka @napi_id) source CPU in XPS
  *	@alloc_cpu: CPU which did the skb allocation.
@@ -961,7 +965,7 @@ struct sk_buff {
 	/* private: */
 	__u8			__mono_tc_offset[0];
 	/* public: */
-	__u8			tstamp_type:1;	/* See SKB_CLOCK_*_MASK */
+	__u8			tstamp_type:2;	/* See skb_tstamp_type enum */
 #ifdef CONFIG_NET_XGRESS
 	__u8			tc_at_ingress:1;	/* See TC_AT_INGRESS_MASK */
 	__u8			tc_skip_classify:1;
@@ -1096,10 +1100,12 @@ struct sk_buff {
  */
 #ifdef __BIG_ENDIAN_BITFIELD
 #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 7)
-#define TC_AT_INGRESS_MASK		(1 << 6)
+#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 6)
+#define TC_AT_INGRESS_MASK		(1 << 5)
 #else
 #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 0)
-#define TC_AT_INGRESS_MASK		(1 << 1)
+#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 1)
+#define TC_AT_INGRESS_MASK		(1 << 2)
 #endif
 #define SKB_BF_MONO_TC_OFFSET		offsetof(struct sk_buff, __mono_tc_offset)
 
@@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
 	case CLOCK_MONOTONIC:
 		skb->tstamp_type = SKB_CLOCK_MONO;
 		break;
+	case CLOCK_TAI:
+		skb->tstamp_type = SKB_CLOCK_TAI;
+		break;
+	default:
+		WARN_ONCE(true, "clockid %d not supported", tstamp_type);
 	}
 }
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index cee0a7915c08..1376ed5ece10 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6209,6 +6209,7 @@ union {					\
 enum {
 	BPF_SKB_TSTAMP_UNSPEC,
 	BPF_SKB_TSTAMP_DELIVERY_MONO,	/* tstamp has mono delivery time */
+	BPF_SKB_TSTAMP_DELIVERY_TAI,	/* tstamp has tai delivery time */
 	/* For any BPF_SKB_TSTAMP_* that the bpf prog cannot handle,
 	 * the bpf prog should handle it like BPF_SKB_TSTAMP_UNSPEC
 	 * and try to deduce it by ingress, egress or skb->sk->sk_clockid.
diff --git a/net/core/filter.c b/net/core/filter.c
index b1192e672cbb..de86c42b5859 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -7711,6 +7711,12 @@ BPF_CALL_3(bpf_skb_set_tstamp, struct sk_buff *, skb,
 		skb->tstamp = tstamp;
 		skb->tstamp_type = SKB_CLOCK_MONO;
 		break;
+	case BPF_SKB_TSTAMP_DELIVERY_TAI:
+		if (!tstamp)
+			return -EINVAL;
+		skb->tstamp = tstamp;
+		skb->tstamp_type = SKB_CLOCK_TAI;
+		break;
 	case BPF_SKB_TSTAMP_UNSPEC:
 		if (tstamp)
 			return -EINVAL;
@@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
 	*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
 			      SKB_BF_MONO_TC_OFFSET);
 	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
-				SKB_MONO_DELIVERY_TIME_MASK, 2);
+				SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
+	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
+				SKB_MONO_DELIVERY_TIME_MASK, 3);
+	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
+				SKB_TAI_DELIVERY_TIME_MASK, 4);
 	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
 	*insn++ = BPF_JMP_A(1);
 	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
+	*insn++ = BPF_JMP_A(1);
+	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
 
 	return insn;
 }
@@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
 		__u8 tmp_reg = BPF_REG_AX;
 
 		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
+		/*check if all three bits are set*/
 		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
-					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
-		*insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
-					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
+					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
+					SKB_TAI_DELIVERY_TIME_MASK);
+		/*if all 3 bits are set jump 3 instructions and clear the register */
+		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
+					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
+					SKB_TAI_DELIVERY_TIME_MASK, 4);
+		/*Now check Mono is set with ingress mask if so clear */
+		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
+					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
+		/*Now Check tai is set with ingress mask if so clear */
+		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
+					TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
+		/*Now Check tai and mono are set if so clear */
+		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
+					SKB_MONO_DELIVERY_TIME_MASK |
+					SKB_TAI_DELIVERY_TIME_MASK, 1);
+		/* goto <store> */
+		*insn++ = BPF_JMP_A(2);
 		/* skb->tc_at_ingress && skb->tstamp_type:1,
 		 * read 0 as the (rcv) timestamp.
 		 */
@@ -9456,9 +9484,11 @@ static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog,
 		/* Writing __sk_buff->tstamp as ingress, goto <clear> */
 		*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
 		/* goto <store> */
-		*insn++ = BPF_JMP_A(2);
+		*insn++ = BPF_JMP_A(3);
 		/* <clear>: mono_delivery_time or (skb->tstamp_type:1) */
 		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, ~SKB_MONO_DELIVERY_TIME_MASK);
+		/* <clear>: tai delivery_time or (skb->tstamp_type:2) */
+		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, ~SKB_TAI_DELIVERY_TIME_MASK);
 		*insn++ = BPF_STX_MEM(BPF_B, skb_reg, tmp_reg, SKB_BF_MONO_TC_OFFSET);
 	}
 #endif
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index f29349151203..91e80a3d411f 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1457,7 +1457,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk,
 
 	skb->priority = (cork->tos != -1) ? cork->priority: READ_ONCE(sk->sk_priority);
 	skb->mark = cork->mark;
-	skb->tstamp = cork->transmit_time;
+	skb_set_delivery_time(skb, cork->transmit_time, sk->sk_clockid);
 	/*
 	 * Steal rt from cork.dst to avoid a pair of atomic_inc/atomic_dec
 	 * on dst refcount
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index dcb11f22cbf2..bbc46a40c8b6 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -360,7 +360,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
 	skb->protocol = htons(ETH_P_IP);
 	skb->priority = READ_ONCE(sk->sk_priority);
 	skb->mark = sockc->mark;
-	skb->tstamp = sockc->transmit_time;
+	skb_set_delivery_time(skb, sockc->transmit_time, sk->sk_clockid);
 	skb_dst_set(skb, &rt->dst);
 	*rtp = NULL;
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index a9e819115622..0b8193bdd98f 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1924,7 +1924,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 
 	skb->priority = READ_ONCE(sk->sk_priority);
 	skb->mark = cork->base.mark;
-	skb->tstamp = cork->base.transmit_time;
+	skb_set_delivery_time(skb, cork->base.transmit_time, sk->sk_clockid);
 
 	ip6_cork_steal_dst(skb, cork);
 	IP6_INC_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUTREQUESTS);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 0d896ca7b589..625f3a917e50 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -621,7 +621,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
 	skb->protocol = htons(ETH_P_IPV6);
 	skb->priority = READ_ONCE(sk->sk_priority);
 	skb->mark = sockc->mark;
-	skb->tstamp = sockc->transmit_time;
+	skb_set_delivery_time(skb, sockc->transmit_time, sk->sk_clockid);
 
 	skb_put(skb, length);
 	skb_reset_network_header(skb);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8c6d3fbb4ed8..356c96f23370 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2056,8 +2056,7 @@ static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
 	skb->dev = dev;
 	skb->priority = READ_ONCE(sk->sk_priority);
 	skb->mark = READ_ONCE(sk->sk_mark);
-	skb->tstamp = sockc.transmit_time;
-
+	skb_set_delivery_time(skb, sockc.transmit_time, sk->sk_clockid);
 	skb_setup_tx_timestamp(skb, sockc.tsflags);
 
 	if (unlikely(extra_len == 4))
@@ -2585,7 +2584,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 	skb->dev = dev;
 	skb->priority = READ_ONCE(po->sk.sk_priority);
 	skb->mark = READ_ONCE(po->sk.sk_mark);
-	skb->tstamp = sockc->transmit_time;
+	skb_set_delivery_time(skb, sockc->transmit_time, po->sk.sk_clockid);
 	skb_setup_tx_timestamp(skb, sockc->tsflags);
 	skb_zcopy_set_nouarg(skb, ph.raw);
 
@@ -3063,7 +3062,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	skb->dev = dev;
 	skb->priority = READ_ONCE(sk->sk_priority);
 	skb->mark = sockc.mark;
-	skb->tstamp = sockc.transmit_time;
+	skb_set_delivery_time(skb, sockc.transmit_time, sk->sk_clockid);
 
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
diff --git a/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c b/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
index 3b7c57fe55a5..9d0c33d7f52e 100644
--- a/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
+++ b/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
@@ -69,15 +69,20 @@ static struct test_case test_cases[] = {
 	{
 		N(SCHED_CLS, struct __sk_buff, tstamp),
 		.read  = "r11 = *(u8 *)($ctx + sk_buff::__mono_tc_offset);"
-			 "w11 &= 3;"
-			 "if w11 != 0x3 goto pc+2;"
+			 "w11 &= 7;"
+			 "if w11 == 0x7 goto pc+4;"
+			 "if w11 == 0x5 goto pc+3;"
+			 "if w11 == 0x6 goto pc+2;"
+			 "if w11 == 0x3 goto pc+1;"
+			 "goto pc+2"
 			 "$dst = 0;"
 			 "goto pc+1;"
 			 "$dst = *(u64 *)($ctx + sk_buff::tstamp);",
 		.write = "r11 = *(u8 *)($ctx + sk_buff::__mono_tc_offset);"
-			 "if w11 & 0x2 goto pc+1;"
-			 "goto pc+2;"
+			 "if w11 & 0x4 goto pc+1;"
+			 "goto pc+3;"
 			 "w11 &= -2;"
+			 "w11 &= -3;"
 			 "*(u8 *)($ctx + sk_buff::__mono_tc_offset) = r11;"
 			 "*(u64 *)($ctx + sk_buff::tstamp) = $src;",
 	},
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18  0:43 ` [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty Abhishek Chauhan
@ 2024-04-18 18:47   ` Willem de Bruijn
  2024-04-18 19:58     ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2024-04-18 18:47 UTC (permalink / raw)
  To: Abhishek Chauhan, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Willem de Bruijn, Martin KaFai Lau, Martin KaFai Lau,
	Daniel Borkmann, bpf
  Cc: kernel

Abhishek Chauhan wrote:
> mono_delivery_time was added to check if skb->tstamp has delivery
> time in mono clock base (i.e. EDT) otherwise skb->tstamp has
> timestamp in ingress and delivery_time at egress.
> 
> Renaming the bitfield from mono_delivery_time to tstamp_type is for
> extensibilty for other timestamps such as userspace timestamp
> (i.e. SO_TXTIME) set via sock opts.
> 
> As we are renaming the mono_delivery_time to tstamp_type, it makes
> sense to start assigning tstamp_type based on enum defined
> in this commit.
> 
> Earlier we used bool arg flag to check if the tstamp is mono in
> function skb_set_delivery_time, Now the signature of the functions
> accepts tstamp_type to distinguish between mono and real time.
> 
> In future tstamp_type:1 can be extended to support userspace timestamp
> by increasing the bitfield.
> 
> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>

> +/**
> + * tstamp_type:1 can take 2 values each
> + * represented by time base in skb
> + * 0x0 => real timestamp_type
> + * 0x1 => mono timestamp_type
> + */
> +enum skb_tstamp_type {
> +	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
> +	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
> +};
> +

Can drop the comments. These names are self documenting.

>  /**
>   * DOC: Basic sk_buff geometry
>   *
> @@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
>   *	@dst_pending_confirm: need to confirm neighbour
>   *	@decrypted: Decrypted SKB
>   *	@slow_gro: state present at GRO time, slower prepare step required
> - *	@mono_delivery_time: When set, skb->tstamp has the
> + *	@tstamp_type: When set, skb->tstamp has the
>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>   *		skb->tstamp has the (rcv) timestamp at ingress and
>   *		delivery_time at egress.

Is this still correct? I think all egress does now annotate correctly
as SKB_CLOCK_MONO. So when not set it always is SKB_CLOCK_REAL.

> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 61119d42b0fd..a062f88c47c3 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
>  	tp = tcp_sk(sk);
>  	prior_wstamp = tp->tcp_wstamp_ns;
>  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
> -	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
> +	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);

Multiple references to CLOCK_MONOTONIC left




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-18  0:43 ` [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type Abhishek Chauhan
@ 2024-04-18 19:06   ` Willem de Bruijn
  2024-04-18 20:10     ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2024-04-18 19:06 UTC (permalink / raw)
  To: Abhishek Chauhan, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Willem de Bruijn, Martin KaFai Lau, Martin KaFai Lau,
	Daniel Borkmann, bpf
  Cc: kernel

Abhishek Chauhan wrote:
> tstamp_type is now set based on actual clockid_t compressed
> into 2 bits.
> 
> To make the design scalable for future needs this commit bring in
> the change to extend the tstamp_type:1 to tstamp_type:2 to support
> other clockid_t timestamp.
> 
> We now support CLOCK_TAI as part of tstamp_type as part of this
> commit with exisiting support CLOCK_MONOTONIC and CLOCK_REALTIME.
> 
> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
>  
>  /**
> - * tstamp_type:1 can take 2 values each
> + * tstamp_type:2 can take 4 values each
>   * represented by time base in skb
>   * 0x0 => real timestamp_type
>   * 0x1 => mono timestamp_type
> + * 0x2 => tai timestamp_type
> + * 0x3 => undefined timestamp_type

Same point as previous patch about comment that repeats name.

> @@ -833,7 +836,8 @@ enum skb_tstamp_type {
>   *	@tstamp_type: When set, skb->tstamp has the
>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>   *		skb->tstamp has the (rcv) timestamp at ingress and
> - *		delivery_time at egress.
> + *		delivery_time at egress or skb->tstamp defined by skb->sk->sk_clockid
> + *		coming from userspace

I would simplify the comment: clock base of skb->tstamp.
Already in the first patch.

>   *	@napi_id: id of the NAPI struct this skb came from
>   *	@sender_cpu: (aka @napi_id) source CPU in XPS
>   *	@alloc_cpu: CPU which did the skb allocation.
> @@ -961,7 +965,7 @@ struct sk_buff {
>  	/* private: */
>  	__u8			__mono_tc_offset[0];
>  	/* public: */
> -	__u8			tstamp_type:1;	/* See SKB_CLOCK_*_MASK */
> +	__u8			tstamp_type:2;	/* See skb_tstamp_type enum */

Probably good to call out that according to pahole this fills a hole.

>  #ifdef CONFIG_NET_XGRESS
>  	__u8			tc_at_ingress:1;	/* See TC_AT_INGRESS_MASK */
>  	__u8			tc_skip_classify:1;
> @@ -1096,10 +1100,12 @@ struct sk_buff {
>   */
>  #ifdef __BIG_ENDIAN_BITFIELD
>  #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 7)
> -#define TC_AT_INGRESS_MASK		(1 << 6)
> +#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 6)

SKB_TSTAMP_TYPE_BIT2_MASK?

> +#define TC_AT_INGRESS_MASK		(1 << 5)
>  #else
>  #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 0)
> -#define TC_AT_INGRESS_MASK		(1 << 1)
> +#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 1)
> +#define TC_AT_INGRESS_MASK		(1 << 2)
>  #endif
>  #define SKB_BF_MONO_TC_OFFSET		offsetof(struct sk_buff, __mono_tc_offset)
>  
> @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>  	case CLOCK_MONOTONIC:
>  		skb->tstamp_type = SKB_CLOCK_MONO;
>  		break;
> +	case CLOCK_TAI:
> +		skb->tstamp_type = SKB_CLOCK_TAI;
> +		break;
> +	default:
> +		WARN_ONCE(true, "clockid %d not supported", tstamp_type);

and set to 0 and default tstamp_type?

>  	}
>  }

>  >
 @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>  	*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>  			      SKB_BF_MONO_TC_OFFSET);
>  	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
> -				SKB_MONO_DELIVERY_TIME_MASK, 2);
> +				SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
> +	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
> +				SKB_MONO_DELIVERY_TIME_MASK, 3);
> +	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
> +				SKB_TAI_DELIVERY_TIME_MASK, 4);
>  	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>  	*insn++ = BPF_JMP_A(1);
>  	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
> +	*insn++ = BPF_JMP_A(1);
> +	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
>  
>  	return insn;
>  }
> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>  		__u8 tmp_reg = BPF_REG_AX;
>  
>  		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
> +		/*check if all three bits are set*/
>  		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
> -					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
> -		*insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
> -					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
> +					SKB_TAI_DELIVERY_TIME_MASK);
> +		/*if all 3 bits are set jump 3 instructions and clear the register */
> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
> +					SKB_TAI_DELIVERY_TIME_MASK, 4);
> +		/*Now check Mono is set with ingress mask if so clear */
> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
> +		/*Now Check tai is set with ingress mask if so clear */
> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
> +					TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
> +		/*Now Check tai and mono are set if so clear */
> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
> +					SKB_MONO_DELIVERY_TIME_MASK |
> +					SKB_TAI_DELIVERY_TIME_MASK, 1);

This looks as if all JEQ result in "if so clear"?

Is the goal to only do something different for the two bits being 0x1,
can we have a single test with a two-bit mask, rather than four tests?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18 18:47   ` Willem de Bruijn
@ 2024-04-18 19:58     ` Abhishek Chauhan (ABC)
  2024-04-18 20:11       ` Willem de Bruijn
  0 siblings, 1 reply; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-18 19:58 UTC (permalink / raw)
  To: Willem de Bruijn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel



On 4/18/2024 11:47 AM, Willem de Bruijn wrote:
> Abhishek Chauhan wrote:
>> mono_delivery_time was added to check if skb->tstamp has delivery
>> time in mono clock base (i.e. EDT) otherwise skb->tstamp has
>> timestamp in ingress and delivery_time at egress.
>>
>> Renaming the bitfield from mono_delivery_time to tstamp_type is for
>> extensibilty for other timestamps such as userspace timestamp
>> (i.e. SO_TXTIME) set via sock opts.
>>
>> As we are renaming the mono_delivery_time to tstamp_type, it makes
>> sense to start assigning tstamp_type based on enum defined
>> in this commit.
>>
>> Earlier we used bool arg flag to check if the tstamp is mono in
>> function skb_set_delivery_time, Now the signature of the functions
>> accepts tstamp_type to distinguish between mono and real time.
>>
>> In future tstamp_type:1 can be extended to support userspace timestamp
>> by increasing the bitfield.
>>
>> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
>> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
> 
>> +/**
>> + * tstamp_type:1 can take 2 values each
>> + * represented by time base in skb
>> + * 0x0 => real timestamp_type
>> + * 0x1 => mono timestamp_type
>> + */
>> +enum skb_tstamp_type {
>> +	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
>> +	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
>> +};
>> +
> 
> Can drop the comments. These names are self documenting.

Noted! . I will take care of this
> 
>>  /**
>>   * DOC: Basic sk_buff geometry
>>   *
>> @@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
>>   *	@dst_pending_confirm: need to confirm neighbour
>>   *	@decrypted: Decrypted SKB
>>   *	@slow_gro: state present at GRO time, slower prepare step required
>> - *	@mono_delivery_time: When set, skb->tstamp has the
>> + *	@tstamp_type: When set, skb->tstamp has the
>>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>>   *		skb->tstamp has the (rcv) timestamp at ingress and
>>   *		delivery_time at egress.
> 
> Is this still correct? I think all egress does now annotate correctly
> as SKB_CLOCK_MONO. So when not set it always is SKB_CLOCK_REAL.
> 
That is correct. 

>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> index 61119d42b0fd..a062f88c47c3 100644
>> --- a/net/ipv4/tcp_output.c
>> +++ b/net/ipv4/tcp_output.c
>> @@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
>>  	tp = tcp_sk(sk);
>>  	prior_wstamp = tp->tcp_wstamp_ns;
>>  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
>> -	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
>> +	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
> 
> Multiple references to CLOCK_MONOTONIC left
> 
I think i took care of all the references. Apologies if i didn't understand your comment here. 


> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-18 19:06   ` Willem de Bruijn
@ 2024-04-18 20:10     ` Abhishek Chauhan (ABC)
  2024-04-18 21:57       ` Martin KaFai Lau
  0 siblings, 1 reply; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-18 20:10 UTC (permalink / raw)
  To: Willem de Bruijn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel



On 4/18/2024 12:06 PM, Willem de Bruijn wrote:
> Abhishek Chauhan wrote:
>> tstamp_type is now set based on actual clockid_t compressed
>> into 2 bits.
>>
>> To make the design scalable for future needs this commit bring in
>> the change to extend the tstamp_type:1 to tstamp_type:2 to support
>> other clockid_t timestamp.
>>
>> We now support CLOCK_TAI as part of tstamp_type as part of this
>> commit with exisiting support CLOCK_MONOTONIC and CLOCK_REALTIME.
>>
>> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
>> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
>>  
>>  /**
>> - * tstamp_type:1 can take 2 values each
>> + * tstamp_type:2 can take 4 values each
>>   * represented by time base in skb
>>   * 0x0 => real timestamp_type
>>   * 0x1 => mono timestamp_type
>> + * 0x2 => tai timestamp_type
>> + * 0x3 => undefined timestamp_type
> 
> Same point as previous patch about comment that repeats name.
> 
Will take care, Noted!
>> @@ -833,7 +836,8 @@ enum skb_tstamp_type {
>>   *	@tstamp_type: When set, skb->tstamp has the
>>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>>   *		skb->tstamp has the (rcv) timestamp at ingress and
>> - *		delivery_time at egress.
>> + *		delivery_time at egress or skb->tstamp defined by skb->sk->sk_clockid
>> + *		coming from userspace
> 
> I would simplify the comment: clock base of skb->tstamp.
> Already in the first patch.
> 
Will take care, Noted!
>>   *	@napi_id: id of the NAPI struct this skb came from
>>   *	@sender_cpu: (aka @napi_id) source CPU in XPS
>>   *	@alloc_cpu: CPU which did the skb allocation.
>> @@ -961,7 +965,7 @@ struct sk_buff {
>>  	/* private: */
>>  	__u8			__mono_tc_offset[0];
>>  	/* public: */
>> -	__u8			tstamp_type:1;	/* See SKB_CLOCK_*_MASK */
>> +	__u8			tstamp_type:2;	/* See skb_tstamp_type enum */
> 
> Probably good to call out that according to pahole this fills a hole.
> 
I will do that . 
>>  #ifdef CONFIG_NET_XGRESS
>>  	__u8			tc_at_ingress:1;	/* See TC_AT_INGRESS_MASK */
>>  	__u8			tc_skip_classify:1;
>> @@ -1096,10 +1100,12 @@ struct sk_buff {
>>   */
>>  #ifdef __BIG_ENDIAN_BITFIELD
>>  #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 7)
>> -#define TC_AT_INGRESS_MASK		(1 << 6)
>> +#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 6)
> 
> SKB_TSTAMP_TYPE_BIT2_MASK?
> 
I was thinking to keep it as TAI because it will confuse developers. I hope thats okay. 
>> +#define TC_AT_INGRESS_MASK		(1 << 5)
>>  #else
>>  #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 0)
>> -#define TC_AT_INGRESS_MASK		(1 << 1)
>> +#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 1)
>> +#define TC_AT_INGRESS_MASK		(1 << 2)
>>  #endif
>>  #define SKB_BF_MONO_TC_OFFSET		offsetof(struct sk_buff, __mono_tc_offset)
>>  
>> @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>  	case CLOCK_MONOTONIC:
>>  		skb->tstamp_type = SKB_CLOCK_MONO;
>>  		break;
>> +	case CLOCK_TAI:
>> +		skb->tstamp_type = SKB_CLOCK_TAI;
>> +		break;
>> +	default:
>> +		WARN_ONCE(true, "clockid %d not supported", tstamp_type);
> 
> and set to 0 and default tstamp_type?
> Actually thinking about it. I feel if its unsupported just fall back to default is the correct thing. I will take care of this. 
>>  	}
>>  }
> 
>>  >
>  @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>  	*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>>  			      SKB_BF_MONO_TC_OFFSET);
>>  	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>> -				SKB_MONO_DELIVERY_TIME_MASK, 2);
>> +				SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>> +	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>> +				SKB_MONO_DELIVERY_TIME_MASK, 3);
>> +	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>> +				SKB_TAI_DELIVERY_TIME_MASK, 4);
>>  	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>>  	*insn++ = BPF_JMP_A(1);
>>  	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
>> +	*insn++ = BPF_JMP_A(1);
>> +	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
>>  
>>  	return insn;
>>  }
>> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>  		__u8 tmp_reg = BPF_REG_AX;
>>  
>>  		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>> +		/*check if all three bits are set*/
>>  		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>> -					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
>> -		*insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
>> -					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
>> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>> +					SKB_TAI_DELIVERY_TIME_MASK);
>> +		/*if all 3 bits are set jump 3 instructions and clear the register */
>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>> +					SKB_TAI_DELIVERY_TIME_MASK, 4);
>> +		/*Now check Mono is set with ingress mask if so clear */
>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
>> +		/*Now Check tai is set with ingress mask if so clear */
>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>> +					TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>> +		/*Now Check tai and mono are set if so clear */
>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>> +					SKB_MONO_DELIVERY_TIME_MASK |
>> +					SKB_TAI_DELIVERY_TIME_MASK, 1);
> 
> This looks as if all JEQ result in "if so clear"?
> 
> Is the goal to only do something different for the two bits being 0x1,
> can we have a single test with a two-bit mask, rather than four tests?
> 
I think Martin wanted to take care of TAI as well. I will wait for his comment here 

My Goal was to take care of invalid combos which does not hold valid
1. If all 3 bits are set => invalid combo (Test case written is Insane)
2. If 2 bits are set (tai+mono)(Test case written is Insane) => this cannot happen (because clock base can only be one in skb)
3. If 2 bit are set (ingress + tai/mono) => This is existing logic + tai being added (clear tstamp in ingress)
4. For all other cases go ahead and fill in the tstamp in the dest register. 

> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18 19:58     ` Abhishek Chauhan (ABC)
@ 2024-04-18 20:11       ` Willem de Bruijn
  2024-04-18 20:38         ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2024-04-18 20:11 UTC (permalink / raw)
  To: Abhishek Chauhan (ABC),
	Willem de Bruijn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel

Abhishek Chauhan (ABC) wrote:
> 
> 
> On 4/18/2024 11:47 AM, Willem de Bruijn wrote:
> > Abhishek Chauhan wrote:
> >> mono_delivery_time was added to check if skb->tstamp has delivery
> >> time in mono clock base (i.e. EDT) otherwise skb->tstamp has
> >> timestamp in ingress and delivery_time at egress.
> >>
> >> Renaming the bitfield from mono_delivery_time to tstamp_type is for
> >> extensibilty for other timestamps such as userspace timestamp
> >> (i.e. SO_TXTIME) set via sock opts.
> >>
> >> As we are renaming the mono_delivery_time to tstamp_type, it makes
> >> sense to start assigning tstamp_type based on enum defined
> >> in this commit.
> >>
> >> Earlier we used bool arg flag to check if the tstamp is mono in
> >> function skb_set_delivery_time, Now the signature of the functions
> >> accepts tstamp_type to distinguish between mono and real time.
> >>
> >> In future tstamp_type:1 can be extended to support userspace timestamp
> >> by increasing the bitfield.
> >>
> >> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
> >> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
> > 
> >> +/**
> >> + * tstamp_type:1 can take 2 values each
> >> + * represented by time base in skb
> >> + * 0x0 => real timestamp_type
> >> + * 0x1 => mono timestamp_type
> >> + */
> >> +enum skb_tstamp_type {
> >> +	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
> >> +	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
> >> +};
> >> +
> > 
> > Can drop the comments. These names are self documenting.
> 
> Noted! . I will take care of this
> > 
> >>  /**
> >>   * DOC: Basic sk_buff geometry
> >>   *
> >> @@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
> >>   *	@dst_pending_confirm: need to confirm neighbour
> >>   *	@decrypted: Decrypted SKB
> >>   *	@slow_gro: state present at GRO time, slower prepare step required
> >> - *	@mono_delivery_time: When set, skb->tstamp has the
> >> + *	@tstamp_type: When set, skb->tstamp has the
> >>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
> >>   *		skb->tstamp has the (rcv) timestamp at ingress and
> >>   *		delivery_time at egress.
> > 
> > Is this still correct? I think all egress does now annotate correctly
> > as SKB_CLOCK_MONO. So when not set it always is SKB_CLOCK_REAL.
> > 
> That is correct. 
> 
> >> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> >> index 61119d42b0fd..a062f88c47c3 100644
> >> --- a/net/ipv4/tcp_output.c
> >> +++ b/net/ipv4/tcp_output.c
> >> @@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
> >>  	tp = tcp_sk(sk);
> >>  	prior_wstamp = tp->tcp_wstamp_ns;
> >>  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
> >> -	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
> >> +	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
> > 
> > Multiple references to CLOCK_MONOTONIC left
> > 
> I think i took care of all the references. Apologies if i didn't understand your comment here. 

On closer read, there is a type issue here.

skb_set_delivery_time takes a u8 tstamp_type. But it is often passed
a clockid_t, and that is also what the switch expects.

But it does also get called with a tstamp_type in code like the
following:

+       u8 tstamp_type = skb->tstamp_type;
        unsigned int hlen, ll_rs, mtu;
        ktime_t tstamp = skb->tstamp;
        struct ip_frag_state state;
@@ -82,7 +82,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
                        if (iter.frag)
                                ip_fraglist_prepare(skb, &iter);
  
-                       skb_set_delivery_time(skb, tstamp, mono_delivery_time);
+                       skb_set_delivery_time(skb, tstamp, tstamp_type);

So maybe we need two variants, one that takes a tstamp_type and one
that tames a clockid_t?

The first can be simple, not switch needed. Just apply the two stores.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18 20:11       ` Willem de Bruijn
@ 2024-04-18 20:38         ` Abhishek Chauhan (ABC)
  2024-04-18 20:49           ` Willem de Bruijn
  0 siblings, 1 reply; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-18 20:38 UTC (permalink / raw)
  To: Willem de Bruijn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel



On 4/18/2024 1:11 PM, Willem de Bruijn wrote:
> Abhishek Chauhan (ABC) wrote:
>>
>>
>> On 4/18/2024 11:47 AM, Willem de Bruijn wrote:
>>> Abhishek Chauhan wrote:
>>>> mono_delivery_time was added to check if skb->tstamp has delivery
>>>> time in mono clock base (i.e. EDT) otherwise skb->tstamp has
>>>> timestamp in ingress and delivery_time at egress.
>>>>
>>>> Renaming the bitfield from mono_delivery_time to tstamp_type is for
>>>> extensibilty for other timestamps such as userspace timestamp
>>>> (i.e. SO_TXTIME) set via sock opts.
>>>>
>>>> As we are renaming the mono_delivery_time to tstamp_type, it makes
>>>> sense to start assigning tstamp_type based on enum defined
>>>> in this commit.
>>>>
>>>> Earlier we used bool arg flag to check if the tstamp is mono in
>>>> function skb_set_delivery_time, Now the signature of the functions
>>>> accepts tstamp_type to distinguish between mono and real time.
>>>>
>>>> In future tstamp_type:1 can be extended to support userspace timestamp
>>>> by increasing the bitfield.
>>>>
>>>> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
>>>> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
>>>
>>>> +/**
>>>> + * tstamp_type:1 can take 2 values each
>>>> + * represented by time base in skb
>>>> + * 0x0 => real timestamp_type
>>>> + * 0x1 => mono timestamp_type
>>>> + */
>>>> +enum skb_tstamp_type {
>>>> +	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
>>>> +	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
>>>> +};
>>>> +
>>>
>>> Can drop the comments. These names are self documenting.
>>
>> Noted! . I will take care of this
>>>
>>>>  /**
>>>>   * DOC: Basic sk_buff geometry
>>>>   *
>>>> @@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
>>>>   *	@dst_pending_confirm: need to confirm neighbour
>>>>   *	@decrypted: Decrypted SKB
>>>>   *	@slow_gro: state present at GRO time, slower prepare step required
>>>> - *	@mono_delivery_time: When set, skb->tstamp has the
>>>> + *	@tstamp_type: When set, skb->tstamp has the
>>>>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>>>>   *		skb->tstamp has the (rcv) timestamp at ingress and
>>>>   *		delivery_time at egress.
>>>
>>> Is this still correct? I think all egress does now annotate correctly
>>> as SKB_CLOCK_MONO. So when not set it always is SKB_CLOCK_REAL.
>>>
>> That is correct. 
>>
>>>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>>>> index 61119d42b0fd..a062f88c47c3 100644
>>>> --- a/net/ipv4/tcp_output.c
>>>> +++ b/net/ipv4/tcp_output.c
>>>> @@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
>>>>  	tp = tcp_sk(sk);
>>>>  	prior_wstamp = tp->tcp_wstamp_ns;
>>>>  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
>>>> -	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
>>>> +	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
>>>
>>> Multiple references to CLOCK_MONOTONIC left
>>>
>> I think i took care of all the references. Apologies if i didn't understand your comment here. 
> 
> On closer read, there is a type issue here.
> 
> skb_set_delivery_time takes a u8 tstamp_type. But it is often passed
> a clockid_t, and that is also what the switch expects.
> 
> But it does also get called with a tstamp_type in code like the
> following:
> 
> +       u8 tstamp_type = skb->tstamp_type;
>         unsigned int hlen, ll_rs, mtu;
>         ktime_t tstamp = skb->tstamp;
>         struct ip_frag_state state;
> @@ -82,7 +82,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
>                         if (iter.frag)
>                                 ip_fraglist_prepare(skb, &iter);
>   
> -                       skb_set_delivery_time(skb, tstamp, mono_delivery_time);
> +                       skb_set_delivery_time(skb, tstamp, tstamp_type);
> 
> So maybe we need two variants, one that takes a tstamp_type and one
> that tames a clockid_t?
> 
> The first can be simple, not switch needed. Just apply the two stores.
I agree to what you are saying but clockid_t => points to int itself. 

For example :- 
		void qdisc_watchdog_init_clockid(struct qdisc_watchdog *wd, struct Qdisc *qdisc,
				 clockid_t clockid)

		qdisc_watchdog_init_clockid(wd, qdisc, CLOCK_MONOTONIC); => sch_api.c
	       qdisc_watchdog_init_clockid(&q->watchdog, sch, q->clockid); =>sch_etf.c (q->clockid is int)
	
But i can change it to two new APIs one which accepts only clock_id (with switch) and other accepts u8 to directly store whatever is given. 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18 20:38         ` Abhishek Chauhan (ABC)
@ 2024-04-18 20:49           ` Willem de Bruijn
  2024-04-18 20:51             ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2024-04-18 20:49 UTC (permalink / raw)
  To: Abhishek Chauhan (ABC),
	Willem de Bruijn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel

Abhishek Chauhan (ABC) wrote:
> 
> 
> On 4/18/2024 1:11 PM, Willem de Bruijn wrote:
> > Abhishek Chauhan (ABC) wrote:
> >>
> >>
> >> On 4/18/2024 11:47 AM, Willem de Bruijn wrote:
> >>> Abhishek Chauhan wrote:
> >>>> mono_delivery_time was added to check if skb->tstamp has delivery
> >>>> time in mono clock base (i.e. EDT) otherwise skb->tstamp has
> >>>> timestamp in ingress and delivery_time at egress.
> >>>>
> >>>> Renaming the bitfield from mono_delivery_time to tstamp_type is for
> >>>> extensibilty for other timestamps such as userspace timestamp
> >>>> (i.e. SO_TXTIME) set via sock opts.
> >>>>
> >>>> As we are renaming the mono_delivery_time to tstamp_type, it makes
> >>>> sense to start assigning tstamp_type based on enum defined
> >>>> in this commit.
> >>>>
> >>>> Earlier we used bool arg flag to check if the tstamp is mono in
> >>>> function skb_set_delivery_time, Now the signature of the functions
> >>>> accepts tstamp_type to distinguish between mono and real time.
> >>>>
> >>>> In future tstamp_type:1 can be extended to support userspace timestamp
> >>>> by increasing the bitfield.
> >>>>
> >>>> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
> >>>> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
> >>>
> >>>> +/**
> >>>> + * tstamp_type:1 can take 2 values each
> >>>> + * represented by time base in skb
> >>>> + * 0x0 => real timestamp_type
> >>>> + * 0x1 => mono timestamp_type
> >>>> + */
> >>>> +enum skb_tstamp_type {
> >>>> +	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
> >>>> +	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
> >>>> +};
> >>>> +
> >>>
> >>> Can drop the comments. These names are self documenting.
> >>
> >> Noted! . I will take care of this
> >>>
> >>>>  /**
> >>>>   * DOC: Basic sk_buff geometry
> >>>>   *
> >>>> @@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
> >>>>   *	@dst_pending_confirm: need to confirm neighbour
> >>>>   *	@decrypted: Decrypted SKB
> >>>>   *	@slow_gro: state present at GRO time, slower prepare step required
> >>>> - *	@mono_delivery_time: When set, skb->tstamp has the
> >>>> + *	@tstamp_type: When set, skb->tstamp has the
> >>>>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
> >>>>   *		skb->tstamp has the (rcv) timestamp at ingress and
> >>>>   *		delivery_time at egress.
> >>>
> >>> Is this still correct? I think all egress does now annotate correctly
> >>> as SKB_CLOCK_MONO. So when not set it always is SKB_CLOCK_REAL.
> >>>
> >> That is correct. 
> >>
> >>>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> >>>> index 61119d42b0fd..a062f88c47c3 100644
> >>>> --- a/net/ipv4/tcp_output.c
> >>>> +++ b/net/ipv4/tcp_output.c
> >>>> @@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
> >>>>  	tp = tcp_sk(sk);
> >>>>  	prior_wstamp = tp->tcp_wstamp_ns;
> >>>>  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
> >>>> -	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
> >>>> +	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
> >>>
> >>> Multiple references to CLOCK_MONOTONIC left
> >>>
> >> I think i took care of all the references. Apologies if i didn't understand your comment here. 
> > 
> > On closer read, there is a type issue here.
> > 
> > skb_set_delivery_time takes a u8 tstamp_type. But it is often passed
> > a clockid_t, and that is also what the switch expects.
> > 
> > But it does also get called with a tstamp_type in code like the
> > following:
> > 
> > +       u8 tstamp_type = skb->tstamp_type;
> >         unsigned int hlen, ll_rs, mtu;
> >         ktime_t tstamp = skb->tstamp;
> >         struct ip_frag_state state;
> > @@ -82,7 +82,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
> >                         if (iter.frag)
> >                                 ip_fraglist_prepare(skb, &iter);
> >   
> > -                       skb_set_delivery_time(skb, tstamp, mono_delivery_time);
> > +                       skb_set_delivery_time(skb, tstamp, tstamp_type);
> > 
> > So maybe we need two variants, one that takes a tstamp_type and one
> > that tames a clockid_t?
> > 
> > The first can be simple, not switch needed. Just apply the two stores.
> I agree to what you are saying but clockid_t => points to int itself. 
> 
> For example :- 
> 		void qdisc_watchdog_init_clockid(struct qdisc_watchdog *wd, struct Qdisc *qdisc,
> 				 clockid_t clockid)
> 
> 		qdisc_watchdog_init_clockid(wd, qdisc, CLOCK_MONOTONIC); => sch_api.c
> 	       qdisc_watchdog_init_clockid(&q->watchdog, sch, q->clockid); =>sch_etf.c (q->clockid is int)

My concern is more that we use CLOCK_MONOTONIC and SKB_CLOCK_MONO
(and other clocks) interchangeably, without invariant checks to make
sure that they map onto the same integer value.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty
  2024-04-18 20:49           ` Willem de Bruijn
@ 2024-04-18 20:51             ` Abhishek Chauhan (ABC)
  0 siblings, 0 replies; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-18 20:51 UTC (permalink / raw)
  To: Willem de Bruijn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel, Andrew Halaney,
	Martin KaFai Lau, Martin KaFai Lau, Daniel Borkmann, bpf
  Cc: kernel



On 4/18/2024 1:49 PM, Willem de Bruijn wrote:
> Abhishek Chauhan (ABC) wrote:
>>
>>
>> On 4/18/2024 1:11 PM, Willem de Bruijn wrote:
>>> Abhishek Chauhan (ABC) wrote:
>>>>
>>>>
>>>> On 4/18/2024 11:47 AM, Willem de Bruijn wrote:
>>>>> Abhishek Chauhan wrote:
>>>>>> mono_delivery_time was added to check if skb->tstamp has delivery
>>>>>> time in mono clock base (i.e. EDT) otherwise skb->tstamp has
>>>>>> timestamp in ingress and delivery_time at egress.
>>>>>>
>>>>>> Renaming the bitfield from mono_delivery_time to tstamp_type is for
>>>>>> extensibilty for other timestamps such as userspace timestamp
>>>>>> (i.e. SO_TXTIME) set via sock opts.
>>>>>>
>>>>>> As we are renaming the mono_delivery_time to tstamp_type, it makes
>>>>>> sense to start assigning tstamp_type based on enum defined
>>>>>> in this commit.
>>>>>>
>>>>>> Earlier we used bool arg flag to check if the tstamp is mono in
>>>>>> function skb_set_delivery_time, Now the signature of the functions
>>>>>> accepts tstamp_type to distinguish between mono and real time.
>>>>>>
>>>>>> In future tstamp_type:1 can be extended to support userspace timestamp
>>>>>> by increasing the bitfield.
>>>>>>
>>>>>> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
>>>>>> Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
>>>>>
>>>>>> +/**
>>>>>> + * tstamp_type:1 can take 2 values each
>>>>>> + * represented by time base in skb
>>>>>> + * 0x0 => real timestamp_type
>>>>>> + * 0x1 => mono timestamp_type
>>>>>> + */
>>>>>> +enum skb_tstamp_type {
>>>>>> +	SKB_CLOCK_REAL,	/* Time base is skb is REALTIME */
>>>>>> +	SKB_CLOCK_MONO,	/* Time base is skb is MONOTONIC */
>>>>>> +};
>>>>>> +
>>>>>
>>>>> Can drop the comments. These names are self documenting.
>>>>
>>>> Noted! . I will take care of this
>>>>>
>>>>>>  /**
>>>>>>   * DOC: Basic sk_buff geometry
>>>>>>   *
>>>>>> @@ -819,7 +830,7 @@ typedef unsigned char *sk_buff_data_t;
>>>>>>   *	@dst_pending_confirm: need to confirm neighbour
>>>>>>   *	@decrypted: Decrypted SKB
>>>>>>   *	@slow_gro: state present at GRO time, slower prepare step required
>>>>>> - *	@mono_delivery_time: When set, skb->tstamp has the
>>>>>> + *	@tstamp_type: When set, skb->tstamp has the
>>>>>>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>>>>>>   *		skb->tstamp has the (rcv) timestamp at ingress and
>>>>>>   *		delivery_time at egress.
>>>>>
>>>>> Is this still correct? I think all egress does now annotate correctly
>>>>> as SKB_CLOCK_MONO. So when not set it always is SKB_CLOCK_REAL.
>>>>>
>>>> That is correct. 
>>>>
>>>>>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>>>>>> index 61119d42b0fd..a062f88c47c3 100644
>>>>>> --- a/net/ipv4/tcp_output.c
>>>>>> +++ b/net/ipv4/tcp_output.c
>>>>>> @@ -1300,7 +1300,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
>>>>>>  	tp = tcp_sk(sk);
>>>>>>  	prior_wstamp = tp->tcp_wstamp_ns;
>>>>>>  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
>>>>>> -	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, true);
>>>>>> +	skb_set_delivery_time(skb, tp->tcp_wstamp_ns, CLOCK_MONOTONIC);
>>>>>
>>>>> Multiple references to CLOCK_MONOTONIC left
>>>>>
>>>> I think i took care of all the references. Apologies if i didn't understand your comment here. 
>>>
>>> On closer read, there is a type issue here.
>>>
>>> skb_set_delivery_time takes a u8 tstamp_type. But it is often passed
>>> a clockid_t, and that is also what the switch expects.
>>>
>>> But it does also get called with a tstamp_type in code like the
>>> following:
>>>
>>> +       u8 tstamp_type = skb->tstamp_type;
>>>         unsigned int hlen, ll_rs, mtu;
>>>         ktime_t tstamp = skb->tstamp;
>>>         struct ip_frag_state state;
>>> @@ -82,7 +82,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
>>>                         if (iter.frag)
>>>                                 ip_fraglist_prepare(skb, &iter);
>>>   
>>> -                       skb_set_delivery_time(skb, tstamp, mono_delivery_time);
>>> +                       skb_set_delivery_time(skb, tstamp, tstamp_type);
>>>
>>> So maybe we need two variants, one that takes a tstamp_type and one
>>> that tames a clockid_t?
>>>
>>> The first can be simple, not switch needed. Just apply the two stores.
>> I agree to what you are saying but clockid_t => points to int itself. 
>>
>> For example :- 
>> 		void qdisc_watchdog_init_clockid(struct qdisc_watchdog *wd, struct Qdisc *qdisc,
>> 				 clockid_t clockid)
>>
>> 		qdisc_watchdog_init_clockid(wd, qdisc, CLOCK_MONOTONIC); => sch_api.c
>> 	       qdisc_watchdog_init_clockid(&q->watchdog, sch, q->clockid); =>sch_etf.c (q->clockid is int)
> 
> My concern is more that we use CLOCK_MONOTONIC and SKB_CLOCK_MONO
> (and other clocks) interchangeably, without invariant checks to make
> sure that they map onto the same integer value.
Ah i see. I got it . I will make two APIs . Makes sense. 
1. One can check for clockid => switch => set 
2. One can set it directly. 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-18 20:10     ` Abhishek Chauhan (ABC)
@ 2024-04-18 21:57       ` Martin KaFai Lau
  2024-04-19  0:30         ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Martin KaFai Lau @ 2024-04-18 21:57 UTC (permalink / raw)
  To: Abhishek Chauhan (ABC), Willem de Bruijn
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Martin KaFai Lau,
	Daniel Borkmann, bpf, kernel

On 4/18/24 1:10 PM, Abhishek Chauhan (ABC) wrote:
>>>   #ifdef CONFIG_NET_XGRESS
>>>   	__u8			tc_at_ingress:1;	/* See TC_AT_INGRESS_MASK */
>>>   	__u8			tc_skip_classify:1;
>>> @@ -1096,10 +1100,12 @@ struct sk_buff {
>>>    */
>>>   #ifdef __BIG_ENDIAN_BITFIELD
>>>   #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 7)
>>> -#define TC_AT_INGRESS_MASK		(1 << 6)
>>> +#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 6)
>>
>> SKB_TSTAMP_TYPE_BIT2_MASK?

nit. Shorten it to just SKB_TSTAMP_TYPE_MASK?

#ifdef __BIG_ENDIAN_BITFIELD
#define SKB_TSTAMP_TYPE_MASK	(3 << 6)
#define SKB_TSTAMP_TYPE_RSH	(6)	/* more on this later */
#else
#define SKB_TSTAMP_TYPE_MASK	(3)
#endif

>>
> I was thinking to keep it as TAI because it will confuse developers. I hope thats okay.

I think it is not very useful to distinguish each bit since it is an enum value 
now. It becomes more like the "pkt_type:3" and its PKT_TYPE_MAX.

>>> +#define TC_AT_INGRESS_MASK		(1 << 5)
>>>   #else
>>>   #define SKB_MONO_DELIVERY_TIME_MASK	(1 << 0)
>>> -#define TC_AT_INGRESS_MASK		(1 << 1)
>>> +#define SKB_TAI_DELIVERY_TIME_MASK	(1 << 1)
>>> +#define TC_AT_INGRESS_MASK		(1 << 2)
>>>   #endif
>>>   #define SKB_BF_MONO_TC_OFFSET		offsetof(struct sk_buff, __mono_tc_offset)
>>>   
>>> @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>>   	case CLOCK_MONOTONIC:
>>>   		skb->tstamp_type = SKB_CLOCK_MONO;
>>>   		break;
>>> +	case CLOCK_TAI:
>>> +		skb->tstamp_type = SKB_CLOCK_TAI;
>>> +		break;
>>> +	default:
>>> +		WARN_ONCE(true, "clockid %d not supported", tstamp_type);
>>
>> and set to 0 and default tstamp_type?
>> Actually thinking about it. I feel if its unsupported just fall back to default is the correct thing. I will take care of this.
>>>   	}
>>>   }
>>
>>>   >
>>   @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>   	*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>>>   			      SKB_BF_MONO_TC_OFFSET);
>>>   	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>> -				SKB_MONO_DELIVERY_TIME_MASK, 2);
>>> +				SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>> +	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>> +				SKB_MONO_DELIVERY_TIME_MASK, 3);
>>> +	*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>> +				SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>   	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>>>   	*insn++ = BPF_JMP_A(1);
>>>   	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
>>> +	*insn++ = BPF_JMP_A(1);
>>> +	*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);

With SKB_TSTAMP_TYPE_MASK defined like above, this could be simplified like this 
(untested):

static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
                                                      struct bpf_insn *insn)
{
	__u8 value_reg = si->dst_reg;
	__u8 skb_reg = si->src_reg;

	BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
	*insn++ = BPF_LDX_MEM(BPF_B, value_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
	*insn++ = BPF_ALU32_IMM(BPF_AND, value_reg, SKB_TSTAMP_TYPE_MASK);
#ifdef __BIG_ENDIAN_BITFIELD
	*insn++ = BPF_ALU32_IMM(BPF_RSH, value_reg, SKB_TSTAMP_TYPE_RSH);
#else
	BUILD_BUG_ON(!(SKB_TSTAMP_TYPE_MASK & 0x1));
#endif

	return insn;
}

>>>   
>>>   	return insn;
>>>   }
>>> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>>   		__u8 tmp_reg = BPF_REG_AX;
>>>   
>>>   		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>> +		/*check if all three bits are set*/
>>>   		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>>> -					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
>>> -		*insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
>>> -					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
>>> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>> +					SKB_TAI_DELIVERY_TIME_MASK);
>>> +		/*if all 3 bits are set jump 3 instructions and clear the register */
>>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>> +					SKB_TAI_DELIVERY_TIME_MASK, 4);
>>> +		/*Now check Mono is set with ingress mask if so clear */
>>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>> +					TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
>>> +		/*Now Check tai is set with ingress mask if so clear */
>>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>> +					TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>> +		/*Now Check tai and mono are set if so clear */
>>> +		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>> +					SKB_MONO_DELIVERY_TIME_MASK |
>>> +					SKB_TAI_DELIVERY_TIME_MASK, 1);

Same as the bpf_convert_tstamp_type_read, this could be simplified with 
SKB_TSTAMP_TYPE_MASK.

>>
>> This looks as if all JEQ result in "if so clear"?
>>
>> Is the goal to only do something different for the two bits being 0x1,
>> can we have a single test with a two-bit mask, rather than four tests?
>>
> I think Martin wanted to take care of TAI as well. I will wait for his comment here
> 
> My Goal was to take care of invalid combos which does not hold valid
> 1. If all 3 bits are set => invalid combo (Test case written is Insane)
> 2. If 2 bits are set (tai+mono)(Test case written is Insane) => this cannot happen (because clock base can only be one in skb)
> 3. If 2 bit are set (ingress + tai/mono) => This is existing logic + tai being added (clear tstamp in ingress)
> 4. For all other cases go ahead and fill in the tstamp in the dest register.

If it is to ensure no new type is added without adding 
BPF_SKB_TSTAMP_DELIVERY_XYZ, I would simplify this runtime bpf insns here and 
use a BUILD_BUG_ON to catch it at compile time. Something like,

enum skb_tstamp_type {
         SKB_CLOCK_REAL, /* Time base is skb is REALTIME */
         SKB_CLOCK_MONO, /* Time base is skb is MONOTONIC */
  	SKB_CLOCK_TAI,  /* Time base in skb is TAI */
         __SKB_CLOCK_MAX = SKB_CLOCK_TAI,
};

/* Same one used in the bpf_convert_tstamp_type_read() above */
BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);

Another thing is, the UDP test in test_tc_dtime.c probably needs to be adjusted, 
the userspace is using the CLOCK_TAI in SO_TXTIME and it is getting forwarded now.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-18 21:57       ` Martin KaFai Lau
@ 2024-04-19  0:30         ` Abhishek Chauhan (ABC)
  2024-04-20  1:13           ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-19  0:30 UTC (permalink / raw)
  To: Martin KaFai Lau, Willem de Bruijn
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Martin KaFai Lau,
	Daniel Borkmann, bpf, kernel



On 4/18/2024 2:57 PM, Martin KaFai Lau wrote:
> On 4/18/24 1:10 PM, Abhishek Chauhan (ABC) wrote:
>>>>   #ifdef CONFIG_NET_XGRESS
>>>>       __u8            tc_at_ingress:1;    /* See TC_AT_INGRESS_MASK */
>>>>       __u8            tc_skip_classify:1;
>>>> @@ -1096,10 +1100,12 @@ struct sk_buff {
>>>>    */
>>>>   #ifdef __BIG_ENDIAN_BITFIELD
>>>>   #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 7)
>>>> -#define TC_AT_INGRESS_MASK        (1 << 6)
>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 6)
>>>
>>> SKB_TSTAMP_TYPE_BIT2_MASK?
> 
> nit. Shorten it to just SKB_TSTAMP_TYPE_MASK?
> 
Okay i will do the same. Noted!
> #ifdef __BIG_ENDIAN_BITFIELD
> #define SKB_TSTAMP_TYPE_MASK    (3 << 6)
> #define SKB_TSTAMP_TYPE_RSH    (6)    /* more on this later */
> #else
> #define SKB_TSTAMP_TYPE_MASK    (3)
> #endif
> 
>>>
>> I was thinking to keep it as TAI because it will confuse developers. I hope thats okay.
> 
> I think it is not very useful to distinguish each bit since it is an enum value now. It becomes more like the "pkt_type:3" and its PKT_TYPE_MAX.
>I see what you are saying.
>>>> +#define TC_AT_INGRESS_MASK        (1 << 5)
>>>>   #else
>>>>   #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 0)
>>>> -#define TC_AT_INGRESS_MASK        (1 << 1)
>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 1)
>>>> +#define TC_AT_INGRESS_MASK        (1 << 2)
>>>>   #endif
>>>>   #define SKB_BF_MONO_TC_OFFSET        offsetof(struct sk_buff, __mono_tc_offset)
>>>>   @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>>>       case CLOCK_MONOTONIC:
>>>>           skb->tstamp_type = SKB_CLOCK_MONO;
>>>>           break;
>>>> +    case CLOCK_TAI:
>>>> +        skb->tstamp_type = SKB_CLOCK_TAI;
>>>> +        break;
>>>> +    default:
>>>> +        WARN_ONCE(true, "clockid %d not supported", tstamp_type);
>>>
>>> and set to 0 and default tstamp_type?
>>> Actually thinking about it. I feel if its unsupported just fall back to default is the correct thing. I will take care of this.
>>>>       }
>>>>   }
>>>
>>>>   >
>>>   @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>>       *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>>>>                     SKB_BF_MONO_TC_OFFSET);
>>>>       *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>> -                SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>> +                SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>> +                SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>> +                SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>       *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>>>>       *insn++ = BPF_JMP_A(1);
>>>>       *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
>>>> +    *insn++ = BPF_JMP_A(1);
>>>> +    *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
> 
> With SKB_TSTAMP_TYPE_MASK defined like above, this could be simplified like this (untested):
> 
Let me think this through and raise it as part of the next rfc patch. 
> static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>                                                      struct bpf_insn *insn)
> {
>     __u8 value_reg = si->dst_reg;
>     __u8 skb_reg = si->src_reg;
> 
>     BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
>     *insn++ = BPF_LDX_MEM(BPF_B, value_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>     *insn++ = BPF_ALU32_IMM(BPF_AND, value_reg, SKB_TSTAMP_TYPE_MASK);
> #ifdef __BIG_ENDIAN_BITFIELD
>     *insn++ = BPF_ALU32_IMM(BPF_RSH, value_reg, SKB_TSTAMP_TYPE_RSH);
> #else
>     BUILD_BUG_ON(!(SKB_TSTAMP_TYPE_MASK & 0x1));
> #endif
> 
>     return insn;
> }
> 
>>>>         return insn;
>>>>   }
>>>> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>>>           __u8 tmp_reg = BPF_REG_AX;
>>>>             *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>>> +        /*check if all three bits are set*/
>>>>           *insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
>>>> -        *insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>> +                    SKB_TAI_DELIVERY_TIME_MASK);
>>>> +        /*if all 3 bits are set jump 3 instructions and clear the register */
>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>> +        /*Now check Mono is set with ingress mask if so clear */
>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>> +        /*Now Check tai is set with ingress mask if so clear */
>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>> +                    TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>> +        /*Now Check tai and mono are set if so clear */
>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>> +                    SKB_MONO_DELIVERY_TIME_MASK |
>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 1);
> 
> Same as the bpf_convert_tstamp_type_read, this could be simplified with SKB_TSTAMP_TYPE_MASK.
> 
>>>
>>> This looks as if all JEQ result in "if so clear"?
>>>
>>> Is the goal to only do something different for the two bits being 0x1,
>>> can we have a single test with a two-bit mask, rather than four tests?
>>>
>> I think Martin wanted to take care of TAI as well. I will wait for his comment here
>>
>> My Goal was to take care of invalid combos which does not hold valid
>> 1. If all 3 bits are set => invalid combo (Test case written is Insane)
>> 2. If 2 bits are set (tai+mono)(Test case written is Insane) => this cannot happen (because clock base can only be one in skb)
>> 3. If 2 bit are set (ingress + tai/mono) => This is existing logic + tai being added (clear tstamp in ingress)
>> 4. For all other cases go ahead and fill in the tstamp in the dest register.
> 
> If it is to ensure no new type is added without adding BPF_SKB_TSTAMP_DELIVERY_XYZ, I would simplify this runtime bpf insns here and use a BUILD_BUG_ON to catch it at compile time. Something like,
> 
> enum skb_tstamp_type {
>         SKB_CLOCK_REAL, /* Time base is skb is REALTIME */
>         SKB_CLOCK_MONO, /* Time base is skb is MONOTONIC */
>      SKB_CLOCK_TAI,  /* Time base in skb is TAI */
>         __SKB_CLOCK_MAX = SKB_CLOCK_TAI,
> };
> 
> /* Same one used in the bpf_convert_tstamp_type_read() above */
> BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
> 
> Another thing is, the UDP test in test_tc_dtime.c probably needs to be adjusted, the userspace is using the CLOCK_TAI in SO_TXTIME and it is getting forwarded now.
Noted ! Let me check and evalute this as well. 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-19  0:30         ` Abhishek Chauhan (ABC)
@ 2024-04-20  1:13           ` Abhishek Chauhan (ABC)
  2024-04-22 18:46             ` Martin KaFai Lau
  0 siblings, 1 reply; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-20  1:13 UTC (permalink / raw)
  To: Martin KaFai Lau, Willem de Bruijn
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Martin KaFai Lau,
	Daniel Borkmann, bpf, kernel



On 4/18/2024 5:30 PM, Abhishek Chauhan (ABC) wrote:
> 
> 
> On 4/18/2024 2:57 PM, Martin KaFai Lau wrote:
>> On 4/18/24 1:10 PM, Abhishek Chauhan (ABC) wrote:
>>>>>   #ifdef CONFIG_NET_XGRESS
>>>>>       __u8            tc_at_ingress:1;    /* See TC_AT_INGRESS_MASK */
>>>>>       __u8            tc_skip_classify:1;
>>>>> @@ -1096,10 +1100,12 @@ struct sk_buff {
>>>>>    */
>>>>>   #ifdef __BIG_ENDIAN_BITFIELD
>>>>>   #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 7)
>>>>> -#define TC_AT_INGRESS_MASK        (1 << 6)
>>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 6)
>>>>
>>>> SKB_TSTAMP_TYPE_BIT2_MASK?
>>
>> nit. Shorten it to just SKB_TSTAMP_TYPE_MASK?
>>
> Okay i will do the same. Noted!
>> #ifdef __BIG_ENDIAN_BITFIELD
>> #define SKB_TSTAMP_TYPE_MASK    (3 << 6)
>> #define SKB_TSTAMP_TYPE_RSH    (6)    /* more on this later */
>> #else
>> #define SKB_TSTAMP_TYPE_MASK    (3)
>> #endif
>>
>>>>
>>> I was thinking to keep it as TAI because it will confuse developers. I hope thats okay.
>>
>> I think it is not very useful to distinguish each bit since it is an enum value now. It becomes more like the "pkt_type:3" and its PKT_TYPE_MAX.
>> I see what you are saying.
>>>>> +#define TC_AT_INGRESS_MASK        (1 << 5)
>>>>>   #else
>>>>>   #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 0)
>>>>> -#define TC_AT_INGRESS_MASK        (1 << 1)
>>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 1)
>>>>> +#define TC_AT_INGRESS_MASK        (1 << 2)
>>>>>   #endif
>>>>>   #define SKB_BF_MONO_TC_OFFSET        offsetof(struct sk_buff, __mono_tc_offset)
>>>>>   @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>>>>       case CLOCK_MONOTONIC:
>>>>>           skb->tstamp_type = SKB_CLOCK_MONO;
>>>>>           break;
>>>>> +    case CLOCK_TAI:
>>>>> +        skb->tstamp_type = SKB_CLOCK_TAI;
>>>>> +        break;
>>>>> +    default:
>>>>> +        WARN_ONCE(true, "clockid %d not supported", tstamp_type);
>>>>
>>>> and set to 0 and default tstamp_type?
>>>> Actually thinking about it. I feel if its unsupported just fall back to default is the correct thing. I will take care of this.
>>>>>       }
>>>>>   }
>>>>
>>>>>   >
>>>>   @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>>>       *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>>>>>                     SKB_BF_MONO_TC_OFFSET);
>>>>>       *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>> -                SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>>> +                SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>> +                SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>> +                SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>>       *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>>>>>       *insn++ = BPF_JMP_A(1);
>>>>>       *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
>>>>> +    *insn++ = BPF_JMP_A(1);
>>>>> +    *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
>>
>> With SKB_TSTAMP_TYPE_MASK defined like above, this could be simplified like this (untested):
>>
> Let me think this through and raise it as part of the next rfc patch. 
>> static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>                                                      struct bpf_insn *insn)
>> {
>>     __u8 value_reg = si->dst_reg;
>>     __u8 skb_reg = si->src_reg;
>>
>>     BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
>>     *insn++ = BPF_LDX_MEM(BPF_B, value_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>     *insn++ = BPF_ALU32_IMM(BPF_AND, value_reg, SKB_TSTAMP_TYPE_MASK);
>> #ifdef __BIG_ENDIAN_BITFIELD
>>     *insn++ = BPF_ALU32_IMM(BPF_RSH, value_reg, SKB_TSTAMP_TYPE_RSH);
>> #else
>>     BUILD_BUG_ON(!(SKB_TSTAMP_TYPE_MASK & 0x1));
>> #endif
>>
>>     return insn;
>> }
>>
>>>>>         return insn;
>>>>>   }
>>>>> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>>>>           __u8 tmp_reg = BPF_REG_AX;
>>>>>             *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>>>> +        /*check if all three bits are set*/
>>>>>           *insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
>>>>> -        *insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
>>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK);
>>>>> +        /*if all 3 bits are set jump 3 instructions and clear the register */
>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>> +        /*Now check Mono is set with ingress mask if so clear */
>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>>> +        /*Now Check tai is set with ingress mask if so clear */
>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>> +                    TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>>> +        /*Now Check tai and mono are set if so clear */
>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>> +                    SKB_MONO_DELIVERY_TIME_MASK |
>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 1);
>>
>> Same as the bpf_convert_tstamp_type_read, this could be simplified with SKB_TSTAMP_TYPE_MASK.
>>
Willem and Martin, 
When do we clear the tstamp and make it 0 in bpf_convert_tstamp_read? meaning which configuration?  
I see previously(current upstream code) if mono_delivery is set and tc_ingress_mask is set 
upstream code used to set the tstamp as 0. 

Which means with addition of tai mask the new implementation should take care of following cases(correct me if i am wrong)
1. ( tai mask set + ingress mask set ) = Clear tstamp 
2. ( mono mask set + ingress mask set ) = Clear tstamp 
3. ( mono mask set + tai mask set + ingress mask set ) = Clear tstamp
4. ( No mask set ) = Clear tstamp 
5. ( Tai mask set + mono mask set ) = Clear tstamp 

This leaves us with only two values which can be support which is 0x1 and 0x2 

This means the tstamp_type should be either 0x1(mono) and tstamp_type 0x2 (tai) to set the value_reg with tstamp 
Is my understanding correct ? 

Do you think the below simplified version looks okay ? 

static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
						const struct bpf_insn *si,
						struct bpf_insn *insn)
{
	__u8 value_reg = si->dst_reg;
	__u8 skb_reg = si->src_reg;

BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
#ifdef CONFIG_NET_XGRESS
	/* If the tstamp_type is read,
	 * the bpf prog is aware the tstamp could have delivery time.
	 * Thus, read skb->tstamp as is if tstamp_type_access is true.
	 */
	if (!prog->tstamp_type_access) {
		/* AX is needed because src_reg and dst_reg could be the same */
		__u8 tmp_reg = BPF_REG_AX;

		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
		/* check if all three bits are set*/
		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
					TC_AT_INGRESS_MASK | SKB_TSTAMP_TYPE_MASK);

		/* If the value of tmp_reg is 7,6,5,4,3,0 which means invalid
		 * configuration set the tstamp to 0, value 0x1 and 0x2
		 * is correct configuration
		 */
#ifdef __BIG_ENDIAN_BITFIELD 
		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x1 << SKB_TSTAMP_TYPE_RSH, 3);
		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x2 << SKB_TSTAMP_TYPE_RSH, 2);
#endif
		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x1, 3);
		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x2, 2);
#endif
		/* skb->tc_at_ingress && skb->tstamp_type:2,
		 * read 0 as the (rcv) timestamp.
		 */
		*insn++ = BPF_MOV64_IMM(value_reg, 0);
		*insn++ = BPF_JMP_A(1);
	}
#endif

	*insn++ = BPF_LDX_MEM(BPF_DW, value_reg, skb_reg,
			      offsetof(struct sk_buff, tstamp));
	return insn;
}


>>>>
>>>> This looks as if all JEQ result in "if so clear"?
>>>>
>>>> Is the goal to only do something different for the two bits being 0x1,
>>>> can we have a single test with a two-bit mask, rather than four tests?
>>>>
>>> I think Martin wanted to take care of TAI as well. I will wait for his comment here
>>>
>>> My Goal was to take care of invalid combos which does not hold valid
>>> 1. If all 3 bits are set => invalid combo (Test case written is Insane)
>>> 2. If 2 bits are set (tai+mono)(Test case written is Insane) => this cannot happen (because clock base can only be one in skb)
>>> 3. If 2 bit are set (ingress + tai/mono) => This is existing logic + tai being added (clear tstamp in ingress)
>>> 4. For all other cases go ahead and fill in the tstamp in the dest register.
>>
>> If it is to ensure no new type is added without adding BPF_SKB_TSTAMP_DELIVERY_XYZ, I would simplify this runtime bpf insns here and use a BUILD_BUG_ON to catch it at compile time. Something like,
>>
>> enum skb_tstamp_type {
>>         SKB_CLOCK_REAL, /* Time base is skb is REALTIME */
>>         SKB_CLOCK_MONO, /* Time base is skb is MONOTONIC */
>>      SKB_CLOCK_TAI,  /* Time base in skb is TAI */
>>         __SKB_CLOCK_MAX = SKB_CLOCK_TAI,
>> };
>>
>> /* Same one used in the bpf_convert_tstamp_type_read() above */
>> BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
>>
>> Another thing is, the UDP test in test_tc_dtime.c probably needs to be adjusted, the userspace is using the CLOCK_TAI in SO_TXTIME and it is getting forwarded now.
> Noted ! Let me check and evalute this as well. 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-20  1:13           ` Abhishek Chauhan (ABC)
@ 2024-04-22 18:46             ` Martin KaFai Lau
  2024-04-24 18:03               ` Abhishek Chauhan (ABC)
  0 siblings, 1 reply; 16+ messages in thread
From: Martin KaFai Lau @ 2024-04-22 18:46 UTC (permalink / raw)
  To: Abhishek Chauhan (ABC), Willem de Bruijn
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Martin KaFai Lau,
	Daniel Borkmann, bpf, kernel

On 4/19/24 6:13 PM, Abhishek Chauhan (ABC) wrote:
> 
> 
> On 4/18/2024 5:30 PM, Abhishek Chauhan (ABC) wrote:
>>
>>
>> On 4/18/2024 2:57 PM, Martin KaFai Lau wrote:
>>> On 4/18/24 1:10 PM, Abhishek Chauhan (ABC) wrote:
>>>>>>    #ifdef CONFIG_NET_XGRESS
>>>>>>        __u8            tc_at_ingress:1;    /* See TC_AT_INGRESS_MASK */
>>>>>>        __u8            tc_skip_classify:1;
>>>>>> @@ -1096,10 +1100,12 @@ struct sk_buff {
>>>>>>     */
>>>>>>    #ifdef __BIG_ENDIAN_BITFIELD
>>>>>>    #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 7)
>>>>>> -#define TC_AT_INGRESS_MASK        (1 << 6)
>>>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 6)
>>>>>
>>>>> SKB_TSTAMP_TYPE_BIT2_MASK?
>>>
>>> nit. Shorten it to just SKB_TSTAMP_TYPE_MASK?
>>>
>> Okay i will do the same. Noted!
>>> #ifdef __BIG_ENDIAN_BITFIELD
>>> #define SKB_TSTAMP_TYPE_MASK    (3 << 6)
>>> #define SKB_TSTAMP_TYPE_RSH    (6)    /* more on this later */
>>> #else
>>> #define SKB_TSTAMP_TYPE_MASK    (3)
>>> #endif
>>>
>>>>>
>>>> I was thinking to keep it as TAI because it will confuse developers. I hope thats okay.
>>>
>>> I think it is not very useful to distinguish each bit since it is an enum value now. It becomes more like the "pkt_type:3" and its PKT_TYPE_MAX.
>>> I see what you are saying.
>>>>>> +#define TC_AT_INGRESS_MASK        (1 << 5)
>>>>>>    #else
>>>>>>    #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 0)
>>>>>> -#define TC_AT_INGRESS_MASK        (1 << 1)
>>>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 1)
>>>>>> +#define TC_AT_INGRESS_MASK        (1 << 2)
>>>>>>    #endif
>>>>>>    #define SKB_BF_MONO_TC_OFFSET        offsetof(struct sk_buff, __mono_tc_offset)
>>>>>>    @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>>>>>        case CLOCK_MONOTONIC:
>>>>>>            skb->tstamp_type = SKB_CLOCK_MONO;
>>>>>>            break;
>>>>>> +    case CLOCK_TAI:
>>>>>> +        skb->tstamp_type = SKB_CLOCK_TAI;
>>>>>> +        break;
>>>>>> +    default:
>>>>>> +        WARN_ONCE(true, "clockid %d not supported", tstamp_type);
>>>>>
>>>>> and set to 0 and default tstamp_type?
>>>>> Actually thinking about it. I feel if its unsupported just fall back to default is the correct thing. I will take care of this.
>>>>>>        }
>>>>>>    }
>>>>>
>>>>>>    >
>>>>>    @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>>>>        *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>>>>>>                      SKB_BF_MONO_TC_OFFSET);
>>>>>>        *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>>> -                SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>>>> +                SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>>> +                SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>>> +                SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>>>        *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>>>>>>        *insn++ = BPF_JMP_A(1);
>>>>>>        *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
>>>>>> +    *insn++ = BPF_JMP_A(1);
>>>>>> +    *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
>>>
>>> With SKB_TSTAMP_TYPE_MASK defined like above, this could be simplified like this (untested):
>>>
>> Let me think this through and raise it as part of the next rfc patch.
>>> static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>                                                       struct bpf_insn *insn)
>>> {
>>>      __u8 value_reg = si->dst_reg;
>>>      __u8 skb_reg = si->src_reg;
>>>
>>>      BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
>>>      *insn++ = BPF_LDX_MEM(BPF_B, value_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>>      *insn++ = BPF_ALU32_IMM(BPF_AND, value_reg, SKB_TSTAMP_TYPE_MASK);
>>> #ifdef __BIG_ENDIAN_BITFIELD
>>>      *insn++ = BPF_ALU32_IMM(BPF_RSH, value_reg, SKB_TSTAMP_TYPE_RSH);
>>> #else
>>>      BUILD_BUG_ON(!(SKB_TSTAMP_TYPE_MASK & 0x1));
>>> #endif
>>>
>>>      return insn;
>>> }
>>>
>>>>>>          return insn;
>>>>>>    }
>>>>>> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>>>>>            __u8 tmp_reg = BPF_REG_AX;
>>>>>>              *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>>>>> +        /*check if all three bits are set*/
>>>>>>            *insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>>>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
>>>>>> -        *insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
>>>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK);
>>>>>> +        /*if all 3 bits are set jump 3 instructions and clear the register */
>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>>> +        /*Now check Mono is set with ingress mask if so clear */
>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>>>> +        /*Now Check tai is set with ingress mask if so clear */
>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>> +                    TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>>>> +        /*Now Check tai and mono are set if so clear */
>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>> +                    SKB_MONO_DELIVERY_TIME_MASK |
>>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 1);
>>>
>>> Same as the bpf_convert_tstamp_type_read, this could be simplified with SKB_TSTAMP_TYPE_MASK.
>>>
> Willem and Martin,
> When do we clear the tstamp and make it 0 in bpf_convert_tstamp_read? meaning which configuration?

When the bpf prog does not check the skb->tstamp_type. It is
the "if (!prog->tstamp_type_access)" in bpf_convert_tstamp_read().

If bpf prog does not check the skb->tstamp_type and it is at ingress,
bpf prog expects recv tstamp (ie. real clock), so it needs to clear
out the tstamp (i.e read as 0 tstamp).

> I see previously(current upstream code) if mono_delivery is set and tc_ingress_mask is set
> upstream code used to set the tstamp as 0.
> 
> Which means with addition of tai mask the new implementation should take care of following cases(correct me if i am wrong)
> 1. ( tai mask set + ingress mask set ) = Clear tstamp
> 2. ( mono mask set + ingress mask set ) = Clear tstamp
> 3. ( mono mask set + tai mask set + ingress mask set ) = Clear tstamp
> 4. ( No mask set ) = Clear tstamp
> 5. ( Tai mask set + mono mask set ) = Clear tstamp

No need to check the individual mono and tai bit here. Check the
tstamp_type as a whole. Like in pseudo C:

if (skb->tc_at_ingress && skb->tstamp_type)
	value_reg = 0;

untested code for tstamp_read() and tstamp_write():

static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
                                                 const struct bpf_insn *si,
                                                 struct bpf_insn *insn)
{
	__u8 value_reg = si->dst_reg;
	__u8 skb_reg = si->src_reg;

#ifdef CONFIG_NET_XGRESS
	/* If the tstamp_type is read,
	 * the bpf prog is aware the tstamp could have delivery time.
	 * Thus, read skb->tstamp as is if tstamp_type_access is true.
	 */
	if (!prog->tstamp_type_access) {
		/* AX is needed because src_reg and dst_reg could be the same */
		__u8 tmp_reg = BPF_REG_AX;

		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
		*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
		/* goto <read> */
		BPF_JMP_A(4);
		*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, SKB_TSTAMP_TYPE_MASK, 1);
		/* goto <read> */
		BPF_JMP_A(2);
		/* skb->tc_at_ingress && skb->tstamp_type,
		 * read 0 as the (rcv) timestamp.
		 */
		*insn++ = BPF_MOV64_IMM(value_reg, 0);
		*insn++ = BPF_JMP_A(1);
	}
#endif

	/* <read>: value_reg = skb->tstamp */
	*insn++ = BPF_LDX_MEM(BPF_DW, value_reg, skb_reg,
			      offsetof(struct sk_buff, tstamp));
	return insn;
}

static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog,
                                                  const struct bpf_insn *si,
	                                         struct bpf_insn *insn)
{
	__u8 value_reg = si->src_reg;
	__u8 skb_reg = si->dst_reg;

#ifdef CONFIG_NET_XGRESS
	/* If the tstamp_type is read,
	 * the bpf prog is aware the tstamp could have delivery time.
	 * Thus, write skb->tstamp as is if tstamp_type_access is true.
	 * Otherwise, writing at ingress will have to clear the
	 * mono_delivery_time (skb->tstamp_type:1)bit also.
	 */
         if (!prog->tstamp_type_access) {
		__u8 tmp_reg = BPF_REG_AX;

		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
		/* Writing __sk_buff->tstamp as ingress, goto <clear> */
		*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
		/* goto <store> */
		*insn++ = BPF_JMP_A(2);
		/* <clear>: skb->tstamp_type */
		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, ~SKB_TSTAMP_TYPE_MASK);
		*insn++ = BPF_STX_MEM(BPF_B, skb_reg, tmp_reg, SKB_BF_MONO_TC_OFFSET);
	}
#endif

	/* <store>: skb->tstamp = tstamp */
	*insn++ = BPF_RAW_INSN(BPF_CLASS(si->code) | BPF_DW | BPF_MEM,
			       skb_reg, value_reg, offsetof(struct sk_buff, tstamp), si->imm);
         return insn;
}

> 
> This leaves us with only two values which can be support which is 0x1 and 0x2
> 
> This means the tstamp_type should be either 0x1(mono) and tstamp_type 0x2 (tai) to set the value_reg with tstamp
> Is my understanding correct ?
> 
> Do you think the below simplified version looks okay ?
> 
> static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
> 						const struct bpf_insn *si,
> 						struct bpf_insn *insn)
> {
> 	__u8 value_reg = si->dst_reg;
> 	__u8 skb_reg = si->src_reg;
> 
> BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
> #ifdef CONFIG_NET_XGRESS
> 	/* If the tstamp_type is read,
> 	 * the bpf prog is aware the tstamp could have delivery time.
> 	 * Thus, read skb->tstamp as is if tstamp_type_access is true.
> 	 */
> 	if (!prog->tstamp_type_access) {
> 		/* AX is needed because src_reg and dst_reg could be the same */
> 		__u8 tmp_reg = BPF_REG_AX;
> 
> 		*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
> 		/* check if all three bits are set*/
> 		*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
> 					TC_AT_INGRESS_MASK | SKB_TSTAMP_TYPE_MASK);
> 
> 		/* If the value of tmp_reg is 7,6,5,4,3,0 which means invalid
> 		 * configuration set the tstamp to 0, value 0x1 and 0x2
> 		 * is correct configuration
> 		 */
> #ifdef __BIG_ENDIAN_BITFIELD
> 		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x1 << SKB_TSTAMP_TYPE_RSH, 3);
> 		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x2 << SKB_TSTAMP_TYPE_RSH, 2);
> #endif
> 		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x1, 3);
> 		*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x2, 2);
> #endif
> 		/* skb->tc_at_ingress && skb->tstamp_type:2,
> 		 * read 0 as the (rcv) timestamp.
> 		 */
> 		*insn++ = BPF_MOV64_IMM(value_reg, 0);
> 		*insn++ = BPF_JMP_A(1);
> 	}
> #endif
> 
> 	*insn++ = BPF_LDX_MEM(BPF_DW, value_reg, skb_reg,
> 			      offsetof(struct sk_buff, tstamp));
> 	return insn;
> }
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type
  2024-04-22 18:46             ` Martin KaFai Lau
@ 2024-04-24 18:03               ` Abhishek Chauhan (ABC)
  0 siblings, 0 replies; 16+ messages in thread
From: Abhishek Chauhan (ABC) @ 2024-04-24 18:03 UTC (permalink / raw)
  To: Martin KaFai Lau, Willem de Bruijn
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Andrew Halaney, Martin KaFai Lau,
	Daniel Borkmann, bpf, kernel



On 4/22/2024 11:46 AM, Martin KaFai Lau wrote:
> On 4/19/24 6:13 PM, Abhishek Chauhan (ABC) wrote:
>>
>>
>> On 4/18/2024 5:30 PM, Abhishek Chauhan (ABC) wrote:
>>>
>>>
>>> On 4/18/2024 2:57 PM, Martin KaFai Lau wrote:
>>>> On 4/18/24 1:10 PM, Abhishek Chauhan (ABC) wrote:
>>>>>>>    #ifdef CONFIG_NET_XGRESS
>>>>>>>        __u8            tc_at_ingress:1;    /* See TC_AT_INGRESS_MASK */
>>>>>>>        __u8            tc_skip_classify:1;
>>>>>>> @@ -1096,10 +1100,12 @@ struct sk_buff {
>>>>>>>     */
>>>>>>>    #ifdef __BIG_ENDIAN_BITFIELD
>>>>>>>    #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 7)
>>>>>>> -#define TC_AT_INGRESS_MASK        (1 << 6)
>>>>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 6)
>>>>>>
>>>>>> SKB_TSTAMP_TYPE_BIT2_MASK?
>>>>
>>>> nit. Shorten it to just SKB_TSTAMP_TYPE_MASK?
>>>>
>>> Okay i will do the same. Noted!
>>>> #ifdef __BIG_ENDIAN_BITFIELD
>>>> #define SKB_TSTAMP_TYPE_MASK    (3 << 6)
>>>> #define SKB_TSTAMP_TYPE_RSH    (6)    /* more on this later */
>>>> #else
>>>> #define SKB_TSTAMP_TYPE_MASK    (3)
>>>> #endif
>>>>
>>>>>>
>>>>> I was thinking to keep it as TAI because it will confuse developers. I hope thats okay.
>>>>
>>>> I think it is not very useful to distinguish each bit since it is an enum value now. It becomes more like the "pkt_type:3" and its PKT_TYPE_MAX.
>>>> I see what you are saying.
>>>>>>> +#define TC_AT_INGRESS_MASK        (1 << 5)
>>>>>>>    #else
>>>>>>>    #define SKB_MONO_DELIVERY_TIME_MASK    (1 << 0)
>>>>>>> -#define TC_AT_INGRESS_MASK        (1 << 1)
>>>>>>> +#define SKB_TAI_DELIVERY_TIME_MASK    (1 << 1)
>>>>>>> +#define TC_AT_INGRESS_MASK        (1 << 2)
>>>>>>>    #endif
>>>>>>>    #define SKB_BF_MONO_TC_OFFSET        offsetof(struct sk_buff, __mono_tc_offset)
>>>>>>>    @@ -4206,6 +4212,11 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>>>>>>        case CLOCK_MONOTONIC:
>>>>>>>            skb->tstamp_type = SKB_CLOCK_MONO;
>>>>>>>            break;
>>>>>>> +    case CLOCK_TAI:
>>>>>>> +        skb->tstamp_type = SKB_CLOCK_TAI;
>>>>>>> +        break;
>>>>>>> +    default:
>>>>>>> +        WARN_ONCE(true, "clockid %d not supported", tstamp_type);
>>>>>>
>>>>>> and set to 0 and default tstamp_type?
>>>>>> Actually thinking about it. I feel if its unsupported just fall back to default is the correct thing. I will take care of this.
>>>>>>>        }
>>>>>>>    }
>>>>>>
>>>>>>>    >
>>>>>>    @@ -9372,10 +9378,16 @@ static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>>>>>        *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
>>>>>>>                      SKB_BF_MONO_TC_OFFSET);
>>>>>>>        *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>>>> -                SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>>>>> +                SKB_MONO_DELIVERY_TIME_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>>>> +                SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>>>>> +    *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
>>>>>>> +                SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>>>>        *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
>>>>>>>        *insn++ = BPF_JMP_A(1);
>>>>>>>        *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
>>>>>>> +    *insn++ = BPF_JMP_A(1);
>>>>>>> +    *insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_TAI);
>>>>
>>>> With SKB_TSTAMP_TYPE_MASK defined like above, this could be simplified like this (untested):
>>>>
>>> Let me think this through and raise it as part of the next rfc patch.
>>>> static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
>>>>                                                       struct bpf_insn *insn)
>>>> {
>>>>      __u8 value_reg = si->dst_reg;
>>>>      __u8 skb_reg = si->src_reg;
>>>>
>>>>      BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
>>>>      *insn++ = BPF_LDX_MEM(BPF_B, value_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>>>      *insn++ = BPF_ALU32_IMM(BPF_AND, value_reg, SKB_TSTAMP_TYPE_MASK);
>>>> #ifdef __BIG_ENDIAN_BITFIELD
>>>>      *insn++ = BPF_ALU32_IMM(BPF_RSH, value_reg, SKB_TSTAMP_TYPE_RSH);
>>>> #else
>>>>      BUILD_BUG_ON(!(SKB_TSTAMP_TYPE_MASK & 0x1));
>>>> #endif
>>>>
>>>>      return insn;
>>>> }
>>>>
>>>>>>>          return insn;
>>>>>>>    }
>>>>>>> @@ -9418,10 +9430,26 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>>>>>>            __u8 tmp_reg = BPF_REG_AX;
>>>>>>>              *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>>>>>> +        /*check if all three bits are set*/
>>>>>>>            *insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>>>>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
>>>>>>> -        *insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
>>>>>>> -                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
>>>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK);
>>>>>>> +        /*if all 3 bits are set jump 3 instructions and clear the register */
>>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK |
>>>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 4);
>>>>>>> +        /*Now check Mono is set with ingress mask if so clear */
>>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>>> +                    TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 3);
>>>>>>> +        /*Now Check tai is set with ingress mask if so clear */
>>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>>> +                    TC_AT_INGRESS_MASK | SKB_TAI_DELIVERY_TIME_MASK, 2);
>>>>>>> +        /*Now Check tai and mono are set if so clear */
>>>>>>> +        *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg,
>>>>>>> +                    SKB_MONO_DELIVERY_TIME_MASK |
>>>>>>> +                    SKB_TAI_DELIVERY_TIME_MASK, 1);
>>>>
>>>> Same as the bpf_convert_tstamp_type_read, this could be simplified with SKB_TSTAMP_TYPE_MASK.
>>>>
>> Willem and Martin,
>> When do we clear the tstamp and make it 0 in bpf_convert_tstamp_read? meaning which configuration?
> 
> When the bpf prog does not check the skb->tstamp_type. It is
> the "if (!prog->tstamp_type_access)" in bpf_convert_tstamp_read().
> 
> If bpf prog does not check the skb->tstamp_type and it is at ingress,
> bpf prog expects recv tstamp (ie. real clock), so it needs to clear
> out the tstamp (i.e read as 0 tstamp).
> 
>> I see previously(current upstream code) if mono_delivery is set and tc_ingress_mask is set
>> upstream code used to set the tstamp as 0.
>>
>> Which means with addition of tai mask the new implementation should take care of following cases(correct me if i am wrong)
>> 1. ( tai mask set + ingress mask set ) = Clear tstamp
>> 2. ( mono mask set + ingress mask set ) = Clear tstamp
>> 3. ( mono mask set + tai mask set + ingress mask set ) = Clear tstamp
>> 4. ( No mask set ) = Clear tstamp
>> 5. ( Tai mask set + mono mask set ) = Clear tstamp
> 
> No need to check the individual mono and tai bit here. Check the
> tstamp_type as a whole. Like in pseudo C:
> 
> if (skb->tc_at_ingress && skb->tstamp_type)
>     value_reg = 0;
> 
> untested code for tstamp_read() and tstamp_write():
> 
> static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>                                                 const struct bpf_insn *si,
>                                                 struct bpf_insn *insn)
> {
>     __u8 value_reg = si->dst_reg;
>     __u8 skb_reg = si->src_reg;
> 
> #ifdef CONFIG_NET_XGRESS
>     /* If the tstamp_type is read,
>      * the bpf prog is aware the tstamp could have delivery time.
>      * Thus, read skb->tstamp as is if tstamp_type_access is true.
>      */
>     if (!prog->tstamp_type_access) {
>         /* AX is needed because src_reg and dst_reg could be the same */
>         __u8 tmp_reg = BPF_REG_AX;
> 
>         *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>         *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
>         /* goto <read> */
>         BPF_JMP_A(4);
>         *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, SKB_TSTAMP_TYPE_MASK, 1);
>         /* goto <read> */
>         BPF_JMP_A(2);
>         /* skb->tc_at_ingress && skb->tstamp_type,
>          * read 0 as the (rcv) timestamp.
>          */
>         *insn++ = BPF_MOV64_IMM(value_reg, 0);
>         *insn++ = BPF_JMP_A(1);
>     }
> #endif
> 
>     /* <read>: value_reg = skb->tstamp */
>     *insn++ = BPF_LDX_MEM(BPF_DW, value_reg, skb_reg,
>                   offsetof(struct sk_buff, tstamp));
>     return insn;
> }
> 
> static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog,
>                                                  const struct bpf_insn *si,
>                                              struct bpf_insn *insn)
> {
>     __u8 value_reg = si->src_reg;
>     __u8 skb_reg = si->dst_reg;
> 
> #ifdef CONFIG_NET_XGRESS
>     /* If the tstamp_type is read,
>      * the bpf prog is aware the tstamp could have delivery time.
>      * Thus, write skb->tstamp as is if tstamp_type_access is true.
>      * Otherwise, writing at ingress will have to clear the
>      * mono_delivery_time (skb->tstamp_type:1)bit also.
>      */
>         if (!prog->tstamp_type_access) {
>         __u8 tmp_reg = BPF_REG_AX;
> 
>         *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>         /* Writing __sk_buff->tstamp as ingress, goto <clear> */
>         *insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
>         /* goto <store> */
>         *insn++ = BPF_JMP_A(2);
>         /* <clear>: skb->tstamp_type */
>         *insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, ~SKB_TSTAMP_TYPE_MASK);
>         *insn++ = BPF_STX_MEM(BPF_B, skb_reg, tmp_reg, SKB_BF_MONO_TC_OFFSET);
>     }
> #endif
> 
>     /* <store>: skb->tstamp = tstamp */
>     *insn++ = BPF_RAW_INSN(BPF_CLASS(si->code) | BPF_DW | BPF_MEM,
>                    skb_reg, value_reg, offsetof(struct sk_buff, tstamp), si->imm);
>         return insn;
> }
> 

Thanks Martin. I will raise the RFC patch v5 today. Also made changes in test_tc_dtime.c which 
needs a closer review too. 

>>
>> This leaves us with only two values which can be support which is 0x1 and 0x2
>>
>> This means the tstamp_type should be either 0x1(mono) and tstamp_type 0x2 (tai) to set the value_reg with tstamp
>> Is my understanding correct ?
>>
>> Do you think the below simplified version looks okay ?
>>
>> static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
>>                         const struct bpf_insn *si,
>>                         struct bpf_insn *insn)
>> {
>>     __u8 value_reg = si->dst_reg;
>>     __u8 skb_reg = si->src_reg;
>>
>> BUILD_BUG_ON(__SKB_CLOCK_MAX != BPF_SKB_TSTAMP_DELIVERY_TAI);
>> #ifdef CONFIG_NET_XGRESS
>>     /* If the tstamp_type is read,
>>      * the bpf prog is aware the tstamp could have delivery time.
>>      * Thus, read skb->tstamp as is if tstamp_type_access is true.
>>      */
>>     if (!prog->tstamp_type_access) {
>>         /* AX is needed because src_reg and dst_reg could be the same */
>>         __u8 tmp_reg = BPF_REG_AX;
>>
>>         *insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, SKB_BF_MONO_TC_OFFSET);
>>         /* check if all three bits are set*/
>>         *insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
>>                     TC_AT_INGRESS_MASK | SKB_TSTAMP_TYPE_MASK);
>>
>>         /* If the value of tmp_reg is 7,6,5,4,3,0 which means invalid
>>          * configuration set the tstamp to 0, value 0x1 and 0x2
>>          * is correct configuration
>>          */
>> #ifdef __BIG_ENDIAN_BITFIELD
>>         *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x1 << SKB_TSTAMP_TYPE_RSH, 3);
>>         *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x2 << SKB_TSTAMP_TYPE_RSH, 2);
>> #endif
>>         *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x1, 3);
>>         *insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0x2, 2);
>> #endif
>>         /* skb->tc_at_ingress && skb->tstamp_type:2,
>>          * read 0 as the (rcv) timestamp.
>>          */
>>         *insn++ = BPF_MOV64_IMM(value_reg, 0);
>>         *insn++ = BPF_JMP_A(1);
>>     }
>> #endif
>>
>>     *insn++ = BPF_LDX_MEM(BPF_DW, value_reg, skb_reg,
>>                   offsetof(struct sk_buff, tstamp));
>>     return insn;
>> }
>>
>>
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-04-24 18:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18  0:43 [RFC PATCH bpf-next v4 0/2] Replace mono_delivery_time with tstamp_type Abhishek Chauhan
2024-04-18  0:43 ` [RFC PATCH bpf-next v4 1/2] net: Rename mono_delivery_time to tstamp_type for scalabilty Abhishek Chauhan
2024-04-18 18:47   ` Willem de Bruijn
2024-04-18 19:58     ` Abhishek Chauhan (ABC)
2024-04-18 20:11       ` Willem de Bruijn
2024-04-18 20:38         ` Abhishek Chauhan (ABC)
2024-04-18 20:49           ` Willem de Bruijn
2024-04-18 20:51             ` Abhishek Chauhan (ABC)
2024-04-18  0:43 ` [RFC PATCH bpf-next v4 2/2] net: Add additional bit to support clockid_t timestamp type Abhishek Chauhan
2024-04-18 19:06   ` Willem de Bruijn
2024-04-18 20:10     ` Abhishek Chauhan (ABC)
2024-04-18 21:57       ` Martin KaFai Lau
2024-04-19  0:30         ` Abhishek Chauhan (ABC)
2024-04-20  1:13           ` Abhishek Chauhan (ABC)
2024-04-22 18:46             ` Martin KaFai Lau
2024-04-24 18:03               ` Abhishek Chauhan (ABC)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.