All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP
@ 2021-08-03 17:10 Kishen Maloor
  2021-08-03 17:10 ` [RFC bpf-next 1/5] net: xdp: " Kishen Maloor
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-03 17:10 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Kishen Maloor

Background

This change adds kernel and userspace support for SO_TXTIME
in AF_XDP to include a specific TXTIME (aka "Launch Time")
with XDP frames issued from userspace XDP applications.

This feature may be used in conjunction with application-based traffic
shaping methods to delay less critical traffic and prioritize
latency sensitive packets to meet a desired QoS.

Implementation notes

The timestamp is stored in the XDP metadata area and passed down
to the NIC driver to configure the launch time.
The timestamp is internally conveyed to the sk_buff destined to generic 
mode NIC drivers. Alternatively, it may be read via a new Zero-Copy 
driver API.

The structure of XDP metadata that stores the TXTIME takes a form
(Shown below) that is aligned to ongoing work in the community
for describing XDP metadata using BTF:

struct xdp_user_tx_metadata {
        __u64 timestamp;
        __u32 md_valid;
        __u32 btf_id;
};

On the control path from userspace XDP applications, this change
repurposes the SO_TXTIME socket option to harness this 
feature, and adopts the above metadata struct for storing the TXTIME.

The (libbpf) userspace API has been expanded with two helper functons:

- int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
   
   Sets the SO_TXTIME option on the AF_XDP socket (using setsockopt()).

- void xsk_umem__set_md_txtime(void *umem_area, __u64 chunkAddr,
                               __s64 txtime)
   
   Packages the application supplied TXTIME into the above md struct 
   and stores it in the XDP metadata area, which precedes the XDP frame.

In practice, a userspace XDP application must ensure the following:

* Store the XDP packet at the location following the XDP metadata area:
   uint8_t *pkt = xsk_umem__get_data(xsk.buffer, chunkAddr) +
                                     sizeof(struct xdp_user_tx_metadata);

* Correctly set the pkt addr in the TX descriptor:
   tx_desc->addr = chunkAddr + sizeof(struct xdp_user_tx_metadata);

* Signal the kernel that it included metadata by setting the TX 
  descriptor options:
   tx_desc->options = XDP_DESC_OPTION_METADATA;

In the NIC driver, a new XSK Zero-Copy Driver API:
s64 xsk_buff_get_txtime(struct xsk_buff_pool *pool, struct xdp_desc *desc)
is used to retrieve and consume a Launch Time provided by the userspace XDP
application.

Accompanying the kernel and userspace changes are the following updates:

* Launch Time support along the XDP ZC path in the IGC driver by exercising
  the new XSK ZC driver API.

* SO_TXTIME support in the xdpsock sample application to illustrate the
  userspace API.

Jithu Joseph (3):
  igc: Launchtime support in XDP Tx ZC path
  samples/bpf/xdpsock_user.c: Make get_nsecs() generic
  samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage

Kishen Maloor (2):
  net: xdp: SO_TXTIME support in AF_XDP
  libbpf: SO_TXTIME support in AF_XDP

 drivers/net/ethernet/intel/igc/igc_main.c | 49 ++++++++++++++-
 include/net/xdp_sock.h                    |  1 +
 include/net/xdp_sock_drv.h                | 10 +++
 include/net/xsk_buff_pool.h               |  1 +
 include/uapi/linux/if_xdp.h               |  2 +
 include/uapi/linux/xdp_md_std.h           | 14 +++++
 net/xdp/xsk.c                             | 51 +++++++++++++++-
 net/xdp/xsk.h                             |  2 +
 net/xdp/xsk_buff_pool.c                   | 23 +++++++
 net/xdp/xsk_queue.h                       |  4 +-
 samples/bpf/xdpsock_user.c                | 74 ++++++++++++++++++++---
 tools/include/uapi/linux/if_xdp.h         |  2 +
 tools/include/uapi/linux/xdp_md_std.h     | 14 +++++
 tools/lib/bpf/xsk.h                       | 27 ++++++++-
 14 files changed, 258 insertions(+), 16 deletions(-)
 create mode 100644 include/uapi/linux/xdp_md_std.h
 create mode 100644 tools/include/uapi/linux/xdp_md_std.h

-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC bpf-next 1/5] net: xdp: SO_TXTIME support in AF_XDP
  2021-08-03 17:10 [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP Kishen Maloor
@ 2021-08-03 17:10 ` Kishen Maloor
  2021-08-03 20:05   ` kernel test robot
  2021-08-03 20:56   ` kernel test robot
  2021-08-03 17:10 ` [RFC bpf-next 2/5] libbpf: " Kishen Maloor
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-03 17:10 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Kishen Maloor

This change adds kernel support for SO_TXTIME in AF_XDP
to include a specific TXTIME (aka "Launch Time")
with XDP frames issued from userspace XDP applications.

The timestamp is stored in the XDP metadata area and
passed down to the NIC driver to configure the launch time.
The timestamp is internally conveyed to the sk_buff destined
to generic mode NIC drivers.

Alternatively, a new XSK Zero-Copy driver API:

s64 xsk_buff_get_txtime(struct xsk_buff_pool *pool,
                        struct xdp_desc *desc)

nay used to retrieve and consume the TXTIME provided by the
userspace XDP application.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
 include/net/xdp_sock.h          |  1 +
 include/net/xdp_sock_drv.h      | 10 +++++++
 include/net/xsk_buff_pool.h     |  1 +
 include/uapi/linux/if_xdp.h     |  2 ++
 include/uapi/linux/xdp_md_std.h | 14 +++++++++
 net/xdp/xsk.c                   | 51 ++++++++++++++++++++++++++++++++-
 net/xdp/xsk.h                   |  2 ++
 net/xdp/xsk_buff_pool.c         | 23 +++++++++++++++
 net/xdp/xsk_queue.h             |  4 +--
 9 files changed, 105 insertions(+), 3 deletions(-)
 create mode 100644 include/uapi/linux/xdp_md_std.h

diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index fff069d2ed1b..b78932921b44 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -58,6 +58,7 @@ struct xdp_sock {
 
 	struct xsk_queue *tx ____cacheline_aligned_in_smp;
 	struct list_head tx_list;
+	u32 so_txtime_mask;
 	/* Protects generic receive. */
 	spinlock_t rx_lock;
 
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 4e295541e396..7d190b454772 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -95,6 +95,11 @@ static inline dma_addr_t xsk_buff_raw_get_dma(struct xsk_buff_pool *pool,
 	return xp_raw_get_dma(pool, addr);
 }
 
+static inline s64 xsk_buff_get_txtime(struct xsk_buff_pool *pool, struct xdp_desc *desc)
+{
+	return xp_raw_get_txtime(pool, desc);
+}
+
 static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
 {
 	return xp_raw_get_data(pool, addr);
@@ -232,6 +237,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
 	return NULL;
 }
 
+static inline s64 xsk_buff_get_txtime(struct xsk_buff_pool *pool, u64 addr)
+{
+	return -1;
+}
+
 static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool)
 {
 }
diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index 7a9a23e7a604..32863cde4128 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -107,6 +107,7 @@ void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs);
 struct xdp_buff *xp_alloc(struct xsk_buff_pool *pool);
 bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count);
 void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr);
+s64 xp_raw_get_txtime(struct xsk_buff_pool *pool, struct xdp_desc *desc);
 dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr);
 static inline dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb)
 {
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index a78a8096f4ce..31f81f82ed86 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -106,6 +106,8 @@ struct xdp_desc {
 	__u32 options;
 };
 
+#define XDP_DESC_OPTION_METADATA (1 << 0)
+
 /* UMEM descriptor is __u64 */
 
 #endif /* _LINUX_IF_XDP_H */
diff --git a/include/uapi/linux/xdp_md_std.h b/include/uapi/linux/xdp_md_std.h
new file mode 100644
index 000000000000..f00996a61639
--- /dev/null
+++ b/include/uapi/linux/xdp_md_std.h
@@ -0,0 +1,14 @@
+#ifndef _UAPI_LINUX_XDP_MD_STD_H
+#define _UAPI_LINUX_XDP_MD_STD_H
+
+#include <linux/types.h>
+
+#define XDP_METADATA_USER_TX_TIMESTAMP 0x1
+
+struct xdp_user_tx_metadata {
+	__u64 timestamp;
+	__u32 md_valid;
+	__u32 btf_id;
+};
+
+#endif /* _UAPI_LINUX_XDP_MD_STD_H */
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index d6b500dc4208..c92baedf18b8 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -339,6 +339,7 @@ bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc)
 			continue;
 		}
 
+		desc->options |= xs->so_txtime_mask;
 		/* This is the backpressure mechanism for the Tx path.
 		 * Reserve space in the completion queue and only proceed
 		 * if there is space in it. This avoids having to implement
@@ -374,7 +375,7 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, struct xdp_desc *
 				   u32 max_entries)
 {
 	struct xdp_sock *xs;
-	u32 nb_pkts;
+	u32 nb_pkts, i;
 
 	rcu_read_lock();
 	if (!list_is_singular(&pool->xsk_tx_list)) {
@@ -395,6 +396,9 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, struct xdp_desc *
 		goto out;
 	}
 
+	for (i = 0; i < nb_pkts; i++)
+		descs[i].options |= xs->so_txtime_mask;
+
 	/* This is the backpressure mechanism for the Tx path. Try to
 	 * reserve space in the completion queue for all packets, but
 	 * if there are fewer slots available, just process that many
@@ -445,6 +449,19 @@ static void xsk_destruct_skb(struct sk_buff *skb)
 	sock_wfree(skb);
 }
 
+static void xsk_process_so_txtime(struct xsk_buff_pool *pool,
+				  struct xdp_desc *desc,
+				  struct sk_buff *skb)
+{
+	if ((desc->options & (XDP_DESC_OPTION_METADATA | XDP_SOL_OPTION_SO_TXTIME))
+	    == (XDP_DESC_OPTION_METADATA | XDP_SOL_OPTION_SO_TXTIME)) {
+		s64 tstamp = xsk_buff_get_txtime(pool, desc);
+
+		if (tstamp != -1)
+			skb->tstamp = tstamp;
+	}
+}
+
 static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 					      struct xdp_desc *desc)
 {
@@ -469,6 +486,7 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 	ts = pool->unaligned ? len : pool->chunk_size;
 
 	buffer = xsk_buff_raw_get_data(pool, addr);
+	xsk_process_so_txtime(pool, desc, skb);
 	offset = offset_in_page(buffer);
 	addr = buffer - pool->addrs;
 
@@ -520,6 +538,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
 		skb_put(skb, len);
 
 		buffer = xsk_buff_raw_get_data(xs->pool, desc->addr);
+		xsk_process_so_txtime(xs->pool, desc, skb);
 		err = skb_store_bits(skb, 0, buffer, len);
 		if (unlikely(err)) {
 			kfree_skb(skb);
@@ -557,6 +576,7 @@ static int xsk_generic_xmit(struct sock *sk)
 			goto out;
 		}
 
+		desc.options |= xs->so_txtime_mask;
 		skb = xsk_build_skb(xs, &desc);
 		if (IS_ERR(skb)) {
 			err = PTR_ERR(skb);
@@ -1098,6 +1118,23 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname,
 		mutex_unlock(&xs->mutex);
 		return err;
 	}
+	case SO_TXTIME:
+	{
+		u32 option;
+
+		if (optlen != sizeof(u32))
+			return -EINVAL;
+
+		if (copy_from_sockptr(&option, optval, sizeof(option)))
+			return -EFAULT;
+
+		if (option)
+			xs->so_txtime_mask |= XDP_SOL_OPTION_SO_TXTIME;
+		else
+			xs->so_txtime_mask &= ~XDP_SOL_OPTION_SO_TXTIME;
+
+		return 0;
+	}
 	default:
 		break;
 	}
@@ -1249,6 +1286,18 @@ static int xsk_getsockopt(struct socket *sock, int level, int optname,
 
 		return 0;
 	}
+	case SO_TXTIME:
+	{
+		u32 option = 0;
+
+		if (xs->so_txtime_mask & XDP_SOL_OPTION_SO_TXTIME)
+			option = 1;
+
+		if (copy_to_user(optval, &option, sizeof(option)))
+			return -EFAULT;
+
+		return 0;
+	}
 	default:
 		break;
 	}
diff --git a/net/xdp/xsk.h b/net/xdp/xsk.h
index a4bc4749faac..9c8686185d68 100644
--- a/net/xdp/xsk.h
+++ b/net/xdp/xsk.h
@@ -11,6 +11,8 @@
 #define XSK_NEXT_PG_CONTIG_SHIFT 0
 #define XSK_NEXT_PG_CONTIG_MASK BIT_ULL(XSK_NEXT_PG_CONTIG_SHIFT)
 
+#define XDP_SOL_OPTION_SO_TXTIME (1 << 31)
+
 struct xdp_ring_offset_v1 {
 	__u64 producer;
 	__u64 consumer;
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 8de01aaac4a0..58bf877cc5cf 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -3,6 +3,7 @@
 #include <net/xsk_buff_pool.h>
 #include <net/xdp_sock.h>
 #include <net/xdp_sock_drv.h>
+#include <linux/xdp_md_std.h>
 
 #include "xsk_queue.h"
 #include "xdp_umem.h"
@@ -522,6 +523,28 @@ void xp_free(struct xdp_buff_xsk *xskb)
 }
 EXPORT_SYMBOL(xp_free);
 
+s64 xp_raw_get_txtime(struct xsk_buff_pool *pool, struct xdp_desc *desc)
+{
+	struct xdp_user_tx_metadata *md;
+
+	if ((desc->addr % pool->chunk_size) <
+	    sizeof(struct xdp_user_tx_metadata))
+		return -1;
+
+	if (!(desc->options & XDP_SOL_OPTION_SO_TXTIME) ||
+	    !(desc->options & XDP_DESC_OPTION_METADATA))
+		return -1;
+
+	md = (xp_raw_get_data(pool, desc->addr) -
+	      sizeof(struct xdp_user_tx_metadata));
+
+	if (!(md->md_valid & XDP_METADATA_USER_TX_TIMESTAMP))
+		return -1;
+
+	return md->timestamp;
+}
+EXPORT_SYMBOL(xp_raw_get_txtime);
+
 void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
 {
 	addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 9ae13cccfb28..f7b323374b6d 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -140,7 +140,7 @@ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool,
 	if (chunk >= pool->addrs_cnt)
 		return false;
 
-	if (desc->options)
+	if (desc->options & ~XDP_DESC_OPTION_METADATA)
 		return false;
 	return true;
 }
@@ -160,7 +160,7 @@ static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool,
 	    xp_desc_crosses_non_contig_pg(pool, addr, desc->len))
 		return false;
 
-	if (desc->options)
+	if (desc->options & ~XDP_DESC_OPTION_METADATA)
 		return false;
 	return true;
 }
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC bpf-next 2/5] libbpf: SO_TXTIME support in AF_XDP
  2021-08-03 17:10 [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP Kishen Maloor
  2021-08-03 17:10 ` [RFC bpf-next 1/5] net: xdp: " Kishen Maloor
@ 2021-08-03 17:10 ` Kishen Maloor
  2021-08-06 23:08   ` Andrii Nakryiko
  2021-08-18  9:49   ` Jesper Dangaard Brouer
  2021-08-03 17:10 ` [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path Kishen Maloor
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-03 17:10 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Kishen Maloor

This change adds userspace support for SO_TXTIME in AF_XDP
to include a specific TXTIME (aka "Launch Time")
with XDP frames issued from userspace XDP applications.

The userspace API has been expanded with two helper functons:

- int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
   Sets the SO_TXTIME option on the AF_XDP socket (using setsockopt()).

- void xsk_umem__set_md_txtime(void *umem_area, __u64 chunkAddr,
                               __s64 txtime)
   Packages the application supplied TXTIME into struct xdp_user_tx_metadata:
   struct xdp_user_tx_metadata {
        __u64 timestamp;
        __u32 md_valid;
        __u32 btf_id;
   };
   and stores it in the XDP metadata area, which precedes the XDP frame.

Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
---
 tools/include/uapi/linux/if_xdp.h     |  2 ++
 tools/include/uapi/linux/xdp_md_std.h | 14 ++++++++++++++
 tools/lib/bpf/xsk.h                   | 27 ++++++++++++++++++++++++++-
 3 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 tools/include/uapi/linux/xdp_md_std.h

diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
index a78a8096f4ce..31f81f82ed86 100644
--- a/tools/include/uapi/linux/if_xdp.h
+++ b/tools/include/uapi/linux/if_xdp.h
@@ -106,6 +106,8 @@ struct xdp_desc {
 	__u32 options;
 };
 
+#define XDP_DESC_OPTION_METADATA (1 << 0)
+
 /* UMEM descriptor is __u64 */
 
 #endif /* _LINUX_IF_XDP_H */
diff --git a/tools/include/uapi/linux/xdp_md_std.h b/tools/include/uapi/linux/xdp_md_std.h
new file mode 100644
index 000000000000..f00996a61639
--- /dev/null
+++ b/tools/include/uapi/linux/xdp_md_std.h
@@ -0,0 +1,14 @@
+#ifndef _UAPI_LINUX_XDP_MD_STD_H
+#define _UAPI_LINUX_XDP_MD_STD_H
+
+#include <linux/types.h>
+
+#define XDP_METADATA_USER_TX_TIMESTAMP 0x1
+
+struct xdp_user_tx_metadata {
+	__u64 timestamp;
+	__u32 md_valid;
+	__u32 btf_id;
+};
+
+#endif /* _UAPI_LINUX_XDP_MD_STD_H */
diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
index 01c12dca9c10..1b52ffe1c9a3 100644
--- a/tools/lib/bpf/xsk.h
+++ b/tools/lib/bpf/xsk.h
@@ -16,7 +16,8 @@
 #include <stdint.h>
 #include <stdbool.h>
 #include <linux/if_xdp.h>
-
+#include <linux/xdp_md_std.h>
+#include <errno.h>
 #include "libbpf.h"
 
 #ifdef __cplusplus
@@ -248,6 +249,30 @@ static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
 LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
 LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
 
+/* Helpers for SO_TXTIME */
+
+static inline void xsk_umem__set_md_txtime(void *umem_area, __u64 addr, __s64 txtime)
+{
+	struct xdp_user_tx_metadata *md;
+
+	md = (struct xdp_user_tx_metadata *)&((char *)umem_area)[addr];
+
+	md->timestamp = txtime;
+	md->md_valid |= XDP_METADATA_USER_TX_TIMESTAMP;
+}
+
+static inline int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
+{
+	unsigned int val = (enable) ? 1 : 0;
+	int err;
+
+	err = setsockopt(xsk_socket__fd(xsk), SOL_XDP, SO_TXTIME, &val, sizeof(val));
+
+	if (err)
+		return -errno;
+	return 0;
+}
+
 #define XSK_RING_CONS__DEFAULT_NUM_DESCS      2048
 #define XSK_RING_PROD__DEFAULT_NUM_DESCS      2048
 #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path
  2021-08-03 17:10 [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP Kishen Maloor
  2021-08-03 17:10 ` [RFC bpf-next 1/5] net: xdp: " Kishen Maloor
  2021-08-03 17:10 ` [RFC bpf-next 2/5] libbpf: " Kishen Maloor
@ 2021-08-03 17:10 ` Kishen Maloor
  2021-08-03 21:41   ` kernel test robot
                     ` (2 more replies)
  2021-08-03 17:10 ` [RFC bpf-next 4/5] samples/bpf/xdpsock_user.c: Make get_nsecs() generic Kishen Maloor
  2021-08-03 17:10 ` [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage Kishen Maloor
  4 siblings, 3 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-03 17:10 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Jithu Joseph

From: Jithu Joseph <jithu.joseph@intel.com>

Adds Launchtime support in the XDP Zero copy Tx path.

If launchtime is enabled for the ring handling the
packet and if the launch-time data is present, programs
a context descriptor with the launchtime so that NIC
can transmit the packet at the appropriate time.

As for the non-ZC path, txtime info is copied into
skb->tstamp in the XDP layer and the existing Tx path
in the driver has support for honoring that.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 49 +++++++++++++++++++++--
 1 file changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index b7aab35c1132..336cf6bc0a71 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2582,6 +2582,35 @@ static void igc_update_tx_stats(struct igc_q_vector *q_vector,
 	q_vector->tx.total_packets += packets;
 }
 
+static void igc_launchtm_ctxtdesc(struct igc_ring *tx_ring,
+				  ktime_t txtime)
+{
+	struct igc_adapter *adapter = netdev_priv(tx_ring->netdev);
+	struct igc_adv_tx_context_desc *context_desc;
+	u16 i = tx_ring->next_to_use;
+	u32 mss_l4len_idx = 0;
+	u32 type_tucmd = 0;
+
+	context_desc = IGC_TX_CTXTDESC(tx_ring, i);
+
+	i++;
+	tx_ring->next_to_use = (i < tx_ring->count) ? i : 0;
+
+	/* set bits to identify this as an advanced context descriptor */
+	type_tucmd |= IGC_TXD_CMD_DEXT | IGC_ADVTXD_DTYP_CTXT;
+
+	/* For i225, context index must be unique per ring. */
+	if (test_bit(IGC_RING_FLAG_TX_CTX_IDX, &tx_ring->flags))
+		mss_l4len_idx |= tx_ring->reg_idx << 4;
+
+	context_desc->vlan_macip_lens	= 0;
+	context_desc->type_tucmd_mlhl	= cpu_to_le32(type_tucmd);
+	context_desc->mss_l4len_idx	= cpu_to_le32(mss_l4len_idx);
+
+	context_desc->launch_time = igc_tx_launchtime(adapter,
+						      txtime);
+}
+
 static void igc_xdp_xmit_zc(struct igc_ring *ring)
 {
 	struct xsk_buff_pool *pool = ring->xsk_pool;
@@ -2599,11 +2628,25 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
 
 	budget = igc_desc_unused(ring);
 
-	while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
+	while (xsk_tx_peek_desc(pool, &xdp_desc) && budget > 2) {
 		u32 cmd_type, olinfo_status;
 		struct igc_tx_buffer *bi;
+		s64 launch_tm = 0;
 		dma_addr_t dma;
 
+		bi = &ring->tx_buffer_info[ring->next_to_use];
+
+		if (ring->launchtime_enable && (xdp_desc.options & XDP_DESC_OPTION_SO_TXTIME)) {
+			launch_tm = xsk_buff_get_txtime(pool, &xdp_desc);
+			if (launch_tm != -1) {
+				budget--;
+				igc_launchtm_ctxtdesc(ring, (ktime_t)launch_tm);
+			}
+		}
+
+		/* re-read ntu as igc_launchtm_ctxtdesc() updates it */
+		ntu = ring->next_to_use;
+
 		cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
 			   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
 			   xdp_desc.len;
@@ -2612,12 +2655,12 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
 		dma = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
 		xsk_buff_raw_dma_sync_for_device(pool, dma, xdp_desc.len);
 
+		budget--;
 		tx_desc = IGC_TX_DESC(ring, ntu);
 		tx_desc->read.cmd_type_len = cpu_to_le32(cmd_type);
 		tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
 		tx_desc->read.buffer_addr = cpu_to_le64(dma);
 
-		bi = &ring->tx_buffer_info[ntu];
 		bi->type = IGC_TX_BUFFER_TYPE_XSK;
 		bi->protocol = 0;
 		bi->bytecount = xdp_desc.len;
@@ -2630,9 +2673,9 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
 		ntu++;
 		if (ntu == ring->count)
 			ntu = 0;
+		ring->next_to_use = ntu;
 	}
 
-	ring->next_to_use = ntu;
 	if (tx_desc) {
 		igc_flush_tx_descriptors(ring);
 		xsk_tx_release(pool);
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC bpf-next 4/5] samples/bpf/xdpsock_user.c: Make get_nsecs() generic
  2021-08-03 17:10 [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP Kishen Maloor
                   ` (2 preceding siblings ...)
  2021-08-03 17:10 ` [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path Kishen Maloor
@ 2021-08-03 17:10 ` Kishen Maloor
  2021-08-03 17:10 ` [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage Kishen Maloor
  4 siblings, 0 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-03 17:10 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Jithu Joseph

From: Jithu Joseph <jithu.joseph@intel.com>

The helper function get_nsecs() assumes clock to be CLOCK_MONOTONIC.

TSN features like Launchtime uses CLOCK_TAI. Subsequent patch
extends this sample to show how Launchtime APIs maybe used
in XDP context.

In prepration for this, extend the function to add CLOCKID parameter.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
---
 samples/bpf/xdpsock_user.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 33d0bdebbed8..3fd2f6a0d1eb 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -159,11 +159,11 @@ static int num_socks;
 struct xsk_socket_info *xsks[MAX_SOCKS];
 int sock;
 
-static unsigned long get_nsecs(void)
+static unsigned long get_nsecs(int clockid)
 {
 	struct timespec ts;
 
-	clock_gettime(CLOCK_MONOTONIC, &ts);
+	clock_gettime(clockid, &ts);
 	return ts.tv_sec * 1000000000UL + ts.tv_nsec;
 }
 
@@ -354,7 +354,7 @@ static void dump_driver_stats(long dt)
 
 static void dump_stats(void)
 {
-	unsigned long now = get_nsecs();
+	unsigned long now = get_nsecs(CLOCK_MONOTONIC);
 	long dt = now - prev_time;
 	int i;
 
@@ -443,7 +443,7 @@ static void dump_stats(void)
 static bool is_benchmark_done(void)
 {
 	if (opt_duration > 0) {
-		unsigned long dt = (get_nsecs() - start_time);
+		unsigned long dt = (get_nsecs(CLOCK_MONOTONIC) - start_time);
 
 		if (dt >= opt_duration)
 			benchmark_done = true;
@@ -1683,7 +1683,7 @@ int main(int argc, char **argv)
 			exit_with_error(ret);
 	}
 
-	prev_time = get_nsecs();
+	prev_time = get_nsecs(CLOCK_MONOTONIC);
 	start_time = prev_time;
 
 	if (opt_bench == BENCH_RXDROP)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage
  2021-08-03 17:10 [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP Kishen Maloor
                   ` (3 preceding siblings ...)
  2021-08-03 17:10 ` [RFC bpf-next 4/5] samples/bpf/xdpsock_user.c: Make get_nsecs() generic Kishen Maloor
@ 2021-08-03 17:10 ` Kishen Maloor
  2021-08-18  8:54   ` Jesper Dangaard Brouer
  4 siblings, 1 reply; 16+ messages in thread
From: Kishen Maloor @ 2021-08-03 17:10 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Jithu Joseph

From: Jithu Joseph <jithu.joseph@intel.com>

Adds -L flag to the xdpsock commandline options. Using this
in conjuction with "-t txonly" option exercises the
Launchtime/Txtime APIs. These allows the application to specify
when each packet should be transmitted by the NIC.

Below is an example of how this option may be exercised:

sudo xdpsock -i enp1s0  -t -q 1 -N -s 128 -z -L

The above invocation would transmit  "batch_size" packets each spaced
1us apart. The first packet in the batch is to be launched
"LAUNCH_TIME_ADVANCE_NS" into the future and the remaining packets
in the batch should be spaced 1us apart.

Since launch-time enabled NICs would wait LAUNCH_TIME_ADVANCE_NS(500us)
between batches of packets, the emphasis of this path is not throughput.

Launch-time should be enabled for the chosen hardware queue using the
appropriate tc qdisc command before starting the application and
also NIC hardware clock should be synchronized to the system clock
using a mechanism like phc2sys.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
---
 samples/bpf/xdpsock_user.c | 64 +++++++++++++++++++++++++++++++++++---
 1 file changed, 60 insertions(+), 4 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 3fd2f6a0d1eb..a0fd3d5414ba 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -55,6 +55,9 @@
 
 #define DEBUG_HEXDUMP 0
 
+#define LAUNCH_TIME_ADVANCE_NS (500000)
+#define CLOCK_SYNC_DELAY (60)
+
 typedef __u64 u64;
 typedef __u32 u32;
 typedef __u16 u16;
@@ -99,6 +102,7 @@ static u32 opt_num_xsks = 1;
 static u32 prog_id;
 static bool opt_busy_poll;
 static bool opt_reduced_cap;
+static bool opt_launch_time;
 
 struct xsk_ring_stats {
 	unsigned long rx_npkts;
@@ -741,6 +745,8 @@ static inline u16 udp_csum(u32 saddr, u32 daddr, u32 len,
 
 #define ETH_FCS_SIZE 4
 
+#define MD_SIZE (sizeof(struct xdp_user_tx_metadata))
+
 #define PKT_HDR_SIZE (sizeof(struct ethhdr) + sizeof(struct iphdr) + \
 		      sizeof(struct udphdr))
 
@@ -798,8 +804,10 @@ static void gen_eth_hdr_data(void)
 
 static void gen_eth_frame(struct xsk_umem_info *umem, u64 addr)
 {
-	memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data,
-	       PKT_SIZE);
+	if (opt_launch_time)
+		memcpy(xsk_umem__get_data(umem->buffer, addr) + MD_SIZE, pkt_data, PKT_SIZE);
+	else
+		memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data, PKT_SIZE);
 }
 
 static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
@@ -927,6 +935,7 @@ static struct option long_options[] = {
 	{"irq-string", no_argument, 0, 'I'},
 	{"busy-poll", no_argument, 0, 'B'},
 	{"reduce-cap", no_argument, 0, 'R'},
+	{"launch-time", no_argument, 0, 'L'},
 	{0, 0, 0, 0}
 };
 
@@ -967,6 +976,7 @@ static void usage(const char *prog)
 		"  -I, --irq-string	Display driver interrupt statistics for interface associated with irq-string.\n"
 		"  -B, --busy-poll      Busy poll.\n"
 		"  -R, --reduce-cap	Use reduced capabilities (cannot be used with -M)\n"
+		"  -L, --launch-time	Toy example of launchtime using XDP\n"
 		"\n";
 	fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE,
 		opt_batch_size, MIN_PKT_SIZE, MIN_PKT_SIZE,
@@ -982,7 +992,7 @@ static void parse_command_line(int argc, char **argv)
 	opterr = 0;
 
 	for (;;) {
-		c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:BR",
+		c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:BRL",
 				long_options, &option_index);
 		if (c == -1)
 			break;
@@ -1087,6 +1097,9 @@ static void parse_command_line(int argc, char **argv)
 		case 'R':
 			opt_reduced_cap = true;
 			break;
+		case 'L':
+			opt_launch_time = true;
+			break;
 		default:
 			usage(basename(argv[0]));
 		}
@@ -1272,6 +1285,7 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size)
 {
 	u32 idx;
 	unsigned int i;
+	u64 cur_ts, launch_time;
 
 	while (xsk_ring_prod__reserve(&xsk->tx, batch_size, &idx) <
 				      batch_size) {
@@ -1280,10 +1294,28 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size)
 			return;
 	}
 
+	cur_ts = get_nsecs(CLOCK_TAI);
+
 	for (i = 0; i < batch_size; i++) {
 		struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(&xsk->tx,
 								  idx + i);
-		tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size;
+		if (opt_launch_time) {
+			/*
+			 * Direct the NIC to launch "batch_size" packets each spaced 1us apart.
+			 * The below calculation specifies that the first packet in the batch
+			 * is to be launched "LAUNCH_TIME_ADVANCE_NS" into the future and the
+			 * remaining packets in the batch should be spaced 1us apart.
+			 */
+			launch_time = cur_ts + LAUNCH_TIME_ADVANCE_NS + (1000 * i);
+			xsk_umem__set_md_txtime(xsk->umem->buffer,
+						((*frame_nb + i) * opt_xsk_frame_size),
+						launch_time);
+			tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size + MD_SIZE;
+			tx_desc->options = XDP_DESC_OPTION_METADATA;
+		} else {
+			tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size;
+		}
+
 		tx_desc->len = PKT_SIZE;
 	}
 
@@ -1293,6 +1325,16 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size)
 	*frame_nb += batch_size;
 	*frame_nb %= NUM_FRAMES;
 	complete_tx_only(xsk, batch_size);
+	if (opt_launch_time) {
+		/*
+		 * Hold the Tx loop until all the frames from this batch has been
+		 * transmitted by the driver. This also ensures that all packets from
+		 * this batch reach the driver ASAP before the proposed future launchtime
+		 * becomes stale
+		 */
+		while (xsk->outstanding_tx)
+			complete_tx_only(xsk, opt_batch_size);
+	}
 }
 
 static inline int get_batch_size(int pkt_cnt)
@@ -1334,6 +1376,20 @@ static void tx_only_all(void)
 		fds[0].events = POLLOUT;
 	}
 
+	if (opt_launch_time) {
+		/*
+		 * For launch-time to be meaningful, the system clock and NIC h/w clock
+		 * needs to be synchronized. Many NIC driver implementations resets the NIC
+		 * during the bind operation in the XDP initialization flow path.
+		 * The delay below is intended as a best case approach to hold off packet
+		 * transmission till the syncronization is acheived.
+		 */
+		xsk_socket__enable_so_txtime(xsks[0]->xsk, true);
+		printf("Waiting for %ds for the NIC clock to synchronize with the system clock\n",
+		       CLOCK_SYNC_DELAY);
+		sleep(CLOCK_SYNC_DELAY);
+	}
+
 	while ((opt_pkt_count && pkt_cnt < opt_pkt_count) || !opt_pkt_count) {
 		int batch_size = get_batch_size(pkt_cnt);
 
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 1/5] net: xdp: SO_TXTIME support in AF_XDP
  2021-08-03 17:10 ` [RFC bpf-next 1/5] net: xdp: " Kishen Maloor
@ 2021-08-03 20:05   ` kernel test robot
  2021-08-03 20:56   ` kernel test robot
  1 sibling, 0 replies; 16+ messages in thread
From: kernel test robot @ 2021-08-03 20:05 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1505 bytes --]

Hi Kishen,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: m68k-defconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 10.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/6d62df8261a3aa49b8bc987418d478bc0313ef8a
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
        git checkout 6d62df8261a3aa49b8bc987418d478bc0313ef8a
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross O=build_dir ARCH=m68k SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   m68k-linux-ld: net/xdp/xsk_buff_pool.o: in function `xp_raw_get_txtime':
>> xsk_buff_pool.c:(.text+0xce): undefined reference to `__umoddi3'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 17226 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 1/5] net: xdp: SO_TXTIME support in AF_XDP
  2021-08-03 17:10 ` [RFC bpf-next 1/5] net: xdp: " Kishen Maloor
  2021-08-03 20:05   ` kernel test robot
@ 2021-08-03 20:56   ` kernel test robot
  1 sibling, 0 replies; 16+ messages in thread
From: kernel test robot @ 2021-08-03 20:56 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

Hi Kishen,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 10.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/6d62df8261a3aa49b8bc987418d478bc0313ef8a
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
        git checkout 6d62df8261a3aa49b8bc987418d478bc0313ef8a
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross O=build_dir ARCH=m68k SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   m68k-linux-ld: net/xdp/xsk_buff_pool.o: in function `xp_raw_get_txtime':
>> (.text+0xe6): undefined reference to `__umoddi3'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 60696 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path
  2021-08-03 17:10 ` [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path Kishen Maloor
@ 2021-08-03 21:41   ` kernel test robot
  2021-08-04  4:15   ` kernel test robot
  2021-08-05 17:53   ` Kishen Maloor
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2021-08-03 21:41 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5139 bytes --]

Hi Kishen,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-randconfig-a011-20210803 (attached as .config)
compiler: gcc-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/a631a80055e4e1c72563d33445336f091c7053e3
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
        git checkout a631a80055e4e1c72563d33445336f091c7053e3
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/intel/igc/igc_main.c: In function 'igc_xdp_xmit_zc':
   drivers/net/ethernet/intel/igc/igc_main.c:2639:54: error: 'XDP_DESC_OPTION_SO_TXTIME' undeclared (first use in this function); did you mean 'XDP_DESC_OPTION_METADATA'?
    2639 |   if (ring->launchtime_enable && (xdp_desc.options & XDP_DESC_OPTION_SO_TXTIME)) {
         |                                                      ^~~~~~~~~~~~~~~~~~~~~~~~~
         |                                                      XDP_DESC_OPTION_METADATA
   drivers/net/ethernet/intel/igc/igc_main.c:2639:54: note: each undeclared identifier is reported only once for each function it appears in
>> drivers/net/ethernet/intel/igc/igc_main.c:2640:42: warning: passing argument 2 of 'xsk_buff_get_txtime' makes integer from pointer without a cast [-Wint-conversion]
    2640 |    launch_tm = xsk_buff_get_txtime(pool, &xdp_desc);
         |                                          ^~~~~~~~~
         |                                          |
         |                                          struct xdp_desc *
   In file included from drivers/net/ethernet/intel/igc/igc_main.c:14:
   include/net/xdp_sock_drv.h:240:71: note: expected 'u64' {aka 'long long unsigned int'} but argument is of type 'struct xdp_desc *'
     240 | static inline s64 xsk_buff_get_txtime(struct xsk_buff_pool *pool, u64 addr)
         |                                                                   ~~~~^~~~


vim +/xsk_buff_get_txtime +2640 drivers/net/ethernet/intel/igc/igc_main.c

  2613	
  2614	static void igc_xdp_xmit_zc(struct igc_ring *ring)
  2615	{
  2616		struct xsk_buff_pool *pool = ring->xsk_pool;
  2617		struct netdev_queue *nq = txring_txq(ring);
  2618		union igc_adv_tx_desc *tx_desc = NULL;
  2619		int cpu = smp_processor_id();
  2620		u16 ntu = ring->next_to_use;
  2621		struct xdp_desc xdp_desc;
  2622		u16 budget;
  2623	
  2624		if (!netif_carrier_ok(ring->netdev))
  2625			return;
  2626	
  2627		__netif_tx_lock(nq, cpu);
  2628	
  2629		budget = igc_desc_unused(ring);
  2630	
  2631		while (xsk_tx_peek_desc(pool, &xdp_desc) && budget > 2) {
  2632			u32 cmd_type, olinfo_status;
  2633			struct igc_tx_buffer *bi;
  2634			s64 launch_tm = 0;
  2635			dma_addr_t dma;
  2636	
  2637			bi = &ring->tx_buffer_info[ring->next_to_use];
  2638	
> 2639			if (ring->launchtime_enable && (xdp_desc.options & XDP_DESC_OPTION_SO_TXTIME)) {
> 2640				launch_tm = xsk_buff_get_txtime(pool, &xdp_desc);
  2641				if (launch_tm != -1) {
  2642					budget--;
  2643					igc_launchtm_ctxtdesc(ring, (ktime_t)launch_tm);
  2644				}
  2645			}
  2646	
  2647			/* re-read ntu as igc_launchtm_ctxtdesc() updates it */
  2648			ntu = ring->next_to_use;
  2649	
  2650			cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
  2651				   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
  2652				   xdp_desc.len;
  2653			olinfo_status = xdp_desc.len << IGC_ADVTXD_PAYLEN_SHIFT;
  2654	
  2655			dma = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
  2656			xsk_buff_raw_dma_sync_for_device(pool, dma, xdp_desc.len);
  2657	
  2658			budget--;
  2659			tx_desc = IGC_TX_DESC(ring, ntu);
  2660			tx_desc->read.cmd_type_len = cpu_to_le32(cmd_type);
  2661			tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
  2662			tx_desc->read.buffer_addr = cpu_to_le64(dma);
  2663	
  2664			bi->type = IGC_TX_BUFFER_TYPE_XSK;
  2665			bi->protocol = 0;
  2666			bi->bytecount = xdp_desc.len;
  2667			bi->gso_segs = 1;
  2668			bi->time_stamp = jiffies;
  2669			bi->next_to_watch = tx_desc;
  2670	
  2671			netdev_tx_sent_queue(txring_txq(ring), xdp_desc.len);
  2672	
  2673			ntu++;
  2674			if (ntu == ring->count)
  2675				ntu = 0;
  2676			ring->next_to_use = ntu;
  2677		}
  2678	
  2679		if (tx_desc) {
  2680			igc_flush_tx_descriptors(ring);
  2681			xsk_tx_release(pool);
  2682		}
  2683	
  2684		__netif_tx_unlock(nq);
  2685	}
  2686	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 41843 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path
  2021-08-03 17:10 ` [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path Kishen Maloor
  2021-08-03 21:41   ` kernel test robot
@ 2021-08-04  4:15   ` kernel test robot
  2021-08-05 17:53   ` Kishen Maloor
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2021-08-04  4:15 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4525 bytes --]

Hi Kishen,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: arc-allyesconfig (attached as .config)
compiler: arceb-elf-gcc (GCC) 10.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/a631a80055e4e1c72563d33445336f091c7053e3
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Kishen-Maloor/SO_TXTIME-support-in-AF_XDP/20210804-011214
        git checkout a631a80055e4e1c72563d33445336f091c7053e3
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross O=build_dir ARCH=arc SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/net/ethernet/intel/igc/igc_main.c: In function 'igc_xdp_xmit_zc':
>> drivers/net/ethernet/intel/igc/igc_main.c:2639:54: error: 'XDP_DESC_OPTION_SO_TXTIME' undeclared (first use in this function); did you mean 'XDP_DESC_OPTION_METADATA'?
    2639 |   if (ring->launchtime_enable && (xdp_desc.options & XDP_DESC_OPTION_SO_TXTIME)) {
         |                                                      ^~~~~~~~~~~~~~~~~~~~~~~~~
         |                                                      XDP_DESC_OPTION_METADATA
   drivers/net/ethernet/intel/igc/igc_main.c:2639:54: note: each undeclared identifier is reported only once for each function it appears in


vim +2639 drivers/net/ethernet/intel/igc/igc_main.c

  2613	
  2614	static void igc_xdp_xmit_zc(struct igc_ring *ring)
  2615	{
  2616		struct xsk_buff_pool *pool = ring->xsk_pool;
  2617		struct netdev_queue *nq = txring_txq(ring);
  2618		union igc_adv_tx_desc *tx_desc = NULL;
  2619		int cpu = smp_processor_id();
  2620		u16 ntu = ring->next_to_use;
  2621		struct xdp_desc xdp_desc;
  2622		u16 budget;
  2623	
  2624		if (!netif_carrier_ok(ring->netdev))
  2625			return;
  2626	
  2627		__netif_tx_lock(nq, cpu);
  2628	
  2629		budget = igc_desc_unused(ring);
  2630	
  2631		while (xsk_tx_peek_desc(pool, &xdp_desc) && budget > 2) {
  2632			u32 cmd_type, olinfo_status;
  2633			struct igc_tx_buffer *bi;
  2634			s64 launch_tm = 0;
  2635			dma_addr_t dma;
  2636	
  2637			bi = &ring->tx_buffer_info[ring->next_to_use];
  2638	
> 2639			if (ring->launchtime_enable && (xdp_desc.options & XDP_DESC_OPTION_SO_TXTIME)) {
  2640				launch_tm = xsk_buff_get_txtime(pool, &xdp_desc);
  2641				if (launch_tm != -1) {
  2642					budget--;
  2643					igc_launchtm_ctxtdesc(ring, (ktime_t)launch_tm);
  2644				}
  2645			}
  2646	
  2647			/* re-read ntu as igc_launchtm_ctxtdesc() updates it */
  2648			ntu = ring->next_to_use;
  2649	
  2650			cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
  2651				   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
  2652				   xdp_desc.len;
  2653			olinfo_status = xdp_desc.len << IGC_ADVTXD_PAYLEN_SHIFT;
  2654	
  2655			dma = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
  2656			xsk_buff_raw_dma_sync_for_device(pool, dma, xdp_desc.len);
  2657	
  2658			budget--;
  2659			tx_desc = IGC_TX_DESC(ring, ntu);
  2660			tx_desc->read.cmd_type_len = cpu_to_le32(cmd_type);
  2661			tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
  2662			tx_desc->read.buffer_addr = cpu_to_le64(dma);
  2663	
  2664			bi->type = IGC_TX_BUFFER_TYPE_XSK;
  2665			bi->protocol = 0;
  2666			bi->bytecount = xdp_desc.len;
  2667			bi->gso_segs = 1;
  2668			bi->time_stamp = jiffies;
  2669			bi->next_to_watch = tx_desc;
  2670	
  2671			netdev_tx_sent_queue(txring_txq(ring), xdp_desc.len);
  2672	
  2673			ntu++;
  2674			if (ntu == ring->count)
  2675				ntu = 0;
  2676			ring->next_to_use = ntu;
  2677		}
  2678	
  2679		if (tx_desc) {
  2680			igc_flush_tx_descriptors(ring);
  2681			xsk_tx_release(pool);
  2682		}
  2683	
  2684		__netif_tx_unlock(nq);
  2685	}
  2686	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 68419 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path
  2021-08-03 17:10 ` [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path Kishen Maloor
  2021-08-03 21:41   ` kernel test robot
  2021-08-04  4:15   ` kernel test robot
@ 2021-08-05 17:53   ` Kishen Maloor
  2 siblings, 0 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-05 17:53 UTC (permalink / raw)
  To: bpf, netdev, hawk, magnus.karlsson; +Cc: Jithu Joseph

On 8/3/21 1:10 PM, Kishen Maloor wrote:
> +		if (ring->launchtime_enable && (xdp_desc.options & XDP_DESC_OPTION_SO_TXTIME)) {

The above line in "[RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path"
required a minor correction.
The fixed patch series is at the tip of this source tree: https://github.com/kmaloor/bpfnext

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 2/5] libbpf: SO_TXTIME support in AF_XDP
  2021-08-03 17:10 ` [RFC bpf-next 2/5] libbpf: " Kishen Maloor
@ 2021-08-06 23:08   ` Andrii Nakryiko
  2021-08-18  9:49   ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2021-08-06 23:08 UTC (permalink / raw)
  To: Kishen Maloor
  Cc: bpf, Networking, Jesper Dangaard Brouer, Magnus Karlsson,
	Toke Høiland-Jørgensen

On Tue, Aug 3, 2021 at 10:10 AM Kishen Maloor <kishen.maloor@intel.com> wrote:
>
> This change adds userspace support for SO_TXTIME in AF_XDP
> to include a specific TXTIME (aka "Launch Time")
> with XDP frames issued from userspace XDP applications.
>
> The userspace API has been expanded with two helper functons:
>
> - int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
>    Sets the SO_TXTIME option on the AF_XDP socket (using setsockopt()).
>
> - void xsk_umem__set_md_txtime(void *umem_area, __u64 chunkAddr,
>                                __s64 txtime)
>    Packages the application supplied TXTIME into struct xdp_user_tx_metadata:
>    struct xdp_user_tx_metadata {
>         __u64 timestamp;
>         __u32 md_valid;
>         __u32 btf_id;
>    };
>    and stores it in the XDP metadata area, which precedes the XDP frame.
>
> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
> ---

Same comments as in [0] regarding the AF_XDP APIs in libbpf.

  [0] https://lore.kernel.org/bpf/CAEf4BzZ44wc-+r6o7vthddt5BoePdg0cQn83g8qkyPMAca4vvA@mail.gmail.com/

>  tools/include/uapi/linux/if_xdp.h     |  2 ++
>  tools/include/uapi/linux/xdp_md_std.h | 14 ++++++++++++++
>  tools/lib/bpf/xsk.h                   | 27 ++++++++++++++++++++++++++-
>  3 files changed, 42 insertions(+), 1 deletion(-)
>  create mode 100644 tools/include/uapi/linux/xdp_md_std.h
>

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage
  2021-08-03 17:10 ` [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage Kishen Maloor
@ 2021-08-18  8:54   ` Jesper Dangaard Brouer
  2021-08-19 19:32     ` Kishen Maloor
  0 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2021-08-18  8:54 UTC (permalink / raw)
  To: Kishen Maloor, bpf, netdev, hawk, magnus.karlsson, Björn Töpel
  Cc: brouer, Jithu Joseph


On 03/08/2021 19.10, Kishen Maloor wrote:
> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> index 3fd2f6a0d1eb..a0fd3d5414ba 100644
> --- a/samples/bpf/xdpsock_user.c
> +++ b/samples/bpf/xdpsock_user.c
[...]
> @@ -741,6 +745,8 @@ static inline u16 udp_csum(u32 saddr, u32 daddr, u32 len,
>   
>   #define ETH_FCS_SIZE 4
>   
> +#define MD_SIZE (sizeof(struct xdp_user_tx_metadata))
> +
>   #define PKT_HDR_SIZE (sizeof(struct ethhdr) + sizeof(struct iphdr) + \
>   		      sizeof(struct udphdr))
>   
> @@ -798,8 +804,10 @@ static void gen_eth_hdr_data(void)
>   
>   static void gen_eth_frame(struct xsk_umem_info *umem, u64 addr)
>   {
> -	memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data,
> -	       PKT_SIZE);
> +	if (opt_launch_time)
> +		memcpy(xsk_umem__get_data(umem->buffer, addr) + MD_SIZE, pkt_data, PKT_SIZE);
> +	else
> +		memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data, PKT_SIZE);
>   }
>   

I imagined that AF_XDP 'addr' would still point to the start of the 
packet data, and that metadata area was access via a negative offset 
from 'addr'.

Maybe I misunderstood the code, but it looks like 'addr' 
(xsk_umem__get_data(umem->buffer, addr)) points to metadata area, is 
this correct?

(and to skip this the code does + MD_SIZE, before memcpy)

One problem/challenge with AF_XDP is that we don't have room in struct 
xdp_desc to store info on the size of the metadata area.  Bjørn came up 
with the idea of having btf_id as last member (access able via minus 4 
bytes), as this tells the kernel the size of metadata area.

Maybe you have come up with a better solution?
(of making the metadata area size dynamic)

--Jesper


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 2/5] libbpf: SO_TXTIME support in AF_XDP
  2021-08-03 17:10 ` [RFC bpf-next 2/5] libbpf: " Kishen Maloor
  2021-08-06 23:08   ` Andrii Nakryiko
@ 2021-08-18  9:49   ` Jesper Dangaard Brouer
  2021-08-19 19:32     ` Kishen Maloor
  1 sibling, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2021-08-18  9:49 UTC (permalink / raw)
  To: Kishen Maloor, bpf, netdev, hawk, magnus.karlsson,
	Björn Töpel, Andrii Nakryiko, Jithu Joseph
  Cc: brouer



On 03/08/2021 19.10, Kishen Maloor wrote:
> This change adds userspace support for SO_TXTIME in AF_XDP
> to include a specific TXTIME (aka "Launch Time")
> with XDP frames issued from userspace XDP applications.
> 
> The userspace API has been expanded with two helper functons:
> 
> - int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
>     Sets the SO_TXTIME option on the AF_XDP socket (using setsockopt()).
> 
> - void xsk_umem__set_md_txtime(void *umem_area, __u64 chunkAddr,
>                                 __s64 txtime)
>     Packages the application supplied TXTIME into struct xdp_user_tx_metadata:
>     struct xdp_user_tx_metadata {

Struct name is important and becomes UAPI. I'm not 100% convinced this 
is a good name.

For BPF programs libbpf can at load-time lookup the 'btf_id' via:

   btf_id = bpf_core_type_id_kernel(struct xdp_user_tx_metadata);

Example see[1]
  [1] https://github.com/xdp-project/bpf-examples/commit/2390b4b11079

I know this is AF_XDP userspace, but I hope Andrii can help guide us 
howto expose the bpf_core_type_id_kernel() API via libbpf, to be used by 
the AF_XDP userspace program.


>          __u64 timestamp;
>          __u32 md_valid;
>          __u32 btf_id;
>     };

I assume this struct is intended to be BTF "described".

Struct member *names* are very important for BTF. (E.g. see how 
'spinlock' have special meaning and is matched internally by kernel).

The member name 'timestamp' seems too generic.  This is a very specific 
'LaunchTime' feature, which could be reflected in the name.

Later it looks like you are encoding the "type" in md_valid, which I 
guess it is needed as timestamps can have different "types".
E.g. some of the clockid_t types from clock_gettime(2):
  CLOCK_REALTIME
  CLOCK_TAI
  CLOCK_MONOTONIC
  CLOCK_BOOTTIME

Which of these timestamp does XDP_METADATA_USER_TX_TIMESTAMP represent?
Or what timestamp type is the expected one?

In principle we could name the member 'Launch_Time_CLOCK_TAI' to encoded 
the clockid_t type in the name, but I think that would be too much (and 
require too advanced BTF helpers to extract type, having a clock_type 
member is easier to understand/consume from C).


>     and stores it in the XDP metadata area, which precedes the XDP frame.
> 
> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
> ---
>   tools/include/uapi/linux/if_xdp.h     |  2 ++
>   tools/include/uapi/linux/xdp_md_std.h | 14 ++++++++++++++
>   tools/lib/bpf/xsk.h                   | 27 ++++++++++++++++++++++++++-
>   3 files changed, 42 insertions(+), 1 deletion(-)
>   create mode 100644 tools/include/uapi/linux/xdp_md_std.h
> 
> diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
> index a78a8096f4ce..31f81f82ed86 100644
> --- a/tools/include/uapi/linux/if_xdp.h
> +++ b/tools/include/uapi/linux/if_xdp.h
> @@ -106,6 +106,8 @@ struct xdp_desc {
>   	__u32 options;
>   };
>   
> +#define XDP_DESC_OPTION_METADATA (1 << 0)
> +
>   /* UMEM descriptor is __u64 */
>   
>   #endif /* _LINUX_IF_XDP_H */
> diff --git a/tools/include/uapi/linux/xdp_md_std.h b/tools/include/uapi/linux/xdp_md_std.h
> new file mode 100644
> index 000000000000..f00996a61639
> --- /dev/null
> +++ b/tools/include/uapi/linux/xdp_md_std.h
> @@ -0,0 +1,14 @@
> +#ifndef _UAPI_LINUX_XDP_MD_STD_H
> +#define _UAPI_LINUX_XDP_MD_STD_H
> +
> +#include <linux/types.h>
> +
> +#define XDP_METADATA_USER_TX_TIMESTAMP 0x1
> +
> +struct xdp_user_tx_metadata {
> +	__u64 timestamp;
> +	__u32 md_valid;
> +	__u32 btf_id;
> +};
> +
> +#endif /* _UAPI_LINUX_XDP_MD_STD_H */
> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> index 01c12dca9c10..1b52ffe1c9a3 100644
> --- a/tools/lib/bpf/xsk.h
> +++ b/tools/lib/bpf/xsk.h
> @@ -16,7 +16,8 @@
>   #include <stdint.h>
>   #include <stdbool.h>
>   #include <linux/if_xdp.h>
> -
> +#include <linux/xdp_md_std.h>
> +#include <errno.h>
>   #include "libbpf.h"
>   
>   #ifdef __cplusplus
> @@ -248,6 +249,30 @@ static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
>   LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
>   LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
>   
> +/* Helpers for SO_TXTIME */
> +
> +static inline void xsk_umem__set_md_txtime(void *umem_area, __u64 addr, __s64 txtime)
> +{
> +	struct xdp_user_tx_metadata *md;
> +
> +	md = (struct xdp_user_tx_metadata *)&((char *)umem_area)[addr];
> +
> +	md->timestamp = txtime;
> +	md->md_valid |= XDP_METADATA_USER_TX_TIMESTAMP;

Is this encoding the "type" of the timestamp?

I don't see the btf_id being updated.  Does that happen in another patch?

As I note above we are current;y lacking an libbpf equivalent 
bpf_core_type_id_kernel() lookup function in userspace.

> +}
> +
> +static inline int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
> +{
> +	unsigned int val = (enable) ? 1 : 0;
> +	int err;
> +
> +	err = setsockopt(xsk_socket__fd(xsk), SOL_XDP, SO_TXTIME, &val, sizeof(val));
> +
> +	if (err)
> +		return -errno;
> +	return 0;
> +}
> +
>   #define XSK_RING_CONS__DEFAULT_NUM_DESCS      2048
>   #define XSK_RING_PROD__DEFAULT_NUM_DESCS      2048
>   #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage
  2021-08-18  8:54   ` Jesper Dangaard Brouer
@ 2021-08-19 19:32     ` Kishen Maloor
  0 siblings, 0 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-19 19:32 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf, netdev, hawk, magnus.karlsson,
	Björn Töpel
  Cc: brouer, Jithu Joseph

On 8/18/21 4:54 AM, Jesper Dangaard Brouer wrote:
> 
> On 03/08/2021 19.10, Kishen Maloor wrote:
>> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
>> index 3fd2f6a0d1eb..a0fd3d5414ba 100644
>> --- a/samples/bpf/xdpsock_user.c
>> +++ b/samples/bpf/xdpsock_user.c
> [...]
>> @@ -741,6 +745,8 @@ static inline u16 udp_csum(u32 saddr, u32 daddr, u32 len,
>>     #define ETH_FCS_SIZE 4
>>   +#define MD_SIZE (sizeof(struct xdp_user_tx_metadata))
>> +
>>   #define PKT_HDR_SIZE (sizeof(struct ethhdr) + sizeof(struct iphdr) + \
>>                 sizeof(struct udphdr))
>>   @@ -798,8 +804,10 @@ static void gen_eth_hdr_data(void)
>>     static void gen_eth_frame(struct xsk_umem_info *umem, u64 addr)
>>   {
>> -    memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data,
>> -           PKT_SIZE);
>> +    if (opt_launch_time)
>> +        memcpy(xsk_umem__get_data(umem->buffer, addr) + MD_SIZE, pkt_data, PKT_SIZE);
>> +    else
>> +        memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data, PKT_SIZE);
>>   }
>>   
> 
> I imagined that AF_XDP 'addr' would still point to the start of the packet data, and that metadata area was access via a negative offset from 'addr'.
> 

There is currently no kernel "infrastructure" on the TX path which factors in the concept of XDP metadata, so the application needs to make place for it. (For e.g., XDP_PACKET_HEADROOM has no de facto role on the TX path AFAIK).

xsk_umem__get_data() just returns the UMEM chunk at the user supplied 'addr' and applications need to write both the XDP packet and any accompanying metadata into this (raw) buffer.

In doing so, it places that metadata right ahead of the XDP packet (much like how that's structured in the RX path), and further plugs (addr + offset_to_XDP_packet (the metadata size, in other words)) into the TX descriptor 'addr' so that lower layers (e.g. the driver) can access the XDP packet as always.
(Note also that the TX descriptor 'len' would exclude the md size)

> Maybe I misunderstood the code, but it looks like 'addr' (xsk_umem__get_data(umem->buffer, addr)) points to metadata area, is this correct?
> 

No, more specifically, it is the user supplied UMEM chunk 'addr'. 

> (and to skip this the code does + MD_SIZE, before memcpy)

Since the packet is written (by the application) to immediately follow the metadata (both stored on the chunk), the code does that (+ MD_SIZE).

> 
> One problem/challenge with AF_XDP is that we don't have room in struct xdp_desc to store info on the size of the metadata area.  Bjørn came up with the idea of having btf_id as last member (access able via minus 4 bytes), as this tells the kernel the size of metadata area.

Yes, the RFC follows this idea and hence btf_id is the last member of struct xdp_user_tx_metadata.

> 
> Maybe you have come up with a better solution?
> (of making the metadata area size dynamic)
> 
> --Jesper
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC bpf-next 2/5] libbpf: SO_TXTIME support in AF_XDP
  2021-08-18  9:49   ` Jesper Dangaard Brouer
@ 2021-08-19 19:32     ` Kishen Maloor
  0 siblings, 0 replies; 16+ messages in thread
From: Kishen Maloor @ 2021-08-19 19:32 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf, netdev, hawk, magnus.karlsson,
	Björn Töpel, Andrii Nakryiko, Jithu Joseph
  Cc: brouer

On 8/18/21 5:49 AM, Jesper Dangaard Brouer wrote:
> 
> 
> On 03/08/2021 19.10, Kishen Maloor wrote:
>> This change adds userspace support for SO_TXTIME in AF_XDP
>> to include a specific TXTIME (aka "Launch Time")
>> with XDP frames issued from userspace XDP applications.
>>
>> The userspace API has been expanded with two helper functons:
>>
>> - int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
>>     Sets the SO_TXTIME option on the AF_XDP socket (using setsockopt()).
>>
>> - void xsk_umem__set_md_txtime(void *umem_area, __u64 chunkAddr,
>>                                 __s64 txtime)
>>     Packages the application supplied TXTIME into struct xdp_user_tx_metadata:
>>     struct xdp_user_tx_metadata {
> 
> Struct name is important and becomes UAPI. I'm not 100% convinced this is a good name.

Sure, we can choose a better name. Open to suggestions.

> 
> For BPF programs libbpf can at load-time lookup the 'btf_id' via:
> 
>   btf_id = bpf_core_type_id_kernel(struct xdp_user_tx_metadata);
> 
> Example see[1]
>  [1] https://github.com/xdp-project/bpf-examples/commit/2390b4b11079
> 
> I know this is AF_XDP userspace, but I hope Andrii can help guide us howto expose the bpf_core_type_id_kernel() API via libbpf, to be used by the AF_XDP userspace program.
> 
> 
>>          __u64 timestamp;
>>          __u32 md_valid;
>>          __u32 btf_id;
>>     };
> 
> I assume this struct is intended to be BTF "described".

Yes, the intention here was to be future-proof in a sense, as once all the related infrastructure/tooling for BTF comes into existence, it should hopefully be a 
simple matter to adapt this code to work with that.

> 
> Struct member *names* are very important for BTF. (E.g. see how 'spinlock' have special meaning and is matched internally by kernel).
> 
> The member name 'timestamp' seems too generic.  This is a very specific 'LaunchTime' feature, which could be reflected in the name.

Yes, LaunchTime is the specific use case addressed by this RFC, so open to suggestions on the struct member name (e.g. launch_time?).

> 
> Later it looks like you are encoding the "type" in md_valid, which I guess it is needed as timestamps can have different "types".

No, the purpose of md_valid is to flag all those md fields that have been set/conveyed by this metadata. So, XDP_METADATA_USER_TX_TIMESTAMP is 
meant as a reference to the 'timestamp' field in struct xdp_user_tx_metadata.

> E.g. some of the clockid_t types from clock_gettime(2):
>  CLOCK_REALTIME
>  CLOCK_TAI
>  CLOCK_MONOTONIC
>  CLOCK_BOOTTIME
> 
> Which of these timestamp does XDP_METADATA_USER_TX_TIMESTAMP represent?
> Or what timestamp type is the expected one?

CLOCK_TAI is what we use for LaunchTime, and supported by IGC/i225.

> 
> In principle we could name the member 'Launch_Time_CLOCK_TAI' to encoded the clockid_t type in the name, but I think that would be too much (and require too advanced BTF helpers to extract type, having a clock_type member is easier to understand/consume from C).
> 
> 
>>     and stores it in the XDP metadata area, which precedes the XDP frame.
>>
>> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
>> ---
>>   tools/include/uapi/linux/if_xdp.h     |  2 ++
>>   tools/include/uapi/linux/xdp_md_std.h | 14 ++++++++++++++
>>   tools/lib/bpf/xsk.h                   | 27 ++++++++++++++++++++++++++-
>>   3 files changed, 42 insertions(+), 1 deletion(-)
>>   create mode 100644 tools/include/uapi/linux/xdp_md_std.h
>>
>> diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
>> index a78a8096f4ce..31f81f82ed86 100644
>> --- a/tools/include/uapi/linux/if_xdp.h
>> +++ b/tools/include/uapi/linux/if_xdp.h
>> @@ -106,6 +106,8 @@ struct xdp_desc {
>>       __u32 options;
>>   };
>>   +#define XDP_DESC_OPTION_METADATA (1 << 0)
>> +
>>   /* UMEM descriptor is __u64 */
>>     #endif /* _LINUX_IF_XDP_H */
>> diff --git a/tools/include/uapi/linux/xdp_md_std.h b/tools/include/uapi/linux/xdp_md_std.h
>> new file mode 100644
>> index 000000000000..f00996a61639
>> --- /dev/null
>> +++ b/tools/include/uapi/linux/xdp_md_std.h
>> @@ -0,0 +1,14 @@
>> +#ifndef _UAPI_LINUX_XDP_MD_STD_H
>> +#define _UAPI_LINUX_XDP_MD_STD_H
>> +
>> +#include <linux/types.h>
>> +
>> +#define XDP_METADATA_USER_TX_TIMESTAMP 0x1
>> +
>> +struct xdp_user_tx_metadata {
>> +    __u64 timestamp;
>> +    __u32 md_valid;
>> +    __u32 btf_id;
>> +};
>> +
>> +#endif /* _UAPI_LINUX_XDP_MD_STD_H */
>> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
>> index 01c12dca9c10..1b52ffe1c9a3 100644
>> --- a/tools/lib/bpf/xsk.h
>> +++ b/tools/lib/bpf/xsk.h
>> @@ -16,7 +16,8 @@
>>   #include <stdint.h>
>>   #include <stdbool.h>
>>   #include <linux/if_xdp.h>
>> -
>> +#include <linux/xdp_md_std.h>
>> +#include <errno.h>
>>   #include "libbpf.h"
>>     #ifdef __cplusplus
>> @@ -248,6 +249,30 @@ static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
>>   LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
>>   LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
>>   +/* Helpers for SO_TXTIME */
>> +
>> +static inline void xsk_umem__set_md_txtime(void *umem_area, __u64 addr, __s64 txtime)
>> +{
>> +    struct xdp_user_tx_metadata *md;
>> +
>> +    md = (struct xdp_user_tx_metadata *)&((char *)umem_area)[addr];
>> +
>> +    md->timestamp = txtime;
>> +    md->md_valid |= XDP_METADATA_USER_TX_TIMESTAMP;
> 
> Is this encoding the "type" of the timestamp?

No, it is hinting at the field (timestamp) being conveyed in this metadata.

> 
> I don't see the btf_id being updated.  Does that happen in another patch?

No. As I understand, BTF support and it's overall applicability (particularly from AF_XDP userspace) is still a WIP. So, btf_id currently exists as a 
placeholder to facilitate future BTF integration.
 
Meanwhile, this RFC re-purposes SO_TXTIME on the control path from AF_XDP userspace to exercise the LaunchTime feature in the presented pattern.

> 
> As I note above we are current;y lacking an libbpf equivalent bpf_core_type_id_kernel() lookup function in userspace.
> 
>> +}
>> +
>> +static inline int xsk_socket__enable_so_txtime(struct xsk_socket *xsk, bool enable)
>> +{
>> +    unsigned int val = (enable) ? 1 : 0;
>> +    int err;
>> +
>> +    err = setsockopt(xsk_socket__fd(xsk), SOL_XDP, SO_TXTIME, &val, sizeof(val));
>> +
>> +    if (err)
>> +        return -errno;
>> +    return 0;
>> +}
>> +
>>   #define XSK_RING_CONS__DEFAULT_NUM_DESCS      2048
>>   #define XSK_RING_PROD__DEFAULT_NUM_DESCS      2048
>>   #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
>>
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-08-19 19:33 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03 17:10 [RFC bpf-next 0/5] SO_TXTIME support in AF_XDP Kishen Maloor
2021-08-03 17:10 ` [RFC bpf-next 1/5] net: xdp: " Kishen Maloor
2021-08-03 20:05   ` kernel test robot
2021-08-03 20:56   ` kernel test robot
2021-08-03 17:10 ` [RFC bpf-next 2/5] libbpf: " Kishen Maloor
2021-08-06 23:08   ` Andrii Nakryiko
2021-08-18  9:49   ` Jesper Dangaard Brouer
2021-08-19 19:32     ` Kishen Maloor
2021-08-03 17:10 ` [RFC bpf-next 3/5] igc: Launchtime support in XDP Tx ZC path Kishen Maloor
2021-08-03 21:41   ` kernel test robot
2021-08-04  4:15   ` kernel test robot
2021-08-05 17:53   ` Kishen Maloor
2021-08-03 17:10 ` [RFC bpf-next 4/5] samples/bpf/xdpsock_user.c: Make get_nsecs() generic Kishen Maloor
2021-08-03 17:10 ` [RFC bpf-next 5/5] samples/bpf/xdpsock_user.c: Launchtime/TXTIME API usage Kishen Maloor
2021-08-18  8:54   ` Jesper Dangaard Brouer
2021-08-19 19:32     ` Kishen Maloor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.