All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: TXTIME (Launch Time) setting per XDP frame from userspace.
       [not found] <00301BEA-4A94-4600-998A-BEDCF3330795@intel.com>
@ 2021-06-21 21:06 ` Jesper Dangaard Brouer
  2021-06-21 23:04   ` Maloor, Kishen
  0 siblings, 1 reply; 2+ messages in thread
From: Jesper Dangaard Brouer @ 2021-06-21 21:06 UTC (permalink / raw)
  To: Maloor, Kishen
  Cc: Zhang, Jessica, Desouza, Ederson, Joseph, Jithu, Gomes, Vinicius,
	brouer, XDP-hints working-group, Magnus Karlsson, BPF-dev-list,
	Walfred 'Fred' Tedeschi, Andrii Nakryiko


(Answer inlined below, please learn howto answer inline)

On Mon, 21 Jun 2021 19:08:54 +0000
"Maloor, Kishen" <kishen.maloor@intel.com> wrote:

> Hi Jesper,
> 
> I wanted to run some patches by you for your comments.

Thanks a lot for reaching out, I really appreciate it.

> Intel has a requirement to support an SO_TXTIME like capability with
> AF_XDP sockets, to set a precise per-packet transmission time from
> user space XDP applications.

This is great!  I also have a customer that have this requirement.
Thus, I'm very interested in collaborating on getting this feature
implemented in practice.


> It is also critical that this support be upstreamed into the Linux
> mainline.

I fully agree "upstream first".  For this to happen, we should discuss
the approach in a public forum as early as possible to get upstream
feedback.  Cc'ing bpf@vger.kernel.org.  Also adding our working group
mailing list as not everybody follow the kernel list Cc'ed
xdp-hints@xdp-project.net (https://lists.xdp-project.net/xdp-hints/).

Multiple of your Intel colleagues (from other departments) have also
reached out to me, and they are also on this xdp-hints mailing list.
It would be great if you can keep this list Cc'ed to keep everybody in
the loop.

Consider subscribing:
 https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/


> The implementation is roughly equivalent to that of SO_TXTIME for
> AF_PACKET. It includes:
>
> - a new XSK bind flag,

So, this feature is enabled for all packets on this AF_XDP socket.

> - libbpf API to set a TXTIME for a frame,
>
> - leverages the concept of UMEM headroom to store the per-packet
>   TXTIME (immediately preceding the XDP buff's data_hard_start),

This sounds like the XDP metadata area?
Or is this top of mem data-area (guess not, as you use +umem_headroom)?

What will happen when an XDP-prog use this same metadata area, and your
feature (patch below) is enabled (via AF_XDP) ?

E.g. can I set the TXTIME in XDP-prog (RX) and XDP_REDIRECT out an
interface with TXTIME enabled?


> - internally signals the NIC driver via the XDP descriptor when
>   there's a TXTIME bound to a frame, and

Looking at patch, if I don't call 'xsk_umem__set_txtime' (libbpf), then
the kernel (xp_raw_get_txtime) will pickup random-data left over by
previous user of the memory, right?

> - provides a kernel API to retrieve it in the driver so it may proceed
>   to set the Launch Time on the NIC.
> 
> It uses existing bitmasks and there's no overhead (i.e. not
> allocating any more memory than is already done), and there are no
> other dependencies.
>
> You may review the commit messages for all details.
> 
> kernel: https://github.com/kmaloor/bpfnext/commit/b72a336bcb8df4a26646a29962ca95d58172147f
> libbpf: https://github.com/kmaloor/libbpf/commit/eda68c1ad11ed60739837ad6e4fb256768231b55
> 
> We're currently testing these changes in conjunction with the igc
> driver.

Can you share the patches for the igc driver as well?

I want to see how you convert the "__s64 txtime" into the 30-bit
LaunchTime used[0] by the hardware (i225).  It would also be good to see
how this is used for driver igb/i210, that have a 25-bit LaunchTime.

Details on driver+hardware here: 
 [0] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org


> Meanwhile, we'd appreciate any comments you may have, particularly as
> an expert with the kernel XDP subsystem.

I think this TXTIME work is just *one* use-case of "xdp-hints".  IMHO we
should use something like BTF to describe the metadata memory area
in-order to support more use-cases.

Being concrete .e.g. your XSK socket bind call could register a BTF-ID
that it tells the kernel about the layout. I guess, we need some
guidance from BTF experts on how kernel can check the driver supports
this LaunchTime feature, Andrii?

Other use-cases: (1) On TX we also want to support asking for TX-csum
by hardware. (2) Support setting VLAN via TX-descriptor (hint TSN can
use VLAN priority bits).

Again thanks for reaching out, lets find a solution together with
upstream.
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Kernel:

 s64 xp_raw_get_txtime(struct xsk_buff_pool *pool, u64 addr)
{
	return *(s64 *)((char *)xp_raw_get_data(pool, addr) - XDP_TXTIME_LEN);
}
EXPORT_SYMBOL(xp_raw_get_txtime);

libbpf:

 static inline void xsk_umem__set_txtime(void *umem_area, __u64 addr, __u32 umem_headroom, __s64 txtime)
{
	*(__s64 *)(&((char *)umem_area)[addr] + umem_headroom - XDP_TXTIME_LEN) = txtime;
}


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: TXTIME (Launch Time) setting per XDP frame from userspace.
  2021-06-21 21:06 ` TXTIME (Launch Time) setting per XDP frame from userspace Jesper Dangaard Brouer
@ 2021-06-21 23:04   ` Maloor, Kishen
  0 siblings, 0 replies; 2+ messages in thread
From: Maloor, Kishen @ 2021-06-21 23:04 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Zhang, Jessica, Desouza, Ederson, Joseph, Jithu, Gomes, Vinicius,
	XDP-hints working-group, Magnus Karlsson, BPF-dev-list, Tedeschi,
	Walfred, Andrii Nakryiko

Really appreciate the quick turnaround! Answers in-line below.
 
-- 
Kishen Maloor
Intel Corporation

On 6/21/21, 5:07 PM, "Jesper Dangaard Brouer" <brouer@redhat.com> wrote:


    (Answer inlined below, please learn howto answer inline)

    On Mon, 21 Jun 2021 19:08:54 +0000
    "Maloor, Kishen" <kishen.maloor@intel.com> wrote:

    > Hi Jesper,
    > 
    > I wanted to run some patches by you for your comments.

    Thanks a lot for reaching out, I really appreciate it.

    > Intel has a requirement to support an SO_TXTIME like capability with
    > AF_XDP sockets, to set a precise per-packet transmission time from
    > user space XDP applications.

    This is great!  I also have a customer that have this requirement.
    Thus, I'm very interested in collaborating on getting this feature
    implemented in practice.

Super!

    > It is also critical that this support be upstreamed into the Linux
    > mainline.

    I fully agree "upstream first".  For this to happen, we should discuss
    the approach in a public forum as early as possible to get upstream
    feedback.  Cc'ing bpf@vger.kernel.org.  Also adding our working group
    mailing list as not everybody follow the kernel list Cc'ed
    xdp-hints@xdp-project.net (https://lists.xdp-project.net/xdp-hints/).

    Multiple of your Intel colleagues (from other departments) have also
    reached out to me, and they are also on this xdp-hints mailing list.
    It would be great if you can keep this list Cc'ed to keep everybody in
    the loop.

    Consider subscribing:
     https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

Yes, I am aware, although it seems (per my reading) the main focus there is on capturing metadata on the RX path and funneling
that into the XDP program (and possibly higher).
Whereas this change is solely focused on passing down TXTIME (in the TX path obv.) from the XDP application running in userspace.
Said another way, the aim here is to accomplish something equivalent to SO_TXTIME with AF_XDP.

    > The implementation is roughly equivalent to that of SO_TXTIME for
    > AF_PACKET. It includes:
    >
    > - a new XSK bind flag,

    So, this feature is enabled for all packets on this AF_XDP socket.

Yes, in setting  the XDP_TXTIME bind flag while creating the AF_XDP socket, it will apply to all packets sent through that socket.

    > - libbpf API to set a TXTIME for a frame,
    >
    > - leverages the concept of UMEM headroom to store the per-packet
    >   TXTIME (immediately preceding the XDP buff's data_hard_start),

    This sounds like the XDP metadata area?
    Or is this top of mem data-area (guess not, as you use +umem_headroom)?

No. More specifically it is stored in the UMEM headroom area (it's default size per XSK_UMEM__DEFAULT_FRAME_HEADROOM is 0). Currently, the code in net/xdp adjusts for
this headroom (pool->headroom) in one or more places so my understanding is that the entirety of the XDP frame (data_hard_start and beyond, including the metadata
area) follows the UMEM headroom.

    What will happen when an XDP-prog use this same metadata area, and your
    feature (patch below) is enabled (via AF_XDP) ?

So far as I understand, an XDP/BPF program that writes to the defined XDP metadata area will function as expected, i.e. it won't intrude in the UMEM headroom.

    E.g. can I set the TXTIME in XDP-prog (RX) and XDP_REDIRECT out an
    interface with TXTIME enabled?

I believe that's possible. This is kind of why I wanted to loop in experts such as  yourself as you'd have a more complete understanding (than myself) of the XDP code as it 
currently stands :) So, I'd appreciate all the feedback I can get to account for anything that's missed in the patches.

    > - internally signals the NIC driver via the XDP descriptor when
    >   there's a TXTIME bound to a frame, and

    Looking at patch, if I don't call 'xsk_umem__set_txtime' (libbpf), then
    the kernel (xp_raw_get_txtime) will pickup random-data left over by
    previous user of the memory, right?

Yes, though the expectation is that an XDP application that enables XDP_TXTIME on the socket will proceed to pass in a TXTIME with every frame it sends.

    > - provides a kernel API to retrieve it in the driver so it may proceed
    >   to set the Launch Time on the NIC.
    > 
    > It uses existing bitmasks and there's no overhead (i.e. not
    > allocating any more memory than is already done), and there are no
    > other dependencies.
    >
    > You may review the commit messages for all details.
    > 
    > kernel: https://github.com/kmaloor/bpfnext/commit/b72a336bcb8df4a26646a29962ca95d58172147f
    > libbpf: https://github.com/kmaloor/libbpf/commit/eda68c1ad11ed60739837ad6e4fb256768231b55
    > 
    > We're currently testing these changes in conjunction with the igc
    > driver.

    Can you share the patches for the igc driver as well?

That's currently in the works and being tested internally. We should hopefully share it soon.

    I want to see how you convert the "__s64 txtime" into the 30-bit
    LaunchTime used[0] by the hardware (i225).  It would also be good to see
    how this is used for driver igb/i210, that have a 25-bit LaunchTime.

    Details on driver+hardware here: 
     [0] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org


    > Meanwhile, we'd appreciate any comments you may have, particularly as
    > an expert with the kernel XDP subsystem.

    I think this TXTIME work is just *one* use-case of "xdp-hints".  IMHO we
    should use something like BTF to describe the metadata memory area
    in-order to support more use-cases.

    Being concrete .e.g. your XSK socket bind call could register a BTF-ID
    that it tells the kernel about the layout. I guess, we need some
    guidance from BTF experts on how kernel can check the driver supports
    this LaunchTime feature, Andrii?

    Other use-cases: (1) On TX we also want to support asking for TX-csum
    by hardware. (2) Support setting VLAN via TX-descriptor (hint TSN can
    use VLAN priority bits).

Well, to be perfectly clear, the TXTIME in this case is an application supplied actionable quantity, so I don't interpret it as "metadata". 
Whereas the examples you cite above indeed seem like metadata. In this patch, the goal is simply to replicate SO_TXTIME with AF_XDP.
I am aware of ongoing discussions for handling XDP metadata/hints and so far as I understand (correct me if I'm wrong), we're still a ways 
off from a solution that the community will uphold.

    Again thanks for reaching out, lets find a solution together with
    upstream.
    -- 
    Best regards,
      Jesper Dangaard Brouer
      MSc.CS, Principal Kernel Engineer at Red Hat
      LinkedIn: http://www.linkedin.com/in/brouer

    Kernel:

     s64 xp_raw_get_txtime(struct xsk_buff_pool *pool, u64 addr)
    {
    	return *(s64 *)((char *)xp_raw_get_data(pool, addr) - XDP_TXTIME_LEN);
    }
    EXPORT_SYMBOL(xp_raw_get_txtime);

    libbpf:

     static inline void xsk_umem__set_txtime(void *umem_area, __u64 addr, __u32 umem_headroom, __s64 txtime)
    {
    	*(__s64 *)(&((char *)umem_area)[addr] + umem_headroom - XDP_TXTIME_LEN) = txtime;
    }



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-21 23:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <00301BEA-4A94-4600-998A-BEDCF3330795@intel.com>
2021-06-21 21:06 ` TXTIME (Launch Time) setting per XDP frame from userspace Jesper Dangaard Brouer
2021-06-21 23:04   ` Maloor, Kishen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.