bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Maloor, Kishen" <kishen.maloor@intel.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "Zhang, Jessica" <jessica.zhang@intel.com>,
	"Desouza, Ederson" <ederson.desouza@intel.com>,
	"Joseph, Jithu" <jithu.joseph@intel.com>,
	"Gomes, Vinicius" <vinicius.gomes@intel.com>,
	XDP-hints working-group <xdp-hints@xdp-project.net>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	BPF-dev-list <bpf@vger.kernel.org>,
	"Tedeschi, Walfred" <walfred.tedeschi@intel.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>
Subject: Re: TXTIME (Launch Time) setting per XDP frame from userspace.
Date: Mon, 21 Jun 2021 23:04:26 +0000	[thread overview]
Message-ID: <CA2F1BFA-C9D8-4027-BC73-A8AD703EEF76@intel.com> (raw)
In-Reply-To: <20210621230635.3a83851c@carbon>

Really appreciate the quick turnaround! Answers in-line below.
 
-- 
Kishen Maloor
Intel Corporation

On 6/21/21, 5:07 PM, "Jesper Dangaard Brouer" <brouer@redhat.com> wrote:


    (Answer inlined below, please learn howto answer inline)

    On Mon, 21 Jun 2021 19:08:54 +0000
    "Maloor, Kishen" <kishen.maloor@intel.com> wrote:

    > Hi Jesper,
    > 
    > I wanted to run some patches by you for your comments.

    Thanks a lot for reaching out, I really appreciate it.

    > Intel has a requirement to support an SO_TXTIME like capability with
    > AF_XDP sockets, to set a precise per-packet transmission time from
    > user space XDP applications.

    This is great!  I also have a customer that have this requirement.
    Thus, I'm very interested in collaborating on getting this feature
    implemented in practice.

Super!

    > It is also critical that this support be upstreamed into the Linux
    > mainline.

    I fully agree "upstream first".  For this to happen, we should discuss
    the approach in a public forum as early as possible to get upstream
    feedback.  Cc'ing bpf@vger.kernel.org.  Also adding our working group
    mailing list as not everybody follow the kernel list Cc'ed
    xdp-hints@xdp-project.net (https://lists.xdp-project.net/xdp-hints/).

    Multiple of your Intel colleagues (from other departments) have also
    reached out to me, and they are also on this xdp-hints mailing list.
    It would be great if you can keep this list Cc'ed to keep everybody in
    the loop.

    Consider subscribing:
     https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

Yes, I am aware, although it seems (per my reading) the main focus there is on capturing metadata on the RX path and funneling
that into the XDP program (and possibly higher).
Whereas this change is solely focused on passing down TXTIME (in the TX path obv.) from the XDP application running in userspace.
Said another way, the aim here is to accomplish something equivalent to SO_TXTIME with AF_XDP.

    > The implementation is roughly equivalent to that of SO_TXTIME for
    > AF_PACKET. It includes:
    >
    > - a new XSK bind flag,

    So, this feature is enabled for all packets on this AF_XDP socket.

Yes, in setting  the XDP_TXTIME bind flag while creating the AF_XDP socket, it will apply to all packets sent through that socket.

    > - libbpf API to set a TXTIME for a frame,
    >
    > - leverages the concept of UMEM headroom to store the per-packet
    >   TXTIME (immediately preceding the XDP buff's data_hard_start),

    This sounds like the XDP metadata area?
    Or is this top of mem data-area (guess not, as you use +umem_headroom)?

No. More specifically it is stored in the UMEM headroom area (it's default size per XSK_UMEM__DEFAULT_FRAME_HEADROOM is 0). Currently, the code in net/xdp adjusts for
this headroom (pool->headroom) in one or more places so my understanding is that the entirety of the XDP frame (data_hard_start and beyond, including the metadata
area) follows the UMEM headroom.

    What will happen when an XDP-prog use this same metadata area, and your
    feature (patch below) is enabled (via AF_XDP) ?

So far as I understand, an XDP/BPF program that writes to the defined XDP metadata area will function as expected, i.e. it won't intrude in the UMEM headroom.

    E.g. can I set the TXTIME in XDP-prog (RX) and XDP_REDIRECT out an
    interface with TXTIME enabled?

I believe that's possible. This is kind of why I wanted to loop in experts such as  yourself as you'd have a more complete understanding (than myself) of the XDP code as it 
currently stands :) So, I'd appreciate all the feedback I can get to account for anything that's missed in the patches.

    > - internally signals the NIC driver via the XDP descriptor when
    >   there's a TXTIME bound to a frame, and

    Looking at patch, if I don't call 'xsk_umem__set_txtime' (libbpf), then
    the kernel (xp_raw_get_txtime) will pickup random-data left over by
    previous user of the memory, right?

Yes, though the expectation is that an XDP application that enables XDP_TXTIME on the socket will proceed to pass in a TXTIME with every frame it sends.

    > - provides a kernel API to retrieve it in the driver so it may proceed
    >   to set the Launch Time on the NIC.
    > 
    > It uses existing bitmasks and there's no overhead (i.e. not
    > allocating any more memory than is already done), and there are no
    > other dependencies.
    >
    > You may review the commit messages for all details.
    > 
    > kernel: https://github.com/kmaloor/bpfnext/commit/b72a336bcb8df4a26646a29962ca95d58172147f
    > libbpf: https://github.com/kmaloor/libbpf/commit/eda68c1ad11ed60739837ad6e4fb256768231b55
    > 
    > We're currently testing these changes in conjunction with the igc
    > driver.

    Can you share the patches for the igc driver as well?

That's currently in the works and being tested internally. We should hopefully share it soon.

    I want to see how you convert the "__s64 txtime" into the 30-bit
    LaunchTime used[0] by the hardware (i225).  It would also be good to see
    how this is used for driver igb/i210, that have a 25-bit LaunchTime.

    Details on driver+hardware here: 
     [0] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org


    > Meanwhile, we'd appreciate any comments you may have, particularly as
    > an expert with the kernel XDP subsystem.

    I think this TXTIME work is just *one* use-case of "xdp-hints".  IMHO we
    should use something like BTF to describe the metadata memory area
    in-order to support more use-cases.

    Being concrete .e.g. your XSK socket bind call could register a BTF-ID
    that it tells the kernel about the layout. I guess, we need some
    guidance from BTF experts on how kernel can check the driver supports
    this LaunchTime feature, Andrii?

    Other use-cases: (1) On TX we also want to support asking for TX-csum
    by hardware. (2) Support setting VLAN via TX-descriptor (hint TSN can
    use VLAN priority bits).

Well, to be perfectly clear, the TXTIME in this case is an application supplied actionable quantity, so I don't interpret it as "metadata". 
Whereas the examples you cite above indeed seem like metadata. In this patch, the goal is simply to replicate SO_TXTIME with AF_XDP.
I am aware of ongoing discussions for handling XDP metadata/hints and so far as I understand (correct me if I'm wrong), we're still a ways 
off from a solution that the community will uphold.

    Again thanks for reaching out, lets find a solution together with
    upstream.
    -- 
    Best regards,
      Jesper Dangaard Brouer
      MSc.CS, Principal Kernel Engineer at Red Hat
      LinkedIn: http://www.linkedin.com/in/brouer

    Kernel:

     s64 xp_raw_get_txtime(struct xsk_buff_pool *pool, u64 addr)
    {
    	return *(s64 *)((char *)xp_raw_get_data(pool, addr) - XDP_TXTIME_LEN);
    }
    EXPORT_SYMBOL(xp_raw_get_txtime);

    libbpf:

     static inline void xsk_umem__set_txtime(void *umem_area, __u64 addr, __u32 umem_headroom, __s64 txtime)
    {
    	*(__s64 *)(&((char *)umem_area)[addr] + umem_headroom - XDP_TXTIME_LEN) = txtime;
    }



      reply	other threads:[~2021-06-21 23:04 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <00301BEA-4A94-4600-998A-BEDCF3330795@intel.com>
2021-06-21 21:06 ` TXTIME (Launch Time) setting per XDP frame from userspace Jesper Dangaard Brouer
2021-06-21 23:04   ` Maloor, Kishen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA2F1BFA-C9D8-4027-BC73-A8AD703EEF76@intel.com \
    --to=kishen.maloor@intel.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=ederson.desouza@intel.com \
    --cc=jessica.zhang@intel.com \
    --cc=jithu.joseph@intel.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=vinicius.gomes@intel.com \
    --cc=walfred.tedeschi@intel.com \
    --cc=xdp-hints@xdp-project.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).