All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Björn Töpel" <bjorn.topel@gmail.com>
To: Tushar Dave <tushar.n.dave@oracle.com>
Cc: "Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Alexander Duyck" <alexander.h.duyck@intel.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Alexei Starovoitov" <ast@fb.com>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	michael.lundkvist@ericsson.com, ravineet.singh@ericsson.com,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Network Development" <netdev@vger.kernel.org>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	jesse.brandeburg@intel.com, anjali.singhai@intel.com,
	rami.rosen@intel.com, jeffrey.b.shaw@intel.com,
	ferruh.yigit@intel.com, qi.z.zhang@intel.com
Subject: Re: [RFC PATCH 01/14] packet: introduce AF_PACKET V4 userspace API
Date: Thu, 2 Nov 2017 17:47:01 +0100	[thread overview]
Message-ID: <CAJ+HfNhmQNKFAe8BuFUM65BhcHivkF=Z-jH3RS4rGEPCB49QtQ@mail.gmail.com> (raw)
In-Reply-To: <34090015-d061-8f7a-18e3-c8c67ade800f@oracle.com>

2017-11-02 17:40 GMT+01:00 Tushar Dave <tushar.n.dave@oracle.com>:
>
>
> On 11/02/2017 03:06 AM, Björn Töpel wrote:
>>
>> On 2017-11-02 02:45, Willem de Bruijn wrote:
>>>
>>> On Tue, Oct 31, 2017 at 9:41 PM, Björn Töpel <bjorn.topel@gmail.com>
>>> wrote:
>>>>
>>>> From: Björn Töpel <bjorn.topel@intel.com>
>>>>
>>>> This patch adds the necessary AF_PACKET V4 structures for usage from
>>>> userspace. AF_PACKET V4 is a new interface optimized for high
>>>> performance packet processing.
>>>>
>>>> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
>>>> ---
>>>>    include/uapi/linux/if_packet.h | 65
>>>> +++++++++++++++++++++++++++++++++++++++++-
>>>>    1 file changed, 64 insertions(+), 1 deletion(-)
>>>>
>>>> +struct tpacket4_queue {
>>>> +       struct tpacket4_desc *ring;
>>>> +
>>>> +       unsigned int avail_idx;
>>>> +       unsigned int last_used_idx;
>>>> +       unsigned int num_free;
>>>> +       unsigned int ring_mask;
>>>> +};
>>>>
>>>>    struct packet_mreq {
>>>> @@ -294,6 +335,28 @@ struct packet_mreq {
>>>>           unsigned char   mr_address[8];
>>>>    };
>>>>
>>>> +/*
>>>> + * struct tpacket_memreg_req is used in conjunction with PACKET_MEMREG
>>>> + * to register user memory which should be used to store the packet
>>>> + * data.
>>>> + *
>>>> + * There are some constraints for the memory being registered:
>>>> + * - The memory area has to be memory page size aligned.
>>>> + * - The frame size has to be a power of 2.
>>>> + * - The frame size cannot be smaller than 2048B.
>>>> + * - The frame size cannot be larger than the memory page size.
>>>> + *
>>>> + * Corollary: The number of frames that can be stored is
>>>> + * len / frame_size.
>>>> + *
>>>> + */
>>>> +struct tpacket_memreg_req {
>>>> +       unsigned long   addr;           /* Start of packet data area */
>>>> +       unsigned long   len;            /* Length of packet data area */
>>>> +       unsigned int    frame_size;     /* Frame size */
>>>> +       unsigned int    data_headroom;  /* Frame head room */
>>>> +};
>>>
>>>
>>> Existing packet sockets take a tpacket_req, allocate memory and let the
>>> user process mmap this. I understand that TPACKET_V4 distinguishes
>>> the descriptor from packet pools, but could both use the existing structs
>>> and logic (packet_mmap)? That would avoid introducing a lot of new code
>>> just for granting user pages to the kernel.
>>>
>>
>> We could certainly pass the "tpacket_memreg_req" fields as part of
>> descriptor ring setup ("tpacket_req4"), but we went with having the
>> memory register as a new separate setsockopt. Having it separated,
>> makes it easier to compare regions at the kernel side of things. "Is
>> this the same umem as another one?" If we go the path of passing the
>> range at descriptor ring setup, we need to handle all kind of
>> overlapping ranges to determine when a copy is needed or not, in those
>> cases where the packet buffer (i.e. umem) is shared between processes.
>
>
> Is there a reason to use separate packet socket for umem? Looks like
> userspace has to create separate packet socket for PACKET_MEMREG.
>

Let me clarify; You *can* use a separate socket for umem, but
you can also use the same/existing AF_PACKET socket for that.


Björn

>
> -Tushar>
>
>>> Also, use of unsigned long can cause problems on 32/64 bit compat
>>> environments. Prefer fixed width types in uapi. Same for pointer in
>>> tpacket4_queue.
>>
>>
>> I agree; We'll change to a fixed width type in next version. Do you
>> (and others on the list) prefer __u32/__u64 or unsigned int / unsigned
>> long long?
>>
>>
>> Thanks,
>> Björn
>>
>

  reply	other threads:[~2017-11-02 16:47 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-31 12:41 [RFC PATCH 00/14] Introducing AF_PACKET V4 support Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 01/14] packet: introduce AF_PACKET V4 userspace API Björn Töpel
2017-11-02  1:45   ` Willem de Bruijn
2017-11-02 10:06     ` Björn Töpel
2017-11-02 16:40       ` Tushar Dave
2017-11-02 16:47         ` Björn Töpel [this message]
2017-11-03  2:29       ` Willem de Bruijn
2017-11-03  9:54         ` Björn Töpel
2017-11-15 22:21           ` chet l
2017-11-16 16:53             ` Jesper Dangaard Brouer
2017-11-17  3:32               ` chetan L
2017-11-15 22:34   ` chet l
2017-11-16  1:44     ` David Miller
2017-11-16 19:32       ` chetan L
2017-10-31 12:41 ` [RFC PATCH 02/14] packet: implement PACKET_MEMREG setsockopt Björn Töpel
2017-11-03  3:00   ` Willem de Bruijn
2017-11-03  9:57     ` Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 03/14] packet: enable AF_PACKET V4 rings Björn Töpel
2017-11-03  4:16   ` Willem de Bruijn
2017-11-03 10:02     ` Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 04/14] packet: enable Rx for AF_PACKET V4 Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 05/14] packet: enable Tx support " Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 06/14] netdevice: add AF_PACKET V4 zerocopy ops Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 07/14] packet: wire up zerocopy for AF_PACKET V4 Björn Töpel
2017-11-03  3:17   ` Willem de Bruijn
2017-11-03 10:47     ` Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 08/14] i40e: AF_PACKET V4 ndo_tp4_zerocopy Rx support Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 09/14] i40e: AF_PACKET V4 ndo_tp4_zerocopy Tx support Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 10/14] samples/tpacket4: added tpbench Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 11/14] veth: added support for PACKET_ZEROCOPY Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 12/14] samples/tpacket4: added veth support Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 13/14] i40e: added XDP support for TP4 enabled queue pairs Björn Töpel
2017-10-31 12:41 ` [RFC PATCH 14/14] xdp: introducing XDP_PASS_TO_KERNEL for PACKET_ZEROCOPY use Björn Töpel
2017-11-03  4:34 ` [RFC PATCH 00/14] Introducing AF_PACKET V4 support Willem de Bruijn
2017-11-03 10:13   ` Karlsson, Magnus
2017-11-03 13:55     ` Willem de Bruijn
2017-11-13 13:07 ` Björn Töpel
2017-11-13 14:34   ` John Fastabend
2017-11-13 23:50   ` Alexei Starovoitov
2017-11-14  5:33     ` Björn Töpel
2017-11-14  7:02       ` John Fastabend
2017-11-14 12:20         ` Willem de Bruijn
2017-11-16  2:55           ` Alexei Starovoitov
2017-11-16  3:35             ` Willem de Bruijn
2017-11-16  7:09               ` Björn Töpel
2017-11-16  8:26                 ` Jesper Dangaard Brouer
2017-11-14 17:19   ` [RFC PATCH 00/14] Introducing AF_PACKET V4 support (AF_XDP or AF_CHANNEL?) Jesper Dangaard Brouer
2017-11-14 19:01     ` Björn Töpel
2017-11-16  8:00       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ+HfNhmQNKFAe8BuFUM65BhcHivkF=Z-jH3RS4rGEPCB49QtQ@mail.gmail.com' \
    --to=bjorn.topel@gmail.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=anjali.singhai@intel.com \
    --cc=ast@fb.com \
    --cc=bjorn.topel@intel.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=ferruh.yigit@intel.com \
    --cc=jeffrey.b.shaw@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=michael.lundkvist@ericsson.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=rami.rosen@intel.com \
    --cc=ravineet.singh@ericsson.com \
    --cc=tushar.n.dave@oracle.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.