All of lore.kernel.org
 help / color / mirror / Atom feed
* AF_XDP only releasing from FQ in batches
@ 2019-06-21 19:20 Rafael Vargas
  2019-06-24 10:21 ` Eelco Chaudron
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael Vargas @ 2019-06-21 19:20 UTC (permalink / raw)
  To: xdp-newbies

Hi,

I'm trying to use AF_XDP and I'm using the xdpsock sample
implementation as a guide.

I've noticed that the Fill Queue slots are released in batches of 16
(kernel 5.1)

The xdpsock (rx_drop) implementation will lock waiting for the space in the FQ.
This seems it will work fine when receiving lots of packets, but will
loop indefinetely if traffic stops.

This is the expected behavior for the FQ?
Should I keep the FQ always with free slots in order to avoid blocking
when waiting for more packets?

Rafael Vargas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AF_XDP only releasing from FQ in batches
  2019-06-21 19:20 AF_XDP only releasing from FQ in batches Rafael Vargas
@ 2019-06-24 10:21 ` Eelco Chaudron
  2019-06-24 11:17   ` Magnus Karlsson
  2019-06-24 14:30   ` Rafael Vargas
  0 siblings, 2 replies; 6+ messages in thread
From: Eelco Chaudron @ 2019-06-24 10:21 UTC (permalink / raw)
  To: Rafael Vargas; +Cc: xdp-newbies



On 21 Jun 2019, at 21:20, Rafael Vargas wrote:

> Hi,
>
> I'm trying to use AF_XDP and I'm using the xdpsock sample
> implementation as a guide.
>
> I've noticed that the Fill Queue slots are released in batches of 16
> (kernel 5.1)
>
> The xdpsock (rx_drop) implementation will lock waiting for the space 
> in the FQ.
> This seems it will work fine when receiving lots of packets, but will
> loop indefinetely if traffic stops.

Yes I was running into the same problem, and you should not wait wait 
for the free slots, as that will put you in a loop waiting, and not 
processing packets until 16 are received.

This is how I solved it:


static void rx_pkts(struct xsk_socket_info *xsk)
{
   	unsigned int rcvd, stock_frames, i;
   	uint32_t idx_rx = 0, idx_fq = 0;
   	int ret;

   	rcvd = xsk_ring_cons__peek(&xsk->rx, RX_BATCH_SIZE, &idx_rx);
   	if (!rcvd)
   		return;

	/* Stuff the ring with as much frames as possible */
	stock_frames = xsk_ring_prod__free(&xsk->umem->fq);
	stock_frames = MIN(stock_frames, xsk_umem_free_frames(xsk));

	if (stock_frames > 0) {

		ret = xsk_ring_prod__reserve(&xsk->umem->fq, stock_frames,
					     &idx_fq);
		while (ret != stock_frames)
			ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
						     &idx_fq);

		for (i = 0; i < stock_frames; i++)
			*xsk_ring_prod__fill_addr(&xsk->umem->fq, idx_fq++) =
				xsk_alloc_umem_frame(xsk);

		xsk_ring_prod__submit(&xsk->umem->fq, stock_frames);
	}

	/* Process received packets */
   	for (i = 0; i < rcvd; i++) {
   		uint64_t addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr;
   		uint32_t len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len;
   		char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);

   		// Handle packet

		xsk_free_umem_frame(xsk, addr);
   	}

   	xsk_ring_cons__release(&xsk->rx, rcvd);
	xsk->stats.rx_packets += rcvd;
   }


Where  xsk_ring_prod__free() is as follows (sent patch upstream for 
adding it as an API):

static inline __u32 xsk_ring_prod__free(struct xsk_ring_prod *r)
{
   	r->cached_cons = *r->consumer + r->size;
	return r->cached_cons - r->cached_prod;
}

And the following is an API around my UMEM frame handling:

xsk_umem_free_frames() -> How many buffers do I have available on my 
stack

xsk_free_umem_frame()  -> return a buffer to my buffer stack

xsk_alloc_umem_frame() -> Get a buffer from my buffer stack

> This is the expected behavior for the FQ?
> Should I keep the FQ always with free slots in order to avoid blocking
> when waiting for more packets?
>
> Rafael Vargas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AF_XDP only releasing from FQ in batches
  2019-06-24 10:21 ` Eelco Chaudron
@ 2019-06-24 11:17   ` Magnus Karlsson
  2019-06-24 11:41     ` Eelco Chaudron
  2019-06-24 14:30   ` Rafael Vargas
  1 sibling, 1 reply; 6+ messages in thread
From: Magnus Karlsson @ 2019-06-24 11:17 UTC (permalink / raw)
  To: Eelco Chaudron; +Cc: Rafael Vargas, xdp-newbies

On Mon, Jun 24, 2019 at 12:24 PM Eelco Chaudron <echaudro@redhat.com> wrote:
>
>
>
> On 21 Jun 2019, at 21:20, Rafael Vargas wrote:
>
> > Hi,
> >
> > I'm trying to use AF_XDP and I'm using the xdpsock sample
> > implementation as a guide.
> >
> > I've noticed that the Fill Queue slots are released in batches of 16
> > (kernel 5.1)
> >
> > The xdpsock (rx_drop) implementation will lock waiting for the space
> > in the FQ.
> > This seems it will work fine when receiving lots of packets, but will
> > loop indefinetely if traffic stops.
>
> Yes I was running into the same problem, and you should not wait wait
> for the free slots, as that will put you in a loop waiting, and not
> processing packets until 16 are received.
>
> This is how I solved it:
>
>
> static void rx_pkts(struct xsk_socket_info *xsk)
> {
>         unsigned int rcvd, stock_frames, i;
>         uint32_t idx_rx = 0, idx_fq = 0;
>         int ret;
>
>         rcvd = xsk_ring_cons__peek(&xsk->rx, RX_BATCH_SIZE, &idx_rx);
>         if (!rcvd)
>                 return;
>
>         /* Stuff the ring with as much frames as possible */
>         stock_frames = xsk_ring_prod__free(&xsk->umem->fq);
>         stock_frames = MIN(stock_frames, xsk_umem_free_frames(xsk));
>
>         if (stock_frames > 0) {
>
>                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, stock_frames,
>                                              &idx_fq);
>                 while (ret != stock_frames)
>                         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
>                                                      &idx_fq);
>
>                 for (i = 0; i < stock_frames; i++)
>                         *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx_fq++) =
>                                 xsk_alloc_umem_frame(xsk);
>
>                 xsk_ring_prod__submit(&xsk->umem->fq, stock_frames);
>         }
>
>         /* Process received packets */
>         for (i = 0; i < rcvd; i++) {
>                 uint64_t addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr;
>                 uint32_t len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len;
>                 char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
>
>                 // Handle packet
>
>                 xsk_free_umem_frame(xsk, addr);
>         }
>
>         xsk_ring_cons__release(&xsk->rx, rcvd);
>         xsk->stats.rx_packets += rcvd;
>    }
>
>
> Where  xsk_ring_prod__free() is as follows (sent patch upstream for
> adding it as an API):

Eelco,

In the patch you sent, why not make it a patch set by adding these
improvements (using your new function) to the xdpsock sample app?
Please :-).

/Magnus

> static inline __u32 xsk_ring_prod__free(struct xsk_ring_prod *r)
> {
>         r->cached_cons = *r->consumer + r->size;
>         return r->cached_cons - r->cached_prod;
> }
>
> And the following is an API around my UMEM frame handling:
>
> xsk_umem_free_frames() -> How many buffers do I have available on my
> stack
>
> xsk_free_umem_frame()  -> return a buffer to my buffer stack
>
> xsk_alloc_umem_frame() -> Get a buffer from my buffer stack
>
> > This is the expected behavior for the FQ?
> > Should I keep the FQ always with free slots in order to avoid blocking
> > when waiting for more packets?
> >
> > Rafael Vargas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AF_XDP only releasing from FQ in batches
  2019-06-24 11:17   ` Magnus Karlsson
@ 2019-06-24 11:41     ` Eelco Chaudron
  0 siblings, 0 replies; 6+ messages in thread
From: Eelco Chaudron @ 2019-06-24 11:41 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: Rafael Vargas, xdp-newbies



On 24 Jun 2019, at 13:17, Magnus Karlsson wrote:

> On Mon, Jun 24, 2019 at 12:24 PM Eelco Chaudron <echaudro@redhat.com> 
> wrote:
>>
>>
>>
>> On 21 Jun 2019, at 21:20, Rafael Vargas wrote:
>>
>>> Hi,
>>>
>>> I'm trying to use AF_XDP and I'm using the xdpsock sample
>>> implementation as a guide.
>>>
>>> I've noticed that the Fill Queue slots are released in batches of 16
>>> (kernel 5.1)
>>>
>>> The xdpsock (rx_drop) implementation will lock waiting for the space
>>> in the FQ.
>>> This seems it will work fine when receiving lots of packets, but 
>>> will
>>> loop indefinetely if traffic stops.
>>
>> Yes I was running into the same problem, and you should not wait wait
>> for the free slots, as that will put you in a loop waiting, and not
>> processing packets until 16 are received.
>>
>> This is how I solved it:
>>
>>
>> static void rx_pkts(struct xsk_socket_info *xsk)
>> {
>>         unsigned int rcvd, stock_frames, i;
>>         uint32_t idx_rx = 0, idx_fq = 0;
>>         int ret;
>>
>>         rcvd = xsk_ring_cons__peek(&xsk->rx, RX_BATCH_SIZE, &idx_rx);
>>         if (!rcvd)
>>                 return;
>>
>>         /* Stuff the ring with as much frames as possible */
>>         stock_frames = xsk_ring_prod__free(&xsk->umem->fq);
>>         stock_frames = MIN(stock_frames, xsk_umem_free_frames(xsk));
>>
>>         if (stock_frames > 0) {
>>
>>                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, 
>> stock_frames,
>>                                              &idx_fq);
>>                 while (ret != stock_frames)
>>                         ret = xsk_ring_prod__reserve(&xsk->umem->fq, 
>> rcvd,
>>                                                      &idx_fq);
>>
>>                 for (i = 0; i < stock_frames; i++)
>>                         *xsk_ring_prod__fill_addr(&xsk->umem->fq, 
>> idx_fq++) =
>>                                 xsk_alloc_umem_frame(xsk);
>>
>>                 xsk_ring_prod__submit(&xsk->umem->fq, stock_frames);
>>         }
>>
>>         /* Process received packets */
>>         for (i = 0; i < rcvd; i++) {
>>                 uint64_t addr = xsk_ring_cons__rx_desc(&xsk->rx, 
>> idx_rx)->addr;
>>                 uint32_t len = xsk_ring_cons__rx_desc(&xsk->rx, 
>> idx_rx++)->len;
>>                 char *pkt = xsk_umem__get_data(xsk->umem->buffer, 
>> addr);
>>
>>                 // Handle packet
>>
>>                 xsk_free_umem_frame(xsk, addr);
>>         }
>>
>>         xsk_ring_cons__release(&xsk->rx, rcvd);
>>         xsk->stats.rx_packets += rcvd;
>>    }
>>
>>
>> Where  xsk_ring_prod__free() is as follows (sent patch upstream for
>> adding it as an API):
>
> Eelco,
>
> In the patch you sent, why not make it a patch set by adding these
> improvements (using your new function) to the xdpsock sample app?
> Please :-).

The code above will be part of the 
https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP 
example I’m working on. Once I finish this, I’ll also try to do a 
backport to the example.

Just need to find some time to finish it :)

>> static inline __u32 xsk_ring_prod__free(struct xsk_ring_prod *r)
>> {
>>         r->cached_cons = *r->consumer + r->size;
>>         return r->cached_cons - r->cached_prod;
>> }
>>
>> And the following is an API around my UMEM frame handling:
>>
>> xsk_umem_free_frames() -> How many buffers do I have available on my
>> stack
>>
>> xsk_free_umem_frame()  -> return a buffer to my buffer stack
>>
>> xsk_alloc_umem_frame() -> Get a buffer from my buffer stack
>>
>>> This is the expected behavior for the FQ?
>>> Should I keep the FQ always with free slots in order to avoid 
>>> blocking
>>> when waiting for more packets?
>>>
>>> Rafael Vargas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AF_XDP only releasing from FQ in batches
  2019-06-24 10:21 ` Eelco Chaudron
  2019-06-24 11:17   ` Magnus Karlsson
@ 2019-06-24 14:30   ` Rafael Vargas
  2019-06-25  7:17     ` Eelco Chaudron
  1 sibling, 1 reply; 6+ messages in thread
From: Rafael Vargas @ 2019-06-24 14:30 UTC (permalink / raw)
  To: xdp-newbies

Em seg, 24 de jun de 2019 às 07:21, Eelco Chaudron
<echaudro@redhat.com> escreveu:

>         /* Stuff the ring with as much frames as possible */
>         stock_frames = xsk_ring_prod__free(&xsk->umem->fq);
>         stock_frames = MIN(stock_frames, xsk_umem_free_frames(xsk));
>
>         if (stock_frames > 0) {
>
>                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, stock_frames,
>                                              &idx_fq);
>                 while (ret != stock_frames)
>                         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
>                                                      &idx_fq);
once you've checked before the number of free entries, this while is
still needed?


> Where  xsk_ring_prod__free() is as follows (sent patch upstream for
> adding it as an API):
>
> static inline __u32 xsk_ring_prod__free(struct xsk_ring_prod *r)
> {
>         r->cached_cons = *r->consumer + r->size;
>         return r->cached_cons - r->cached_prod;
> }

this is basically  xsk_prod_nb_free, but always triggering the cache
refresh, right?

>
> And the following is an API around my UMEM frame handling:
>
> xsk_umem_free_frames() -> How many buffers do I have available on my
> stack
>
> xsk_free_umem_frame()  -> return a buffer to my buffer stack
>
> xsk_alloc_umem_frame() -> Get a buffer from my buffer stack
>

Is this API using another local ring or stack buffer or is it using
some clever trick I'm not figuring out?

I'll try this approach, thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AF_XDP only releasing from FQ in batches
  2019-06-24 14:30   ` Rafael Vargas
@ 2019-06-25  7:17     ` Eelco Chaudron
  0 siblings, 0 replies; 6+ messages in thread
From: Eelco Chaudron @ 2019-06-25  7:17 UTC (permalink / raw)
  To: Rafael Vargas; +Cc: xdp-newbies



On 24 Jun 2019, at 16:30, Rafael Vargas wrote:

> Em seg, 24 de jun de 2019 às 07:21, Eelco Chaudron
> <echaudro@redhat.com> escreveu:
>
>>         /* Stuff the ring with as much frames as possible */
>>         stock_frames = xsk_ring_prod__free(&xsk->umem->fq);
>>         stock_frames = MIN(stock_frames, xsk_umem_free_frames(xsk));
>>
>>         if (stock_frames > 0) {
>>
>>                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, stock_frames,
>>                                              &idx_fq);
>>                 while (ret != stock_frames)
>>                         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
>>                                                      &idx_fq);
> once you've checked before the number of free entries, this while is
> still needed?

In theory not, just a safeguard…

>> Where  xsk_ring_prod__free() is as follows (sent patch upstream for
>> adding it as an API):
>>
>> static inline __u32 xsk_ring_prod__free(struct xsk_ring_prod *r)
>> {
>>         r->cached_cons = *r->consumer + r->size;
>>         return r->cached_cons - r->cached_prod;
>> }
>
> this is basically  xsk_prod_nb_free, but always triggering the cache
> refresh, right?

Yes, probably a xsk_prod_nb_free(, xsk_umem_free_frames(xsk)) would work

>>
>> And the following is an API around my UMEM frame handling:
>>
>> xsk_umem_free_frames() -> How many buffers do I have available on my
>> stack
>>
>> xsk_free_umem_frame()  -> return a buffer to my buffer stack
>>
>> xsk_alloc_umem_frame() -> Get a buffer from my buffer stack
>>
>
> Is this API using another local ring or stack buffer or is it using
> some clever trick I'm not figuring out?

Just a simple array with all the UMEM addresses:


struct xsk_socket_info {
    …

	uint64_t umem_frame_addr[NUM_FRAMES];
	uint32_t umem_frame_free;
}

static uint64_t xsk_alloc_umem_frame(struct xsk_socket_info *xsk)
{
	uint64_t frame;
	if (xsk->umem_frame_free == 0)
		return INVALID_UMEM_FRAME;

	frame = xsk->umem_frame_addr[--xsk->umem_frame_free];
	xsk->umem_frame_addr[xsk->umem_frame_free] = INVALID_UMEM_FRAME;
	return frame;
}

static void xsk_free_umem_frame(struct xsk_socket_info *xsk, uint64_t frame)
{
	assert(xsk->umem_frame_free < NUM_FRAMES);

	xsk->umem_frame_addr[xsk->umem_frame_free++] = frame;
}

static uint64_t xsk_umem_free_frames(struct xsk_socket_info *xsk)
{
	return xsk->umem_frame_free;
}


> I'll try this approach, thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-25  7:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-21 19:20 AF_XDP only releasing from FQ in batches Rafael Vargas
2019-06-24 10:21 ` Eelco Chaudron
2019-06-24 11:17   ` Magnus Karlsson
2019-06-24 11:41     ` Eelco Chaudron
2019-06-24 14:30   ` Rafael Vargas
2019-06-25  7:17     ` Eelco Chaudron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.