All of lore.kernel.org
 help / color / mirror / Atom feed
* behavior of recvmmsg() on blocking sockets
@ 2010-03-24 16:15 Brandon Black
  2010-03-24 17:41 ` Chris Friesen
  0 siblings, 1 reply; 14+ messages in thread
From: Brandon Black @ 2010-03-24 16:15 UTC (permalink / raw)
  To: linux-kernel

[Not on the list, please CC responses]

Currently, my application code uses blocking UDP sockets and is
basically structured like this:

while(1) {
    recvmsg(fd, ...);
    // do some work on the packet
    sendmsg(fd, ...);
}

It uses a thread-per-socket model, and the "do some work" code is very
fast, and so this turns out to be more efficient than non-blocking
techniques for my use case.  Today I started playing with 2.6.33's new
recvmmsg(), hoping to convert my code like so (still on blocking
sockets):

while(1) {
    recvmmsg(fd, ...);
    // do some work on up to N packets
    // loop over sendmsg() foreach packet to be sent
    //   (or sendmmsg() if/when that interface becomes available)
}

The catch I ran into is that on a blocking socket, recvmmsg() will
block until *all* vlen packets have been received.  The behavior I'd
prefer for my use case would be for it to only block until at least
one packet is available, not until all are available.  Or in code
terms, the first internal call to recvmsg should use the supplied
flags, and the rest of the recvmsg calls should use flags &
MSG_DONTWAIT.  It's not clear to me which is the better default
behavior, but I feel like at the very least there should be a flag
that can switch behavior between the two possible interpretations of
"blocking".

Obviously, I can also work around this at the user level by simply
switching to a non-blocking socket and using select/poll before
recvmmsg, but then under any conditions where only 1 packet is
available (probably fairly common) I'm issuing two syscalls per packet
when only one should be necessary (and only one is necessary when
using recvmsg()).  This seems inefficient and antithetical to one of
the design goals of recvmmsg (cut down the syscalls:packets ratio).

Thoughts on this?

Thanks,
-- Brandon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 16:15 behavior of recvmmsg() on blocking sockets Brandon Black
@ 2010-03-24 17:41 ` Chris Friesen
  2010-03-24 18:28   ` Brandon Black
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Friesen @ 2010-03-24 17:41 UTC (permalink / raw)
  To: Brandon Black; +Cc: linux-kernel, netdev

On 03/24/2010 10:15 AM, Brandon Black wrote:
> [Not on the list, please CC responses]

Adding netdev to the CC list.

> Currently, my application code uses blocking UDP sockets and is
> basically structured like this:
> 
> while(1) {
>     recvmsg(fd, ...);
>     // do some work on the packet
>     sendmsg(fd, ...);
> }
> 
> It uses a thread-per-socket model

This doesn't scale well to large numbers of sockets....you get a lot of
unnecessary context switching.


, and the "do some work" code is very
> fast, and so this turns out to be more efficient than non-blocking
> techniques for my use case.  Today I started playing with 2.6.33's new
> recvmmsg(), hoping to convert my code like so (still on blocking
> sockets):
> 
> while(1) {
>     recvmmsg(fd, ...);
>     // do some work on up to N packets
>     // loop over sendmsg() foreach packet to be sent
>     //   (or sendmmsg() if/when that interface becomes available)
> }
> 
> The catch I ran into is that on a blocking socket, recvmmsg() will
> block until *all* vlen packets have been received.  The behavior I'd
> prefer for my use case would be for it to only block until at least
> one packet is available, not until all are available.  Or in code
> terms, the first internal call to recvmsg should use the supplied
> flags, and the rest of the recvmsg calls should use flags &
> MSG_DONTWAIT.  It's not clear to me which is the better default
> behavior, but I feel like at the very least there should be a flag
> that can switch behavior between the two possible interpretations of
> "blocking".

On a sufficiently fast CPU there will always only be 1 packet waiting
but we'll waste a lot of time doing one syscall per packet.

I suspect the intent is that you set the timeout to indicate the max
latency you're willing to accomodate.  Once the timeout expires then the
call will return with the packets received to that point.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 17:41 ` Chris Friesen
@ 2010-03-24 18:28   ` Brandon Black
  2010-03-24 18:34     ` drepper
  2010-03-24 19:36     ` Chris Friesen
  0 siblings, 2 replies; 14+ messages in thread
From: Brandon Black @ 2010-03-24 18:28 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel, netdev

On Wed, Mar 24, 2010 at 12:41 PM, Chris Friesen <cfriesen@nortel.com> wrote:
> On 03/24/2010 10:15 AM, Brandon Black wrote:
>> It uses a thread-per-socket model
>
> This doesn't scale well to large numbers of sockets....you get a lot of
> unnecessary context switching.

It scales great actually, within my measurement error of linear in
testing so far.  These are UDP server sockets, and the traffic pattern
is one request packet maps to one response packet, with no longer-term
per-client state (this is a DNS server, to be specific).  The "do some
work" code doesn't have any inter-thread contention (no locks, no
writes to the same memory, etc), so the "threads" here may as well be
processes if that makes the discussion less confusing.  I haven't yet
found a model that scales as well for me.

> On a sufficiently fast CPU there will always only be 1 packet waiting
> but we'll waste a lot of time doing one syscall per packet.

Based on loopback interface testing, when the socket is saturated with
packet throughput (one CPU core is locked up handling one socket), the
"do some work" code accounts for an average of roughly 10-20% of the
cpu time per request right now on a fairly fast Xeon, the rest is
spent in recvmsg()/sendmsg().  One potential way for things to "get
behind" would be that the time spent in my user code isn't a constant:
some requests will be processed slower than others.  If a particular
request is unusually slow for some reason (and there are potential
reasons) and 2+ packets backlog while handling it, recvmmsg() allows
me to catch up faster.

I'm also just not personally sure whether there are network
interfaces/drivers out there that could queue packets to the kernel
(to a single socket) faster than recvmsg() could dequeue them to
userspace, which is another reason recvmmsg() would make sense for
this.  Maybe that's not even possible, I have no idea.  But for the
moment, I've been operating on the assumption that if it's not
possible now, it likely will be possible at some point in the future.

> I suspect the intent is that you set the timeout to indicate the max
> latency you're willing to accomodate.  Once the timeout expires then the
> call will return with the packets received to that point.

Yes, I agree that's another option I have here, to use the timeout to
set a small but acceptable latency window for gathering multiple
packets.  That timeout value wouldn't have a universally right value
though, so I'd probably have to pass it off to the user as a config
option and let them tune it.  Assuming no change is made to
recvmmsg(), this is probably the route I'll test and benchmark (versus
just sticking with plain recvmsg()).

I still think having a "block until at least one packet arrives" mode
for recvmmsg() makes sense though.

Thanks for the input,
-- Brandon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 18:28   ` Brandon Black
@ 2010-03-24 18:34     ` drepper
  2010-03-24 23:35       ` Brandon Black
  2010-03-24 19:36     ` Chris Friesen
  1 sibling, 1 reply; 14+ messages in thread
From: drepper @ 2010-03-24 18:34 UTC (permalink / raw)
  To: Brandon Black; +Cc: Chris Friesen, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 405 bytes --]

On Wed, Mar 24, 2010 at 11:28, Brandon Black <blblack@gmail.com> wrote:
> I still think having a "block until at least one packet arrives" mode
> for recvmmsg() makes sense though.

I agree.  This is the mode I've seen people asking for.  They want the call to return as quickly as possible if there is data and then with as many messages as possible.  A MSG_WAITFORONE flag would do the trick nicely.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 272 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 18:28   ` Brandon Black
  2010-03-24 18:34     ` drepper
@ 2010-03-24 19:36     ` Chris Friesen
  2010-03-24 19:55       ` Brandon Black
  1 sibling, 1 reply; 14+ messages in thread
From: Chris Friesen @ 2010-03-24 19:36 UTC (permalink / raw)
  To: Brandon Black; +Cc: linux-kernel, netdev

On 03/24/2010 12:28 PM, Brandon Black wrote:
> On Wed, Mar 24, 2010 at 12:41 PM, Chris Friesen <cfriesen@nortel.com> wrote:
>> On 03/24/2010 10:15 AM, Brandon Black wrote:
>>> It uses a thread-per-socket model
>>
>> This doesn't scale well to large numbers of sockets....you get a lot of
>> unnecessary context switching.
> 
> It scales great actually, within my measurement error of linear in
> testing so far.  These are UDP server sockets, and the traffic pattern
> is one request packet maps to one response packet, with no longer-term
> per-client state (this is a DNS server, to be specific).  The "do some
> work" code doesn't have any inter-thread contention (no locks, no
> writes to the same memory, etc), so the "threads" here may as well be
> processes if that makes the discussion less confusing.  I haven't yet
> found a model that scales as well for me.

Note that I said "large numbers of sockets".  Like tens of thousands.
In addition to context switch overhead this can also lead to issues with
memory consumption due to stack frames.

> I'm also just not personally sure whether there are network
> interfaces/drivers out there that could queue packets to the kernel
> (to a single socket) faster than recvmsg() could dequeue them to
> userspace

A 10Gig NIC could do this easily depending on your CPU.

> I still think having a "block until at least one packet arrives" mode
> for recvmmsg() makes sense though.

Agreed, as long as developers are aware that it won't be the most
efficient mode of operation.

Consider the case where you want to do some other useful work in
addition to running your network server.  Every cpu cycle spent on the
network server is robbed from the other work.  In this scenario you want
to handle packets as efficiently as possible, so the timeout-based
behaviour is better since it is more likely to give you multiple packets
per syscall.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 19:36     ` Chris Friesen
@ 2010-03-24 19:55       ` Brandon Black
  2010-03-27 13:19         ` Brandon Black
  0 siblings, 1 reply; 14+ messages in thread
From: Brandon Black @ 2010-03-24 19:55 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel, netdev

On Wed, Mar 24, 2010 at 2:36 PM, Chris Friesen <cfriesen@nortel.com> wrote:
> Note that I said "large numbers of sockets".  Like tens of thousands.
> In addition to context switch overhead this can also lead to issues with
> memory consumption due to stack frames.

Ok, agreed there.  In my case though, there will only ever be a
handful of sockets.  Ideally it would be just one socket.  The only
reason I allocate multiple sockets and spawn threads for them is
because you can't scale past one CPU core on a single socket, due to
the NIC and/or the driver and/or the socket locks and/or the basic
nature of the problem.

> Consider the case where you want to do some other useful work in
> addition to running your network server.  Every cpu cycle spent on the
> network server is robbed from the other work.  In this scenario you want
> to handle packets as efficiently as possible, so the timeout-based
> behaviour is better since it is more likely to give you multiple packets
> per syscall.

That's a good point, I tend to tunnelvision on the dedicated server
scenario.  I should probably have a user-level option for
timeout-based operation as well, since the decision here gets to the
systems admin/engineering level and will be situational.

Thanks,
-- Brandon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 18:34     ` drepper
@ 2010-03-24 23:35       ` Brandon Black
  2010-03-26 12:00         ` Ulrich Drepper
  0 siblings, 1 reply; 14+ messages in thread
From: Brandon Black @ 2010-03-24 23:35 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

On Wed, Mar 24, 2010 at 1:34 PM,  <drepper@gmail.com> wrote:
> On Wed, Mar 24, 2010 at 11:28, Brandon Black <blblack@gmail.com> wrote:
>>
>> I still think having a "block until at least one packet arrives" mode
>> for recvmmsg() makes sense though.
>
> I agree.  This is the mode I've seen people asking for.  They want the call
> to return as quickly as possible if there is data and then with as many
> messages as possible.  A MSG_WAITFORONE flag would do the trick nicely.

This patch might be woefully inadequate, as I'm not intimately
familiar with the rest of the Linux socket code, and I'm not sure what
the impact is of (a) adding that new flag, which is the first beyond
the 16-bit space, and (b) having that extra undefined flag present
during the underlying recvmsg() calls, but this patch Works For Me for
my isolated case.  Thoughts? (hoping gmail doesn't mangle this)

[blblack@xpc kernels]$ diff -u linux-2.6.33-orig/net/socket.c
linux-2.6.33/net/socket.c
--- linux-2.6.33-orig/net/socket.c	2010-02-24 12:52:17.000000000 -0600
+++ linux-2.6.33/net/socket.c	2010-03-24 18:10:37.156234986 -0500
@@ -2133,7 +2133,10 @@

 		if (err)
 			break;
-		++datagrams;
+
+                /* MSG_WAITFORONE turns on MSG_DONTWAIT after one packet */
+		if (!datagrams++ && flags & MSG_WAITFORONE)
+			flags |= MSG_DONTWAIT;

 		if (timeout) {
 			ktime_get_ts(timeout);
[blblack@xpc kernels]$ diff -u
linux-2.6.33-orig/include/linux/socket.h
linux-2.6.33/include/linux/socket.h
--- linux-2.6.33-orig/include/linux/socket.h	2010-02-24 12:52:17.000000000 -0600
+++ linux-2.6.33/include/linux/socket.h	2010-03-24 17:35:14.009266280 -0500
@@ -255,6 +255,7 @@
 #define MSG_ERRQUEUE	0x2000	/* Fetch message from error queue */
 #define MSG_NOSIGNAL	0x4000	/* Do not generate SIGPIPE */
 #define MSG_MORE	0x8000	/* Sender will send more */
+#define MSG_WAITFORONE  0x10000 /* recvmmsg(): block until 1+ packets avail */

 #define MSG_EOF         MSG_FIN

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 23:35       ` Brandon Black
@ 2010-03-26 12:00         ` Ulrich Drepper
  2010-03-26 14:20           ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Ulrich Drepper @ 2010-03-26 12:00 UTC (permalink / raw)
  To: Brandon Black; +Cc: netdev, linux-kernel

On Wed, Mar 24, 2010 at 16:35, Brandon Black <blblack@gmail.com> wrote:
> +                /* MSG_WAITFORONE turns on MSG_DONTWAIT after one packet */
> +               if (!datagrams++ && flags & MSG_WAITFORONE)
> +                       flags |= MSG_DONTWAIT;

There should be an extra pair of parenthesis around the & operands but
aside from that I could imagine this to work.

Would be nice if the netdev folks would respond.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-26 12:00         ` Ulrich Drepper
@ 2010-03-26 14:20           ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2010-03-26 14:20 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: Brandon Black, netdev, linux-kernel

Le vendredi 26 mars 2010 à 05:00 -0700, Ulrich Drepper a écrit :
> On Wed, Mar 24, 2010 at 16:35, Brandon Black <blblack@gmail.com> wrote:
> > +                /* MSG_WAITFORONE turns on MSG_DONTWAIT after one packet */
> > +               if (!datagrams++ && flags & MSG_WAITFORONE)
> > +                       flags |= MSG_DONTWAIT;
> 
> There should be an extra pair of parenthesis around the & operands but
> aside from that I could imagine this to work.
> 
> Would be nice if the netdev folks would respond.


It seems fine, Brandon please submit a formal patch.

Thanks



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-24 19:55       ` Brandon Black
@ 2010-03-27 13:19         ` Brandon Black
  2010-03-27 14:26           ` Arnaldo Carvalho de Melo
  2010-03-29 16:18           ` Chris Friesen
  0 siblings, 2 replies; 14+ messages in thread
From: Brandon Black @ 2010-03-27 13:19 UTC (permalink / raw)
  To: Chris Friesen, Arnaldo Carvalho de Melo; +Cc: linux-kernel, netdev

On Wed, Mar 24, 2010 at 2:55 PM, Brandon Black <blblack@gmail.com> wrote:
> On Wed, Mar 24, 2010 at 2:36 PM, Chris Friesen <cfriesen@nortel.com> wrote:
>> Consider the case where you want to do some other useful work in
>> addition to running your network server.  Every cpu cycle spent on the
>> network server is robbed from the other work.  In this scenario you want
>> to handle packets as efficiently as possible, so the timeout-based
>> behaviour is better since it is more likely to give you multiple packets
>> per syscall.
>
> That's a good point, I tend to tunnelvision on the dedicated server
> scenario.  I should probably have a user-level option for
> timeout-based operation as well, since the decision here gets to the
> systems admin/engineering level and will be situational.

I've been playing with the timeout argument to recvmmsg as well now,
and I'm struggling to see how one would ever use it correctly with the
current implementation.  It seems to rely on the assumption of a
never-ending stream of tightly-spaced input packets?  It seems like it
was meant for usage on blocking sockets.  Given a blocking socket with
timeout 0 (infinite), and a recvmmsg timeout of 100us, if you had a
very steady stream of input packets, it recvmmsg would pull in all of
them that it could within a max timeframe of (100us +
time_to_execute_one_recvmsg).  However, any disruption to the input
stream for a time-window of N would result in delaying some
already-received packets by N.  For example, consider the case that 2
packets are already queued when you invoke recvmmsg(), but then the
next packet doesn't arrive for another 300ms.  In this scenario, you'd
end up with recvmmsg() blocking for 300ms and then returning all 3
packets, two of which have been delayed way beyond the specified
timeout.

-- Brandon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-27 13:19         ` Brandon Black
@ 2010-03-27 14:26           ` Arnaldo Carvalho de Melo
  2010-03-29 16:18           ` Chris Friesen
  1 sibling, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2010-03-27 14:26 UTC (permalink / raw)
  To: Brandon Black; +Cc: Chris Friesen, linux-kernel, netdev

Em Sat, Mar 27, 2010 at 08:19:09AM -0500, Brandon Black escreveu:
> On Wed, Mar 24, 2010 at 2:55 PM, Brandon Black <blblack@gmail.com> wrote:
> > On Wed, Mar 24, 2010 at 2:36 PM, Chris Friesen <cfriesen@nortel.com> wrote:
> >> Consider the case where you want to do some other useful work in
> >> addition to running your network server.  Every cpu cycle spent on the
> >> network server is robbed from the other work.  In this scenario you want
> >> to handle packets as efficiently as possible, so the timeout-based
> >> behaviour is better since it is more likely to give you multiple packets
> >> per syscall.
> >
> > That's a good point, I tend to tunnelvision on the dedicated server
> > scenario.  I should probably have a user-level option for
> > timeout-based operation as well, since the decision here gets to the
> > systems admin/engineering level and will be situational.
> 
> I've been playing with the timeout argument to recvmmsg as well now,
> and I'm struggling to see how one would ever use it correctly with the
> current implementation.  It seems to rely on the assumption of a
> never-ending stream of tightly-spaced input packets?  It seems like it

As said by somebody else in this recent discussion (perhaps Chris), it
is based on the maximum latency acceptable.

If minimum latency is desired, use a zero timeout and get as many
packets get queued up while the application is processing the last
batch.

If instead more packets are desired per batch and some latency is
acceptable, use a timeout.

10 Gbit/s interfaces were the target but results with simple app
published when the syscall was posted initially showed that even on 1
1 Gbit/s eth this helped.

> was meant for usage on blocking sockets.  Given a blocking socket with
> timeout 0 (infinite), and a recvmmsg timeout of 100us, if you had a
> very steady stream of input packets, it recvmmsg would pull in all of
> them that it could within a max timeframe of (100us +
> time_to_execute_one_recvmsg).  However, any disruption to the input
> stream for a time-window of N would result in delaying some
> already-received packets by N.  For example, consider the case that 2
> packets are already queued when you invoke recvmmsg(), but then the
> next packet doesn't arrive for another 300ms.  In this scenario, you'd
> end up with recvmmsg() blocking for 300ms and then returning all 3
> packets, two of which have been delayed way beyond the specified
> timeout.

And that is a use case that is fixed by your patch, thanks, now we cover
more use cases :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-27 13:19         ` Brandon Black
  2010-03-27 14:26           ` Arnaldo Carvalho de Melo
@ 2010-03-29 16:18           ` Chris Friesen
  2010-03-29 17:24             ` Brandon Black
  1 sibling, 1 reply; 14+ messages in thread
From: Chris Friesen @ 2010-03-29 16:18 UTC (permalink / raw)
  To: Brandon Black; +Cc: Arnaldo Carvalho de Melo, linux-kernel, netdev

On 03/27/2010 07:19 AM, Brandon Black wrote:

> I've been playing with the timeout argument to recvmmsg as well now,
> and I'm struggling to see how one would ever use it correctly with the
> current implementation.

I'd probably do something like this:

prev = current time
loop forever
	cur = current time
	timeout = max_latency - (cur - prev)
	recvmmsg(timeout)
	process all received messages
	prev = cur


Basically you determine the max latency you're willing to wait for a
packet to be handled, then subtract the amount of time you spent
processing messages from that and pass it into the recvmmsg() call as
the timeout.  That way no messages will be delayed for longer than the
max latency. (Not considering scheduling delays.)

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-29 16:18           ` Chris Friesen
@ 2010-03-29 17:24             ` Brandon Black
  2010-03-29 17:48               ` Chris Friesen
  0 siblings, 1 reply; 14+ messages in thread
From: Brandon Black @ 2010-03-29 17:24 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Arnaldo Carvalho de Melo, linux-kernel, netdev

On Mon, Mar 29, 2010 at 11:18 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>
> prev = current time
> loop forever
>        cur = current time
>        timeout = max_latency - (cur - prev)
>        recvmmsg(timeout)
>        process all received messages
>        prev = cur
>
>
> Basically you determine the max latency you're willing to wait for a
> packet to be handled, then subtract the amount of time you spent
> processing messages from that and pass it into the recvmmsg() call as
> the timeout.  That way no messages will be delayed for longer than the
> max latency. (Not considering scheduling delays.)

With a blocking socket, you'd also need to set SO_RCVTIMEO on the
underlying socket to some value that makes sense and is below your max
latency, because recvmmsg()'s timeout argument only applies in-between
underlying recvmsg() calls, not during them.  You're going to spend a
lot of time spinning if max_latency is low and there are any gaps in
the input stream though.  I guess for some uses this must makes sense.

The other potential usage is with non-blocking sockets, in which case
the timeout argument is putting on upper boundary on how long
recvmmsg() can spend fetching packets from the queue before it must
return, even if more packets are available.  Seems like for a given
kernel and hardware you could accomplish the same by tuning the vlen
argument.  In either case though, it seems like if you're running into
your hard latency limit on a non-blocking packet fetch and there are
already more packets waiting, you're probably (at least) verging on
being unable to meet the latency requirement for (at least) some of
your packets due to a hard lack of CPU horsepower for the workload.

-- Brandon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: behavior of recvmmsg() on blocking sockets
  2010-03-29 17:24             ` Brandon Black
@ 2010-03-29 17:48               ` Chris Friesen
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Friesen @ 2010-03-29 17:48 UTC (permalink / raw)
  To: Brandon Black; +Cc: Arnaldo Carvalho de Melo, linux-kernel, netdev

On 03/29/2010 11:24 AM, Brandon Black wrote:
> On Mon, Mar 29, 2010 at 11:18 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>>
>> prev = current time
>> loop forever
>>        cur = current time
>>        timeout = max_latency - (cur - prev)
>>        recvmmsg(timeout)
>>        process all received messages
>>        prev = cur
>>
>>
>> Basically you determine the max latency you're willing to wait for a
>> packet to be handled, then subtract the amount of time you spent
>> processing messages from that and pass it into the recvmmsg() call as
>> the timeout.  That way no messages will be delayed for longer than the
>> max latency. (Not considering scheduling delays.)
> 
> With a blocking socket, you'd also need to set SO_RCVTIMEO on the
> underlying socket to some value that makes sense and is below your max
> latency, because recvmmsg()'s timeout argument only applies in-between
> underlying recvmsg() calls, not during them.

Hmm...that's a good point.  For some reason I had been under the
impression that the timeout affected the underlying recvmsg() calls as
well.  It think it would make more sense for the kernel to abort a
blocking recvmsg() call once the timeout expires.

As for spending a lot of time spinning if there are gaps in the input
stream...in the cases where the time-based usage makes sense the normal
situation is that there are a lot of packets coming in.  A 10gig
ethernet pipe can theoretically receive something like 19 packets per
usec.  Doesn't take much of a delay before you probably have packets
waiting.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-03-29 17:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-24 16:15 behavior of recvmmsg() on blocking sockets Brandon Black
2010-03-24 17:41 ` Chris Friesen
2010-03-24 18:28   ` Brandon Black
2010-03-24 18:34     ` drepper
2010-03-24 23:35       ` Brandon Black
2010-03-26 12:00         ` Ulrich Drepper
2010-03-26 14:20           ` Eric Dumazet
2010-03-24 19:36     ` Chris Friesen
2010-03-24 19:55       ` Brandon Black
2010-03-27 13:19         ` Brandon Black
2010-03-27 14:26           ` Arnaldo Carvalho de Melo
2010-03-29 16:18           ` Chris Friesen
2010-03-29 17:24             ` Brandon Black
2010-03-29 17:48               ` Chris Friesen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.