All of lore.kernel.org
 help / color / mirror / Atom feed
* Same data to several sockets with just one syscall ?
@ 2016-02-12  8:53 Claudio Scordino
  2016-02-12 10:35 ` Eric Dumazet
  2016-02-12 14:25 ` Tom Herbert
  0 siblings, 2 replies; 6+ messages in thread
From: Claudio Scordino @ 2016-02-12  8:53 UTC (permalink / raw)
  To: netdev

Hi all.

Suppose I have an application that needs to send the very same data to
several sockets already connected. In this case, the application has
to call the sendto() syscall several times:

      for(...)
               sendto(...)

This makes the application waste time in entering/exiting the kernel
level several times.
Moreover, if I'm not wrong, the kernel is free to execute pending work
(e.g., softirqs) when returning from a syscall, making the application
experience further latency.

I therefore wonder if a mechanism exists for sending the data to
several sockets using just a single syscall. If not, has anybody ever
thought about adding a syscall like the following ?

      sendto-multicast(..., int number_of_sockets, int* const sockets [])

I can't see any obvious reason why such an approach could be wrong.

Many thanks,

               Claudio

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Same data to several sockets with just one syscall ?
  2016-02-12  8:53 Same data to several sockets with just one syscall ? Claudio Scordino
@ 2016-02-12 10:35 ` Eric Dumazet
  2016-02-15 10:03   ` Claudio Scordino
  2016-02-12 14:25 ` Tom Herbert
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2016-02-12 10:35 UTC (permalink / raw)
  To: Claudio Scordino; +Cc: netdev

On Fri, 2016-02-12 at 09:53 +0100, Claudio Scordino wrote:

> This makes the application waste time in entering/exiting the kernel
> level several times.

syscall overhead is usually small. Real cost is actually getting to the
socket objects (fd manipulation), that you wont avoid with a
super-syscall anyway.

> Moreover, if I'm not wrong, the kernel is free to execute pending work
> (e.g., softirqs) when returning from a syscall, making the application
> experience further latency.

Well, sofirqs can happen even in the middle of syscalls, not only at the
end of them.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Same data to several sockets with just one syscall ?
  2016-02-12  8:53 Same data to several sockets with just one syscall ? Claudio Scordino
  2016-02-12 10:35 ` Eric Dumazet
@ 2016-02-12 14:25 ` Tom Herbert
  1 sibling, 0 replies; 6+ messages in thread
From: Tom Herbert @ 2016-02-12 14:25 UTC (permalink / raw)
  To: Claudio Scordino; +Cc: Linux Kernel Network Developers

On Fri, Feb 12, 2016 at 9:53 AM, Claudio Scordino
<claudio@evidence.eu.com> wrote:
> Hi all.
>
> Suppose I have an application that needs to send the very same data to
> several sockets already connected. In this case, the application has
> to call the sendto() syscall several times:
>
>       for(...)
>                sendto(...)
>
> This makes the application waste time in entering/exiting the kernel
> level several times.
> Moreover, if I'm not wrong, the kernel is free to execute pending work
> (e.g., softirqs) when returning from a syscall, making the application
> experience further latency.
>
> I therefore wonder if a mechanism exists for sending the data to
> several sockets using just a single syscall. If not, has anybody ever
> thought about adding a syscall like the following ?
>
>       sendto-multicast(..., int number_of_sockets, int* const sockets [])
>
> I can't see any obvious reason why such an approach could be wrong.
>
The design of KCM allows for this where sendmmsg can be used to send
messages over TCP to various connections. Overcoming HOL blocking
becomes the biggest issue to address with something like that I think.
Using recvmmsg to get messages from multiple TCP sockets should be
doable in KCM.

Tom

> Many thanks,
>
>                Claudio

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Same data to several sockets with just one syscall ?
  2016-02-12 10:35 ` Eric Dumazet
@ 2016-02-15 10:03   ` Claudio Scordino
  2016-02-15 18:16     ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Claudio Scordino @ 2016-02-15 10:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

Hi Eric,

2016-02-12 11:35 GMT+01:00 Eric Dumazet <eric.dumazet@gmail.com>:
> On Fri, 2016-02-12 at 09:53 +0100, Claudio Scordino wrote:
>
>> This makes the application waste time in entering/exiting the kernel
>> level several times.
>
> syscall overhead is usually small. Real cost is actually getting to the
> socket objects (fd manipulation), that you wont avoid with a
> super-syscall anyway.

Thank you for answering. I see your point.

However, assuming that a switch from user-space to kernel-space (and
back) needs about 200nsec of computation (which I guess is a
reasonable value for a 3GHz x86 architecture), the 50th receiver
experiences a latency of about 10 usec. In some domains (e.g.,
finance) this delay is not negligible.

Moving the "fan-out" code into kernel space would remove this waste of
time. IMHO, the latency reduction would pay back the 100 lines of code
for adding a new syscall.

Many thanks and best regards,

                 Claudio

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Same data to several sockets with just one syscall ?
  2016-02-15 10:03   ` Claudio Scordino
@ 2016-02-15 18:16     ` Eric Dumazet
  2016-02-16  7:52       ` Claudio Scordino
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2016-02-15 18:16 UTC (permalink / raw)
  To: Claudio Scordino; +Cc: netdev

On Mon, 2016-02-15 at 11:03 +0100, Claudio Scordino wrote:
> Hi Eric,
> 
> 2016-02-12 11:35 GMT+01:00 Eric Dumazet <eric.dumazet@gmail.com>:
> > On Fri, 2016-02-12 at 09:53 +0100, Claudio Scordino wrote:
> >
> >> This makes the application waste time in entering/exiting the kernel
> >> level several times.
> >
> > syscall overhead is usually small. Real cost is actually getting to the
> > socket objects (fd manipulation), that you wont avoid with a
> > super-syscall anyway.
> 
> Thank you for answering. I see your point.
> 
> However, assuming that a switch from user-space to kernel-space (and
> back) needs about 200nsec of computation (which I guess is a
> reasonable value for a 3GHz x86 architecture), the 50th receiver
> experiences a latency of about 10 usec. In some domains (e.g.,
> finance) this delay is not negligible.

I thought these domains were using multicast.

> 
> Moving the "fan-out" code into kernel space would remove this waste of
> time. IMHO, the latency reduction would pay back the 100 lines of code
> for adding a new syscall.

It wont reduce the latency at all, and add a lot of maintenance hassle.

syscall overhead is about 40 ns.
This is the time taken to transmit ~50 bytes on 10Gbit link.

40ns * 50 = 2 usec only.

Feel free to implement your idea and test it, you'll discover the added
complexity is not worth it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Same data to several sockets with just one syscall ?
  2016-02-15 18:16     ` Eric Dumazet
@ 2016-02-16  7:52       ` Claudio Scordino
  0 siblings, 0 replies; 6+ messages in thread
From: Claudio Scordino @ 2016-02-16  7:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

Hi Eric,

2016-02-15 19:16 GMT+01:00 Eric Dumazet <eric.dumazet@gmail.com>:
> On Mon, 2016-02-15 at 11:03 +0100, Claudio Scordino wrote:
>> Hi Eric,
>>
>> 2016-02-12 11:35 GMT+01:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> > On Fri, 2016-02-12 at 09:53 +0100, Claudio Scordino wrote:
>> >
>> >> This makes the application waste time in entering/exiting the kernel
>> >> level several times.
>> >
>> > syscall overhead is usually small. Real cost is actually getting to the
>> > socket objects (fd manipulation), that you wont avoid with a
>> > super-syscall anyway.
>>
>> Thank you for answering. I see your point.
>>
>> However, assuming that a switch from user-space to kernel-space (and
>> back) needs about 200nsec of computation (which I guess is a
>> reasonable value for a 3GHz x86 architecture), the 50th receiver
>> experiences a latency of about 10 usec. In some domains (e.g.,
>> finance) this delay is not negligible.
>
> I thought these domains were using multicast.

They don't :)

There are a couple of reasons behind their choice:

- Multicast works only in SOCK_DGRAM (i.e. unreliable)

- For a limited number of receivers (e.g. 50) and depending on the
data size, the latency of multicast is almost equal to the one of TCP

>
>>
>> Moving the "fan-out" code into kernel space would remove this waste of
>> time. IMHO, the latency reduction would pay back the 100 lines of code
>> for adding a new syscall.
>
> It wont reduce the latency at all, and add a lot of maintenance hassle.
>
> syscall overhead is about 40 ns.

I thought it was slightly higher. Does this time also include the
interrupt return to go back to user-space ?


> This is the time taken to transmit ~50 bytes on 10Gbit link.
>
> 40ns * 50 = 2 usec only.
>
> Feel free to implement your idea and test it, you'll discover the added
> complexity is not worth it.

Honestly, I can't see how it could be that difficult: the kernel-side
code could just iterate on the existing syscall...

Can you please elaborate a bit further to let me understand why it
would be that complex ?

Many thanks and best regards,

                 Claudio

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-02-16  7:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-12  8:53 Same data to several sockets with just one syscall ? Claudio Scordino
2016-02-12 10:35 ` Eric Dumazet
2016-02-15 10:03   ` Claudio Scordino
2016-02-15 18:16     ` Eric Dumazet
2016-02-16  7:52       ` Claudio Scordino
2016-02-12 14:25 ` Tom Herbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.