All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
       [not found] <9b2db90b0908060014r6a1763e8t1b3ee9310e012c25@mail.gmail.com>
@ 2009-08-06  7:15 ` Nir Tzachar
  2009-09-14 23:09   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Nir Tzachar @ 2009-08-06  7:15 UTC (permalink / raw)
  To: netdev

Hello.

Is there anything new with this patch? What are the plans for merging
it upstream?

Cheers.

p.s. please cc me, I am not registered to linux-netdev.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-08-06  7:15 ` Fwd: [RFC v3] net: Introduce recvmmsg socket syscall Nir Tzachar
@ 2009-09-14 23:09   ` Arnaldo Carvalho de Melo
  2009-09-15  8:37     ` Nir Tzachar
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-09-14 23:09 UTC (permalink / raw)
  To: Nir Tzachar; +Cc: Linux Networking Development Mailing List

Em Thu, Aug 06, 2009 at 10:15:26AM +0300, Nir Tzachar escreveu:
> Hello.
> 
> Is there anything new with this patch? What are the plans for merging
> it upstream?

I'm doing perf runs using a test app using recvmsg, then with the first
patch, that introduces recvmmsg, then with the second, that locks the
series of unlocked_recvmmsg calls just once, will try to get this posted
here soon.

I'd really appreciate if the people interested in this could try it and
post numbers too, to get this ball rolling again.

As for getting it upstream, well, posting numbers here would definetely
help with that :-)

- Arnaldo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-09-14 23:09   ` Arnaldo Carvalho de Melo
@ 2009-09-15  8:37     ` Nir Tzachar
  2009-09-15 14:11       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Nir Tzachar @ 2009-09-15  8:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Linux Networking Development Mailing List, Ziv Ayalon

On Tue, Sep 15, 2009 at 2:09 AM, Arnaldo Carvalho de
Melo<acme@ghostprotocols.net> wrote:
> Em Thu, Aug 06, 2009 at 10:15:26AM +0300, Nir Tzachar escreveu:
>> Hello.
>>
>> Is there anything new with this patch? What are the plans for merging
>> it upstream?
>
> I'm doing perf runs using a test app using recvmsg, then with the first
> patch, that introduces recvmmsg, then with the second, that locks the
> series of unlocked_recvmmsg calls just once, will try to get this posted
> here soon.
>
> I'd really appreciate if the people interested in this could try it and
> post numbers too, to get this ball rolling again.
>
> As for getting it upstream, well, posting numbers here would definetely
> help with that :-)

Ok, here are some crude results:

Setup:
linux 2.6.29.2 with the third version of the patch, running on an
Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
9.04

Application:
A financial application, subscribing to quotes from a stock exchange.
Typical traffic is small (around 50-100 bytes) multicast packets in
large volumes. The application just receives  the quotes and pass them
along.

The test:
Run two version of the application, head to head. one version using
recvmsg and the other recvmmsg. The data is passed to a third
application measuring the latency of the data.

Results:
On general, the recvmmsg beats the pants off the regular recvmsg by a
whole millisecond (which might not sound much, but is _really_ a lot
for us ;). The exact distribution fluctuates between half a milli and
2 millis, but the average is 1 milli.

Conclusions:
We would _really_ like to see this patch go upstream. It gives an
important performance boost in out use cases.

We are willing to perform more accurate tests if needed, and would
appreciate the feedback on how to conduct them.

Cheers,
Nir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-09-15  8:37     ` Nir Tzachar
@ 2009-09-15 14:11       ` Arnaldo Carvalho de Melo
  2009-09-15 18:20         ` Nir Tzachar
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-09-15 14:11 UTC (permalink / raw)
  To: Nir Tzachar; +Cc: Linux Networking Development Mailing List, Ziv Ayalon

Em Tue, Sep 15, 2009 at 11:37:36AM +0300, Nir Tzachar escreveu:
> On Tue, Sep 15, 2009 at 2:09 AM, Arnaldo Carvalho de
> Melo<acme@ghostprotocols.net> wrote:
> > Em Thu, Aug 06, 2009 at 10:15:26AM +0300, Nir Tzachar escreveu:
> >> Hello.
> >>
> >> Is there anything new with this patch? What are the plans for merging
> >> it upstream?
> >
> > I'm doing perf runs using a test app using recvmsg, then with the first
> > patch, that introduces recvmmsg, then with the second, that locks the
> > series of unlocked_recvmmsg calls just once, will try to get this posted
> > here soon.
> >
> > I'd really appreciate if the people interested in this could try it and
> > post numbers too, to get this ball rolling again.
> >
> > As for getting it upstream, well, posting numbers here would definetely
> > help with that :-)
> 
> Ok, here are some crude results:
> 
> Setup:
> linux 2.6.29.2 with the third version of the patch, running on an
> Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
> 9.04

Which NIC? 10 Gbit/s?
 
> Application:
> A financial application, subscribing to quotes from a stock exchange.
> Typical traffic is small (around 50-100 bytes) multicast packets in
> large volumes. The application just receives  the quotes and pass them
> along.

Exactly what made me work on this patch :-)
 
> The test:
> Run two version of the application, head to head. one version using
> recvmsg and the other recvmmsg. The data is passed to a third
> application measuring the latency of the data.
> 
> Results:
> On general, the recvmmsg beats the pants off the regular recvmsg by a
> whole millisecond (which might not sound much, but is _really_ a lot
> for us ;). The exact distribution fluctuates between half a milli and
> 2 millis, but the average is 1 milli.

Do you have any testcase using publicly available software? Like qpidd,
etc? I'll eventually have to do that, for now I'm just using that
recvmmsg tool I posted, now with a recvmsg mode, then collecting 'perf
record' with and without callgraphs to post here. The client is just
pktgen spitting datagrams as if there is no tomorrow :-)
 
> Conclusions:
> We would _really_ like to see this patch go upstream. It gives an
> important performance boost in out use cases.

Great! I'll work on the lower layer unlocked_recvmmsg followup patch
addressing the concerns raised when I first posted it and we should see
some more gains.
 
> We are willing to perform more accurate tests if needed, and would
> appreciate the feedback on how to conduct them.

Showing that we get latency improvements is complementary to what I'm
doing, that is for now just showing the performance improvements and
showing what gives this improvement (perf counters runs).

If you could come up with a testcase that you could share with us,
perhaps using one of these AMQP implementations, that would be great
too.
 
> Cheers,
> Nir.

Thanks a lot for sharing your results with us!

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-09-15 14:11       ` Arnaldo Carvalho de Melo
@ 2009-09-15 18:20         ` Nir Tzachar
  2009-09-15 20:52           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Nir Tzachar @ 2009-09-15 18:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Linux Networking Development Mailing List, Ziv Ayalon

>> Setup:
>> linux 2.6.29.2 with the third version of the patch, running on an
>> Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
>> 9.04
>
> Which NIC? 10 Gbit/s?

1G. We do not care as much for throughput as we do about latency...


>> Results:
>> On general, the recvmmsg beats the pants off the regular recvmsg by a
>> whole millisecond (which might not sound much, but is _really_ a lot
>> for us ;). The exact distribution fluctuates between half a milli and
>> 2 millis, but the average is 1 milli.
>
> Do you have any testcase using publicly available software? Like qpidd,
> etc? I'll eventually have to do that, for now I'm just using that
> recvmmsg tool I posted, now with a recvmsg mode, then collecting 'perf
> record' with and without callgraphs to post here. The client is just
> pktgen spitting datagrams as if there is no tomorrow :-)

No. This was on a live, production system.


> Showing that we get latency improvements is complementary to what I'm
> doing, that is for now just showing the performance improvements and
> showing what gives this improvement (perf counters runs).

We are more latency oriented, and, naturally, concentrate on this
aspect of the problem. Producing numbers here is much more easier....
I can easily come up with a test application which just measures the
latency of processing packets, by employing a sending loop between two
hosts.

> If you could come up with a testcase that you could share with us,
> perhaps using one of these AMQP implementations, that would be great
> too.

Well, in our experience, AMQP and other solutions have latency issues.
Moreover, the receiving end of our application is a regular multicast
stream. I will implement the simple latency test I mentioned earlier,
and post some results soon.

Nir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-09-15 18:20         ` Nir Tzachar
@ 2009-09-15 20:52           ` Arnaldo Carvalho de Melo
  2009-09-16  4:53             ` Simon Horman
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-09-15 20:52 UTC (permalink / raw)
  To: Nir Tzachar; +Cc: Linux Networking Development Mailing List, Ziv Ayalon

Em Tue, Sep 15, 2009 at 09:20:13PM +0300, Nir Tzachar escreveu:
> >> Setup:
> >> linux 2.6.29.2 with the third version of the patch, running on an
> >> Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
> >> 9.04
> >
> > Which NIC? 10 Gbit/s?
> 
> 1G. We do not care as much for throughput as we do about latency...

OK, but anyway the 10 Gbit/s cards I've briefly played with all
exhibited lower latencies than all 1 gbit/s ones, in fact I've heard
about people moving to 10 Gbit/s not for the bw, but for the lower
latencies :-)
 
> 
> >> Results:
> >> On general, the recvmmsg beats the pants off the regular recvmsg by a
> >> whole millisecond (which might not sound much, but is _really_ a lot
> >> for us ;). The exact distribution fluctuates between half a milli and
> >> 2 millis, but the average is 1 milli.
> >
> > Do you have any testcase using publicly available software? Like qpidd,
> > etc? I'll eventually have to do that, for now I'm just using that
> > recvmmsg tool I posted, now with a recvmsg mode, then collecting 'perf
> > record' with and without callgraphs to post here. The client is just
> > pktgen spitting datagrams as if there is no tomorrow :-)
> 
> No. This was on a live, production system.

Wow :-)
> 
> > Showing that we get latency improvements is complementary to what I'm
> > doing, that is for now just showing the performance improvements and
> > showing what gives this improvement (perf counters runs).
> 
> We are more latency oriented, and, naturally, concentrate on this
> aspect of the problem. Producing numbers here is much more easier....
> I can easily come up with a test application which just measures the
> latency of processing packets, by employing a sending loop between two
> hosts.
> 
> > If you could come up with a testcase that you could share with us,
> > perhaps using one of these AMQP implementations, that would be great
> > too.
> 
> Well, in our experience, AMQP and other solutions have latency issues.
> Moreover, the receiving end of our application is a regular multicast
> stream. I will implement the simple latency test I mentioned earlier,
> and post some results soon.

OK.

And here are some callgraphs for a very simple app (attached) that stops
after receiving 1 million datagrams, collected with the perf tools in
the kernel. No packets per sec numbers besides the fact that the
recvmmsg test run collected way less samples (finished quicker).

Client is pktgen sending 100 byte datagrams over a single tg3 1 Gbit/s
NIC, server is running over a bnx2 1 Gbit/s link as well, just a sink:

With recvmmsg, batch of 8 datagrams, no timeout:

http://oops.ghostprotocols.net:81/acme/perf.recvmmsg.step1.cg.data.txt.bz2

And with recvmsg:

http://oops.ghostprotocols.net:81/acme/perf.recvmsg.step1.cg.data.txt.bz2

Notice where we are spending time in the recvmmsg case... :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-09-15 20:52           ` Arnaldo Carvalho de Melo
@ 2009-09-16  4:53             ` Simon Horman
  2009-09-16 11:52               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Simon Horman @ 2009-09-16  4:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Nir Tzachar, Linux Networking Development Mailing List, Ziv Ayalon

On Tue, Sep 15, 2009 at 05:52:07PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 15, 2009 at 09:20:13PM +0300, Nir Tzachar escreveu:
> > >> Setup:
> > >> linux 2.6.29.2 with the third version of the patch, running on an
> > >> Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
> > >> 9.04
> > >
> > > Which NIC? 10 Gbit/s?
> > 
> > 1G. We do not care as much for throughput as we do about latency...
> 
> OK, but anyway the 10 Gbit/s cards I've briefly played with all
> exhibited lower latencies than all 1 gbit/s ones, in fact I've heard
> about people moving to 10 Gbit/s not for the bw, but for the lower
> latencies :-)

Hi Arnaldo,

Out of curiosity, is that using a 10Gbit card with
a 10Gbit link or a 1Gbit link?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [RFC v3] net: Introduce recvmmsg socket syscall
  2009-09-16  4:53             ` Simon Horman
@ 2009-09-16 11:52               ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-09-16 11:52 UTC (permalink / raw)
  To: Simon Horman
  Cc: Arnaldo Carvalho de Melo, Nir Tzachar,
	Linux Networking Development Mailing List, Ziv Ayalon

Em Wed, Sep 16, 2009 at 02:53:44PM +1000, Simon Horman escreveu:
> On Tue, Sep 15, 2009 at 05:52:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Sep 15, 2009 at 09:20:13PM +0300, Nir Tzachar escreveu:
> > > >> Setup:
> > > >> linux 2.6.29.2 with the third version of the patch, running on an
> > > >> Intel Xeon X3220 2.4GHz quad core, with 4Gbyte of ram, running Ubuntu
> > > >> 9.04
> > > >
> > > > Which NIC? 10 Gbit/s?
> > > 
> > > 1G. We do not care as much for throughput as we do about latency...
> > 
> > OK, but anyway the 10 Gbit/s cards I've briefly played with all
> > exhibited lower latencies than all 1 gbit/s ones, in fact I've heard
> > about people moving to 10 Gbit/s not for the bw, but for the lower
> > latencies :-)
> 
> Hi Arnaldo,
> 
> Out of curiosity, is that using a 10Gbit card with
> a 10Gbit link or a 1Gbit link?

Its been a while, but most details should be publicly available here:

http://www.stacresearch.com/node/4211

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-09-16 11:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <9b2db90b0908060014r6a1763e8t1b3ee9310e012c25@mail.gmail.com>
2009-08-06  7:15 ` Fwd: [RFC v3] net: Introduce recvmmsg socket syscall Nir Tzachar
2009-09-14 23:09   ` Arnaldo Carvalho de Melo
2009-09-15  8:37     ` Nir Tzachar
2009-09-15 14:11       ` Arnaldo Carvalho de Melo
2009-09-15 18:20         ` Nir Tzachar
2009-09-15 20:52           ` Arnaldo Carvalho de Melo
2009-09-16  4:53             ` Simon Horman
2009-09-16 11:52               ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.