linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Performance bottleneck with ndo_start_xmit
@ 2015-07-07 16:32 Jason A. Donenfeld
  2015-07-07 18:10 ` Stephen Hemminger
  0 siblings, 1 reply; 3+ messages in thread
From: Jason A. Donenfeld @ 2015-07-07 16:32 UTC (permalink / raw)
  To: netdev, linux-kernel

Hi folks,

I'm writing a kernel module that creates a virtual network device with
rtnl_link_register. At initialization time, it creates a UDP socket
with sock_create_kern. On ndo_start_xmit, it passes the data of the
skb to the UDP socket's sendmsg, after some minimal crypto and
processing. The device's MTU takes things into account properly. In
other words: it's a UDP-based tunnel device. And it works.

But I'm hitting a bottleneck in the send path (ndo_start_xmit) that I
can't seem to figure out. None of the aforementioned crypto or
processing contributes significantly. I boot up two virtual machines,
configure the tunnel on them, and run iperf to test bandwidth. Using
the tunnel device I get around 450mbps. Without using the tunnel
device, I get around 5gbps. These performance characteristics remain
the same for 1 CPU and for 4 CPUs and for 8 CPUs.

When it maxes out at ~5gbps without using the tunnel device, the CPU
is at around 80%. When it maxes out at ~450mbps using the tunnel
device, the CPU is at 100%. Running perf top indicates that most the
kernel time is spent in e1000_xmit, or the xmit function of whichever
driver underlies the UDP socket. Very little percent of time is spent
in any functions related to my module or even inside UDP's sendmsg
call tree.

I'm stumped. I've tried workqueues, tasklets, all sorts of deferal.
I've tried not using a UDP _socket_ and instead constructing an
Ethernet, IP, and UDP header myself, checksumming it, computing the
flowi4s,  getting the macs, and passing it to dev_queue_xmit. But in
all cases, the bandwidth stays the same: 450mbps at 100% CPU
utilization with the e1000_xmit (or vmxnet3_xmit if I'm using that
driver instead) function at the top of the list in perf top.

I can confirm that the receive path never reaches 100% CPU
utilization, and hence the bottleneck is in the send path, described
above.

Can anyone help? Or point me in the right direction of where to learn?
I have exhausted all of the documentation resources I've been able to
find, and my eyes hurt from reading tens of thousands of lines of
kernel code trying to figure this out. I'm at a loss.

Any pointers would be greatly appreciated.

Regards,
Jason

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Performance bottleneck with ndo_start_xmit
  2015-07-07 16:32 Performance bottleneck with ndo_start_xmit Jason A. Donenfeld
@ 2015-07-07 18:10 ` Stephen Hemminger
  2015-07-07 18:14   ` Jason A. Donenfeld
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Hemminger @ 2015-07-07 18:10 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: netdev, linux-kernel

On Tue, 7 Jul 2015 18:32:22 +0200
"Jason A. Donenfeld" <Jason@zx2c4.com> wrote:

> I'm writing a kernel module that creates a virtual network device with
> rtnl_link_register.

Is it open source, is the source available to look at?
If not, please solve your own problems.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Performance bottleneck with ndo_start_xmit
  2015-07-07 18:10 ` Stephen Hemminger
@ 2015-07-07 18:14   ` Jason A. Donenfeld
  0 siblings, 0 replies; 3+ messages in thread
From: Jason A. Donenfeld @ 2015-07-07 18:14 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, linux-kernel

On Tue, Jul 7, 2015 at 8:10 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Is it open source, is the source available to look at?
> If not, please solve your own problems.

Yes it is. Right now the repo is under password because it's supposed
to keep your data secure, but I haven't audited it yet, and I don't
want someone to rely on the software erroneously before I've made sure
it's safe. If my general question here doesn't turn up any good
pointers, I'll take the password off the repo and just add some
massive "do not use!" warnings.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-07-07 18:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-07 16:32 Performance bottleneck with ndo_start_xmit Jason A. Donenfeld
2015-07-07 18:10 ` Stephen Hemminger
2015-07-07 18:14   ` Jason A. Donenfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).