All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V2 00/21] can: c_can: Another pile of fixes and improvements
@ 2014-04-11  8:13 Thomas Gleixner
  2014-04-11  8:13 ` [patch V2 02/21] can: c_can: Fix startup logic Thomas Gleixner
                   ` (21 more replies)
  0 siblings, 22 replies; 26+ messages in thread
From: Thomas Gleixner @ 2014-04-11  8:13 UTC (permalink / raw)
  To: linux-can
  Cc: Alexander Stein, Oliver Hartkopp, Marc Kleine-Budde,
	Wolfgang Grandegger, Mark

Changes since V1:

 - Slightly modified version of the interrupt reduction patch
 - Included the fix for PCH / C_CAN
 - Lockless XMIT path
 - Further reduction of register access
 - Add the missing can.type setup in c_can_pci.c
 - A pile of code cleanups.

It would be nice to reduce the register access some more by relying
completely on the status interrupt, but it turned out that the TX/RXOK
is not reliable enough. So we need to invalidate the message objects
in the tx softirq handling.

But the overall change of this series is that the I/O load gets
reduced by about 45% according to perf top. Though that PCH thing
sucks. The beaglebone manages to almost saturate the bus with short
packets at 1Mbit while PCH fails miserably and thats solely related to
the miserable I/O performance.

time cangen can0 -g0 -p10 -I5A5 -L0 -x -n 1000000 

arm: real	0m51.510s 	I/O read:  ~6%  I/O write: 1.5%  ~3.5s
x86: real	1m48.533s	I/O read: ~29%  I/O write: 0.8%  ~32 s!!

That's both with HW loopback on, as my PCH does not have a
tranceiver. Granted the C_CAN in the PCH needs the double IF transfer
to prevent the message loss versus the D_CAN in the ARM chip, but even
that taken into account makes a whopping 16s per 1M messages vs. 3.5s
on ARM.

w/o loopback the arm I/O read load drops to ~3.5% on the sender side
and ~5.5% on the receiver side. The time drops to 50.5s on the
transmit side if we do not have to get all the RX packets from HW
loopback. On TX we have a ~10us large gap every 16 packets which is
caused by the queue stall as we have to wait for the last
packet in the "FIFO" to be transferred. 

It seems there is a reason why the ATOM perf events do not expose the
stalled cpu cycles. But it's easy to figure out. You can compare the
CAN load case with some other scenario which has 100% CPU utilization
by running 

# perf stat -a sleep 60

The interesting part is: insns per cycle

CAN:	 0.23  insns per cycle
Other:	 0.53  insns per cycle

I don't have comparison numbers for ARM due to not supported perf
events, but the perf top numbers and the transfer performance tell a
clear story.

There might be room for a few improvements, but I'm running out of
cycles and I really want to get the IF3 DMA feature functional on the
TI chips, but that seems to be an equally tedious reverse engineering
problem as the rest of this.

Thanks,

        tglx

---------
 Kconfig     |    7 
 c_can.c     |  662 +++++++++++++++++++++++++++---------------------------------
 c_can.h     |   21 -
 c_can_pci.c |    2 
 4 files changed, 320 insertions(+), 372 deletions(-)




^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-04-14 20:17 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-11  8:13 [patch V2 00/21] can: c_can: Another pile of fixes and improvements Thomas Gleixner
2014-04-11  8:13 ` [patch V2 02/21] can: c_can: Fix startup logic Thomas Gleixner
2014-04-11  8:13 ` [patch V2 01/21] can: c_can_pci: Set the type of the IP core Thomas Gleixner
2014-04-11  8:13 ` [patch V2 03/21] can: c_can: Make bus off interrupt disable logic work Thomas Gleixner
2014-04-11  8:13 ` [patch V2 04/21] can: c_can: Do not access skb after net_receive_skb() Thomas Gleixner
2014-04-11  8:13 ` [patch V2 05/21] can: c_can: Handle state change correctly Thomas Gleixner
2014-04-11  8:13 ` [patch V2 06/21] can: c_can: Fix berr reporting Thomas Gleixner
2014-04-11  8:13 ` [patch V2 07/21] can: c_can: Always update error stats Thomas Gleixner
2014-04-11  8:13 ` [patch V2 08/21] can: c_can: Simplify buffer reenabling Thomas Gleixner
2014-04-11  8:13 ` [patch V2 09/21] can: c_can: Avoid status register update for D_CAN Thomas Gleixner
2014-04-11  8:13 ` [patch V2 10/21] can: c_can: Get rid of pointless interrupts Thomas Gleixner
2014-04-11  8:13 ` [patch V2 11/21] can: c_can : Disable rx split as workaround Thomas Gleixner
2014-04-11  8:13 ` [patch V2 13/21] can: c_can: Cleanup irq enable/disable Thomas Gleixner
2014-04-11  8:13 ` [patch V2 12/21] can: c_can": Work around C_CAN RX wreckage Thomas Gleixner
2014-04-14  8:38   ` Alexander Stein
2014-04-14 20:13     ` Thomas Gleixner
2014-04-14 20:17       ` Marc Kleine-Budde
2014-04-11  8:13 ` [patch V2 14/21] can: c_can: Cleanup c_can_read_msg_object() Thomas Gleixner
2014-04-11  8:13 ` [patch V2 15/21] can: c_can Cleanup setup of receive buffers Thomas Gleixner
2014-04-11  8:13 ` [patch V2 16/21] can: c_can: Cleanup c_can_inval_msg_object() Thomas Gleixner
2014-04-11  8:13 ` [patch V2 17/21] can: c_can: Cleanup c_can_msg_obj_put/get() Thomas Gleixner
2014-04-11  8:13 ` [patch V2 18/21] can: c_can: Cleanup c_can_write_msg_object() Thomas Gleixner
2014-04-11  8:13 ` [patch V2 19/21] can: c_can: Use proper u32 variables in c_can_write_msg_object() Thomas Gleixner
2014-04-11  8:13 ` [patch V2 21/21] can: c_can: Speed up tx buffer invalidation Thomas Gleixner
2014-04-11  8:13 ` [patch V2 20/21] can: c_can: Remove tx locking Thomas Gleixner
2014-04-14  8:38 ` [patch V2 00/21] can: c_can: Another pile of fixes and improvements Alexander Stein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.