All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Evans <tom_usenet@optusnet.com.au>
To: Marc Kleine-Budde <mkl@pengutronix.de>,
	dan.egnor@gmail.com, linux-can@vger.kernel.org
Subject: Re: Flexcan (was: Re: Fwd: Querying current tx_queue usage of a SocketCAN interface)
Date: Fri, 10 Apr 2015 16:35:43 +1000	[thread overview]
Message-ID: <55276F3F.7020903@optusnet.com.au> (raw)
In-Reply-To: <552632F7.5090204@optusnet.com.au>

On 09/04/15 18:06, Tom Evans wrote:
> On 04/04/15 14:32, Tom Evans wrote:
>> On 2/04/2015 10:35 PM, Tom Evans wrote:
>> ...
>> And schedules NAPI to forward them from there rather than reading them from
>> the hardware FIFO.
>>
>> The purpose of NAPI is to make the interrupts as fast as possible, doing as
>> little work as possible, but servicing time-critical hardware so it doesn't
>> overflow/underflow. Operations like reading characters from a serial port.
>>
>> But that assumes the "little work" is fast. In the case of the FlexCAN driver,
>> it takes about 5 reads and a write to read a CAN message, and there may be six
>> messages in the FIFO.
>>
>> Not many accesses, but peripheral device registers can be notoriously slow on
>> some CPUs [1].
>  > ...
>> I'll try and measure this on Tuesday.
>
> Now quite tomorrow, but I have some results:
>
> [    1.494142] flexcan flexcan.1: One do_gettimeofday took 0 us)
> [    1.499903] flexcan flexcan.1: Ten do_gettimeofday took 4 us)
> [    1.505677] flexcan flexcan.1: 100 flexcan_read() took 23 us)
>
> I first measured the overhead of calling do_gettimeofday(), which is about
> 0.4us. So I can pretty much ignore that in this test.
>
> Then in a loop reading a FlexCAN control register, it took about 0.23us per
> read. That's 230ns or about 184 CPU clocks at 800MHz.
>
> OK, so this IS a slow peripheral.
>
> Given it takes about 5 reads to read one message, that's about 1.15us per
> message. With a queue depth of "6" that's a maximum extra delay of 6.9us.

That would only happen if interrupts were delayed for 6 whole CAN message 
times, which is over 600us. This should be unlikely. In the more common case, 
one interrupt would read one message, meaning only about 1.15us more than 
throwing to NAPI.

Does anyone have any figures on how slow (how many CPU cycles to read and 
write) the other peripherals are on this CPU? This is something I've never 
seen in any Freescale manual for any of their CPUs.

I wonder if any of the other peripherals are faster? I can run that test myself:

[    1.588819] flexcan flexcan.1: 100 read(ssi)     @0x50014000  took 24 us
[    1.596449] flexcan flexcan.1: 100 read(esdhc1)  @0x50004000  took 25 us
[    1.604337] flexcan flexcan.1: 100 read(uart)    @0x5000c000  took 23 us
[    1.612051] flexcan flexcan.1: 100 read(flexcan) @0x53fc8000  took 23 us
[    1.620017] flexcan flexcan.1: 100 read(gpio)    @0x53f84000  took 26 us
[    1.627731] flexcan flexcan.1: 100 read(pwm)     @0x53fb8000  took 23 us
[    1.635358] flexcan flexcan.1: 100 read(i2c1)    @0x63fc0000  took 23 us
[    1.643076] flexcan flexcan.1: 100 read(fec)     @0x63fec000  took 27 us
[    1.650690] flexcan flexcan.1: 100 read(sdma)    @0x63fb0000  took 23 us
[    1.658406] flexcan flexcan.1: 100 read(sram)    @0xf8000000  took 17 us

The IRAM is a bit faster, but not by that much. I don't believe these tests. 
The IRAM is meant to be accessed in a few CLOCKS not a hundred! Maybe it is 
springing an MMU trap on every "I/O" access? That would account for the time.

I think I'm testing this the right way. The inner loop that is reading the 
registers (after calling ioremap() to get an address) is and disassembles to:

             tbase = ioremap(psDev->addr, 4096);
             do_gettimeofday(&now);
             reg = readl(tbase);
             for (i = 0; i < 100; i++)
             {
                 reg = readl(tbase);
             }
             do_gettimeofday(&now2);


  530:   ebfffffe    bl  0 <__arm_ioremap>
  534:   e2504000    subs    r4, r0, #0
...
  558:   e3a03064    mov r3, #100    ; 0x64
  55c:   e5942000    ldr r2, [r4]
  560:   f57ff04f    dsb sy
  564:   e2533001    subs    r3, r3, #1
  568:   e50b2030    str r2, [fp, #-48]  ; 0x30
  56c:   1afffffa    bne 55c <flexcan_probe+0x55c>
...

If I generate code that abuses "volatile" to read the registers, but leaves 
the "dsb" out the time for the loop drops to 18ms (180ns/read) for registers 
and 13us for the IRAM (130ns/read or still 100 CPU clocks at 800MHz).

I can believe the IO Registers are that slow but why should the internal SRAM 
shouldn't be that slow?

Tom



  reply	other threads:[~2015-04-10  6:35 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAE+ymru296P+LjkT7_ONVc2OGMP9mtXW46Nq5aSnm1etauj9Aw@mail.gmail.com>
2015-03-28 20:26 ` Fwd: Querying current tx_queue usage of a SocketCAN interface Paarvai Naai
2015-03-29 22:42   ` Tom Evans
2015-03-30 21:55     ` Paarvai Naai
     [not found]       ` <5519E5A9.7080104@optusnet.com.au>
2015-03-31  0:26         ` Paarvai Naai
2015-03-31  3:09           ` Tom Evans
2015-04-01 20:33             ` Paarvai Naai
2015-04-01 20:57               ` Dan Egnor
2015-04-02  2:20                 ` Tom Evans
2015-04-02  2:33                   ` Daniel Egnor
2015-04-01 23:21               ` Tom Evans
2015-04-02  0:33                 ` Dan Egnor
2015-04-02  2:20                   ` Tom Evans
2015-04-02  6:28                     ` Flexcan (was: Re: Fwd: Querying current tx_queue usage of a SocketCAN interface) Marc Kleine-Budde
2015-04-02 11:35                       ` Tom Evans
2015-04-02 12:07                         ` Flexcan Marc Kleine-Budde
2015-04-04  3:32                         ` Flexcan (was: Re: Fwd: Querying current tx_queue usage of a SocketCAN interface) Tom Evans
2015-04-09  8:06                           ` Flexcan Tom Evans
2015-04-10  6:35                             ` Tom Evans [this message]
2015-04-02 18:23                     ` Fwd: Querying current tx_queue usage of a SocketCAN interface Paarvai Naai
2015-04-02  6:46                   ` Marc Kleine-Budde
2015-04-02 18:28                     ` Paarvai Naai
2015-04-03  1:35                       ` Tom Evans
2015-04-03  6:45                         ` Paarvai Naai
2015-04-03 11:08                           ` Marc Kleine-Budde
2015-04-03 15:24                             ` Paarvai Naai
2015-04-03 20:28                               ` Marc Kleine-Budde
2015-04-03 20:53                                 ` Paarvai Naai
2015-04-04  8:49                                   ` Marc Kleine-Budde
2015-04-06 17:54                                     ` Paarvai Naai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55276F3F.7020903@optusnet.com.au \
    --to=tom_usenet@optusnet.com.au \
    --cc=dan.egnor@gmail.com \
    --cc=linux-can@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.