linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RT Summit 2018 & some advice on my application running ARM big.LITTLE
@ 2018-10-26 17:53 Christopher Obbard
  2018-11-08 11:50 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 2+ messages in thread
From: Christopher Obbard @ 2018-10-26 17:53 UTC (permalink / raw)
  To: linux-rt-users

Hi Everyone,

First can I thank you all for a very interesting day yesterday in Edinburgh!
Unfortunately I didn't get to stay much after 16.00 as we had to catch
a plane. It was very good to meet other RT users & listen to some
developers having in-depth conversation :-).


I am a user of the RT patchset in some fairly demanding multimedia
situations with a typical multichannel (8+) audio pipeline:  DAC ->
buffer -> app -> buffer -> ADC
The buffering is handled by JACKD2 the app having the same RT
priority+scheduler as this.

Most work I am doing is on arm(+64) platform but some x86+amd64 too
and we find running RT patchset really improves audio latency.

One of the questions which came up yesterday was the scheduler and
something I have not yet thought much around:
We are just setting the audio DMA interrupt (edma_ccint) to a priority
of around 95 with SCHED_FIFO, the jack server to 90 with also
SCHED_FIFO.
Now this seems to work quite well on the single core 1 GHz Beaglebone system.
My first question: is there anything I am doing glaringly wrong here?

So now I am working on a project that needs much more performance so
naturally I want to throwing multiple cores at it.
I have found the Rockchip RK3399 which has two cores CortexA72 & four
cores CortexA53 in a big.LITTLE style arrangement.

The story yesterday seemed that SMT is very bad and should be disabled
with RT & ARM does not have this function so I am okay.
The other topic mentioned was cache lines being shared between
multiple cores causing a hard to reproduce outlier & from what I have
read bigLITTLE shares cache lines between both processor types. So I
think I am going to have to disable the HMP and use the 4 fast cores?

Can you at all offer some quick advice to see if I am on the right track?



Cheers!

Christopher Obbard
64 Studio Ltd.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: RT Summit 2018 & some advice on my application running ARM big.LITTLE
  2018-10-26 17:53 RT Summit 2018 & some advice on my application running ARM big.LITTLE Christopher Obbard
@ 2018-11-08 11:50 ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 2+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-11-08 11:50 UTC (permalink / raw)
  To: Christopher Obbard; +Cc: linux-rt-users

On 2018-10-26 18:53:45 [+0100], Christopher Obbard wrote:
> Hi Everyone,
Hi,

> One of the questions which came up yesterday was the scheduler and
> something I have not yet thought much around:
> We are just setting the audio DMA interrupt (edma_ccint) to a priority
> of around 95 with SCHED_FIFO, the jack server to 90 with also
> SCHED_FIFO.
> Now this seems to work quite well on the single core 1 GHz Beaglebone system.
> My first question: is there anything I am doing glaringly wrong here?

The default priority is 50 for threaded interrupts. If edma_ccint is the
only one responsible for audio processing then lifting it should be fine.
You could try to use the sched_switch tracer and check if there is some
forth-and-back switching between edma_ccint and another interrupt (in
case there is another one involved in audio processing).

> So now I am working on a project that needs much more performance so
> naturally I want to throwing multiple cores at it.
> I have found the Rockchip RK3399 which has two cores CortexA72 & four
> cores CortexA53 in a big.LITTLE style arrangement.
> 
> The story yesterday seemed that SMT is very bad and should be disabled
> with RT & ARM does not have this function so I am okay.
I wouldn't say "very bad" but yes, the actual performance of one HT may
vary depending how busy the other HT is and what it is doing. It depends
how bad it can get and how much of additional latency is still
acceptable for your case.

> The other topic mentioned was cache lines being shared between
> multiple cores causing a hard to reproduce outlier & from what I have
> read bigLITTLE shares cache lines between both processor types. So I
> think I am going to have to disable the HMP and use the 4 fast cores?

Two fast cores or four slow cores :)

I think it depends on what you and what do you try to achieve. If the
outlier are still in the range of "okay" then I wouldn't care much.
Usually the system is measured in the worst possible operating state and
checked if the measured latency is acceptable. That means load generating
applications like hackbench, disk-io or stress-ng (you name it) are run
and latency shouldn't suffer much. However if you start invaliding the
caches then the results get very bad.
>From what I can see in [0] is that those two are separated. I think
there is an interconnect between the L2 and the main memory.

What might be bad for you latency wise is if the task migrates from the big
to the little cluster. So task pinning on a RT system is always a good
especially in this case :)

[0] http://opensource.rock-chips.com/wiki_RK3399

> Can you at all offer some quick advice to see if I am on the right track?
> 
> 
> 
> Cheers!
> 
> Christopher Obbard
> 64 Studio Ltd.

Sebastian

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-11-08 21:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-26 17:53 RT Summit 2018 & some advice on my application running ARM big.LITTLE Christopher Obbard
2018-11-08 11:50 ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).