All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Re: Disabling lapic timer for a certain core
@ 2010-03-04 21:03 M. Koehrer
  2010-03-05  7:05 ` Thomas Gleixner
  2010-03-05 13:54 ` M. Koehrer
  0 siblings, 2 replies; 12+ messages in thread
From: M. Koehrer @ 2010-03-04 21:03 UTC (permalink / raw)
  To: lclaudio, mathias_koehrer; +Cc: linux-rt-users

Hi Luis,

thanks for the reply.
It is an Intel Core2Quad CPU - one CPU with 4 cores,  
this is a SMP system...
Concerning the SMI issue:
I have globally disabled SMI with the LPC registers...
The main board is a server board from Supermicro, 
no USB enabled, VGA text mode only. No X running.

The total amount of time we have for a cycle is 40us,
most often the application manages to run within 25-30us, 
however there are some jitters 
that increases the run time to be 45us which is too slow.
I have done a kernel hack to be able to return directly from
the interrupt routine of the "Local Timer Interrupt" for core 3.
This helps, however it is really an ugly hack and I am looking
for a smarter way to do that. I have done this just to identify
the cause of the jitter.

The issue is here, that the interrupt routine of the kernel
takes too long here.
It would be fine to have the timer interrupt called more 
often and to process with each call only a subset of the 
jobs to be done...
This would reduce the time the CPU the user mode 
is interrupted by the timer routine.

The idea of running this application in an endless loop is to avoid
to use timers (including the interrupt latency caused by them).
By pure polling, no interrupt is required.

Regards

Mathias

> | Hi all!
> | 
> | I am running the RT_PREEMPT kernel 2.6.31.2-rt13 on a Intel Quad Core
> CPU.
> | I start my kernel with isolcpus=1-7 option to force all processes to run
> on CPU core 0 only.
> | Now we have the need for a very fast loop. Within this loop some accesses
> from/to a PCIe I/O board
> | (mapped in user space) and some additional computation has to be done.
> | For this, I use a real time thread, run this on CPU core 3 and let it run
> in an endless polling loop.
> | So far everything works fine.
> | This thread is the only running user mode thread on CPU core 3.
> | However, we measure some run time jitters when accessing the I/O board in
> the range of up to 
> | 15 microseconds which are not tolerable by the application.
> | I see that on all cores the "Local Timer Interrupt" occurs 100 times a
> | second (of course this is the timer frequency select in the kernel
> configuration).
> 
> Are CPU0 and CPU3 on the same socket? Are you using a SMP or a NUMA box? I
> would suggest running your application in a CPU on a different socket just
> to ensure you are not having any cache issues.
> 
> Have you tried running hwlat_detect or smi-test? Your 15us threshold is
> pretty
> tight and could easilly be affected by SMI spikes (if present in your
> system).
> 
> Luis
> 


-- 
Mathias Koehrer
mathias_koehrer@arcor.de


Tolle Dekolletés oder scharfe Tatoos? Vote jetzt ... oder mach selbst mit und zeige Deine Schokoladenseite
bei Topp oder Hopp von Arcor: http://www.arcor.de/rd/footer.toh
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2010-03-04 21:03 Re: Disabling lapic timer for a certain core M. Koehrer
@ 2010-03-05  7:05 ` Thomas Gleixner
  2010-03-05 13:54 ` M. Koehrer
  1 sibling, 0 replies; 12+ messages in thread
From: Thomas Gleixner @ 2010-03-05  7:05 UTC (permalink / raw)
  To: M. Koehrer; +Cc: lclaudio, linux-rt-users

On Thu, 4 Mar 2010, M. Koehrer wrote:
> Hi Luis,

Please do _NOT_ top post. Please reply inline and in context.

> thanks for the reply.
> It is an Intel Core2Quad CPU - one CPU with 4 cores,  
> this is a SMP system...
> Concerning the SMI issue:
> I have globally disabled SMI with the LPC registers...

I hope you know what you are doing. Disabling SMIs can be dangerous in
various aspects:

  - thermal protection might be disabled by it
  - SMIs which fix chip(set) erratas are not longer working

You should at least confirm with your hardware vendor whether it's
safe to do so and which side effects you have to take into account.

> The main board is a server board from Supermicro, 
> no USB enabled, VGA text mode only. No X running.
> 
> The total amount of time we have for a cycle is 40us,

FYI, 40us is in the range where the hardware induced latencies can
bite you already badly. You run on a machine with shared L2 caches, so
your other core(s) might evict your code / data and you run into cache
misses which might take a while due to DMAs running etc...

These machines are designed for throughput, not for deterministic
behaviour in the single digit micro seconds range.

> The issue is here, that the interrupt routine of the kernel
> takes too long here.
> It would be fine to have the timer interrupt called more 
> often and to process with each call only a subset of the 
> jobs to be done...
> This would reduce the time the CPU the user mode 
> is interrupted by the timer routine.

Err, by splitting the work you introduce even more overhead. That's
the wrong approach. The first question is which timers are running on
that CPU as you have isolated it.

/proc/timer_stats and /proc/timer_list and the event tracer might help
you to identify them.

In theory it's possible to remove the timer interrupt from such an
isolated core completely, but there needs to be some work done vs. the
scheduler, accounting, RCU etc. There are people looking into this,
but we have no patches yet.

> The idea of running this application in an endless loop is to avoid
> to use timers (including the interrupt latency caused by them).
> By pure polling, no interrupt is required.

What kind of application is that ? Data acquistion or closed loop
processsing ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Disabling lapic timer for a certain core
  2010-03-04 21:03 Re: Disabling lapic timer for a certain core M. Koehrer
  2010-03-05  7:05 ` Thomas Gleixner
@ 2010-03-05 13:54 ` M. Koehrer
  2010-03-06  9:18   ` Thomas Gleixner
  1 sibling, 1 reply; 12+ messages in thread
From: M. Koehrer @ 2010-03-05 13:54 UTC (permalink / raw)
  To: tglx, mathias_koehrer; +Cc: lclaudio, linux-rt-users

Hi Thomas, 

thank you very much for the reply.
> Please do _NOT_ top post. Please reply inline and in context.
Sorry for that...

> > The issue is here, that the interrupt routine of the kernel
> > takes too long here.
> > It would be fine to have the timer interrupt called more 
> > often and to process with each call only a subset of the 
> > jobs to be done...
> > This would reduce the time the CPU the user mode 
> > is interrupted by the timer routine.
> 
> Err, by splitting the work you introduce even more overhead. That's
> the wrong approach. The first question is which timers are running on
> that CPU as you have isolated it.
You are right. The total overhead is of course larger.
However, the overhead that could appear within a single of our 40us loops
would be smaller. It is fine for me to have a 5us add on with each loop.
But it is not allowed to have a 15us add on with every 1000th loop...

> 
> In theory it's possible to remove the timer interrupt from such an
> isolated core completely, but there needs to be some work done vs. the
> scheduler, accounting, RCU etc. There are people looking into this,
> but we have no patches yet.
I have checked the LAPIC addresses via the MSRs.
All LAPIC addresses for all CPU cores are the same. 
I assume they share the very same configuration, thus a minimum step 
would be to make a copy of this configuration data and to let CPU core 3
point to this copy. This would allow to disable the timer.

 
> 
> What kind of application is that ? Data acquistion or closed loop
> processsing ?
I am running a close loop application.

Thank you very much.

Regards

Mathias

-- 
Mathias Koehrer
mathias_koehrer@arcor.de


Tolle Dekolletés oder scharfe Tatoos? Vote jetzt ... oder mach selbst mit und zeige Deine Schokoladenseite
bei Topp oder Hopp von Arcor: http://www.arcor.de/rd/footer.toh
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Disabling lapic timer for a certain core
  2010-03-05 13:54 ` M. Koehrer
@ 2010-03-06  9:18   ` Thomas Gleixner
  0 siblings, 0 replies; 12+ messages in thread
From: Thomas Gleixner @ 2010-03-06  9:18 UTC (permalink / raw)
  To: M. Koehrer; +Cc: lclaudio, linux-rt-users

On Fri, 5 Mar 2010, M. Koehrer wrote:
> > In theory it's possible to remove the timer interrupt from such an
> > isolated core completely, but there needs to be some work done vs. the
> > scheduler, accounting, RCU etc. There are people looking into this,
> > but we have no patches yet.
> I have checked the LAPIC addresses via the MSRs.
> All LAPIC addresses for all CPU cores are the same. 
> I assume they share the very same configuration, thus a minimum step 
> would be to make a copy of this configuration data and to let CPU core 3
> point to this copy. This would allow to disable the timer.

Really ? If it would be enough to disable the timer interrupt and not
let it fire, it would have been done years ago.

Did you even try to read what I said above ?

> >                 ...., but there needs to be some work done vs. the
> > scheduler, accounting, RCU etc.

Linux was not designed that way and it requires a non trivial amount
of work to get this sorted out:

> > There are people looking into this, but we have no patches yet.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2011-05-23 21:40 Joe Howard
@ 2011-05-23 23:49 ` Frank Rowand
  0 siblings, 0 replies; 12+ messages in thread
From: Frank Rowand @ 2011-05-23 23:49 UTC (permalink / raw)
  To: Joe Howard; +Cc: linux-rt-users

On 05/23/11 14:40, Joe Howard wrote:
> Apologies if this is improper etiquette top-posting and bumping a
> thread that is over 1 year old, but I am curious if any progress has
> been made toward the desire to shield one or more cores from other
> interrupts and processes?  Particularly the lapic timer interrupt.
> 
> It seems the combination of cpusets and irq smp_affinity get 95% of
> the way there, but the "Local timer interrupts" cannot be disabled
> for a given core.  My application runs a polling loop that consumes
> 100% cpu time on a "shielded" core.  The code in the loop takes about
> 500ns to execute and runs once every 5000ns (using "rdtsc"
> instruction to throttle).  I'm seeing an increase of 3000ns duration
> in one cycle out of every 200 (corresponding to the default
> HZ=1000).
> 
> Thanks, -Joe H


At ELC 2011, Thomas Gleixner gave a presentation titled "Status of
Preempt-RT and why there is no roadmap".  The slides are at:

   http://elinux.org/images/c/ca/Elc2011_gleixner.pdf

On slide 12, one of the "Future features" listed is "Full CPU isolation".
So the good news is that the feature is on the radar screen.  The bad
news is that the future is not here yet.

-Frank





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
@ 2011-05-23 21:40 Joe Howard
  2011-05-23 23:49 ` Frank Rowand
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Howard @ 2011-05-23 21:40 UTC (permalink / raw)
  To: linux-rt-users

Apologies if this is improper etiquette top-posting and bumping a thread that is over 1 year old, but I am curious if any progress has been made toward the desire to shield one or more cores from other interrupts and processes?  Particularly the lapic timer interrupt.

It seems the combination of cpusets and irq smp_affinity get 95% of the way there, but the "Local timer interrupts" cannot be disabled for a given core.  My application runs a polling loop that consumes 100% cpu time on a "shielded" core.  The code in the loop takes about 500ns to execute and runs once every 5000ns (using "rdtsc" instruction to throttle).  I'm seeing an increase of 3000ns duration in one cycle out of every 200 (corresponding to the default HZ=1000).

Thanks,
	-Joe H


On Fri, 5 Mar 2010, Mark Hounschell wrote:
> On 03/04/2010 09:04 AM, Luis Claudio R. Goncalves wrote:
> > On Thu, Mar 04, 2010 at 01:29:53PM +0100, M. Koehrer wrote:
> > | Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
> > | I want to use the full 100% CPU core time for this single loop.
> > | 
> > | Any help or ideas are welcome!
> > | 
> 
> This has been a long standing issue with me too. Moving your process to
> another socket won't help you. It is not a cache issue. It is the local
> timer interrupt just as you suspect. I've played with disabling it on a
> core but haven't been successful. This is a problem with both the vanilla
> and RT kernels. No matter what you do as far as isolation of tasks and
> normal interrupts, the local timer interrupt kills ya. The kernel is broken
> in this regard, by design. Your processors aren't yours. The kernel
> developers insist on claiming a piece of every one of them for their code.
> The kernel people will never change/fix this flaw in it's basic design
> because only a few (hard real-time) consider it a problem. Those people are
> told to use something else and that Linux wasn't designed for that kind of
> thing.

That's crap in several ways.

1) This is not a problem where only real-time folks are interested
   in. HPC folks complain about the same thing.

2) As I said yesterday, we are aware of the problem and people are
   working on a solution. It's just not that trivial as disabling the
   timer interrupt. We need to sort out housekeeping stuff and solve
   other problems to gain full isolation of a core.

I'm really fed up with the attitude of folks who claim that the kernel
is broken and we just are not interested to fix it. This has been
discussed several times in great length and all the details have been
pointed out which need work. But we have not seen a single patch from
the very people who whine about it.

> Unfortunately, the instructions (rdmsr and wrmsr) that could be used to
> disable/re-enable the local timer interrupt require DOM-0 privileges and
> can't be executed from user land. If it were not for that one little thing
> a solution would be easy. You wouldn't even need the RT patch set any more.
> 
> You could probably hack the kernel up such that you could get DOM-0
> privileges in user land but don't expect any help from any kernel people
> for that sort of thing.

True, simply because we are not interested in hacked up one off shit,
which breaks things left and right.
 
Thanks,

	tglx


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2010-03-06  9:10     ` Thomas Gleixner
@ 2010-03-06 13:12       ` Mark Hounschell
  0 siblings, 0 replies; 12+ messages in thread
From: Mark Hounschell @ 2010-03-06 13:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mark Hounschell, Luis Claudio R. Goncalves, M. Koehrer, linux-rt-users

On 03/06/2010 04:10 AM, Thomas Gleixner wrote:
> On Fri, 5 Mar 2010, Mark Hounschell wrote:
>> On 03/04/2010 09:04 AM, Luis Claudio R. Goncalves wrote:
>>> On Thu, Mar 04, 2010 at 01:29:53PM +0100, M. Koehrer wrote:
>>> | Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
>>> | I want to use the full 100% CPU core time for this single loop.
>>> | 
>>> | Any help or ideas are welcome!
>>> | 
>>
>> This has been a long standing issue with me too. Moving your process to
>> another socket won't help you. It is not a cache issue. It is the local
>> timer interrupt just as you suspect. I've played with disabling it on a
>> core but haven't been successful. This is a problem with both the vanilla
>> and RT kernels. No matter what you do as far as isolation of tasks and
>> normal interrupts, the local timer interrupt kills ya. The kernel is broken
>> in this regard, by design. Your processors aren't yours. The kernel
>> developers insist on claiming a piece of every one of them for their code.
>> The kernel people will never change/fix this flaw in it's basic design
>> because only a few (hard real-time) consider it a problem. Those people are
>> told to use something else and that Linux wasn't designed for that kind of
>> thing.
> 
> That's crap in several ways.
> 
> 1) This is not a problem where only real-time folks are interested
>    in. HPC folks complain about the same thing.
> 
> 2) As I said yesterday, we are aware of the problem and people are
>    working on a solution. It's just not that trivial as disabling the
>    timer interrupt. We need to sort out housekeeping stuff and solve
>    other problems to gain full isolation of a core.
> 

This I surely didn't know. I have reported problems in the past that were a
result of my application being an affinitized RT CPU hog and I've been told
that the kernel doesn't support it and there was nothing that could be
done. Between that scenario and local timer interrupts I have heard only
that it is not something that will be supported so what am I too think.
This is just personal experience on LKML. If people are truly thinking
about these things now, I commend that.

> I'm really fed up with the attitude of folks who claim that the kernel
> is broken and we just are not interested to fix it. This has been
> discussed several times in great length and all the details have been
> pointed out which need work. But we have not seen a single patch from
> the very people who whine about it.
> 

I wish I was knowledgeable enough of the kernel to get involved in this but
am currently not. Maybe as a tester of it I would be willing. I'm sorry if
I asserted an attitude, I was just noting personal experiences related to
these sorts of things.

>> Unfortunately, the instructions (rdmsr and wrmsr) that could be used to
>> disable/re-enable the local timer interrupt require DOM-0 privileges and
>> can't be executed from user land. If it were not for that one little thing
>> a solution would be easy. You wouldn't even need the RT patch set any more.
>>
>> You could probably hack the kernel up such that you could get DOM-0
>> privileges in user land but don't expect any help from any kernel people
>> for that sort of thing.
> 
> True, simply because we are not interested in hacked up one off shit,
> which breaks things left and right.

Of coarse and I understand that. And don't expect anything less.

Regards
Mark


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2010-03-05  9:59   ` Mark Hounschell
  2010-03-05 10:31     ` Jan Kiszka
@ 2010-03-06  9:10     ` Thomas Gleixner
  2010-03-06 13:12       ` Mark Hounschell
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2010-03-06  9:10 UTC (permalink / raw)
  To: Mark Hounschell
  Cc: Luis Claudio R. Goncalves, M. Koehrer, linux-rt-users, Mark Hounschell

On Fri, 5 Mar 2010, Mark Hounschell wrote:
> On 03/04/2010 09:04 AM, Luis Claudio R. Goncalves wrote:
> > On Thu, Mar 04, 2010 at 01:29:53PM +0100, M. Koehrer wrote:
> > | Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
> > | I want to use the full 100% CPU core time for this single loop.
> > | 
> > | Any help or ideas are welcome!
> > | 
> 
> This has been a long standing issue with me too. Moving your process to
> another socket won't help you. It is not a cache issue. It is the local
> timer interrupt just as you suspect. I've played with disabling it on a
> core but haven't been successful. This is a problem with both the vanilla
> and RT kernels. No matter what you do as far as isolation of tasks and
> normal interrupts, the local timer interrupt kills ya. The kernel is broken
> in this regard, by design. Your processors aren't yours. The kernel
> developers insist on claiming a piece of every one of them for their code.
> The kernel people will never change/fix this flaw in it's basic design
> because only a few (hard real-time) consider it a problem. Those people are
> told to use something else and that Linux wasn't designed for that kind of
> thing.

That's crap in several ways.

1) This is not a problem where only real-time folks are interested
   in. HPC folks complain about the same thing.

2) As I said yesterday, we are aware of the problem and people are
   working on a solution. It's just not that trivial as disabling the
   timer interrupt. We need to sort out housekeeping stuff and solve
   other problems to gain full isolation of a core.

I'm really fed up with the attitude of folks who claim that the kernel
is broken and we just are not interested to fix it. This has been
discussed several times in great length and all the details have been
pointed out which need work. But we have not seen a single patch from
the very people who whine about it.

> Unfortunately, the instructions (rdmsr and wrmsr) that could be used to
> disable/re-enable the local timer interrupt require DOM-0 privileges and
> can't be executed from user land. If it were not for that one little thing
> a solution would be easy. You wouldn't even need the RT patch set any more.
> 
> You could probably hack the kernel up such that you could get DOM-0
> privileges in user land but don't expect any help from any kernel people
> for that sort of thing.

True, simply because we are not interested in hacked up one off shit,
which breaks things left and right.
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2010-03-05  9:59   ` Mark Hounschell
@ 2010-03-05 10:31     ` Jan Kiszka
  2010-03-06  9:10     ` Thomas Gleixner
  1 sibling, 0 replies; 12+ messages in thread
From: Jan Kiszka @ 2010-03-05 10:31 UTC (permalink / raw)
  To: dmarkh
  Cc: Luis Claudio R. Goncalves, M. Koehrer, linux-rt-users, Mark Hounschell

Mark Hounschell wrote:
> On 03/04/2010 09:04 AM, Luis Claudio R. Goncalves wrote:
>> On Thu, Mar 04, 2010 at 01:29:53PM +0100, M. Koehrer wrote:
>> | Hi all!
>> | 
>> | I am running the RT_PREEMPT kernel 2.6.31.2-rt13 on a Intel Quad Core CPU.
>> | I start my kernel with isolcpus=1-7 option to force all processes to run on CPU core 0 only.
>> | Now we have the need for a very fast loop. Within this loop some accesses from/to a PCIe I/O board
>> | (mapped in user space) and some additional computation has to be done.
>> | For this, I use a real time thread, run this on CPU core 3 and let it run in an endless polling loop.
>> | So far everything works fine.
>> | This thread is the only running user mode thread on CPU core 3.
>> | However, we measure some run time jitters when accessing the I/O board in the range of up to 
>> | 15 microseconds which are not tolerable by the application.
>> | I see that on all cores the "Local Timer Interrupt" occurs 100 times a
>> | second (of course this is the timer frequency select in the kernel configuration).
>>
>> Are CPU0 and CPU3 on the same socket? Are you using a SMP or a NUMA box? I
>> would suggest running your application in a CPU on a different socket just
>> to ensure you are not having any cache issues.
>>
>> Have you tried running hwlat_detect or smi-test? Your 15us threshold is pretty
>> tight and could easilly be affected by SMI spikes (if present in your
>> system).
>>
>> Luis
>>
>> | My question is now:
>> | Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
>> | I want to use the full 100% CPU core time for this single loop.
>> | 
>> | Any help or ideas are welcome!
>> | 
> 
> This has been a long standing issue with me too. Moving your process to
> another socket won't help you. It is not a cache issue. It is the local
> timer interrupt just as you suspect. I've played with disabling it on a
> core but haven't been successful. This is a problem with both the vanilla
> and RT kernels. No matter what you do as far as isolation of tasks and
> normal interrupts, the local timer interrupt kills ya. The kernel is broken
> in this regard, by design. Your processors aren't yours. The kernel
> developers insist on claiming a piece of every one of them for their code.
> The kernel people will never change/fix this flaw in it's basic design
> because only a few (hard real-time) consider it a problem. Those people are
> told to use something else and that Linux wasn't designed for that kind of
> thing.

Never say never. Given a safe and not too intrusive design, I bet this
could become mainline.

> 
> Unfortunately, the instructions (rdmsr and wrmsr) that could be used to
> disable/re-enable the local timer interrupt require DOM-0 privileges and
> can't be executed from user land. If it were not for that one little thing
> a solution would be easy. You wouldn't even need the RT patch set any more.

Right, with perfect isolation, you could run user-space-only RT on a
PREEMPT_NONE kernel.

> 
> You could probably hack the kernel up such that you could get DOM-0
> privileges in user land but don't expect any help from any kernel people
> for that sort of thing.

This would kill your box within very short time. Other CPUs will once in
a while want to talk to your isolated CPU that then spins with IRQs
disabled. So it doesn't react, and the whole system locks up hard.

> 
> Your best bet will be to attempt to use a high speed video cards GPU for
> your process. There are methods available (out side of the kernel) for
> this. I haven't got there yet but I think NVIDIA's CUDA may be an answer.
> 
> 
> http://en.wikipedia.org/wiki/CUDA

Careful: We currently have bug report open with those guys as the nvidia
driver issues RT-killing cash flushes once in a while (wbinvd on all
CPUs...). Definitely on certain memory allocations which you may avoid,
but it's yet unclear if these are all. Well, binary-only
$your_preferred_term, you know...

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2010-03-04 14:04 ` Luis Claudio R. Goncalves
@ 2010-03-05  9:59   ` Mark Hounschell
  2010-03-05 10:31     ` Jan Kiszka
  2010-03-06  9:10     ` Thomas Gleixner
  0 siblings, 2 replies; 12+ messages in thread
From: Mark Hounschell @ 2010-03-05  9:59 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves
  Cc: M. Koehrer, linux-rt-users, Mark Hounschell, Mark Hounschell

On 03/04/2010 09:04 AM, Luis Claudio R. Goncalves wrote:
> On Thu, Mar 04, 2010 at 01:29:53PM +0100, M. Koehrer wrote:
> | Hi all!
> | 
> | I am running the RT_PREEMPT kernel 2.6.31.2-rt13 on a Intel Quad Core CPU.
> | I start my kernel with isolcpus=1-7 option to force all processes to run on CPU core 0 only.
> | Now we have the need for a very fast loop. Within this loop some accesses from/to a PCIe I/O board
> | (mapped in user space) and some additional computation has to be done.
> | For this, I use a real time thread, run this on CPU core 3 and let it run in an endless polling loop.
> | So far everything works fine.
> | This thread is the only running user mode thread on CPU core 3.
> | However, we measure some run time jitters when accessing the I/O board in the range of up to 
> | 15 microseconds which are not tolerable by the application.
> | I see that on all cores the "Local Timer Interrupt" occurs 100 times a
> | second (of course this is the timer frequency select in the kernel configuration).
> 
> Are CPU0 and CPU3 on the same socket? Are you using a SMP or a NUMA box? I
> would suggest running your application in a CPU on a different socket just
> to ensure you are not having any cache issues.
> 
> Have you tried running hwlat_detect or smi-test? Your 15us threshold is pretty
> tight and could easilly be affected by SMI spikes (if present in your
> system).
> 
> Luis
> 
> | My question is now:
> | Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
> | I want to use the full 100% CPU core time for this single loop.
> | 
> | Any help or ideas are welcome!
> | 

This has been a long standing issue with me too. Moving your process to
another socket won't help you. It is not a cache issue. It is the local
timer interrupt just as you suspect. I've played with disabling it on a
core but haven't been successful. This is a problem with both the vanilla
and RT kernels. No matter what you do as far as isolation of tasks and
normal interrupts, the local timer interrupt kills ya. The kernel is broken
in this regard, by design. Your processors aren't yours. The kernel
developers insist on claiming a piece of every one of them for their code.
The kernel people will never change/fix this flaw in it's basic design
because only a few (hard real-time) consider it a problem. Those people are
told to use something else and that Linux wasn't designed for that kind of
thing.

Unfortunately, the instructions (rdmsr and wrmsr) that could be used to
disable/re-enable the local timer interrupt require DOM-0 privileges and
can't be executed from user land. If it were not for that one little thing
a solution would be easy. You wouldn't even need the RT patch set any more.

You could probably hack the kernel up such that you could get DOM-0
privileges in user land but don't expect any help from any kernel people
for that sort of thing.

Your best bet will be to attempt to use a high speed video cards GPU for
your process. There are methods available (out side of the kernel) for
this. I haven't got there yet but I think NVIDIA's CUDA may be an answer.


http://en.wikipedia.org/wiki/CUDA


Regards
Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling lapic timer for a certain core
  2010-03-04 12:29 M. Koehrer
@ 2010-03-04 14:04 ` Luis Claudio R. Goncalves
  2010-03-05  9:59   ` Mark Hounschell
  0 siblings, 1 reply; 12+ messages in thread
From: Luis Claudio R. Goncalves @ 2010-03-04 14:04 UTC (permalink / raw)
  To: M. Koehrer; +Cc: linux-rt-users

On Thu, Mar 04, 2010 at 01:29:53PM +0100, M. Koehrer wrote:
| Hi all!
| 
| I am running the RT_PREEMPT kernel 2.6.31.2-rt13 on a Intel Quad Core CPU.
| I start my kernel with isolcpus=1-7 option to force all processes to run on CPU core 0 only.
| Now we have the need for a very fast loop. Within this loop some accesses from/to a PCIe I/O board
| (mapped in user space) and some additional computation has to be done.
| For this, I use a real time thread, run this on CPU core 3 and let it run in an endless polling loop.
| So far everything works fine.
| This thread is the only running user mode thread on CPU core 3.
| However, we measure some run time jitters when accessing the I/O board in the range of up to 
| 15 microseconds which are not tolerable by the application.
| I see that on all cores the "Local Timer Interrupt" occurs 100 times a
| second (of course this is the timer frequency select in the kernel configuration).

Are CPU0 and CPU3 on the same socket? Are you using a SMP or a NUMA box? I
would suggest running your application in a CPU on a different socket just
to ensure you are not having any cache issues.

Have you tried running hwlat_detect or smi-test? Your 15us threshold is pretty
tight and could easilly be affected by SMI spikes (if present in your
system).

Luis

| My question is now:
| Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
| I want to use the full 100% CPU core time for this single loop.
| 
| Any help or ideas are welcome!
| 
| Thanks a lot.
| 
| Regards
| 
| Mathias
| 
| -- 
| Mathias Koehrer
| mathias_koehrer@arcor.de
| 
| 
| Tolle Dekolletés oder scharfe Tatoos? Vote jetzt ... oder mach selbst mit und zeige Deine Schokoladenseite
| bei Topp oder Hopp von Arcor: http://www.arcor.de/rd/footer.toh
| --
| To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at  http://vger.kernel.org/majordomo-info.html
---end quoted text---

-- 
[ Luis Claudio R. Goncalves                    Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Disabling lapic timer for a certain core
@ 2010-03-04 12:29 M. Koehrer
  2010-03-04 14:04 ` Luis Claudio R. Goncalves
  0 siblings, 1 reply; 12+ messages in thread
From: M. Koehrer @ 2010-03-04 12:29 UTC (permalink / raw)
  To: linux-rt-users

Hi all!

I am running the RT_PREEMPT kernel 2.6.31.2-rt13 on a Intel Quad Core CPU.
I start my kernel with isolcpus=1-7 option to force all processes to run on CPU core 0 only.
Now we have the need for a very fast loop. Within this loop some accesses from/to a PCIe I/O board
(mapped in user space) and some additional computation has to be done.
For this, I use a real time thread, run this on CPU core 3 and let it run in an endless polling loop.
So far everything works fine.
This thread is the only running user mode thread on CPU core 3.
However, we measure some run time jitters when accessing the I/O board in the range of up to 
15 microseconds which are not tolerable by the application.
I see that on all cores the "Local Timer Interrupt" occurs 100 times a second (of course this is the timer frequency 
select in the kernel configuration).

My question is now:
Is it possible to disable the "Local Timer Interrupt" for core 3 as it is actually not required here?
I want to use the full 100% CPU core time for this single loop.

Any help or ideas are welcome!

Thanks a lot.

Regards

Mathias

-- 
Mathias Koehrer
mathias_koehrer@arcor.de


Tolle Dekolletés oder scharfe Tatoos? Vote jetzt ... oder mach selbst mit und zeige Deine Schokoladenseite
bei Topp oder Hopp von Arcor: http://www.arcor.de/rd/footer.toh
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-05-23 23:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-04 21:03 Re: Disabling lapic timer for a certain core M. Koehrer
2010-03-05  7:05 ` Thomas Gleixner
2010-03-05 13:54 ` M. Koehrer
2010-03-06  9:18   ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2011-05-23 21:40 Joe Howard
2011-05-23 23:49 ` Frank Rowand
2010-03-04 12:29 M. Koehrer
2010-03-04 14:04 ` Luis Claudio R. Goncalves
2010-03-05  9:59   ` Mark Hounschell
2010-03-05 10:31     ` Jan Kiszka
2010-03-06  9:10     ` Thomas Gleixner
2010-03-06 13:12       ` Mark Hounschell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.