All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Nero Fernandez <grimlynch@domain.hid>
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-core] co-kernel benchmarking on arm926
Date: Mon, 28 Jun 2010 23:50:41 +0200	[thread overview]
Message-ID: <4C291931.7010402@domain.hid> (raw)
In-Reply-To: <AANLkTilOt6PsWdCc7faKOBDy1MdCZp8Lgib1d5Dzy3cz@domain.hid>

Nero Fernandez wrote:
> On Fri, Jun 25, 2010 at 8:30 PM, Philippe Gerum <rpm@xenomai.org> wrote:
> 
>> On Thu, 2010-06-24 at 17:05 +0530, Nero Fernandez wrote:
>>> Thanks for your response, Philippe.
>>>
>>> The concerns while the carrying out my experiments were to:
>>>
>>>  - compare xenomai co-kernel overheads (timer and context switch
>>> latencies)
>>>    in xenomai-space vs similar native-linux overheads. These are
>>> presented in
>>>    the first two sheets.
>>>
>>>  - find out, how addition of xenomai, xenomai+adeos effects the native
>>> kernel's
>>>    performance. Here, lmbench was used on the native linux side to
>>> estimate
>>>    the changes to standard linux services.
>> How can your reasonably estimate the overhead of co-kernel services
>> without running any co-kernel services? Interrupt pipelining is not a
>> co-kernel service. You do nothing with interrupt pipelining except
>> enabling co-kernel services to be implemented with real-time response
>> guarantee.
>>
> 
> Repeating myself, sheet 1 and 2 contain the results of running
> co-kernel services(real-time pthread, message-queues, semaphores
> and clock-nansleep) and making measurment regarding scheduling
> and timer-base functionality provided by co-kernel via posix skin.
> 
> Same code was then built native posix, instead of  xenomai-posix skin
> and similar measurements were taken for linux-scheduler and timerbase.
> This is something that i cant do with xenomai's native test (use it for
> native linux benchmarking).
> The point here is to demostrate what kind of benefits may be drawn using
>  xenomai-space without any code change.
> 
> 
> 
>>> Regarding the additions of latency measurements in sys-timer handler,
>>> i performed
>>> a similar measurement from xnintr_clock_handler(), and the results
>>> were similar
>>> to ones reported from sys-timer handler in xenomai-enabled linux.
>> If your benchmark is about Xenomai, then at least make sure to provide
>> results for Xenomai services, used in a relevant application and
>> platform context. Pretending that you instrumented
>> xnintr_clock_handler() at some point and got some results, but
>> eventually decided to illustrate your benchmark with other "similar"
>> results obtained from a totally unrelated instrumentation code, does not
>> help considering the figures as relevant.
>>
>> Btw, hooking xnintr_clock_handler() is not correct. Again, benchmarking
>> interrupt latency with Xenomai has to measure the entire code path, from
>> the moment the interrupt is taken by the CPU, until it is delivered to
>> the Xenomai service user. By instrumenting directly in
>> xnintr_clock_handler(), your test bypasses the Xenomai timer handling
>> code which delivers the timer tick to the user code, and the
>> rescheduling procedure as well, so your figures are optimistically wrong
>> for any normal use case based on real-time tasks.
>>
> 
> Regarding hooking up a measurement-device in sys-timer itself, it serves
> the benefit of observing the changes that xenomai's aperiodic handling
> of system-timer brings. This measurement does not attempt to measure
> the co-kernel services in any manner.
> 
> 
> 
>>  While trying to
>>> make both these measurements, i tried to take care that delay-value
>>> logging is
>>> done at the end the handler routines,but the __ipipe_mach_tsc value is
>>> recorded
>>> at the beginning of the routine (a patch for this is included in the
>>> worksheet itself)
>> This patch is hopelessly useless and misleading. Unless your intent is
>> to have your application directly embodied into low-level interrupt
>> handlers, you are not measuring the actual overhead.
>>
>> Latency is not solely a matter of interrupt masking, but also a matter
>> of I/D cache misses, particularly on ARM - you have to traverse the
>> actual code until delivery to exhibit the latter.
>>
>> This is exactly what the latency tests shipped with Xenomai are for:
>> - /usr/xenomai/bin/latency -t0/1/2
>> - /usr/xenomai/bin/klatency
>> - /usr/xenomai/bin/irqbench
>>
>> If your system involves user-space tasks, then you should benchmark
>> user-space response time using latency [-t0]. If you plan to use
>> kernel-based tasks such as RTDM tasks, then latency -t1 and klatency
>> tests will provide correct results for your benchmark.
>> If you are interested only in interrupt latency, then latency -t2 will
>> help.
>>
>> If you do think that those tests do not measure what you seem to be
>> interested in, then you may want to explain why on this list, so that we
>> eventually understand what you are after.
>>
>>> Regarding the system, changing the kernel version would invalidate my
>>> results
>>> as the system is a released CE device and has no plans to upgrade the
>>> kernel.
>> Ok. But that makes your benchmark 100% irrelevant with respect to
>> assessing the real performances of a decent co-kernel on your setup.
>>
>>> AFAIK, enabling FCSE would limit the number of concurrent processes,
>>> hence
>>> becoming inviable in my scenario.
>> Ditto. Besides, FCSE as implemented in recent I-pipe patches has a
>> best-effort mode which lifts those limitations, at the expense of
>> voiding the latency guarantee, but on the average, that would still be
>> much better than always suffering the VIVT cache insanity without FCSE
>>
> 
> Thanks for mentioning this. I will try to enable this option for
> re-measurements.
> 
> 
>> Quoting a previous mail of yours, regarding your target:
>>> Processor       : ARM926EJ-S rev 5 (v5l)
>> The latency hit induced by VIVT caching on arm926 is typically in the
>> 180-200 us range under load in user-space, and 100-120 us in kernel
>> space. So, without FCSE, this would bite at each Xenomai __and__ linux
>> process context switch. Since your application requires that more than
>> 95 processes be available in the system, you will likely get quite a few
>> switches in any given period of time, unless most of them always sleep,
>> of course.
>>
>> Ok, so let me do some wild guesses here: you told us this is a CE-based
>> application; maybe it exists already? maybe it has to be put on steroïds
>> for gaining decent real-time guarantees it doesn't have yet? and perhaps
>> the design of that application involves many processes undergoing
>> periodic activities, so lots of context switches with address space
>> changes during normal operations?
>>
>> And, you want that to run on arm926, with no FCSE, and likely not a huge
>> amount of RAM either, with more than 95 different address spaces? Don't
>> you think there might be a problem? If so, don't you think implementing
>> a benchmark based on those assumptions might be irrelevant at some
>> point?
>>
>>> As far as the adeos patch is concerned, i took a recent one (2.6.32)
>> I guess you meant 2.6.33?
>>
> 
> Correction, 2.6.30.

Ok. If you are interested in the FCSE code, you may want to use FCSE v4.
See the comparison on the hackbench test here:
http://sisyphus.hd.free.fr/~gilles/pub/fcse/hackbench-fcse-v4.png

I did not rebase the I-pipe patch for 2.6.30 on this new fcse, but you
can find it in the patches for 2.6.31 and 2.6.33. Or as standalone trees
in my adeos git tree:
http://git.xenomai.org/?p=ipipe-gch.git;a=summary

Also note that since we are in the re-hashing tonight, as Philippe told
you, 95 processes is actually a lot on a low-end ARM platform, so you
would better be sure that you really need more than 95 processes
(beware, we are talking processes here, memory spaces, not threads, a
process may have has many threads as it wants) before deciding not to
use the FCSE guaranteed mode. Thinking that the number of processes is
unlimited on a low-end/embedded ARM system is an error: it is limited by
the available ressources (RAM, CPU) on your system. The lower the
ressources, the lower the practical limit is, and I bet this practical
limit is much lower than you would like.

-- 
					    Gilles.



      parent reply	other threads:[~2010-06-28 21:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-18 13:52 [Xenomai-core] problem in pthread_mutex_lock/unlock Nero Fernandez
2010-06-18 14:12 ` Gilles Chanteperdrix
     [not found]   ` <AANLkTinABTK2nMI0QfZVaULQ4OKwF0678PKOBc_OMIn1@domain.hid>
2010-06-18 14:59     ` [Xenomai-core] Fwd: " Nero Fernandez
2010-06-18 15:08       ` Gilles Chanteperdrix
2010-06-18 19:45         ` Gilles Chanteperdrix
2010-06-23 20:45           ` Nero Fernandez
2010-06-23 22:00             ` Philippe Gerum
2010-06-24 11:35               ` Nero Fernandez
2010-06-24 11:50                 ` Gilles Chanteperdrix
2010-06-24 13:21                   ` Nero Fernandez
2010-06-24 14:14                     ` Gilles Chanteperdrix
2010-06-28 17:53                       ` Nero Fernandez
2010-06-28 19:26                         ` Gilles Chanteperdrix
2010-06-24 20:40                 ` Gilles Chanteperdrix
2010-06-25 15:00                 ` [Xenomai-core] co-kernel benchmarking on arm926 (was: Fwd: problem in pthread_mutex_lock/unlock) Philippe Gerum
2010-06-28 17:50                   ` Nero Fernandez
2010-06-28 21:31                     ` Philippe Gerum
2010-06-28 21:50                     ` Gilles Chanteperdrix [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C291931.7010402@domain.hid \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=grimlynch@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.