All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
@ 2012-07-11 18:10 Jorge Ramirez Ortiz,  HCL Europe
  2012-07-12  0:21 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-11 18:10 UTC (permalink / raw)
  To: xenomai

Please could I get some background with respect of the possible causes of this message? I have read it might be related to __copy_to/from user but I have already validated that path in the driver. Must be something else.
Also, I am OK to assume that the system is not reliable after this event.
thanks in advance
Jorge


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-11 18:10 [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT Jorge Ramirez Ortiz,  HCL Europe
@ 2012-07-12  0:21 ` Gilles Chanteperdrix
  2012-07-12  8:16   ` Jorge Ramirez Ortiz,  HCL Europe
  0 siblings, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2012-07-12  0:21 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/11/2012 08:10 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Please could I get some background with respect of the possible causes of this message? I have read it might be related to __copy_to/from user but I have already validated that path in the driver. Must be something else.
> Also, I am OK to assume that the system is not reliable after this event.
> thanks in advance

You can try and add a call to show_stack(NULL, NULL) at the place where
this message is printed.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-12  0:21 ` Gilles Chanteperdrix
@ 2012-07-12  8:16   ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-12  8:31     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-12  8:16 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

Thanks Gilles. Yes unfortunately I cant recompile this kernel so I was wondering about the system context of this fault within Xenomai.
BTW how come the stack frame is not printed by default. 
________________________________________
From: Gilles Chanteperdrix [gilles.chanteperdrix@xenomai.org]
Sent: 12 July 2012 01:21
To: Jorge Ramirez Ortiz,  HCL Europe
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT

On 07/11/2012 08:10 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Please could I get some background with respect of the possible causes of this message? I have read it might be related to __copy_to/from user but I have already validated that path in the driver. Must be something else.
> Also, I am OK to assume that the system is not reliable after this event.
> thanks in advance

You can try and add a call to show_stack(NULL, NULL) at the place where
this message is printed.

--
                                                                Gilles.


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-12  8:16   ` Jorge Ramirez Ortiz,  HCL Europe
@ 2012-07-12  8:31     ` Gilles Chanteperdrix
  2012-07-12 21:29       ` Jorge Ramirez Ortiz,  HCL Europe
  0 siblings, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2012-07-12  8:31 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/12/2012 10:16 AM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Thanks Gilles. Yes unfortunately I cant recompile this kernel so I
> was wondering about the system context of this fault within Xenomai. 
> BTW how come the stack frame is not printed by default.

Because it may not be safe to use show_stack() from RT context (for
instance, the print_symbol function to print symbol names need to take a
spinlock when a symbol is defined by a kernel module, and this spinlock
is not rt-safe). Anyway, the error message normally prints the PC of the
error.

Could you give us the exact error messsage? The version of Xenomai, the
Linux kernel, the version of the I-pipe patch you use?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-12  8:31     ` Gilles Chanteperdrix
@ 2012-07-12 21:29       ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13  7:27         ` Gilles Chanteperdrix
  2012-07-13  9:34         ` Philippe Gerum
  0 siblings, 2 replies; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-12 21:29 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

Hi Gilles

Information and context below.

kernel: 2.6.35.9                                                                                                                                      
xenomai:2.6.0                                                                                                                                         
ipipe:2.8-04                                                                                                                                          
cpu atom z530 1.6HH                                                                                                                                   
                                                                                                                                                      
Error                                                                                                                                                 
>>> [  242.962195] BUG: Unhandled exception over domain Xenomai at 0x892160 - switching to ROOT                                                       
>>> [  242.990052] Pid: 972, comm: InterruptTest Not tainted 2.6.35.9-prot-xeno-atom #4                                                               
>>> [  243.015979] Call Trace:                                                                                                                        
>>> [  243.025121]  [<c041b0b3>] __ipipe_handle_exception+0x203/0x210                                                                                 
>>> [  243.048691]  [<c07ef4fb>] error_code+0x63/0x70                                                                                                 
                                                                                                                                                      
System:                                                                                                                                               
1. Xenomai thread calling ~9000 PCI writes in a loop. Each of these 9000 IOCTLs is processed in the realtime path of the RTDM driver.                
2. A library has an interrupt notifier on the same RTDM device.                                                                                       
   The interrupt notifier is a wait queue in the non-realtime path of the RTDM driver. Events are raised from the RTDM interrupt handler to notify the sleeping threads.
   The interrupt notifier is a timed wait every 100ms. When it times out, the linux application reads (msgrcv) from a queue in a shared memory segment (shmget created).                            
                                                                                                                                                      
The problem with the design is obvious and the issue above solved by modifying the library.                                                              
The interrupt notifier should be sleeping in an rtdm_event_t in the realtime path of the device instead of a linux wait queue.                        
                                                                                                                                                      
However, the other design, even if wrong, shouldnt have caused exceptions.                                                                            
                                                                                                                                                      
Anyhow, I am as impressed as always with this piece of software. This is the toolkit that any embedded developer needs. And RTDM has been a great addition.

thanks
Jorge
   
________________________________________
From: Gilles Chanteperdrix [gilles.chanteperdrix@xenomai.org]
Sent: 12 July 2012 09:31
To: Jorge Ramirez Ortiz,  HCL Europe
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT

On 07/12/2012 10:16 AM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Thanks Gilles. Yes unfortunately I cant recompile this kernel so I
> was wondering about the system context of this fault within Xenomai.
> BTW how come the stack frame is not printed by default.

Because it may not be safe to use show_stack() from RT context (for
instance, the print_symbol function to print symbol names need to take a
spinlock when a symbol is defined by a kernel module, and this spinlock
is not rt-safe). Anyway, the error message normally prints the PC of the
error.

Could you give us the exact error messsage? The version of Xenomai, the
Linux kernel, the version of the I-pipe patch you use?

--
                                            Gilles.


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-12 21:29       ` Jorge Ramirez Ortiz,  HCL Europe
@ 2012-07-13  7:27         ` Gilles Chanteperdrix
  2012-07-13  9:34         ` Philippe Gerum
  1 sibling, 0 replies; 16+ messages in thread
From: Gilles Chanteperdrix @ 2012-07-13  7:27 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/12/2012 11:29 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
>>>> [  242.962195] BUG: Unhandled exception over domain Xenomai at 0x892160 - switching to ROOT                                                       

Weird message, I can not seem to find it in xenomai sources. Anyway the
message indicates that you should look at what is at 0x892160

>  The interrupt notifier is a wait queue in the non-realtime path of
> the RTDM driver. Events are raised from the RTDM interrupt handler to
> notify the sleeping threads.

You mean you are calling something like "wake_up" from a real-time
interrupt ? If this is what you are doing it can not work for obvious
reasons.

If you want to do that, you should use an rtdm_nrtsig object. Use
rtdm_nrtsig_pend in the interrupt handler, and call wake_up in the
nrtsig_handler.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-12 21:29       ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13  7:27         ` Gilles Chanteperdrix
@ 2012-07-13  9:34         ` Philippe Gerum
  2012-07-13 10:41           ` Jorge Ramirez Ortiz,  HCL Europe
  1 sibling, 1 reply; 16+ messages in thread
From: Philippe Gerum @ 2012-07-13  9:34 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/12/2012 11:29 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Hi Gilles
>
> Information and context below.
>
> kernel: 2.6.35.9
> xenomai:2.6.0
> ipipe:2.8-04
> cpu atom z530 1.6HH
>
> Error
>>>> [  242.962195] BUG: Unhandled exception over domain Xenomai at 0x892160 - switching to ROOT
>>>> [  242.990052] Pid: 972, comm: InterruptTest Not tainted 2.6.35.9-prot-xeno-atom #4
>>>> [  243.015979] Call Trace:
>>>> [  243.025121]  [<c041b0b3>] __ipipe_handle_exception+0x203/0x210
>>>> [  243.048691]  [<c07ef4fb>] error_code+0x63/0x70
>
> System:
> 1. Xenomai thread calling ~9000 PCI writes in a loop. Each of these 9000 IOCTLs is processed in the realtime path of the RTDM driver.
> 2. A library has an interrupt notifier on the same RTDM device.
>     The interrupt notifier is a wait queue in the non-realtime path of the RTDM driver. Events are raised from the RTDM interrupt handler to notify the sleeping threads.
>     The interrupt notifier is a timed wait every 100ms. When it times out, the linux application reads (msgrcv) from a queue in a shared memory segment (shmget created).
>
> The problem with the design is obvious and the issue above solved by modifying the library.
> The interrupt notifier should be sleeping in an rtdm_event_t in the realtime path of the device instead of a linux wait queue.
>
> However, the other design, even if wrong, shouldnt have caused exceptions.

Your assumption is wrong: you cannot parry a real-time context 
reentering the regular linux kernel from an unsafe place through a plain 
function call, absolutely none. We can only detect this situation after 
the facts to give some debug hints, hopefully before the complete crash, 
by instrumenting code paths which may be spuriously called that way, e.g:

diff --git a/kernel/sched.c b/kernel/sched.c
index 2d1e23a..2e0ba74 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3819,6 +3819,8 @@ static void __wake_up_common(wait_queue_head_t *q, 
unsigned int mode,
  {
  	wait_queue_t *curr, *next;

+	ipipe_check_context(ipipe_root_domain);
+
  	list_for_each_entry_safe(curr, next, &q->task_list, task_list) {
  		unsigned flags = curr->flags;

At any rate, you should not expect the system to help you getting away 
with violations of basic dual kernel programming rules from kernel 
space, this won't happen.

>
> Anyhow, I am as impressed as always with this piece of software. This is the toolkit that any embedded developer needs. And RTDM has been a great addition.
>
> thanks
> Jorge
>
> ________________________________________
> From: Gilles Chanteperdrix [gilles.chanteperdrix@xenomai.org]
> Sent: 12 July 2012 09:31
> To: Jorge Ramirez Ortiz,  HCL Europe
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
>
> On 07/12/2012 10:16 AM, Jorge Ramirez Ortiz, HCL Europe wrote:
>> Thanks Gilles. Yes unfortunately I cant recompile this kernel so I
>> was wondering about the system context of this fault within Xenomai.
>> BTW how come the stack frame is not printed by default.
>
> Because it may not be safe to use show_stack() from RT context (for
> instance, the print_symbol function to print symbol names need to take a
> spinlock when a symbol is defined by a kernel module, and this spinlock
> is not rt-safe). Anyway, the error message normally prints the PC of the
> error.
>
> Could you give us the exact error messsage? The version of Xenomai, the
> Linux kernel, the version of the I-pipe patch you use?
>
> --
>                                              Gilles.
>
>
> ::DISCLAIMER::
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses and other defects.
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> _______________________________________________
> Xenomai mailing list
> Xenomai@xenomai.org
> http://www.xenomai.org/mailman/listinfo/xenomai
>


-- 
Philippe.




^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13  9:34         ` Philippe Gerum
@ 2012-07-13 10:41           ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13 11:08             ` Philippe Gerum
  0 siblings, 1 reply; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-13 10:41 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

Sure, I agree with the implementation issue. Makes a lot of sense (I think Gilles also posted on this)



But now thinking about your response ('getting away from violation' caught my eye J), it seems to me that RTDM (or its implementation, or my interpretation!) might be somehow inconsistent.



To the application developer, RTDM provides a unified interface for requests to the driver: the client ignores upfront whether the call will be handled in real-time or non-real-time context. Looking at the RTDM skin as the front door to the real-time software framework, RTDM is telling the client not to worry, "we will handle your request in the right context for you: just send them our way". Which is actually quite nice and provides a lot of data and flexibility to the driver designer.

Similarly, we can look at interrupts as the backdoor to that same framework; to the driver developer  RTDM provides only _one_ interface (rtdm_request_irq) to that backdoor. Just like it does to the front door.



So, from the __system__  perspective, I don't see any reasons why that backdoor  - the interrupt handler- couldn't do the same and handle/delegate the notification of the non-realtime paths that were allowed in via the front door. And I am only talking from an architecture point of view of the framework



Anyhow, thanks for the details and the patch.





-----Mensaje original-----
De: Philippe Gerum [mailto:rpm@xenomai.org]
Enviado el: 13 July 2012 10:34
Para: Jorge Ramirez Ortiz, HCL Europe
CC: Gilles Chanteperdrix; xenomai@xenomai.org
Asunto: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT



On 07/12/2012 11:29 PM, Jorge Ramirez Ortiz, HCL Europe wrote:

> Hi Gilles

>

> Information and context below.

>

> kernel: 2.6.35.9

> xenomai:2.6.0

> ipipe:2.8-04

> cpu atom z530 1.6HH

>

> Error

>>>> [  242.962195] BUG: Unhandled exception over domain Xenomai at 0x892160 - switching to ROOT

>>>> [  242.990052] Pid: 972, comm: InterruptTest Not tainted 2.6.35.9-prot-xeno-atom #4

>>>> [  243.015979] Call Trace:

>>>> [  243.025121]  [<c041b0b3>] __ipipe_handle_exception+0x203/0x210

>>>> [  243.048691]  [<c07ef4fb>] error_code+0x63/0x70

>

> System:

> 1. Xenomai thread calling ~9000 PCI writes in a loop. Each of these 9000 IOCTLs is processed in the realtime path of the RTDM driver.

> 2. A library has an interrupt notifier on the same RTDM device.

>     The interrupt notifier is a wait queue in the non-realtime path of the RTDM driver. Events are raised from the RTDM interrupt handler to notify the sleeping threads.

>     The interrupt notifier is a timed wait every 100ms. When it times out, the linux application reads (msgrcv) from a queue in a shared memory segment (shmget created).

>

> The problem with the design is obvious and the issue above solved by modifying the library.

> The interrupt notifier should be sleeping in an rtdm_event_t in the realtime path of the device instead of a linux wait queue.

>

> However, the other design, even if wrong, shouldnt have caused exceptions.



Your assumption is wrong: you cannot parry a real-time context

reentering the regular linux kernel from an unsafe place through a plain

function call, absolutely none. We can only detect this situation after

the facts to give some debug hints, hopefully before the complete crash,

by instrumenting code paths which may be spuriously called that way, e.g:



diff --git a/kernel/sched.c b/kernel/sched.c

index 2d1e23a..2e0ba74 100644

--- a/kernel/sched.c

+++ b/kernel/sched.c

@@ -3819,6 +3819,8 @@ static void __wake_up_common(wait_queue_head_t *q,

unsigned int mode,

  {

                wait_queue_t *curr, *next;



+             ipipe_check_context(ipipe_root_domain);

+

                list_for_each_entry_safe(curr, next, &q->task_list, task_list) {

                                unsigned flags = curr->flags;



At any rate, you should not expect the system to help you getting away

with violations of basic dual kernel programming rules from kernel

space, this won't happen.



>

> Anyhow, I am as impressed as always with this piece of software. This is the toolkit that any embedded developer needs. And RTDM has been a great addition.

>

> thanks

> Jorge

>

> ________________________________________

> From: Gilles Chanteperdrix [gilles.chanteperdrix@xenomai.org]

> Sent: 12 July 2012 09:31

> To: Jorge Ramirez Ortiz,  HCL Europe

> Cc: xenomai@xenomai.org

> Subject: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT

>

> On 07/12/2012 10:16 AM, Jorge Ramirez Ortiz, HCL Europe wrote:

>> Thanks Gilles. Yes unfortunately I cant recompile this kernel so I

>> was wondering about the system context of this fault within Xenomai.

>> BTW how come the stack frame is not printed by default.

>

> Because it may not be safe to use show_stack() from RT context (for

> instance, the print_symbol function to print symbol names need to take a

> spinlock when a symbol is defined by a kernel module, and this spinlock

> is not rt-safe). Anyway, the error message normally prints the PC of the

> error.

>

> Could you give us the exact error messsage? The version of Xenomai, the

> Linux kernel, the version of the I-pipe patch you use?

>

> --

>                                              Gilles.

>

>

> ::DISCLAIMER::

> ----------------------------------------------------------------------------------------------------------------------------------------------------

>

> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.

> E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,

> lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents

> (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.

> Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the

> views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,

> distribution and / or publication of this message without the prior written consent of authorized representative of

> HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.

> Before opening any email and/or attachments, please check them for viruses and other defects.

>

> ----------------------------------------------------------------------------------------------------------------------------------------------------

>

>

> _______________________________________________

> Xenomai mailing list

> Xenomai@xenomai.org

> http://www.xenomai.org/mailman/listinfo/xenomai

>





--

Philippe.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 10:41           ` Jorge Ramirez Ortiz,  HCL Europe
@ 2012-07-13 11:08             ` Philippe Gerum
  2012-07-13 11:24               ` Jorge Ramirez Ortiz,  HCL Europe
  0 siblings, 1 reply; 16+ messages in thread
From: Philippe Gerum @ 2012-07-13 11:08 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/13/2012 12:41 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Sure, I agree with the implementation issue. Makes a lot of sense (I
> think Gilles also posted on this)
>
> But now thinking about your response (‘getting away from violation’
> caught my eye J), it seems to me that RTDM (or its implementation, or my
> interpretation!) might be somehow inconsistent.
>
> To the application developer, RTDM provides a unified interface for
> requests to the driver: the client ignores upfront whether the call will
> be handled in real-time or non-real-time context.

No, absolutely not. Of course not. RTDM provides a unified framework for 
writing real-time device drivers building on a well-defined API, which 
design helps in porting back and forth that code between native linux 
and dual kernel implementations. Nothing less, nothing more.
Originally, once of the mission statements of RTDM was to stop the 
proliferation of ad hoc mechanisms for interfacing user and driver code 
in a real-time context. It is certainly not designed to hide the 
requirements and constraints that each environment imposes on the 
implementation. I could not, anyway.

  Looking at the RTDM
> skin as the front door to the real-time software framework, RTDM is
> telling the client not to worry, /“we will handle your request in the
> right context for you: just send them our way”./ Which is actually quite
> nice and provides a lot of data and flexibility to the driver designer.

Neither the userland client or the kernel space IRQ handle ignore which 
context should the service run in, this is where you badly misinterpret 
the core logic of dual kernel systems, and the dual kernel incarnation 
of RTDM in particular.

The fact that RTDM services can switch context automatically in some 
cases, when the call is issued from user-space is by no mean a waiver for:

- ignoring that doing so might induce latency for the userland caller. 
So the caller should really know what it's doing, including from which 
context.
- assuming that kernel code would benefit from the same feature, which 
would not be achievable with reasonable means.

Again, RTDM is not meant to hide the target context to the issuing code, 
it is meant to make the developer's life easier when ever possible. This 
does not mean that the user code is allowed to fire any random call from 
any random context, hoping for the best.

>
> Similarly, we can look at interrupts as the backdoor to that same
> framework; to the driver developer RTDM provides only _/one/_ interface
> (rtdm_request_irq) to that backdoor. Just like it does to the front door.

No, there must be a reason why we have rtdm_request_irq in addition to 
request_irq: because the former specifically deals with real-time 
interrupts, which can preempt any linux activity. So, we do know that we 
are dealing with a real-time context. Incidentally, this is where your 
IRQ handler failed at, by calling a wake_up function from the regular 
linux core.

By your logic, we should be able to hook regular linux IRQs using 
rtdm_request_irq, which we can't.

>
> So, from the __system__ perspective, I don’t see any reasons why that
> backdoor - the interrupt handler- couldn’t do the same and
> handle/delegate the notification of the non-realtime paths that were
> allowed in via the front door. And I am only talking from an
> architecture point of view of the framework

You are mentioning a design which is at odds with the basic constraints 
imposed on RTDM and its clients by the dual kernel nature of the system, 
which won't fly.

>
> Anyhow, thanks for the details and the patch.
>
> -----Mensaje original-----
> De: Philippe Gerum [mailto:rpm@xenomai.org]
> Enviado el: 13 July 2012 10:34
> Para: Jorge Ramirez Ortiz, HCL Europe
> CC: Gilles Chanteperdrix; xenomai@xenomai.org
> Asunto: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai -
> switching to ROOT
>
> On 07/12/2012 11:29 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
>
>> Hi Gilles
>
>>
>
>>  Information and context below.
>
>>
>
>> kernel:  2.6.35.9
>
>>  xenomai:2.6.0
>
>>  ipipe:2.8-04
>
>> cpu atom  z530 1.6HH
>
>>
>
>> Error
>
>>>>>  [242.962195] BUG: Unhandled exception over domain Xenomai at 0x892160
> - switching to ROOT
>
>>>>>  [242.990052] Pid: 972, comm: InterruptTest Not tainted
> 2.6.35.9-prot-xeno-atom #4
>
>>>>>  [243.015979] Call Trace:
>
>>>>>  [243.025121][<c041b0b3>] __ipipe_handle_exception+0x203/0x210
>
>>>>>  [243.048691][<c07ef4fb>] error_code+0x63/0x70
>
>>
>
>> System:
>
>> 1.  Xenomai thread calling ~9000 PCI writes in a loop. Each of these 9000
> IOCTLs is processed in the realtime path of the RTDM driver.
>
>> 2. A  library has an interrupt notifier on the same RTDM device.
>
>>The interrupt notifier is a wait queue in the non-realtime path of the
> RTDM driver. Events are raised from the RTDM interrupt handler to notify
> the sleeping threads.
>
>>The interrupt notifier is a timed wait every 100ms. When it times out,
> the linux application reads (msgrcv) from a queue in a shared memory
> segment (shmget created).
>
>>
>
>> The  problem with the design is obvious and the issue above solved by
> modifying the library.
>
>> The  interrupt notifier should be sleeping in an rtdm_event_t in the
> realtime path of the device instead of a linux wait queue.
>
>>
>
>> However,  the other design, even if wrong, shouldnt have caused exceptions.
>
> Your assumption is wrong: you cannot parry a real-time context
>
> reentering the regular linux kernel from an unsafe place through a plain
>
> function call, absolutely none. We can only detect this situation after
>
> the facts to give some debug hints, hopefully before the complete crash,
>
> by instrumenting code paths which may be spuriously called that way, e.g:
>
> diff --git a/kernel/sched.c b/kernel/sched.c
>
> index 2d1e23a..2e0ba74 100644
>
> --- a/kernel/sched.c
>
> +++ b/kernel/sched.c
>
> @@ -3819,6 +3819,8 @@ static void __wake_up_common(wait_queue_head_t *q,
>
> unsigned int mode,
>
> {
>
> wait_queue_t *curr, *next;
>
> +ipipe_check_context(ipipe_root_domain);
>
> +
>
> list_for_each_entry_safe(curr, next, &q->task_list, task_list) {
>
> unsigned flags = curr->flags;
>
> At any rate, you should not expect the system to help you getting away
>
> with violations of basic dual kernel programming rules from kernel
>
> space, this won't happen.
>
>>
>
>> Anyhow, I  am as impressed as always with this piece of software. This is the
> toolkit that any embedded developer needs. And RTDM has been a great
> addition.
>
>>
>
>> thanks
>
>> Jorge
>
>>
>
>>  ________________________________________
>
>> From:  Gilles Chanteperdrix [gilles.chanteperdrix@xenomai.org]
>
>> Sent: 12  July 2012 09:31
>
>> To: Jorge  Ramirez Ortiz,HCL Europe
>
>> Cc:  xenomai@xenomai.org
>
>> Subject:  Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching
> to ROOT
>
>>
>
>> On  07/12/2012 10:16 AM, Jorge Ramirez Ortiz, HCL Europe wrote:
>
>>>  Thanks Gilles. Yes unfortunately I cant recompile this kernel so I
>
>>> was  wondering about the system context of this fault within Xenomai.
>
>>> BTW  how come the stack frame is not printed by default.
>
>>
>
>> Because  it may not be safe to use show_stack() from RT context (for
>
>> instance,  the print_symbol function to print symbol names need to take a
>
>> spinlock  when a symbol is defined by a kernel module, and this spinlock
>
>> is not  rt-safe). Anyway, the error message normally prints the PC of the
>
>> error.
>
>>
>
>> Could you  give us the exact error messsage? The version of Xenomai, the
>
>> Linux  kernel, the version of the I-pipe patch you use?
>
>>
>
>> --
>
>>Gilles.
>
>>
>
>>
>
>>  ::DISCLAIMER::
>
>>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>>
>
>> The  contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
>
>> E-mail  transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
>
>> lost,  destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
>
>> (with or  without referred errors) shall therefore not attach any liability on
> the originator or HCL or its affiliates.
>
>> Views or  opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
>
>> views or opinions  of HCL or its affiliates. Any form of reproduction, dissemination,
> copying, disclosure, modification,
>
>>  distribution and / or publication of this message without the prior
> written consent of authorized representative of
>
>> HCL is  strictly prohibited. If you have received this email in error please
> delete it and notify the sender immediately.
>
>> Before  opening any email and/or attachments, please check them for viruses
> and other defects.
>
>>
>
>>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>>
>
>>
>
>>  _______________________________________________
>
>> Xenomai  mailing list
>
>>  Xenomai@xenomai.org
>
>> http://www.xenomai.org/mailman/listinfo/xenomai
>
>>
>
> --
>
> Philippe.
>


-- 
Philippe.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 11:08             ` Philippe Gerum
@ 2012-07-13 11:24               ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13 11:58                 ` Jan Kiszka
  2012-07-13 12:31                 ` Gilles Chanteperdrix
  0 siblings, 2 replies; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-13 11:24 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

Yes it does: the caller ignores upfront whether the call will be handled in realtime or non-realtime context by the driver.
The client (of course!) can/should (it doesn't really matter for the sake of the argument) take the adequate measures to make sure it will get into the adequate path. 
But the _interface_ does not guarantee which path it will take. This is a fact that you can't disagree with.

But please allow me to re-frame the discussion: I am not discussing here about realtime design practises or about how to use the framework properly. 
I am merely commenting on  the _interfaces_ to the realtime framework and their consistency. 


-----Mensaje original-----
De: Philippe Gerum [mailto:rpm@xenomai.org] 
Enviado el: 13 July 2012 12:08
Para: Jorge Ramirez Ortiz, HCL Europe
CC: Gilles Chanteperdrix; xenomai@xenomai.org
Asunto: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT

On 07/13/2012 12:41 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Sure, I agree with the implementation issue. Makes a lot of sense (I
> think Gilles also posted on this)
>
> But now thinking about your response ('getting away from violation'
> caught my eye J), it seems to me that RTDM (or its implementation, or my
> interpretation!) might be somehow inconsistent.
>
> To the application developer, RTDM provides a unified interface for
> requests to the driver: the client ignores upfront whether the call will
> be handled in real-time or non-real-time context.

No, absolutely not. Of course not. RTDM provides a unified framework for 
writing real-time device drivers building on a well-defined API, which 
design helps in porting back and forth that code between native linux 
and dual kernel implementations. Nothing less, nothing more.
Originally, once of the mission statements of RTDM was to stop the 
proliferation of ad hoc mechanisms for interfacing user and driver code 
in a real-time context. It is certainly not designed to hide the 
requirements and constraints that each environment imposes on the 
implementation. I could not, anyway.

  Looking at the RTDM
> skin as the front door to the real-time software framework, RTDM is
> telling the client not to worry, /"we will handle your request in the
> right context for you: just send them our way"./ Which is actually quite
> nice and provides a lot of data and flexibility to the driver designer.

Neither the userland client or the kernel space IRQ handle ignore which 
context should the service run in, this is where you badly misinterpret 
the core logic of dual kernel systems, and the dual kernel incarnation 
of RTDM in particular.

The fact that RTDM services can switch context automatically in some 
cases, when the call is issued from user-space is by no mean a waiver for:

- ignoring that doing so might induce latency for the userland caller. 
So the caller should really know what it's doing, including from which 
context.
- assuming that kernel code would benefit from the same feature, which 
would not be achievable with reasonable means.

Again, RTDM is not meant to hide the target context to the issuing code, 
it is meant to make the developer's life easier when ever possible. This 
does not mean that the user code is allowed to fire any random call from 
any random context, hoping for the best.

>
> Similarly, we can look at interrupts as the backdoor to that same
> framework; to the driver developer RTDM provides only _/one/_ interface
> (rtdm_request_irq) to that backdoor. Just like it does to the front door.

No, there must be a reason why we have rtdm_request_irq in addition to 
request_irq: because the former specifically deals with real-time 
interrupts, which can preempt any linux activity. So, we do know that we 
are dealing with a real-time context. Incidentally, this is where your 
IRQ handler failed at, by calling a wake_up function from the regular 
linux core.

By your logic, we should be able to hook regular linux IRQs using 
rtdm_request_irq, which we can't.

>
> So, from the __system__ perspective, I don't see any reasons why that
> backdoor - the interrupt handler- couldn't do the same and
> handle/delegate the notification of the non-realtime paths that were
> allowed in via the front door. And I am only talking from an
> architecture point of view of the framework

You are mentioning a design which is at odds with the basic constraints 
imposed on RTDM and its clients by the dual kernel nature of the system, 
which won't fly.

>
> Anyhow, thanks for the details and the patch.
>
> -----Mensaje original-----
> De: Philippe Gerum [mailto:rpm@xenomai.org]
> Enviado el: 13 July 2012 10:34
> Para: Jorge Ramirez Ortiz, HCL Europe
> CC: Gilles Chanteperdrix; xenomai@xenomai.org
> Asunto: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai -
> switching to ROOT
>
> On 07/12/2012 11:29 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
>
>> Hi Gilles
>
>>
>
>>  Information and context below.
>
>>
>
>> kernel:  2.6.35.9
>
>>  xenomai:2.6.0
>
>>  ipipe:2.8-04
>
>> cpu atom  z530 1.6HH
>
>>
>
>> Error
>
>>>>>  [242.962195] BUG: Unhandled exception over domain Xenomai at 0x892160
> - switching to ROOT
>
>>>>>  [242.990052] Pid: 972, comm: InterruptTest Not tainted
> 2.6.35.9-prot-xeno-atom #4
>
>>>>>  [243.015979] Call Trace:
>
>>>>>  [243.025121][<c041b0b3>] __ipipe_handle_exception+0x203/0x210
>
>>>>>  [243.048691][<c07ef4fb>] error_code+0x63/0x70
>
>>
>
>> System:
>
>> 1.  Xenomai thread calling ~9000 PCI writes in a loop. Each of these 9000
> IOCTLs is processed in the realtime path of the RTDM driver.
>
>> 2. A  library has an interrupt notifier on the same RTDM device.
>
>>The interrupt notifier is a wait queue in the non-realtime path of the
> RTDM driver. Events are raised from the RTDM interrupt handler to notify
> the sleeping threads.
>
>>The interrupt notifier is a timed wait every 100ms. When it times out,
> the linux application reads (msgrcv) from a queue in a shared memory
> segment (shmget created).
>
>>
>
>> The  problem with the design is obvious and the issue above solved by
> modifying the library.
>
>> The  interrupt notifier should be sleeping in an rtdm_event_t in the
> realtime path of the device instead of a linux wait queue.
>
>>
>
>> However,  the other design, even if wrong, shouldnt have caused exceptions.
>
> Your assumption is wrong: you cannot parry a real-time context
>
> reentering the regular linux kernel from an unsafe place through a plain
>
> function call, absolutely none. We can only detect this situation after
>
> the facts to give some debug hints, hopefully before the complete crash,
>
> by instrumenting code paths which may be spuriously called that way, e.g:
>
> diff --git a/kernel/sched.c b/kernel/sched.c
>
> index 2d1e23a..2e0ba74 100644
>
> --- a/kernel/sched.c
>
> +++ b/kernel/sched.c
>
> @@ -3819,6 +3819,8 @@ static void __wake_up_common(wait_queue_head_t *q,
>
> unsigned int mode,
>
> {
>
> wait_queue_t *curr, *next;
>
> +ipipe_check_context(ipipe_root_domain);
>
> +
>
> list_for_each_entry_safe(curr, next, &q->task_list, task_list) {
>
> unsigned flags = curr->flags;
>
> At any rate, you should not expect the system to help you getting away
>
> with violations of basic dual kernel programming rules from kernel
>
> space, this won't happen.
>
>>
>
>> Anyhow, I  am as impressed as always with this piece of software. This is the
> toolkit that any embedded developer needs. And RTDM has been a great
> addition.
>
>>
>
>> thanks
>
>> Jorge
>
>>
>
>>  ________________________________________
>
>> From:  Gilles Chanteperdrix [gilles.chanteperdrix@xenomai.org]
>
>> Sent: 12  July 2012 09:31
>
>> To: Jorge  Ramirez Ortiz,HCL Europe
>
>> Cc:  xenomai@xenomai.org
>
>> Subject:  Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching
> to ROOT
>
>>
>
>> On  07/12/2012 10:16 AM, Jorge Ramirez Ortiz, HCL Europe wrote:
>
>>>  Thanks Gilles. Yes unfortunately I cant recompile this kernel so I
>
>>> was  wondering about the system context of this fault within Xenomai.
>
>>> BTW  how come the stack frame is not printed by default.
>
>>
>
>> Because  it may not be safe to use show_stack() from RT context (for
>
>> instance,  the print_symbol function to print symbol names need to take a
>
>> spinlock  when a symbol is defined by a kernel module, and this spinlock
>
>> is not  rt-safe). Anyway, the error message normally prints the PC of the
>
>> error.
>
>>
>
>> Could you  give us the exact error messsage? The version of Xenomai, the
>
>> Linux  kernel, the version of the I-pipe patch you use?
>
>>
>
>> --
>
>>Gilles.
>
>>
>
>>
>
>>  ::DISCLAIMER::
>
>>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>>
>
>> The  contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
>
>> E-mail  transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
>
>> lost,  destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
>
>> (with or  without referred errors) shall therefore not attach any liability on
> the originator or HCL or its affiliates.
>
>> Views or  opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
>
>> views or opinions  of HCL or its affiliates. Any form of reproduction, dissemination,
> copying, disclosure, modification,
>
>>  distribution and / or publication of this message without the prior
> written consent of authorized representative of
>
>> HCL is  strictly prohibited. If you have received this email in error please
> delete it and notify the sender immediately.
>
>> Before  opening any email and/or attachments, please check them for viruses
> and other defects.
>
>>
>
>>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>>
>
>>
>
>>  _______________________________________________
>
>> Xenomai  mailing list
>
>>  Xenomai@xenomai.org
>
>> http://www.xenomai.org/mailman/listinfo/xenomai
>
>>
>
> --
>
> Philippe.
>


-- 
Philippe.




::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 11:24               ` Jorge Ramirez Ortiz,  HCL Europe
@ 2012-07-13 11:58                 ` Jan Kiszka
  2012-07-13 16:27                   ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13 12:31                 ` Gilles Chanteperdrix
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Kiszka @ 2012-07-13 11:58 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

[please don't top-post]

On 2012-07-13 13:24, Jorge Ramirez Ortiz, HCL Europe wrote:
> Yes it does: the caller ignores upfront whether the call will be handled in realtime or non-realtime context by the driver.
> The client (of course!) can/should (it doesn't really matter for the sake of the argument) take the adequate measures to make sure it will get into the adequate path.
> But the _interface_ does not guarantee which path it will take. This is a fact that you can't disagree with.
> 
> But please allow me to re-frame the discussion: I am not discussing here about realtime design practises or about how to use the framework properly.
> I am merely commenting on  the _interfaces_ to the realtime framework and their consistency.

As far as I understood, you were using interfaces outside of the scope
of the RTDM framework. Sorry, we can change the Linux kernel to
gracefully handle all types of improper RT designs. We already have
quite some infrastructure to detect such scenarios, and if there are
holes, we are happy for suggestions (bug reports, patches etc.) to plug
them. But, e.g., failing a call like wake_up from wrong contexts is
impractical (there are too many spots to patch). Or what is your
expectation?

BTW, if you call wake_up under PREEMPT-RT from a hard IRQ handler, you
will get similar results: at best lockdep will bark at you, at worst
your box locks up hard. Different architecture, similar problem.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 11:24               ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13 11:58                 ` Jan Kiszka
@ 2012-07-13 12:31                 ` Gilles Chanteperdrix
  2012-07-13 16:39                   ` Jorge Ramirez Ortiz,  HCL Europe
  1 sibling, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2012-07-13 12:31 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/13/2012 01:24 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Yes it does: the caller ignores upfront whether the call will be
> handled in realtime or non-realtime context by the driver. The client
> (of course!) can/should (it doesn't really matter for the sake of the
> argument) take the adequate measures to make sure it will get into
> the adequate path. But the _interface_ does not guarantee which path
> it will take. This is a fact that you can't disagree with.
> 
> But please allow me to re-frame the discussion: I am not discussing
> here about realtime design practises or about how to use the
> framework properly. I am merely commenting on  the _interfaces_ to
> the realtime framework and their consistency.

Philippe and Jan have answered already, so I am going to make it really
short.

The Linux kernel API has hundreds of services, much of which can not be
re-entered from an RT context, that is a consequence of the "dual
kernel" model. It would simply be silly to assume that we have put LARTs
in these hundreds of services. Knowing that you can not call linux
services when in RT context in driver code is simply part of the things
you should know when developing an RTDM driver.

You forgot that, you got bitten, now get over it.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 11:58                 ` Jan Kiszka
@ 2012-07-13 16:27                   ` Jorge Ramirez Ortiz,  HCL Europe
  2012-07-13 17:47                     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-13 16:27 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Please can we go back and reframe the discussion?

But first I'll reiterate: do ignore the marginal issue I had with the framework (I ported and went over 44,785 lines of ANSI-C(kernel/user) and 1995 lines of C++ code  in just a few days -this one was a big pci driver-....I am not really losing my sleep over it) . Besides It is not really relevant to the point I am trying to make and the reason why I am posting here. 

My point is: yes, absolutely, I would expect the framework to allow wakeup calls to linux services from interrupt context. 
Why not? Why is this the wrong expectation to have?
When the client calls the RTDM driver from the wrong context (a non-realtime context in the real-time world), the framework doesn't complain. It handles it and gives the driver the option to chose how it wants to handle it which makes perfect sense to me. 

And anyhow, failing that, I would have expected the real-time driver model to give the user the option to chose the back door context he wants to register to (which is my whole point and what  PREEMPT_RT actually does in the linux domain). And I would expect an rtdm interface for that. So I think my point stands with respect to consistency on the interfaces: either give the client a rtdm_request_irq context flag to request different IRQ contexts or handle wake_up events. 

With respect to your comment about PREEMPT_RT I think it actually helps with what I am trying to articulate here. So I think we are looking at it from different angles.

Anyway I really don't have much more to contribute/say on this matter but as a user, I thought it would be right to spend some time sharing my views since I benefited from this project so many times.

-----Mensaje original-----
De: Jan Kiszka [mailto:jan.kiszka@siemens.com] 
Enviado el: 13 July 2012 12:59
Para: Jorge Ramirez Ortiz, HCL Europe
CC: Philippe Gerum; xenomai@xenomai.org
Asunto: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT

[please don't top-post]

On 2012-07-13 13:24, Jorge Ramirez Ortiz, HCL Europe wrote:
> Yes it does: the caller ignores upfront whether the call will be handled in realtime or non-realtime context by the driver.
> The client (of course!) can/should (it doesn't really matter for the sake of the argument) take the adequate measures to make sure it will get into the adequate path.
> But the _interface_ does not guarantee which path it will take. This is a fact that you can't disagree with.
> 
> But please allow me to re-frame the discussion: I am not discussing here about realtime design practises or about how to use the framework properly.
> I am merely commenting on  the _interfaces_ to the realtime framework and their consistency.

As far as I understood, you were using interfaces outside of the scope
of the RTDM framework. Sorry, we can change the Linux kernel to
gracefully handle all types of improper RT designs. We already have
quite some infrastructure to detect such scenarios, and if there are
holes, we are happy for suggestions (bug reports, patches etc.) to plug
them. But, e.g., failing a call like wake_up from wrong contexts is
impractical (there are too many spots to patch). Or what is your
expectation?

BTW, if you call wake_up under PREEMPT-RT from a hard IRQ handler, you
will get similar results: at best lockdep will bark at you, at worst
your box locks up hard. Different architecture, similar problem.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 12:31                 ` Gilles Chanteperdrix
@ 2012-07-13 16:39                   ` Jorge Ramirez Ortiz,  HCL Europe
  0 siblings, 0 replies; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-13 16:39 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

On 07/13/2012 01:24 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Yes it does: the caller ignores upfront whether the call will be
> handled in realtime or non-realtime context by the driver. The client
> (of course!) can/should (it doesn't really matter for the sake of the
> argument) take the adequate measures to make sure it will get into
> the adequate path. But the _interface_ does not guarantee which path
> it will take. This is a fact that you can't disagree with.
> 
> But please allow me to re-frame the discussion: I am not discussing
> here about realtime design practises or about how to use the
> framework properly. I am merely commenting on  the _interfaces_ to
> the realtime framework and their consistency.

Philippe and Jan have answered already, so I am going to make it really
short.

The Linux kernel API has hundreds of services, much of which can not be
re-entered from an RT context, that is a consequence of the "dual
kernel" model. It would simply be silly to assume that we have put LARTs
in these hundreds of services. Knowing that you can not call linux
services when in RT context in driver code is simply part of the things
you should know when developing an RTDM driver.

You forgot that, you got bitten, now get over it.

-- 
					    Gilles.



OK. Bye now.

Jorge




::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 16:27                   ` Jorge Ramirez Ortiz,  HCL Europe
@ 2012-07-13 17:47                     ` Gilles Chanteperdrix
  2012-07-16  9:10                       ` Jorge Ramirez Ortiz,  HCL Europe
  0 siblings, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2012-07-13 17:47 UTC (permalink / raw)
  To: Jorge Ramirez Ortiz, HCL Europe; +Cc: xenomai

On 07/13/2012 06:27 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Please can we go back and reframe the discussion?
> 
> But first I'll reiterate: do ignore the marginal issue I had with the
> framework (I ported and went over 44,785 lines of ANSI-C(kernel/user)
> and 1995 lines of C++ code  in just a few days -this one was a big
> pci driver-....I am not really losing my sleep over it) . Besides It
> is not really relevant to the point I am trying to make and the
> reason why I am posting here.
> 
> My point is: yes, absolutely, I would expect the framework to allow
> wakeup calls to linux services from interrupt context. Why not? Why
> is this the wrong expectation to have?

Because waking up a task means interacting with Linux scheduler, which,
in turn, means interacting with the scheduler data structures. But the
reason why xenomai has low latencies is because its interrupts may
interrupt linux anywhere, including in the middle of a critical section
where the scheduler data structures are in an inconsistent state. So,
you see, it can not work.


> And anyhow, failing that, I would have expected the real-time driver
> model to give the user the option to chose the back door context he
> wants to register to (which is my whole point and what  PREEMPT_RT
> actually does in the linux domain). And I would expect an rtdm
> interface for that.

rtdm has an interface for that, it is called rtdm_nrtsig.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT
  2012-07-13 17:47                     ` Gilles Chanteperdrix
@ 2012-07-16  9:10                       ` Jorge Ramirez Ortiz,  HCL Europe
  0 siblings, 0 replies; 16+ messages in thread
From: Jorge Ramirez Ortiz,  HCL Europe @ 2012-07-16  9:10 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

Thanks. I appreciate the time you took to to remind me (again) of this interface and the issues when notifying linux from the realtime domain. 
I also should have read the doxygen docs more carefully (!)
cheers
jorge



-----Mensaje original-----
De: Gilles Chanteperdrix [mailto:gilles.chanteperdrix@xenomai.org] 
Enviado el: 13 July 2012 18:48
Para: Jorge Ramirez Ortiz, HCL Europe
CC: Jan Kiszka; xenomai@xenomai.org
Asunto: Re: [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT

On 07/13/2012 06:27 PM, Jorge Ramirez Ortiz, HCL Europe wrote:
> Please can we go back and reframe the discussion?
> 
> But first I'll reiterate: do ignore the marginal issue I had with the
> framework (I ported and went over 44,785 lines of ANSI-C(kernel/user)
> and 1995 lines of C++ code  in just a few days -this one was a big
> pci driver-....I am not really losing my sleep over it) . Besides It
> is not really relevant to the point I am trying to make and the
> reason why I am posting here.
> 
> My point is: yes, absolutely, I would expect the framework to allow
> wakeup calls to linux services from interrupt context. Why not? Why
> is this the wrong expectation to have?

Because waking up a task means interacting with Linux scheduler, which,
in turn, means interacting with the scheduler data structures. But the
reason why xenomai has low latencies is because its interrupts may
interrupt linux anywhere, including in the middle of a critical section
where the scheduler data structures are in an inconsistent state. So,
you see, it can not work.


> And anyhow, failing that, I would have expected the real-time driver
> model to give the user the option to chose the back door context he
> wants to register to (which is my whole point and what  PREEMPT_RT
> actually does in the linux domain). And I would expect an rtdm
> interface for that.

rtdm has an interface for that, it is called rtdm_nrtsig.

-- 
					    Gilles.


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-07-16  9:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-11 18:10 [Xenomai] BUG: Unhandled exception over domain Xenomai - switching to ROOT Jorge Ramirez Ortiz,  HCL Europe
2012-07-12  0:21 ` Gilles Chanteperdrix
2012-07-12  8:16   ` Jorge Ramirez Ortiz,  HCL Europe
2012-07-12  8:31     ` Gilles Chanteperdrix
2012-07-12 21:29       ` Jorge Ramirez Ortiz,  HCL Europe
2012-07-13  7:27         ` Gilles Chanteperdrix
2012-07-13  9:34         ` Philippe Gerum
2012-07-13 10:41           ` Jorge Ramirez Ortiz,  HCL Europe
2012-07-13 11:08             ` Philippe Gerum
2012-07-13 11:24               ` Jorge Ramirez Ortiz,  HCL Europe
2012-07-13 11:58                 ` Jan Kiszka
2012-07-13 16:27                   ` Jorge Ramirez Ortiz,  HCL Europe
2012-07-13 17:47                     ` Gilles Chanteperdrix
2012-07-16  9:10                       ` Jorge Ramirez Ortiz,  HCL Europe
2012-07-13 12:31                 ` Gilles Chanteperdrix
2012-07-13 16:39                   ` Jorge Ramirez Ortiz,  HCL Europe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.