Oliver Hartkopp wrote: > Additionally to the written stuff below (please read that first), i want > to remark: > > - Remember that we are talking about a case that is not a standard > operation mode but a (temporary) error condition that normally leads to > a bus-off state and appears only in development and hardware setup phase! > - i would suggest to use some low resolution timestamp (like jiffies) > for this, which is very cheap in CPU usage > - the throttling should be configured as a driver module parameter (e.g. > bei_thr=0 or bei_thr=200 )due to the need of the global use-case. If you > are writing a CAN analysis tool you might want to set bei_thr=0 in other > cases a default of 200ms might be the right thing. We are falling back to #1, i.e. where we are now already. Your suggestion doesn't help us to provide a generic RT-stack for Xenomai. > > Regards, > Oliver > > > > Oliver Hartkopp wrote: >> Wolfgang Grandegger wrote: >>> Jan Kiszka wrote: >>>> Wolfgang Grandegger wrote: >>>>> Oliver Hartkopp wrote: >>>>> >>>>>> I would tend to reduce the notifications to the user by creating a >>>>>> timer at the first bus error interrupt. The first BE irq would >>>>>> lead to a CAN_ERR_BUSERROR and after a (configurable) time >>>>>> (e.g.250ms) the next information about bus errors is allowed to be >>>>>> passed to the user. After this time period is over a new >>>>>> CAN_ERR_BUSERROR may be passed to the user containing the count of >>>>>> occurred bus errors somewhere in the data[]-section of the Error >>>>>> Frame. When a normal RX/TX-interrupt indicates a 'working' CAN >>>>>> again, the timer would be terminated. >>>>>> >>>>>> Instead of a fix configurable time we could also think about a >>>>>> dynamic behaviour (e.g. with increasing periods). >>>>>> >>>>>> What do you think about this? >>>>> The question is if one bus-error does provide enough information on >>>>> the cause of the electrical problem or if a sequence is better. >>>>> Furthermore, I personally regard the use of timers as to heavy. But >>>>> the solution is feasible, of course. Any other opinions? >>>>> >>>> >>>> I think Oliver's suggestions points in the right direction. But instead >>>> of only coding a timer into the stack, I still vote for closing the >>>> loop >>>> over the application: >>>> >>>> After the first error in a potential series, the related error frame is >>>> queued, listeners are woken up, and BEI is disabled for now. Once some >>>> listener read the error frame *and* decided to call into the stack for >>>> further bus errors, BEI is enabled again. >>>> >>>> That way the application decides about the error-related IRQ rate and >>>> can easily throttle it by delaying the next receive call. Moreover, >>>> threads of higher priority will be delayed at worst by one error IRQ. >>>> This mechanism just needs some words in the documentation ("Be warned: >>>> error frames may overwhelm you. Throttle your reception!"), but no >>>> further user-visible config options. >>> >>> I understand, BEI interrupts get (re-)enabled in recvmsg() if the >>> socket wants to receive bus errors. There can me multiple readers, >>> but that's not a problem. Just some overhead in this function. This >>> would also simplify the implementation as my previous one with >>> "on-demand" bus error would be obsolete. I start to like this solution. >> >> Hm - to reenable the BEI on user interaction would be a nice thing BUT i >> can see several problems: >> >> 1. In socketcan you have receive queues into the userspace with a >> length >1 Can you explain to me what the problem behind this is? I don't see it yet. >> >> 2. How can we handle multiple subscribers (A reads three error frames >> and reenables therefore the BEI, B reads nothing in this time). Please >> remember: To have multiple applications it a vital idea from socketcan. Same here, I don't see the issue. A and B will both find the first error frame in their queues/ring buffers/whatever. If A has higher priority (or gets an earlier timeslice), it may already re-enable BEI before B was able to run as well. But that's an application-specific scheduling issue and not a problem of the CAN stack (often it is precisely what you want when assigning priorities...). >> >> 3. The count of occured BEIs gets lost (maybe this is unimportant) Agreed, but I also don't consider this problematic. >> >> ---- >> >> Regarding (2) the solution could be not to reenable the BEI for a device >> until every subscriber has read his error frame. But this collides with >> a raw-socket that's bound to 'any' device (ifindex = 0). That can cause prio-inversion: a low-prio BEI-reader decides about when a high-prio one gets the next message. No-go for RT. >> >> Regarding (3) we could count the BEIs (which would not reduce the >> interrupt load) or we just stop the BEI after the first occurance which >> might possibly not enough for some people to implement the CAN >> academical correct. >> >> As you may see here a tight coupling of the problems on the CAN bus with >> the application(s!) is very tricky or even impossible in socketcan. >> Regarding other network devices (like ethernet devices) the notification >> about Layer 1/2 problems is unusual. The concept of creating error >> frames was a good compromise for this reason. >> >> As i also would like to avoid to create a timer for "bus error >> throttling", i got a new idea: >> >> - on the first BEI: create an error frame, set a counter to zero and >> save the current timestamp >> - on the next BEI: >> - increment the counter >> - check if the time is up for the next error frame (e.g. after 200ms - >> configurable?) >> - if so: Send the next error frame (including the number of occured >> error frames in this 200ms) >> >> BEI means ONLY to have a BEI (and no other error). >> >> Of course this does NOT reduce the interrupt load but all this >> throttling is performed inside the interrupt context. This should not be >> that problem, or is it? And we do not need a timer ... >> >> Any comments to this idea? >> >> Regards, >> Oliver >> Well, I may oversee some pitfalls of my suggestion, so please help me to understand your concerns. Jan