From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46039128.90609@domain.hid> Date: Fri, 23 Mar 2007 09:34:48 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <46002EE0.9040406@domain.hid> <460167F8.50703@domain.hid> <46017CA7.2080801@domain.hid> <4601958C.90502@domain.hid> <4601A6E4.9020908@domain.hid> <46023991.4020301@domain.hid> <46036D32.7000603@domain.hid> <46036F22.60709@domain.hid> In-Reply-To: <46036F22.60709@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig5AD6CE1172BFBA96485303ED" Sender: jan.kiszka@domain.hid Subject: [Xenomai-core] Re: RT-Socket-CAN bus error rate and latencies List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Oliver Hartkopp Cc: socketcan-core@domain.hid, xenomai-core This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig5AD6CE1172BFBA96485303ED Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable Oliver Hartkopp wrote: > Additionally to the written stuff below (please read that first), i wan= t > to remark: >=20 > - Remember that we are talking about a case that is not a standard > operation mode but a (temporary) error condition that normally leads to= > a bus-off state and appears only in development and hardware setup phas= e! > - i would suggest to use some low resolution timestamp (like jiffies) > for this, which is very cheap in CPU usage > - the throttling should be configured as a driver module parameter (e.g= =2E > bei_thr=3D0 or bei_thr=3D200 )due to the need of the global use-case. I= f you > are writing a CAN analysis tool you might want to set bei_thr=3D0 in ot= her > cases a default of 200ms might be the right thing. We are falling back to #1, i.e. where we are now already. Your suggestion doesn't help us to provide a generic RT-stack for Xenomai. >=20 > Regards, > Oliver >=20 >=20 >=20 > Oliver Hartkopp wrote: >> Wolfgang Grandegger wrote: >>> Jan Kiszka wrote: >>>> Wolfgang Grandegger wrote: >>>>> Oliver Hartkopp wrote: >>>>> >>>>>> I would tend to reduce the notifications to the user by creating a= >>>>>> timer at the first bus error interrupt. The first BE irq would >>>>>> lead to a CAN_ERR_BUSERROR and after a (configurable) time >>>>>> (e.g.250ms) the next information about bus errors is allowed to be= >>>>>> passed to the user. After this time period is over a new >>>>>> CAN_ERR_BUSERROR may be passed to the user containing the count of= >>>>>> occurred bus errors somewhere in the data[]-section of the Error >>>>>> Frame. When a normal RX/TX-interrupt indicates a 'working' CAN >>>>>> again, the timer would be terminated. >>>>>> >>>>>> Instead of a fix configurable time we could also think about a >>>>>> dynamic behaviour (e.g. with increasing periods). >>>>>> >>>>>> What do you think about this? >>>>> The question is if one bus-error does provide enough information on= >>>>> the cause of the electrical problem or if a sequence is better. >>>>> Furthermore, I personally regard the use of timers as to heavy. But= >>>>> the solution is feasible, of course. Any other opinions? >>>>> >>>> >>>> I think Oliver's suggestions points in the right direction. But inst= ead >>>> of only coding a timer into the stack, I still vote for closing the >>>> loop >>>> over the application: >>>> >>>> After the first error in a potential series, the related error frame= is >>>> queued, listeners are woken up, and BEI is disabled for now. Once so= me >>>> listener read the error frame *and* decided to call into the stack f= or >>>> further bus errors, BEI is enabled again. >>>> >>>> That way the application decides about the error-related IRQ rate an= d >>>> can easily throttle it by delaying the next receive call. Moreover, >>>> threads of higher priority will be delayed at worst by one error IRQ= =2E >>>> This mechanism just needs some words in the documentation ("Be warne= d: >>>> error frames may overwhelm you. Throttle your reception!"), but no >>>> further user-visible config options. >>> >>> I understand, BEI interrupts get (re-)enabled in recvmsg() if the >>> socket wants to receive bus errors. There can me multiple readers, >>> but that's not a problem. Just some overhead in this function. This >>> would also simplify the implementation as my previous one with >>> "on-demand" bus error would be obsolete. I start to like this solutio= n. >> >> Hm - to reenable the BEI on user interaction would be a nice thing BUT= i >> can see several problems: >> >> 1. In socketcan you have receive queues into the userspace with a >> length >1 Can you explain to me what the problem behind this is? I don't see it yet= =2E >> >> 2. How can we handle multiple subscribers (A reads three error frames >> and reenables therefore the BEI, B reads nothing in this time). Please= >> remember: To have multiple applications it a vital idea from socketcan= =2E Same here, I don't see the issue. A and B will both find the first error frame in their queues/ring buffers/whatever. If A has higher priority (or gets an earlier timeslice), it may already re-enable BEI before B was able to run as well. But that's an application-specific scheduling issue and not a problem of the CAN stack (often it is precisely what you want when assigning priorities...). >> >> 3. The count of occured BEIs gets lost (maybe this is unimportant) Agreed, but I also don't consider this problematic. >> >> ---- >> >> Regarding (2) the solution could be not to reenable the BEI for a devi= ce >> until every subscriber has read his error frame. But this collides wit= h >> a raw-socket that's bound to 'any' device (ifindex =3D 0). That can cause prio-inversion: a low-prio BEI-reader decides about when a high-prio one gets the next message. No-go for RT. >> >> Regarding (3) we could count the BEIs (which would not reduce the >> interrupt load) or we just stop the BEI after the first occurance whic= h >> might possibly not enough for some people to implement the CAN >> academical correct. >> >> As you may see here a tight coupling of the problems on the CAN bus wi= th >> the application(s!) is very tricky or even impossible in socketcan. >> Regarding other network devices (like ethernet devices) the notificati= on >> about Layer 1/2 problems is unusual. The concept of creating error >> frames was a good compromise for this reason. >> >> As i also would like to avoid to create a timer for "bus error >> throttling", i got a new idea: >> >> - on the first BEI: create an error frame, set a counter to zero and >> save the current timestamp >> - on the next BEI: >> - increment the counter >> - check if the time is up for the next error frame (e.g. after 200ms = - >> configurable?) >> - if so: Send the next error frame (including the number of occured >> error frames in this 200ms) >> >> BEI means ONLY to have a BEI (and no other error). >> >> Of course this does NOT reduce the interrupt load but all this >> throttling is performed inside the interrupt context. This should not = be >> that problem, or is it? And we do not need a timer ... >> >> Any comments to this idea? >> >> Regards, >> Oliver >> Well, I may oversee some pitfalls of my suggestion, so please help me to understand your concerns. Jan --------------enig5AD6CE1172BFBA96485303ED Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGA5EoniDOoMHTA+kRAmGjAJ9EIH18FtUHEJqcBCQSw0dOi5OsTgCfSk90 aZBDgDGFs5di7FBcwfoGeoY= =eZ/5 -----END PGP SIGNATURE----- --------------enig5AD6CE1172BFBA96485303ED--