From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <46039128.90609@domain.hid>
Date: Fri, 23 Mar 2007 09:34:48 +0100
From: Jan Kiszka <jan.kiszka@domain.hid>
MIME-Version: 1.0
References: <46002EE0.9040406@domain.hid>
	<460167F8.50703@domain.hid>	<46017CA7.2080801@domain.hid>
	<4601958C.90502@domain.hid> <4601A6E4.9020908@domain.hid>
	<46023991.4020301@domain.hid>
	<46036D32.7000603@domain.hid> <46036F22.60709@domain.hid>
In-Reply-To: <46036F22.60709@domain.hid>
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enig5AD6CE1172BFBA96485303ED"
Sender: jan.kiszka@domain.hid
Subject: [Xenomai-core] Re: RT-Socket-CAN bus error rate and latencies
List-Id: "Xenomai life and development \(bug reports, patches,
	discussions\)" <xenomai.xenomai.org>
List-Unsubscribe: <https://mail.gna.org/listinfo/xenomai-core>,
	<mailto:xenomai-core-request@domain.hid>
List-Archive: </public/xenomai-core>
List-Post: <mailto:xenomai@xenomai.org>
List-Help: <mailto:xenomai-core-request@domain.hid>
List-Subscribe: <https://mail.gna.org/listinfo/xenomai-core>,
	<mailto:xenomai-core-request@domain.hid>
To: Oliver Hartkopp <socketcan@domain.hid>
Cc: socketcan-core@domain.hid, xenomai-core <xenomai@xenomai.org>

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig5AD6CE1172BFBA96485303ED
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: quoted-printable

Oliver Hartkopp wrote:
> Additionally to the written stuff below (please read that first), i wan=
t
> to remark:
>=20
> - Remember that we are talking about a case that is not a standard
> operation mode but a (temporary) error condition that normally leads to=

> a bus-off state and appears only in development and hardware setup phas=
e!
> - i would suggest to use some low resolution timestamp (like jiffies)
> for this, which is very cheap in CPU usage
> - the throttling should be configured as a driver module parameter (e.g=
=2E
> bei_thr=3D0 or bei_thr=3D200 )due to the need of the global use-case. I=
f you
> are writing a CAN analysis tool you might want to set bei_thr=3D0 in ot=
her
> cases a default of 200ms might be the right thing.

We are falling back to #1, i.e. where we are now already. Your
suggestion doesn't help us to provide a generic RT-stack for Xenomai.

>=20
> Regards,
> Oliver
>=20
>=20
>=20
> Oliver Hartkopp wrote:
>> Wolfgang Grandegger wrote:
>>> Jan Kiszka wrote:
>>>> Wolfgang Grandegger wrote:
>>>>> Oliver Hartkopp wrote:
>>>>>
>>>>>> I would tend to reduce the notifications to the user by creating a=

>>>>>> timer at the first bus error interrupt. The first BE irq would
>>>>>> lead to a CAN_ERR_BUSERROR and after a (configurable) time
>>>>>> (e.g.250ms) the next information about bus errors is allowed to be=

>>>>>> passed to the user. After this time period is over a new
>>>>>> CAN_ERR_BUSERROR may be passed to the user containing the count of=

>>>>>> occurred bus errors somewhere in the data[]-section of the Error
>>>>>> Frame. When a normal RX/TX-interrupt indicates a 'working' CAN
>>>>>> again, the timer would be terminated.
>>>>>>
>>>>>> Instead of a fix configurable time we could also think about a
>>>>>> dynamic behaviour (e.g. with increasing periods).
>>>>>>
>>>>>> What do you think about this?
>>>>> The question is if one bus-error does provide enough information on=

>>>>> the cause of the electrical problem or if a sequence is better.
>>>>> Furthermore, I personally regard the use of timers as to heavy. But=

>>>>> the solution is feasible, of course. Any other opinions?
>>>>>
>>>>
>>>> I think Oliver's suggestions points in the right direction. But inst=
ead
>>>> of only coding a timer into the stack, I still vote for closing the
>>>> loop
>>>> over the application:
>>>>
>>>> After the first error in a potential series, the related error frame=
 is
>>>> queued, listeners are woken up, and BEI is disabled for now. Once so=
me
>>>> listener read the error frame *and* decided to call into the stack f=
or
>>>> further bus errors, BEI is enabled again.
>>>>
>>>> That way the application decides about the error-related IRQ rate an=
d
>>>> can easily throttle it by delaying the next receive call. Moreover,
>>>> threads of higher priority will be delayed at worst by one error IRQ=
=2E
>>>> This mechanism just needs some words in the documentation ("Be warne=
d:
>>>> error frames may overwhelm you. Throttle your reception!"), but no
>>>> further user-visible config options.
>>>
>>> I understand, BEI interrupts get (re-)enabled in recvmsg() if the
>>> socket wants to receive bus errors. There can me multiple readers,
>>> but that's not a problem. Just some overhead in this function. This
>>> would also simplify the implementation as my previous one with
>>> "on-demand" bus error would be obsolete. I start to like this solutio=
n.
>>
>> Hm - to reenable the BEI on user interaction would be a nice thing BUT=
 i
>> can see several problems:
>>
>> 1. In socketcan you have receive queues into the userspace with a
>> length >1

Can you explain to me what the problem behind this is? I don't see it yet=
=2E

>>
>> 2. How can we handle multiple subscribers (A reads three error frames
>> and reenables therefore the BEI, B reads nothing in this time). Please=

>> remember: To have multiple applications it a vital idea from socketcan=
=2E

Same here, I don't see the issue. A and B will both find the first error
frame in their queues/ring buffers/whatever. If A has higher priority
(or gets an earlier timeslice), it may already re-enable BEI before B
was able to run as well. But that's an application-specific scheduling
issue and not a problem of the CAN stack (often it is precisely what you
want when assigning priorities...).

>>
>> 3. The count of occured BEIs gets lost (maybe this is unimportant)

Agreed, but I also don't consider this problematic.

>>
>> ----
>>
>> Regarding (2) the solution could be not to reenable the BEI for a devi=
ce
>> until every subscriber has read his error frame. But this collides wit=
h
>> a raw-socket that's bound to 'any' device (ifindex =3D 0).

That can cause prio-inversion: a low-prio BEI-reader decides about when
a high-prio one gets the next message. No-go for RT.

>>
>> Regarding (3) we could count the BEIs (which would not reduce the
>> interrupt load) or we just stop the BEI after the first occurance whic=
h
>> might possibly not enough for some people to implement the CAN
>> academical correct.
>>
>> As you may see here a tight coupling of the problems on the CAN bus wi=
th
>> the application(s!) is very tricky or even impossible in socketcan.
>> Regarding other network devices (like ethernet devices) the notificati=
on
>> about Layer 1/2 problems is unusual. The concept of creating error
>> frames was a good compromise for this reason.
>>
>> As i also would like to avoid to create a timer for "bus error
>> throttling", i got a new idea:
>>
>> - on the first BEI: create an error frame, set a counter to zero and
>> save the current timestamp
>> - on the next BEI:
>>  - increment the counter
>>  - check if the time is up for the next error frame (e.g. after 200ms =
-
>> configurable?)
>>  - if so: Send the next error frame (including the number of occured
>> error frames in this 200ms)
>>
>> BEI means ONLY to have a BEI (and no other error).
>>
>> Of course this does NOT reduce the interrupt load but all this
>> throttling is performed inside the interrupt context. This should not =
be
>> that problem, or is it? And we do not need a timer ...
>>
>> Any comments to this idea?
>>
>> Regards,
>> Oliver
>>

Well, I may oversee some pitfalls of my suggestion, so please help me to
understand your concerns.

Jan


--------------enig5AD6CE1172BFBA96485303ED
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGA5EoniDOoMHTA+kRAmGjAJ9EIH18FtUHEJqcBCQSw0dOi5OsTgCfSk90
aZBDgDGFs5di7FBcwfoGeoY=
=eZ/5
-----END PGP SIGNATURE-----

--------------enig5AD6CE1172BFBA96485303ED--