All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-core] RT-CAN tx_sem
@ 2007-02-14 18:30 Jan Kiszka
  2007-02-15  8:37 ` [Xenomai-core] " Wolfgang Grandegger
  2007-02-15 13:25 ` Wolfgang Grandegger
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Kiszka @ 2007-02-14 18:30 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 713 bytes --]

Hi Wolfgang,

fiddling with the CAN utils, I noticed the behaviour that rtcansend on
freshly registered but stopped devices simply blocks. And when you
meanwhile call rtcanconfig up on that device, rtcansend will continue to
block.

The reason is that during startup of CAN devices the tx-semaphore is
re-initialised while it is already set up during device registration.
Re-init on an already initialised rtdm_sem is, say, undefined.

So rtdm_sem should better only be initialised/destroyed by the drivers.
Trying to send on a shut down CAN device could be catched differently,
e.g. via the device state. This likely needs more thoughts, take it as a
note that $something should be done.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-14 18:30 [Xenomai-core] RT-CAN tx_sem Jan Kiszka
@ 2007-02-15  8:37 ` Wolfgang Grandegger
  2007-02-15 13:25 ` Wolfgang Grandegger
  1 sibling, 0 replies; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-15  8:37 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Hi Wolfgang,
> 
> fiddling with the CAN utils, I noticed the behaviour that rtcansend on
> freshly registered but stopped devices simply blocks. And when you
> meanwhile call rtcanconfig up on that device, rtcansend will continue to
> block.

This is a bug, the send function should return an error in that case.

> The reason is that during startup of CAN devices the tx-semaphore is
> re-initialised while it is already set up during device registration.
> Re-init on an already initialised rtdm_sem is, say, undefined.
> 
> So rtdm_sem should better only be initialised/destroyed by the drivers.
> Trying to send on a shut down CAN device could be catched differently,
> e.g. via the device state. This likely needs more thoughts, take it as a
> note that $something should be done.

I'm going to check this later today or tomorrow.

Wolfgang.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-14 18:30 [Xenomai-core] RT-CAN tx_sem Jan Kiszka
  2007-02-15  8:37 ` [Xenomai-core] " Wolfgang Grandegger
@ 2007-02-15 13:25 ` Wolfgang Grandegger
  2007-02-15 13:36   ` Jan Kiszka
  1 sibling, 1 reply; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-15 13:25 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1370 bytes --]

Hi Jan,

Jan Kiszka wrote:
> Hi Wolfgang,
> 
> fiddling with the CAN utils, I noticed the behaviour that rtcansend on
> freshly registered but stopped devices simply blocks. And when you
> meanwhile call rtcanconfig up on that device, rtcansend will continue to
> block.

I see the expected behavior on stopped devices:

   bash-2.05b# rtcansend rtcan0
   rt_dev_send: Network is down

This is, because the tx_sem is marked "destroyed" already when the 
device gets initialized. I wonder why your app blocks.

> The reason is that during startup of CAN devices the tx-semaphore is
> re-initialised while it is already set up during device registration.
> Re-init on an already initialised rtdm_sem is, say, undefined.

That makes definitely trouble. A CAN_MODE_START should only start the 
device if it's not active. The attached patch fixes this. Another 
possibility would be to force a CAN_MODE_STOP in case the device is 
already active (=restart).

> So rtdm_sem should better only be initialised/destroyed by the drivers.
> Trying to send on a shut down CAN device could be catched differently,
> e.g. via the device state. This likely needs more thoughts, take it as a
> note that $something should be done.

With the fix above I do not see further problems with the existing 
implementation using the "destroyed" state to mark the device unavailable.

Wolfgang.

[-- Attachment #2: xenomai-rtcan-txsem.patch --]
[-- Type: text/x-patch, Size: 556 bytes --]

+ diff -u xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM xenomai/ksrc/drivers/can/rtcan_raw_dev.c
--- xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM	2007-02-15 11:21:43.000000000 +0100
+++ xenomai/ksrc/drivers/can/rtcan_raw_dev.c	2007-02-15 14:16:16.000000000 +0100
@@ -193,7 +193,8 @@
     switch (request) {
     case SIOCSCANMODE:
 	mode = (can_mode_t *)&ifr->ifr_ifru;
-	if (dev->do_set_mode)
+	if (dev->do_set_mode &&
+	    !(*mode == CAN_MODE_START && CAN_STATE_OPERATING(dev->state)))
 	    ret = dev->do_set_mode(dev, *mode, &lock_ctx);
 	break;
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-15 13:25 ` Wolfgang Grandegger
@ 2007-02-15 13:36   ` Jan Kiszka
  2007-02-15 13:54     ` Wolfgang Grandegger
  2007-02-15 14:01     ` Sebastian Smolorz
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Kiszka @ 2007-02-15 13:36 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2618 bytes --]

Wolfgang Grandegger wrote:
> Hi Jan,
> 
> Jan Kiszka wrote:
>> Hi Wolfgang,
>>
>> fiddling with the CAN utils, I noticed the behaviour that rtcansend on
>> freshly registered but stopped devices simply blocks. And when you
>> meanwhile call rtcanconfig up on that device, rtcansend will continue to
>> block.
> 
> I see the expected behavior on stopped devices:
> 
>   bash-2.05b# rtcansend rtcan0
>   rt_dev_send: Network is down
> 
> This is, because the tx_sem is marked "destroyed" already when the
> device gets initialized. I wonder why your app blocks.

Unlikely, because the sem is _not_ marked destroyed on startup:

http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_dev.c#181

In case it makes a subtle difference: I tested with xeno_can_virt, but I
think I saw the same effect on real SJA1000 hw as well.

> 
>> The reason is that during startup of CAN devices the tx-semaphore is
>> re-initialised while it is already set up during device registration.
>> Re-init on an already initialised rtdm_sem is, say, undefined.
> 
> That makes definitely trouble. A CAN_MODE_START should only start the
> device if it's not active. The attached patch fixes this. Another
> possibility would be to force a CAN_MODE_STOP in case the device is
> already active (=restart).
> 
>> So rtdm_sem should better only be initialised/destroyed by the drivers.
>> Trying to send on a shut down CAN device could be catched differently,
>> e.g. via the device state. This likely needs more thoughts, take it as a
>> note that $something should be done.
> 
> With the fix above I do not see further problems with the existing
> implementation using the "destroyed" state to mark the device unavailable.

The problem of double init remains. I don't think that the sem state
should be used for encoding the CAN device state towards potential senders.

Jan

> 
> Wolfgang.
> 
> 
> ------------------------------------------------------------------------
> 
> + diff -u xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM xenomai/ksrc/drivers/can/rtcan_raw_dev.c
> --- xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM	2007-02-15 11:21:43.000000000 +0100
> +++ xenomai/ksrc/drivers/can/rtcan_raw_dev.c	2007-02-15 14:16:16.000000000 +0100
> @@ -193,7 +193,8 @@
>      switch (request) {
>      case SIOCSCANMODE:
>  	mode = (can_mode_t *)&ifr->ifr_ifru;
> -	if (dev->do_set_mode)
> +	if (dev->do_set_mode &&
> +	    !(*mode == CAN_MODE_START && CAN_STATE_OPERATING(dev->state)))
>  	    ret = dev->do_set_mode(dev, *mode, &lock_ctx);
>  	break;
>  



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-15 13:36   ` Jan Kiszka
@ 2007-02-15 13:54     ` Wolfgang Grandegger
  2007-02-15 14:29       ` Jan Kiszka
  2007-02-15 14:01     ` Sebastian Smolorz
  1 sibling, 1 reply; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-15 13:54 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2329 bytes --]

Jan Kiszka wrote:
> Wolfgang Grandegger wrote:
>> Hi Jan,
>>
>> Jan Kiszka wrote:
>>> Hi Wolfgang,
>>>
>>> fiddling with the CAN utils, I noticed the behaviour that rtcansend on
>>> freshly registered but stopped devices simply blocks. And when you
>>> meanwhile call rtcanconfig up on that device, rtcansend will continue to
>>> block.
>> I see the expected behavior on stopped devices:
>>
>>   bash-2.05b# rtcansend rtcan0
>>   rt_dev_send: Network is down
>>
>> This is, because the tx_sem is marked "destroyed" already when the
>> device gets initialized. I wonder why your app blocks.
> 
> Unlikely, because the sem is _not_ marked destroyed on startup:
> 
> http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_dev.c#181

You missed

     /* Set dummy state for following call */
     dev->state = CAN_STATE_ACTIVE;
     /* Enter reset mode */
     rtcan_sja_mode_stop(dev, NULL);

in the init part of rtcan_sja1000.c and rtcan_mscan.c, ...

> 
> In case it makes a subtle difference: I tested with xeno_can_virt, but I
> think I saw the same effect on real SJA1000 hw as well.

... which is missing in rtcan_virt.c. I already thought moving it to the 
common part. The attached patch fixes that.

>>> The reason is that during startup of CAN devices the tx-semaphore is
>>> re-initialised while it is already set up during device registration.
>>> Re-init on an already initialised rtdm_sem is, say, undefined.
>> That makes definitely trouble. A CAN_MODE_START should only start the
>> device if it's not active. The attached patch fixes this. Another
>> possibility would be to force a CAN_MODE_STOP in case the device is
>> already active (=restart).
>>
>>> So rtdm_sem should better only be initialised/destroyed by the drivers.
>>> Trying to send on a shut down CAN device could be catched differently,
>>> e.g. via the device state. This likely needs more thoughts, take it as a
>>> note that $something should be done.
>> With the fix above I do not see further problems with the existing
>> implementation using the "destroyed" state to mark the device unavailable.
> 
> The problem of double init remains. I don't think that the sem state
> should be used for encoding the CAN device state towards potential senders.

As I see it, Sebastian's trick saves code and synchronization.

Wolfgang.


[-- Attachment #2: xenomai-rtcan-txsem2.patch --]
[-- Type: text/x-patch, Size: 1169 bytes --]

+ diff -u xenomai/ksrc/drivers/can/rtcan_dev.c.TXSEM xenomai/ksrc/drivers/can/rtcan_dev.c
--- xenomai/ksrc/drivers/can/rtcan_dev.c.TXSEM	2007-02-15 11:21:43.000000000 +0100
+++ xenomai/ksrc/drivers/can/rtcan_dev.c	2007-02-15 14:52:20.000000000 +0100
@@ -176,9 +176,9 @@
 
     rtdm_lock_init(&dev->device_lock);
 
-    /* Init TX Semaphore, will be destroyed forthwith
-     * when setting stop mode */
+    /* Init and destroy TX semaphore to mark TX unavailable. */
     rtdm_sem_init(&dev->tx_sem, 0);
+    rtdm_sem_destroy(&dev->tx_sem);
 #ifdef RTCAN_USE_REFCOUNT
     atomic_set(&dev->refcount, 0);
 #endif
+ diff -u xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM xenomai/ksrc/drivers/can/rtcan_raw_dev.c
--- xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM	2007-02-15 11:21:43.000000000 +0100
+++ xenomai/ksrc/drivers/can/rtcan_raw_dev.c	2007-02-15 14:16:16.000000000 +0100
@@ -193,7 +193,8 @@
     switch (request) {
     case SIOCSCANMODE:
 	mode = (can_mode_t *)&ifr->ifr_ifru;
-	if (dev->do_set_mode)
+	if (dev->do_set_mode &&
+	    !(*mode == CAN_MODE_START && CAN_STATE_OPERATING(dev->state)))
 	    ret = dev->do_set_mode(dev, *mode, &lock_ctx);
 	break;
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-15 13:36   ` Jan Kiszka
  2007-02-15 13:54     ` Wolfgang Grandegger
@ 2007-02-15 14:01     ` Sebastian Smolorz
  1 sibling, 0 replies; 15+ messages in thread
From: Sebastian Smolorz @ 2007-02-15 14:01 UTC (permalink / raw)
  To: xenomai; +Cc: Jan Kiszka

Jan Kiszka wrote:
> I don't think that the sem state
> should be used for encoding the CAN device state towards potential senders.

Why not?

--
Sebastian


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-15 13:54     ` Wolfgang Grandegger
@ 2007-02-15 14:29       ` Jan Kiszka
  2007-02-15 16:45         ` Wolfgang Grandegger
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2007-02-15 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 3008 bytes --]

Wolfgang Grandegger wrote:
> Jan Kiszka wrote:
>> Wolfgang Grandegger wrote:
>>> Hi Jan,
>>>
>>> Jan Kiszka wrote:
>>>> Hi Wolfgang,
>>>>
>>>> fiddling with the CAN utils, I noticed the behaviour that rtcansend on
>>>> freshly registered but stopped devices simply blocks. And when you
>>>> meanwhile call rtcanconfig up on that device, rtcansend will
>>>> continue to
>>>> block.
>>> I see the expected behavior on stopped devices:
>>>
>>>   bash-2.05b# rtcansend rtcan0
>>>   rt_dev_send: Network is down
>>>
>>> This is, because the tx_sem is marked "destroyed" already when the
>>> device gets initialized. I wonder why your app blocks.
>>
>> Unlikely, because the sem is _not_ marked destroyed on startup:
>>
>> http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_dev.c#181
>>
> 
> You missed
> 
>     /* Set dummy state for following call */
>     dev->state = CAN_STATE_ACTIVE;
>     /* Enter reset mode */
>     rtcan_sja_mode_stop(dev, NULL);
> 
> in the init part of rtcan_sja1000.c and rtcan_mscan.c, ...
> 
>>
>> In case it makes a subtle difference: I tested with xeno_can_virt, but I
>> think I saw the same effect on real SJA1000 hw as well.
> 
> ... which is missing in rtcan_virt.c. I already thought moving it to the
> common part. The attached patch fixes that.

I'm still not happy with this pattern, at least it's hidden from the
drivers now. But do not rtcan_sja1000.c and rtcan_mscan.c require some
changes as well to avoid double destruction when this happens in generic
code from now on?

> 
>>>> The reason is that during startup of CAN devices the tx-semaphore is
>>>> re-initialised while it is already set up during device registration.
>>>> Re-init on an already initialised rtdm_sem is, say, undefined.
>>> That makes definitely trouble. A CAN_MODE_START should only start the
>>> device if it's not active. The attached patch fixes this. Another
>>> possibility would be to force a CAN_MODE_STOP in case the device is
>>> already active (=restart).
>>>
>>>> So rtdm_sem should better only be initialised/destroyed by the drivers.
>>>> Trying to send on a shut down CAN device could be catched differently,
>>>> e.g. via the device state. This likely needs more thoughts, take it
>>>> as a
>>>> note that $something should be done.
>>> With the fix above I do not see further problems with the existing
>>> implementation using the "destroyed" state to mark the device
>>> unavailable.
>>
>> The problem of double init remains. I don't think that the sem state
>> should be used for encoding the CAN device state towards potential
>> senders.
> 
> As I see it, Sebastian's trick saves code and synchronization.

Well, we are discussing a single test here. Or doesn't dev->state hold
the require information as well. Moreover, the existing pattern was easy
to get wrong between core and driver.

On the other hand, if we manage to clean the code up, I may change my mind.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-15 14:29       ` Jan Kiszka
@ 2007-02-15 16:45         ` Wolfgang Grandegger
  2007-02-17 10:19           ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-15 16:45 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]

Jan Kiszka wrote:
> Wolfgang Grandegger wrote:
>> Jan Kiszka wrote:
>>> Wolfgang Grandegger wrote:
>>>> Hi Jan,
>>>>
>>>> Jan Kiszka wrote:
>>>>> Hi Wolfgang,
>>>>>
>>>>> fiddling with the CAN utils, I noticed the behaviour that rtcansend on
>>>>> freshly registered but stopped devices simply blocks. And when you
>>>>> meanwhile call rtcanconfig up on that device, rtcansend will
>>>>> continue to
>>>>> block.
>>>> I see the expected behavior on stopped devices:
>>>>
>>>>   bash-2.05b# rtcansend rtcan0
>>>>   rt_dev_send: Network is down
>>>>
>>>> This is, because the tx_sem is marked "destroyed" already when the
>>>> device gets initialized. I wonder why your app blocks.
>>> Unlikely, because the sem is _not_ marked destroyed on startup:
>>>
>>> http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_dev.c#181
>>>
>> You missed
>>
>>     /* Set dummy state for following call */
>>     dev->state = CAN_STATE_ACTIVE;
>>     /* Enter reset mode */
>>     rtcan_sja_mode_stop(dev, NULL);
>>
>> in the init part of rtcan_sja1000.c and rtcan_mscan.c, ...
>>
>>> In case it makes a subtle difference: I tested with xeno_can_virt, but I
>>> think I saw the same effect on real SJA1000 hw as well.
>> ... which is missing in rtcan_virt.c. I already thought moving it to the
>> common part. The attached patch fixes that.
> 
> I'm still not happy with this pattern, at least it's hidden from the
> drivers now. But do not rtcan_sja1000.c and rtcan_mscan.c require some
> changes as well to avoid double destruction when this happens in generic
> code from now on?

Well, what needs a better implementaion is the code fragment shown above 
but I don't have a quick solution at hand. Therefore I tend to the 
attached patch.

>>>>> The reason is that during startup of CAN devices the tx-semaphore is
>>>>> re-initialised while it is already set up during device registration.
>>>>> Re-init on an already initialised rtdm_sem is, say, undefined.
>>>> That makes definitely trouble. A CAN_MODE_START should only start the
>>>> device if it's not active. The attached patch fixes this. Another
>>>> possibility would be to force a CAN_MODE_STOP in case the device is
>>>> already active (=restart).
>>>>
>>>>> So rtdm_sem should better only be initialised/destroyed by the drivers.
>>>>> Trying to send on a shut down CAN device could be catched differently,
>>>>> e.g. via the device state. This likely needs more thoughts, take it
>>>>> as a
>>>>> note that $something should be done.
>>>> With the fix above I do not see further problems with the existing
>>>> implementation using the "destroyed" state to mark the device
>>>> unavailable.
>>> The problem of double init remains. I don't think that the sem state
>>> should be used for encoding the CAN device state towards potential
>>> senders.
>> As I see it, Sebastian's trick saves code and synchronization.
> 
> Well, we are discussing a single test here. Or doesn't dev->state hold
> the require information as well. Moreover, the existing pattern was easy
> to get wrong between core and driver.

We speak about adding:

  if (!CAN_STATE_OPERATING(dev->state)) {
	if (dev->state == CAN_STATE_SLEEPING)
	    return-ECOMM;
	else
	    return-ENETDOWN;
  }

before calling rtdm_sem_timeddown().

> On the other hand, if we manage to clean the code up, I may change my mind.

I think we should avoid the above lines in a frequently used function.

Wolfgang.

[-- Attachment #2: xenomai-rtcan-txsem3.patch --]
[-- Type: text/x-patch, Size: 2063 bytes --]

+ diff -u xenomai/ksrc/drivers/can/rtcan_dev.c.TXSEM xenomai/ksrc/drivers/can/rtcan_dev.c
+ diff -u xenomai/ksrc/drivers/can/rtcan_raw.c.TXSEM xenomai/ksrc/drivers/can/rtcan_raw.c
--- xenomai/ksrc/drivers/can/rtcan_raw.c.TXSEM	2007-02-15 17:11:11.000000000 +0100
+++ xenomai/ksrc/drivers/can/rtcan_raw.c	2007-02-15 13:46:02.000000000 +0100
@@ -964,12 +964,13 @@
 
     /* Controller should be operating */
     if (!CAN_STATE_OPERATING(dev->state)) {
+	if (dev->state == CAN_STATE_SLEEPING) {
+	    ret = -ECOMM;
+	    rtdm_sem_up(&dev->tx_sem);
+	    goto send_out2;
+	}
         ret = -ENETDOWN;
         goto send_out2;
-    } else if (dev->state == CAN_STATE_SLEEPING) {
-        ret = -ECOMM;
-        rtdm_sem_up(&dev->tx_sem);
-        goto send_out2;
     }
 
     dev->tx_count++;
+ diff -u xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM xenomai/ksrc/drivers/can/rtcan_raw_dev.c
--- xenomai/ksrc/drivers/can/rtcan_raw_dev.c.TXSEM	2007-02-15 11:21:43.000000000 +0100
+++ xenomai/ksrc/drivers/can/rtcan_raw_dev.c	2007-02-15 14:16:16.000000000 +0100
@@ -193,7 +193,8 @@
     switch (request) {
     case SIOCSCANMODE:
 	mode = (can_mode_t *)&ifr->ifr_ifru;
-	if (dev->do_set_mode)
+	if (dev->do_set_mode &&
+	    !(*mode == CAN_MODE_START && CAN_STATE_OPERATING(dev->state)))
 	    ret = dev->do_set_mode(dev, *mode, &lock_ctx);
 	break;
 
+ diff -u xenomai/ksrc/drivers/can/rtcan_virt.c.TXSEM xenomai/ksrc/drivers/can/rtcan_virt.c
--- xenomai/ksrc/drivers/can/rtcan_virt.c.TXSEM	2007-02-15 11:24:14.000000000 +0100
+++ xenomai/ksrc/drivers/can/rtcan_virt.c	2007-02-15 17:13:23.000000000 +0100
@@ -102,8 +102,7 @@
 		break;
 
 	case CAN_MODE_START:
-		if (dev->state == CAN_STATE_STOPPED)
-			rtdm_sem_init(&dev->tx_sem, VIRT_TX_BUFS);
+		rtdm_sem_init(&dev->tx_sem, VIRT_TX_BUFS);
 		dev->state = CAN_STATE_ACTIVE;
 		break;
 
@@ -126,7 +125,7 @@
 	dev->ctrl_name = virt_ctlr_name;
 	dev->board_name = virt_board_name;
 
-	dev->state = CAN_STATE_STOPPED;
+	rtcan_virt_set_mode(dev, CAN_MODE_STOP, NULL);
 
 	strncpy(dev->name, RTCAN_DEV_NAME, IFNAMSIZ);
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-15 16:45         ` Wolfgang Grandegger
@ 2007-02-17 10:19           ` Jan Kiszka
  2007-02-18 18:31             ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2007-02-17 10:19 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 3893 bytes --]

Wolfgang Grandegger wrote:
> Jan Kiszka wrote:
>> Wolfgang Grandegger wrote:
>>> Jan Kiszka wrote:
>>>> Wolfgang Grandegger wrote:
>>>>> Hi Jan,
>>>>>
>>>>> Jan Kiszka wrote:
>>>>>> Hi Wolfgang,
>>>>>>
>>>>>> fiddling with the CAN utils, I noticed the behaviour that
>>>>>> rtcansend on
>>>>>> freshly registered but stopped devices simply blocks. And when you
>>>>>> meanwhile call rtcanconfig up on that device, rtcansend will
>>>>>> continue to
>>>>>> block.
>>>>> I see the expected behavior on stopped devices:
>>>>>
>>>>>   bash-2.05b# rtcansend rtcan0
>>>>>   rt_dev_send: Network is down
>>>>>
>>>>> This is, because the tx_sem is marked "destroyed" already when the
>>>>> device gets initialized. I wonder why your app blocks.
>>>> Unlikely, because the sem is _not_ marked destroyed on startup:
>>>>
>>>> http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_dev.c#181
>>>>
>>>>
>>> You missed
>>>
>>>     /* Set dummy state for following call */
>>>     dev->state = CAN_STATE_ACTIVE;
>>>     /* Enter reset mode */
>>>     rtcan_sja_mode_stop(dev, NULL);
>>>
>>> in the init part of rtcan_sja1000.c and rtcan_mscan.c, ...
>>>
>>>> In case it makes a subtle difference: I tested with xeno_can_virt,
>>>> but I
>>>> think I saw the same effect on real SJA1000 hw as well.
>>> ... which is missing in rtcan_virt.c. I already thought moving it to the
>>> common part. The attached patch fixes that.
>>
>> I'm still not happy with this pattern, at least it's hidden from the
>> drivers now. But do not rtcan_sja1000.c and rtcan_mscan.c require some
>> changes as well to avoid double destruction when this happens in generic
>> code from now on?
> 
> Well, what needs a better implementaion is the code fragment shown above
> but I don't have a quick solution at hand. Therefore I tend to the
> attached patch.
> 
>>>>>> The reason is that during startup of CAN devices the tx-semaphore is
>>>>>> re-initialised while it is already set up during device registration.
>>>>>> Re-init on an already initialised rtdm_sem is, say, undefined.
>>>>> That makes definitely trouble. A CAN_MODE_START should only start the
>>>>> device if it's not active. The attached patch fixes this. Another
>>>>> possibility would be to force a CAN_MODE_STOP in case the device is
>>>>> already active (=restart).
>>>>>
>>>>>> So rtdm_sem should better only be initialised/destroyed by the
>>>>>> drivers.
>>>>>> Trying to send on a shut down CAN device could be catched
>>>>>> differently,
>>>>>> e.g. via the device state. This likely needs more thoughts, take it
>>>>>> as a
>>>>>> note that $something should be done.
>>>>> With the fix above I do not see further problems with the existing
>>>>> implementation using the "destroyed" state to mark the device
>>>>> unavailable.
>>>> The problem of double init remains. I don't think that the sem state
>>>> should be used for encoding the CAN device state towards potential
>>>> senders.
>>> As I see it, Sebastian's trick saves code and synchronization.
>>
>> Well, we are discussing a single test here. Or doesn't dev->state hold
>> the require information as well. Moreover, the existing pattern was easy
>> to get wrong between core and driver.
> 
> We speak about adding:
> 
>  if (!CAN_STATE_OPERATING(dev->state)) {
>     if (dev->state == CAN_STATE_SLEEPING)
>         return-ECOMM;
>     else
>         return-ENETDOWN;
>  }
> 
> before calling rtdm_sem_timeddown().
> 

I thought about this issue again and found the reason for my vague bad
feeling: re-init is not atomic, thus racy. But also the test+sem_init
pattern I suggested is not save.

I guess we need to enhance rtdm_XXX_init in this regard to make the
RT-CAN use case an officially allowed one. /me is planning to spend more
thoughts on this the next days.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-17 10:19           ` Jan Kiszka
@ 2007-02-18 18:31             ` Jan Kiszka
  2007-02-18 19:53               ` Wolfgang Grandegger
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2007-02-18 18:31 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 756 bytes --]

Jan Kiszka wrote:
> I thought about this issue again and found the reason for my vague bad
> feeling: re-init is not atomic, thus racy. But also the test+sem_init
> pattern I suggested is not save.
> 
> I guess we need to enhance rtdm_XXX_init in this regard to make the
> RT-CAN use case an officially allowed one. /me is planning to spend more
> thoughts on this the next days.

OK, done, rtdm_{event,sem,mutex}_init are now protected by the nklock in
trunk. This should make the in-place re-initialisation of RTDM IPC
objects race-free and allow to use them in RT-CAN as originally
intended. I think I will back-port that pieces also to v2.3.x later.

What patch should now go in to avoid double init/destroy and fix rtcan_virt?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-18 18:31             ` Jan Kiszka
@ 2007-02-18 19:53               ` Wolfgang Grandegger
  2007-02-18 20:37                 ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-18 19:53 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 803 bytes --]

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> I thought about this issue again and found the reason for my vague bad
>> feeling: re-init is not atomic, thus racy. But also the test+sem_init
>> pattern I suggested is not save.
>>
>> I guess we need to enhance rtdm_XXX_init in this regard to make the
>> RT-CAN use case an officially allowed one. /me is planning to spend more
>> thoughts on this the next days.
> 
> OK, done, rtdm_{event,sem,mutex}_init are now protected by the nklock in
> trunk. This should make the in-place re-initialisation of RTDM IPC
> objects race-free and allow to use them in RT-CAN as originally
> intended. I think I will back-port that pieces also to v2.3.x later.
> 
> What patch should now go in to avoid double init/destroy and fix rtcan_virt?

Attached. Thanks.

Wolfgang.


[-- Attachment #2: xenomai-rtcan-txsem.patch --]
[-- Type: text/x-patch, Size: 2943 bytes --]

Index: quilt/xenomai/ksrc/drivers/can/rtcan_raw.c
===================================================================
--- quilt.orig/xenomai/ksrc/drivers/can/rtcan_raw.c
+++ quilt/xenomai/ksrc/drivers/can/rtcan_raw.c
@@ -967,12 +967,13 @@ ssize_t rtcan_raw_sendmsg(struct rtdm_de
 
     /* Controller should be operating */
     if (!CAN_STATE_OPERATING(dev->state)) {
+	if (dev->state == CAN_STATE_SLEEPING) {
+	    ret = -ECOMM;
+	    rtdm_sem_up(&dev->tx_sem);
+	    goto send_out2;
+	}
         ret = -ENETDOWN;
         goto send_out2;
-    } else if (dev->state == CAN_STATE_SLEEPING) {
-        ret = -ECOMM;
-        rtdm_sem_up(&dev->tx_sem);
-        goto send_out2;
     }
 
     dev->tx_count++;
Index: quilt/xenomai/ksrc/drivers/can/rtcan_raw_dev.c
===================================================================
--- quilt.orig/xenomai/ksrc/drivers/can/rtcan_raw_dev.c
+++ quilt/xenomai/ksrc/drivers/can/rtcan_raw_dev.c
@@ -193,7 +193,8 @@ static inline int rtcan_raw_ioctl_dev_se
     switch (request) {
     case SIOCSCANMODE:
 	mode = (can_mode_t *)&ifr->ifr_ifru;
-	if (dev->do_set_mode)
+	if (dev->do_set_mode &&
+	    !(*mode == CAN_MODE_START && CAN_STATE_OPERATING(dev->state)))
 	    ret = dev->do_set_mode(dev, *mode, &lock_ctx);
 	break;
 
Index: quilt/xenomai/ksrc/drivers/can/rtcan_virt.c
===================================================================
--- quilt.orig/xenomai/ksrc/drivers/can/rtcan_virt.c
+++ quilt/xenomai/ksrc/drivers/can/rtcan_virt.c
@@ -102,8 +102,7 @@ static int rtcan_virt_set_mode(struct rt
 		break;
 
 	case CAN_MODE_START:
-		if (dev->state == CAN_STATE_STOPPED)
-			rtdm_sem_init(&dev->tx_sem, VIRT_TX_BUFS);
+		rtdm_sem_init(&dev->tx_sem, VIRT_TX_BUFS);
 		dev->state = CAN_STATE_ACTIVE;
 		break;
 
@@ -126,7 +125,7 @@ static int __init rtcan_virt_init_one(in
 	dev->ctrl_name = virt_ctlr_name;
 	dev->board_name = virt_board_name;
 
-	dev->state = CAN_STATE_STOPPED;
+	rtcan_virt_set_mode(dev, CAN_MODE_STOP, NULL);
 
 	strncpy(dev->name, RTCAN_DEV_NAME, IFNAMSIZ);
 
Index: quilt/xenomai/ChangeLog
===================================================================
--- quilt.orig/xenomai/ChangeLog
+++ quilt/xenomai/ChangeLog
@@ -1,5 +1,15 @@
 2007-02-18  Wolfgang Grandegger  <wg@domain.hid>
 
+	* xenomai/ksrc/drivers/can/rtcan_raw.c,
+	xenomai/ksrc/drivers/can/rtcan_raw_dev.c
+	xenomai/ksrc/drivers/can/rtcan_virt.c: The TX semaphore must be in
+	a destroyed state for proper operation. This was not the case for
+	the "virt" driver. Furthermore, to avoid re-initialization of the
+	TX semaphore, the device specific start routine is only called when
+	the device is not operating to avoid re-initialization.
+
+2007-02-18  Wolfgang Grandegger  <wg@domain.hid>
+
 	* ksrc/drivers/can/rtcan_raw.c, ksrc/drivers/can/rtcan_socket.h: add
 	prefix RTCAN_ to TIMESTAMP_SIZE, HAS_TIMESTAMP and HAS_NO_TIMESTAMP
 	to avoild name clashes, e.g. TIMESTAMP_SIZE is used by the kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-18 19:53               ` Wolfgang Grandegger
@ 2007-02-18 20:37                 ` Jan Kiszka
  2007-02-19 14:04                   ` Wolfgang Grandegger
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2007-02-18 20:37 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]

Wolfgang Grandegger wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> I thought about this issue again and found the reason for my vague bad
>>> feeling: re-init is not atomic, thus racy. But also the test+sem_init
>>> pattern I suggested is not save.
>>>
>>> I guess we need to enhance rtdm_XXX_init in this regard to make the
>>> RT-CAN use case an officially allowed one. /me is planning to spend more
>>> thoughts on this the next days.
>>
>> OK, done, rtdm_{event,sem,mutex}_init are now protected by the nklock in
>> trunk. This should make the in-place re-initialisation of RTDM IPC
>> objects race-free and allow to use them in RT-CAN as originally
>> intended. I think I will back-port that pieces also to v2.3.x later.
>>
>> What patch should now go in to avoid double init/destroy and fix
>> rtcan_virt?
> 
> Attached. Thanks.
> 

Just applied to trunk.

To avoid lock nesting where possible, I reordered some code in
rtcan_raw_sendmsg. Please re-check if I messed anything up. Then I could
commit to stable, too.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-18 20:37                 ` Jan Kiszka
@ 2007-02-19 14:04                   ` Wolfgang Grandegger
  2007-02-19 14:15                     ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-19 14:04 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Wolfgang Grandegger wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> I thought about this issue again and found the reason for my vague bad
>>>> feeling: re-init is not atomic, thus racy. But also the test+sem_init
>>>> pattern I suggested is not save.
>>>>
>>>> I guess we need to enhance rtdm_XXX_init in this regard to make the
>>>> RT-CAN use case an officially allowed one. /me is planning to spend more
>>>> thoughts on this the next days.
>>> OK, done, rtdm_{event,sem,mutex}_init are now protected by the nklock in
>>> trunk. This should make the in-place re-initialisation of RTDM IPC
>>> objects race-free and allow to use them in RT-CAN as originally
>>> intended. I think I will back-port that pieces also to v2.3.x later.

Well, I looked to your changes and do not yet understand what they are 
good for. From my point of view, it does _only_ make sense to use "init" 
and "destroy" in a serialized way, like RT-Socket-CAN does. If you do 
"re-init" without destroy, you might be in trouble if the object is in a 
wait state. Have I missed something?

>>> What patch should now go in to avoid double init/destroy and fix
>>> rtcan_virt?
>> Attached. Thanks.
>>
> 
> Just applied to trunk.

Thanks.

> To avoid lock nesting where possible, I reordered some code in
> rtcan_raw_sendmsg. Please re-check if I messed anything up. Then I could
> commit to stable, too.

Hm, I do not see critical changes in rtcan_raw.c. I have tested the CAN 
drivers today and did'nt realize any new problem, apart from an old bug 
:-(fix coming soon).

Wolfgang.





> Jan
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-19 14:04                   ` Wolfgang Grandegger
@ 2007-02-19 14:15                     ` Jan Kiszka
  2007-02-19 14:48                       ` Wolfgang Grandegger
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2007-02-19 14:15 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2471 bytes --]

Wolfgang Grandegger wrote:
> Jan Kiszka wrote:
>> Wolfgang Grandegger wrote:
>>> Jan Kiszka wrote:
>>>> Jan Kiszka wrote:
>>>>> I thought about this issue again and found the reason for my vague bad
>>>>> feeling: re-init is not atomic, thus racy. But also the test+sem_init
>>>>> pattern I suggested is not save.
>>>>>
>>>>> I guess we need to enhance rtdm_XXX_init in this regard to make the
>>>>> RT-CAN use case an officially allowed one. /me is planning to spend
>>>>> more
>>>>> thoughts on this the next days.
>>>> OK, done, rtdm_{event,sem,mutex}_init are now protected by the
>>>> nklock in
>>>> trunk. This should make the in-place re-initialisation of RTDM IPC
>>>> objects race-free and allow to use them in RT-CAN as originally
>>>> intended. I think I will back-port that pieces also to v2.3.x later.
> 
> Well, I looked to your changes and do not yet understand what they are
> good for. From my point of view, it does _only_ make sense to use "init"
> and "destroy" in a serialized way, like RT-Socket-CAN does. If you do
> "re-init" without destroy, you might be in trouble if the object is in a
> wait state. Have I missed something?

I didn't target re-init without previous destroy - that remains illegal
as before.

But when you poke on some destroyed object like CAN does, you have to
ensure that the user never sees some half-initialised object. You could
achieve this by setting the object state last and spreading memory
barriers, but for now I resorted to a simple lock around the whole
process to avoid changing nucleus code for this special use case.

> 
>>>> What patch should now go in to avoid double init/destroy and fix
>>>> rtcan_virt?
>>> Attached. Thanks.
>>>
>>
>> Just applied to trunk.
> 
> Thanks.
> 
>> To avoid lock nesting where possible, I reordered some code in
>> rtcan_raw_sendmsg. Please re-check if I messed anything up. Then I could
>> commit to stable, too.
> 
> Hm, I do not see critical changes in rtcan_raw.c. I have tested the CAN
> drivers today and did'nt realize any new problem, apart from an old bug
> :-(fix coming soon).

This block was affected:

http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_raw.c?v=SVN-trunk#965

As it is an error path, testing may not reveal potential bugs quick
enough. I just wanted you to have an unbiased glance. But as I trusted
myself this time, the code made it into 2.3.x already. :)

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Re: RT-CAN tx_sem
  2007-02-19 14:15                     ` Jan Kiszka
@ 2007-02-19 14:48                       ` Wolfgang Grandegger
  0 siblings, 0 replies; 15+ messages in thread
From: Wolfgang Grandegger @ 2007-02-19 14:48 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Wolfgang Grandegger wrote:
>> Jan Kiszka wrote:
>>> Wolfgang Grandegger wrote:
>>>> Jan Kiszka wrote:
>>>>> Jan Kiszka wrote:
>>>>>> I thought about this issue again and found the reason for my vague bad
>>>>>> feeling: re-init is not atomic, thus racy. But also the test+sem_init
>>>>>> pattern I suggested is not save.
>>>>>>
>>>>>> I guess we need to enhance rtdm_XXX_init in this regard to make the
>>>>>> RT-CAN use case an officially allowed one. /me is planning to spend
>>>>>> more
>>>>>> thoughts on this the next days.
>>>>> OK, done, rtdm_{event,sem,mutex}_init are now protected by the
>>>>> nklock in
>>>>> trunk. This should make the in-place re-initialisation of RTDM IPC
>>>>> objects race-free and allow to use them in RT-CAN as originally
>>>>> intended. I think I will back-port that pieces also to v2.3.x later.
>> Well, I looked to your changes and do not yet understand what they are
>> good for. From my point of view, it does _only_ make sense to use "init"
>> and "destroy" in a serialized way, like RT-Socket-CAN does. If you do
>> "re-init" without destroy, you might be in trouble if the object is in a
>> wait state. Have I missed something?
> 
> I didn't target re-init without previous destroy - that remains illegal
> as before.
> 
> But when you poke on some destroyed object like CAN does, you have to
> ensure that the user never sees some half-initialised object. You could
> achieve this by setting the object state last and spreading memory
> barriers, but for now I resorted to a simple lock around the whole
> process to avoid changing nucleus code for this special use case.

OK, for RTDM over Linux-rt, I put the clearing of the destroy flag in 
the init function to the end.

>>>>> What patch should now go in to avoid double init/destroy and fix
>>>>> rtcan_virt?
>>>> Attached. Thanks.
>>>>
>>> Just applied to trunk.
>> Thanks.
>>
>>> To avoid lock nesting where possible, I reordered some code in
>>> rtcan_raw_sendmsg. Please re-check if I messed anything up. Then I could
>>> commit to stable, too.
>> Hm, I do not see critical changes in rtcan_raw.c. I have tested the CAN
>> drivers today and did'nt realize any new problem, apart from an old bug
>> :-(fix coming soon).
> 
> This block was affected:
> 
> http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/drivers/can/rtcan_raw.c?v=SVN-trunk#965
> 
> As it is an error path, testing may not reveal potential bugs quick
> enough. I just wanted you to have an unbiased glance. But as I trusted
> myself this time, the code made it into 2.3.x already. :)

Ah, OK, looks good ;-).

Thanks,

Wolfgang.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2007-02-19 14:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-14 18:30 [Xenomai-core] RT-CAN tx_sem Jan Kiszka
2007-02-15  8:37 ` [Xenomai-core] " Wolfgang Grandegger
2007-02-15 13:25 ` Wolfgang Grandegger
2007-02-15 13:36   ` Jan Kiszka
2007-02-15 13:54     ` Wolfgang Grandegger
2007-02-15 14:29       ` Jan Kiszka
2007-02-15 16:45         ` Wolfgang Grandegger
2007-02-17 10:19           ` Jan Kiszka
2007-02-18 18:31             ` Jan Kiszka
2007-02-18 19:53               ` Wolfgang Grandegger
2007-02-18 20:37                 ` Jan Kiszka
2007-02-19 14:04                   ` Wolfgang Grandegger
2007-02-19 14:15                     ` Jan Kiszka
2007-02-19 14:48                       ` Wolfgang Grandegger
2007-02-15 14:01     ` Sebastian Smolorz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.