All of lore.kernel.org
 help / color / mirror / Atom feed
* Deadcode discussion based on Arm NS phys timer
@ 2022-10-24  9:07 Michal Orzel
  2022-10-24 10:51 ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Orzel @ 2022-10-24  9:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, Stefano Stabellini, Bertrand Marquis

Hello,

Recently I came across a deadcode in Xen Arm arch timer code. Briefly speaking, we are routing
the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use of it (as it uses the hypervisor timer CNTHP).
This timer is fully emulated, which means that there is nothing that can trigger such IRQ. This code is
a left over from early days, where the CNTHP was buggy on some models and we had to use the CNTP instead.

As far as the problem itself is not really interesting, it raises a question of what to do with a deadcode,
as there might be/are other deadcode places in Xen. One may say that it is useful to keep it, because one day,
someone might need it when dealing with yet another broken HW. Such person would still need to modify the other
part of the code (e.g. reprogram_timer), but there would be less work required overall. Personally, I'm not in favor of
such approach, because we should not really support possible scenarios with broken HW (except for erratas listing known issues).
Also, as part of the certification/FUSA process, there should be no deadcode and we should have explanation for every block of code.

There are different ways to deal with a deadcode:
1. Get rid of it completely
2. Leave it as it is
3. Admit that it can be useful one day and:
  3.1. protect it with #if 0
  3.2. protect it with a new Kconfig option (disabled by default) using #ifdef
  3.3. protect it with a new Kconfig option (disabled by default) using IS_ENABLED (to make sure code always compile)
  3.4. protect it with a command line option (allowing to choose the timer to be used by Xen)
...

Let me know what you think.

~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-24  9:07 Deadcode discussion based on Arm NS phys timer Michal Orzel
@ 2022-10-24 10:51 ` Julien Grall
  2022-10-24 11:41   ` Michal Orzel
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2022-10-24 10:51 UTC (permalink / raw)
  To: Michal Orzel, xen-devel; +Cc: Stefano Stabellini, Bertrand Marquis

On 24/10/2022 10:07, Michal Orzel wrote:
> Hello,

Hi Michal,

> Recently I came across a deadcode in Xen Arm arch timer code. Briefly speaking, we are routing
> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use of it (as it uses the hypervisor timer CNTHP).
> This timer is fully emulated, which means that there is nothing that can trigger such IRQ. This code is
> a left over from early days, where the CNTHP was buggy on some models and we had to use the CNTP instead.
> 
> As far as the problem itself is not really interesting, it raises a question of what to do with a deadcode,
> as there might be/are other deadcode places in Xen.

There are multiple definition of deadcode. Depending on which one you 
chose, then this could cover IS_ENABLED() and possibly #ifdef. So this 
would result to a lot of places impacted with the decision.

So can you clarify what you mean by deadcode?

> One may say that it is useful to keep it, because one day,
> someone might need it when dealing with yet another broken HW. Such person would still need to modify the other
> part of the code (e.g. reprogram_timer), but there would be less work required overall. Personally, I'm not in favor of
> such approach, because we should not really support possible scenarios with broken HW (except for erratas listing known issues).

The difference between "broken HW" and "HW with known errata" is a bit 
unclear to me. Can you clarify how you would make the difference here?

In particular, at which point do you consider that the HW should not be 
supported by Xen?

> Also, as part of the certification/FUSA process, there should be no deadcode and we should have explanation for every block of code.

See above. What are you trying to cover by deadcode? Would protecting 
code with IS_ENABLED() (or #ifdef) ok?

> 
> There are different ways to deal with a deadcode: > 1. Get rid of it completely
> 2. Leave it as it is
> 3. Admit that it can be useful one day and:
>    3.1. protect it with #if 0
>    3.2. protect it with a new Kconfig option (disabled by default) using #ifdef
>    3.3. protect it with a new Kconfig option (disabled by default) using IS_ENABLED (to make sure code always compile)
>    3.4. protect it with a command line option (allowing to choose the timer to be used by Xen)
> ...
> 
> Let me know what you think.

Before answering the question, I would need some clarifications on your aim.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-24 10:51 ` Julien Grall
@ 2022-10-24 11:41   ` Michal Orzel
  2022-10-24 13:29     ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Orzel @ 2022-10-24 11:41 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Stefano Stabellini, Bertrand Marquis

Hi Julien,

On 24/10/2022 12:51, Julien Grall wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On 24/10/2022 10:07, Michal Orzel wrote:
>> Hello,
> 
> Hi Michal,
> 
>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly speaking, we are routing
>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use of it (as it uses the hypervisor timer CNTHP).
>> This timer is fully emulated, which means that there is nothing that can trigger such IRQ. This code is
>> a left over from early days, where the CNTHP was buggy on some models and we had to use the CNTP instead.
>>
>> As far as the problem itself is not really interesting, it raises a question of what to do with a deadcode,
>> as there might be/are other deadcode places in Xen.
> 
> There are multiple definition of deadcode. Depending on which one you
> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
> would result to a lot of places impacted with the decision.
> 
> So can you clarify what you mean by deadcode?
In the timer example, I think we have both a deadcode and unreachable code.
For the purpose of this discussion, let's take the MISRA definition of a deadcode which is a "code that can be executed
but has no effect on the functional behavior of the program". This differs from the unreachable code definition that is
a "code that cannot be executed". Setting up the IRQ for Xen is an example of a deadcode. Code within IRQ handler is an unreachable code
(there is nothing that can trigger this IRQ).

What I mean by deadcode happens to be the sum of the two cases above i.e. the code that cannot be executed as well as the code that
does not impact the functionality of the program.

> 
>> One may say that it is useful to keep it, because one day,
>> someone might need it when dealing with yet another broken HW. Such person would still need to modify the other
>> part of the code (e.g. reprogram_timer), but there would be less work required overall. Personally, I'm not in favor of
>> such approach, because we should not really support possible scenarios with broken HW (except for erratas listing known issues).
> 
> The difference between "broken HW" and "HW with known errata" is a bit
> unclear to me. Can you clarify how you would make the difference here?
> 
> In particular, at which point do you consider that the HW should not be
> supported by Xen?
I'm not saying that HW should not be supported. The difference for me between broken HW and
HW with known errata is that for the former, the incorrect behavior is often due to the early support stage,
using emulators/models instead of real HW, whereas for the latter, the HW is already released and it happens to be that it is buggy
(the HW vendor is aware of the issue and released erratas). Do we have any example in Xen for supporting broken HW,
whose vendor is not aware of the issue or did not release any errata?

> 
>> Also, as part of the certification/FUSA process, there should be no deadcode and we should have explanation for every block of code.
> 
> See above. What are you trying to cover by deadcode? Would protecting
> code with IS_ENABLED() (or #ifdef) ok?
I think this would be ok from the certification point of view (this would at least means, that we are aware of the issue
and we took some steps). Otherwise, such code is just an example of a deadcode/unreachable code.

> 
>>
>> There are different ways to deal with a deadcode: > 1. Get rid of it completely
>> 2. Leave it as it is
>> 3. Admit that it can be useful one day and:
>>    3.1. protect it with #if 0
>>    3.2. protect it with a new Kconfig option (disabled by default) using #ifdef
>>    3.3. protect it with a new Kconfig option (disabled by default) using IS_ENABLED (to make sure code always compile)
>>    3.4. protect it with a command line option (allowing to choose the timer to be used by Xen)
>> ...
>>
>> Let me know what you think.
> 
> Before answering the question, I would need some clarifications on your aim.
> 
> Cheers,
> 
> --
> Julien Grall

~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-24 11:41   ` Michal Orzel
@ 2022-10-24 13:29     ` Julien Grall
  2022-10-25  1:29       ` Stefano Stabellini
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2022-10-24 13:29 UTC (permalink / raw)
  To: Michal Orzel, xen-devel; +Cc: Stefano Stabellini, Bertrand Marquis



On 24/10/2022 12:41, Michal Orzel wrote:
> Hi Julien,

Hi Michal,

> 
> On 24/10/2022 12:51, Julien Grall wrote:
>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>
>>
>> On 24/10/2022 10:07, Michal Orzel wrote:
>>> Hello,
>>
>> Hi Michal,
>>
>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly speaking, we are routing
>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use of it (as it uses the hypervisor timer CNTHP).
>>> This timer is fully emulated, which means that there is nothing that can trigger such IRQ. This code is
>>> a left over from early days, where the CNTHP was buggy on some models and we had to use the CNTP instead.
>>>
>>> As far as the problem itself is not really interesting, it raises a question of what to do with a deadcode,
>>> as there might be/are other deadcode places in Xen.
>>
>> There are multiple definition of deadcode. Depending on which one you
>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>> would result to a lot of places impacted with the decision.
>>
>> So can you clarify what you mean by deadcode?
> In the timer example, I think we have both a deadcode and unreachable code.
> For the purpose of this discussion, let's take the MISRA definition of a deadcode which is a "code that can be executed
> but has no effect on the functional behavior of the program". This differs from the unreachable code definition that is
> a "code that cannot be executed". Setting up the IRQ for Xen is an example of a deadcode. Code within IRQ handler is an unreachable code
> (there is nothing that can trigger this IRQ).
> 
> What I mean by deadcode happens to be the sum of the two cases above i.e. the code that cannot be executed as well as the code that
> does not impact the functionality of the program.
> 
>>
>>> One may say that it is useful to keep it, because one day,
>>> someone might need it when dealing with yet another broken HW. Such person would still need to modify the other
>>> part of the code (e.g. reprogram_timer), but there would be less work required overall. Personally, I'm not in favor of
>>> such approach, because we should not really support possible scenarios with broken HW (except for erratas listing known issues).
>>
>> The difference between "broken HW" and "HW with known errata" is a bit
>> unclear to me. Can you clarify how you would make the difference here?
>>
>> In particular, at which point do you consider that the HW should not be
>> supported by Xen?
> I'm not saying that HW should not be supported. The difference for me between broken HW and
> HW with known errata is that for the former, the incorrect behavior is often due to the early support stage,
> using emulators/models instead of real HW, whereas for the latter, the HW is already released and it happens to be that it is buggy
> (the HW vendor is aware of the issue and released erratas). 

Thanks for the clarification. What I would call broken is anything that 
can't be fixed in software. For a not too fictional example, an HW where 
PCI devices are using the same stream ID. So effectively, passthrough 
can't be safely supported.

Regarding, not yet released HW, I don't think Xen should have workaround 
for them. I wouldn't even call it "broken" because they are not yet 
released and it is common to have bug in early revision.

> Do we have any example in Xen for supporting broken HW,
> whose vendor is not aware of the issue or did not release any errata?
I will not cite any HW on the ML. But from my experience, the vendors 
are not very vocal about issues in public (some don't even seem to have 
public doc). The best way to find the issues is to look at Linux commit.

> 
>>
>>> Also, as part of the certification/FUSA process, there should be no deadcode and we should have explanation for every block of code.
>>
>> See above. What are you trying to cover by deadcode? Would protecting
>> code with IS_ENABLED() (or #ifdef) ok?
> I think this would be ok from the certification point of view (this would at least means, that we are aware of the issue
> and we took some steps). Otherwise, such code is just an example of a deadcode/unreachable code.

Thanks for the clarification. So the exact approach will depend on the 
context....

> 
>>
>>>
>>> There are different ways to deal with a deadcode: > 1. Get rid of it completely
>>> 2. Leave it as it is

... this is my preference in the context of the timer. If the other 
don't like it, then 1 would be my preference.

In general, my preference would be either 3.3 or 3.2 (see below).

>>> 3. Admit that it can be useful one day and:
>>>     3.1. protect it with #if 0

#if 0 should not be used in Xen code. IMHO this is the worse of all the 
world.

>>>     3.2. protect it with a new Kconfig option (disabled by default) using #ifdef
>>>     3.3. protect it with a new Kconfig option (disabled by default) using IS_ENABLED (to make sure code always compile)

I would prefer 3.3 over 3.2. 3.2 would be used if it is too difficult to 
get the code compiled when !IS_ENABLED.

Similar to one if this is to move all the affected code in a separate 
file with using obj-$(CONFIG...). That would only work for large chunk 
of code and would be preferred over 3.2.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-24 13:29     ` Julien Grall
@ 2022-10-25  1:29       ` Stefano Stabellini
  2022-10-25  7:11         ` Michal Orzel
  0 siblings, 1 reply; 13+ messages in thread
From: Stefano Stabellini @ 2022-10-25  1:29 UTC (permalink / raw)
  To: Julien Grall
  Cc: Michal Orzel, xen-devel, Stefano Stabellini, Bertrand Marquis

On Mon, 24 Oct 2022, Julien Grall wrote:
> > On 24/10/2022 12:51, Julien Grall wrote:
> > > Caution: This message originated from an External Source. Use proper
> > > caution when opening attachments, clicking links, or responding.
> > > 
> > > 
> > > On 24/10/2022 10:07, Michal Orzel wrote:
> > > > Hello,
> > > 
> > > Hi Michal,
> > > 
> > > > Recently I came across a deadcode in Xen Arm arch timer code. Briefly
> > > > speaking, we are routing
> > > > the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
> > > > of it (as it uses the hypervisor timer CNTHP).
> > > > This timer is fully emulated, which means that there is nothing that can
> > > > trigger such IRQ. This code is
> > > > a left over from early days, where the CNTHP was buggy on some models
> > > > and we had to use the CNTP instead.
> > > > 
> > > > As far as the problem itself is not really interesting, it raises a
> > > > question of what to do with a deadcode,
> > > > as there might be/are other deadcode places in Xen.
> > > 
> > > There are multiple definition of deadcode. Depending on which one you
> > > chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
> > > would result to a lot of places impacted with the decision.
> > > 
> > > So can you clarify what you mean by deadcode?
> > In the timer example, I think we have both a deadcode and unreachable code.
> > For the purpose of this discussion, let's take the MISRA definition of a
> > deadcode which is a "code that can be executed
> > but has no effect on the functional behavior of the program". This differs
> > from the unreachable code definition that is
> > a "code that cannot be executed". Setting up the IRQ for Xen is an example
> > of a deadcode. Code within IRQ handler is an unreachable code
> > (there is nothing that can trigger this IRQ).
> > 
> > What I mean by deadcode happens to be the sum of the two cases above i.e.
> > the code that cannot be executed as well as the code that
> > does not impact the functionality of the program.
> > 
> > > 
> > > > One may say that it is useful to keep it, because one day,
> > > > someone might need it when dealing with yet another broken HW. Such
> > > > person would still need to modify the other
> > > > part of the code (e.g. reprogram_timer), but there would be less work
> > > > required overall. Personally, I'm not in favor of
> > > > such approach, because we should not really support possible scenarios
> > > > with broken HW (except for erratas listing known issues).
> > > 
> > > The difference between "broken HW" and "HW with known errata" is a bit
> > > unclear to me. Can you clarify how you would make the difference here?
> > > 
> > > In particular, at which point do you consider that the HW should not be
> > > supported by Xen?
> > I'm not saying that HW should not be supported. The difference for me
> > between broken HW and
> > HW with known errata is that for the former, the incorrect behavior is often
> > due to the early support stage,
> > using emulators/models instead of real HW, whereas for the latter, the HW is
> > already released and it happens to be that it is buggy
> > (the HW vendor is aware of the issue and released erratas). 
> 
> Thanks for the clarification. What I would call broken is anything that can't
> be fixed in software. For a not too fictional example, an HW where PCI devices
> are using the same stream ID. So effectively, passthrough can't be safely
> supported.
> 
> Regarding, not yet released HW, I don't think Xen should have workaround for
> them. I wouldn't even call it "broken" because they are not yet released and
> it is common to have bug in early revision.
> 
> > Do we have any example in Xen for supporting broken HW,
> > whose vendor is not aware of the issue or did not release any errata?
> I will not cite any HW on the ML. But from my experience, the vendors are not
> very vocal about issues in public (some don't even seem to have public doc).
> The best way to find the issues is to look at Linux commit.
> 
> > 
> > > 
> > > > Also, as part of the certification/FUSA process, there should be no
> > > > deadcode and we should have explanation for every block of code.
> > > 
> > > See above. What are you trying to cover by deadcode? Would protecting
> > > code with IS_ENABLED() (or #ifdef) ok?
> > I think this would be ok from the certification point of view (this would at
> > least means, that we are aware of the issue
> > and we took some steps). Otherwise, such code is just an example of a
> > deadcode/unreachable code.
> 
> Thanks for the clarification. So the exact approach will depend on the
> context....
>
> > > > There are different ways to deal with a deadcode: > 1. Get rid of it
> > > > completely
> > > > 2. Leave it as it is
> 
> ... this is my preference in the context of the timer.

From a certification point of view, the fewer lines of code the better,
and ideally all the lines of code used for the certified build should be
testable and used.

So I think 2. is the lest useful option from a certification
perspective. For this reason, I'd prefer another alternative.


> If the other don't like it, then 1 would be my preference.
> 
> In general, my preference would be either 3.3 or 3.2 (see below).

I also think that 3.2 and 3.3 are good options for the general case. For
the timer, I can see why 1 is your (second) preference and I am fine
with 1 as well.


> > > > 3. Admit that it can be useful one day and:
> > > >     3.1. protect it with #if 0
> 
> #if 0 should not be used in Xen code. IMHO this is the worse of all the world.
> 
> > > >     3.2. protect it with a new Kconfig option (disabled by default)
> > > > using #ifdef
> > > >     3.3. protect it with a new Kconfig option (disabled by default)
> > > > using IS_ENABLED (to make sure code always compile)
> 
> I would prefer 3.3 over 3.2. 3.2 would be used if it is too difficult to get
> the code compiled when !IS_ENABLED.
> 
> Similar to one if this is to move all the affected code in a separate file
> with using obj-$(CONFIG...). That would only work for large chunk of code and
> would be preferred over 3.2.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-25  1:29       ` Stefano Stabellini
@ 2022-10-25  7:11         ` Michal Orzel
  2022-10-25  7:45           ` Bertrand Marquis
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Orzel @ 2022-10-25  7:11 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Bertrand Marquis

Hi,

On 25/10/2022 03:29, Stefano Stabellini wrote:
> 
> 
> On Mon, 24 Oct 2022, Julien Grall wrote:
>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>> Caution: This message originated from an External Source. Use proper
>>>> caution when opening attachments, clicking links, or responding.
>>>>
>>>>
>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>> Hello,
>>>>
>>>> Hi Michal,
>>>>
>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>> speaking, we are routing
>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>> trigger such IRQ. This code is
>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>> and we had to use the CNTP instead.
>>>>>
>>>>> As far as the problem itself is not really interesting, it raises a
>>>>> question of what to do with a deadcode,
>>>>> as there might be/are other deadcode places in Xen.
>>>>
>>>> There are multiple definition of deadcode. Depending on which one you
>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>> would result to a lot of places impacted with the decision.
>>>>
>>>> So can you clarify what you mean by deadcode?
>>> In the timer example, I think we have both a deadcode and unreachable code.
>>> For the purpose of this discussion, let's take the MISRA definition of a
>>> deadcode which is a "code that can be executed
>>> but has no effect on the functional behavior of the program". This differs
>>> from the unreachable code definition that is
>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>> of a deadcode. Code within IRQ handler is an unreachable code
>>> (there is nothing that can trigger this IRQ).
>>>
>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>> the code that cannot be executed as well as the code that
>>> does not impact the functionality of the program.
>>>
>>>>
>>>>> One may say that it is useful to keep it, because one day,
>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>> person would still need to modify the other
>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>> required overall. Personally, I'm not in favor of
>>>>> such approach, because we should not really support possible scenarios
>>>>> with broken HW (except for erratas listing known issues).
>>>>
>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>
>>>> In particular, at which point do you consider that the HW should not be
>>>> supported by Xen?
>>> I'm not saying that HW should not be supported. The difference for me
>>> between broken HW and
>>> HW with known errata is that for the former, the incorrect behavior is often
>>> due to the early support stage,
>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>> already released and it happens to be that it is buggy
>>> (the HW vendor is aware of the issue and released erratas).
>>
>> Thanks for the clarification. What I would call broken is anything that can't
>> be fixed in software. For a not too fictional example, an HW where PCI devices
>> are using the same stream ID. So effectively, passthrough can't be safely
>> supported.
>>
>> Regarding, not yet released HW, I don't think Xen should have workaround for
>> them. I wouldn't even call it "broken" because they are not yet released and
>> it is common to have bug in early revision.
>>
>>> Do we have any example in Xen for supporting broken HW,
>>> whose vendor is not aware of the issue or did not release any errata?
>> I will not cite any HW on the ML. But from my experience, the vendors are not
>> very vocal about issues in public (some don't even seem to have public doc).
>> The best way to find the issues is to look at Linux commit.
>>
>>>
>>>>
>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>> deadcode and we should have explanation for every block of code.
>>>>
>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>> code with IS_ENABLED() (or #ifdef) ok?
>>> I think this would be ok from the certification point of view (this would at
>>> least means, that we are aware of the issue
>>> and we took some steps). Otherwise, such code is just an example of a
>>> deadcode/unreachable code.
>>
>> Thanks for the clarification. So the exact approach will depend on the
>> context....
>>
>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>> completely
>>>>> 2. Leave it as it is
>>
>> ... this is my preference in the context of the timer.
> 
> From a certification point of view, the fewer lines of code the better,
> and ideally all the lines of code used for the certified build should be
> testable and used.
> 
> So I think 2. is the lest useful option from a certification
> perspective. For this reason, I'd prefer another alternative.
> 
> 
>> If the other don't like it, then 1 would be my preference.
>>
>> In general, my preference would be either 3.3 or 3.2 (see below).
> 
> I also think that 3.2 and 3.3 are good options for the general case. For
> the timer, I can see why 1 is your (second) preference and I am fine
> with 1 as well.
Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.

> 
> 
>>>>> 3. Admit that it can be useful one day and:
>>>>>     3.1. protect it with #if 0
>>
>> #if 0 should not be used in Xen code. IMHO this is the worse of all the world.
I share your opinion here Julien. Unfortunately we still have quite a few examples
in the Arm code using this either to mark something as TODO or to comment out
parts of the code waiting for future support. This is mostly in SMMU code that
was taken from Linux but already diverged quite far (maybe some cleanup is necessary).

>>
>>>>>     3.2. protect it with a new Kconfig option (disabled by default)
>>>>> using #ifdef
>>>>>     3.3. protect it with a new Kconfig option (disabled by default)
>>>>> using IS_ENABLED (to make sure code always compile)
>>
>> I would prefer 3.3 over 3.2. 3.2 would be used if it is too difficult to get
>> the code compiled when !IS_ENABLED.
>>
>> Similar to one if this is to move all the affected code in a separate file
>> with using obj-$(CONFIG...). That would only work for large chunk of code and
>> would be preferred over 3.2.
> 

~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-25  7:11         ` Michal Orzel
@ 2022-10-25  7:45           ` Bertrand Marquis
  2022-10-25  8:07             ` Michal Orzel
  0 siblings, 1 reply; 13+ messages in thread
From: Bertrand Marquis @ 2022-10-25  7:45 UTC (permalink / raw)
  To: Michal Orzel; +Cc: Stefano Stabellini, Julien Grall, xen-devel

Hi Michal,

> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
> 
> Hi,
> 
> On 25/10/2022 03:29, Stefano Stabellini wrote:
>> 
>> 
>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>> Caution: This message originated from an External Source. Use proper
>>>>> caution when opening attachments, clicking links, or responding.
>>>>> 
>>>>> 
>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>> Hello,
>>>>> 
>>>>> Hi Michal,
>>>>> 
>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>> speaking, we are routing
>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>>> trigger such IRQ. This code is
>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>> and we had to use the CNTP instead.
>>>>>> 
>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>> question of what to do with a deadcode,
>>>>>> as there might be/are other deadcode places in Xen.
>>>>> 
>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>> would result to a lot of places impacted with the decision.
>>>>> 
>>>>> So can you clarify what you mean by deadcode?
>>>> In the timer example, I think we have both a deadcode and unreachable code.
>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>> deadcode which is a "code that can be executed
>>>> but has no effect on the functional behavior of the program". This differs
>>>> from the unreachable code definition that is
>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>> (there is nothing that can trigger this IRQ).
>>>> 
>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>>> the code that cannot be executed as well as the code that
>>>> does not impact the functionality of the program.
>>>> 
>>>>> 
>>>>>> One may say that it is useful to keep it, because one day,
>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>> person would still need to modify the other
>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>> required overall. Personally, I'm not in favor of
>>>>>> such approach, because we should not really support possible scenarios
>>>>>> with broken HW (except for erratas listing known issues).
>>>>> 
>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>> 
>>>>> In particular, at which point do you consider that the HW should not be
>>>>> supported by Xen?
>>>> I'm not saying that HW should not be supported. The difference for me
>>>> between broken HW and
>>>> HW with known errata is that for the former, the incorrect behavior is often
>>>> due to the early support stage,
>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>>> already released and it happens to be that it is buggy
>>>> (the HW vendor is aware of the issue and released erratas).
>>> 
>>> Thanks for the clarification. What I would call broken is anything that can't
>>> be fixed in software. For a not too fictional example, an HW where PCI devices
>>> are using the same stream ID. So effectively, passthrough can't be safely
>>> supported.
>>> 
>>> Regarding, not yet released HW, I don't think Xen should have workaround for
>>> them. I wouldn't even call it "broken" because they are not yet released and
>>> it is common to have bug in early revision.
>>> 
>>>> Do we have any example in Xen for supporting broken HW,
>>>> whose vendor is not aware of the issue or did not release any errata?
>>> I will not cite any HW on the ML. But from my experience, the vendors are not
>>> very vocal about issues in public (some don't even seem to have public doc).
>>> The best way to find the issues is to look at Linux commit.
>>> 
>>>> 
>>>>> 
>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>> deadcode and we should have explanation for every block of code.
>>>>> 
>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>> I think this would be ok from the certification point of view (this would at
>>>> least means, that we are aware of the issue
>>>> and we took some steps). Otherwise, such code is just an example of a
>>>> deadcode/unreachable code.
>>> 
>>> Thanks for the clarification. So the exact approach will depend on the
>>> context....
>>> 
>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>> completely
>>>>>> 2. Leave it as it is
>>> 
>>> ... this is my preference in the context of the timer.
>> 
>> From a certification point of view, the fewer lines of code the better,
>> and ideally all the lines of code used for the certified build should be
>> testable and used.
>> 
>> So I think 2. is the lest useful option from a certification
>> perspective. For this reason, I'd prefer another alternative.
>> 
>> 
>>> If the other don't like it, then 1 would be my preference.
>>> 
>>> In general, my preference would be either 3.3 or 3.2 (see below).
>> 
>> I also think that 3.2 and 3.3 are good options for the general case. For
>> the timer, I can see why 1 is your (second) preference and I am fine
>> with 1 as well.
> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.

We need to get rid of dead code and removing it is not always the best solution.

If the code is or could be useful for someone some day, protecting it with ifdef is ok.

In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
code so that we can compile out what we do not need and code not applying to
some hardware is a case where we will do that (does not mean that by default
we will not compile it in but we will make it easier to reduce the code size for a
specific use case).

So 3.2 and 3.3 are ok for me.

> 
>> 
>> 
>>>>>> 3. Admit that it can be useful one day and:
>>>>>>    3.1. protect it with #if 0
>>> 
>>> #if 0 should not be used in Xen code. IMHO this is the worse of all the world.
> I share your opinion here Julien. Unfortunately we still have quite a few examples
> in the Arm code using this either to mark something as TODO or to comment out
> parts of the code waiting for future support. This is mostly in SMMU code that
> was taken from Linux but already diverged quite far (maybe some cleanup is necessary).

Definitely the SMMU code will need some cleaning.
#if 0 are a no go from a certification point of view.

Cheers
Bertrand

> 
>>> 
>>>>>>    3.2. protect it with a new Kconfig option (disabled by default)
>>>>>> using #ifdef
>>>>>>    3.3. protect it with a new Kconfig option (disabled by default)
>>>>>> using IS_ENABLED (to make sure code always compile)
>>> 
>>> I would prefer 3.3 over 3.2. 3.2 would be used if it is too difficult to get
>>> the code compiled when !IS_ENABLED.
>>> 
>>> Similar to one if this is to move all the affected code in a separate file
>>> with using obj-$(CONFIG...). That would only work for large chunk of code and
>>> would be preferred over 3.2.
>> 
> 
> ~Michal



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-25  7:45           ` Bertrand Marquis
@ 2022-10-25  8:07             ` Michal Orzel
  2022-10-25  8:20               ` Bertrand Marquis
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Orzel @ 2022-10-25  8:07 UTC (permalink / raw)
  To: Bertrand Marquis; +Cc: Stefano Stabellini, Julien Grall, xen-devel

Hi Bertrand,

On 25/10/2022 09:45, Bertrand Marquis wrote:
> 
> 
> Hi Michal,
> 
>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
>>
>> Hi,
>>
>> On 25/10/2022 03:29, Stefano Stabellini wrote:
>>>
>>>
>>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>>> Caution: This message originated from an External Source. Use proper
>>>>>> caution when opening attachments, clicking links, or responding.
>>>>>>
>>>>>>
>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>>> Hello,
>>>>>>
>>>>>> Hi Michal,
>>>>>>
>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>>> speaking, we are routing
>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>>>> trigger such IRQ. This code is
>>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>>> and we had to use the CNTP instead.
>>>>>>>
>>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>>> question of what to do with a deadcode,
>>>>>>> as there might be/are other deadcode places in Xen.
>>>>>>
>>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>>> would result to a lot of places impacted with the decision.
>>>>>>
>>>>>> So can you clarify what you mean by deadcode?
>>>>> In the timer example, I think we have both a deadcode and unreachable code.
>>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>>> deadcode which is a "code that can be executed
>>>>> but has no effect on the functional behavior of the program". This differs
>>>>> from the unreachable code definition that is
>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>>> (there is nothing that can trigger this IRQ).
>>>>>
>>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>>>> the code that cannot be executed as well as the code that
>>>>> does not impact the functionality of the program.
>>>>>
>>>>>>
>>>>>>> One may say that it is useful to keep it, because one day,
>>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>>> person would still need to modify the other
>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>>> required overall. Personally, I'm not in favor of
>>>>>>> such approach, because we should not really support possible scenarios
>>>>>>> with broken HW (except for erratas listing known issues).
>>>>>>
>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>>>
>>>>>> In particular, at which point do you consider that the HW should not be
>>>>>> supported by Xen?
>>>>> I'm not saying that HW should not be supported. The difference for me
>>>>> between broken HW and
>>>>> HW with known errata is that for the former, the incorrect behavior is often
>>>>> due to the early support stage,
>>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>>>> already released and it happens to be that it is buggy
>>>>> (the HW vendor is aware of the issue and released erratas).
>>>>
>>>> Thanks for the clarification. What I would call broken is anything that can't
>>>> be fixed in software. For a not too fictional example, an HW where PCI devices
>>>> are using the same stream ID. So effectively, passthrough can't be safely
>>>> supported.
>>>>
>>>> Regarding, not yet released HW, I don't think Xen should have workaround for
>>>> them. I wouldn't even call it "broken" because they are not yet released and
>>>> it is common to have bug in early revision.
>>>>
>>>>> Do we have any example in Xen for supporting broken HW,
>>>>> whose vendor is not aware of the issue or did not release any errata?
>>>> I will not cite any HW on the ML. But from my experience, the vendors are not
>>>> very vocal about issues in public (some don't even seem to have public doc).
>>>> The best way to find the issues is to look at Linux commit.
>>>>
>>>>>
>>>>>>
>>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>>> deadcode and we should have explanation for every block of code.
>>>>>>
>>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>>> I think this would be ok from the certification point of view (this would at
>>>>> least means, that we are aware of the issue
>>>>> and we took some steps). Otherwise, such code is just an example of a
>>>>> deadcode/unreachable code.
>>>>
>>>> Thanks for the clarification. So the exact approach will depend on the
>>>> context....
>>>>
>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>>> completely
>>>>>>> 2. Leave it as it is
>>>>
>>>> ... this is my preference in the context of the timer.
>>>
>>> From a certification point of view, the fewer lines of code the better,
>>> and ideally all the lines of code used for the certified build should be
>>> testable and used.
>>>
>>> So I think 2. is the lest useful option from a certification
>>> perspective. For this reason, I'd prefer another alternative.
>>>
>>>
>>>> If the other don't like it, then 1 would be my preference.
>>>>
>>>> In general, my preference would be either 3.3 or 3.2 (see below).
>>>
>>> I also think that 3.2 and 3.3 are good options for the general case. For
>>> the timer, I can see why 1 is your (second) preference and I am fine
>>> with 1 as well.
>> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.
> 
> We need to get rid of dead code and removing it is not always the best solution.
> 
> If the code is or could be useful for someone some day, protecting it with ifdef is ok.
> 
> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
> code so that we can compile out what we do not need and code not applying to
> some hardware is a case where we will do that (does not mean that by default
> we will not compile it in but we will make it easier to reduce the code size for a
> specific use case).
> 
> So 3.2 and 3.3 are ok for me.

So we all agree that the code in the current form is a no go from certification purposes.
That is good :)

The reason why I opt for solution 1 and not the others is that in the latter case it would
mean introducing the Kconfig option to allow user to select the timer to be used by Xen.
This is not really correct. Also in the current form, it would also require adding more
code to time.c code because at the moment using CNTP for Xen would not work out of the box.
The architecture defines the hypervisor timer for a purpose. If it does not work, it means
that the HW is problematic. I agree that we would want to support such use case but I'm not
really aware of any issue like that. Adding more code and Kconfig options just because
one day someone may face issues with a new HW is something I am not a fan of.

I would agree with your solution under the condition that the code is already ready
for the timer switch.

I guess we need another round of sharing opinions.

> 
>>
>>>
>>>
>>>>>>> 3. Admit that it can be useful one day and:
>>>>>>>    3.1. protect it with #if 0
>>>>
>>>> #if 0 should not be used in Xen code. IMHO this is the worse of all the world.
>> I share your opinion here Julien. Unfortunately we still have quite a few examples
>> in the Arm code using this either to mark something as TODO or to comment out
>> parts of the code waiting for future support. This is mostly in SMMU code that
>> was taken from Linux but already diverged quite far (maybe some cleanup is necessary).
> 
> Definitely the SMMU code will need some cleaning.
> #if 0 are a no go from a certification point of view.
> 
> Cheers
> Bertrand
> 
>>
>>>>
>>>>>>>    3.2. protect it with a new Kconfig option (disabled by default)
>>>>>>> using #ifdef
>>>>>>>    3.3. protect it with a new Kconfig option (disabled by default)
>>>>>>> using IS_ENABLED (to make sure code always compile)
>>>>
>>>> I would prefer 3.3 over 3.2. 3.2 would be used if it is too difficult to get
>>>> the code compiled when !IS_ENABLED.
>>>>
>>>> Similar to one if this is to move all the affected code in a separate file
>>>> with using obj-$(CONFIG...). That would only work for large chunk of code and
>>>> would be preferred over 3.2.
>>>
>>
>> ~Michal
> 

~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-25  8:07             ` Michal Orzel
@ 2022-10-25  8:20               ` Bertrand Marquis
  2022-10-26 11:29                 ` Michal Orzel
  0 siblings, 1 reply; 13+ messages in thread
From: Bertrand Marquis @ 2022-10-25  8:20 UTC (permalink / raw)
  To: Michal Orzel; +Cc: Stefano Stabellini, Julien Grall, xen-devel

Hi Michal,

> On 25 Oct 2022, at 09:07, Michal Orzel <michal.orzel@amd.com> wrote:
> 
> Hi Bertrand,
> 
> On 25/10/2022 09:45, Bertrand Marquis wrote:
>> 
>> 
>> Hi Michal,
>> 
>>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
>>> 
>>> Hi,
>>> 
>>> On 25/10/2022 03:29, Stefano Stabellini wrote:
>>>> 
>>>> 
>>>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>>>> Caution: This message originated from an External Source. Use proper
>>>>>>> caution when opening attachments, clicking links, or responding.
>>>>>>> 
>>>>>>> 
>>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>>>> Hello,
>>>>>>> 
>>>>>>> Hi Michal,
>>>>>>> 
>>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>>>> speaking, we are routing
>>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>>>>> trigger such IRQ. This code is
>>>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>>>> and we had to use the CNTP instead.
>>>>>>>> 
>>>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>>>> question of what to do with a deadcode,
>>>>>>>> as there might be/are other deadcode places in Xen.
>>>>>>> 
>>>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>>>> would result to a lot of places impacted with the decision.
>>>>>>> 
>>>>>>> So can you clarify what you mean by deadcode?
>>>>>> In the timer example, I think we have both a deadcode and unreachable code.
>>>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>>>> deadcode which is a "code that can be executed
>>>>>> but has no effect on the functional behavior of the program". This differs
>>>>>> from the unreachable code definition that is
>>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>>>> (there is nothing that can trigger this IRQ).
>>>>>> 
>>>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>>>>> the code that cannot be executed as well as the code that
>>>>>> does not impact the functionality of the program.
>>>>>> 
>>>>>>> 
>>>>>>>> One may say that it is useful to keep it, because one day,
>>>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>>>> person would still need to modify the other
>>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>>>> required overall. Personally, I'm not in favor of
>>>>>>>> such approach, because we should not really support possible scenarios
>>>>>>>> with broken HW (except for erratas listing known issues).
>>>>>>> 
>>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>>>> 
>>>>>>> In particular, at which point do you consider that the HW should not be
>>>>>>> supported by Xen?
>>>>>> I'm not saying that HW should not be supported. The difference for me
>>>>>> between broken HW and
>>>>>> HW with known errata is that for the former, the incorrect behavior is often
>>>>>> due to the early support stage,
>>>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>>>>> already released and it happens to be that it is buggy
>>>>>> (the HW vendor is aware of the issue and released erratas).
>>>>> 
>>>>> Thanks for the clarification. What I would call broken is anything that can't
>>>>> be fixed in software. For a not too fictional example, an HW where PCI devices
>>>>> are using the same stream ID. So effectively, passthrough can't be safely
>>>>> supported.
>>>>> 
>>>>> Regarding, not yet released HW, I don't think Xen should have workaround for
>>>>> them. I wouldn't even call it "broken" because they are not yet released and
>>>>> it is common to have bug in early revision.
>>>>> 
>>>>>> Do we have any example in Xen for supporting broken HW,
>>>>>> whose vendor is not aware of the issue or did not release any errata?
>>>>> I will not cite any HW on the ML. But from my experience, the vendors are not
>>>>> very vocal about issues in public (some don't even seem to have public doc).
>>>>> The best way to find the issues is to look at Linux commit.
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>>>> deadcode and we should have explanation for every block of code.
>>>>>>> 
>>>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>>>> I think this would be ok from the certification point of view (this would at
>>>>>> least means, that we are aware of the issue
>>>>>> and we took some steps). Otherwise, such code is just an example of a
>>>>>> deadcode/unreachable code.
>>>>> 
>>>>> Thanks for the clarification. So the exact approach will depend on the
>>>>> context....
>>>>> 
>>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>>>> completely
>>>>>>>> 2. Leave it as it is
>>>>> 
>>>>> ... this is my preference in the context of the timer.
>>>> 
>>>> From a certification point of view, the fewer lines of code the better,
>>>> and ideally all the lines of code used for the certified build should be
>>>> testable and used.
>>>> 
>>>> So I think 2. is the lest useful option from a certification
>>>> perspective. For this reason, I'd prefer another alternative.
>>>> 
>>>> 
>>>>> If the other don't like it, then 1 would be my preference.
>>>>> 
>>>>> In general, my preference would be either 3.3 or 3.2 (see below).
>>>> 
>>>> I also think that 3.2 and 3.3 are good options for the general case. For
>>>> the timer, I can see why 1 is your (second) preference and I am fine
>>>> with 1 as well.
>>> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.
>> 
>> We need to get rid of dead code and removing it is not always the best solution.
>> 
>> If the code is or could be useful for someone some day, protecting it with ifdef is ok.
>> 
>> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
>> code so that we can compile out what we do not need and code not applying to
>> some hardware is a case where we will do that (does not mean that by default
>> we will not compile it in but we will make it easier to reduce the code size for a
>> specific use case).
>> 
>> So 3.2 and 3.3 are ok for me.
> 
> So we all agree that the code in the current form is a no go from certification purposes.
> That is good :)
> 
> The reason why I opt for solution 1 and not the others is that in the latter case it would
> mean introducing the Kconfig option to allow user to select the timer to be used by Xen.
> This is not really correct. Also in the current form, it would also require adding more
> code to time.c code because at the moment using CNTP for Xen would not work out of the box.
> The architecture defines the hypervisor timer for a purpose. If it does not work, it means
> that the HW is problematic. I agree that we would want to support such use case but I'm not
> really aware of any issue like that. Adding more code and Kconfig options just because
> one day someone may face issues with a new HW is something I am not a fan of.

I see 2 solutions here:
- somehow push the code to a different file (not quite sure this is feasible here)
- remove completely the code with a clean commit. Doing this it will be easy for someone needing this to later revert the patch

It is definitely true here that adding more code to keep some unused code does not really make sense.
And let’s be realistic here, if we need that one day, it will not take ages to support it somehow.

As said, from a pure certification point of view:
- we must not have deadcode
- proper ifdef is acceptable
- if 0 is not acceptable
- commented code is not acceptable

> 
> I would agree with your solution under the condition that the code is already ready
> for the timer switch.
> 
> I guess we need another round of sharing opinions.
> 

I guess so yes

Cheers
Bertrand


>> 
>>> 
>>>> 
>>>> 
>>>>>>>> 3. Admit that it can be useful one day and:
>>>>>>>>   3.1. protect it with #if 0
>>>>> 
>>>>> #if 0 should not be used in Xen code. IMHO this is the worse of all the world.
>>> I share your opinion here Julien. Unfortunately we still have quite a few examples
>>> in the Arm code using this either to mark something as TODO or to comment out
>>> parts of the code waiting for future support. This is mostly in SMMU code that
>>> was taken from Linux but already diverged quite far (maybe some cleanup is necessary).
>> 
>> Definitely the SMMU code will need some cleaning.
>> #if 0 are a no go from a certification point of view.
>> 
>> Cheers
>> Bertrand
>> 
>>> 
>>>>> 
>>>>>>>>   3.2. protect it with a new Kconfig option (disabled by default)
>>>>>>>> using #ifdef
>>>>>>>>   3.3. protect it with a new Kconfig option (disabled by default)
>>>>>>>> using IS_ENABLED (to make sure code always compile)
>>>>> 
>>>>> I would prefer 3.3 over 3.2. 3.2 would be used if it is too difficult to get
>>>>> the code compiled when !IS_ENABLED.
>>>>> 
>>>>> Similar to one if this is to move all the affected code in a separate file
>>>>> with using obj-$(CONFIG...). That would only work for large chunk of code and
>>>>> would be preferred over 3.2.
>>>> 
>>> 
>>> ~Michal
>> 
> 
> ~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-25  8:20               ` Bertrand Marquis
@ 2022-10-26 11:29                 ` Michal Orzel
  2022-10-26 12:20                   ` Bertrand Marquis
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Orzel @ 2022-10-26 11:29 UTC (permalink / raw)
  To: Bertrand Marquis, Stefano Stabellini, Julien Grall; +Cc: xen-devel

Hi all,

On 25/10/2022 10:20, Bertrand Marquis wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> Hi Michal,
> 
>> On 25 Oct 2022, at 09:07, Michal Orzel <michal.orzel@amd.com> wrote:
>>
>> Hi Bertrand,
>>
>> On 25/10/2022 09:45, Bertrand Marquis wrote:
>>>
>>>
>>> Hi Michal,
>>>
>>>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 25/10/2022 03:29, Stefano Stabellini wrote:
>>>>>
>>>>>
>>>>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>>>>> Caution: This message originated from an External Source. Use proper
>>>>>>>> caution when opening attachments, clicking links, or responding.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Hi Michal,
>>>>>>>>
>>>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>>>>> speaking, we are routing
>>>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>>>>>> trigger such IRQ. This code is
>>>>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>>>>> and we had to use the CNTP instead.
>>>>>>>>>
>>>>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>>>>> question of what to do with a deadcode,
>>>>>>>>> as there might be/are other deadcode places in Xen.
>>>>>>>>
>>>>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>>>>> would result to a lot of places impacted with the decision.
>>>>>>>>
>>>>>>>> So can you clarify what you mean by deadcode?
>>>>>>> In the timer example, I think we have both a deadcode and unreachable code.
>>>>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>>>>> deadcode which is a "code that can be executed
>>>>>>> but has no effect on the functional behavior of the program". This differs
>>>>>>> from the unreachable code definition that is
>>>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>>>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>>>>> (there is nothing that can trigger this IRQ).
>>>>>>>
>>>>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>>>>>> the code that cannot be executed as well as the code that
>>>>>>> does not impact the functionality of the program.
>>>>>>>
>>>>>>>>
>>>>>>>>> One may say that it is useful to keep it, because one day,
>>>>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>>>>> person would still need to modify the other
>>>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>>>>> required overall. Personally, I'm not in favor of
>>>>>>>>> such approach, because we should not really support possible scenarios
>>>>>>>>> with broken HW (except for erratas listing known issues).
>>>>>>>>
>>>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>>>>>
>>>>>>>> In particular, at which point do you consider that the HW should not be
>>>>>>>> supported by Xen?
>>>>>>> I'm not saying that HW should not be supported. The difference for me
>>>>>>> between broken HW and
>>>>>>> HW with known errata is that for the former, the incorrect behavior is often
>>>>>>> due to the early support stage,
>>>>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>>>>>> already released and it happens to be that it is buggy
>>>>>>> (the HW vendor is aware of the issue and released erratas).
>>>>>>
>>>>>> Thanks for the clarification. What I would call broken is anything that can't
>>>>>> be fixed in software. For a not too fictional example, an HW where PCI devices
>>>>>> are using the same stream ID. So effectively, passthrough can't be safely
>>>>>> supported.
>>>>>>
>>>>>> Regarding, not yet released HW, I don't think Xen should have workaround for
>>>>>> them. I wouldn't even call it "broken" because they are not yet released and
>>>>>> it is common to have bug in early revision.
>>>>>>
>>>>>>> Do we have any example in Xen for supporting broken HW,
>>>>>>> whose vendor is not aware of the issue or did not release any errata?
>>>>>> I will not cite any HW on the ML. But from my experience, the vendors are not
>>>>>> very vocal about issues in public (some don't even seem to have public doc).
>>>>>> The best way to find the issues is to look at Linux commit.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>>>>> deadcode and we should have explanation for every block of code.
>>>>>>>>
>>>>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>>>>> I think this would be ok from the certification point of view (this would at
>>>>>>> least means, that we are aware of the issue
>>>>>>> and we took some steps). Otherwise, such code is just an example of a
>>>>>>> deadcode/unreachable code.
>>>>>>
>>>>>> Thanks for the clarification. So the exact approach will depend on the
>>>>>> context....
>>>>>>
>>>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>>>>> completely
>>>>>>>>> 2. Leave it as it is
>>>>>>
>>>>>> ... this is my preference in the context of the timer.
>>>>>
>>>>> From a certification point of view, the fewer lines of code the better,
>>>>> and ideally all the lines of code used for the certified build should be
>>>>> testable and used.
>>>>>
>>>>> So I think 2. is the lest useful option from a certification
>>>>> perspective. For this reason, I'd prefer another alternative.
>>>>>
>>>>>
>>>>>> If the other don't like it, then 1 would be my preference.
>>>>>>
>>>>>> In general, my preference would be either 3.3 or 3.2 (see below).
>>>>>
>>>>> I also think that 3.2 and 3.3 are good options for the general case. For
>>>>> the timer, I can see why 1 is your (second) preference and I am fine
>>>>> with 1 as well.
>>>> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.
>>>
>>> We need to get rid of dead code and removing it is not always the best solution.
>>>
>>> If the code is or could be useful for someone some day, protecting it with ifdef is ok.
>>>
>>> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
>>> code so that we can compile out what we do not need and code not applying to
>>> some hardware is a case where we will do that (does not mean that by default
>>> we will not compile it in but we will make it easier to reduce the code size for a
>>> specific use case).
>>>
>>> So 3.2 and 3.3 are ok for me.
>>
>> So we all agree that the code in the current form is a no go from certification purposes.
>> That is good :)
>>
>> The reason why I opt for solution 1 and not the others is that in the latter case it would
>> mean introducing the Kconfig option to allow user to select the timer to be used by Xen.
>> This is not really correct. Also in the current form, it would also require adding more
>> code to time.c code because at the moment using CNTP for Xen would not work out of the box.
>> The architecture defines the hypervisor timer for a purpose. If it does not work, it means
>> that the HW is problematic. I agree that we would want to support such use case but I'm not
>> really aware of any issue like that. Adding more code and Kconfig options just because
>> one day someone may face issues with a new HW is something I am not a fan of.
> 
> I see 2 solutions here:
> - somehow push the code to a different file (not quite sure this is feasible here)
> - remove completely the code with a clean commit. Doing this it will be easy for someone needing this to later revert the patch
> 
> It is definitely true here that adding more code to keep some unused code does not really make sense.
> And let’s be realistic here, if we need that one day, it will not take ages to support it somehow.
> 
> As said, from a pure certification point of view:
> - we must not have deadcode
> - proper ifdef is acceptable
> - if 0 is not acceptable
> - commented code is not acceptable

Given that we agree on that (+ IS_ENABLED option if possible), and the option 1 seems
to be the best choice for the timer, I will create a patch removing the IRQ path to get rid
of the deadcode/unreachable code.

Do you think this is something we want for 4.17?
The risk is low as the code is already dead and the benefit is that we have no deadcode.
What do you think?

~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-26 11:29                 ` Michal Orzel
@ 2022-10-26 12:20                   ` Bertrand Marquis
  2022-10-26 15:30                     ` Stefano Stabellini
  0 siblings, 1 reply; 13+ messages in thread
From: Bertrand Marquis @ 2022-10-26 12:20 UTC (permalink / raw)
  To: Michal Orzel; +Cc: Stefano Stabellini, Julien Grall, xen-devel

Hi Michal,

> On 26 Oct 2022, at 12:29, Michal Orzel <michal.orzel@amd.com> wrote:
> 
> Hi all,
> 
> On 25/10/2022 10:20, Bertrand Marquis wrote:
>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>> 
>> 
>> Hi Michal,
>> 
>>> On 25 Oct 2022, at 09:07, Michal Orzel <michal.orzel@amd.com> wrote:
>>> 
>>> Hi Bertrand,
>>> 
>>> On 25/10/2022 09:45, Bertrand Marquis wrote:
>>>> 
>>>> 
>>>> Hi Michal,
>>>> 
>>>>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> On 25/10/2022 03:29, Stefano Stabellini wrote:
>>>>>> 
>>>>>> 
>>>>>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>>>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>>>>>> Caution: This message originated from an External Source. Use proper
>>>>>>>>> caution when opening attachments, clicking links, or responding.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>>>>>> Hello,
>>>>>>>>> 
>>>>>>>>> Hi Michal,
>>>>>>>>> 
>>>>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>>>>>> speaking, we are routing
>>>>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>>>>>>> trigger such IRQ. This code is
>>>>>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>>>>>> and we had to use the CNTP instead.
>>>>>>>>>> 
>>>>>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>>>>>> question of what to do with a deadcode,
>>>>>>>>>> as there might be/are other deadcode places in Xen.
>>>>>>>>> 
>>>>>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>>>>>> would result to a lot of places impacted with the decision.
>>>>>>>>> 
>>>>>>>>> So can you clarify what you mean by deadcode?
>>>>>>>> In the timer example, I think we have both a deadcode and unreachable code.
>>>>>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>>>>>> deadcode which is a "code that can be executed
>>>>>>>> but has no effect on the functional behavior of the program". This differs
>>>>>>>> from the unreachable code definition that is
>>>>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>>>>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>>>>>> (there is nothing that can trigger this IRQ).
>>>>>>>> 
>>>>>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>>>>>>> the code that cannot be executed as well as the code that
>>>>>>>> does not impact the functionality of the program.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> One may say that it is useful to keep it, because one day,
>>>>>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>>>>>> person would still need to modify the other
>>>>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>>>>>> required overall. Personally, I'm not in favor of
>>>>>>>>>> such approach, because we should not really support possible scenarios
>>>>>>>>>> with broken HW (except for erratas listing known issues).
>>>>>>>>> 
>>>>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>>>>>> 
>>>>>>>>> In particular, at which point do you consider that the HW should not be
>>>>>>>>> supported by Xen?
>>>>>>>> I'm not saying that HW should not be supported. The difference for me
>>>>>>>> between broken HW and
>>>>>>>> HW with known errata is that for the former, the incorrect behavior is often
>>>>>>>> due to the early support stage,
>>>>>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>>>>>>> already released and it happens to be that it is buggy
>>>>>>>> (the HW vendor is aware of the issue and released erratas).
>>>>>>> 
>>>>>>> Thanks for the clarification. What I would call broken is anything that can't
>>>>>>> be fixed in software. For a not too fictional example, an HW where PCI devices
>>>>>>> are using the same stream ID. So effectively, passthrough can't be safely
>>>>>>> supported.
>>>>>>> 
>>>>>>> Regarding, not yet released HW, I don't think Xen should have workaround for
>>>>>>> them. I wouldn't even call it "broken" because they are not yet released and
>>>>>>> it is common to have bug in early revision.
>>>>>>> 
>>>>>>>> Do we have any example in Xen for supporting broken HW,
>>>>>>>> whose vendor is not aware of the issue or did not release any errata?
>>>>>>> I will not cite any HW on the ML. But from my experience, the vendors are not
>>>>>>> very vocal about issues in public (some don't even seem to have public doc).
>>>>>>> The best way to find the issues is to look at Linux commit.
>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>>>>>> deadcode and we should have explanation for every block of code.
>>>>>>>>> 
>>>>>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>>>>>> I think this would be ok from the certification point of view (this would at
>>>>>>>> least means, that we are aware of the issue
>>>>>>>> and we took some steps). Otherwise, such code is just an example of a
>>>>>>>> deadcode/unreachable code.
>>>>>>> 
>>>>>>> Thanks for the clarification. So the exact approach will depend on the
>>>>>>> context....
>>>>>>> 
>>>>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>>>>>> completely
>>>>>>>>>> 2. Leave it as it is
>>>>>>> 
>>>>>>> ... this is my preference in the context of the timer.
>>>>>> 
>>>>>> From a certification point of view, the fewer lines of code the better,
>>>>>> and ideally all the lines of code used for the certified build should be
>>>>>> testable and used.
>>>>>> 
>>>>>> So I think 2. is the lest useful option from a certification
>>>>>> perspective. For this reason, I'd prefer another alternative.
>>>>>> 
>>>>>> 
>>>>>>> If the other don't like it, then 1 would be my preference.
>>>>>>> 
>>>>>>> In general, my preference would be either 3.3 or 3.2 (see below).
>>>>>> 
>>>>>> I also think that 3.2 and 3.3 are good options for the general case. For
>>>>>> the timer, I can see why 1 is your (second) preference and I am fine
>>>>>> with 1 as well.
>>>>> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.
>>>> 
>>>> We need to get rid of dead code and removing it is not always the best solution.
>>>> 
>>>> If the code is or could be useful for someone some day, protecting it with ifdef is ok.
>>>> 
>>>> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
>>>> code so that we can compile out what we do not need and code not applying to
>>>> some hardware is a case where we will do that (does not mean that by default
>>>> we will not compile it in but we will make it easier to reduce the code size for a
>>>> specific use case).
>>>> 
>>>> So 3.2 and 3.3 are ok for me.
>>> 
>>> So we all agree that the code in the current form is a no go from certification purposes.
>>> That is good :)
>>> 
>>> The reason why I opt for solution 1 and not the others is that in the latter case it would
>>> mean introducing the Kconfig option to allow user to select the timer to be used by Xen.
>>> This is not really correct. Also in the current form, it would also require adding more
>>> code to time.c code because at the moment using CNTP for Xen would not work out of the box.
>>> The architecture defines the hypervisor timer for a purpose. If it does not work, it means
>>> that the HW is problematic. I agree that we would want to support such use case but I'm not
>>> really aware of any issue like that. Adding more code and Kconfig options just because
>>> one day someone may face issues with a new HW is something I am not a fan of.
>> 
>> I see 2 solutions here:
>> - somehow push the code to a different file (not quite sure this is feasible here)
>> - remove completely the code with a clean commit. Doing this it will be easy for someone needing this to later revert the patch
>> 
>> It is definitely true here that adding more code to keep some unused code does not really make sense.
>> And let’s be realistic here, if we need that one day, it will not take ages to support it somehow.
>> 
>> As said, from a pure certification point of view:
>> - we must not have deadcode
>> - proper ifdef is acceptable
>> - if 0 is not acceptable
>> - commented code is not acceptable
> 
> Given that we agree on that (+ IS_ENABLED option if possible), and the option 1 seems
> to be the best choice for the timer, I will create a patch removing the IRQ path to get rid
> of the deadcode/unreachable code.
> 
> Do you think this is something we want for 4.17?
> The risk is low as the code is already dead and the benefit is that we have no deadcode.
> What do you think?
> 

We are very near from the release so from my point of view as it is not solving a bug, this should not go into 4.17.

Bertrand

> ~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-26 12:20                   ` Bertrand Marquis
@ 2022-10-26 15:30                     ` Stefano Stabellini
  2022-10-26 15:53                       ` Michal Orzel
  0 siblings, 1 reply; 13+ messages in thread
From: Stefano Stabellini @ 2022-10-26 15:30 UTC (permalink / raw)
  To: Bertrand Marquis
  Cc: Michal Orzel, Stefano Stabellini, Julien Grall, xen-devel

[-- Attachment #1: Type: text/plain, Size: 9756 bytes --]

On Wed, 26 Oct 2022, Bertrand Marquis wrote:
> > On 26 Oct 2022, at 12:29, Michal Orzel <michal.orzel@amd.com> wrote:
> > 
> > Hi all,
> > 
> > On 25/10/2022 10:20, Bertrand Marquis wrote:
> >> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> >> 
> >> 
> >> Hi Michal,
> >> 
> >>> On 25 Oct 2022, at 09:07, Michal Orzel <michal.orzel@amd.com> wrote:
> >>> 
> >>> Hi Bertrand,
> >>> 
> >>> On 25/10/2022 09:45, Bertrand Marquis wrote:
> >>>> 
> >>>> 
> >>>> Hi Michal,
> >>>> 
> >>>>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
> >>>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> On 25/10/2022 03:29, Stefano Stabellini wrote:
> >>>>>> 
> >>>>>> 
> >>>>>> On Mon, 24 Oct 2022, Julien Grall wrote:
> >>>>>>>> On 24/10/2022 12:51, Julien Grall wrote:
> >>>>>>>>> Caution: This message originated from an External Source. Use proper
> >>>>>>>>> caution when opening attachments, clicking links, or responding.
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
> >>>>>>>>>> Hello,
> >>>>>>>>> 
> >>>>>>>>> Hi Michal,
> >>>>>>>>> 
> >>>>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
> >>>>>>>>>> speaking, we are routing
> >>>>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
> >>>>>>>>>> of it (as it uses the hypervisor timer CNTHP).
> >>>>>>>>>> This timer is fully emulated, which means that there is nothing that can
> >>>>>>>>>> trigger such IRQ. This code is
> >>>>>>>>>> a left over from early days, where the CNTHP was buggy on some models
> >>>>>>>>>> and we had to use the CNTP instead.
> >>>>>>>>>> 
> >>>>>>>>>> As far as the problem itself is not really interesting, it raises a
> >>>>>>>>>> question of what to do with a deadcode,
> >>>>>>>>>> as there might be/are other deadcode places in Xen.
> >>>>>>>>> 
> >>>>>>>>> There are multiple definition of deadcode. Depending on which one you
> >>>>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
> >>>>>>>>> would result to a lot of places impacted with the decision.
> >>>>>>>>> 
> >>>>>>>>> So can you clarify what you mean by deadcode?
> >>>>>>>> In the timer example, I think we have both a deadcode and unreachable code.
> >>>>>>>> For the purpose of this discussion, let's take the MISRA definition of a
> >>>>>>>> deadcode which is a "code that can be executed
> >>>>>>>> but has no effect on the functional behavior of the program". This differs
> >>>>>>>> from the unreachable code definition that is
> >>>>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
> >>>>>>>> of a deadcode. Code within IRQ handler is an unreachable code
> >>>>>>>> (there is nothing that can trigger this IRQ).
> >>>>>>>> 
> >>>>>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
> >>>>>>>> the code that cannot be executed as well as the code that
> >>>>>>>> does not impact the functionality of the program.
> >>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>>> One may say that it is useful to keep it, because one day,
> >>>>>>>>>> someone might need it when dealing with yet another broken HW. Such
> >>>>>>>>>> person would still need to modify the other
> >>>>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
> >>>>>>>>>> required overall. Personally, I'm not in favor of
> >>>>>>>>>> such approach, because we should not really support possible scenarios
> >>>>>>>>>> with broken HW (except for erratas listing known issues).
> >>>>>>>>> 
> >>>>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
> >>>>>>>>> unclear to me. Can you clarify how you would make the difference here?
> >>>>>>>>> 
> >>>>>>>>> In particular, at which point do you consider that the HW should not be
> >>>>>>>>> supported by Xen?
> >>>>>>>> I'm not saying that HW should not be supported. The difference for me
> >>>>>>>> between broken HW and
> >>>>>>>> HW with known errata is that for the former, the incorrect behavior is often
> >>>>>>>> due to the early support stage,
> >>>>>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
> >>>>>>>> already released and it happens to be that it is buggy
> >>>>>>>> (the HW vendor is aware of the issue and released erratas).
> >>>>>>> 
> >>>>>>> Thanks for the clarification. What I would call broken is anything that can't
> >>>>>>> be fixed in software. For a not too fictional example, an HW where PCI devices
> >>>>>>> are using the same stream ID. So effectively, passthrough can't be safely
> >>>>>>> supported.
> >>>>>>> 
> >>>>>>> Regarding, not yet released HW, I don't think Xen should have workaround for
> >>>>>>> them. I wouldn't even call it "broken" because they are not yet released and
> >>>>>>> it is common to have bug in early revision.
> >>>>>>> 
> >>>>>>>> Do we have any example in Xen for supporting broken HW,
> >>>>>>>> whose vendor is not aware of the issue or did not release any errata?
> >>>>>>> I will not cite any HW on the ML. But from my experience, the vendors are not
> >>>>>>> very vocal about issues in public (some don't even seem to have public doc).
> >>>>>>> The best way to find the issues is to look at Linux commit.
> >>>>>>> 
> >>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>>> Also, as part of the certification/FUSA process, there should be no
> >>>>>>>>>> deadcode and we should have explanation for every block of code.
> >>>>>>>>> 
> >>>>>>>>> See above. What are you trying to cover by deadcode? Would protecting
> >>>>>>>>> code with IS_ENABLED() (or #ifdef) ok?
> >>>>>>>> I think this would be ok from the certification point of view (this would at
> >>>>>>>> least means, that we are aware of the issue
> >>>>>>>> and we took some steps). Otherwise, such code is just an example of a
> >>>>>>>> deadcode/unreachable code.
> >>>>>>> 
> >>>>>>> Thanks for the clarification. So the exact approach will depend on the
> >>>>>>> context....
> >>>>>>> 
> >>>>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
> >>>>>>>>>> completely
> >>>>>>>>>> 2. Leave it as it is
> >>>>>>> 
> >>>>>>> ... this is my preference in the context of the timer.
> >>>>>> 
> >>>>>> From a certification point of view, the fewer lines of code the better,
> >>>>>> and ideally all the lines of code used for the certified build should be
> >>>>>> testable and used.
> >>>>>> 
> >>>>>> So I think 2. is the lest useful option from a certification
> >>>>>> perspective. For this reason, I'd prefer another alternative.
> >>>>>> 
> >>>>>> 
> >>>>>>> If the other don't like it, then 1 would be my preference.
> >>>>>>> 
> >>>>>>> In general, my preference would be either 3.3 or 3.2 (see below).
> >>>>>> 
> >>>>>> I also think that 3.2 and 3.3 are good options for the general case. For
> >>>>>> the timer, I can see why 1 is your (second) preference and I am fine
> >>>>>> with 1 as well.
> >>>>> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.
> >>>> 
> >>>> We need to get rid of dead code and removing it is not always the best solution.
> >>>> 
> >>>> If the code is or could be useful for someone some day, protecting it with ifdef is ok.
> >>>> 
> >>>> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
> >>>> code so that we can compile out what we do not need and code not applying to
> >>>> some hardware is a case where we will do that (does not mean that by default
> >>>> we will not compile it in but we will make it easier to reduce the code size for a
> >>>> specific use case).
> >>>> 
> >>>> So 3.2 and 3.3 are ok for me.
> >>> 
> >>> So we all agree that the code in the current form is a no go from certification purposes.
> >>> That is good :)
> >>> 
> >>> The reason why I opt for solution 1 and not the others is that in the latter case it would
> >>> mean introducing the Kconfig option to allow user to select the timer to be used by Xen.
> >>> This is not really correct. Also in the current form, it would also require adding more
> >>> code to time.c code because at the moment using CNTP for Xen would not work out of the box.
> >>> The architecture defines the hypervisor timer for a purpose. If it does not work, it means
> >>> that the HW is problematic. I agree that we would want to support such use case but I'm not
> >>> really aware of any issue like that. Adding more code and Kconfig options just because
> >>> one day someone may face issues with a new HW is something I am not a fan of.
> >> 
> >> I see 2 solutions here:
> >> - somehow push the code to a different file (not quite sure this is feasible here)
> >> - remove completely the code with a clean commit. Doing this it will be easy for someone needing this to later revert the patch
> >> 
> >> It is definitely true here that adding more code to keep some unused code does not really make sense.
> >> And let’s be realistic here, if we need that one day, it will not take ages to support it somehow.
> >> 
> >> As said, from a pure certification point of view:
> >> - we must not have deadcode
> >> - proper ifdef is acceptable
> >> - if 0 is not acceptable
> >> - commented code is not acceptable
> > 
> > Given that we agree on that (+ IS_ENABLED option if possible), and the option 1 seems
> > to be the best choice for the timer, I will create a patch removing the IRQ path to get rid
> > of the deadcode/unreachable code.
> > 
> > Do you think this is something we want for 4.17?
> > The risk is low as the code is already dead and the benefit is that we have no deadcode.
> > What do you think?
> > 
> 
> We are very near from the release so from my point of view as it is not solving a bug, this should not go into 4.17.

I agree

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadcode discussion based on Arm NS phys timer
  2022-10-26 15:30                     ` Stefano Stabellini
@ 2022-10-26 15:53                       ` Michal Orzel
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Orzel @ 2022-10-26 15:53 UTC (permalink / raw)
  To: Stefano Stabellini, Bertrand Marquis; +Cc: Julien Grall, xen-devel

Hi,

On 26/10/2022 17:30, Stefano Stabellini wrote:
> 
> 
> On Wed, 26 Oct 2022, Bertrand Marquis wrote:
>>> On 26 Oct 2022, at 12:29, Michal Orzel <michal.orzel@amd.com> wrote:
>>>
>>> Hi all,
>>>
>>> On 25/10/2022 10:20, Bertrand Marquis wrote:
>>>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>>>
>>>>
>>>> Hi Michal,
>>>>
>>>>> On 25 Oct 2022, at 09:07, Michal Orzel <michal.orzel@amd.com> wrote:
>>>>>
>>>>> Hi Bertrand,
>>>>>
>>>>> On 25/10/2022 09:45, Bertrand Marquis wrote:
>>>>>>
>>>>>>
>>>>>> Hi Michal,
>>>>>>
>>>>>>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@amd.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 25/10/2022 03:29, Stefano Stabellini wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>>>>>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>>>>>>>> Caution: This message originated from an External Source. Use proper
>>>>>>>>>>> caution when opening attachments, clicking links, or responding.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> Hi Michal,
>>>>>>>>>>>
>>>>>>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>>>>>>>> speaking, we are routing
>>>>>>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>>>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>>>>>>>> This timer is fully emulated, which means that there is nothing that can
>>>>>>>>>>>> trigger such IRQ. This code is
>>>>>>>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>>>>>>>> and we had to use the CNTP instead.
>>>>>>>>>>>>
>>>>>>>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>>>>>>>> question of what to do with a deadcode,
>>>>>>>>>>>> as there might be/are other deadcode places in Xen.
>>>>>>>>>>>
>>>>>>>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>>>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>>>>>>>> would result to a lot of places impacted with the decision.
>>>>>>>>>>>
>>>>>>>>>>> So can you clarify what you mean by deadcode?
>>>>>>>>>> In the timer example, I think we have both a deadcode and unreachable code.
>>>>>>>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>>>>>>>> deadcode which is a "code that can be executed
>>>>>>>>>> but has no effect on the functional behavior of the program". This differs
>>>>>>>>>> from the unreachable code definition that is
>>>>>>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an example
>>>>>>>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>>>>>>>> (there is nothing that can trigger this IRQ).
>>>>>>>>>>
>>>>>>>>>> What I mean by deadcode happens to be the sum of the two cases above i.e.
>>>>>>>>>> the code that cannot be executed as well as the code that
>>>>>>>>>> does not impact the functionality of the program.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> One may say that it is useful to keep it, because one day,
>>>>>>>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>>>>>>>> person would still need to modify the other
>>>>>>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>>>>>>>> required overall. Personally, I'm not in favor of
>>>>>>>>>>>> such approach, because we should not really support possible scenarios
>>>>>>>>>>>> with broken HW (except for erratas listing known issues).
>>>>>>>>>>>
>>>>>>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>>>>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>>>>>>>>
>>>>>>>>>>> In particular, at which point do you consider that the HW should not be
>>>>>>>>>>> supported by Xen?
>>>>>>>>>> I'm not saying that HW should not be supported. The difference for me
>>>>>>>>>> between broken HW and
>>>>>>>>>> HW with known errata is that for the former, the incorrect behavior is often
>>>>>>>>>> due to the early support stage,
>>>>>>>>>> using emulators/models instead of real HW, whereas for the latter, the HW is
>>>>>>>>>> already released and it happens to be that it is buggy
>>>>>>>>>> (the HW vendor is aware of the issue and released erratas).
>>>>>>>>>
>>>>>>>>> Thanks for the clarification. What I would call broken is anything that can't
>>>>>>>>> be fixed in software. For a not too fictional example, an HW where PCI devices
>>>>>>>>> are using the same stream ID. So effectively, passthrough can't be safely
>>>>>>>>> supported.
>>>>>>>>>
>>>>>>>>> Regarding, not yet released HW, I don't think Xen should have workaround for
>>>>>>>>> them. I wouldn't even call it "broken" because they are not yet released and
>>>>>>>>> it is common to have bug in early revision.
>>>>>>>>>
>>>>>>>>>> Do we have any example in Xen for supporting broken HW,
>>>>>>>>>> whose vendor is not aware of the issue or did not release any errata?
>>>>>>>>> I will not cite any HW on the ML. But from my experience, the vendors are not
>>>>>>>>> very vocal about issues in public (some don't even seem to have public doc).
>>>>>>>>> The best way to find the issues is to look at Linux commit.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>>>>>>>> deadcode and we should have explanation for every block of code.
>>>>>>>>>>>
>>>>>>>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>>>>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>>>>>>>> I think this would be ok from the certification point of view (this would at
>>>>>>>>>> least means, that we are aware of the issue
>>>>>>>>>> and we took some steps). Otherwise, such code is just an example of a
>>>>>>>>>> deadcode/unreachable code.
>>>>>>>>>
>>>>>>>>> Thanks for the clarification. So the exact approach will depend on the
>>>>>>>>> context....
>>>>>>>>>
>>>>>>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>>>>>>>> completely
>>>>>>>>>>>> 2. Leave it as it is
>>>>>>>>>
>>>>>>>>> ... this is my preference in the context of the timer.
>>>>>>>>
>>>>>>>> From a certification point of view, the fewer lines of code the better,
>>>>>>>> and ideally all the lines of code used for the certified build should be
>>>>>>>> testable and used.
>>>>>>>>
>>>>>>>> So I think 2. is the lest useful option from a certification
>>>>>>>> perspective. For this reason, I'd prefer another alternative.
>>>>>>>>
>>>>>>>>
>>>>>>>>> If the other don't like it, then 1 would be my preference.
>>>>>>>>>
>>>>>>>>> In general, my preference would be either 3.3 or 3.2 (see below).
>>>>>>>>
>>>>>>>> I also think that 3.2 and 3.3 are good options for the general case. For
>>>>>>>> the timer, I can see why 1 is your (second) preference and I am fine
>>>>>>>> with 1 as well.
>>>>>>> Ok, sounds good to me. Let's still give Bertrand the chance to share his opinion.
>>>>>>
>>>>>> We need to get rid of dead code and removing it is not always the best solution.
>>>>>>
>>>>>> If the code is or could be useful for someone some day, protecting it with ifdef is ok.
>>>>>>
>>>>>> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in the
>>>>>> code so that we can compile out what we do not need and code not applying to
>>>>>> some hardware is a case where we will do that (does not mean that by default
>>>>>> we will not compile it in but we will make it easier to reduce the code size for a
>>>>>> specific use case).
>>>>>>
>>>>>> So 3.2 and 3.3 are ok for me.
>>>>>
>>>>> So we all agree that the code in the current form is a no go from certification purposes.
>>>>> That is good :)
>>>>>
>>>>> The reason why I opt for solution 1 and not the others is that in the latter case it would
>>>>> mean introducing the Kconfig option to allow user to select the timer to be used by Xen.
>>>>> This is not really correct. Also in the current form, it would also require adding more
>>>>> code to time.c code because at the moment using CNTP for Xen would not work out of the box.
>>>>> The architecture defines the hypervisor timer for a purpose. If it does not work, it means
>>>>> that the HW is problematic. I agree that we would want to support such use case but I'm not
>>>>> really aware of any issue like that. Adding more code and Kconfig options just because
>>>>> one day someone may face issues with a new HW is something I am not a fan of.
>>>>
>>>> I see 2 solutions here:
>>>> - somehow push the code to a different file (not quite sure this is feasible here)
>>>> - remove completely the code with a clean commit. Doing this it will be easy for someone needing this to later revert the patch
>>>>
>>>> It is definitely true here that adding more code to keep some unused code does not really make sense.
>>>> And let’s be realistic here, if we need that one day, it will not take ages to support it somehow.
>>>>
>>>> As said, from a pure certification point of view:
>>>> - we must not have deadcode
>>>> - proper ifdef is acceptable
>>>> - if 0 is not acceptable
>>>> - commented code is not acceptable
>>>
>>> Given that we agree on that (+ IS_ENABLED option if possible), and the option 1 seems
>>> to be the best choice for the timer, I will create a patch removing the IRQ path to get rid
>>> of the deadcode/unreachable code.
>>>
>>> Do you think this is something we want for 4.17?
>>> The risk is low as the code is already dead and the benefit is that we have no deadcode.
>>> What do you think?
>>>
>>
>> We are very near from the release so from my point of view as it is not solving a bug, this should not go into 4.17.
> 
> I agree
This works for me and thank you all for the discussion!
This thread and its outcome will certainly be useful for similar issues in the future.

~Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-10-26 15:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24  9:07 Deadcode discussion based on Arm NS phys timer Michal Orzel
2022-10-24 10:51 ` Julien Grall
2022-10-24 11:41   ` Michal Orzel
2022-10-24 13:29     ` Julien Grall
2022-10-25  1:29       ` Stefano Stabellini
2022-10-25  7:11         ` Michal Orzel
2022-10-25  7:45           ` Bertrand Marquis
2022-10-25  8:07             ` Michal Orzel
2022-10-25  8:20               ` Bertrand Marquis
2022-10-26 11:29                 ` Michal Orzel
2022-10-26 12:20                   ` Bertrand Marquis
2022-10-26 15:30                     ` Stefano Stabellini
2022-10-26 15:53                       ` Michal Orzel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.